Discussion:
[ast-developers] Fwd: Re: [ast-users] Implementing SEEK_HOLE, SEEK_DATA in AST cp, mv, pax
Dan Shelton
2013-09-30 14:17:36 UTC
Permalink
Hello,

I'm just forwarding the old conversation as a reminder - AST pax still
does not support SEEK_HOLE or SEEK_DATA (nor does it SUN.holesdata pax
header), nor do AST cp and mv support files with holes.

As consequence neither AST pax, cp or mv are competitive to any of
such implementations which support SEEK_HOLE and SEEK_DATA.

For example moving a 200GB file with 99% holes with GNU mv (GNU
supports SEEK_HOLE/SEEK_DATA since 2010) across filesystem takes less
than a 4 seconds with GNU mv but takes a WHOPPING 18 minutes with AST
mv.
AST pax is even worse compared to GNU tar if any files have holes inside.

Dan

Forwarded conversation
Subject: Implementing SEEK_HOLE, SEEK_DATA in AST cp, mv, pax
------------------------

From: Dan Shelton <dan.f.shelton at googlemail.com>
Date: 26 March 2012 02:13
To: ast-users at research.att.com, ast-developers at research.att.com


Hello,

are there plans to implement support for SEEK_HOLE (let lseek() seek
to the next hole in a sparse file) and SEEK_DATA (let lseek() seek to
the next place with real data, usually after a hole) in AST cp, mv and
pax in the next 2-3 months? This has become VERY important to
enterprise customers now that Linux+btrfs, GNU coreutils, Solaris,
FreeBSD and others support this feature and that it is going to be
included in the next iteration of the POSIX standard
(http://man7.org/linux/man-pages/man2/lseek.2.html)

I've forwarded an older email of Jeff Liu describing some internals at
the end of this email about the GNU coreutils implementation as
context.

---------- Forwarded message ----------
From: Jeff Liu <jeff.liu at oracle.com>
Date: Fri, Aug 26, 2011 at 11:43 AM
Subject: Re: [zfs-discuss] bug#8061: Introduce SEEK_DATA/SEEK_HOLE to
extent_scan module
To: bug-coreutils at gnu.org
Cc: zfs-discuss at opensolaris.org, chris.mason at oracle.com,
linux-btrfs at vger.kernel.org


Dear All,

As the SEEK_HOLE/SEEK_DATA has been implemented on Btrfs in 3.1.0+ and
Glibc, I have worked out a new version for your guys review.

Changes:
======
extent_scan.[c|h]:
1. add a function pointer to "struct extent_scan":
/* Scan method. */
bool (*extent_scan) (struct extent_scan *scan);

2. add a structure item to indicate seek back issue maybe occurred:
/* Failed to seek back to position 0 or not. */
bool seek_back_failed;
If the file system support SEEK_HOLE, the file offset will pointed to
somewhere > 0, so need to
seek back to the beginning after support_seek_hole() checking for the
proceeding extent scan.

3. rename extent_scan to fiemap_extent_scan.

4. add a new seek_extent_scan method.

5. add a new method to check SEEK stuff is supported or not.
if the underlaying file system support SEEK_HOLE, assign
seek_extent_scan to scan->extent_scan, or else, fiemap_extent_scan()
will be assigned to it.

copy.c:
1. pass src_total_size to extent_scan_init ().
2. for the first round extent scan, we need to seek back to position
0 too, if the data extent is started at the beginning of source file.

Tested:
======
1. make syntax-check.
2. verify a copied sparse file with 4697 extents on btrfs
jeff at pibroch:~/gnu/coreutils$ python -c "f=open('/btrfs/sparse_test',
'w'); [(f.seek(x) or f.write(str(x))) for x in range(1, 1000000000,
99999)]; f.close()"
jeff at pibroch:~/gnu/coreutils$ ./src/cp --sparse=always
/btrfs/sparse_test /btrfs/sp.seek
jeff at pibroch:~/gnu/coreutils$ cmp /btrfs/sparse_test /btrfs/sp.seek
jeff at pibroch:~/gnu/coreutils$ echo $?
0

Also, the previous patch was developed on Solaris ZFS, but my test
env was lost now. :( so anyone can help testing it on ZFS would be
appreciated!!
Loading...