Discussion:
[ast-developers] Sparse file support for AST cp/mv/ln ? / was: Re: [ast-users] Fwd: Re: Implementing SEEK_HOLE, SEEK_DATA in AST cp, mv, pax
Roland Mainz
2013-09-30 15:12:23 UTC
Permalink
I'm just forwarding the old conversation as a reminder - AST pax still
does not support SEEK_HOLE or SEEK_DATA (nor does it SUN.holesdata pax
header), nor do AST cp and mv support files with holes.
As consequence neither AST pax, cp or mv are competitive to any of
such implementations which support SEEK_HOLE and SEEK_DATA.
For example moving a 200GB file with 99% holes with GNU mv (GNU
supports SEEK_HOLE/SEEK_DATA since 2010) across filesystem takes less
than a 4 seconds with GNU mv but takes a WHOPPING 18 minutes with AST
mv.
[snip]

Erm...
... AFAIK a copy-file-data algorithm would be just this:
1. Test whether $ getconf MIN_HOLE_SIZE <srcpath> # returns a value > 0
2. If [1] is true then check whether the file has at least one hole
(via |SEEK_HOLE|)
3. If [2] is true switch to a special version of the data copying code
which "simply" copies data via |write()| until it hits a hole and then
uses |lseek()| to seek forward to the next position and then uses
|write()| again.

Glenn: Does that sound correct ?

----

Bye,
Roland

P.S.: Testing: Files with holes at the start, at the end, in the
middle, files which are a giant, single hole...
--
__ . . __
(o.\ \/ /.o) roland.mainz at nrubsig.org
\__\/\/__/ MPEG specialist, C&&JAVA&&Sun&&Unix programmer
/O /==\ O\ TEL +49 641 3992797
(;O/ \/ \O;)
Roland Mainz
2013-09-30 17:06:58 UTC
Permalink
Post by Roland Mainz
I'm just forwarding the old conversation as a reminder - AST pax still
does not support SEEK_HOLE or SEEK_DATA (nor does it SUN.holesdata pax
header), nor do AST cp and mv support files with holes.
As consequence neither AST pax, cp or mv are competitive to any of
such implementations which support SEEK_HOLE and SEEK_DATA.
For example moving a 200GB file with 99% holes with GNU mv (GNU
supports SEEK_HOLE/SEEK_DATA since 2010) across filesystem takes less
than a 4 seconds with GNU mv but takes a WHOPPING 18 minutes with AST
mv.
[snip]
Erm...
1. Test whether $ getconf MIN_HOLE_SIZE <srcpath> # returns a value > 0
2. If [1] is true then check whether the file has at least one hole
(via |SEEK_HOLE|)
3. If [2] is true switch to a special version of the data copying code
which "simply" copies data via |write()| until it hits a hole and then
uses |lseek()| to seek forward to the next position and then uses
|write()| again.
Glenn: Does that sound correct ?
Some notes for myself:
1. Test code:
-- snip --
bool
support_seek_hole (int fd)
{
off_t pos;

if (fpathconf(fd, _PC_MIN_HOLE_SIZE) < 0)
return (false);

/*
* Test two error conditions:
* 1. we have been compiled on an OS revision that
* supports |SEEK_HOLE| but run on an OS revision
* that does not support |SEEK_HOLE|, we get |EINVAL|.
* 2. the underlying filesystem does not support
* |SEEK_HOLE|, we get |ENOTSUP|.
*/
pos = lseek (fd, 0LL, SEEK_HOLE);
if (pos < 0LL)
{
if ((errno == EINVAL) || (errno == ENOTSUP))
return (false);
}

/* Do the same for |SEEK_DATA| */
pos = lseek (fd, 0LL, SEEK_DATA);
if (pos < 0LL)
{
if ((errno == EINVAL) || (errno == ENOTSUP))
return (false);
}

return (true);
}
-- snip --


2. |lseek(fd, pos, SEEK_DATA)|/|lseek(fd, pos, SEEK_HOLE)| return
|ENXIO| if no further hole/data block can be found from position |pos|
on...

----

Bye,
Roland
--
__ . . __
(o.\ \/ /.o) roland.mainz at nrubsig.org
\__\/\/__/ MPEG specialist, C&&JAVA&&Sun&&Unix programmer
/O /==\ O\ TEL +49 641 3992797
(;O/ \/ \O;)
Roland Mainz
2013-10-01 03:50:06 UTC
Permalink
Post by Roland Mainz
I'm just forwarding the old conversation as a reminder - AST pax still
does not support SEEK_HOLE or SEEK_DATA (nor does it SUN.holesdata pax
header), nor do AST cp and mv support files with holes.
As consequence neither AST pax, cp or mv are competitive to any of
such implementations which support SEEK_HOLE and SEEK_DATA.
For example moving a 200GB file with 99% holes with GNU mv (GNU
supports SEEK_HOLE/SEEK_DATA since 2010) across filesystem takes less
than a 4 seconds with GNU mv but takes a WHOPPING 18 minutes with AST
mv.
[snip]
Erm...
1. Test whether $ getconf MIN_HOLE_SIZE <srcpath> # returns a value > 0
2. If [1] is true then check whether the file has at least one hole
(via |SEEK_HOLE|)
3. If [2] is true switch to a special version of the data copying code
which "simply" copies data via |write()| until it hits a hole and then
uses |lseek()| to seek forward to the next position and then uses
|write()| again.
Glenn: Does that sound correct ?
[snip]

Attached (as "astksh20130926_sparsefile_cp001.diff.txt") is a
prototype patch which adds sparse file support to AST
cp(1)/mv(1)/ln(1) via using the |SEEK_HOLE|/|SEEK_DATA| API from
POSIX.
Additionally I've attached "lsholes.c.txt" which is a small test
application to show the hole/data layout of a sparse file.

* Notes:
- |sfmove()| seems to turn longer sequences of '\0\ data into holes.
While this is usefull _sometimes_ its devastating for databases&&other
software which depend on an exact replication of the layout of the
holes (and real data which are mostly of the value '\0'.
- Erm... it's 5:35h AM here... any idea how |sfmove()| figures out if
data are all zero bytes and should be skipped ?
- Glenn: Where should the final hole-replication code live - in
src/lib/libcmd/cp.c or |sfmove()| ? Note that we need _both_ modes,
selectable via option (proposed name - line GNU cp(1) - ...
"--sparse"): By default we use SEEK_HOLE|/|SEEK_DATA| to create an
exact replication of the hole/data layout of a file (--sparse=layout),
but optionally we need a way ("--sparse-zeros2holes") to turn longer
sequences than getconf(MIN_HOLE_SIZE) (or 512 bytes if not available)
into holes at the destination (this turns out to be a good thing to
create boot CDROMs since some boot data have lots of padding via '\0';
but care must be done in other cases where the holes are important
(Solaris diskless boot kernel is a case where wrong hole layout
results in an unbootable kernel for weired&&arcane reasons), too).

* Testing/Example:
-- snip --
$ rm -f x.x y.y
$ ./lsholes -T x.x
## writing text data at: 0
## writing text data at: 1048576
## writing zero data at: 2097152
## writing text data at: 3145728
data: from 0 to 131072 (size 131072)
hole: from 131072 to 1048576 (size 917504)
data: from 1048576 to 1179648 (size 131072)
hole: from 1179648 to 2097152 (size 917504)
data: from 2097152 to 2228224 (size 131072)
hole: from 2228224 to 3145728 (size 917504)
data: from 3145728 to 3145735 (size 7)
$ ksh -c 'builtin cp ; rm -f y.y ; cp x.x y.y ; true'
$ ./lsholes --list x.x y.y
# file: x.x
data: from 0 to 131072 (size 131072)
hole: from 131072 to 1048576 (size 917504)
data: from 1048576 to 1179648 (size 131072)
hole: from 1179648 to 2097152 (size 917504)
data: from 2097152 to 2228224 (size 131072)
hole: from 2228224 to 3145728 (size 917504)
data: from 3145728 to 3145735 (size 7)
# file: y.y
data: from 0 to 131072 (size 131072)
hole: from 131072 to 1048576 (size 917504)
data: from 1048576 to 1179648 (size 131072)
hole: from 1179648 to 2097152 (size 917504)
data: from 2097152 to 2228224 (size 131072)
hole: from 2228224 to 3145728 (size 917504)
data: from 3145728 to 3145735 (size 7)
#
# now try AST cp(1) from an old libcmd/ksh93 which does not have
|SEEK_HOLE|/|SEEK_DATA| support:
#
$ ~/bin/ksh_noseekholesupport -c 'builtin cp ; cp x.x z.z ; true'
'
$ ./lsholes --list z.z
# file: z.z
data: from 0 to 131072 (size 131072)
hole: from 131072 to 1048576 (size 917504)
data: from 1048576 to 1179648 (size 131072)
hole: from 1179648 to 3145728 (size 1966080)
data: from 3145728 to 3145735 (size 7)
-- snip --
Note the missing data written by lsholes -T ("## writing zero data at:
2097152") ?

Finally: Glenn... are there any objections that I add lsholes(1)
(AST'tified of course) to libcmd ?

----

Bye,
Roland

P.S.: Linux's ext3/4fs respond to |SEEK_HOLE|/|SEEK_DATA| but do not
report holes... instead it returns a single block of data regardless
whether there are holes or not. The options are: 1. Use btrfs or 2.
use the ext3/4fs-specific filemap API. So for testing we're limited to
Solaris for now... if the |SEEK_HOLE|/|SEEK_DATA| makes it into
libast/libcmd I look at the ext3/ext4fs-specific API as follow-up work
(mostly for legacy purposes... but it's still nice-to-have) ...
--
__ . . __
(o.\ \/ /.o) roland.mainz at nrubsig.org
\__\/\/__/ MPEG specialist, C&&JAVA&&Sun&&Unix programmer
/O /==\ O\ TEL +49 641 3992797
(;O/ \/ \O;)
-------------- next part --------------
diff -r -u original/src/lib/libcmd/cp.c build_cpsparse/src/lib/libcmd/cp.c
--- src/lib/libcmd/cp.c 2013-07-16 23:45:26.000000000 +0200
+++ src/lib/libcmd/cp.c 2013-10-01 05:12:49.890609833 +0200
@@ -228,6 +228,169 @@
}
}

+#if defined(SEEK_HOLE) && defined(SEEK_DATA)
+#define SPARSEFILE_SUPPORT 1
+#endif
+
+#ifdef SPARSEFILE_SUPPORT
+static
+bool supports_seek_hole(int fd)
+{
+ off_t pos;
+
+/* Linux does not support |_PC_MIN_HOLE_SIZE| */
+#ifdef _PC_MIN_HOLE_SIZE
+ if (fpathconf(fd, _PC_MIN_HOLE_SIZE) < 0)
+ return (false);
+#endif
+
+ /*
+ * Test two error conditions:
+ * 1. we have been compiled on an OS revision that
+ * supports |SEEK_HOLE| but run on an OS revision
+ * that does not support |SEEK_HOLE|, we get |EINVAL|.
+ * 2. the underlying filesystem does not support
+ * |SEEK_HOLE|, we get |ENOTSUP|.
+ */
+ pos = lseek(fd, 0LL, SEEK_HOLE);
+ if (pos < 0LL)
+ {
+ if ((errno == EINVAL) || (errno == ENOTSUP))
+ return (false);
+ }
+
+ /* Do the same for |SEEK_DATA| */
+ pos = lseek(fd, 0LL, SEEK_DATA);
+ if (pos < 0LL)
+ {
+ if ((errno == EINVAL) || (errno == ENOTSUP))
+ return (false);
+ }
+
+ return (true);
+}
+
+#if 1
+#define D(x)
+#else
+#define D(x) x
+#endif
+
+typedef struct _sparsefiledatarec
+{
+ enum
+ {
+ SPFDREC_UNDEFINED = 0,
+ SPFDREC_DATA = 1,
+ SPFDREC_HOLE = 2
+ } type;
+ off_t begin;
+ off_t end;
+} sparsefiledatarec;
+
+static
+sparsefiledatarec *sparsefile_enumerate_holes(int fd, ssize_t *res_numrec)
+{
+ off_t data_pos,
+ hole_pos,
+ pos;
+ struct stat st;
+ D(int saved_errno);
+ sparsefiledatarec *rec = NULL;
+ size_t numrec = 0UL;
+
+ *res_numrec = -1L;
+
+ if (fstat(fd, &st) < 0)
+ return (NULL);
+
+ /* special case for files with zero size */
+ if (st.st_size == 0)
+ {
+ rec = malloc(sizeof(sparsefiledatarec));
+ if (!rec)
+ return (NULL);
+ rec->type = SPFDREC_DATA;
+ rec->begin = 0;
+ rec->end = 0;
+ *res_numrec = 0;
+ return (rec);
+ }
+
+ for (hole_pos = data_pos = pos = 0LL ; pos < st.st_size ; )
+ {
+ data_pos = lseek(fd, pos, SEEK_DATA);
+ D(saved_errno=errno;(void)printf("# data pos = %8ld\n", data_pos);errno=saved_errno);
+ if (data_pos < 0)
+ {
+ if (errno == ENXIO)
+ {
+ /* final data block */
+ }
+ else
+ {
+ free(rec);
+ return (NULL);
+ }
+ }
+
+ hole_pos = lseek(fd, pos, SEEK_HOLE);
+ D(saved_errno=errno;(void)printf("# hole pos = %8ld\n", hole_pos);errno=saved_errno);
+ if (hole_pos < 0)
+ {
+ if (errno == ENXIO)
+ {
+ /* final hole block */
+ }
+ else
+ {
+ free(rec);
+ return (NULL);
+ }
+ }
+
+ if (data_pos == pos)
+ {
+ D((void)printf("#data from %8ld to %8ld (size %8ld)\n",
+ data_pos, hole_pos, (hole_pos - data_pos)));
+ pos = hole_pos;
+
+ rec = realloc(rec, sizeof(sparsefiledatarec)*(numrec+1));
+ if (!rec)
+ return (NULL);
+ rec[numrec].type = SPFDREC_DATA;
+ rec[numrec].begin = data_pos;
+ rec[numrec].end = hole_pos;
+ numrec++;
+ }
+ else if (hole_pos == pos)
+ {
+ D((void)printf("#hole from %8ld to %8ld (size %8ld)\n",
+ hole_pos, data_pos, (data_pos - hole_pos)));
+ pos = data_pos;
+
+ rec = realloc(rec, sizeof(sparsefiledatarec)*(numrec+1));
+ if (!rec)
+ return (NULL);
+ rec[numrec].type = SPFDREC_HOLE;
+ rec[numrec].begin = hole_pos;
+ rec[numrec].end = data_pos;
+ numrec++;
+ }
+ else
+ {
+ free(rec);
+ return (NULL);
+ }
+ }
+
+ *res_numrec = numrec;
+
+ return (rec);
+}
+#endif /* SPARSEFILE_SUPPORT */
+
+
/*
* visit a single file and state.op to the destination
*/
@@ -605,6 +768,19 @@
}
else if (rfd >= 0)
{
+#ifdef SPARSEFILE_SUPPORT
+ sparsefiledatarec *sprec;
+ ssize_t spnumrec = 0L;
+ sprec = sparsefile_enumerate_holes(rfd, &spnumrec);
+ if (lseek(rfd, 0LL, SEEK_SET) < 0)
+ {
+ error(ERROR_SYSTEM|2, "%s: %s read stream seek error", ent->fts_path, state->path);
+ close(rfd);
+ close(wfd);
+ return 0;
+ }
+#endif /* SPARSEFILE_SUPPORT */
+
if (!(ip = sfnew(NiL, NiL, SF_UNBOUND, rfd, SF_READ)))
{
error(ERROR_SYSTEM|2, "%s: %s read stream error", ent->fts_path, state->path);
@@ -620,10 +796,58 @@
return 0;
}
n = 0;
- if (sfmove(ip, op, (Sfoff_t)SF_UNBOUND, -1) < 0)
- n |= 3;
- if (!sfeof(ip))
- n |= 1;
+#ifdef SPARSEFILE_SUPPORT
+ if (sprec)
+ {
+ ssize_t i;
+
+ for (i=0 ; (i < spnumrec) && (n == 0) ; i++)
+ {
+ Sfoff_t movesize = sprec[i].end - sprec[i].begin;
+ switch(sprec[i].type)
+ {
+ case SPFDREC_DATA:
+ /*
+ * fixme: |sfmove()| seems to optimise
+ * longer sequences of '\0' away and
+ * turns them into holes, too... this
+ * MUST not happen with native
+ * |SEEK_HOLE|/|SEEK_DATA|
+ * support
+ */
+ if (sfmove(ip, op, movesize, -1) < 0)
+ n |= 3;
+ break;
+ case SPFDREC_HOLE:
+ if (sfseek(ip, movesize, SEEK_CUR) < 0)
+ n |= 1;
+ if (sfseek(op, movesize, SEEK_CUR) < 0)
+ n |= 2;
+ break;
+ }
+ }
+
+ /*
+ * Just seeking to a new postion does not set
+ * the sfio-internal eof flag. If the file
+ * ends with a hole we explicitly have to read
+ * something to get the EOF (or not)
+ */
+ if ((n == 0) && (sfgetc(ip) != EOF))
+ {
+ n |= 1;
+ }
+
+ free(sprec);
+ }
+ else
+#endif /* SPARSEFILE_SUPPORT */
+ {
+ if (sfmove(ip, op, (Sfoff_t)SF_UNBOUND, -1) < 0)
+ n |= 3;
+ if (!sfeof(ip))
+ n |= 1;
+ }
if (sfsync(op) || state->sync && fsync(wfd) || sfclose(op))
n |= 2;
if (sfclose(ip))
diff -r -u original/src/lib/libsum/sum-lmd.c build_cpsparse/src/lib/libsum/sum-lmd.c
--- src/lib/libsum/sum-lmd.c 2013-09-25 16:48:46.000000000 +0200
+++ src/lib/libsum/sum-lmd.c 2013-10-01 03:33:40.296099349 +0200
@@ -266,6 +266,7 @@
#define sha384_description "FIPS 180-2 SHA384 secure hash algorithm. The block count is not printed."
#define sha384_options "[+(version)?sha384 (solaris -lmd) 2005-07-26]"
#define sha384_match "sha384|sha-384|SHA384|SHA-384"
+#define sha384_scale 0
#define sha384_flags SUM_INDICATOR
#define sha384_init lmd_init
#define sha384_block lmd_block
-------------- next part --------------
/***********************************************************************
* *
* This software is part of the ast package *
* Copyright (c) 2013 AT&T Intellectual Property *
* and is licensed under the *
* Eclipse Public License, Version 1.0 *
* by AT&T Intellectual Property *
* *
* A copy of the License is available at *
* http://www.eclipse.org/org/documents/epl-v10.html *
* (with md5 checksum b35adb5213ca9657e911e9befb180842) *
* *
* Information and Software Systems Research *
* AT&T Research *
* Florham Park NJ *
* *
* Roland Mainz <roland.mainz at nrubsig.org> *
* *
***********************************************************************/

/* Linux requires |_GNU_SOURCE| for |SEEK_DATA|/|SEEK_HOLE|*/
#define _GNU_SOURCE 1

#include <stdlib.h>
#include <stdio.h>
#include <stdbool.h>
#include <string.h>
#include <err.h>

#include <sys/types.h>
#include <sys/stat.h>
#include <unistd.h>
#include <fcntl.h>
#include <errno.h>


static
bool supports_seek_hole(int fd)
{
off_t pos;

/* Linux does not support |_PC_MIN_HOLE_SIZE| */
#ifdef _PC_MIN_HOLE_SIZE
if (fpathconf(fd, _PC_MIN_HOLE_SIZE) < 0)
return (false);
#endif

/*
* Test two error conditions:
* 1. we have been compiled on an OS revision that
* supports |SEEK_HOLE| but run on an OS revision
* that does not support |SEEK_HOLE|, we get |EINVAL|.
* 2. the underlying filesystem does not support
* |SEEK_HOLE|, we get |ENOTSUP|.
*/
pos = lseek(fd, 0LL, SEEK_HOLE);
if (pos < 0LL)
{
if ((errno == EINVAL) || (errno == ENOTSUP))
return (false);
}

/* Do the same for |SEEK_DATA| */
pos = lseek(fd, 0LL, SEEK_DATA);
if (pos < 0LL)
{
if ((errno == EINVAL) || (errno == ENOTSUP))
return (false);
}

return (true);
}

#if 1
#define D(x)
#else
#define D(x) x
#endif

typedef struct _sparsefiledatarec
{
enum
{
SPFDREC_UNDEFINED = 0,
SPFDREC_DATA = 1,
SPFDREC_HOLE = 2
} type;
off_t begin;
off_t end;
} sparsefiledatarec;

static
sparsefiledatarec *sparsefile_enumerate_holes(int fd, ssize_t *res_numrec)
{
off_t data_pos,
hole_pos,
pos;
struct stat st;
D(int saved_errno);
sparsefiledatarec *rec = NULL;
size_t numrec = 0UL;

*res_numrec = -1L;

if (fstat(fd, &st) < 0)
{
warn("fstat failed");
return (NULL);
}

/* special case for files with zero size */
if (st.st_size == 0)
{
rec = malloc(sizeof(sparsefiledatarec));
if (!rec)
return (NULL);
rec->type = SPFDREC_DATA;
rec->begin = 0;
rec->end = 0;
*res_numrec = 0;
return (rec);
}

for (hole_pos = data_pos = pos = 0LL ; pos < st.st_size ; )
{
data_pos = lseek(fd, pos, SEEK_DATA);
D(saved_errno=errno;(void)printf("# data pos = %8ld\n", data_pos);errno=saved_errno);
if (data_pos < 0)
{
if (errno == ENXIO)
{
/* final data block */
}
else
{
free(rec);
return (NULL);
}
}

hole_pos = lseek(fd, pos, SEEK_HOLE);
D(saved_errno=errno;(void)printf("# hole pos = %8ld\n", hole_pos);errno=saved_errno);
if (hole_pos < 0)
{
if (errno == ENXIO)
{
/* final hole block */
}
else
{
free(rec);
return (NULL);
}
}

if (data_pos == pos)
{
D((void)printf("#data from %8ld to %8ld (size %8ld)\n",
data_pos, hole_pos, (hole_pos - data_pos)));
pos = hole_pos;

rec = realloc(rec, sizeof(sparsefiledatarec)*(numrec+1));
if (!rec)
return (NULL);
rec[numrec].type = SPFDREC_DATA;
rec[numrec].begin = data_pos;
rec[numrec].end = hole_pos;
numrec++;
}
else if (hole_pos == pos)
{
D((void)printf("#hole from %8ld to %8ld (size %8ld)\n",
hole_pos, data_pos, (data_pos - hole_pos)));
pos = data_pos;

rec = realloc(rec, sizeof(sparsefiledatarec)*(numrec+1));
if (!rec)
return (NULL);
rec[numrec].type = SPFDREC_HOLE;
rec[numrec].begin = hole_pos;
rec[numrec].end = data_pos;
numrec++;
}
else
{
free(rec);
return (NULL);
}
}

*res_numrec = numrec;

return (rec);
}


static
void printrec(sparsefiledatarec *rec, ssize_t numrec)
{
ssize_t i;

for (i=0 ; i < numrec ; i++)
{
switch(rec[i].type)
{
case SPFDREC_DATA:
(void)printf("data: from\t%8ld to\t%8ld\t(size %8ld)\n",
(long)rec[i].begin,
(long)rec[i].end,
(long)(rec[i].end - rec[i].begin));
break;
case SPFDREC_HOLE:
(void)printf("hole: from\t%8ld to\t%8ld\t(size %8ld)\n",
(long)rec[i].begin,
(long)rec[i].end,
(long)(rec[i].end - rec[i].begin));
break;
case SPFDREC_UNDEFINED:
abort();
break;
}
}
}


static
bool hasholerecord(sparsefiledatarec *rec, ssize_t numrec)
{
ssize_t i;

for (i=0 ; i < numrec ; i++)
{
switch(rec[i].type)
{
case SPFDREC_DATA:
break;
case SPFDREC_HOLE:
return (true);
case SPFDREC_UNDEFINED:
abort();
break;
}
}
return (false);
}

static
int test_lsholes1(int ac, char *av[])
{
int fd;
int res = EXIT_SUCCESS;
off_t p = 0LL;
sparsefiledatarec *rec;
ssize_t numrec = 0UL;

fd = creat(av[1], 0666);
if (fd < 0)
{
warn("Cannot open %s", av[1]);
return (EXIT_FAILURE);
}

if (!supports_seek_hole(fd))
{
warn("filesystem does not support holes for %s", av[1]);
(void)close(fd);
return (EXIT_FAILURE);
}

(void)lseek(fd, 0LL, SEEK_SET);

(void)printf("## writing text data at: %8ld\n", (long)p);
(void)write(fd, "a start\n", 8);

p = lseek(fd, 65536*16-8, SEEK_CUR);
(void)printf("## writing text data at: %8ld\n", (long)p);
(void)write(fd, "a middle\n", 9);

p = lseek(fd, 65536*16-9, SEEK_CUR);
(void)printf("## writing zero data at: %8ld\n", (long)p);
(void)write(fd, "\0\0\0\0\0\0\0\0\0\0\0\0\0\0", 13);

p = lseek(fd, 65536*16-13, SEEK_CUR);
(void)printf("## writing text data at: %8ld\n", (long)p);
(void)write(fd, "an end\n", 7);

(void)lseek(fd, 0LL, SEEK_SET);
rec = sparsefile_enumerate_holes(fd, &numrec);
if (!rec)
perror("cannot obtain list of sparse entries");
(void)close(fd);

if (!rec)
return (EXIT_FAILURE);

printrec(rec, numrec);

free(rec);

return (res);
}


static
int do_list(const char *filename)
{
int fd;
int res = EXIT_SUCCESS;
sparsefiledatarec *rec;
ssize_t numrec = 0UL;

(void)printf("# file: %s\n", filename);
fd = open(filename, O_RDONLY);
if (fd < 0)
{
warn("Cannot open %s", filename);
return (EXIT_FAILURE);
}

if (!supports_seek_hole(fd))
{
warn("filesystem does not support holes for %s", filename);
(void)close(fd);
return (EXIT_FAILURE);
}

(void)lseek(fd, 0LL, SEEK_SET);
rec = sparsefile_enumerate_holes(fd, &numrec);
if (!rec)
warn("cannot obtain list of sparse entries for %s", filename);
(void)close(fd);

if (!rec)
return (EXIT_FAILURE);

printrec(rec, numrec);

free(rec);

return (res);
}


static
int do_test(const char *filename)
{
int fd;
sparsefiledatarec *rec;
ssize_t numrec = 0UL;
bool hasholes;

fd = open(filename, O_RDONLY);
if (fd < 0)
{
warn("Cannot open %s", filename);
return (EXIT_FAILURE);
}

if (!supports_seek_hole(fd))
{
warn("filesystem does not support holes for %s", filename);
(void)close(fd);
return (EXIT_FAILURE);
}

(void)lseek(fd, 0LL, SEEK_SET);
rec = sparsefile_enumerate_holes(fd, &numrec);
if (!rec)
warn("cannot obtain list of sparse entries for %s", filename);
(void)close(fd);

if (!rec)
return (EXIT_FAILURE);

hasholes = hasholerecord(rec, numrec);

free(rec);

return (hasholes?EXIT_SUCCESS:EXIT_FAILURE);
}


int main(int ac, char *av[])
{
if ((ac > 1) && (!strcmp(av[1], "-T") || !strcmp(av[1], "--selftest")))
{
av++;
ac--;
return (test_lsholes1(ac, av));
}
else if ((ac > 1) && (!strcmp(av[1], "-l") || !strcmp(av[1], "--list")))
{
const char *name;
int res = EXIT_SUCCESS;

av++;
ac--;

if (av[1] && (!strcmp(av[1], "--")))
{
av++;
ac++;
}

av++;
ac--;

while(((ac-- > 0) && (name = *av++)))
{
if (do_list(name) != EXIT_SUCCESS)
res = EXIT_FAILURE;
}

return (res);
}
else if ((ac > 1) && (!strcmp(av[1], "-t") || !strcmp(av[1], "--test")))
{
const char *name;
int res = EXIT_FAILURE;

av++;
ac--;

if (av[1] && (!strcmp(av[1], "--")))
{
av++;
ac++;
}

av++;
ac--;

while(((ac-- > 0) && (name = *av++)))
{
if (do_test(name) == EXIT_SUCCESS)
res = EXIT_SUCCESS;
}

return (res);
}
else
{
(void)fprintf(stderr, "%s: Unknown option %s\n", av[0], av[1]?av[1]:"");
return (EXIT_FAILURE);
}
}
Roland Mainz
2013-10-01 11:31:27 UTC
Permalink
Post by Roland Mainz
Post by Roland Mainz
I'm just forwarding the old conversation as a reminder - AST pax still
does not support SEEK_HOLE or SEEK_DATA (nor does it SUN.holesdata pax
header), nor do AST cp and mv support files with holes.
As consequence neither AST pax, cp or mv are competitive to any of
such implementations which support SEEK_HOLE and SEEK_DATA.
For example moving a 200GB file with 99% holes with GNU mv (GNU
supports SEEK_HOLE/SEEK_DATA since 2010) across filesystem takes less
than a 4 seconds with GNU mv but takes a WHOPPING 18 minutes with AST
mv.
[snip]
Erm...
1. Test whether $ getconf MIN_HOLE_SIZE <srcpath> # returns a value > 0
2. If [1] is true then check whether the file has at least one hole
(via |SEEK_HOLE|)
3. If [2] is true switch to a special version of the data copying code
which "simply" copies data via |write()| until it hits a hole and then
uses |lseek()| to seek forward to the next position and then uses
|write()| again.
Glenn: Does that sound correct ?
[snip]
Attached (as "astksh20130926_sparsefile_cp001.diff.txt") is a
prototype patch which adds sparse file support to AST
cp(1)/mv(1)/ln(1) via using the |SEEK_HOLE|/|SEEK_DATA| API from
POSIX.
Additionally I've attached "lsholes.c.txt" which is a small test
application to show the hole/data layout of a sparse file.
- |sfmove()| seems to turn longer sequences of '\0\ data into holes.
While this is usefull _sometimes_ its devastating for databases&&other
software which depend on an exact replication of the layout of the
holes (and real data which are mostly of the value '\0'.
- Erm... it's 5:35h AM here... any idea how |sfmove()| figures out if
data are all zero bytes and should be skipped ?
[snip]

It turns out there is a sfio flag called |SF_WHOLE| to disable the
"turn zero bytes into hols"-functionality...
... attached (as "astksh20130926_sparsefile_cp002.diff.txt") is a
patch which exactly does that...

... AFAIK only open question before commiting it to ast-ksh is the
--sparse option and how to name the keys for it...

----

Bye,
Roland
--
__ . . __
(o.\ \/ /.o) roland.mainz at nrubsig.org
\__\/\/__/ MPEG specialist, C&&JAVA&&Sun&&Unix programmer
/O /==\ O\ TEL +49 641 3992797
(;O/ \/ \O;)
-------------- next part --------------
diff -r -u original/src/lib/libcmd/cp.c build_cpsparse/src/lib/libcmd/cp.c
--- src/lib/libcmd/cp.c 2013-07-16 23:45:26.000000000 +0200
+++ src/lib/libcmd/cp.c 2013-10-01 13:08:30.382728681 +0200
@@ -228,6 +228,169 @@
}
}

+#if defined(SEEK_HOLE) && defined(SEEK_DATA)
+#define SPARSEFILE_SUPPORT 1
+#endif
+
+#ifdef SPARSEFILE_SUPPORT
+static
+bool supports_seek_hole(int fd)
+{
+ off_t pos;
+
+/* Linux does not support |_PC_MIN_HOLE_SIZE| */
+#ifdef _PC_MIN_HOLE_SIZE
+ if (fpathconf(fd, _PC_MIN_HOLE_SIZE) < 0)
+ return (false);
+#endif
+
+ /*
+ * Test two error conditions:
+ * 1. we have been compiled on an OS revision that
+ * supports |SEEK_HOLE| but run on an OS revision
+ * that does not support |SEEK_HOLE|, we get |EINVAL|.
+ * 2. the underlying filesystem does not support
+ * |SEEK_HOLE|, we get |ENOTSUP|.
+ */
+ pos = lseek(fd, 0LL, SEEK_HOLE);
+ if (pos < 0LL)
+ {
+ if ((errno == EINVAL) || (errno == ENOTSUP))
+ return (false);
+ }
+
+ /* Do the same for |SEEK_DATA| */
+ pos = lseek(fd, 0LL, SEEK_DATA);
+ if (pos < 0LL)
+ {
+ if ((errno == EINVAL) || (errno == ENOTSUP))
+ return (false);
+ }
+
+ return (true);
+}
+
+#if 1
+#define D(x)
+#else
+#define D(x) x
+#endif
+
+typedef struct _sparsefiledatarec
+{
+ enum
+ {
+ SPFDREC_UNDEFINED = 0,
+ SPFDREC_DATA = 1,
+ SPFDREC_HOLE = 2
+ } type;
+ off_t begin;
+ off_t end;
+} sparsefiledatarec;
+
+static
+sparsefiledatarec *sparsefile_enumerate_holes(int fd, ssize_t *res_numrec)
+{
+ off_t data_pos,
+ hole_pos,
+ pos;
+ struct stat st;
+ D(int saved_errno);
+ sparsefiledatarec *rec = NULL;
+ size_t numrec = 0UL;
+
+ *res_numrec = -1L;
+
+ if (fstat(fd, &st) < 0)
+ return (NULL);
+
+ /* special case for files with zero size */
+ if (st.st_size == 0)
+ {
+ rec = malloc(sizeof(sparsefiledatarec));
+ if (!rec)
+ return (NULL);
+ rec->type = SPFDREC_DATA;
+ rec->begin = 0;
+ rec->end = 0;
+ *res_numrec = 0;
+ return (rec);
+ }
+
+ for (hole_pos = data_pos = pos = 0LL ; pos < st.st_size ; )
+ {
+ data_pos = lseek(fd, pos, SEEK_DATA);
+ D(saved_errno=errno;(void)printf("# data pos = %8ld\n", data_pos);errno=saved_errno);
+ if (data_pos < 0)
+ {
+ if (errno == ENXIO)
+ {
+ /* final data block */
+ }
+ else
+ {
+ free(rec);
+ return (NULL);
+ }
+ }
+
+ hole_pos = lseek(fd, pos, SEEK_HOLE);
+ D(saved_errno=errno;(void)printf("# hole pos = %8ld\n", hole_pos);errno=saved_errno);
+ if (hole_pos < 0)
+ {
+ if (errno == ENXIO)
+ {
+ /* final hole block */
+ }
+ else
+ {
+ free(rec);
+ return (NULL);
+ }
+ }
+
+ if (data_pos == pos)
+ {
+ D((void)printf("#data from %8ld to %8ld (size %8ld)\n",
+ data_pos, hole_pos, (hole_pos - data_pos)));
+ pos = hole_pos;
+
+ rec = realloc(rec, sizeof(sparsefiledatarec)*(numrec+1));
+ if (!rec)
+ return (NULL);
+ rec[numrec].type = SPFDREC_DATA;
+ rec[numrec].begin = data_pos;
+ rec[numrec].end = hole_pos;
+ numrec++;
+ }
+ else if (hole_pos == pos)
+ {
+ D((void)printf("#hole from %8ld to %8ld (size %8ld)\n",
+ hole_pos, data_pos, (data_pos - hole_pos)));
+ pos = data_pos;
+
+ rec = realloc(rec, sizeof(sparsefiledatarec)*(numrec+1));
+ if (!rec)
+ return (NULL);
+ rec[numrec].type = SPFDREC_HOLE;
+ rec[numrec].begin = hole_pos;
+ rec[numrec].end = data_pos;
+ numrec++;
+ }
+ else
+ {
+ free(rec);
+ return (NULL);
+ }
+ }
+
+ *res_numrec = numrec;
+
+ return (rec);
+}
+#endif /* SPARSEFILE_SUPPORT */
+
+
/*
* visit a single file and state.op to the destination
*/
@@ -605,6 +768,19 @@
}
else if (rfd >= 0)
{
+#ifdef SPARSEFILE_SUPPORT
+ sparsefiledatarec *sprec;
+ ssize_t spnumrec = 0L;
+ sprec = sparsefile_enumerate_holes(rfd, &spnumrec);
+ if (lseek(rfd, 0LL, SEEK_SET) < 0)
+ {
+ error(ERROR_SYSTEM|2, "%s: %s read stream seek error", ent->fts_path, state->path);
+ close(rfd);
+ close(wfd);
+ return 0;
+ }
+#endif /* SPARSEFILE_SUPPORT */
+
if (!(ip = sfnew(NiL, NiL, SF_UNBOUND, rfd, SF_READ)))
{
error(ERROR_SYSTEM|2, "%s: %s read stream error", ent->fts_path, state->path);
@@ -612,7 +788,20 @@
close(wfd);
return 0;
}
- if (!(op = sfnew(NiL, NiL, SF_UNBOUND, wfd, SF_WRITE)))
+ if (!(op = sfnew(NiL, NiL, SF_UNBOUND, wfd, SF_WRITE
+#ifdef SPARSEFILE_SUPPORT
+ /*
+ * Use real |SEEK_HOLE|/|SEEK_DATA| support if we have
+ * it and don't try to turn innocent '\0\-byte
+ * sequences into holes (which can corrupt databases,
+ * simulations or special boot binaries among other
+ * things. In the future we may have an option which
+ * selects the mode (default should be
+ * |SEEK_HOLE|/|SEEK_DATA|))
+ */
+ |SF_WHOLE
+#endif
+ )))
{
error(ERROR_SYSTEM|2, "%s: %s write stream error", ent->fts_path, state->path);
close(wfd);
@@ -620,10 +809,58 @@
return 0;
}
n = 0;
- if (sfmove(ip, op, (Sfoff_t)SF_UNBOUND, -1) < 0)
- n |= 3;
- if (!sfeof(ip))
- n |= 1;
+#ifdef SPARSEFILE_SUPPORT
+ if (sprec)
+ {
+ ssize_t i;
+
+ for (i=0 ; (i < spnumrec) && (n == 0) ; i++)
+ {
+ Sfoff_t movesize = sprec[i].end - sprec[i].begin;
+ switch(sprec[i].type)
+ {
+ case SPFDREC_DATA:
+ /*
+ * fixme: |sfmove()| seems to optimise
+ * longer sequences of '\0' away and
+ * turns them into holes, too... this
+ * MUST not happen with native
+ * |SEEK_HOLE|/|SEEK_DATA|
+ * support
+ */
+ if (sfmove(ip, op, movesize, -1) < 0)
+ n |= 3;
+ break;
+ case SPFDREC_HOLE:
+ if (sfseek(ip, movesize, SEEK_CUR) < 0)
+ n |= 1;
+ if (sfseek(op, movesize, SEEK_CUR) < 0)
+ n |= 2;
+ break;
+ }
+ }
+
+ /*
+ * Just seeking to a new postion does not set
+ * the sfio-internal eof flag. If the file
+ * ends with a hole we explicitly have to read
+ * something to get the EOF (or not)
+ */
+ if ((n == 0) && (sfgetc(ip) != EOF))
+ {
+ n |= 1;
+ }
+
+ free(sprec);
+ }
+ else
+#endif /* SPARSEFILE_SUPPORT */
+ {
+ if (sfmove(ip, op, (Sfoff_t)SF_UNBOUND, -1) < 0)
+ n |= 3;
+ if (!sfeof(ip))
+ n |= 1;
+ }
if (sfsync(op) || state->sync && fsync(wfd) || sfclose(op))
n |= 2;
if (sfclose(ip))
Irek Szczesniak
2013-10-02 15:37:36 UTC
Permalink
Post by Roland Mainz
Post by Roland Mainz
Post by Roland Mainz
I'm just forwarding the old conversation as a reminder - AST pax still
does not support SEEK_HOLE or SEEK_DATA (nor does it SUN.holesdata pax
header), nor do AST cp and mv support files with holes.
As consequence neither AST pax, cp or mv are competitive to any of
such implementations which support SEEK_HOLE and SEEK_DATA.
For example moving a 200GB file with 99% holes with GNU mv (GNU
supports SEEK_HOLE/SEEK_DATA since 2010) across filesystem takes less
than a 4 seconds with GNU mv but takes a WHOPPING 18 minutes with AST
mv.
[snip]
Erm...
1. Test whether $ getconf MIN_HOLE_SIZE <srcpath> # returns a value > 0
2. If [1] is true then check whether the file has at least one hole
(via |SEEK_HOLE|)
3. If [2] is true switch to a special version of the data copying code
which "simply" copies data via |write()| until it hits a hole and then
uses |lseek()| to seek forward to the next position and then uses
|write()| again.
Glenn: Does that sound correct ?
[snip]
Attached (as "astksh20130926_sparsefile_cp001.diff.txt") is a
prototype patch which adds sparse file support to AST
cp(1)/mv(1)/ln(1) via using the |SEEK_HOLE|/|SEEK_DATA| API from
POSIX.
Additionally I've attached "lsholes.c.txt" which is a small test
application to show the hole/data layout of a sparse file.
- |sfmove()| seems to turn longer sequences of '\0\ data into holes.
While this is usefull _sometimes_ its devastating for databases&&other
software which depend on an exact replication of the layout of the
holes (and real data which are mostly of the value '\0'.
- Erm... it's 5:35h AM here... any idea how |sfmove()| figures out if
data are all zero bytes and should be skipped ?
[snip]
It turns out there is a sfio flag called |SF_WHOLE| to disable the
"turn zero bytes into hols"-functionality...
... attached (as "astksh20130926_sparsefile_cp002.diff.txt") is a
patch which exactly does that...
... AFAIK only open question before commiting it to ast-ksh is the
--sparse option and how to name the keys for it...
We've tested your patch with ORACLE databases and your own software
which creates sparse files and I'd say I'm impressed. It works better
than expected and sheds a bad light on other cp(1) versions - it looks
so "simple" but yet nobody did implement it for cp(1).

What I dislike is to have an option to "invent" holes out of 0 byte
sequences. That is a quite dangerous way of blowing up things badly
and should be done in a separate utility.

Irek

Loading...