Discussion:
[ast-developers] [patch] Prototype flag for /dev/file@... to pass extra flags to |open()|&&co. ...
Roland Mainz
2013-08-09 05:17:07 UTC
Permalink
Hi!

----

Attached (as "astksh20130807_devfile001.diff.txt") is a prototype
patch for ast-ksh.2013-08-07 which adds a new "virtual device" called
/dev/file at ... which adds the ability to pass extra |O_*| flags to
open(2).

Example usage:
-- snip --
$ ksh -c 'redirect {n}</dev/file at directory/etc/profile '
/bin/ksh: /dev/file at directory/etc/profile: cannot open [Not a directory]
-- snip --
This error happens because /dev/file at directory sets the |O_DIRECTORY|
flag for the following path

** Notes:
* The code explicitly sits in libast only to make sure _any_ libast
consumer can use it (main consumers are going to be the CERN and GE
Healthcare camps who crave for |O_DIRECT| to disable kernel-buffering
for some I/O applications)

* The naming scheme was inspired by ksh93's /dev/tcp&&co. and Solaris
devfs (see http://docs.oracle.com/cd/E19963-01/html/819-3159/fgomr.html)
which uses complex expressions with ',', ':', '@' to describe
physical/virtual devices/device drivers under /devices

* I've avoided using other characters than ',', ':', '@' for the same
reasons as Sun's engineers avoided them in Solaris devfs... some of
them like '#' are toxic to some shells, blanks/whitespaces don't mix
with them either

* I've explicitly used /dev/file at ... as basis because using any
relative pathname doesn't work because somehow the attributes need to
be specified.

* Filenames/naming scheme alternatives - what does not work:
- The portable file name character set allows a lot of characters
- Looking at the RPM filename database even more exotic things like
"filename(attr)" or filename{attr}" are already in common use on
Linux.
- "foo;attr" doesn't work either since filename;string" is more or
less reserved for filesystems which do versioning (see VMS, Solaris's
samfs+CD and ISO9660 used on CDROMs).
- The final stroke on that idea comes from Windows where users and
applications use almost every sane/insane combination of characters
from the whole Unicode range except '\0'.

** Todo:
1. Discuss
2. Test SIGPOLL
3. Fix any issues
4. Make sure something like /dev/file at async,nonblock/dev/fd/19 works,
too (which means the matching lookup must be done recursive).

----

Bye,
Roland
--
__ . . __
(o.\ \/ /.o) roland.mainz at nrubsig.org
\__\/\/__/ MPEG specialist, C&&JAVA&&Sun&&Unix programmer
/O /==\ O\ TEL +49 641 3992797
(;O/ \/ \O;)
-------------- next part --------------
diff -r -u original/src/lib/libast/include/ast.h build_sigfixes/src/lib/libast/include/ast.h
--- src/lib/libast/include/ast.h 2013-06-27 16:03:06.000000000 +0200
+++ src/lib/libast/include/ast.h 2013-08-09 02:47:25.119595367 +0200
@@ -126,6 +126,7 @@
{
int fd;
pid_t pid;
+ int oflags;
Pathpart_t prot;
Pathpart_t host;
Pathpart_t port;
diff -r -u original/src/lib/libast/path/pathcanon.c build_sigfixes/src/lib/libast/path/pathcanon.c
--- src/lib/libast/path/pathcanon.c 2013-08-06 21:12:09.000000000 +0200
+++ src/lib/libast/path/pathcanon.c 2013-08-09 06:38:43.460376644 +0200
@@ -117,7 +117,12 @@
else if (!size)
size = strlen(path);
if (dev)
+ {
+ memset(dev, 0, sizeof(*dev));
dev->path.offset = 0;
+ dev->oflags = 0;
+ dev->fd = -1;
+ }
else
dev = &nodev;
if (*path == '/')
@@ -264,6 +269,69 @@
dev->prot.offset = 0;
}
}
+ else if (size > 6 && s[0] == 'f' && s[1] == 'i' && s[2] == 'l' && s[3] == 'e' && s[4] == '@')
+ {
+ char *param = &s[5];
+ const char *paramend = strchr(param, '/');
+ const char *tok;
+
+ if (!paramend)
+ {
+ errno = EINVAL;
+ return(NULL);
+ }
+
+ for (tok = param;
+ tok && ((paramend - tok) > 0L);
+ tok = strchr(tok, ','))
+ {
+ while(*tok == ',')
+ tok++;
+
+ errno = 0;
+
+#define MATCH_O_FLAG(s, name, nsize) \
+ ((!strncmp((s), (name), (nsize))) && (s[(nsize)]==',' || s[(nsize)]=='/'))
+ if (MATCH_O_FLAG(tok, "directory", 9))
+#ifdef O_DIRECTORY
+ dev->oflags |= O_DIRECTORY;
+#else
+ errno = ENOTSUP;
+#endif
+ else if (MATCH_O_FLAG(tok, "nonblock", 8))
+#ifdef O_NONBLOCK
+ dev->oflags |= O_NONBLOCK;
+#else
+ errno = ENOTSUP;
+#endif
+ else if (MATCH_O_FLAG(tok, "async", 5))
+#ifdef O_ASYNC
+ dev->oflags |= O_ASYNC;
+#else
+ errno = ENOTSUP;
+#endif
+ else if (MATCH_O_FLAG(tok, "sync", 4))
+#ifdef O_SYNC
+ dev->oflags |= O_SYNC;
+#else
+ errno = ENOTSUP;
+#endif
+ else if (MATCH_O_FLAG(tok, "direct", 6))
+#ifdef O_DIRECT
+ dev->oflags |= O_DIRECT;
+#else
+ errno = ENOTSUP;
+#endif
+ else
+ errno = EINVAL;
+
+ if (errno)
+ return (NULL);
+ }
+
+ r = paramend;
+ canon = NULL;
+ }
}
if (r)
{
diff -r -u original/src/lib/libast/path/pathopen.c build_sigfixes/src/lib/libast/path/pathopen.c
--- src/lib/libast/path/pathopen.c 2013-08-06 21:14:58.000000000 +0200
+++ src/lib/libast/path/pathopen.c 2013-08-09 06:47:16.729061537 +0200
@@ -163,6 +163,8 @@
b = canon ? canon : (char*)path;
if (pathdev(path, canon, size, flags, &dev) && dev.path.offset)
{
+ oflags |= dev.oflags;
+
if (dev.fd >= 0)
{
if (dev.pid > 0 && dev.pid != getpid())
@@ -323,10 +325,33 @@
return fd;
}
}
+ else
+ {
+ dev.path.offset = 0;
+ }
+
if (flags & PATH_DEV)
{
errno = ENODEV;
return -1;
}
- return openat(fd, b, oflags, mode);
+
+ fd = openat(fd, b + dev.path.offset, oflags, mode);
+ if ((fd >= 0) && (oflags & O_ASYNC))
+ {
+#if defined(O_ASYNC)
+ /*
+ * open(2) on Linux says:
+ * Currently, it is not possible to enable signal-driven I/O
+ * by specifying O_ASYNC when calling open(); use fcntl(2) to
+ * enable this flag.
+ * grrr...
+ */
+
+ oflags = fcntl(fd, F_GETFL, 0);
+ (void)fcntl(fd, F_SETFL, oflags|O_ASYNC);
+#endif
+ (void)fcntl(fd, F_SETOWN, getpid());
+ }
+ return (fd);
}
Glenn Fowler
2013-08-09 16:58:44 UTC
Permalink
this will be in the next alpha as
/dev/fcntl at ...
because it works on all types of paths, even /dev/fd

the patch happened quickly because the background work in pathdev() and
syscall restart intercepts (both part of the stability/debugging flurry this summer)
were already in place

it handles relative paths like this
/dev/fcntl at direct/./file_in_pwd

recursion not needed, pathdev() just loops back after peeling off /dev/fcntl at ... and setting dev.oflags
the O_ASYNC checks were moved to the ast_openat() intercept

one question is should /dev/fcntl at .../dev/fd/N be allowed to modify fd N?
the current implementation does allow this

a few points about ast patches
always add new struct members to the end of the struct, just before _FOO_PRIVATE_ members if any
this will maintain binary compatibility for the apis that return struct to the caller
this is really important for maintaining integrity in shared lib builds
(in particular, dev.oflags should be at the end)

background: the ast build farm is bleeding edge -- the next build uses the results of the previous build
this includes ksh and nmake -- its sometimes crazy but has served well as the first regression test

similarly (and not in this patch) never change macro bitmap values
always grab new high bits
even if visual sorting of the macro names/values goes out of whack

sometimes this is esthetically hard to do when members/macros seem to have natural groupings

finally, and this relates to the "ast background work already in place comment"
there are many patches that come close to replicating features already in ast
although some of the patches may be simple and non-intrusive
they require some thought to make sure they fit in with what's already in place
sometimes this means tweaking the ast api just a bit to get the functionality
and maybe with those tweaks a big patch can change into an api veneer over the ast api
all this takes thought to do it right
it doesn't mean a delayed patch is wrong or incorrect, but it may just need a big picture
analysis before being taken
like for instance the readlink(1) proposal -- its based on a stat(1) command that is already
covered in a more natural/readable way by ast ls(1) and tw(1)
if we take readlink(1) as it is -- what are we trying to do? emulate every api, good or bad
although there are bad apis in ast, the march forward is to make them better and not
bloat with additional bad apis
the thought that needs to go into readlink(1) and similar builtins is:
maybe make ls(1) a builtin (possibly with a few more %(FIELDS)) and make readlink(1) and friends
aliases or functions or scripts on top of ls

I guess this ramble boils down to -- just because a patch works doesn't make it right
--089e013a25e2d3e60a04e37ce281
Content-Type: text/plain; charset=ISO-8859-1
Hi!
----
Attached (as "astksh20130807_devfile001.diff.txt") is a prototype
patch for ast-ksh.2013-08-07 which adds a new "virtual device" called
/dev/file at ... which adds the ability to pass extra |O_*| flags to
open(2).
-- snip --
$ ksh -c 'redirect {n}</dev/file at directory/etc/profile '
/bin/ksh: /dev/file at directory/etc/profile: cannot open [Not a directory]
-- snip --
This error happens because /dev/file at directory sets the |O_DIRECTORY|
flag for the following path
* The code explicitly sits in libast only to make sure _any_ libast
consumer can use it (main consumers are going to be the CERN and GE
Healthcare camps who crave for |O_DIRECT| to disable kernel-buffering
for some I/O applications)
* The naming scheme was inspired by ksh93's /dev/tcp&&co. and Solaris
devfs (see http://docs.oracle.com/cd/E19963-01/html/819-3159/fgomr.html)
physical/virtual devices/device drivers under /devices
reasons as Sun's engineers avoided them in Solaris devfs... some of
them like '#' are toxic to some shells, blanks/whitespaces don't mix
with them either
* I've explicitly used /dev/file at ... as basis because using any
relative pathname doesn't work because somehow the attributes need to
be specified.
- The portable file name character set allows a lot of characters
- Looking at the RPM filename database even more exotic things like
"filename(attr)" or filename{attr}" are already in common use on
Linux.
- "foo;attr" doesn't work either since filename;string" is more or
less reserved for filesystems which do versioning (see VMS, Solaris's
samfs+CD and ISO9660 used on CDROMs).
- The final stroke on that idea comes from Windows where users and
applications use almost every sane/insane combination of characters
from the whole Unicode range except '\0'.
1. Discuss
2. Test SIGPOLL
3. Fix any issues
4. Make sure something like /dev/file at async,nonblock/dev/fd/19 works,
too (which means the matching lookup must be done recursive).
Irek Szczesniak
2013-08-10 15:43:25 UTC
Permalink
Post by Glenn Fowler
this will be in the next alpha as
/dev/fcntl at ...
because it works on all types of paths, even /dev/fd
the patch happened quickly because the background work in pathdev() and
syscall restart intercepts (both part of the stability/debugging flurry this summer)
were already in place
it handles relative paths like this
/dev/fcntl at direct/./file_in_pwd
I don't like the name. It implies that all fcntl operations are
possible which is not the case. For example, how do you pass the
return value of fcntl() back to the user and what does it mean? How do
you implement locks, shares, leases with /dev/fcntl? The name is
misleading.

Irek
Glenn Fowler
2013-08-10 15:47:09 UTC
Permalink
Post by Irek Szczesniak
Post by Glenn Fowler
this will be in the next alpha as
/dev/fcntl at ...
because it works on all types of paths, even /dev/fd
the patch happened quickly because the background work in pathdev() and
syscall restart intercepts (both part of the stability/debugging flurry this summer)
were already in place
it handles relative paths like this
/dev/fcntl at direct/./file_in_pwd
I don't like the name. It implies that all fcntl operations are
possible which is not the case. For example, how do you pass the
return value of fcntl() back to the user and what does it mean? How do
you implement locks, shares, leases with /dev/fcntl? The name is
misleading.
fair enough
how about "oflags"
Irek Szczesniak
2013-08-10 16:00:17 UTC
Permalink
Post by Glenn Fowler
Post by Irek Szczesniak
Post by Glenn Fowler
this will be in the next alpha as
/dev/fcntl at ...
because it works on all types of paths, even /dev/fd
the patch happened quickly because the background work in pathdev() and
syscall restart intercepts (both part of the stability/debugging flurry this summer)
were already in place
it handles relative paths like this
/dev/fcntl at direct/./file_in_pwd
I don't like the name. It implies that all fcntl operations are
possible which is not the case. For example, how do you pass the
return value of fcntl() back to the user and what does it mean? How do
you implement locks, shares, leases with /dev/fcntl? The name is
misleading.
fair enough
how about "oflags"
apropos oflag yields nothing. /dev/file was better than that one.
/dev/open would be a choice but that's already taken:
ls -l /dev/open
lrwxrwxrwx 1 root root 46 Apr 5 2010 /dev/open ->
../../devices/pci at 0,0/pci1022,1102 at 18,2:mc-open

Irek
Roland Mainz
2013-08-10 21:12:02 UTC
Permalink
Post by Irek Szczesniak
Post by Glenn Fowler
Post by Irek Szczesniak
Post by Glenn Fowler
this will be in the next alpha as
/dev/fcntl at ...
because it works on all types of paths, even /dev/fd
the patch happened quickly because the background work in pathdev() and
syscall restart intercepts (both part of the stability/debugging flurry this summer)
were already in place
it handles relative paths like this
/dev/fcntl at direct/./file_in_pwd
I don't like the name. It implies that all fcntl operations are
possible which is not the case. For example, how do you pass the
return value of fcntl() back to the user and what does it mean? How do
you implement locks, shares, leases with /dev/fcntl? The name is
misleading.
fair enough
how about "oflags"
apropos oflag yields nothing. /dev/file was better than that one.
ls -l /dev/open
lrwxrwxrwx 1 root root 46 Apr 5 2010 /dev/open ->
../../devices/pci at 0,0/pci1022,1102 at 18,2:mc-open
Grumpf... that's why I proposed /dev/file@ ... all other similar stuff
in /dev is already used-up by Solaris or Linux devices. Crazy enough
there is a /dev/ioctl but no /dev/fcntl ... but I don't like
/dev/fcntl very much because the *only* purpose is to open _files_
with extra flags (remember the only reason we have this was to get
access to |O_DIRECT| and |O_SYNC| for HPC (=high performance
computing) purposes to circumvent/bypass limitations in other
applications which don't have special flags to use this flags
directly.
|O_DIRECTORY| was later tacked-on because I tought it's usefull and
prevents mishaps in the #ifdef logic of the _original_ prototype patch
(before I added |ENOSYS|/|EINVAL| later) since it assumed that at
least one |O_*| flag must be there.

Can we stick with /dev/file@, please ?

----

Bye,
Roland
--
__ . . __
(o.\ \/ /.o) roland.mainz at nrubsig.org
\__\/\/__/ MPEG specialist, C&&JAVA&&Sun&&Unix programmer
/O /==\ O\ TEL +49 641 3992797
(;O/ \/ \O;)
Glenn Fowler
2013-08-12 14:29:24 UTC
Permalink
Post by Roland Mainz
Post by Irek Szczesniak
Post by Glenn Fowler
Post by Irek Szczesniak
Post by Glenn Fowler
this will be in the next alpha as
/dev/fcntl at ...
because it works on all types of paths, even /dev/fd
the patch happened quickly because the background work in pathdev() and
syscall restart intercepts (both part of the stability/debugging flurry this summer)
were already in place
it handles relative paths like this
/dev/fcntl at direct/./file_in_pwd
I don't like the name. It implies that all fcntl operations are
possible which is not the case. For example, how do you pass the
return value of fcntl() back to the user and what does it mean? How do
you implement locks, shares, leases with /dev/fcntl? The name is
misleading.
fair enough
how about "oflags"
apropos oflag yields nothing. /dev/file was better than that one.
ls -l /dev/open
lrwxrwxrwx 1 root root 46 Apr 5 2010 /dev/open ->
../../devices/pci at 0,0/pci1022,1102 at 18,2:mc-open
in /dev is already used-up by Solaris or Linux devices. Crazy enough
there is a /dev/ioctl but no /dev/fcntl ... but I don't like
/dev/fcntl very much because the *only* purpose is to open _files_
with extra flags (remember the only reason we have this was to get
access to |O_DIRECT| and |O_SYNC| for HPC (=high performance
computing) purposes to circumvent/bypass limitations in other
applications which don't have special flags to use this flags
directly.
|O_DIRECTORY| was later tacked-on because I tought it's usefull and
prevents mishaps in the #ifdef logic of the _original_ prototype patch
(before I added |ENOSYS|/|EINVAL| later) since it assumed that at
least one |O_*| flag must be there.
fine
Roland Mainz
2013-08-10 21:22:08 UTC
Permalink
Post by Glenn Fowler
this will be in the next alpha as
/dev/fcntl at ...
See http://lists.research.att.com/pipermail/ast-developers/2013q3/003070.html
... last sentence...
Post by Glenn Fowler
because it works on all types of paths, even /dev/fd
the patch happened quickly because the background work in pathdev() and
syscall restart intercepts (both part of the stability/debugging flurry this summer)
were already in place
it handles relative paths like this
/dev/fcntl at direct/./file_in_pwd
Mhhh... I would prefer to use
/dev/fcntl at direct/dev/fd/AT_CWD/file_in_pwd ... but that would require
to teach /dev/fd/ to honor "AT_CWDFD" as input value...
Post by Glenn Fowler
recursion not needed, pathdev() just loops back after peeling off /dev/fcntl at ... and setting dev.oflags
the O_ASYNC checks were moved to the ast_openat() intercept
Ok...
... did you move the |fcntl()| calls for Linux there, too ?
Post by Glenn Fowler
one question is should /dev/fcntl at .../dev/fd/N be allowed to modify fd N?
the current implementation does allow this
Erm... how should that work ?
Post by Glenn Fowler
a few points about ast patches
always add new struct members to the end of the struct, just before _FOO_PRIVATE_ members if any
this will maintain binary compatibility for the apis that return struct to the caller
this is really important for maintaining integrity in shared lib builds
(in particular, dev.oflags should be at the end)
Erm... I put the flag near the field where it logically belongs to
because we're still in alpha mode... right ? :-)

[snip]
Post by Glenn Fowler
like for instance the readlink(1) proposal -- its based on a stat(1) command that is already
covered in a more natural/readable way by ast ls(1) and tw(1)
if we take readlink(1) as it is -- what are we trying to do?
See below... readlink(1) has a special purpose...
Post by Glenn Fowler
emulate every api, good or bad
although there are bad apis in ast, the march forward is to make them better and not
bloat with additional bad apis
maybe make ls(1) a builtin
... that would be a very good idea because "ls" usually ranks high in
the logs of a system's accounting system
Post by Glenn Fowler
(possibly with a few more %(FIELDS)) and make readlink(1) and friends
aliases or functions or scripts on top of ls
Erm... the readlink(1) patch was there because there is the (urgend)
demand for it to make it possible that ksh93(1) can replace busybox
... the issue is that at system boot some weired and unusual stuff has
to be done... like booting from a so-called "miniroot" (e.g. UFS
filesystem filled with a mini-system loaded over tftp which then
mounts the real root filesystem (ZFS-, NFS-, UFS-based etc.) later).
At a certain point you need to process a set of symlinks _without_
having a '/' filesystem (don't ask... it only works with busybox
because readlink(1) and mount(1m) are shell builtins).

----

Bye,
Roland
--
__ . . __
(o.\ \/ /.o) roland.mainz at nrubsig.org
\__\/\/__/ MPEG specialist, C&&JAVA&&Sun&&Unix programmer
/O /==\ O\ TEL +49 641 3992797
(;O/ \/ \O;)
Glenn Fowler
2013-08-12 14:40:31 UTC
Permalink
Post by Roland Mainz
Post by Glenn Fowler
this will be in the next alpha as
/dev/fcntl at ...
See http://lists.research.att.com/pipermail/ast-developers/2013q3/003070.html
... last sentence...
Post by Glenn Fowler
because it works on all types of paths, even /dev/fd
the patch happened quickly because the background work in pathdev() and
syscall restart intercepts (both part of the stability/debugging flurry this summer)
were already in place
it handles relative paths like this
/dev/fcntl at direct/./file_in_pwd
Mhhh... I would prefer to use
/dev/fcntl at direct/dev/fd/AT_CWD/file_in_pwd ... but that would require
to teach /dev/fd/ to honor "AT_CWDFD" as input value...
Post by Glenn Fowler
recursion not needed, pathdev() just loops back after peeling off /dev/fcntl at ... and setting dev.oflags
the O_ASYNC checks were moved to the ast_openat() intercept
Ok...
... did you move the |fcntl()| calls for Linux there, too ?
yes
Post by Roland Mainz
Post by Glenn Fowler
one question is should /dev/fcntl at .../dev/fd/N be allowed to modify fd N?
the current implementation does allow this
Erm... how should that work ?
right, it can't really because the desired semantics for open(/dev/fd/N) is to get
a new file structure (i.e., separate seek pointer, flags)
ast only falls back to dup() semantics when it can't do it any other way
Post by Roland Mainz
Post by Glenn Fowler
a few points about ast patches
always add new struct members to the end of the struct, just before _FOO_PRIVATE_ members if any
this will maintain binary compatibility for the apis that return struct to the caller
this is really important for maintaining integrity in shared lib builds
(in particular, dev.oflags should be at the end)
Erm... I put the flag near the field where it logically belongs to
because we're still in alpha mode... right ? :-)
fair assumption
but alpha is relative
once things get burned into shared libs it gets messy
in particular ast not being able able to re-build itself
Post by Roland Mainz
[snip]
Post by Glenn Fowler
like for instance the readlink(1) proposal -- its based on a stat(1) command that is already
covered in a more natural/readable way by ast ls(1) and tw(1)
if we take readlink(1) as it is -- what are we trying to do?
See below... readlink(1) has a special purpose...
Post by Glenn Fowler
emulate every api, good or bad
although there are bad apis in ast, the march forward is to make them better and not
bloat with additional bad apis
maybe make ls(1) a builtin
... that would be a very good idea because "ls" usually ranks high in
the logs of a system's accounting system
Post by Glenn Fowler
(possibly with a few more %(FIELDS)) and make readlink(1) and friends
aliases or functions or scripts on top of ls
Erm... the readlink(1) patch was there because there is the (urgend)
demand for it to make it possible that ksh93(1) can replace busybox
... the issue is that at system boot some weired and unusual stuff has
to be done... like booting from a so-called "miniroot" (e.g. UFS
filesystem filled with a mini-system loaded over tftp which then
mounts the real root filesystem (ZFS-, NFS-, UFS-based etc.) later).
At a certain point you need to process a set of symlinks _without_
having a '/' filesystem (don't ask... it only works with busybox
because readlink(1) and mount(1m) are shell builtins).
as long as the stat(1) part of readlink(1) doesn't creep in I believe we can
(1) make ls a builtin
(2) make readlink a builtin-in alias of 'ls --format="%(some-new-ast-id)s"'

so now mount has to be a builtin too?

Loading...