[ast-developers] [BUG] ${.sh.subshell} broken

Discussion:

Martijn Dekker

2016-10-17 19:40:09 UTC

I know there are no active developers at this point, but here's one for
the archives in case there are active developers in the future.

The ${.sh.subshell} (subshell level) variable is broken:

case $(echo ${.sh.subshell} 1>&2
echo ${.sh.subshell} 1>&2
echo ${.sh.subshell}) in
1) echo ok ;;
esac

This should output '1' twice and then 'ok'. Instead it outputs '1' and
then '0'.

This means ${.sh.subshell} is reset after the first access, so only the
first access is correct.

Interestingly this all works fine on ancient ksh93, at least "Version M
1993-12-28 r" as installed on sdf-eu.org. But versions as of at least
2010 and later all have this problem.

Since ksh93 uniquely does subshells without forking, the canonical
method involving the comparison of $$ with $(exec sh -c 'echo $PPID')
doesn't work either, as subshells don't get separate PIDs.

As far as I can tell, that leaves ksh93 without *any* reliable way of
figuring out if you're in a subshell.

I need this capability for my shell library to support ksh93. If anyone
can think of any workarounds, please let me know.

Many thanks,

- Martijn

Martijn Dekker

2016-10-17 20:58:43 UTC

Permalink

Post by Martijn Dekker
I need this capability for my shell library to support ksh93. If anyone
can think of any workarounds, please let me know.

Of course I can't just let stuff like this go...

The bug occurs under really bizarrely specific circumstances.

This works fine:

$ (echo ${.sh.subshell}; echo ${.sh.subshell})
1
1

This works fine:

$ echo $( (echo ${.sh.subshell}; echo ${.sh.subshell}) )
2 2

But look what happens if you add a redirection, even a no-op one like 1>&1:

$ echo $( (echo ${.sh.subshell} 1>&1; echo ${.sh.subshell} 1>&1) )
2 0

And:

$ echo $( (echo ${.sh.subshell} 1>&2; echo ${.sh.subshell} 1>&2) )
2
0

BUT this works fine again:

$ echo $( (echo ${.sh.subshell} 1>&2; echo ${.sh.subshell} 1>&2) 2>&1 )
2 2

Also, with non-command substitution subshells I can't seem to trigger
the bug at all:

$ (echo ${.sh.subshell} 1>&1; echo ${.sh.subshell} 1>&1)
1
1
$ ( (echo ${.sh.subshell} 1>&1; echo ${.sh.subshell} 1>&1) )
2
2

So it looks like this bug is triggered under some kind of bizarre
combination of circumstances related to command substitution and output
redirection.

- Martijn

Martijn Dekker

2016-10-17 22:15:54 UTC

Permalink

Post by Martijn Dekker
$ echo $( (echo ${.sh.subshell} 1>&1; echo ${.sh.subshell} 1>&1) )
2 0

Turns out you don't actually have to read ${.sh.subshell} twice: it is
the output redirection within a command substitution that kills
${.sh.subshell}. This is finally starting to make some sense now.

$ echo $( (: 1>&1; echo ${.sh.subshell}) )
0

(expected output: 2)

- M.

Martijn Dekker

2016-10-17 23:03:59 UTC

Permalink

Post by Martijn Dekker
Turns out you don't actually have to read ${.sh.subshell} twice: it is
the output redirection within a command substitution that kills
${.sh.subshell}. This is finally starting to make some sense now.
$ echo $( (: 1>&1; echo ${.sh.subshell}) )
0
(expected output: 2)

Annnd it turns out that ${.sh.subshell} is not read-only. You can
actually assign a value to it! Very strange, but this makes the
workaround obvious: save the value before doing output redirection,
restore it afterwards.

$ echo $( (save=${.sh.subshell}; : 1>&1; .sh.subshell=$save; echo
${.sh.subshell}) )
2

In case anyone is interested, here's how I found the bug. First a little
simplified background information.

The cross-platform POSIX shell library I'm developing, "modernish"
<https://github.com/modernish/modernish>, includes a feature for robust
shell programming called "harden" that hardens a command against errors.
It does this by setting a shell function under the command's name that
first runs the real command, then automatically checks its exit status
against a user-specified value indicating a fatal error. This is my
attempt to provide something better than 'set -e' which is fundamentally
flawed.

If a fatal error is found, the function set by 'harden' calls another
modernish function, 'die'. Fatal errors should always kill the program,
so this function is designed to reliably halt program execution, even if
the error occurred within a subshell. To do this, "die" first checks if
we're currently in a subshell using a third function called
"insubshell". If we're not in a subshell, it simply exits. If we are, it
sends SIGTERM to the main shell ("$$") and then exits.

The aforementioned "insubshell" function has several platform-specific
versions; the correct one is automatically detected. For ksh93 I simply
have this:

insubshell() {
((.sh.subshell))
}

(returning false if ${.sh.subshell} is zero, true otherwise).

But this fails something in my test scripts: the hardened 'grep' (which,
remember, is a shell function calling the real grep) fails to kill the
program on error:
https://github.com/modernish/modernish/blob/master/share/doc/modernish/testsuite/harden-test

Post by Martijn Dekker
harden -tp grep '> 1' # harden and trace grep, whitelisting SIGPIPE
# [...]
print "this file has $(grep -c '.*'

/almost/certainly/a/non/existent/file) lines"

Post by Martijn Dekker
print "we should never make it to here, BAD"

On ksh, 'die' fails to kill the main shell and the last line is printed,
because 'insubshell' fails to detect that 'die' is in a subshell.

And now I know why: since 'grep' was hardened with the -t option, the
'grep' shell function traces the command using output redirection before
doing its thing -- and this ksh bug is triggered by using output
redirection within a command substitution. Mystery solved.

Now on to implementing the necessary bug test (BUG_KSHSUBVAR) and
workarounds in modernish, so it once again becomes ksh compatible.

- M.