Discussion:
[ast-developers] syntax error when using "[()]" as awk field separator
lijo george
2015-05-17 21:23:09 UTC
Permalink
I observed a syntax error with a script in ksh versions starting from
2012-08-01.
A minimal test case is given below.
Basically if the pattern "[()]" is used as field separator in awk, it
throws a syntax error.

# ksh --version
version sh (AT&T Research) 93u+ 2012-08-01
# cat test-simple.ksh
NAWK=/usr/bin/nawk
TEST="abc(def)g)"
A="`echo $TEST | $NAWK '
BEGIN{
FS = "[()]"
}
/^a/ {
print $1
}
END{
#print $2
}'`"
print $A
# ksh test-simple.ksh
test-simple.ksh: line 2: syntax error at line 5: `(' unexpected


This used to work in the 2011-02-08 version.
Is this a bug or expected behaviour.

From the sh_lex function in the lex.c file, I can see that there is a check
added for the pattern "[(" given below.
case S_EPAT:
epat:
if(fcgetc(n)==LPAREN && c!='[')

But in the earlier version(2011-02-08), the corresponding check was

case S_EPAT:
epat:
if(fcgetc(n)==LPAREN)


So I guess the extra check was added for some specific reason. In that
case, could someone please let me know the reason why this was added and
the syntax error is thrown.

Thanks,
Lijo
Terrence J. Doyle
2015-05-19 16:05:54 UTC
Permalink
Actually, I'm surprised this ever worked. The quotes on line 5 of your
script should be escaped, like this:

FS = \"[()]\"

Without the escapes ksh matches the double-quote after A= with the first
one on line 5. So, the characters, [()], are unquoted (in both ksh and
awk). Then, quoting is re-initiated with the second double-quote on line 5.

However, I did try your script with ksh version: Version M 1993-12-28
s+. It did indeed succeed without the escaped double-quotes. Then, I
reran the script with line 5 modified like this:

FS = "()"

With that the older version of ksh failed with the error:

line 2: syntax error at line 5: `(' unexpected

With line 5 modified as:

FS = \"()\"

The script once again succeeded (albeit with a different answer because
awk's FS had a different value).

These variations seem to indicate that in older versions of ksh square
brackets could be used as a quasi-quoting mechanism. But, as far as I
know this /feature/ was not documented and should not have been
relied-upon. Escaping the double-quotes on line 5 is the way to go.

Terrence Doyle
Post by lijo george
I observed a syntax error with a script in ksh versions starting from
2012-08-01.
A minimal test case is given below.
Basically if the pattern "[()]" is used as field separator in awk, it
throws a syntax error.
# ksh --version
version sh (AT&T Research) 93u+ 2012-08-01
# cat test-simple.ksh
NAWK=/usr/bin/nawk
TEST="abc(def)g)"
A="`echo $TEST | $NAWK '
BEGIN{
FS = "[()]"
}
/^a/ {
print $1
}
END{
#print $2
}'`"
print $A
# ksh test-simple.ksh
test-simple.ksh: line 2: syntax error at line 5: `(' unexpected
This used to work in the 2011-02-08 version.
Is this a bug or expected behaviour.
From the sh_lex function in the lex.c file, I can see that there is a
check added for the pattern "[(" given below.
if(fcgetc(n)==LPAREN && c!='[')
But in the earlier version(2011-02-08), the corresponding check was
if(fcgetc(n)==LPAREN)
So I guess the extra check was added for some specific reason. In that
case, could someone please let me know the reason why this was added and
the syntax error is thrown.
Thanks,
Lijo
_______________________________________________
ast-developers mailing list
http://lists.research.att.com/mailman/listinfo/ast-developers
lijo george
2015-05-22 12:10:40 UTC
Permalink
Thanks for the clarifcation. So can I assume that the earlier behaviour was
somehow unintended
and the current behaviour is the right one. i.e the syntax error thrown is
valid.

The reason I'm asking is there are some old scripts with this syntax which
used to work in ksh2011,
so if this is not a bug, I guess I'll need to fix those scripts rather than
waiting for ksh to be fixed.

On Tue, May 19, 2015 at 9:35 PM, Terrence J. Doyle <
Post by Terrence J. Doyle
Actually, I'm surprised this ever worked. The quotes on line 5 of your
FS = \"[()]\"
Without the escapes ksh matches the double-quote after A= with the first
one on line 5. So, the characters, [()], are unquoted (in both ksh and
awk). Then, quoting is re-initiated with the second double-quote on line 5.
However, I did try your script with ksh version: Version M 1993-12-28
s+. It did indeed succeed without the escaped double-quotes. Then, I
FS = "()"
line 2: syntax error at line 5: `(' unexpected
FS = \"()\"
The script once again succeeded (albeit with a different answer because
awk's FS had a different value).
These variations seem to indicate that in older versions of ksh square
brackets could be used as a quasi-quoting mechanism. But, as far as I
know this /feature/ was not documented and should not have been
relied-upon. Escaping the double-quotes on line 5 is the way to go.
Terrence Doyle
Post by lijo george
I observed a syntax error with a script in ksh versions starting from
2012-08-01.
A minimal test case is given below.
Basically if the pattern "[()]" is used as field separator in awk, it
throws a syntax error.
# ksh --version
version sh (AT&T Research) 93u+ 2012-08-01
# cat test-simple.ksh
NAWK=/usr/bin/nawk
TEST="abc(def)g)"
A="`echo $TEST | $NAWK '
BEGIN{
FS = "[()]"
}
/^a/ {
print $1
}
END{
#print $2
}'`"
print $A
# ksh test-simple.ksh
test-simple.ksh: line 2: syntax error at line 5: `(' unexpected
This used to work in the 2011-02-08 version.
Is this a bug or expected behaviour.
From the sh_lex function in the lex.c file, I can see that there is a
check added for the pattern "[(" given below.
if(fcgetc(n)==LPAREN && c!='[')
But in the earlier version(2011-02-08), the corresponding check was
if(fcgetc(n)==LPAREN)
So I guess the extra check was added for some specific reason. In that
case, could someone please let me know the reason why this was added and
the syntax error is thrown.
Thanks,
Lijo
_______________________________________________
ast-developers mailing list
http://lists.research.att.com/mailman/listinfo/ast-developers
_______________________________________________
ast-developers mailing list
http://lists.research.att.com/mailman/listinfo/ast-developers
Terrence J. Doyle
2015-05-27 15:32:01 UTC
Permalink
As far as I know the old behavior was unintended and incorrect, and the
current behavior is correct.

Terrence Doyle
Post by lijo george
Thanks for the clarifcation. So can I assume that the earlier behaviour
was somehow unintended
and the current behaviour is the right one. i.e the syntax error thrown
is valid.
The reason I'm asking is there are some old scripts with this syntax
which used to work in ksh2011,
so if this is not a bug, I guess I'll need to fix those scripts rather
than waiting for ksh to be fixed.
On Tue, May 19, 2015 at 9:35 PM, Terrence J. Doyle
Actually, I'm surprised this ever worked. The quotes on line 5 of your
FS = \"[()]\"
Without the escapes ksh matches the double-quote after A= with the first
one on line 5. So, the characters, [()], are unquoted (in both ksh and
awk). Then, quoting is re-initiated with the second double-quote on line 5.
However, I did try your script with ksh version: Version M 1993-12-28
s+. It did indeed succeed without the escaped double-quotes. Then, I
FS = "()"
line 2: syntax error at line 5: `(' unexpected
FS = \"()\"
The script once again succeeded (albeit with a different answer because
awk's FS had a different value).
These variations seem to indicate that in older versions of ksh square
brackets could be used as a quasi-quoting mechanism. But, as far as I
know this /feature/ was not documented and should not have been
relied-upon. Escaping the double-quotes on line 5 is the way to go.
Terrence Doyle
Post by lijo george
I observed a syntax error with a script in ksh versions starting from
2012-08-01.
A minimal test case is given below.
Basically if the pattern "[()]" is used as field separator in awk, it
throws a syntax error.
# ksh --version
version sh (AT&T Research) 93u+ 2012-08-01
# cat test-simple.ksh
NAWK=/usr/bin/nawk
TEST="abc(def)g)"
A="`echo $TEST | $NAWK '
BEGIN{
FS = "[()]"
}
/^a/ {
print $1
}
END{
#print $2
}'`"
print $A
# ksh test-simple.ksh
test-simple.ksh: line 2: syntax error at line 5: `(' unexpected
This used to work in the 2011-02-08 version.
Is this a bug or expected behaviour.
From the sh_lex function in the lex.c file, I can see that there is a
check added for the pattern "[(" given below.
if(fcgetc(n)==LPAREN && c!='[')
But in the earlier version(2011-02-08), the corresponding check was
if(fcgetc(n)==LPAREN)
So I guess the extra check was added for some specific reason. In that
case, could someone please let me know the reason why this was
added and
Post by lijo george
the syntax error is thrown.
Thanks,
Lijo
_______________________________________________
ast-developers mailing list
http://lists.research.att.com/mailman/listinfo/ast-developers
_______________________________________________
ast-developers mailing list
http://lists.research.att.com/mailman/listinfo/ast-developers
Loading...