Discussion:
[ast-developers] Bug in ~(X) and .sh.match
Dan Shelton
2013-08-20 20:31:38 UTC
Permalink
I may have found a bug in ~(X). AFAIK the first index (.sh.match[0])
in .sh.match always lists all patterns for which matches have been
found, and all following indexes (.sh.match[0..inf]) store the
captured matches for a specific bracket pair, right?

If that's correct, why does '15' appear in .sh.match[0][0] but no
other match has '15' later?

ksh -c 'x="a15 b2 c3" ; d="${x//~(X)(([[:alnum:]])&([[:digit:]]))+/}"
; print -v .sh.match'
(
(
15
2
3
)
(
5
2
3
)
(
5
2
3
)
(
5
2
3
)
)

Dan
Roland Mainz
2013-08-21 13:58:17 UTC
Permalink
Post by Dan Shelton
I may have found a bug in ~(X). AFAIK the first index (.sh.match[0])
in .sh.match always lists all patterns for which matches have been
found, and all following indexes (.sh.match[0..inf]) store the
captured matches for a specific bracket pair, right?
If that's correct, why does '15' appear in .sh.match[0][0] but no
other match has '15' later?
ksh -c 'x="a15 b2 c3" ; d="${x//~(X)(([[:alnum:]])&([[:digit:]]))+/}"
; print -v .sh.match'
(
(
15
2
3
)
(
5
2
3
)
Erm... the issue happens because the + is outside of any bracket pair.
That means only the latest match (e.g. the '5') is stored (overwriting
previous captured matches of the same bracket pair) while
.sh.match[0][0] in your example stores the whole string (and not only
the last captured match) which was matched.

* Hints:
- Use (?:pattern) to turn a bracket pair into a non-capturing bracket
pair. It will logicall group stuff together but will not generate an
entry in .sh.match
- It's usually "wise" to put * and + backtracking stuff inside a
bracket pair to prevent brain damage to the programmer... :-)

----

Bye,
Roland
--
__ . . __
(o.\ \/ /.o) roland.mainz at nrubsig.org
\__\/\/__/ MPEG specialist, C&&JAVA&&Sun&&Unix programmer
/O /==\ O\ TEL +49 641 3992797
(;O/ \/ \O;)
Glenn Fowler
2013-08-21 21:35:53 UTC
Permalink
attached are some ast testregex tests that examine the pattern in your example
in the tests ~(A) is equivalent to ~(X)

testregex is in the ast-open package in src/cmd/re
the man page is online at www.research.att.com/sw/download/

testregex input data documents patterns, subject strings, and the expected outcome,
including sub-group matches

the tests exposed a bug in ast:regsubcomp() that will be fixed in the next alpha
Post by Dan Shelton
I may have found a bug in ~(X). AFAIK the first index (.sh.match[0])
in .sh.match always lists all patterns for which matches have been
found, and all following indexes (.sh.match[0..inf]) store the
captured matches for a specific bracket pair, right?
If that's correct, why does '15' appear in .sh.match[0][0] but no
other match has '15' later?
ksh -c 'x="a15 b2 c3" ; d="${x//~(X)(([[:alnum:]])&([[:digit:]]))+/}"
; print -v .sh.match'
(
(
15
2
3
)
(
5
2
3
)
(
5
2
3
)
(
5
2
3
)
)
Dan
_______________________________________________
ast-developers mailing list
ast-developers at lists.research.att.com
http://lists.research.att.com/mailman/listinfo/ast-developers
-------------- next part --------------
# regex conjunction tests

A [[:alnum:]]+ a15 b2 c3 (0,3)
A ([[:alnum:]])+ a15 b2 c3 (0,3)(2,3)
A (([[:alnum:]]))+ a15 b2 c3 (0,3)(2,3)(2,3)

A [[:digit:]]+ a15 b2 c3 (1,3)
A ([[:digit:]])+ a15 b2 c3 (1,3)(2,3)
A (([[:digit:]]))+ a15 b2 c3 (1,3)(2,3)(2,3)

K +([[:alnum:]]) a15 b2 c3 (0,3)(0,3)
K +(([[:alnum:]])) a15 b2 c3 (0,3)(0,3)(2,3)

K +([[:digit:]]) a15 b2 c3 (1,3)(1,3)
K +(([[:digit:]])) a15 b2 c3 (1,3)(1,3)(2,3)

# the following group shows the difference between +(...) and (...)+ subgroup reporting

A (([[:alnum:]])&([[:digit:]]))+ a15 b2 c3 (1,3)(2,3)(2,3)(2,3)
K ~(A)(([[:alnum:]])&([[:digit:]]))+ a15 b2 c3 (1,3)(2,3)(2,3)(2,3)

A (?K-a)+(([[:alnum:]])&([[:digit:]])) a15 b2 c3 (1,3)(1,3)(2,3)(2,3)
K +(([[:alnum:]])&([[:digit:]])) a15 b2 c3 (1,3)(1,3)(2,3)(2,3)

# the following group should be the equivalent

A/ /(([[:alnum:]])&([[:digit:]]))+/X/g a15 b2 c3 aX bX cX
K/ /~(A)(([[:alnum:]])&([[:digit:]]))+/X/g a15 b2 c3 aX bX cX # regsubexec() BUG fixed #
A/ /(?K-a)+(([[:alnum:]])&([[:digit:]]))/X/g a15 b2 c3 aX bX cX # regsubexec() BUG fixed #
K/ /+(([[:alnum:]])&([[:digit:]]))/X/g a15 b2 c3 aX bX cX

Loading...