Cedric Blancher
2014-01-14 00:03:25 UTC
Glenn, what does libast do in case of %t?
Ced
Forwarded conversation
Subject: [developer] Webrev for: 3141 strptime() doesn't support %t
------------------------
From: Gary Mills <gary_mills at fastmail.fm>
Date: 13 January 2014 16:16
To: developer at lists.illumos.org
This is for illumos bug `3141 strptime() doesn't support %t'.
http://cr.illumos.org/~webrev/jgmills/3141-1
The first change is addition of a case to handle %n and %t in the
pattern, both of which cause skipping of white space in the buffer.
This change alone is defeated by code at the end of many other cases
that advances the format pointer whenever the first character in the
buffer is a white space character. Strangely, this code operates
contrary to the design of strptime(), where elements of the pattern
drive the buffer scan. Fortunately, it is unnecessary and can be
removed. Other portions of the loop already handle white space. The
final change is the inclusion of leading white space removal for both
%d and %e, in case the values were space-padded, and to align it with
the standard. In general, interpretation of pattern elements in
strptime() should be more lenient than in strftime() so that parsing
is more likely to succeed.
This test shows how the new %t and %n in the strptime() pattern causes
whitespace skipping in the time string. My test program shows the
results of the parse by strptime(). If this succeeds, it displays the
time with the strftime() %+ format.
$ /tmp/tstrpft '%a %b %e %H:%M:%S %Y' 'Fri Jan 3 20:53:10 2014'
Format: %a %b %e %H:%M:%S %Y
Time string: Fri Jan 3 20:53:10 2014
Fully parsed: Fri Jan 3 20:53:10 2014
Result: 3 January, 2014 08:53:10 PM CST
$ /tmp/tstrpft '%a %b%n%e%t%H:%M:%S %Y' 'Fri Jan 3 20:53:10 2014'
Format: %a %b%n%e%t%H:%M:%S %Y
Time string: Fri Jan 3 20:53:10 2014
strptime failed
$ env LD_LIBRARY_PATH=proto/root_i386/lib /tmp/tstrpft '%a
%b%n%e%t%H:%M:%S %Y' 'Fri Jan 3 20:53:10 2014'
Format: %a %b%n%e%t%H:%M:%S %Y
Time string: Fri Jan 3 20:53:10 2014
Fully parsed: Fri Jan 3 20:53:10 2014
Result: 3 January, 2014 08:53:10 PM CST
This test shows the effect of white space removal for the %d format
element:
$ /tmp/tstrpft '%m/%e/%y' '01/ 9/14'
Format: %m/%e/%y
Time string: 01/ 9/14
Fully parsed: 01/ 9/14
Result: 9 January, 2014 12:00:00 AM CST
$ /tmp/tstrpft '%m/%d/%y' '01/ 9/14'
Format: %m/%d/%y
Time string: 01/ 9/14
strptime failed
$ env LD_LIBRARY_PATH=proto/root_i386/lib /tmp/tstrpft '%m/%d/%y' '01/ 9/14'
Format: %m/%d/%y
Time string: 01/ 9/14
Fully parsed: 01/ 9/14
Result: 9 January, 2014 12:00:00 AM CST
I also identified all consumers of strptime() in illumos:
usr/src/lib/gss_mechs/mech_krb5/krb5/krb/str_conv.c
usr/src/cmd/fm/fmdump/common/fmdump.c
usr/src/cmd/logadm/opts.c
usr/src/lib/libc/port/locale/getdate.c
usr/src/lib/smbsrv/libmlsvc/common/eventlog_log.c
usr/src/cmd/bnu/uustat.c
usr/src/lib/libkmf/libkmf/common/certgetsetop.c
usr/src/lib/krb5/plugins/kdb/ldap/libkdb_ldap/ldap_misc.c
usr/src/lib/libbe/common/be_utils.c
usr/src/lib/hbaapi/common/HBAAPILIB.c
usr/src/cmd/iscsiadm/sun_ima.c
usr/src/lib/libkmf/plugins/kmf_pkcs11/common/pkcs11_spi.c
usr/src/lib/smhba/common/SMHBAAPILIB.c
usr/src/lib/libima/common/ima-lib.c
Some of these try a whole list of patterns in the hope that one will
succeed. Others use only one. I've tested all of the patterns by
encoding a time string with strftime() and decoding it with
strptime(). All tests were successful, both with the old and new
versions of strptime(). This test demonstrates that the changes in
this webrev will have no effect on parsing by current consumers.
This is the original bug report:
https://www.illumos.org/issues/3141
--
-Gary Mills- -refurb- -Winnipeg, Manitoba, Canada-
-------------------------------------------
illumos-developer
Archives: https://www.listbox.com/member/archive/182179/=now
RSS Feed: https://www.listbox.com/member/archive/rss/182179/21175010-3e2de3b9
Modify Your Subscription:
https://www.listbox.com/member/?member_id=21175010&id_secret=21175010-7e834616
Powered by Listbox: http://www.listbox.com
----------
From: Garrett D'Amore <garrett at damore.org>
Date: 13 January 2014 21:14
To: Gary Mills <gary_mills at fastmail.fm>
Cc: illumos-dev <developer at lists.illumos.org>
I'm not sure this is correct.
If I'm reading your code correctly, it seems you'll parse %d the same
way as %e. Interestingly enough, the man page documents this behavior
*without* a leading space, but the code was made to do this because
strftime() of %e includes a leading space. (The idea being that
strftime/strptime should be reversible transforms.) The standard
seems in conflict of the specification here, because if I read it
correctly, %e is specifically scanned differently than it is emitted.
You'll see that %D in the standard is intended to parse those forms
where %e is present, and indeed, it expands differently for strptime
than it does for strftime. (In fact, our code is specifically *not*
conformant with the standard for %D, because it lacks the intervening
spaces!)
Now all that said, this code has lived a long time now. I'm a little
afraid of changes here, and I'd really like second opinions both in
the code review, and on the implications for standards conformance.
I'd also like you to test %D properly. It should support spaces in
the right places -- I think post your changes it probably will not.
----------
From: Gary Mills <gary_mills at fastmail.fm>
Date: 14 January 2014 00:27
To: Garrett D'Amore <garrett at damore.org>
Cc: illumos-dev <developer at lists.illumos.org>
http://pubs.opengroup.org/onlinepubs/7999959899/functions/strptime.html
says:
%d
The day of the month [01,31]; leading zeros are permitted but not
required.
%D
The date as %m / %d / %y.
%e
Equivalent to %d.
Likewise, the illumos strptime man page says:
%d
Day of month [1,31]; leading zero is permitted but not
required.
%D
Date as %m/%d/%y.
%e
Same as %d.
Neither of them explicitly mention a leading space. I take it then
that a leading zero or a leading space is optional for both %d and %e.
According to the standard and the man page, both pattern elements
should cause the same scan. Making them as permissive as possible
is the right thing to do.
succeeds:
$ /tmp/tstrpft '%D' '01/ 9/14'
Format: %D
Time string: 01/ 9/14
strptime failed
$ env LD_LIBRARY_PATH=proto/root_i386/lib /tmp/tstrpft '%D' '01/ 9/14'
Format: %D
Time string: 01/ 9/14
Fully parsed: 01/ 9/14
Result: 9 January, 2014 12:00:00 AM CST
I don't believe there are any embedded spaces in the %D expansion.
My guess is that the web page is wrong.
For completeness, I'll attach my test program.
----------
From: Garrett D'Amore <garrett at damore.org>
Date: 14 January 2014 00:36
To: Gary Mills <gary_mills at fastmail.fm>
Cc: illumos-dev <developer at lists.illumos.org>
Look at the standard more closely. The expansion of %D has embedded spaces.
Sent from my iPhone
illumos-developer
Archives: https://www.listbox.com/member/archive/182179/=now
RSS Feed: https://www.listbox.com/member/archive/rss/182179/21175010-3e2de3b9
Modify Your Subscription:
https://www.listbox.com/member/?member_id=21175010&id_secret=21175010-7e834616
--
Cedric Blancher <cedric.blancher at gmail.com>
Institute Pasteur
-------------- next part --------------
#include <stdio.h>
#include <locale.h>
#include <time.h>
int
main(int argc, char **argv)
{
char *format = "%c";
char *buf = "January 3, 2014 07:56:08 PM CST";
struct tm tm;
char *rv;
(void)setlocale(LC_ALL, "");
if (argc >= 2) {
format = argv[1];
}
if (argc >= 3) {
buf = argv[2];
}
printf("Format: %s\n", format);
printf("Time string: %s\n", buf);
rv = strptime(buf, format, &tm);
if (rv == NULL) {
printf("strptime failed\n");
}
else {
char s[256];
if (*rv == '\0') {
printf("Fully parsed: %s\n", buf);
}
else {
printf("Parse stopped at %s\n", rv);
}
if (strftime(s, sizeof (s), "%+", &tm) > 0)
printf("Result: %s\n", s);
}
}
-------------------------------------------
illumos-developer
Archives: https://www.listbox.com/member/archive/182179/=now
RSS Feed: https://www.listbox.com/member/archive/rss/182179/21175010-3e2de3b9
Modify Your Subscription: https://www.listbox.com/member/?member_id=21175010&id_secret=21175010-7e834616
Powered by Listbox: http://www.listbox.com
Ced
Forwarded conversation
Subject: [developer] Webrev for: 3141 strptime() doesn't support %t
------------------------
From: Gary Mills <gary_mills at fastmail.fm>
Date: 13 January 2014 16:16
To: developer at lists.illumos.org
This is for illumos bug `3141 strptime() doesn't support %t'.
http://cr.illumos.org/~webrev/jgmills/3141-1
The first change is addition of a case to handle %n and %t in the
pattern, both of which cause skipping of white space in the buffer.
This change alone is defeated by code at the end of many other cases
that advances the format pointer whenever the first character in the
buffer is a white space character. Strangely, this code operates
contrary to the design of strptime(), where elements of the pattern
drive the buffer scan. Fortunately, it is unnecessary and can be
removed. Other portions of the loop already handle white space. The
final change is the inclusion of leading white space removal for both
%d and %e, in case the values were space-padded, and to align it with
the standard. In general, interpretation of pattern elements in
strptime() should be more lenient than in strftime() so that parsing
is more likely to succeed.
This test shows how the new %t and %n in the strptime() pattern causes
whitespace skipping in the time string. My test program shows the
results of the parse by strptime(). If this succeeds, it displays the
time with the strftime() %+ format.
$ /tmp/tstrpft '%a %b %e %H:%M:%S %Y' 'Fri Jan 3 20:53:10 2014'
Format: %a %b %e %H:%M:%S %Y
Time string: Fri Jan 3 20:53:10 2014
Fully parsed: Fri Jan 3 20:53:10 2014
Result: 3 January, 2014 08:53:10 PM CST
$ /tmp/tstrpft '%a %b%n%e%t%H:%M:%S %Y' 'Fri Jan 3 20:53:10 2014'
Format: %a %b%n%e%t%H:%M:%S %Y
Time string: Fri Jan 3 20:53:10 2014
strptime failed
$ env LD_LIBRARY_PATH=proto/root_i386/lib /tmp/tstrpft '%a
%b%n%e%t%H:%M:%S %Y' 'Fri Jan 3 20:53:10 2014'
Format: %a %b%n%e%t%H:%M:%S %Y
Time string: Fri Jan 3 20:53:10 2014
Fully parsed: Fri Jan 3 20:53:10 2014
Result: 3 January, 2014 08:53:10 PM CST
This test shows the effect of white space removal for the %d format
element:
$ /tmp/tstrpft '%m/%e/%y' '01/ 9/14'
Format: %m/%e/%y
Time string: 01/ 9/14
Fully parsed: 01/ 9/14
Result: 9 January, 2014 12:00:00 AM CST
$ /tmp/tstrpft '%m/%d/%y' '01/ 9/14'
Format: %m/%d/%y
Time string: 01/ 9/14
strptime failed
$ env LD_LIBRARY_PATH=proto/root_i386/lib /tmp/tstrpft '%m/%d/%y' '01/ 9/14'
Format: %m/%d/%y
Time string: 01/ 9/14
Fully parsed: 01/ 9/14
Result: 9 January, 2014 12:00:00 AM CST
I also identified all consumers of strptime() in illumos:
usr/src/lib/gss_mechs/mech_krb5/krb5/krb/str_conv.c
usr/src/cmd/fm/fmdump/common/fmdump.c
usr/src/cmd/logadm/opts.c
usr/src/lib/libc/port/locale/getdate.c
usr/src/lib/smbsrv/libmlsvc/common/eventlog_log.c
usr/src/cmd/bnu/uustat.c
usr/src/lib/libkmf/libkmf/common/certgetsetop.c
usr/src/lib/krb5/plugins/kdb/ldap/libkdb_ldap/ldap_misc.c
usr/src/lib/libbe/common/be_utils.c
usr/src/lib/hbaapi/common/HBAAPILIB.c
usr/src/cmd/iscsiadm/sun_ima.c
usr/src/lib/libkmf/plugins/kmf_pkcs11/common/pkcs11_spi.c
usr/src/lib/smhba/common/SMHBAAPILIB.c
usr/src/lib/libima/common/ima-lib.c
Some of these try a whole list of patterns in the hope that one will
succeed. Others use only one. I've tested all of the patterns by
encoding a time string with strftime() and decoding it with
strptime(). All tests were successful, both with the old and new
versions of strptime(). This test demonstrates that the changes in
this webrev will have no effect on parsing by current consumers.
This is the original bug report:
https://www.illumos.org/issues/3141
--
-Gary Mills- -refurb- -Winnipeg, Manitoba, Canada-
-------------------------------------------
illumos-developer
Archives: https://www.listbox.com/member/archive/182179/=now
RSS Feed: https://www.listbox.com/member/archive/rss/182179/21175010-3e2de3b9
Modify Your Subscription:
https://www.listbox.com/member/?member_id=21175010&id_secret=21175010-7e834616
Powered by Listbox: http://www.listbox.com
----------
From: Garrett D'Amore <garrett at damore.org>
Date: 13 January 2014 21:14
To: Gary Mills <gary_mills at fastmail.fm>
Cc: illumos-dev <developer at lists.illumos.org>
I'm not sure this is correct.
If I'm reading your code correctly, it seems you'll parse %d the same
way as %e. Interestingly enough, the man page documents this behavior
*without* a leading space, but the code was made to do this because
strftime() of %e includes a leading space. (The idea being that
strftime/strptime should be reversible transforms.) The standard
seems in conflict of the specification here, because if I read it
correctly, %e is specifically scanned differently than it is emitted.
You'll see that %D in the standard is intended to parse those forms
where %e is present, and indeed, it expands differently for strptime
than it does for strftime. (In fact, our code is specifically *not*
conformant with the standard for %D, because it lacks the intervening
spaces!)
Now all that said, this code has lived a long time now. I'm a little
afraid of changes here, and I'd really like second opinions both in
the code review, and on the implications for standards conformance.
I'd also like you to test %D properly. It should support spaces in
the right places -- I think post your changes it probably will not.
RSS Feed: https://www.listbox.com/member/archive/rss/182179/21239177-3604570e
Modify Your Subscription: https://www.listbox.com/member/?&
Powered by Listbox: http://www.listbox.com
illumos-developer | Archives | Modify Your SubscriptionModify Your Subscription: https://www.listbox.com/member/?&
Powered by Listbox: http://www.listbox.com
----------
From: Gary Mills <gary_mills at fastmail.fm>
Date: 14 January 2014 00:27
To: Garrett D'Amore <garrett at damore.org>
Cc: illumos-dev <developer at lists.illumos.org>
I'm not sure this is correct.
If I'm reading your code correctly, it seems you'll parse %d the same
way as %e. Interestingly enough, the man page documents this behavior
*without* a leading space, but the code was made to do this because
strftime() of %e includes a leading space. (The idea being that
strftime/strptime should be reversible transforms.) The standard seems
in conflict of the specification here, because if I read it correctly,
%e is specifically scanned differently than it is emitted.
The standard here:If I'm reading your code correctly, it seems you'll parse %d the same
way as %e. Interestingly enough, the man page documents this behavior
*without* a leading space, but the code was made to do this because
strftime() of %e includes a leading space. (The idea being that
strftime/strptime should be reversible transforms.) The standard seems
in conflict of the specification here, because if I read it correctly,
%e is specifically scanned differently than it is emitted.
http://pubs.opengroup.org/onlinepubs/7999959899/functions/strptime.html
says:
%d
The day of the month [01,31]; leading zeros are permitted but not
required.
%D
The date as %m / %d / %y.
%e
Equivalent to %d.
Likewise, the illumos strptime man page says:
%d
Day of month [1,31]; leading zero is permitted but not
required.
%D
Date as %m/%d/%y.
%e
Same as %d.
Neither of them explicitly mention a leading space. I take it then
that a leading zero or a leading space is optional for both %d and %e.
According to the standard and the man page, both pattern elements
should cause the same scan. Making them as permissive as possible
is the right thing to do.
You'll see
that %D in the standard is intended to parse those forms where %e is
present, and indeed, it expands differently for strptime than it does
for strftime. (In fact, our code is specifically *not* conformant with
the standard for %D, because it lacks the intervening spaces!)
Now all that said, this code has lived a long time now. I'm a little
afraid of changes here, and I'd really like second opinions both in the
code review, and on the implications for standards conformance.
I'd also like you to test %D properly. It should support spaces in the
right places -- I think post your changes it probably will not.
Au contraire, the existing strptime() fails, but the new onethat %D in the standard is intended to parse those forms where %e is
present, and indeed, it expands differently for strptime than it does
for strftime. (In fact, our code is specifically *not* conformant with
the standard for %D, because it lacks the intervening spaces!)
Now all that said, this code has lived a long time now. I'm a little
afraid of changes here, and I'd really like second opinions both in the
code review, and on the implications for standards conformance.
I'd also like you to test %D properly. It should support spaces in the
right places -- I think post your changes it probably will not.
succeeds:
$ /tmp/tstrpft '%D' '01/ 9/14'
Format: %D
Time string: 01/ 9/14
strptime failed
$ env LD_LIBRARY_PATH=proto/root_i386/lib /tmp/tstrpft '%D' '01/ 9/14'
Format: %D
Time string: 01/ 9/14
Fully parsed: 01/ 9/14
Result: 9 January, 2014 12:00:00 AM CST
I don't believe there are any embedded spaces in the %D expansion.
My guess is that the web page is wrong.
For completeness, I'll attach my test program.
----------
From: Garrett D'Amore <garrett at damore.org>
Date: 14 January 2014 00:36
To: Gary Mills <gary_mills at fastmail.fm>
Cc: illumos-dev <developer at lists.illumos.org>
Look at the standard more closely. The expansion of %D has embedded spaces.
Sent from my iPhone
<tstrpft.c>
-------------------------------------------illumos-developer
Archives: https://www.listbox.com/member/archive/182179/=now
RSS Feed: https://www.listbox.com/member/archive/rss/182179/21175010-3e2de3b9
Modify Your Subscription:
https://www.listbox.com/member/?member_id=21175010&id_secret=21175010-7e834616
--
Cedric Blancher <cedric.blancher at gmail.com>
Institute Pasteur
-------------- next part --------------
#include <stdio.h>
#include <locale.h>
#include <time.h>
int
main(int argc, char **argv)
{
char *format = "%c";
char *buf = "January 3, 2014 07:56:08 PM CST";
struct tm tm;
char *rv;
(void)setlocale(LC_ALL, "");
if (argc >= 2) {
format = argv[1];
}
if (argc >= 3) {
buf = argv[2];
}
printf("Format: %s\n", format);
printf("Time string: %s\n", buf);
rv = strptime(buf, format, &tm);
if (rv == NULL) {
printf("strptime failed\n");
}
else {
char s[256];
if (*rv == '\0') {
printf("Fully parsed: %s\n", buf);
}
else {
printf("Parse stopped at %s\n", rv);
}
if (strftime(s, sizeof (s), "%+", &tm) > 0)
printf("Result: %s\n", s);
}
}
-------------------------------------------
illumos-developer
Archives: https://www.listbox.com/member/archive/182179/=now
RSS Feed: https://www.listbox.com/member/archive/rss/182179/21175010-3e2de3b9
Modify Your Subscription: https://www.listbox.com/member/?member_id=21175010&id_secret=21175010-7e834616
Powered by Listbox: http://www.listbox.com