Discussion:
[ast-developers] ksh crashes, because job_lock & job_unlock do not survive compiler optimizations
Michal Hlavinka
2014-06-25 10:27:21 UTC
Permalink
Hi,

we found a bug caused by compiler optimizations.

The top of stack looks like this:

#0 job_chksave (pid=5066) at jobs.c:1949
#1 job_reap (sig=17) at obs.c:428
#2 <signal handler called>
#3 job_subsave () at sh/jobs.c:1990
#4 sh_subshell (shp=0x76cba0, t=0x7fd6050c9fe0, flags=4, comsub=3) at
subshell.c:520

It's based on patched 2012-08-01 so line numbers won't match precisely.

The interesting part is that job_reap was executed during job_subsave
while job_lock should prevent this:

void *job_subsave(void)
{
struct back_save *bp = new_of(struct back_save,0);
job_lock();
*bp = bck;
bp->prev = bck.prev;
bck.count = 0;
bck.list = 0; <<---- HERE when signal came
bck.prev = bp;
job_unlock();
return((void*)bp);
}

when checking the job.in_critical, gdb showed that it is zero
(gdb) p job.in_critical
$1 = 0

So the locking was not effective. I've checked the disassembled
job_subsave and to no surprise:
?0x429801 <job_subsave+1> mov $0x18,%edi
?0x429806 <job_subsave+6> callq 0x405de0 <malloc at plt>
?0x42980b <job_subsave+11> mov 0x343e9e(%rip),%rdx #
0x76d6b0 <bck>
?0x429812 <job_subsave+18> mov %rax,%rbx
?0x429815 <job_subsave+21> mov 0x343bed(%rip),%eax #
0x76d408 <job+40>
?0x42981b <job_subsave+27> movl $0x0,0x343e8b(%rip) #
0x76d6b0 <bck>
?0x429825 <job_subsave+37> mov %rdx,(%rbx)
?0x429828 <job_subsave+40> mov 0x343e89(%rip),%rdx #
0x76d6b8 <bck+8>
?0x42982f <job_subsave+47> test %eax,%eax
?0x429831 <job_subsave+49> movq $0x0,0x343e7c(%rip) #
0x76d6b8 <bck+8>
?0x42983c <job_subsave+60> mov %rdx,0x8(%rbx)
?0x429840 <job_subsave+64> mov 0x343e79(%rip),%rdx #
0x76d6c0 <bck+16>
?0x429847 <job_subsave+71> mov %rbx,0x343e72(%rip) #
0x76d6c0 <bck+16>
?0x42984e <job_subsave+78> mov %rdx,0x10(%rbx)
?0x429852 <job_subsave+82> jne 0x429887 <job_subsave+135>
?0x429854 <job_subsave+84> mov 0x343bb2(%rip),%edi #
0x76d40c <job+44>
?0x42985a <job_subsave+90> test %edi,%edi
?0x42985c <job_subsave+92> je 0x429887 <job_subsave+135>
?0x42985e <job_subsave+94> mov 0x341ca3(%rip),%rax #
0x76b508 <Vmregion>
?0x429865 <job_subsave+101> movl $0x1,0x343b99(%rip) #
0x76d408 <job+40>
?0x42986f <job_subsave+111> mov 0x60(%rax),%rdx
?0x429873 <job_subsave+115> mov $0x1,%eax
?0x429878 <job_subsave+120> mov (%rdx),%esi
?0x42987a <job_subsave+122> test %esi,%esi
?0x42987c <job_subsave+124> je 0x429890 <job_subsave+144>
?0x42987e <job_subsave+126> sub $0x1,%eax
?0x429881 <job_subsave+129> mov %eax,0x343b81(%rip) #
0x76d408 <job+40>
?0x429887 <job_subsave+135> mov %rbx,%rax
?0x42988a <job_subsave+138> pop %rbx
?0x42988b <job_subsave+139> retq
?0x42988c <job_subsave+140> nopl 0x0(%rax)
?0x429890 <job_subsave+144> callq 0x428d60 <job_reap>
?0x429895 <job_subsave+149> mov 0x343b6d(%rip),%eax #
0x76d408 <job+40>
?0x42989b <job_subsave+155> jmp 0x42987e <job_subsave+126>

there is no job.in_critical AKA <job+40> ++ nor --

Even with volatile attribute, gcc reorders the code so it locks,
immediately decrements in_critical again as part of unlocking and then
do the code that's supposed to be lock protected. Adding memory barriers
was necessary to prevent compiler from reorganizing the code.

See the attached patch.

Michal


-------------- next part --------------
A non-text attachment was scrubbed...
Name: ksh-20120801-locking.patch
Type: text/x-patch
Size: 1298 bytes
Desc: not available
URL: <http://lists.research.att.com/pipermail/ast-developers/attachments/20140625/91d6c666/attachment.bin>
Lionel Cons
2014-06-25 11:32:25 UTC
Permalink
The attached patch isn't portable across all platforms.

Roland, do you see a portable solution?

Lionel
Post by Michal Hlavinka
Hi,
we found a bug caused by compiler optimizations.
#0 job_chksave (pid=5066) at jobs.c:1949
#1 job_reap (sig=17) at obs.c:428
#2 <signal handler called>
#3 job_subsave () at sh/jobs.c:1990
#4 sh_subshell (shp=0x76cba0, t=0x7fd6050c9fe0, flags=4, comsub=3) at
subshell.c:520
It's based on patched 2012-08-01 so line numbers won't match precisely.
The interesting part is that job_reap was executed during job_subsave while
void *job_subsave(void)
{
struct back_save *bp = new_of(struct back_save,0);
job_lock();
*bp = bck;
bp->prev = bck.prev;
bck.count = 0;
bck.list = 0; <<---- HERE when signal came
bck.prev = bp;
job_unlock();
return((void*)bp);
}
when checking the job.in_critical, gdb showed that it is zero
(gdb) p job.in_critical
$1 = 0
So the locking was not effective. I've checked the disassembled job_subsave
?0x429801 <job_subsave+1> mov $0x18,%edi
?0x429806 <job_subsave+6> callq 0x405de0 <malloc at plt>
?0x42980b <job_subsave+11> mov 0x343e9e(%rip),%rdx #
0x76d6b0 <bck>
?0x429812 <job_subsave+18> mov %rax,%rbx
?0x429815 <job_subsave+21> mov 0x343bed(%rip),%eax #
0x76d408 <job+40>
?0x42981b <job_subsave+27> movl $0x0,0x343e8b(%rip) #
0x76d6b0 <bck>
?0x429825 <job_subsave+37> mov %rdx,(%rbx)
?0x429828 <job_subsave+40> mov 0x343e89(%rip),%rdx #
0x76d6b8 <bck+8>
?0x42982f <job_subsave+47> test %eax,%eax
?0x429831 <job_subsave+49> movq $0x0,0x343e7c(%rip) #
0x76d6b8 <bck+8>
?0x42983c <job_subsave+60> mov %rdx,0x8(%rbx)
?0x429840 <job_subsave+64> mov 0x343e79(%rip),%rdx #
0x76d6c0 <bck+16>
?0x429847 <job_subsave+71> mov %rbx,0x343e72(%rip) #
0x76d6c0 <bck+16>
?0x42984e <job_subsave+78> mov %rdx,0x10(%rbx)
?0x429852 <job_subsave+82> jne 0x429887 <job_subsave+135>
?0x429854 <job_subsave+84> mov 0x343bb2(%rip),%edi #
0x76d40c <job+44>
?0x42985a <job_subsave+90> test %edi,%edi
?0x42985c <job_subsave+92> je 0x429887 <job_subsave+135>
?0x42985e <job_subsave+94> mov 0x341ca3(%rip),%rax #
0x76b508 <Vmregion>
?0x429865 <job_subsave+101> movl $0x1,0x343b99(%rip) #
0x76d408 <job+40>
?0x42986f <job_subsave+111> mov 0x60(%rax),%rdx
?0x429873 <job_subsave+115> mov $0x1,%eax
?0x429878 <job_subsave+120> mov (%rdx),%esi
?0x42987a <job_subsave+122> test %esi,%esi
?0x42987c <job_subsave+124> je 0x429890 <job_subsave+144>
?0x42987e <job_subsave+126> sub $0x1,%eax
?0x429881 <job_subsave+129> mov %eax,0x343b81(%rip) #
0x76d408 <job+40>
?0x429887 <job_subsave+135> mov %rbx,%rax
?0x42988a <job_subsave+138> pop %rbx
?0x42988b <job_subsave+139> retq
?0x42988c <job_subsave+140> nopl 0x0(%rax)
?0x429890 <job_subsave+144> callq 0x428d60 <job_reap>
?0x429895 <job_subsave+149> mov 0x343b6d(%rip),%eax #
0x76d408 <job+40>
?0x42989b <job_subsave+155> jmp 0x42987e <job_subsave+126>
there is no job.in_critical AKA <job+40> ++ nor --
Even with volatile attribute, gcc reorders the code so it locks, immediately
decrements in_critical again as part of unlocking and then do the code
that's supposed to be lock protected. Adding memory barriers was necessary
to prevent compiler from reorganizing the code.
See the attached patch.
Michal
_______________________________________________
ast-developers mailing list
ast-developers at lists.research.att.com
http://lists.research.att.com/mailman/listinfo/ast-developers
--
Lionel
Glenn Fowler
2014-06-25 11:55:56 UTC
Permalink
interesting that it was optimized out -- but it does seem to follow the
letter of the standard
references to the volatile variable must access the actual memory and not
be cached
in this case there is no reference because it was optimized out so its easy
is there any standard way to force the increment to happen at any
optimization level?

I think job_unlock() needs 2 more memory barriers

along with volatile unsinged int in_critical or volatile struct jobs job
try this portable code instead:
---
#define job_lock() asoincint(job.in_critical)
#define job_unlock() \
do { \
int _sig; \
if (asodecint(job.in_critical) == 1 && (_sig =
job.savesig)) \
{ \
if (asoincint(job.in_critical) == 0 && !vmbusy()) \
job_reap(_sig); \
asodecint(job.in_critical); \
} \
} while(0)
---
I believe this code with one less aso op is equivalent but I'm not in a
spot to test it right now:
---
#define job_lock() asoincint(job.in_critical)
#define job_unlock() \
do { \
int _sig; \
if (asogetint(job.in_critical) == 1 && (_sig = job.savesig)
&& !vmbusy()) \
job_reap(_sig); \
asodecint(job.in_critical); \
} while(0)
---


On Wed, Jun 25, 2014 at 6:27 AM, Michal Hlavinka <mhlavink at redhat.com>
Post by Michal Hlavinka
Hi,
we found a bug caused by compiler optimizations.
#0 job_chksave (pid=5066) at jobs.c:1949
#1 job_reap (sig=17) at obs.c:428
#2 <signal handler called>
#3 job_subsave () at sh/jobs.c:1990
#4 sh_subshell (shp=0x76cba0, t=0x7fd6050c9fe0, flags=4, comsub=3) at
subshell.c:520
It's based on patched 2012-08-01 so line numbers won't match precisely.
The interesting part is that job_reap was executed during job_subsave
void *job_subsave(void)
{
struct back_save *bp = new_of(struct back_save,0);
job_lock();
*bp = bck;
bp->prev = bck.prev;
bck.count = 0;
bck.list = 0; <<---- HERE when signal came
bck.prev = bp;
job_unlock();
return((void*)bp);
}
when checking the job.in_critical, gdb showed that it is zero
(gdb) p job.in_critical
$1 = 0
So the locking was not effective. I've checked the disassembled
?0x429801 <job_subsave+1> mov $0x18,%edi
?0x429806 <job_subsave+6> callq 0x405de0 <malloc at plt>
?0x42980b <job_subsave+11> mov 0x343e9e(%rip),%rdx #
0x76d6b0 <bck>
?0x429812 <job_subsave+18> mov %rax,%rbx
?0x429815 <job_subsave+21> mov 0x343bed(%rip),%eax #
0x76d408 <job+40>
?0x42981b <job_subsave+27> movl $0x0,0x343e8b(%rip) #
0x76d6b0 <bck>
?0x429825 <job_subsave+37> mov %rdx,(%rbx)
?0x429828 <job_subsave+40> mov 0x343e89(%rip),%rdx #
0x76d6b8 <bck+8>
?0x42982f <job_subsave+47> test %eax,%eax
?0x429831 <job_subsave+49> movq $0x0,0x343e7c(%rip) #
0x76d6b8 <bck+8>
?0x42983c <job_subsave+60> mov %rdx,0x8(%rbx)
?0x429840 <job_subsave+64> mov 0x343e79(%rip),%rdx #
0x76d6c0 <bck+16>
?0x429847 <job_subsave+71> mov %rbx,0x343e72(%rip) #
0x76d6c0 <bck+16>
?0x42984e <job_subsave+78> mov %rdx,0x10(%rbx)
?0x429852 <job_subsave+82> jne 0x429887 <job_subsave+135>
?0x429854 <job_subsave+84> mov 0x343bb2(%rip),%edi #
0x76d40c <job+44>
?0x42985a <job_subsave+90> test %edi,%edi
?0x42985c <job_subsave+92> je 0x429887 <job_subsave+135>
?0x42985e <job_subsave+94> mov 0x341ca3(%rip),%rax #
0x76b508 <Vmregion>
?0x429865 <job_subsave+101> movl $0x1,0x343b99(%rip) #
0x76d408 <job+40>
?0x42986f <job_subsave+111> mov 0x60(%rax),%rdx
?0x429873 <job_subsave+115> mov $0x1,%eax
?0x429878 <job_subsave+120> mov (%rdx),%esi
?0x42987a <job_subsave+122> test %esi,%esi
?0x42987c <job_subsave+124> je 0x429890 <job_subsave+144>
?0x42987e <job_subsave+126> sub $0x1,%eax
?0x429881 <job_subsave+129> mov %eax,0x343b81(%rip) #
0x76d408 <job+40>
?0x429887 <job_subsave+135> mov %rbx,%rax
?0x42988a <job_subsave+138> pop %rbx
?0x42988b <job_subsave+139> retq
?0x42988c <job_subsave+140> nopl 0x0(%rax)
?0x429890 <job_subsave+144> callq 0x428d60 <job_reap>
?0x429895 <job_subsave+149> mov 0x343b6d(%rip),%eax #
0x76d408 <job+40>
?0x42989b <job_subsave+155> jmp 0x42987e <job_subsave+126>
there is no job.in_critical AKA <job+40> ++ nor --
Even with volatile attribute, gcc reorders the code so it locks,
immediately decrements in_critical again as part of unlocking and then do
the code that's supposed to be lock protected. Adding memory barriers was
necessary to prevent compiler from reorganizing the code.
See the attached patch.
Michal
_______________________________________________
ast-developers mailing list
ast-developers at lists.research.att.com
http://lists.research.att.com/mailman/listinfo/ast-developers
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.research.att.com/pipermail/ast-developers/attachments/20140625/3e4ad108/attachment-0001.html>
Adam Edgar
2014-06-25 15:12:45 UTC
Permalink
On newer versions of gcc asoincint and asodecint should be using the assembly directive __sync_fetch_and_add which is by definition atomic and synchronized across all cores/cpus. If you trace through the included headers see if you get to that or some other method. Perhaps iffe is not detecting the correct operator.

ASE
interesting that it was optimized out -- but it does seem to follow the letter of the standard
references to the volatile variable must access the actual memory and not be cached
in this case there is no reference because it was optimized out so its easy
is there any standard way to force the increment to happen at any optimization level?
I think job_unlock() needs 2 more memory barriers
---
#define job_lock() asoincint(job.in_critical)
#define job_unlock() \
do { \
int _sig; \
if (asodecint(job.in_critical) == 1 && (_sig = job.savesig)) \
{ \
if (asoincint(job.in_critical) == 0 && !vmbusy()) \
job_reap(_sig); \
asodecint(job.in_critical); \
} \
} while(0)
---
---
#define job_lock() asoincint(job.in_critical)
#define job_unlock() \
do { \
int _sig; \
if (asogetint(job.in_critical) == 1 && (_sig = job.savesig) && !vmbusy()) \
job_reap(_sig); \
asodecint(job.in_critical); \
} while(0)
---
Hi,
we found a bug caused by compiler optimizations.
#0 job_chksave (pid=5066) at jobs.c:1949
#1 job_reap (sig=17) at obs.c:428
#2 <signal handler called>
#3 job_subsave () at sh/jobs.c:1990
#4 sh_subshell (shp=0x76cba0, t=0x7fd6050c9fe0, flags=4, comsub=3) at subshell.c:520
It's based on patched 2012-08-01 so line numbers won't match precisely.
void *job_subsave(void)
{
struct back_save *bp = new_of(struct back_save,0);
job_lock();
*bp = bck;
bp->prev = bck.prev;
bck.count = 0;
bck.list = 0; <<---- HERE when signal came
bck.prev = bp;
job_unlock();
return((void*)bp);
}
when checking the job.in_critical, gdb showed that it is zero
(gdb) p job.in_critical
$1 = 0
?0x429801 <job_subsave+1> mov $0x18,%edi
?0x429806 <job_subsave+6> callq 0x405de0 <malloc at plt>
?0x42980b <job_subsave+11> mov 0x343e9e(%rip),%rdx # 0x76d6b0 <bck>
?0x429812 <job_subsave+18> mov %rax,%rbx
?0x429815 <job_subsave+21> mov 0x343bed(%rip),%eax # 0x76d408 <job+40>
?0x42981b <job_subsave+27> movl $0x0,0x343e8b(%rip) # 0x76d6b0 <bck>
?0x429825 <job_subsave+37> mov %rdx,(%rbx)
?0x429828 <job_subsave+40> mov 0x343e89(%rip),%rdx # 0x76d6b8 <bck+8>
?0x42982f <job_subsave+47> test %eax,%eax
?0x429831 <job_subsave+49> movq $0x0,0x343e7c(%rip) # 0x76d6b8 <bck+8>
?0x42983c <job_subsave+60> mov %rdx,0x8(%rbx)
?0x429840 <job_subsave+64> mov 0x343e79(%rip),%rdx # 0x76d6c0 <bck+16>
?0x429847 <job_subsave+71> mov %rbx,0x343e72(%rip) # 0x76d6c0 <bck+16>
?0x42984e <job_subsave+78> mov %rdx,0x10(%rbx)
?0x429852 <job_subsave+82> jne 0x429887 <job_subsave+135>
?0x429854 <job_subsave+84> mov 0x343bb2(%rip),%edi # 0x76d40c <job+44>
?0x42985a <job_subsave+90> test %edi,%edi
?0x42985c <job_subsave+92> je 0x429887 <job_subsave+135>
?0x42985e <job_subsave+94> mov 0x341ca3(%rip),%rax # 0x76b508 <Vmregion>
?0x429865 <job_subsave+101> movl $0x1,0x343b99(%rip) # 0x76d408 <job+40>
?0x42986f <job_subsave+111> mov 0x60(%rax),%rdx
?0x429873 <job_subsave+115> mov $0x1,%eax
?0x429878 <job_subsave+120> mov (%rdx),%esi
?0x42987a <job_subsave+122> test %esi,%esi
?0x42987c <job_subsave+124> je 0x429890 <job_subsave+144>
?0x42987e <job_subsave+126> sub $0x1,%eax
?0x429881 <job_subsave+129> mov %eax,0x343b81(%rip) # 0x76d408 <job+40>
?0x429887 <job_subsave+135> mov %rbx,%rax
?0x42988a <job_subsave+138> pop %rbx
?0x42988b <job_subsave+139> retq
?0x42988c <job_subsave+140> nopl 0x0(%rax)
?0x429890 <job_subsave+144> callq 0x428d60 <job_reap>
?0x429895 <job_subsave+149> mov 0x343b6d(%rip),%eax # 0x76d408 <job+40>
?0x42989b <job_subsave+155> jmp 0x42987e <job_subsave+126>
there is no job.in_critical AKA <job+40> ++ nor --
Even with volatile attribute, gcc reorders the code so it locks, immediately decrements in_critical again as part of unlocking and then do the code that's supposed to be lock protected. Adding memory barriers was necessary to prevent compiler from reorganizing the code.
See the attached patch.
Michal
_______________________________________________
ast-developers mailing list
ast-developers at lists.research.att.com
http://lists.research.att.com/mailman/listinfo/ast-developers
_______________________________________________
ast-developers mailing list
ast-developers at lists.research.att.com
http://lists.research.att.com/mailman/listinfo/ast-developers
Glenn Fowler
2014-06-26 12:06:07 UTC
Permalink
Michal did you get the reordering problem with the aso*() code?
I checked my ubuntu build and the aso*() ops are macros that call the
__sync_* assembly directives as Adam notes below
if gcc optimization fiddles with __sync_* ordering that's a big problem

if the aso*() does fail to coax gcc to do what the code says then add this
line to the ksh93 Makefile and rebuild:

* :NOOPTIMIZE: io.c jobs.c xec.c


On Wed, Jun 25, 2014 at 8:27 AM, Michal Hlavinka <mhlavink at redhat.com>
volatile attribute is sufficient to enforce the inc/dec-rement to happen
and read/store the value from/to memory (not just register).
The reordering is bigger problem, (with just volatile) it did something
load memory to register
increment register
store register to memory
decrement memory
do the locked stuff (just assigning)
test whether last decrement was zero
do (or do not) the other tests from job_unlock and job_reap call
So it did both increment and decrement of job.in_critical
The volatile just means that all operations happen with memory and not
cached data (registers) and results are immediately written back to memory.
Also all expressions using volatile variables will be executed in
specified order.
foo(int a) {
volatile int x=1 ,y=2;
int z=3;
z += a;
x = 3*x + a;
y = 2*y + a;
x = x + a;
y = y + a;
just says that both x = ... and y = ... will be executed as they are
written and stored to memory immediately after every assignment, but both y
= operations can be executed before z += and x =
there is no guaranteed order with respect to other volatile variables if
they are not using other volatile variable in assignment.
There is no control over ordering and AFAIK (looking for solution half of
yesterday) there is no standard way how to force the correct order.
Only side effects of some hacks. The memory barrier hack being one.
Post by Glenn Fowler
interesting that it was optimized out -- but it does seem to follow the
letter of the standard
references to the volatile variable must access the actual memory and
not be cached
in this case there is no reference because it was optimized out so its easy
is there any standard way to force the increment to happen at any
optimization level?
I think job_unlock() needs 2 more memory barriers
along with volatile unsinged int in_critical or volatile struct jobs job
---
#define job_lock() asoincint(job.in_critical)
#define job_unlock() \
do { \
int _sig; \
if (asodecint(job.in_critical) == 1 && (_sig =
job.savesig)) \
{ \
if (asoincint(job.in_critical) == 0 && !vmbusy()) \
job_reap(_sig); \
asodecint(job.in_critical); \
} \
} while(0)
---
I believe this code with one less aso op is equivalent but I'm not in a
---
#define job_lock() asoincint(job.in_critical)
#define job_unlock() \
do { \
int _sig; \
if (asogetint(job.in_critical) == 1 && (_sig =
job.savesig) && !vmbusy()) \
job_reap(_sig); \
asodecint(job.in_critical); \
} while(0)
---
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.research.att.com/pipermail/ast-developers/attachments/20140626/2dd55860/attachment.html>
Michal Hlavinka
2014-06-26 15:39:37 UTC
Permalink
With the aso*(), the assembler code looks fine. I used the code you've
posted, just added the & to arguments.

It translates to the __sync_* which implicitly include the memory barrier.
Post by Glenn Fowler
Michal did you get the reordering problem with the aso*() code?
I checked my ubuntu build and the aso*() ops are macros that call the
__sync_* assembly directives as Adam notes below
if gcc optimization fiddles with __sync_* ordering that's a big problem
if the aso*() does fail to coax gcc to do what the code says then add
* :NOOPTIMIZE: io.c jobs.c xec.c
On Wed, Jun 25, 2014 at 8:27 AM, Michal Hlavinka <mhlavink at redhat.com
volatile attribute is sufficient to enforce the inc/dec-rement to
happen and read/store the value from/to memory (not just register).
The reordering is bigger problem, (with just volatile) it did
load memory to register
increment register
store register to memory
decrement memory
do the locked stuff (just assigning)
test whether last decrement was zero
do (or do not) the other tests from job_unlock and job_reap call
So it did both increment and decrement of job.in_critical
The volatile just means that all operations happen with memory and
not cached data (registers) and results are immediately written back
to memory.
Also all expressions using volatile variables will be executed in
specified order.
foo(int a) {
volatile int x=1 ,y=2;
int z=3;
z += a;
x = 3*x + a;
y = 2*y + a;
x = x + a;
y = y + a;
just says that both x = ... and y = ... will be executed as they are
written and stored to memory immediately after every assignment, but
both y = operations can be executed before z += and x =
there is no guaranteed order with respect to other volatile
variables if they are not using other volatile variable in assignment.
There is no control over ordering and AFAIK (looking for solution
half of yesterday) there is no standard way how to force the correct
order.
Only side effects of some hacks. The memory barrier hack being one.
interesting that it was optimized out -- but it does seem to follow the
letter of the standard
references to the volatile variable must access the actual memory and
not be cached
in this case there is no reference because it was optimized out so its easy
is there any standard way to force the increment to happen at any
optimization level?
I think job_unlock() needs 2 more memory barriers
along with volatile unsinged int in_critical or volatile struct jobs job
---
#define job_lock() asoincint(job.in_critical)
#define job_unlock() \
do { \
int _sig; \
if (asodecint(job.in_critical) == 1 && (_sig =
job.savesig)) \
{ \
if (asoincint(job.in_critical) == 0 &&
!vmbusy()) \
job_reap(_sig); \
asodecint(job.in_critical); \
} \
} while(0)
---
I believe this code with one less aso op is equivalent but I'm not in a
---
#define job_lock() asoincint(job.in_critical)
#define job_unlock() \
do { \
int _sig; \
if (asogetint(job.in_critical) == 1 && (_sig =
job.savesig) && !vmbusy()) \
job_reap(_sig); \
asodecint(job.in_critical); \
} while(0)
---
Glenn Fowler
2014-06-27 01:10:16 UTC
Permalink
thanks for catching the omitted '&'
Update will be done tonight
With the aso*(), the assembler code looks fine. I used the code you've
posted, just added the & to arguments.

It translates to the __sync_* which implicitly include the memory barrier.
Post by Glenn Fowler
Michal did you get the reordering problem with the aso*() code?
I checked my ubuntu build and the aso*() ops are macros that call the
__sync_* assembly directives as Adam notes below
if gcc optimization fiddles with __sync_* ordering that's a big problem
if the aso*() does fail to coax gcc to do what the code says then add
* :NOOPTIMIZE: io.c jobs.c xec.c
On Wed, Jun 25, 2014 at 8:27 AM, Michal Hlavinka <mhlavink at redhat.com
volatile attribute is sufficient to enforce the inc/dec-rement to
happen and read/store the value from/to memory (not just register).
The reordering is bigger problem, (with just volatile) it did
load memory to register
increment register
store register to memory
decrement memory
do the locked stuff (just assigning)
test whether last decrement was zero
do (or do not) the other tests from job_unlock and job_reap call
So it did both increment and decrement of job.in_critical
The volatile just means that all operations happen with memory and
not cached data (registers) and results are immediately written back
to memory.
Also all expressions using volatile variables will be executed in
specified order.
foo(int a) {
volatile int x=1 ,y=2;
int z=3;
z += a;
x = 3*x + a;
y = 2*y + a;
x = x + a;
y = y + a;
just says that both x = ... and y = ... will be executed as they are
written and stored to memory immediately after every assignment, but
both y = operations can be executed before z += and x =
there is no guaranteed order with respect to other volatile
variables if they are not using other volatile variable in assignment.
There is no control over ordering and AFAIK (looking for solution
half of yesterday) there is no standard way how to force the correct
order.
Only side effects of some hacks. The memory barrier hack being one.
interesting that it was optimized out -- but it does seem to follow the
letter of the standard
references to the volatile variable must access the actual memory and
not be cached
in this case there is no reference because it was optimized out so its easy
is there any standard way to force the increment to happen at any
optimization level?
I think job_unlock() needs 2 more memory barriers
along with volatile unsinged int in_critical or volatile struct jobs job
---
#define job_lock() asoincint(job.in_critical)
#define job_unlock() \
do { \
int _sig; \
if (asodecint(job.in_critical) == 1 && (_sig =
job.savesig)) \
{ \
if (asoincint(job.in_critical) == 0 &&
!vmbusy()) \
job_reap(_sig); \
asodecint(job.in_critical); \
} \
} while(0)
---
I believe this code with one less aso op is equivalent but I'm not in a
---
#define job_lock() asoincint(job.in_critical)
#define job_unlock() \
do { \
int _sig; \
if (asogetint(job.in_critical) == 1 && (_sig =
job.savesig) && !vmbusy()) \
job_reap(_sig); \
asodecint(job.in_critical); \
} while(0)
---
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.research.att.com/pipermail/ast-developers/attachments/20140626/37753b29/attachment.html>
Loading...