Discussion:
[ast-developers] malloc() in signal handlers sometimes causing deadlocks
Dr. Werner Fink
2014-07-09 10:36:07 UTC
Permalink
Hi,

even with _AST_std_malloc==0 and _map_malloc==1 I see sometimes that
the test suite hangs for ever in signal.sh. After attaching the gdb
to such a hanging ksh process I can identify that this happens in
the signal handler sh_fault() if the Siginfo structure is allocated.
The back trace shows that the ksh hanging at last in a nanosleep()
call within tvsleep() called below src/lib/libast/vmalloc/

In other words even the libast variant of memory allocation is not
reentrant.

Werner
--
"Having a smoking section in a restaurant is like having
a peeing section in a swimming pool." -- Edward Burr
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 836 bytes
Desc: not available
URL: <http://lists.research.att.com/pipermail/ast-developers/attachments/20140709/993ee3bd/attachment.sig>
Dr. Werner Fink
2014-07-09 12:13:05 UTC
Permalink
Post by Dr. Werner Fink
Hi,
even with _AST_std_malloc==0 and _map_malloc==1 I see sometimes that
the test suite hangs for ever in signal.sh. After attaching the gdb
to such a hanging ksh process I can identify that this happens in
the signal handler sh_fault() if the Siginfo structure is allocated.
The back trace shows that the ksh hanging at last in a nanosleep()
call within tvsleep() called below src/lib/libast/vmalloc/
Just like this

(gdb) up
#1 0x00000000004c6a7d in tvsleep (tv=<optimized out>, rv=0x0) at /usr/src/packages/BUILD/ksh93/src/lib/libast/tm/tvsleep.c:64
64 if ((r = nanosleep(&stv, &srv)) && errno == EINTR && rv)
(gdb)
#2 0x00000000004f65a9 in asorelax (nsec=<optimized out>) at /usr/src/packages/BUILD/ksh93/src/lib/libast/aso/asorelax.c:46
46 return tvsleep(&tv, 0);
(gdb)
#3 0x00000000004f653a in asolock (lock=0x7da858 <_Vmextern+120>, key=1241319115, type=<optimized out>)
at /usr/src/packages/BUILD/ksh93/src/lib/libast/aso/asolock.c:40
40 { for (;; asospinrest())
(gdb)
#4 0x00000000004fb30c in safebrkmem (vm=0x7da5c0 <_Vmheap>, caddr=0x0, csize=0, nsize=4194304, disc=<optimized out>)
at /usr/src/packages/BUILD/ksh93/src/lib/libast/vmalloc/vmdcsystem.c:206
206 asolock(&_Vmsbrklock, key, ASO_LOCK);
(gdb)
#5 0x00000000004ff34e in _vmsegalloc (vm=0x7da5c0 <_Vmheap>, blk=<optimized out>, size=16384, type=1)
at /usr/src/packages/BUILD/ksh93/src/lib/libast/vmalloc/vmsegment.c:377
377 if(!(base = (Vmuchar_t*)(*disc->memoryf)(vm, NIL(Void_t*), 0, segsz, disc)) )
(gdb)
#6 0x00000000004f9545 in bestpackget (vm=0x7fff00b146a0, ppos=<optimized out>, tid=1241319115)
at /usr/src/packages/BUILD/ksh93/src/lib/libast/vmalloc/vmbest.c:295
295 if(!(blk = (*_Vmsegalloc)(vm, NIL(Block_t*), sizeof(Pack_t)+EXTZ, VM_SEGEXTEND)) )
(gdb)
#7 0x00000000004f9840 in bestalloc (vm=0x7da5c0 <_Vmheap>, size=144, local=<optimized out>)
at /usr/src/packages/BUILD/ksh93/src/lib/libast/vmalloc/vmbest.c:762
762 } while (!(pk = bestpackget(vm, ppos, tid)));
(gdb)
#8 0x00000000004f7a29 in _ast_malloc (size=144) at /usr/src/packages/BUILD/ksh93/src/lib/libast/vmalloc/malloc.c:770
770 addr = (*Vmregion->meth.allocf)(Vmregion, size, 0);
(gdb)
#9 0x00000000004161a5 in set_trapinfo (info=<optimized out>, sig=<optimized out>, shp=<optimized out>)
at /usr/src/packages/BUILD/ksh93/src/cmd/ksh93/sh/fault.c:89
89 ip = malloc(sizeof(struct Siginfo));
(gdb)
#10 sh_fault (sig=36, info=0x7fff00b14fc0, context=<optimized out>) at /usr/src/packages/BUILD/ksh93/src/cmd/ksh93/sh/fault.c:210
210 set_trapinfo(shp,sig,info);
Post by Dr. Werner Fink
In other words even the libast variant of memory allocation is not
reentrant.
--
"Having a smoking section in a restaurant is like having
a peeing section in a swimming pool." -- Edward Burr
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 836 bytes
Desc: not available
URL: <http://lists.research.att.com/pipermail/ast-developers/attachments/20140709/d86e56ae/attachment.sig>
Lionel Cons
2014-07-09 12:20:58 UTC
Permalink
Post by Dr. Werner Fink
Post by Dr. Werner Fink
Hi,
even with _AST_std_malloc==0 and _map_malloc==1 I see sometimes that
the test suite hangs for ever in signal.sh. After attaching the gdb
to such a hanging ksh process I can identify that this happens in
the signal handler sh_fault() if the Siginfo structure is allocated.
The back trace shows that the ksh hanging at last in a nanosleep()
call within tvsleep() called below src/lib/libast/vmalloc/
Just like this
(gdb) up
#1 0x00000000004c6a7d in tvsleep (tv=<optimized out>, rv=0x0) at /usr/src/packages/BUILD/ksh93/src/lib/libast/tm/tvsleep.c:64
64 if ((r = nanosleep(&stv, &srv)) && errno == EINTR && rv)
(gdb)
#2 0x00000000004f65a9 in asorelax (nsec=<optimized out>) at /usr/src/packages/BUILD/ksh93/src/lib/libast/aso/asorelax.c:46
46 return tvsleep(&tv, 0);
(gdb)
#3 0x00000000004f653a in asolock (lock=0x7da858 <_Vmextern+120>, key=1241319115, type=<optimized out>)
at /usr/src/packages/BUILD/ksh93/src/lib/libast/aso/asolock.c:40
40 { for (;; asospinrest())
(gdb)
#4 0x00000000004fb30c in safebrkmem (vm=0x7da5c0 <_Vmheap>, caddr=0x0, csize=0, nsize=4194304, disc=<optimized out>)
at /usr/src/packages/BUILD/ksh93/src/lib/libast/vmalloc/vmdcsystem.c:206
206 asolock(&_Vmsbrklock, key, ASO_LOCK);
(gdb)
#5 0x00000000004ff34e in _vmsegalloc (vm=0x7da5c0 <_Vmheap>, blk=<optimized out>, size=16384, type=1)
at /usr/src/packages/BUILD/ksh93/src/lib/libast/vmalloc/vmsegment.c:377
377 if(!(base = (Vmuchar_t*)(*disc->memoryf)(vm, NIL(Void_t*), 0, segsz, disc)) )
(gdb)
#6 0x00000000004f9545 in bestpackget (vm=0x7fff00b146a0, ppos=<optimized out>, tid=1241319115)
at /usr/src/packages/BUILD/ksh93/src/lib/libast/vmalloc/vmbest.c:295
295 if(!(blk = (*_Vmsegalloc)(vm, NIL(Block_t*), sizeof(Pack_t)+EXTZ, VM_SEGEXTEND)) )
(gdb)
#7 0x00000000004f9840 in bestalloc (vm=0x7da5c0 <_Vmheap>, size=144, local=<optimized out>)
at /usr/src/packages/BUILD/ksh93/src/lib/libast/vmalloc/vmbest.c:762
762 } while (!(pk = bestpackget(vm, ppos, tid)));
(gdb)
#8 0x00000000004f7a29 in _ast_malloc (size=144) at /usr/src/packages/BUILD/ksh93/src/lib/libast/vmalloc/malloc.c:770
770 addr = (*Vmregion->meth.allocf)(Vmregion, size, 0);
(gdb)
#9 0x00000000004161a5 in set_trapinfo (info=<optimized out>, sig=<optimized out>, shp=<optimized out>)
at /usr/src/packages/BUILD/ksh93/src/cmd/ksh93/sh/fault.c:89
89 ip = malloc(sizeof(struct Siginfo));
(gdb)
#10 sh_fault (sig=36, info=0x7fff00b14fc0, context=<optimized out>) at /usr/src/packages/BUILD/ksh93/src/cmd/ksh93/sh/fault.c:210
210 set_trapinfo(shp,sig,info);
Post by Dr. Werner Fink
In other words even the libast variant of memory allocation is not
reentrant.
Isn't this the same bug Roland Mainz reported a while ago, typically
with SIGABRT involved?

Lionel
Adam Edgar
2014-07-09 14:13:34 UTC
Permalink
Very few implementations of malloc are reentrant. Making malloc thread safe without locking is not a trivial task. Using malloc within a signal trap is frowned upon in my experience.

ASE
Post by Dr. Werner Fink
Hi,
even with _AST_std_malloc==0 and _map_malloc==1 I see sometimes that
the test suite hangs for ever in signal.sh. After attaching the gdb
to such a hanging ksh process I can identify that this happens in
the signal handler sh_fault() if the Siginfo structure is allocated.
The back trace shows that the ksh hanging at last in a nanosleep()
call within tvsleep() called below src/lib/libast/vmalloc/
In other words even the libast variant of memory allocation is not
reentrant.
Werner
--
"Having a smoking section in a restaurant is like having
a peeing section in a swimming pool." -- Edward Burr
_______________________________________________
ast-developers mailing list
ast-developers at lists.research.att.com
http://lists.research.att.com/mailman/listinfo/ast-developers
Dr. Werner Fink
2014-07-10 07:10:23 UTC
Permalink
Post by Adam Edgar
Very few implementations of malloc are reentrant. Making malloc thread safe without locking is not a trivial task. Using malloc within a signal trap is frowned upon in my experience.
Indeed, the main problem could be that within signal handlers most
library functions can become in some cases nonreentrant functions,
The correct way seems to be (volatile) sig_atomic_t variables and/or
setjmp/longjmp pair (BSD) or maybe a context switch (SysV).

This would also allow to use the memory allocation of the glibc.
Post by Adam Edgar
Post by Dr. Werner Fink
Hi,
even with _AST_std_malloc==0 and _map_malloc==1 I see sometimes that
the test suite hangs for ever in signal.sh. After attaching the gdb
to such a hanging ksh process I can identify that this happens in
the signal handler sh_fault() if the Siginfo structure is allocated.
The back trace shows that the ksh hanging at last in a nanosleep()
call within tvsleep() called below src/lib/libast/vmalloc/
In other words even the libast variant of memory allocation is not
reentrant.
Werner
Werner

------------------------------------------------------------------
"Having a smoking section in a restaurant is like having
a peeing section in a swimming pool." -- Edward Burr
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 836 bytes
Desc: not available
URL: <http://lists.research.att.com/pipermail/ast-developers/attachments/20140710/fd677821/attachment-0001.sig>
Dr. Werner Fink
2014-07-16 13:38:53 UTC
Permalink
Post by Dr. Werner Fink
Post by Adam Edgar
Very few implementations of malloc are reentrant. Making malloc thread safe without locking is not a trivial task. Using malloc within a signal trap is frowned upon in my experience.
Indeed, the main problem could be that within signal handlers most
library functions can become in some cases nonreentrant functions,
The correct way seems to be (volatile) sig_atomic_t variables and/or
setjmp/longjmp pair (BSD) or maybe a context switch (SysV).
One expensive solution could be to block the signals if used for traps
and poll with sigtimedwait() for the signal events which then will be
queued by the kernel (POSIX 1003.1b)
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 836 bytes
Desc: not available
URL: <http://lists.research.att.com/pipermail/ast-developers/attachments/20140716/4a82b344/attachment.sig>
Loading...