Cedric Blancher
2013-11-22 14:40:57 UTC
The bug report below might explain some of the reported signal
weirdness on FreeBSD and derivatives.
How far are we anyway with signal reliability vs volatile of the ksh93
struct jobs?
Ced
---------- Forwarded message ----------
From: Konstantin Belousov <kostikbel at gmail.com>
Date: 21 November 2013 22:15
Subject: Re: Problem with signal 0 being delivered to SIGUSR1 handler
To: Vitaly Magerya <vmagerya at gmail.com>
Cc: freebsd-hackers at freebsd.org, davidxu at freebsd.org, threads at freebsd.org
in the line before failed.
delivery and for some reason the code inside the deferred path called
into rtld for symbol binding. Than, rtld lock is locked, some code in
rtld is executed, and rtld lock is unlocked. Unlock causes _thr_ast()
run, which results in the nested check_deferred_signal() execution.
The check_deferred_signal() clearks si_signo, so on return the same
signal is delivered one more time, but is advertized as signo zero.
The _thr_rtld_init() approach of doing dummy calls does not really work,
since it is not practically possible to enumerate the symbols needed
during signal delivery.
My first attempt to fix this was to increment curthread->critical_count
around the calls to check_* functions in the _thr_ast(), but it causes
reverse problem of losing _thr_ast() runs on unlock.
I ended up with the flag to indicate that deferred delivery is running,
so check_deferred_signal() should avoid doing anything. A delicate
moment is that user signal handler is allowed to modify the passed
machine context to result the return from the signal handler to cause
arbitrary jump, or just do longjmp(). For this case, I also clear the
flag in thr_sighandler(), since kernel signal delivery means that nested
delivery code should not run right now.
Please try this.
diff --git a/lib/libthr/thread/thr_private.h b/lib/libthr/thread/thr_private.h
index 83a02b5..c6651cd 100644
--- a/lib/libthr/thread/thr_private.h
+++ b/lib/libthr/thread/thr_private.h
@@ -433,6 +433,9 @@ struct pthread {
/* the sigaction should be used for deferred signal. */
struct sigaction deferred_sigact;
+ /* deferred signal delivery is performed, do not reenter. */
+ int deferred_run;
+
/* Force new thread to exit. */
int force_exit;
diff --git a/lib/libthr/thread/thr_sig.c b/lib/libthr/thread/thr_sig.c
index 415ddb0..57c9406 100644
--- a/lib/libthr/thread/thr_sig.c
+++ b/lib/libthr/thread/thr_sig.c
@@ -162,6 +162,7 @@ thr_sighandler(int sig, siginfo_t *info, void *_ucp)
act = _thr_sigact[sig-1].sigact;
_thr_rwl_unlock(&_thr_sigact[sig-1].lock);
errno = err;
+ curthread->deferred_run = 0;
/*
* if a thread is in critical region, for example it holds low
level locks,
@@ -320,14 +321,18 @@ check_deferred_signal(struct pthread *curthread)
siginfo_t info;
int uc_len;
- if (__predict_true(curthread->deferred_siginfo.si_signo == 0))
+ if (__predict_true(curthread->deferred_siginfo.si_signo == 0 ||
+ curthread->deferred_run))
return;
+ curthread->deferred_run = 1;
uc_len = __getcontextx_size();
uc = alloca(uc_len);
getcontext(uc);
- if (curthread->deferred_siginfo.si_signo == 0)
+ if (curthread->deferred_siginfo.si_signo == 0) {
+ curthread->deferred_run = 0;
return;
+ }
__fillcontextx2((char *)uc);
act = curthread->deferred_sigact;
uc->uc_sigmask = curthread->deferred_sigmask;
--
Cedric Blancher <cedric.blancher at gmail.com>
Institute Pasteur
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 850 bytes
Desc: not available
URL: <http://lists.research.att.com/pipermail/ast-developers/attachments/20131122/67f74119/attachment.sig>
weirdness on FreeBSD and derivatives.
How far are we anyway with signal reliability vs volatile of the ksh93
struct jobs?
Ced
---------- Forwarded message ----------
From: Konstantin Belousov <kostikbel at gmail.com>
Date: 21 November 2013 22:15
Subject: Re: Problem with signal 0 being delivered to SIGUSR1 handler
To: Vitaly Magerya <vmagerya at gmail.com>
Cc: freebsd-hackers at freebsd.org, davidxu at freebsd.org, threads at freebsd.org
Hi, folks. I'm investigating a test case failure that devel/boehm-gc
has on recent FreeBSD releases. The problem is that a signal
handler registered for SIGUSR1 is sometimes called with signum=0,
which should not be possible under any conditions.
/* Compile with 'c99 -o example example.c -pthread'
*/
#include <pthread.h>
#include <signal.h>
#include <stdio.h>
#include <stdlib.h>
void signal_handler(int signum, siginfo_t *si, void *context) {
if (signum != SIGUSR1) {
printf("bad signal, signum=%d\n", signum);
exit(1);
}
}
void *thread_func(void *arg) {
return arg;
}
int main(void) {
struct sigaction sa = { 0 };
sa.sa_flags = SA_SIGINFO;
sa.sa_sigaction = signal_handler;
if (sigfillset(&sa.sa_mask) != 0) abort();
if (sigaction(SIGUSR1, &sa, NULL) != 0) abort();
for (int i = 0; i < 10000; i++) {
pthread_t t;
pthread_create(&t, NULL, thread_func, NULL);
pthread_kill(t, SIGUSR1);
Side note. pthread_kill(3) call behaviour is undefined if pthread_create(3)has on recent FreeBSD releases. The problem is that a signal
handler registered for SIGUSR1 is sometimes called with signum=0,
which should not be possible under any conditions.
/* Compile with 'c99 -o example example.c -pthread'
*/
#include <pthread.h>
#include <signal.h>
#include <stdio.h>
#include <stdlib.h>
void signal_handler(int signum, siginfo_t *si, void *context) {
if (signum != SIGUSR1) {
printf("bad signal, signum=%d\n", signum);
exit(1);
}
}
void *thread_func(void *arg) {
return arg;
}
int main(void) {
struct sigaction sa = { 0 };
sa.sa_flags = SA_SIGINFO;
sa.sa_sigaction = signal_handler;
if (sigfillset(&sa.sa_mask) != 0) abort();
if (sigaction(SIGUSR1, &sa, NULL) != 0) abort();
for (int i = 0; i < 10000; i++) {
pthread_t t;
pthread_create(&t, NULL, thread_func, NULL);
pthread_kill(t, SIGUSR1);
in the line before failed.
}
return 0;
}
Under FreeBSD 9.2-RELEASE amd64 I pretty consistently get
"signum=0" from this program, but you may need to run it a few
times or increase the number of iterations to see the same.
Interestingly enough, I don't see this behavior under 9.0-RELEASE.
So, any ideas what the problem here is?
It happens when libthr deferred signal handling path is taken for signalreturn 0;
}
Under FreeBSD 9.2-RELEASE amd64 I pretty consistently get
"signum=0" from this program, but you may need to run it a few
times or increase the number of iterations to see the same.
Interestingly enough, I don't see this behavior under 9.0-RELEASE.
So, any ideas what the problem here is?
delivery and for some reason the code inside the deferred path called
into rtld for symbol binding. Than, rtld lock is locked, some code in
rtld is executed, and rtld lock is unlocked. Unlock causes _thr_ast()
run, which results in the nested check_deferred_signal() execution.
The check_deferred_signal() clearks si_signo, so on return the same
signal is delivered one more time, but is advertized as signo zero.
The _thr_rtld_init() approach of doing dummy calls does not really work,
since it is not practically possible to enumerate the symbols needed
during signal delivery.
My first attempt to fix this was to increment curthread->critical_count
around the calls to check_* functions in the _thr_ast(), but it causes
reverse problem of losing _thr_ast() runs on unlock.
I ended up with the flag to indicate that deferred delivery is running,
so check_deferred_signal() should avoid doing anything. A delicate
moment is that user signal handler is allowed to modify the passed
machine context to result the return from the signal handler to cause
arbitrary jump, or just do longjmp(). For this case, I also clear the
flag in thr_sighandler(), since kernel signal delivery means that nested
delivery code should not run right now.
Please try this.
diff --git a/lib/libthr/thread/thr_private.h b/lib/libthr/thread/thr_private.h
index 83a02b5..c6651cd 100644
--- a/lib/libthr/thread/thr_private.h
+++ b/lib/libthr/thread/thr_private.h
@@ -433,6 +433,9 @@ struct pthread {
/* the sigaction should be used for deferred signal. */
struct sigaction deferred_sigact;
+ /* deferred signal delivery is performed, do not reenter. */
+ int deferred_run;
+
/* Force new thread to exit. */
int force_exit;
diff --git a/lib/libthr/thread/thr_sig.c b/lib/libthr/thread/thr_sig.c
index 415ddb0..57c9406 100644
--- a/lib/libthr/thread/thr_sig.c
+++ b/lib/libthr/thread/thr_sig.c
@@ -162,6 +162,7 @@ thr_sighandler(int sig, siginfo_t *info, void *_ucp)
act = _thr_sigact[sig-1].sigact;
_thr_rwl_unlock(&_thr_sigact[sig-1].lock);
errno = err;
+ curthread->deferred_run = 0;
/*
* if a thread is in critical region, for example it holds low
level locks,
@@ -320,14 +321,18 @@ check_deferred_signal(struct pthread *curthread)
siginfo_t info;
int uc_len;
- if (__predict_true(curthread->deferred_siginfo.si_signo == 0))
+ if (__predict_true(curthread->deferred_siginfo.si_signo == 0 ||
+ curthread->deferred_run))
return;
+ curthread->deferred_run = 1;
uc_len = __getcontextx_size();
uc = alloca(uc_len);
getcontext(uc);
- if (curthread->deferred_siginfo.si_signo == 0)
+ if (curthread->deferred_siginfo.si_signo == 0) {
+ curthread->deferred_run = 0;
return;
+ }
__fillcontextx2((char *)uc);
act = curthread->deferred_sigact;
uc->uc_sigmask = curthread->deferred_sigmask;
--
Cedric Blancher <cedric.blancher at gmail.com>
Institute Pasteur
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 850 bytes
Desc: not available
URL: <http://lists.research.att.com/pipermail/ast-developers/attachments/20131122/67f74119/attachment.sig>