Hi! On Thu 05 Jul 2018 05:33, Mark H Weaver writes: >> One problem I’ve noticed is that the child process that >> ‘call-with-decompressed-port’ spawns would be stuck trying to get the >> allocation lock: >> >> So it seems quite clear that the thing has the alloc lock taken. I >> suppose this can happen if one of the libgc threads runs right when we >> call fork and takes the alloc lock, right? > > Does libgc spawn threads that run concurrently with user threads? If > so, that would be news to me. My understanding was that incremental > marking occurs within GC allocation calls, and marking threads are only > spawned after all user threads have been stopped, but I could be wrong. I think Mark is correct. > The first idea that comes to my mind is that perhaps the finalization > thread is holding the GC allocation lock when 'fork' is called. So of course we agree you're only supposed to "fork" when there are no other threads running, I think. As far as the finalizer thread goes, "primitive-fork" calls "scm_i_finalizer_pre_fork" which should join the finalizer thread, before the fork. There could be a bug obviously but the intention is for Guile to shut down its internal threads. Here's the body of primitive-fork fwiw: { int pid; scm_i_finalizer_pre_fork (); if (scm_ilength (scm_all_threads ()) != 1) /* Other threads may be holding on to resources that Guile needs -- it is not safe to permit one thread to fork while others are running. In addition, POSIX clearly specifies that if a multi-threaded program forks, the child must only call functions that are async-signal-safe. We can't guarantee that in general. The best we can do is to allow forking only very early, before any call to sigaction spawns the signal-handling thread. */ scm_display (scm_from_latin1_string ("warning: call to primitive-fork while multiple threads are running;\n" " further behavior unspecified. See \"Processes\" in the\n" " manual, for more information.\n"), scm_current_warning_port ()); pid = fork (); if (pid == -1) SCM_SYSERROR; return scm_from_int (pid); } > Another possibility: both the finalization thread and the signal > delivery thread call 'scm_without_guile', which calls 'GC_do_blocking', > which also temporarily grabs the GC allocation lock before calling the > specified function. See 'GC_do_blocking_inner' in pthread_support.c in > libgc. You spawn the signal delivery thread by calling 'sigaction' and > you make work for it to do every second when the SIGALRM is delivered. The signal thread is a possibility though in that case you'd get a warning; the signal-handling thread appears in scm_all_threads. Do you see a warning? If you do, that is a problem :) >> If that is correct, the fix would be to call fork within >> ‘GC_call_with_alloc_lock’. >> >> How does that sound? > > Sure, sounds good to me. I don't think this is necessary. I think the problem is that other threads are running. If we solve that, then we solve this issue; if we don't solve that, we don't know what else those threads are doing, so we don't know what mutexes and other state they might have. Andy