Login fail after core-update without reboot

OpenSubmitted by Pierre-Antoine Rouby.
Details
3 participants
  • Ludovic Courtès
  • Maxim Cournoyer
  • Pierre-Antoine Rouby
Owner
unassigned
Severity
important
P
P
Pierre-Antoine Rouby wrote on 17 Jul 2018 10:30
(address . bug-guix@gnu.org)
655514906.8428771.1531816208634.JavaMail.zimbra@inria.fr
Hi Guix,

I found a problem with 'guix reconfigure' and core-update. After
reconfigure it's impossible to connect in tty, 'login' segfault
with this error:

----------------------------------------------------------------------
login[30083]: segfault at 968 ip 00007f6ae6168ec8 sp 00007ffc7bd0f420 error 4 in libpthread-2.27.so[7f6ae6163000+19000]
----------------------------------------------------------------------

I think login try to use glibc-2.27 but it's still configured to use
glib-2.26. It's possible this issue come from '/etc/pam.d/login'.

A had to reboot my system.

gdb trace:
----------------------------------------------------------------------
process 24717 is executing new program: /gnu/store/31qbd404pmlm5bmb0l0r147mnjxzpq3y-shadow-4.6/bin/login
warning: Could not load shared library symbols for linux-vdso.so.1.
Do you need "set solib-search-path" or "set sysroot"?
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/gnu/store/l4lr0f5cjd0nbsaaf8b5dmcw1a1yypr3-glibc-2.27/lib/libthread_db.so.1".

Program received signal SIGSEGV, Segmentation fault.
0x00007fb95cabaec8 in __pthread_initialize_minimal_internal () from /gnu/store/l4lr0f5cjd0nbsaaf8b5dmcw1a1yypr3-glibc-2.27/lib/libpthread.so.0
(gdb) bt
#0 0x00007fb95cabaec8 in __pthread_initialize_minimal_internal () from /gnu/store/l4lr0f5cjd0nbsaaf8b5dmcw1a1yypr3-glibc-2.27/lib/libpthread.so.0
#1 0x00007fb95caba621 in _init () from /gnu/store/l4lr0f5cjd0nbsaaf8b5dmcw1a1yypr3-glibc-2.27/lib/libpthread.so.0
#2 0x00007fb95d8dcaa0 in ?? () from /gnu/store/gwyb3679v49ljisgkvzay2xa3njgq4ii-linux-pam-1.3.0/lib/security/pam_env.so
#3 0x00007fb95eb1f33a in call_init.part () from /gnu/store/4sqaib7c2dfjv62ivrg9b8wa7bh226la-glibc-2.26.105-g0890d5379c/lib/ld-linux-x86-64.so.2
#4 0x00007fb95eb1f4f5 in _dl_init () from /gnu/store/4sqaib7c2dfjv62ivrg9b8wa7bh226la-glibc-2.26.105-g0890d5379c/lib/ld-linux-x86-64.so.2
#5 0x00007fb95eb23980 in dl_open_worker () from /gnu/store/4sqaib7c2dfjv62ivrg9b8wa7bh226la-glibc-2.26.105-g0890d5379c/lib/ld-linux-x86-64.so.2
#6 0x00007fb95e058901 in _dl_catch_error () from /gnu/store/4sqaib7c2dfjv62ivrg9b8wa7bh226la-glibc-2.26.105-g0890d5379c/lib/libc.so.6
#7 0x00007fb95eb23127 in _dl_open () from /gnu/store/4sqaib7c2dfjv62ivrg9b8wa7bh226la-glibc-2.26.105-g0890d5379c/lib/ld-linux-x86-64.so.2
#8 0x00007fb95e4f9f96 in dlopen_doit () from /gnu/store/4sqaib7c2dfjv62ivrg9b8wa7bh226la-glibc-2.26.105-g0890d5379c/lib/libdl.so.2
#9 0x00007fb95e058901 in _dl_catch_error () from /gnu/store/4sqaib7c2dfjv62ivrg9b8wa7bh226la-glibc-2.26.105-g0890d5379c/lib/libc.so.6
#10 0x00007fb95e4fa5a9 in _dlerror_run () from /gnu/store/4sqaib7c2dfjv62ivrg9b8wa7bh226la-glibc-2.26.105-g0890d5379c/lib/libdl.so.2
#11 0x00007fb95e4fa021 in dlopen@@GLIBC_2.2.5 () from /gnu/store/4sqaib7c2dfjv62ivrg9b8wa7bh226la-glibc-2.26.105-g0890d5379c/lib/libdl.so.2
#12 0x00007fb95e701f4d in _pam_load_module () from /gnu/store/gwyb3679v49ljisgkvzay2xa3njgq4ii-linux-pam-1.3.0/lib/libpam.so.0
#13 0x00007fb95e7025d9 in _pam_add_handler () from /gnu/store/gwyb3679v49ljisgkvzay2xa3njgq4ii-linux-pam-1.3.0/lib/libpam.so.0
#14 0x00007fb95e702cd6 in _pam_parse_conf_file () from /gnu/store/gwyb3679v49ljisgkvzay2xa3njgq4ii-linux-pam-1.3.0/lib/libpam.so.0
#15 0x00007fb95e7033d7 in _pam_init_handlers () from /gnu/store/gwyb3679v49ljisgkvzay2xa3njgq4ii-linux-pam-1.3.0/lib/libpam.so.0
#16 0x00007fb95e704bc1 in pam_start () from /gnu/store/gwyb3679v49ljisgkvzay2xa3njgq4ii-linux-pam-1.3.0/lib/libpam.so.0
#17 0x0000000000402f2c in main ()
----------------------------------------------------------------------

----------------------------------------------------------------------
(gdb) info sharedlibrary
From To Syms Read Shared Object Library
0x00007fb95eb10cc0 0x00007fb95eb2c990 Yes (*) /gnu/store/4sqaib7c2dfjv62ivrg9b8wa7bh226la-glibc-2.26.105-g0890d5379c/lib/ld-linux-x86-64.so.2
No linux-vdso.so.1
0x00007fb95e90d1b0 0x00007fb95e90de59 Yes (*) /gnu/store/gwyb3679v49ljisgkvzay2xa3njgq4ii-linux-pam-1.3.0/lib/libpam_misc.so.0
0x00007fb95e6ff9d0 0x00007fb95e706ae5 Yes (*) /gnu/store/gwyb3679v49ljisgkvzay2xa3njgq4ii-linux-pam-1.3.0/lib/libpam.so.0
0x00007fb95e4f9de0 0x00007fb95e4faa27 Yes (*) /gnu/store/4sqaib7c2dfjv62ivrg9b8wa7bh226la-glibc-2.26.105-g0890d5379c/lib/libdl.so.2
0x00007fb95e2e4a40 0x00007fb95e2f4775 Yes (*) /gnu/store/2ifmksc425qcysl5rkxkbv6yrgc1w9cs-gcc-5.5.0-lib/lib/libgcc_s.so.1
0x00007fb95df50750 0x00007fb95e088fac Yes (*) /gnu/store/4sqaib7c2dfjv62ivrg9b8wa7bh226la-glibc-2.26.105-g0890d5379c/lib/libc.so.6
0x00007fb95dd1a6a0 0x00007fb95dd20af8 Yes (*) /gnu/store/gwyb3679v49ljisgkvzay2xa3njgq4ii-linux-pam-1.3.0/lib/security/pam_unix.so
0x00007fb95dae0ba0 0x00007fb95dae5f47 Yes (*) /gnu/store/4sqaib7c2dfjv62ivrg9b8wa7bh226la-glibc-2.26.105-g0890d5379c/lib/libcrypt.so.1
0x00007fb95cabab40 0x00007fb95cac8657 Yes (*) /gnu/store/l4lr0f5cjd0nbsaaf8b5dmcw1a1yypr3-glibc-2.27/lib/libpthread.so.0
0x00007fb95d8dcf30 0x00007fb95d8de421 Yes (*) /gnu/store/gwyb3679v49ljisgkvzay2xa3njgq4ii-linux-pam-1.3.0/lib/security/pam_env.so
(*): Shared library is missing debugging information.
----------------------------------------------------------------------

--
Pierre-Antoine Rouby
L
L
Ludovic Courtès wrote on 23 Jul 2018 15:17
(name . Pierre-Antoine Rouby)(address . pierre-antoine.rouby@inria.fr)(address . 32182@debbugs.gnu.org)
87r2juqegc.fsf@gnu.org
Hello!

Pierre-Antoine Rouby <pierre-antoine.rouby@inria.fr> skribis:

Toggle quote (3 lines)
> I think login try to use glibc-2.27 but it's still configured to use
> glib-2.26. It's possible this issue come from '/etc/pam.d/login'.

Indeed. The problem here is that ‘reconfigure’ updates /etc/pam.d, but
does not change the service definition of ‘login’, etc. Thus, when
‘login’ restarts, it reads the new /etc/pam.d/login, which contains a
line like:

session required /gnu/store/…-elogind-232.4/lib/security/pam_elogind.so

Consequently, ‘login’ dlopens pam_elogind.so, which is linked against
the new libc, which eventually causes it to crash.

It’s a real issue on headless servers because you could lock yourself
out (‘sshd’ could have the same problem.)

I can think of several solutions:

1. Arrange for services to refer to /gnu/store/…-pam.d instead of
/etc/pam.d. This can maybe be achieved by modifying PAM such that
these applications honor $PAM_DIRECTORY or something like that.

2. Add support for “service chain-loading” in the Shepherd and/or
GuixSD. The idea is that, for services that cannot be restarted
right away because they are currently running, register code to
upgrade the service next time it is restarted (see
https://bugs.gnu.org/30706). That way, when ‘login’ restarts
after ‘reconfigure’, it’s the new ‘login’ service that would be
restarted.

Thoughts?

Ludo’.
L
L
Ludovic Courtès wrote on 8 Sep 2018 23:05
control message for bug #32182
(address . control@debbugs.gnu.org)
87h8izem3o.fsf@gnu.org
severity 32182 important
L
L
Ludovic Courtès wrote on 2 May 2020 16:37
Re: bug#32182: Login fail after core-update without reboot
(name . Pierre-Antoine Rouby)(address . pierre-antoine.rouby@inria.fr)(address . 32182@debbugs.gnu.org)
87mu6q5rnn.fsf@gnu.org
Hi, old bug! :-)

ludo@gnu.org (Ludovic Courtès) skribis:

Toggle quote (6 lines)
> I can think of several solutions:
>
> 1. Arrange for services to refer to /gnu/store/…-pam.d instead of
> /etc/pam.d. This can maybe be achieved by modifying PAM such that
> these applications honor $PAM_DIRECTORY or something like that.

We should look into that.

Toggle quote (8 lines)
> 2. Add support for “service chain-loading” in the Shepherd and/or
> GuixSD. The idea is that, for services that cannot be restarted
> right away because they are currently running, register code to
> upgrade the service next time it is restarted (see
> <https://bugs.gnu.org/30706>). That way, when ‘login’ restarts
> after ‘reconfigure’, it’s the new ‘login’ service that would be
> restarted.

That bit was implemented long ago with Shepherd service replacements.
So at least, now, one can run ‘herd start term-tty1’ or similar to get a
working login:


Ludo’.
M
M
Maxim Cournoyer wrote on 16 Dec 2021 16:56
(name . Ludovic Courtès)(address . ludo@gnu.org)
878rwksfvt.fsf@gmail.com
Hello,

Ludovic Courtès <ludo@gnu.org> writes:

[...]

Toggle quote (18 lines)
>> I can think of several solutions:
>>
>> 1. Arrange for services to refer to /gnu/store/…-pam.d instead of
>> /etc/pam.d. This can maybe be achieved by modifying PAM such that
>> these applications honor $PAM_DIRECTORY or something like that.
>>
>> 2. Add support for “service chain-loading” in the Shepherd and/or
>> GuixSD. The idea is that, for services that cannot be restarted
>> right away because they are currently running, register code to
>> upgrade the service next time it is restarted (see
>> <https://bugs.gnu.org/30706>). That way, when ‘login’ restarts
>> after ‘reconfigure’, it’s the new ‘login’ service that would be
>> restarted.
>
> That bit was implemented long ago with Shepherd service replacements.
> So at least, now, one can run ‘herd start term-tty1’ or similar to get a
> working login:

Point 2 doesn't seem to help in working around or fixing the related
#52533 though, correct? Restarting the remote elogind or even
ssh-daemon doesn't work there, perhaps because 'guix deploy' wasn't able
to complete in the first place.

I guess that means we should look into fixing point 1., as you already
suggested. On top of that I'd propose disabling PAM unless there's a
good reason to have it on by default; as I wrote in the other issue,
`man sshd_config' documents that by default in OpenSSH it is disabled.

Thanks,

Maxim
M
M
Maxim Cournoyer wrote on 16 Dec 2021 17:15
(name . Ludovic Courtès)(address . ludo@gnu.org)
875yrosez3.fsf@gmail.com
Hi again,

Maxim Cournoyer <maxim.cournoyer@gmail.com> writes:

Toggle quote (29 lines)
> Hello,
>
> Ludovic Courtès <ludo@gnu.org> writes:
>
> [...]
>
>>> I can think of several solutions:
>>>
>>> 1. Arrange for services to refer to /gnu/store/…-pam.d instead of
>>> /etc/pam.d. This can maybe be achieved by modifying PAM such that
>>> these applications honor $PAM_DIRECTORY or something like that.
>>>
>>> 2. Add support for “service chain-loading” in the Shepherd and/or
>>> GuixSD. The idea is that, for services that cannot be restarted
>>> right away because they are currently running, register code to
>>> upgrade the service next time it is restarted (see
>>> <https://bugs.gnu.org/30706>). That way, when ‘login’ restarts
>>> after ‘reconfigure’, it’s the new ‘login’ service that would be
>>> restarted.
>>
>> That bit was implemented long ago with Shepherd service replacements.
>> So at least, now, one can run ‘herd start term-tty1’ or similar to get a
>> working login:
>
> Point 2 doesn't seem to help in working around or fixing the related
> #52533 though, correct? Restarting the remote elogind or even
> ssh-daemon doesn't work there, perhaps because 'guix deploy' wasn't able
> to complete in the first place.

Another bit that probably played a role here: the above failure to
complete is perhaps caused/made worst by #41238 (guix deploy close ssh
session after each store items sent), which doesn't reuse the same
stable SSH session to do the whole of what it needs to do.

Maxim
?