'guix system reconfigure' can fail to load new system services

OpenSubmitted by Andreas Enge.
Details
4 participants
  • Andreas Enge
  • Danny Milosavljevic
  • Ludovic Courtès
  • Ricardo Wurmus
Owner
unassigned
Severity
important
A
A
Andreas Enge wrote on 5 Mar 2018 00:06
Re: Nginx service fails
(address . guix-devel@gnu.org)(address . bug-guix@gnu.org)
20180304230637.GA24077@jurong
Well, I am turning this into a bug report, since it still occurs with
the latest git commit ac1a9ce8b07f3b80900ee08436ff6e683e8dc195 .

This is the result of "./pre-inst-env guix system reconfigure ...",
where "..." is my configuration file:

...
creating nginx log directory '/var/log/nginx'
creating nginx run directory '/var/run/nginx'
creating nginx temp directories '/var/run/nginx/{client_body,proxy,fastcgi,uwsgi,scgi}_temp'
nginx: [alert] could not open error log file: open() "/gnu/store/pp71iff1qxwhh82vm34g18h9kmn0xrg5-nginx-1.13.9/logs/error.log" failed (2: No such file or directory)
nginx: the configuration file /gnu/store/5ixkryw6jl32cm6d1g9jb8dm9rbz8csc-nginx.conf syntax is ok
nginx: configuration file /gnu/store/5ixkryw6jl32cm6d1g9jb8dm9rbz8csc-nginx.conf test is successful
`/gnu/store/zchh8s3r1bbmia3zfxsyhsz3c4b9fmps-openssh-authorized-keys/root' -> `/etc/ssh/authorized_keys.d/root'
`/gnu/store/zchh8s3r1bbmia3zfxsyhsz3c4b9fmps-openssh-authorized-keys/andreas' -> `/etc/ssh/authorized_keys.d/andreas'
guix system: loading new services: user-homes term-auto nginx...
shepherd: Evaluating user expression (register-services (primitive-load "/gnu/st?") ?).
guix system: error: exception caught while executing 'eval' on service 'root':
find-long-options: unbound variable
Installing for i386-pc platform.
/gnu/store/1dnbfda2p1bxwyl0rcm96ka9pmi0wb88-grub-2.02/sbin/grub-install: warning: disk does not exist, so falling back to partition device /dev/xvda2.
/gnu/store/1dnbfda2p1bxwyl0rcm96ka9pmi0wb88-grub-2.02/sbin/grub-install: warning: disk does not exist, so falling back to partition device /dev/xvda2.
/gnu/store/1dnbfda2p1bxwyl0rcm96ka9pmi0wb88-grub-2.02/sbin/grub-install: warning: disk does not exist, so falling back to partition device /dev/xvda2.
/gnu/store/1dnbfda2p1bxwyl0rcm96ka9pmi0wb88-grub-2.02/sbin/grub-install: error: cannot find a GRUB drive for /dev/sda. Check your device.map.
guix system: error: failed to install bootloader /gnu/store/9iv63jm07klxvrr4fpwv6q5vpnca13ja-bootloader-installer

The final error is "normal", since I am installing in a Xen virtual machine,
where /dev/sda does not exist; it did not matter before.
The real error occurs above, the "uncaught exception".

But:
# herd status nginx
herd: service 'nginx' could not be found

Then I do a
# ./pre-inst-env guix system roll-back
# herd status nginx
herd: service 'nginx' could not be found

Otherwise said, I can go back, but my previously running web server has
definitely gone! How do I get it back?

Andreas
R
R
Ricardo Wurmus wrote on 5 Mar 2018 08:23
Re: bug#30706: Nginx service fails
(name . Andreas Enge)(address . andreas@enge.fr)
87tvtvyoih.fsf@elephly.net
Andreas Enge <andreas@enge.fr> writes:

Toggle quote (20 lines)
> Well, I am turning this into a bug report, since it still occurs with
> the latest git commit ac1a9ce8b07f3b80900ee08436ff6e683e8dc195 .
>
> This is the result of "./pre-inst-env guix system reconfigure ...",
> where "..." is my configuration file:
>
> ...
> creating nginx log directory '/var/log/nginx'
> creating nginx run directory '/var/run/nginx'
> creating nginx temp directories '/var/run/nginx/{client_body,proxy,fastcgi,uwsgi,scgi}_temp'
> nginx: [alert] could not open error log file: open() "/gnu/store/pp71iff1qxwhh82vm34g18h9kmn0xrg5-nginx-1.13.9/logs/error.log" failed (2: No such file or directory)
> nginx: the configuration file /gnu/store/5ixkryw6jl32cm6d1g9jb8dm9rbz8csc-nginx.conf syntax is ok
> nginx: configuration file /gnu/store/5ixkryw6jl32cm6d1g9jb8dm9rbz8csc-nginx.conf test is successful
> `/gnu/store/zchh8s3r1bbmia3zfxsyhsz3c4b9fmps-openssh-authorized-keys/root' -> `/etc/ssh/authorized_keys.d/root'
> `/gnu/store/zchh8s3r1bbmia3zfxsyhsz3c4b9fmps-openssh-authorized-keys/andreas' -> `/etc/ssh/authorized_keys.d/andreas'
> guix system: loading new services: user-homes term-auto nginx...
> shepherd: Evaluating user expression (register-services (primitive-load "/gnu/st?") ?).
> guix system: error: exception caught while executing 'eval' on service 'root':
> find-long-options: unbound variable

I had the same error when updating my i686 netbook after a long while.
After a reboot everything seemed to be fine, though.

--
Ricardo

GPG: BCA6 89B6 3655 3801 C3C6 2150 197A 5888 235F ACAC
A
A
Andreas Enge wrote on 5 Mar 2018 08:43
(name . Ricardo Wurmus)(address . rekado@elephly.net)
20180305074324.GA7822@jurong
On Mon, Mar 05, 2018 at 08:23:18AM +0100, Ricardo Wurmus wrote:
Toggle quote (3 lines)
> I had the same error when updating my i686 netbook after a long while.
> After a reboot everything seemed to be fine, though.

Ah, thanks for the information! A reboot made things worse in my case -
I rebooted the virtual machine, and now I cannot ssh into it any more.
So it looks like I will have to set it up from scratch again...

In my case, the problem occurred between February 28 and March 4.

Andreas
L
L
Ludovic Courtès wrote on 5 Mar 2018 11:09
(name . Andreas Enge)(address . andreas@enge.fr)
87k1uqbzq9.fsf@gnu.org
Andreas Enge <andreas@enge.fr> skribis:

Toggle quote (5 lines)
> guix system: loading new services: user-homes term-auto nginx...
> shepherd: Evaluating user expression (register-services (primitive-load "/gnu/st?") ?).
> guix system: error: exception caught while executing 'eval' on service 'root':
> find-long-options: unbound variable

The problem we have here is that the agetty service expects
‘find-long-options’ from linux-boot.scm, and it expects it at the top
level.

So what happens above is that we evaluate in PID 1 code like:

(make <service>
;; …
#:start (let ((tty … (find-long-options …) …))
…))

If you run this on an “old” GuixSD, ‘find-long-options’ is undefined.

Thus the whole (register-services …) expression fails to evaluate, and
we end up with some of the services missing.

Conclusions:

1. ‘guix system reconfigure’ should probably register services one by
one so that if one of the service expressions is erroneous, we
don’t bork everything. See ‘upgrade-shepherd-services’.

2. IWBN to delay execution of this whole default-tty thing to the
#:start method. Ideas, Danny?

In general we should do as little as possible at the top level in the
Shepherd config file.

Ludo’.
D
D
Danny Milosavljevic wrote on 6 Mar 2018 17:24
(name . Ludovic Courtès)(address . ludo@gnu.org)
20180306172442.76df0bb1@scratchpost.org
Hi Ludo,

Toggle quote (2 lines)
> If you run this on an “old” GuixSD, ‘find-long-options’ is undefined.

How can it be that (gnu services base) with find-long-options call is present
but the (gnu build linux-boot)'s find-long options isn't present?

Aren't they either both added by "guix system reconfigure" (or both removed)?

Also when selecting an old generation in the Grub boot menu, isn't both
(gnu build linux-boot) and (gnu services base) at the same generation when
starting up the service (as opposed to stopping the old service)?

Toggle quote (4 lines)
> 1. ‘guix system reconfigure’ should probably register services one by
> one so that if one of the service expressions is erroneous, we
> don’t bork everything. See ‘upgrade-shepherd-services’.

Yes please.

Toggle quote (3 lines)
> 2. IWBN to delay execution of this whole default-tty thing to the
> #:start method. Ideas, Danny?

The idea was that if you specify a serial console at boot that you can
actually log in at that console.

So it's trying to find out whether, at the time of service start,
there is a serial console specified (in the Linux command line), and if
so, start an agetty. Otherwise do not start that agetty.

We could also do that without a guix service - but I thought it would be
nice to have a guix service for it as well.
-----BEGIN PGP SIGNATURE-----

iQEzBAEBCAAdFiEEds7GsXJ0tGXALbPZ5xo1VCwwuqUFAlqewMoACgkQ5xo1VCww
uqVULwgApCiNjI0og8Vb4Ve/wAs7Xy+I8Rr62jPEJtCqf5ALg1m+5R+h7kI3+PBM
FOYxTnhUisa0Qt4DEbwmJVAm/s6L4wSbxpYtb33kkp2F8pW4sbcfKHVE7Gj60Pf7
Nh9wPxdyD8otJhxOf7tyV2SDiEoZNCvu0qoIABIMM/zkmPro51UWC8Or5bIKcXah
RAUkk+LcO9ZByGI1DLjcpwFf0d/QvBvmWDNgaYF+roA8b1x8mbe2V3brLS0oRmua
NiP1up6YckD559GF9Tqz54GhGN3uszB4SLaGJAJOz3X3rOTkS50a4genFDIGz/rA
4Z+sa7MV2UhJFhgVOyzYRAdgYCB+ew==
=2fLi
-----END PGP SIGNATURE-----


L
L
Ludovic Courtès wrote on 8 Mar 2018 10:08
control message for bug #30706
(address . control@debbugs.gnu.org)
87r2ovndcm.fsf@gnu.org
retitle 30706 'guix system reconfigure' can fail to load new system services
L
L
Ludovic Courtès wrote on 8 Mar 2018 10:09
(address . control@debbugs.gnu.org)
87po4fndcj.fsf@gnu.org
severity 30706 important
L
L
Ludovic Courtès wrote on 10 Mar 2018 16:30
Re: bug#30706: Nginx service fails
(name . Danny Milosavljevic)(address . dannym@scratchpost.org)
87o9jwuewo.fsf@gnu.org
Heya,

Danny Milosavljevic <dannym@scratchpost.org> skribis:

Toggle quote (5 lines)
>> If you run this on an “old” GuixSD, ‘find-long-options’ is undefined.
>
> How can it be that (gnu services base) with find-long-options call is present
> but the (gnu build linux-boot)'s find-long options isn't present?

The service-upgrade code loads new service definitions in PID 1.
However, it does not force a reload of already-loaded modules.

What happens here is that (gnu build linux-boot), the one without
‘find-long-options’, is already available in PID 1. Thus, when end up
using that one, which lacks ‘find-long-options’.

We could call ‘reload-module’, but that’s probably not a great idea as
it could cause breakage in previously-loaded code in PID 1. So I think
the current approach is the safest, and breakage of this sort should be
quite rare; we should pay attention to such issues, though, and try hard
to avoid them.

(Note that there’s no problem once you reboot, of course.)

Toggle quote (19 lines)
>> 1. ‘guix system reconfigure’ should probably register services one by
>> one so that if one of the service expressions is erroneous, we
>> don’t bork everything. See ‘upgrade-shepherd-services’.
>
> Yes please.
>
>> 2. IWBN to delay execution of this whole default-tty thing to the
>> #:start method. Ideas, Danny?
>
> The idea was that if you specify a serial console at boot that you can
> actually log in at that console.
>
> So it's trying to find out whether, at the time of service start,
> there is a serial console specified (in the Linux command line), and if
> so, start an agetty. Otherwise do not start that agetty.
>
> We could also do that without a guix service - but I thought it would be
> nice to have a guix service for it as well.

I agree. I think what you did in
c32e3ddedd103318ca3f0a4bf0c91c91e2517806 is good. The effect here is
just that agetty would fail to start upon reconfigure, but that’s an
acceptable limitation IMO.

Thanks,
Ludo’.
?