/run/booted-system is not protected from GC

  • Done
  • quality assurance status badge
Details
2 participants
  • Ludovic Courtès
  • Ludovic Courtès
Owner
unassigned
Submitted by
Ludovic Courtès
Severity
important
L
L
Ludovic Courtès wrote on 25 Feb 2021 10:00
/run/booted-system can be removed by ‘guix system delete-generations’
(address . bug-guix@gnu.org)
878s7c1pzh.fsf@inria.fr
/run/booted-system is not protected from GC. Here’s what I observed on
a machine with unattended upgrades (which includes automatic removal of
old system generations):

Toggle snippet (20 lines)
~$ ls -l /run/booted-system
lrwxrwxrwx 1 root root 33 Nov 2 16:06 /run/booted-system -> /var/guix/profiles/system-68-link
~$ ls -l /var/guix/profiles/system-68-link
ls: cannot access '/var/guix/profiles/system-68-link': No such file or directory
~$ ls -lrt /var/guix/profiles/system-*-link
lrwxrwxrwx 1 root root 50 Nov 29 01:34 /var/guix/profiles/system-74-link -> /gnu/store/ym7bs9pp9lxy0s1pjfrbic0pjjr7svzd-system
lrwxrwxrwx 1 root root 50 Dec 6 01:33 /var/guix/profiles/system-75-link -> /gnu/store/ivxak4d58gqz2xqihkc636nhwhpa1fs4-system
lrwxrwxrwx 1 root root 50 Dec 13 01:35 /var/guix/profiles/system-76-link -> /gnu/store/wqpwlqlfsc4yqm0nypzvan1a8sb9xmcc-system
lrwxrwxrwx 1 root root 50 Dec 27 01:33 /var/guix/profiles/system-77-link -> /gnu/store/y539xw934mbdcqidg6zaxrzq9hy8hm9p-system
lrwxrwxrwx 1 root root 50 Jan 4 09:29 /var/guix/profiles/system-78-link -> /gnu/store/3582jh1v9vn51wasyl1y189ng4vhqiy9-system
lrwxrwxrwx 1 root root 50 Jan 10 01:33 /var/guix/profiles/system-79-link -> /gnu/store/q67avpz4bfhq2zyfhh8ka6q9hpqzc3xj-system
lrwxrwxrwx 1 root root 50 Jan 17 01:33 /var/guix/profiles/system-80-link -> /gnu/store/13bwrvjgsl16sigwpa93yr4r51qnm8zi-system
lrwxrwxrwx 1 root root 50 Jan 19 14:23 /var/guix/profiles/system-81-link -> /gnu/store/v151wf6lj4ivgj3xwysi9fdmva55jzqp-system
lrwxrwxrwx 1 root root 50 Jan 24 01:35 /var/guix/profiles/system-82-link -> /gnu/store/rvarwdsymd94am8bc8b1rx2xdrxcvx6l-system
lrwxrwxrwx 1 root root 50 Jan 31 01:33 /var/guix/profiles/system-83-link -> /gnu/store/0sgd2yb702483zi3hl04wv4r4rn3ibcy-system
lrwxrwxrwx 1 root root 50 Feb 7 01:32 /var/guix/profiles/system-84-link -> /gnu/store/bix4yp9zs2h3vy8zi9ap9mazap727hng-system
lrwxrwxrwx 1 root root 50 Feb 14 01:33 /var/guix/profiles/system-85-link -> /gnu/store/702l59w3gbsc45c7nffsyv89vnaky5zc-system
lrwxrwxrwx 1 root root 50 Feb 21 01:34 /var/guix/profiles/system-86-link -> /gnu/store/qq4rz2fprvnsgqhj24v735hhmp189jl8-system

This is bad but mostly harmless since all the packages actually in use
are GC-protected anyway, via ‘guix gc --list-busy’.

It breaks things like ‘guix deploy’ though. Specifically, its remote
initrd module check in (gnu machine ssh) looks for
/run/booted-system/kernel/lib/modules/KERNEL-VERSION/modules.alias, via
‘missing-modules’ of (gnu build linux-modules). That code throws
because ‘modules.alias’ is supposed to exist, which in turn leads ‘guix
deploy’ to crash badly:

Toggle snippet (68 lines)
$ guix time-machine -C channels.scm -- deploy deploy.scm
The following 1 machine will be deployed:
guix-hpc

guix deploy: deploying to guix-hpc...
substitute: updating substitutes from 'https://ci.guix.gnu.org'... 100.0%
La jenaj derivoj estos konstruataj:
/gnu/store/nh0n44vb4i445msj6g6i0kwyp3jgs39c-remote-exp.scm.drv
/gnu/store/44271p6fvygx0fa7dvjhmqz7jwnnr9sc-remote-assertion.scm.drv
/gnu/store/5nqzxyal9lhhga7bz9qf6hvml5v862xw-remote-assertion.scm.drv

0.0 MB will be downloaded
downloading from https://ci.guix.gnu.org/nar/lzip/1q9118pw4d18ihj91csfilbg6x2x29am-module-import-compiled ...
module-import-compiled 8KiB 1.7MiB/s 00:00 [##################] 100.0%

building /gnu/store/44271p6fvygx0fa7dvjhmqz7jwnnr9sc-remote-assertion.scm.drv...
building /gnu/store/5nqzxyal9lhhga7bz9qf6hvml5v862xw-remote-assertion.scm.drv...
building /gnu/store/nh0n44vb4i445msj6g6i0kwyp3jgs39c-remote-exp.scm.drv...
guix deploy: sending 8 store items (5 MiB) to 'localhost'...

FORMAT: error with call: (format #f "missing modules for ~a:~{ ~a~}<===~%" #<file-system-label "root"> ===>#f )
expected a list argument
FORMAT: INTERNAL ERROR IN FORMAT-ERROR!
destination: #f
format string: "missing modules for ~a:~{ ~a~}~%"
format args: (#<file-system-label "root"> #f)
error args: (#f "error in format" () #f)
Backtrace:
In guix/store.scm:
1305:8 19 (call-with-build-handler #<procedure 7f0d3cc35240 at guix/ui.scm:1171:2 (continue store things mode)> _)
In guix/scripts/deploy.scm:
170:14 18 (_)
In guix/store.scm:
1346:2 17 (map/accumulate-builds #<store-connection 256.99 7f0d39c6c000> _ _)
In srfi/srfi-1.scm:
586:17 16 (map1 (#<<unresolved> things: (("/gnu/store/b5nnbpgkvgdpzgvj67539ylcaqacj90l-guile-3.0.2.drv" . "out"…>))
In guix/store.scm:
1305:8 15 (call-with-build-handler #<procedure build-accumulator (continue store things mode)> _)
In ice-9/boot-9.scm:
1736:10 14 (with-exception-handler _ _ #:unwind? _ #:unwind-for-type _)
In guix/scripts/deploy.scm:
144:6 13 (_)
In guix/store.scm:
2066:24 12 (run-with-store #<store-connection 256.99 7f0d39c6c000> _ #:guile-for-build _ #:system _ #:target _)
In gnu/machine/ssh.scm:
445:2 11 (_ _)
338:4 10 (_ _)
In srfi/srfi-1.scm:
650:11 9 (for-each #<procedure 7f0d3e372d28 at gnu/machine/ssh.scm:338:14 (proc value)> _ _)
In gnu/machine/ssh.scm:
275:26 8 (_ #f)
In ice-9/format.scm:
1546:2 7 (format #f "missing modules for ~a:~{ ~a~}~%" #<file-system-label "root"> #f)
571:24 6 (format:format-work "missing modules for ~a:~{ ~a~}~%" (#<file-system-label "root"> #f))
In ice-9/boot-9.scm:
1736:10 5 (with-exception-handler _ _ #:unwind? _ #:unwind-for-type _)
In ice-9/format.scm:
102:10 4 (_)
In ice-9/boot-9.scm:
1669:16 3 (raise-exception _ #:continuable? _)
1669:16 2 (raise-exception _ #:continuable? _)
1669:16 1 (raise-exception _ #:continuable? _)
1669:16 0 (raise-exception _ #:continuable? _)

ice-9/boot-9.scm:1669:16: In procedure raise-exception:
error in format

Ludo’.
L
L
Ludovic Courtès wrote on 25 Feb 2021 10:59
control message for bug #46767
(address . control@debbugs.gnu.org)
871rd41n8w.fsf@gnu.org
severity 46767 important
quit
L
L
Ludovic Courtès wrote on 25 Feb 2021 11:00
(address . control@debbugs.gnu.org)
87zgzszcuy.fsf@gnu.org
retitle 46767 /run/booted-system is not protected from GC
quit
L
L
Ludovic Courtès wrote on 25 Feb 2021 11:44
Re: bug#46767: /run/booted-system can be removed by ‘guix system delete-generations’
(address . 46767@debbugs.gnu.org)
87r1l4zat9.fsf@gnu.org
Before rebooting, I had:

Toggle snippet (5 lines)
$ ls -l /run/{current,booted}-system
lrwxrwxrwx 1 root root 33 Nov 2 16:06 /run/booted-system -> /var/guix/profiles/system-68-link
lrwxrwxrwx 1 root root 50 Feb 21 01:34 /run/current-system -> /gnu/store/qq4rz2fprvnsgqhj24v735hhmp189jl8-system

After rebooting:

Toggle snippet (5 lines)
$ ls -l /run/{current,booted}-system
lrwxrwxrwx 1 root root 33 Feb 25 10:28 /run/booted-system -> /var/guix/profiles/system-86-link
lrwxrwxrwx 1 root root 33 Feb 25 10:28 /run/current-system -> /var/guix/profiles/system-86-link

/run/booted-system is symlinked from /run/current-system in
‘shepherd-boot-gexp’:

Toggle snippet (12 lines)
(define (shepherd-boot-gexp config)
"Return a gexp starting the shepherd service."
(let ((shepherd (shepherd-configuration-shepherd config))
(services (shepherd-configuration-services config)))
#~(begin
;; Keep track of the booted system.
(false-if-exception (delete-file "/run/booted-system"))
(symlink (readlink "/run/current-system")
"/run/booted-system")
…)))

So the solution is to make sure /run/current-system always points to the
store item rather than to the /var/guix symlink in the first place.

/run/current-system is created from (gnu build activation). When
reconfiguring or deploying, the symlink points to $GUIX_NEW_SYSTEM,
which is set to the store item in (guix scripts system reconfigure).

But when booting, /run/current-system is symlinked to the ‘--system’
kernel command-line argument, which is /var/guix/…. To address that, we
need to throw a ‘canonicalize-path’ call.

Done in 412e4f081e9cdf38db9859e1548ef2362cde678e.

Ludo’.
L
L
Ludovic Courtès wrote on 25 Feb 2021 11:44
control message for bug #46767
(address . control@debbugs.gnu.org)
87pn0ozat0.fsf@gnu.org
tags 46767 fixed
close 46767
quit
?