Stale .go files are loaded when Guile and Guix are in the same prefix

OpenSubmitted by Eric Bavier.
Details
2 participants
  • Eric Bavier
  • Ludovic Courtès
Owner
unassigned
Severity
important
E
E
Eric Bavier wrote on 20 Mar 2018 16:43
Commit bc499b113 broke guix on guile@2.0.14, improper <operating-system> field initialization
(address . bug-guix@gnu.org)
20180320154302.GL105827@pe06.us.cray.com
Hello Guix,

On the master branch (5d818b3557cc3b546d5bd0639359c14c7c0ab685), when
configured with guile@2.0.14, I get the following backtrace when
running `make`.

Backtrace:
In ice-9/boot-9.scm:
1739: 19 [#<procedure 34ebc6c0 ()>]
In unknown file:
?: 18 [primitive-load "/home/users/bavier/src/guix/./build-aux/compile-all.scm"]
In guix/build/compile.scm:
158: 17 [compile-files "." "/home/users/bavier/src/guix" ...]
107: 16 [load-files "." # # ...]
In ice-9/boot-9.scm:
2900: 15 [resolve-interface (gnu tests base) #:select ...]
2825: 14 [#<procedure 34dfc200 at ice-9/boot-9.scm:2813:4 (name #:optional autoload version #:key ensure)> # ...]
3101: 13 [try-module-autoload (gnu tests base) #f]
2412: 12 [save-module-excursion #<procedure 35c46750 at ice-9/boot-9.scm:3102:17 ()>]
3121: 11 [#<procedure 35c46750 at ice-9/boot-9.scm:3102:17 ()>]
In unknown file:
?: 10 [primitive-load-path "gnu/tests/base" ...]
In gnu/tests/base.scm:
390: 9 [#<procedure 38c523a0 ()>]
63: 8 [run-basic-test # # "basic" ...]
In ice-9/eval.scm:
387: 7 [eval # #]
387: 6 [eval # #]
411: 5 [eval # #]
387: 4 [eval # #]
In unknown file:
?: 3 [filter #<procedure 35c461e0 at ice-9/eval.scm:416:20 (a)> (# # # #)]
In ice-9/eval.scm:
411: 2 [eval # #]
411: 1 [eval # #]
387: 0 [eval # #]

ice-9/eval.scm:387:11: In procedure eval:
ice-9/eval.scm:387:11: In procedure mapped-device-target: Wrong type argument: #<<file-system> device: "my-root" title: label mount-point: "/" type: "ext4" flags: () options: #f mount?: #t needed-for-boot?: #f check?: #t create-mount-point?: #f dependencies: () location: ((line . 209) (column . 24) (filename . "gnu/tests.scm"))>

(as an aside: maybe would could postpone compilation of test modules
until `make check`).

I git bisect'd this failure to commit
bc499b113a598c0e7863da9887a4133472985713, which added the
'initrd-modules' field to the (@ (gnu system) <operating-system>)
record.

The %simple-os from (gnu tests base) seems improperly initialized. In
particular, the fields seem to be shifted:

scheme@(guile-user)> (@@ (gnu tests base) %simple-os)
$1 = #<<operating-system>
kernel: #<package linux-libre@4.15.7 ...>
kernel-arguments: ()
bootloader: #<<bootloader-configuration> bootloader: ...>
initrd: #<procedure base-initrd ...>
initrd-modules: ()
firmware: "komputilo"
host-name: #f
hosts-file: ()
mapped-devices: (#<<file-system> device: "my-root" ...> #<<file-system> ...> ...)
file-systems: ()
swap-devices: (#<<user-account> name: "alice" ...> ...)
...

Notice e.g. the "firmware" field has that value that should be in
"host-name", which has the value "hosts-file" should have, and
"mapped-devices" has the value "file-systems" should have, etc.

If you explicitely specify the new "initrd-modules" field this commit
added in (@ (gnu tests) %simple-os), then compilation proceeds as
expected.

--
Eric Bavier, Scientific Libraries, Cray Inc.
L
L
Ludovic Courtès wrote on 21 Mar 2018 00:12
(name . Eric Bavier)(address . bavier@cray.com)(address . 30879@debbugs.gnu.org)
877eq6ibp9.fsf@gnu.org
Hello Eric,

Eric Bavier <bavier@cray.com> skribis:

Toggle quote (23 lines)
> scheme@(guile-user)> (@@ (gnu tests base) %simple-os)
> $1 = #<<operating-system>
> kernel: #<package linux-libre@4.15.7 ...>
> kernel-arguments: ()
> bootloader: #<<bootloader-configuration> bootloader: ...>
> initrd: #<procedure base-initrd ...>
> initrd-modules: ()
> firmware: "komputilo"
> host-name: #f
> hosts-file: ()
> mapped-devices: (#<<file-system> device: "my-root" ...> #<<file-system> ...> ...)
> file-systems: ()
> swap-devices: (#<<user-account> name: "alice" ...> ...)
> ...
>
> Notice e.g. the "firmware" field has that value that should be in
> "host-name", which has the value "hosts-file" should have, and
> "mapped-devices" has the value "file-systems" should have, etc.
>
> If you explicitely specify the new "initrd-modules" field this commit
> added in (@ (gnu tests) %simple-os), then compilation proceeds as
> expected.

That sounds a lot like regular ABI breakage: a new <operating-system>
field was added but gnu/tests/base.go wasn’t rebuilt, and thus was
expecting the previous struct layout.

Does “rm gnu/tests/base.go && make” suffice to fix this issue?

Thanks,
Ludo’.
E
E
Eric Bavier wrote on 21 Mar 2018 16:16
(name . Ludovic Courtès)(address . ludo@gnu.org)(address . 30879@debbugs.gnu.org)
20180321151642.GN105827@pe06.us.cray.com
On Wed, Mar 21, 2018 at 12:12:02AM +0100, Ludovic Courtès wrote:

Toggle quote (6 lines)
> That sounds a lot like regular ABI breakage: a new <operating-system>
> field was added but gnu/tests/base.go wasn’t rebuilt, and thus was
> expecting the previous struct layout.
>
> Does “rm gnu/tests/base.go && make” suffice to fix this issue?

No, it doesn't help. Previously I had been running "make clean-go"
before each "make.

The error/backtrace is issued when build-aux/compile-all.scm tries to
load gnu/tests/base.scm, before it even gets to compilation.

--
Eric Bavier, Scientific Libraries, Cray Inc.
L
L
Ludovic Courtès wrote on 21 Mar 2018 22:04
(name . Eric Bavier)(address . bavier@cray.com)(address . 30879@debbugs.gnu.org)
87y3ildttr.fsf@gnu.org
Eric Bavier <bavier@cray.com> skribis:

Toggle quote (14 lines)
> On Wed, Mar 21, 2018 at 12:12:02AM +0100, Ludovic Courtès wrote:
>
>> That sounds a lot like regular ABI breakage: a new <operating-system>
>> field was added but gnu/tests/base.go wasn’t rebuilt, and thus was
>> expecting the previous struct layout.
>>
>> Does “rm gnu/tests/base.go && make” suffice to fix this issue?
>
> No, it doesn't help. Previously I had been running "make clean-go"
> before each "make.
>
> The error/backtrace is issued when build-aux/compile-all.scm tries to
> load gnu/tests/base.scm, before it even gets to compilation.

Oh, can you “rm -rf ~/.cache/guile”?

One thing that could be an issue is that (gnu system install) loads
‘examples/bare-bones.tmpl’. Thus ‘bare-bones.tmpl.go’ ends up in
~/.cache/guile and could be out of sync.

Thanks,
Ludo’.
E
E
Eric Bavier wrote on 21 Mar 2018 22:14
(name . Ludovic Courtès)(address . ludo@gnu.org)(address . 30879@debbugs.gnu.org)
20180321211403.GO105827@pe06.us.cray.com
On Wed, Mar 21, 2018 at 10:04:00PM +0100, Ludovic Courtès wrote:
Toggle quote (18 lines)
> Eric Bavier <bavier@cray.com> skribis:
>
> > On Wed, Mar 21, 2018 at 12:12:02AM +0100, Ludovic Courtès wrote:
> >
> >> That sounds a lot like regular ABI breakage: a new <operating-system>
> >> field was added but gnu/tests/base.go wasn’t rebuilt, and thus was
> >> expecting the previous struct layout.
> >>
> >> Does “rm gnu/tests/base.go && make” suffice to fix this issue?
> >
> > No, it doesn't help. Previously I had been running "make clean-go"
> > before each "make.
> >
> > The error/backtrace is issued when build-aux/compile-all.scm tries to
> > load gnu/tests/base.scm, before it even gets to compilation.
>
> Oh, can you “rm -rf ~/.cache/guile”?

"rm -rf ~/.cache/guile && make clean-go && make" resulted in an error,
but a slightly different backtrace:

```
LOAD gnu/tests/base.scm
Backtrace:
In ice-9/eval.scm:
432: 19 [eval # #]
In ice-9/boot-9.scm:
2412: 18 [save-module-excursion #<procedure 2998d7c0 at ice-9/boot-9.scm:4084:3 ()>]
4091: 17 [#<procedure 2998d7c0 at ice-9/boot-9.scm:4084:3 ()>]
1734: 16 [%start-stack load-stack ...]
1739: 15 [#<procedure 299b26c0 ()>]
In unknown file:
?: 14 [primitive-load "/home/users/bavier/src/guix/./build-aux/compile-all.scm"]
In guix/build/compile.scm:
158: 13 [compile-files "." "/home/users/bavier/src/guix" ...]
107: 12 [load-files "." # # ...]
In ice-9/boot-9.scm:
2900: 11 [resolve-interface (gnu tests base) #:select ...]
2825: 10 [#<procedure 298f2200 at ice-9/boot-9.scm:2813:4 (name #:optional autoload version #:key ensure)> # ...]
3101: 9 [try-module-autoload (gnu tests base) #f]
2412: 8 [save-module-excursion #<procedure 30cd0ed0 at ice-9/boot-9.scm:3102:17 ()>]
3121: 7 [#<procedure 30cd0ed0 at ice-9/boot-9.scm:3102:17 ()>]
In unknown file:
?: 6 [primitive-load-path "gnu/tests/base" ...]
In gnu/tests/base.scm:
390: 5 [#<procedure 30cdae40 ()>]
63: 4 [run-basic-test # # "basic" ...]
In gnu/system.scm:
501: 3 [operating-system-services # # #f]
476: 2 [essential-services # # #f]
576: 1 [operating-system-etc-service #]
In gnu/system/nss.scm:
217: 0 [name-service-switch->string (# # # # ...)]

gnu/system/nss.scm:217:19: In procedure name-service-switch->string:
gnu/system/nss.scm:217:19: In procedure struct_vtable: Wrong type argument in position 1 (expecting struct): (#<<service> type: #<service-type login ...
```

--
Eric Bavier, Scientific Libraries, Cray Inc.
L
L
Ludovic Courtès wrote on 22 Mar 2018 00:04
(name . Eric Bavier)(address . bavier@cray.com)(address . 30879@debbugs.gnu.org)
87r2oddo9l.fsf@gnu.org
Eric Bavier <bavier@cray.com> skribis:

[...]

Toggle quote (10 lines)
> In gnu/system.scm:
> 501: 3 [operating-system-services # # #f]
> 476: 2 [essential-services # # #f]
> 576: 1 [operating-system-etc-service #]
> In gnu/system/nss.scm:
> 217: 0 [name-service-switch->string (# # # # ...)]
>
> gnu/system/nss.scm:217:19: In procedure name-service-switch->string:
> gnu/system/nss.scm:217:19: In procedure struct_vtable: Wrong type argument in position 1 (expecting struct): (#<<service> type: #<service-type login ...

This looks like another record issue: the code is accessing the
`services' field instead of the `name-service-switch' field, which is
right next to it.

So it looks like there are still stale .go files somewhere being picked
up. This time it would mean that nss.go is up-to-date and system.go is
stale, since nss.go assumes an offset for `name-service-switch' that is
+1 compared to that of system.go.

Could you maybe try:

rm -rf ~/.cache/guile
make clean-go
strace -f -o log make

and check in `log' whether .go files outside of the build tree are being
used?

Thanks,
Ludo'.
E
E
Eric Bavier wrote on 22 Mar 2018 15:45
(name . Ludovic Courtès)(address . ludo@gnu.org)(address . 30879@debbugs.gnu.org)
20180322144538.GP105827@pe06.us.cray.com
On Thu, Mar 22, 2018 at 12:04:06AM +0100, Ludovic Court�s wrote:
Toggle quote (32 lines)
> Eric Bavier <bavier@cray.com> skribis:
>
> [...]
>
> > In gnu/system.scm:
> > 501: 3 [operating-system-services # # #f]
> > 476: 2 [essential-services # # #f]
> > 576: 1 [operating-system-etc-service #]
> > In gnu/system/nss.scm:
> > 217: 0 [name-service-switch->string (# # # # ...)]
> >
> > gnu/system/nss.scm:217:19: In procedure name-service-switch->string:
> > gnu/system/nss.scm:217:19: In procedure struct_vtable: Wrong type argument in position 1 (expecting struct): (#<<service> type: #<service-type login ...
>
> This looks like another record issue: the code is accessing the
> `services' field instead of the `name-service-switch' field, which is
> right next to it.
>
> So it looks like there are still stale .go files somewhere being picked
> up. This time it would mean that nss.go is up-to-date and system.go is
> stale, since nss.go assumes an offset for `name-service-switch' that is
> +1 compared to that of system.go.
>
> Could you maybe try:
>
> rm -rf ~/.cache/guile
> make clean-go
> strace -f -o log make
>
> and check in `log' whether .go files outside of the build tree are being
> used?

Oh, so it loks like .go files from the system-installed guix are being
picked up:

53692 openat(AT_FDCWD, "/usr/local/lib/guile/2.0/site-ccache/gnu/system.go", O_RDONLY|O_CLOEXEC) = 10

I hadn't expected that, but I suppose it makes sense. Running make
under ./pre-inst-env does not help.

We should probably find a way to prevent this in general, right? We
shouldn't be loading guix modules from outside the source tree during
build.

--
Eric Bavier, Scientific Libraries, Cray Inc.
L
L
Ludovic Courtès wrote on 22 Mar 2018 17:19
(name . Eric Bavier)(address . bavier@cray.com)(address . 30879@debbugs.gnu.org)
878takgk1z.fsf@gnu.org
Hello,

Eric Bavier <bavier@cray.com> skribis:

Toggle quote (8 lines)
> Oh, so it loks like .go files from the system-installed guix are being
> picked up:
>
> 53692 openat(AT_FDCWD, "/usr/local/lib/guile/2.0/site-ccache/gnu/system.go", O_RDONLY|O_CLOEXEC) = 10
>
> I hadn't expected that, but I suppose it makes sense. Running make
> under ./pre-inst-env does not help.

From my Guix build and source tree, I see this:

Toggle snippet (13 lines)
$ rm gnu/system.go
$ ./pre-inst-env strace -o log guile --no-auto-compile -c '(use-modules (gnu system))'
;;; note: source file /home/ludo/src/guix/gnu/system.scm
;;; newer than compiled /run/current-system/profile/lib/guile/2.2/site-ccache/gnu/system.go
;;; note: source file /home/ludo/src/guix/gnu/system.scm
;;; newer than compiled /run/current-system/profile/lib/guile/2.2/site-ccache/gnu/system.go
;;; note: source file /home/ludo/src/guix/gnu/system.scm
;;; newer than compiled /run/current-system/profile/lib/guile/2.2/site-ccache/gnu/system.go
$ grep open.*gnu/system.go log
$ echo $?
1

Don’t you get a similar message?

Thanks,
Ludo’.
E
E
Eric Bavier wrote on 29 Mar 2018 19:06
(name . Ludovic Courtès)(address . ludo@gnu.org)(address . 30879@debbugs.gnu.org)
20180329170645.GQ105827@pe06.us.cray.com
On Thu, Mar 22, 2018 at 05:19:04PM +0100, Ludovic Courtès wrote:
Toggle quote (30 lines)
> Hello,
>
> Eric Bavier <bavier@cray.com> skribis:
>
> > Oh, so it loks like .go files from the system-installed guix are being
> > picked up:
> >
> > 53692 openat(AT_FDCWD, "/usr/local/lib/guile/2.0/site-ccache/gnu/system.go", O_RDONLY|O_CLOEXEC) = 10
> >
> > I hadn't expected that, but I suppose it makes sense. Running make
> > under ./pre-inst-env does not help.
>
> From my Guix build and source tree, I see this:
>
> --8<---------------cut here---------------start------------->8---
> $ rm gnu/system.go
> $ ./pre-inst-env strace -o log guile --no-auto-compile -c '(use-modules (gnu system))'
> ;;; note: source file /home/ludo/src/guix/gnu/system.scm
> ;;; newer than compiled /run/current-system/profile/lib/guile/2.2/site-ccache/gnu/system.go
> ;;; note: source file /home/ludo/src/guix/gnu/system.scm
> ;;; newer than compiled /run/current-system/profile/lib/guile/2.2/site-ccache/gnu/system.go
> ;;; note: source file /home/ludo/src/guix/gnu/system.scm
> ;;; newer than compiled /run/current-system/profile/lib/guile/2.2/site-ccache/gnu/system.go
> $ grep open.*gnu/system.go log
> $ echo $?
> 1
> --8<---------------cut here---------------end--------------->8---
>
> Don’t you get a similar message?

No, I get

Toggle snippet (6 lines)
$ grep open.*gnu/system.go log
openat(AT_FDCWD, "/usr/local/lib/guile/2.0/site-ccache/gnu/system.go", O_RDONLY|O_CLOEXEC) = 5
$ echo $?
0

--
Eric Bavier, Scientific Libraries, Cray Inc.
E
E
Eric Bavier wrote on 11 Apr 2018 20:42
(address . 30879@debbugs.gnu.org)
20180411184215.GE105827@pe06.us.cray.com
As a workaround, I temporarily uninstalled Guix from the system. This
allowed compilation from my git checkout to succeed.

--
Eric Bavier, Scientific Libraries, Cray Inc.
L
L
Ludovic Courtès wrote on 1 May 2018 22:26
control message for bug #30879
(address . control@debbugs.gnu.org)
87k1snp1fy.fsf@gnu.org
severity 30879 important
L
L
Ludovic Courtès wrote on 15 May 2018 11:20
Re: bug#30879: Commit bc499b113 broke guix on guile@2.0.14, improper <operating-system> field initialization
(name . Eric Bavier)(address . bavier@cray.com)(address . 30879@debbugs.gnu.org)
87wow5i8av.fsf@gnu.org
Hello Eric,

Sorry for the late reply.

Eric Bavier <bavier@cray.com> skribis:

Toggle quote (15 lines)
> On Thu, Mar 22, 2018 at 12:04:06AM +0100, Ludovic Courtès wrote:
>> Eric Bavier <bavier@cray.com> skribis:
>>
>> [...]
>>
>> > In gnu/system.scm:
>> > 501: 3 [operating-system-services # # #f]
>> > 476: 2 [essential-services # # #f]
>> > 576: 1 [operating-system-etc-service #]
>> > In gnu/system/nss.scm:
>> > 217: 0 [name-service-switch->string (# # # # ...)]
>> >
>> > gnu/system/nss.scm:217:19: In procedure name-service-switch->string:
>> > gnu/system/nss.scm:217:19: In procedure struct_vtable: Wrong type argument in position 1 (expecting struct): (#<<service> type: #<service-type login ...

[...]

Toggle quote (10 lines)
> Oh, so it loks like .go files from the system-installed guix are being
> picked up:
>
> 53692 openat(AT_FDCWD, "/usr/local/lib/guile/2.0/site-ccache/gnu/system.go", O_RDONLY|O_CLOEXEC) = 10
>
> I hadn't expected that, but I suppose it makes sense. Running make
> under ./pre-inst-env does not help.
>
> We should probably find a way to prevent this in general, right?

It seems that the problem here is that both Guile and Guix were
installed with --prefix=/usr/local.

Guile contains by default $prefix/lib/guile/2.0/site-ccache in its
%load-compiled-path. Thus, it will always find the .go files of that
Guix that’s installed in the same prefix.

Toggle quote (3 lines)
> We shouldn't be loading guix modules from outside the source tree
> during build.

In general we can (and do: see the ‘make-go’ target and see
‘pre-inst-env’), but in this case we can’t really prevent it because
$prefix/lib/… is in the default search path of Guile, which is
admittedly problematic.

Maybe we should just forbid install Guix in the same prefix as Guile,
and detect that at configure time.

WDYT?

Ludo’.
L
L
Ludovic Courtès wrote on 15 May 2018 11:21
control message for bug #30879
(address . control@debbugs.gnu.org)
87vabpi8a5.fsf@gnu.org
retitle 30879 Stale .go files are loaded when Guile and Guix are in the same prefix
?