time-machine: backtrace about maybe-remove-expired-cache-entries

  • Done
  • quality assurance status badge
Details
3 participants
  • Ludovic Courtès
  • Maxime Devos
  • zimoun
Owner
unassigned
Submitted by
zimoun
Severity
normal
Z
Z
zimoun wrote on 25 May 2022 18:49
(address . bug-guix@gnu.org)
87leupd0bq.fsf@gmail.com
Hi,

From 8a87e29, I get:


Toggle snippet (33 lines)
$ guix time-machine --commit=9d795fb -- help
Backtrace:
14 (primitive-load "/home/sitour/.config/guix/current/bin/…")
In guix/ui.scm:
2229:7 13 (run-guix . _)
2192:10 12 (run-guix-command _ . _)
In ice-9/boot-9.scm:
1752:10 11 (with-exception-handler _ _ #:unwind? _ # _)
1747:15 10 (with-exception-handler #<procedure 7f1459d16b40 at ic…> …)
In guix/store.scm:
671:3 9 (_)
In ice-9/boot-9.scm:
1752:10 8 (with-exception-handler _ _ #:unwind? _ # _)
In guix/store.scm:
658:37 7 (thunk)
In guix/status.scm:
809:4 6 (call-with-status-report _ _)
In guix/store.scm:
1320:8 5 (call-with-build-handler #<procedure 7f1459e534b0 at g…> …)
In guix/inferior.scm:
885:2 4 (cached-channel-instance #<store-connection 256.99 7f1…> …)
In guix/cache.scm:
39:10 3 (maybe-remove-expired-cache-entries "/home/sitour/.cac…" …)
In srfi/srfi-19.scm:
287:16 2 (time-normalize! #<time type: time-monotonic nanosecond…>)
In ice-9/boot-9.scm:
1685:16 1 (raise-exception _ #:continuable? _)
1685:16 0 (raise-exception _ #:continuable? _)

ice-9/boot-9.scm:1685:16: In procedure raise-exception:
In procedure <: Wrong type argument in position 2: #<eof>

Then ~/.cache/guix/inferiors/last-expiry-cleanup is empty

Toggle snippet (4 lines)
$ cat ~/.cache/guix/inferiors/last-expiry-cleanup


probably erased by the previous time-machine call. Well, it still fails
until I remove the file ~/.cache/guix/inferiors/last-expiry-cleanup.

It is hard to debug. Any idea?


Cheers,
simon
L
L
Ludovic Courtès wrote on 26 May 2022 17:05
(name . zimoun)(address . zimon.toutoune@gmail.com)(address . 55638@debbugs.gnu.org)
871qwgfi6k.fsf@gnu.org
Hi,

zimoun <zimon.toutoune@gmail.com> skribis:

Toggle quote (16 lines)
> In guix/cache.scm:
> 39:10 3 (maybe-remove-expired-cache-entries "/home/sitour/.cac…" …)
> In srfi/srfi-19.scm:
> 287:16 2 (time-normalize! #<time type: time-monotonic nanosecond…>)
> In ice-9/boot-9.scm:
> 1685:16 1 (raise-exception _ #:continuable? _)
> 1685:16 0 (raise-exception _ #:continuable? _)
>
> ice-9/boot-9.scm:1685:16: In procedure raise-exception:
> In procedure <: Wrong type argument in position 2: #<eof>
>
>
> Then ~/.cache/guix/inferiors/last-expiry-cleanup is empty
>
> $ cat ~/.cache/guix/inferiors/last-expiry-cleanup

This file was empty when you ran the command instead of containing an
integer (could have been a file system corruption or something like
that).

Solution:

rm ~/.cache/guix/inferiors/last-expiry-cleanup

HTH!

Ludo’.
M
M
Maxime Devos wrote on 26 May 2022 17:12
(address . 55638@debbugs.gnu.org)
bc32e583c62832e3c64b1f75455d196bf0ec48ee.camel@telenet.be
Ludovic Courtès schreef op do 26-05-2022 om 17:05 [+0200]:
Toggle quote (8 lines)
> This file was empty when you ran the command instead of containing an
> integer (could have been a file system corruption or something like
> that).
>
> Solution:
>
>   rm ~/.cache/guix/inferiors/last-expiry-cleanup

It's a work-around, but there's still an underlying problem:
guix/cache.scm doesn't do 'fsync+rename', so the file is not created
atomically, so in case of an abrupt shutdown or C-c at the wrong time,
the file becomes corrupted without fault of the file system.

As such, WDYT of making last-expiry-date more robust, by treating
invalid contents as time=0 or something like that?

Greetings,
Maxime.
-----BEGIN PGP SIGNATURE-----

iI0EABYKADUWIQTB8z7iDFKP233XAR9J4+4iGRcl7gUCYo+Y0hccbWF4aW1lZGV2
b3NAdGVsZW5ldC5iZQAKCRBJ4+4iGRcl7l3WAQDFf6zv5y0G/T4Ybj08jKrRX70a
zLexr1odFULn3bw5jAEA+OWX8SZisOZ2GjCjfQLZJCoA5ydVNEqF6zmAdzcHBw8=
=ONvg
-----END PGP SIGNATURE-----


Z
Z
zimoun wrote on 27 May 2022 10:31
(address . 55638@debbugs.gnu.org)
CAJ3okZ2LahyQAPpXZbHJT0jxdYK_OuQecz5rXqdZ+m8FPjjsxA@mail.gmail.com
Hi,

On Thu, 26 May 2022 at 17:05, Ludovic Courtès <ludo@gnu.org> wrote:

Toggle quote (4 lines)
> This file was empty when you ran the command instead of containing an
> integer (could have been a file system corruption or something like
> that).

No, I did nothing special and the file system is not corrupted. :-)

I am just using intensively "guix time-machine".


Toggle quote (4 lines)
> Solution:
>
> rm ~/.cache/guix/inferiors/last-expiry-cleanup

Yes, this is what I did because I know enough the internals.

As Maxime said, it should be robust. Therefore, see the fix:



Cheers,
simon
L
L
Ludovic Courtès wrote on 28 May 2022 19:12
(name . Maxime Devos)(address . maximedevos@telenet.be)
87leulbmz0.fsf@gnu.org
Hi,

Maxime Devos <maximedevos@telenet.be> skribis:

Toggle quote (5 lines)
> It's a work-around, but there's still an underlying problem:
> guix/cache.scm doesn't do 'fsync+rename', so the file is not created
> atomically, so in case of an abrupt shutdown or C-c at the wrong time,
> the file becomes corrupted without fault of the file system.

Right, I guess this is what we should fix first, by using
‘with-atomic-file-output’ for instance.

Toggle quote (3 lines)
> As such, WDYT of making last-expiry-date more robust, by treating
> invalid contents as time=0 or something like that?

That too.

Ludo’.
Z
Z
zimoun wrote on 30 May 2022 15:11
(name . Ludovic Courtès)(address . ludo@gnu.org)
CAJ3okZ2=UG5ZdsG6oJLkv=t+va_5twAxOAFbg4f=Xk0JKLaQ0Q@mail.gmail.com
Hi,

On Sat, 28 May 2022 at 19:12, Ludovic Courtès <ludo@gnu.org> wrote:

Toggle quote (3 lines)
> Right, I guess this is what we should fix first, by using
> ‘with-atomic-file-output’ for instance.

Please give a look at the patch


which fixes the issue.


Cheers,
simon
L
L
Ludovic Courtès wrote on 8 Jul 2022 13:45
control message for bug #55638
(address . control@debbugs.gnu.org)
87y1x3x1yx.fsf@gnu.org
close 55638
quit
?