Grafter should check every byte it modifies, to avoid corruption

  • Open
  • quality assurance status badge
Details
2 participants
  • Ludovic Courtès
  • Mark H Weaver
Owner
unassigned
Submitted by
Mark H Weaver
Severity
important
M
M
Mark H Weaver wrote on 17 Apr 2021 11:39
(address . bug-guix@gnu.org)
87eef98d1i.fsf@netris.org
Recall that the grafting code performs a set of substitutions, replacing
store item names (i.e. file names in /gnu/store) with replacement store
items of the same length, with rules like:
"fx3979c88s9yxdbchyf36qryawgzpwb5-libx11-1.6.10" =>
"rwkqxykm91a75w9afhb41saj0dmf30hw-libx11-1.6.12".

The grafting code currently only checks the first 33 bytes, consisting
of the nix-base32 hash and the "-". It *assumes* that the remainder of
the associated store item name immediately follows, and blindly writes
the replacement string over whatever is there.

Here's a real-world example of silent corruption caused by this in a
Racket .zo file, before commit 834aa48504a24f0c79e858fc295edbf63815a408
which patched Racket to avoid embedding this store reference:

Toggle snippet (23 lines)
mhw@jojen ~$ diff -u <(hexdump -C $(guix build racket --no-grafts)/share/racket/pkgs/gui-lib/mred/private/wx/gtk/compiled/utils_rkt.zo) \
<(hexdump -C $(guix build racket )/share/racket/pkgs/gui-lib/mred/private/wx/gtk/compiled/utils_rkt.zo)
--- /dev/fd/63 2021-04-15 04:36:01.240427788 -0400
+++ /dev/fd/62 2021-04-15 04:36:01.240427788 -0400
@@ -2047,11 +2047,11 @@
00007fe0 49 8b 6f 0b 08 00 09 d0 02 2f d7 fe d0 02 07 f2 |I.o....../......|
00007ff0 02 0b 00 62 12 04 00 12 12 05 00 3e 12 06 00 17 |...b.......>....|
00008000 12 07 3d 02 f0 28 32 02 04 75 6e 69 78 00 1e 26 |..=..(2..unix..&|
-00008010 5a 2f 67 6e 75 2f 73 74 6f 72 65 2f 69 72 6a 61 |Z/gnu/store/irja|
-00008020 6e 35 77 71 37 6a 32 35 66 61 32 6d 36 6e 32 78 |n5wq7j25fa2m6n2x|
-00008030 68 6c 38 6d 67 6c 73 61 71 78 6e 34 2d 49 02 02 |hl8mglsaqxn4-I..|
-00008040 a5 02 fd 01 2b 73 76 67 2d 32 2e 34 30 2e 30 2f |....+svg-2.40.0/|
-00008050 6c 69 62 2f c2 02 39 2e 73 6f 9b 02 1d 43 9b 02 |lib/..9.so...C..|
+00008010 5a 2f 67 6e 75 2f 73 74 6f 72 65 2f 36 66 32 30 |Z/gnu/store/6f20|
+00008020 38 64 61 6b 32 77 64 62 30 61 72 31 68 6e 38 79 |8dak2wdb0ar1hn8y|
+00008030 6b 31 39 30 79 77 67 67 69 77 33 63 2d 67 64 6b |k190ywggiw3c-gdk|
+00008040 2d 70 69 78 62 75 66 2b 73 76 67 2d 32 2e 34 30 |-pixbuf+svg-2.40|
+00008050 2e 30 62 2f c2 02 39 2e 73 6f 9b 02 1d 43 9b 02 |.0b/..9.so...C..|
00008060 57 30 07 01 02 0e 35 00 05 a2 02 c3 12 08 0c 26 |W0....5........&|
00008070 00 18 12 09 02 2c 84 00 55 02 0f 54 02 09 13 7f |.....,..U..T....|
00008080 54 02 1b 49 54 02 15 32 54 02 15 1c 54 02 1f 01 |T..IT..2T...T...|

For the record, when I originally wrote this fast(er) grafting code
(commit 5a1add373ab427a3b336981d857252e703a9f8d1), by design it only
rewrote the hashes, and so naturally it had the following desirable
property: it never overwrote any byte without first checking it against
an expected value. Later, starting in commit
57bdd79e485801ccf405ca7389bd099809fe5d67, the grafting code was modified
to allow rewriting the entire store item name (notably including the
version number). Unfortunately, although the set of overwritten bytes
was extended past the "-", the set of bytes *checked* was left
unchanged, and thus the aforementioned desirable property was lost.

I think we ought to restore that property, to avoid silent corruptions
such as the example above.

Ideally, an error should be raised if the 'hash+dash' pattern is present
but not followed by the expected bytes, so that we will be alerted to a
problem.

It would be even better to detect hashes by themselves, even if not
followed by a dash, or even non-trivial substrings of hashes, in order
to help alert us to problems like this.

I'll try to find the time to work on this soon, but no promises.

Mark
L
L
Ludovic Courtès wrote on 21 Apr 2021 23:51
control message for bug #47838
(address . control@debbugs.gnu.org)
87pmyn2tng.fsf@gnu.org
severity 47838 important
quit
?