Hi Ludo, Thanks for your comments! Ludovic Courtès writes: >> StorePath: /gnu/store/81bdcd5x4v50i28h98bfkvvkx9cky63w-hello-2.10 >> URL: nar/gzip/81bdcd5x4v50i28h98bfkvvkx9cky63w-hello-2.10 >> Compression: gzip >> FileSize: 67363 >> ERIS: urn:erisx2:BIBC2LUTIQH43S2KRIAV7TBXNUUVPZTMV6KFA2M7AL5V6FNE77VNUDDVDAGJUEEAFATVO2QQT67SMOPTO3LGWCJFU7BZVCF5VXEQQW25BE >> URL: nar/zstd/81bdcd5x4v50i28h98bfkvvkx9cky63w-hello-2.10 >> Compression: zstd >> FileSize: 64917 >> ERIS: urn:erisx2:BIBO7KS7SAWHDNC43DVILOSQ3F3SRRHEV6YPLDCSZ7MMD6LZVCHQMEQ6FUBTJAPSNFF7XR5XPTP4OQ72OPABNEO7UYBUN42O46ARKHBTGM > > Do we really need one URN per compression method? Couldn’t we leave > compression (of individual chunks, possibly) as a “detail” handled by > the encoding or the transport layer? > I agree that it would be nice to leave this to the encoding layer as that would allow certain optimizations (e.g. de-duplication). Unfortunately, we haven't figured out yet what the most suitable compression/format would be. Something like EROSFS seems good (as it aligns data to fixed block sizes) [1]. But this seems a bit "clunky" for just an archive format and there do not seem to be any libraries that we could use to neatly integrate. It seems possible to block-align a Tar archive, but that seems a bit hackey [2]. Other things to look into might be Tarlz [3] and ZPAQ [4]. To get started I suggest just using one of the compressions/formats already in Guix. zstd seems to be a reasonable choice (for the same reasons why it makes sense to use zstd with `--discover` [5]). Does that sound like a plan? [1] https://inqlab.net/git/guile-eris.git/tree/examples/dedup-fs/Readme.org [2] https://unix.stackexchange.com/questions/276908/make-tar-or-other-archive-with-data-block-aligned-like-in-original-files-for/279384#279384 [3] http://lzip.nongnu.org/tarlz.html [4] http://mattmahoney.net/dc/zpaq.html [5] https://guix.gnu.org/en/blog/2021/getting-bytes-to-disk-more-quickly/ >> If the `--ipfs` is used for `guix publish` then the encoded blocks are also >> uploaded to the IPFS daemon. The nar could then be retrieved from anywhere like >> this: >> >> (use-modules (eris) >> (eris blocks ipfs)) >> >> (eris-decode->bytevector >> "urn:erisx2:BIBC2LUTIQH43S2KRIAV7TBXNUUVPZTMV6KFA2M7AL5V6FNE77VNUDDVDAGJUEEAFATVO2QQT67SMOPTO3LGWCJFU7BZVCF5VXEQQW25BE" >> eris-blocks-ipfs-ref) >> >> These patches do not yet retrieve content from IPFS (TODO). But in principle, >> anybody connected to IPFS can get the nar with the ERIS URN. This could be used >> to reduce load on substitute server as they would only need to publish the ERIS >> URN directly - substitutes could be delivered much more peer-to-peer. > > Nice. So adjusting ‘guix substitute’ should be relatively easy? Yes, relatively! :) I meant to send in a V2 that does this before going on holidays, but I'm afraid I won't make it. V2 will come in early January! >> Other transports that I have been looking in to and am pretty sure will work >> include: HTTP (with RFC 2169 [3]), GNUNet, OpenDHT. This is, imho, the >> advantage of ERIS over IPFS directly or GNUNet directly. The encoding and >> identifiers (URN) are abstracted away from specific transports (and also >> applications). ERIS is almost exactly the same encoding as used in GNUNet >> (ECRS). > > As a first step, ‘guix publish’ could implement RFC 2169, too. > > I gather implementing the HTTP and IPFS backends in ‘guix substitute’ > should be relatively easy, right? Yes, those seem to be the two easiest backends to implement. >> A tricky things is figuring out how to multiplex all these different >> transports and storages... > > Yes. We don’t know yet what performance and data availability will be > like on IPFS, for instance, so it’s important for users to be able to > set priorities. It’s also important to gracefully fall back to direct > HTTP downloads when fancier p2p methods fail, regardless of how they > fail. Agree. Thanks, -pukkamustard