From debbugs-submit-bounces@debbugs.gnu.org Thu Dec 23 07:18:29 2021 Received: (at 52555) by debbugs.gnu.org; 23 Dec 2021 12:18:29 +0000 Received: from localhost ([127.0.0.1]:60570 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1n0N3A-0003EI-Ic for submit@debbugs.gnu.org; Thu, 23 Dec 2021 07:18:29 -0500 Received: from mout02.posteo.de ([185.67.36.66]:52835) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1n0N38-0003E4-FE for 52555@debbugs.gnu.org; Thu, 23 Dec 2021 07:18:23 -0500 Received: from submission (posteo.de [89.146.220.130]) by mout02.posteo.de (Postfix) with ESMTPS id 2F987240105 for <52555@debbugs.gnu.org>; Thu, 23 Dec 2021 13:18:15 +0100 (CET) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=posteo.net; s=2017; t=1640261896; bh=JgqGKyEHLIdnmE6TeYENeP0wVNmtmDc4GKpcUJhHrYU=; h=From:To:Cc:Subject:Date:From; b=iW3bMzBy/BjgyPuHsO5+9LyOGNWnUQFwMfA/pKQ2N7sT2vDJlUho5P5oGdQg9U7cR 2Ezr5kuKqzBVCIoJlu30LWxEMyyQ1OfjizzCEz3NNk5WlTjq7HkqxajPzcaeQJwH+w 9XY7aXwKtfPqkEhYPq776u+6mnysrSdWphxAhC/ghdRNQgz9kT6a0a0bkFaRI1zLDi GiFDbSjV94XgkH54gVWFvomQOn+vccx+fur2ZEnOxTChnW6YYI19Y1k4svuCYlbOLW LPD1Is0csnx0JJkiZ8WexZlR2IFiMxvGLMwVVbs6T73h4WHmM72I8NHM8jXg4aWpOk +lQbMIAcjbkcA== Received: from customer (localhost [127.0.0.1]) by submission (posteo.de) with ESMTPSA id 4JKTjy5gg2z6tn0; Thu, 23 Dec 2021 13:18:14 +0100 (CET) References: <20211216161724.547-1-pukkamustard@posteo.net> <87h7b3gs64.fsf@gnu.org> From: pukkamustard To: Ludovic =?utf-8?Q?Court=C3=A8s?= Subject: Re: bug#52555: [RFC PATCH 0/3] Decentralized substitute distribution with ERIS Date: Thu, 23 Dec 2021 11:42:46 +0000 In-reply-to: <87h7b3gs64.fsf@gnu.org> Message-ID: <86bl17ms56.fsf@posteo.net> MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: quoted-printable X-Spam-Score: -2.3 (--) X-Debbugs-Envelope-To: 52555 Cc: ~pukkamustard/eris@lists.sr.ht, 52555@debbugs.gnu.org X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: debbugs-submit-bounces@debbugs.gnu.org Sender: "Debbugs-submit" X-Spam-Score: -1.0 (-) Hi Ludo, Thanks for your comments! Ludovic Court=C3=A8s writes: >> StorePath: /gnu/store/81bdcd5x4v50i28h98bfkvvkx9cky63w-hello-2.10 >> URL: nar/gzip/81bdcd5x4v50i28h98bfkvvkx9cky63w-hello-2.10 >> Compression: gzip >> FileSize: 67363 >> ERIS: urn:erisx2:BIBC2LUTIQH43S2KRIAV7TBXNUUVPZTMV6KFA2M7AL5V6FNE77VNUDD= VDAGJUEEAFATVO2QQT67SMOPTO3LGWCJFU7BZVCF5VXEQQW25BE >> URL: nar/zstd/81bdcd5x4v50i28h98bfkvvkx9cky63w-hello-2.10 >> Compression: zstd >> FileSize: 64917 >> ERIS: urn:erisx2:BIBO7KS7SAWHDNC43DVILOSQ3F3SRRHEV6YPLDCSZ7MMD6LZVCHQMEQ= 6FUBTJAPSNFF7XR5XPTP4OQ72OPABNEO7UYBUN42O46ARKHBTGM > > Do we really need one URN per compression method? Couldn=E2=80=99t we le= ave > compression (of individual chunks, possibly) as a =E2=80=9Cdetail=E2=80= =9D handled by > the encoding or the transport layer? > I agree that it would be nice to leave this to the encoding layer as that would allow certain optimizations (e.g. de-duplication). Unfortunately, we haven't figured out yet what the most suitable compression/format would be. Something like EROSFS seems good (as it aligns data to fixed block sizes) [1]. But this seems a bit "clunky" for just an archive format and there do not seem to be any libraries that we could use to neatly integrate. It seems possible to block-align a Tar archive, but that seems a bit hackey [2]. Other things to look into might be Tarlz [3] and ZPAQ [4]. To get started I suggest just using one of the compressions/formats already in Guix. zstd seems to be a reasonable choice (for the same reasons why it makes sense to use zstd with `--discover` [5]). Does that sound like a plan? [1] https://inqlab.net/git/guile-eris.git/tree/examples/dedup-fs/Readme.org [2] https://unix.stackexchange.com/questions/276908/make-tar-or-other-archi= ve-with-data-block-aligned-like-in-original-files-for/279384#279384 [3] http://lzip.nongnu.org/tarlz.html [4] http://mattmahoney.net/dc/zpaq.html [5] https://guix.gnu.org/en/blog/2021/getting-bytes-to-disk-more-quickly/ >> If the `--ipfs` is used for `guix publish` then the encoded blocks are a= lso >> uploaded to the IPFS daemon. The nar could then be retrieved from anywhe= re like >> this: >> >> (use-modules (eris) >> (eris blocks ipfs)) >> >> (eris-decode->bytevector >> "urn:erisx2:BIBC2LUTIQH43S2KRIAV7TBXNUUVPZTMV6KFA2M7AL5V6FNE77VNUDDVDAG= JUEEAFATVO2QQT67SMOPTO3LGWCJFU7BZVCF5VXEQQW25BE" >> eris-blocks-ipfs-ref) >> >> These patches do not yet retrieve content from IPFS (TODO). But in princ= iple, >> anybody connected to IPFS can get the nar with the ERIS URN. This could = be used >> to reduce load on substitute server as they would only need to publish t= he ERIS >> URN directly - substitutes could be delivered much more peer-to-peer. > > Nice. So adjusting =E2=80=98guix substitute=E2=80=99 should be relativel= y easy? Yes, relatively! :) I meant to send in a V2 that does this before going on holidays, but I'm afraid I won't make it. V2 will come in early January! >> Other transports that I have been looking in to and am pretty sure will = work >> include: HTTP (with RFC 2169 [3]), GNUNet, OpenDHT. This is, imho, the >> advantage of ERIS over IPFS directly or GNUNet directly. The encoding and >> identifiers (URN) are abstracted away from specific transports (and also >> applications). ERIS is almost exactly the same encoding as used in GNUNet >> (ECRS). > > As a first step, =E2=80=98guix publish=E2=80=99 could implement RFC 2169,= too. > > I gather implementing the HTTP and IPFS backends in =E2=80=98guix substit= ute=E2=80=99 > should be relatively easy, right? Yes, those seem to be the two easiest backends to implement. >> A tricky things is figuring out how to multiplex all these different >> transports and storages... > > Yes. We don=E2=80=99t know yet what performance and data availability wi= ll be > like on IPFS, for instance, so it=E2=80=99s important for users to be abl= e to > set priorities. It=E2=80=99s also important to gracefully fall back to d= irect > HTTP downloads when fancier p2p methods fail, regardless of how they > fail. Agree. Thanks, -pukkamustard