From debbugs-submit-bounces@debbugs.gnu.org Mon Mar 09 08:41:00 2020 Received: (at 39258) by debbugs.gnu.org; 9 Mar 2020 12:41:00 +0000 Received: from localhost ([127.0.0.1]:49940 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1jBHiO-0002EH-0P for submit@debbugs.gnu.org; Mon, 09 Mar 2020 08:41:00 -0400 Received: from mail-qt1-f196.google.com ([209.85.160.196]:44394) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1jBHiM-0002E4-4O for 39258@debbugs.gnu.org; Mon, 09 Mar 2020 08:40:58 -0400 Received: by mail-qt1-f196.google.com with SMTP id h16so6777559qtr.11 for <39258@debbugs.gnu.org>; Mon, 09 Mar 2020 05:40:58 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=mime-version:references:in-reply-to:from:date:message-id:subject:to :cc:content-transfer-encoding; bh=lyEK1xGVq9wbyxBsvLKiCTo7/3Cxi26+1BTnqsidQO8=; b=POqiy9wXHNQd7Nw3CFLvuKN0zsNuQ6Yod+YZ0h472zhsV/qaafRKvV7kFm6xuXSSRD sZ3P48r2NX9WoYmkjCVF7GReQoU/vVG7fCyeq/vskNcBZJyN9S5TOP765IRvNROTYcPL DZx4oxNRG5oT1H299bTaGuItaA5oQjrYNVuh2oBJACL+4BL8ODXiYO9LUrBQqNrVBNI4 buP4o+28G0wgKtXBlLfqfiJ2UEG6TpOXeXPMwLQdWi0Ev9+YSt1BWBSdgTTg/l08AHAH 8NKe8yk+LExO+5lGAvCtPK4CNjYA89E/eg5Wn2v8OERUk2hh6rRgxuTyEFOuI45Ka5UD mzgA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc:content-transfer-encoding; bh=lyEK1xGVq9wbyxBsvLKiCTo7/3Cxi26+1BTnqsidQO8=; b=DBSSvL0dNQ8vyvMIm10SqBCehpB0BVW8a0aL9RCOARswmWnNZ+BrTKwrIMFGKlGs/6 CEooaghNacPe5r8ioxZ4jAgpglJ1V0XB/CS3NcnlCj36I6ha3pHbUJJnmkUoyoNLzefZ a6fvjjFTuUCHxKb1jsiuhWWmNI6Tk3jiP9xgjESu1sdE25ZihTixS/hK3XRpF3nsRS3p RZlKPMCJuqZZQE+HMJlApyKI7tT0BE4LWJpqrrzQkWacQUrD+wn7hM97/3LKs+FwOjau 80++R89Ll9CfEeOTbtGPR0JVNNSMXQsl7f3vehAsFrfR1oCHZD74OtLnRFR+/A2NbvIO pfOQ== X-Gm-Message-State: ANhLgQ36+QQ4kXMwNfcWbZ5e7fplG3J9RhogkmzNEplJVXcAqzPMXnV0 56zbk7Rhd97DqzDqEs1FBinbARe/2c7nmrbSTI4= X-Google-Smtp-Source: ADFU+vtzPZ/0TKTXFnXvZF15UtqOR8RHPOCYtfDoQHDUyWFtubLoCbBAq3rZwDtGBBZPNOtNjRfvqroTIxrCZ4X4rgA= X-Received: by 2002:ac8:6b44:: with SMTP id x4mr3058646qts.186.1583757652303; Mon, 09 Mar 2020 05:40:52 -0700 (PDT) MIME-Version: 1.0 References: <20200307133116.11443-1-arunisaac@systemreboot.net> <87sgijgb1v.fsf@gnu.org> In-Reply-To: From: zimoun Date: Mon, 9 Mar 2020 13:40:39 +0100 Message-ID: Subject: Re: [PATCH v2 0/3] Xapian for Guix package search To: Arun Isaac Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Spam-Score: 0.0 (/) X-Debbugs-Envelope-To: 39258 Cc: =?UTF-8?Q?Ludovic_Court=C3=A8s?= , Pierre Neidhardt , 39258@debbugs.gnu.org X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: debbugs-submit-bounces@debbugs.gnu.org Sender: "Debbugs-submit" X-Spam-Score: -1.0 (-) On Sun, 8 Mar 2020 at 10:02, Arun Isaac wrote: > >> It turns out that most of the time is spent in printing and texinfo > >> rendering of the search results. > > Also, when we put all package metadata into the Xapian index, we don't > have to look up any of the package variables in (gnu packages *) during > `guix search` time. This also contributes substantially to the speedup. Yes, magic power of inverted index. ;-) > > Also, if the 12K+ descriptions need to be rendered at the time the user > > runs =E2=80=98guix pull=E2=80=99, the experience may not be great, beca= use it could take > > a bit of time. > > This is a problem, but I would see it as a necessary "compilation" > step. :-P In fact, this whole patchset speeds up `guix search` by doing > part of the work of `guix search` ahead of time. So, some such cost is > unavoidable. Currently "guix pull" is rather long on my machine. I would accept a couple of seconds more (even minutes). So this compilation step could be done at the "guix pull" time. Or even we could imagine something indexing in the background. > > What I like about the recutils format in this context is that it=E2=80= =99s both > > human- and machine-readable. The examples in the manual show how it ca= n > > be useful to select the information displayed or to refine the search > > (info "(guix) Invoking guix package"). > > Xapian's query language is much more natural (as in natural language) > than the regexp based techniques we need to use with recutils. I have > hardly ever used the regexp based search and I suspect many others > haven't either. Also, refining the search query should be easier to do > with Xapian. We could even use Xapian's query expansion feature to > suggest improved queries to the user. > > That said, if we want the recutils format, we can still keep it in a > simplified form like so. > > name: inkscape > version: 0.92.4 > synopsis: Vector graphics editor > > name: inklingreader > version: 0.8 > synopsis: Wacom Inkling skecth format conversion and manipulation > > > Also: I=E2=80=99d recommend tackling one thing at a time. :-) > > I totally agree, but I'm tempted to say that pre-rendering would be a > lot cheaper with the simplified form of search results. :-) IMHO, we "just" need to propose different outputs mimicking "git log --format". Soemthing like "guix search --format=3D". What do you think? All the best, simon