Hello, Arun Isaac skribis: > Here is the second iteration of my Xapian Guix package search patchset. I have > found the reason the earlier patchset did not show significant speedup. It > turns out that most of the time is spent in printing and texinfo rendering of > the search results. So, in this patchset, I pre-render the search results > while building the Xapian index and stuff them into the Xapian database > itself. Therefore, during `guix search`, I just pull out the pre-rendered > search results and print it on the screen. This is much faster. See comparison > below. > > With a warm cache, > $ time guix search inkscape > > real 0m1.787s > user 0m1.745s > sys 0m0.111s > > $ time /tmp/test/bin/guix search inkscape > > real 0m0.199s > user 0m0.182s > sys 0m0.024s Nice! In general, pre-rendering doesn’t seem practical to me: the output of ‘guix search’ is locale-dependent (it speaks the user’s language) and adjusts to the terminal width (well, this is temporarily broken on Guile 3.0.0, but see ‘%text-width’ in (guix ui)). Also, if the 12K+ descriptions need to be rendered at the time the user runs ‘guix pull’, the experience may not be great, because it could take a bit of time. WDYT? > Why not use a simpler package search results format like Arch Linux or Debian > does? We could just display the package name, version and synopsis like so. > > inkscape 0.92.4 > Vector graphics editor > inklingreader 0.8 > Wacom Inkling sketch format conversion and manipulation > > Why do we need the entire recutils format? If the user is interested, they can > always use `guix package --show` to get the full recutils formatted > info. Having shorter search results will make everything even faster and much > more readable. WDYT? What I like about the recutils format in this context is that it’s both human- and machine-readable. The examples in the manual show how it can be useful to select the information displayed or to refine the search (info "(guix) Invoking guix package"). Also: I’d recommend tackling one thing at a time. :-) > Ludovic Courtès writes: > >> Note that ‘guix search’ time is largely dominated by I/O. > > Yes, `guix search` is I/O intensive. That is why I expect Xapian to do better > since it only needs to access matching packages not all packages. Also, the > Xapian index is fast at all times. It is not very dependent on a warm > filesystem cache. Yes, indeed. >> On my laptop, >> I get (first measurement is cold cache, second one is warm cache): >> >> --8<---------------cut here---------------start------------->8--- >> $ sudo sh -c 'echo 3 > /proc/sys/vm/drop_caches' >> $ time guix search foo >/dev/null >> >> real 0m2.631s >> user 0m1.134s >> sys 0m0.124s >> $ time guix search foo >/dev/null >> >> real 0m0.836s >> user 0m1.027s >> sys 0m0.053s >> --8<---------------cut here---------------end--------------->8--- >> >> It’s hard to do better on the warm cache case because at this level, >> there may be other things to optimize having little to do with searching >> itself. >> >> Note that this is on an SSD; the cold-cache case must be worse on NFS or >> on a spinning disk, and there we could gain a lot. > > My laptop is quite old with a particularly slow HDD. Hence my motivation to > improve guix search performance! Were you able to measure the cost of rendering specifically? Here’s what I see when I turn ‘package->recutils’ into a no-op: --8<---------------cut here---------------start------------->8--- $ sudo sh -c 'echo 3 > /proc/sys/vm/drop_caches' $ time ./pre-inst-env guix search foo real 0m1.617s user 0m0.812s sys 0m0.094s $ time ./pre-inst-env guix search foo real 0m0.595s user 0m0.747s sys 0m0.043s --8<---------------cut here---------------end--------------->8--- To compare with: --8<---------------cut here---------------start------------->8--- $ time ./pre-inst-env guix search foo >/dev/null real 0m0.829s user 0m1.026s sys 0m0.046s --8<---------------cut here---------------end--------------->8--- I think we should look at a profile of ‘package->recutils’, there’s probably room for improvement there. Thoughts? Ludo’.