Search paths of dependencies are not honored

OpenSubmitted by Ludovic Courtès.
Details
4 participants
  • Julien Lepiller
  • Ludovic Courtès
  • Maxime Devos
  • Mark H Weaver
Owner
unassigned
Severity
important
L
L
Ludovic Courtès wrote on 10 Dec 2015 10:36
(address . bug-guix@gnu.org)
87bn9yk5mf.fsf@gnu.org
As of 0.9.0+, only the search paths of packages explicitly in the
profile are shown by ‘--search-paths’ and similar. This is a problem
for libraries that have associated environment variables.

For instance, if one installs an application linked against OpenSSL,
they will not know about ‘SSL_CERT_DIR’ and ‘SSL_CERT_FILE’. Similarly
for GStreamer and ‘GST_PLUGIN_PATH’, libc and ‘GUIX_LOCPATH’, and so on.

Fixing it seems tricky. We could pass the profile builder the closure’s
graph (via #:references-graph) and the the set of search paths of all
the direct and indirect dependencies of the profile. The builder would
then need to somehow determine the subset of these search paths that is
actually useful, and use it to build the search path spec in the
manifest as well as etc/profile.

Ludo’.
L
L
Ludovic Courtès wrote on 26 Mar 2017 15:18
control message for bug #22138
(address . control@debbugs.gnu.org)
87shm0rxb3.fsf@gnu.org
severity 22138 important
J
J
Julien Lepiller wrote on 1 Aug 2019 22:12
Search paths of dependencies are not honored
(address . 22138@debbugs.gnu.org)
20190801221206.17965136@sybil.lepiller.eu
Hi, I've been looking at our current code and would like to propose the
attached patch for that issue.
From cfd2c229087166ab4cc0a9e2bdb72c8b393bcdd5 Mon Sep 17 00:00:00 2001
From: Julien Lepiller <julien@lepiller.eu>
Date: Thu, 1 Aug 2019 22:09:38 +0200
Subject: [PATCH] guix: Recursively honor search paths of dependencies.

* guix/packages.scm (all-transitive-inputs)
(package-all-transitive-inputs)
(package-all-transitive-native-search-paths): New procedures.
* guix/profiles.scm (package->manifest-entry): Use
package-all-transitive-native-search-paths to generate manifest search
paths.
---
guix/packages.scm | 53 +++++++++++++++++++++++++++++++++++++++++++++++
guix/profiles.scm | 2 +-
2 files changed, 54 insertions(+), 1 deletion(-)

Toggle diff (100 lines)
diff --git a/guix/packages.scm b/guix/packages.scm
index c94a651f27..f9095759f1 100644
--- a/guix/packages.scm
+++ b/guix/packages.scm
@@ -101,6 +101,7 @@
             package-transitive-propagated-inputs
             package-transitive-native-search-paths
             package-transitive-supported-systems
+	    package-all-transitive-native-search-paths
             package-mapping
             package-input-rewriting
             package-input-rewriting/spec
@@ -686,6 +687,42 @@ preserved, and only duplicate propagated inputs are removed."
       ((input rest ...)
        (loop rest (cons input result) propagated first? seen)))))
 
+(define (all-transitive-inputs inputs)
+  "Return the closure of INPUTS when considering the 'propagated-inputs',
+'inputs' and 'native-inputs' edges.  Omit duplicate inputs, except for
+those already present in INPUTS itself.
+
+This is implemented as a breadth-first traversal such that INPUTS is
+preserved, and only duplicate propagated inputs are removed."
+  (define (seen? seen item outputs)
+    ;; FIXME: We're using pointer identity here, which is extremely sensitive
+    ;; to memoization in package-producing procedures; see
+    ;; <https://bugs.gnu.org/30155>.
+    (match (vhash-assq item seen)
+      ((_ . o) (equal? o outputs))
+      (_       #f)))
+
+  (let loop ((inputs     inputs)
+             (result     '())
+             (transitive '())
+             (first?     #t)
+             (seen       vlist-null))
+    (match inputs
+      (()
+       (if (null? transitive)
+           (reverse result)
+           (loop (reverse (concatenate transitive)) result '() #f seen)))
+      (((and input (label (? package? package) outputs ...)) rest ...)
+       (if (and (not first?) (seen? seen package outputs))
+           (loop rest result transitive first? seen)
+           (loop rest
+                 (cons input result)
+                 (cons (package-direct-inputs package) transitive)
+                 first?
+                 (vhash-consq package outputs seen))))
+      ((input rest ...)
+       (loop rest (cons input result) transitive first? seen)))))
+
 (define (package-direct-sources package)
   "Return all source origins associated with PACKAGE; including origins in
 PACKAGE's inputs."
@@ -720,6 +757,11 @@ with their propagated inputs."
 with their propagated inputs, recursively."
   (transitive-inputs (package-direct-inputs package)))
 
+(define (package-all-transitive-inputs package)
+  "Return the transitive inputs of PACKAGE---i.e., its direct inputs along
+with their propagated inputs, recursively."
+  (all-transitive-inputs (package-direct-inputs package)))
+
 (define (package-transitive-target-inputs package)
   "Return the transitive target inputs of PACKAGE---i.e., its direct inputs
 along with their propagated inputs, recursively.  This only includes inputs
@@ -749,6 +791,17 @@ recursively."
                          '()))
                       (package-transitive-propagated-inputs package))))
 
+(define (package-all-transitive-native-search-paths package)
+  "Return the list of search paths for PACKAGE and its propagated inputs,
+recursively."
+  (append (package-native-search-paths package)
+          (append-map (match-lambda
+                        ((label (? package? p) _ ...)
+                         (package-native-search-paths p))
+                        (_
+                         '()))
+                      (package-all-transitive-inputs package))))
+
 (define (transitive-input-references alist inputs)
   "Return a list of (assoc-ref ALIST <label>) for each (<label> <package> . _)
 in INPUTS and their transitive propagated inputs."
diff --git a/guix/profiles.scm b/guix/profiles.scm
index f5c863945c..dd6a31562f 100644
--- a/guix/profiles.scm
+++ b/guix/profiles.scm
@@ -318,7 +318,7 @@ file name."
                      (item package)
                      (dependencies (delete-duplicates deps))
                      (search-paths
-                      (package-transitive-native-search-paths package))
+                      (package-all-transitive-native-search-paths package))
                      (parent parent)
                      (properties properties))))
     entry))
-- 
2.22.0
M
M
Mark H Weaver wrote on 5 Aug 2019 18:23
(name . Julien Lepiller)(address . julien@lepiller.eu)(address . 22138@debbugs.gnu.org)
87ftmfa8cp.fsf@netris.org
Hi Julien,

Julien Lepiller <julien@lepiller.eu> writes:

Toggle quote (15 lines)
> Hi, I've been looking at our current code and would like to propose the
> attached patch for that issue.
>
> From cfd2c229087166ab4cc0a9e2bdb72c8b393bcdd5 Mon Sep 17 00:00:00 2001
> From: Julien Lepiller <julien@lepiller.eu>
> Date: Thu, 1 Aug 2019 22:09:38 +0200
> Subject: [PATCH] guix: Recursively honor search paths of dependencies.
>
> * guix/packages.scm (all-transitive-inputs)
> (package-all-transitive-inputs)
> (package-all-transitive-native-search-paths): New procedures.
> * guix/profiles.scm (package->manifest-entry): Use
> package-all-transitive-native-search-paths to generate manifest search
> paths.

As I recall this kind of solution has been proposed in the past and
rejected. It's a reasonable suggestion, but I personally think that it
goes too far, because it would include a great many packages whose code
is nowhere to be found in the resulting profile. For example, it would
include documentation generators used to build man pages, and the
compilers that were used to build those documentation generators, etc,
all the way back to the early bootstrap binaries.

Having said this, I agree that there is a longstanding problem in Guix
with search-paths not including enough packages in its calculation.
We've known about this problem for a long time, but as far as I know we
have not yet found a satisfactory solution.

Our current hacky workaround for problems like this has been to set
certain environment variables unconditionally in /etc/profile. For
example, you'll see that MANPATH, INFOPATH, XDG_DATA_DIRS,
XDG_CONFIG_DIRS, XCURSOR_PATH, DICPATH, and GST_PLUGIN_PATH are all set
there. See 'operating-system-etc-service' in (gnu system) for the
relevant code.

At the very least, I think we should wait for input from Ludovic before
applying anything along these lines.

Anyway, thanks for looking into it and making the proposal.

Best,
Mark
J
J
Julien Lepiller wrote on 5 Aug 2019 18:31
(name . Mark H Weaver)(address . mhw@netris.org)(address . 22138@debbugs.gnu.org)
DA98A3D3-98B0-444C-B53A-3A6D37366908@lepiller.eu
Le 5 août 2019 18:23:55 GMT+02:00, Mark H Weaver <mhw@netris.org> a écrit :
Toggle quote (51 lines)
>Hi Julien,
>
>Julien Lepiller <julien@lepiller.eu> writes:
>
>> Hi, I've been looking at our current code and would like to propose
>the
>> attached patch for that issue.
>>
>> From cfd2c229087166ab4cc0a9e2bdb72c8b393bcdd5 Mon Sep 17 00:00:00
>2001
>> From: Julien Lepiller <julien@lepiller.eu>
>> Date: Thu, 1 Aug 2019 22:09:38 +0200
>> Subject: [PATCH] guix: Recursively honor search paths of
>dependencies.
>>
>> * guix/packages.scm (all-transitive-inputs)
>> (package-all-transitive-inputs)
>> (package-all-transitive-native-search-paths): New procedures.
>> * guix/profiles.scm (package->manifest-entry): Use
>> package-all-transitive-native-search-paths to generate manifest
>search
>> paths.
>
>As I recall this kind of solution has been proposed in the past and
>rejected. It's a reasonable suggestion, but I personally think that it
>goes too far, because it would include a great many packages whose code
>is nowhere to be found in the resulting profile. For example, it would
>include documentation generators used to build man pages, and the
>compilers that were used to build those documentation generators, etc,
>all the way back to the early bootstrap binaries.
>
>Having said this, I agree that there is a longstanding problem in Guix
>with search-paths not including enough packages in its calculation.
>We've known about this problem for a long time, but as far as I know we
>have not yet found a satisfactory solution.
>
>Our current hacky workaround for problems like this has been to set
>certain environment variables unconditionally in /etc/profile. For
>example, you'll see that MANPATH, INFOPATH, XDG_DATA_DIRS,
>XDG_CONFIG_DIRS, XCURSOR_PATH, DICPATH, and GST_PLUGIN_PATH are all set
>there. See 'operating-system-etc-service' in (gnu system) for the
>relevant code.
>
>At the very least, I think we should wait for input from Ludovic before
>applying anything along these lines.
>
>Anyway, thanks for looking into it and making the proposal.
>
> Best,
> Mark

Tgis patch doesn't add any dependency to any package, since it doesn't record any more stone paths than without. It indeed defines more variables than needed, but isn't that ok?
L
L
Ludovic Courtès wrote on 23 Aug 2019 16:42
(name . Mark H Weaver)(address . mhw@netris.org)
87d0gw54fz.fsf@gnu.org
Hi Mark,

Mark H Weaver <mhw@netris.org> skribis:

Toggle quote (22 lines)
> Julien Lepiller <julien@lepiller.eu> writes:
>
>> Hi, I've been looking at our current code and would like to propose the
>> attached patch for that issue.
>>
>> From cfd2c229087166ab4cc0a9e2bdb72c8b393bcdd5 Mon Sep 17 00:00:00 2001
>> From: Julien Lepiller <julien@lepiller.eu>
>> Date: Thu, 1 Aug 2019 22:09:38 +0200
>> Subject: [PATCH] guix: Recursively honor search paths of dependencies.
>>
>> * guix/packages.scm (all-transitive-inputs)
>> (package-all-transitive-inputs)
>> (package-all-transitive-native-search-paths): New procedures.
>> * guix/profiles.scm (package->manifest-entry): Use
>> package-all-transitive-native-search-paths to generate manifest search
>> paths.
>
> As I recall this kind of solution has been proposed in the past and
> rejected. It's a reasonable suggestion, but I personally think that it
> goes too far, because it would include a great many packages whose code
> is nowhere to be found in the resulting profile.

I agree.

We need to take into account run-time dependencies (references), not
build-time dependencies, and that’s what makes things more difficult.

The attached procedure computes the search paths of a package based on
its run-time dependencies. ‘profile-derivation’ could use this to
compute its ‘etc/profile’ file and its ‘manifest’ file. Here’s what it
gives for ‘w3m’:

Toggle snippet (34 lines)
(("BASH_LOADABLES_PATH"
("lib/bash")
":"
directory
#f)
("GUIX_LOCPATH" ("lib/locale") ":" directory #f)
("SSL_CERT_DIR"
("etc/ssl/certs")
#f
directory
#f)
("SSL_CERT_FILE"
("etc/ssl/certs/ca-certificates.crt")
#f
regular
#f)
("TERMINFO_DIRS"
("share/terminfo")
":"
directory
#f)
("XDG_DATA_DIRS" ("share") ":" directory #f)
("GIO_EXTRA_MODULES"
("lib/gio/modules")
":"
directory
#f)
("PERL5LIB"
("lib/perl5/site_perl")
":"
directory
#f))

We get GUIX_LOCPATH and SSL_CERT_DIR, which is exactly what we want.
However, the other search paths that we get look less desirable:
intuitively, it seems that we may not need them, and in the case of
XDG_DATA_DIRS, setting it may be detrimental.

For ‘icecat’ this gives additional things like PYTHONPATH,
GUILE_LOAD_PATH, XML_CATALOG_FILES, and so on.

In practice that’s probably OK though: if you only have ‘icecat’ and
other GUIs in your profile, Guix won’t suggest setting GUILE_LOAD_PATH
or PYTHONPATH.

So I think we can come up with a solution based on the patch below, but
we’ll have to test it on Guix System and on foreign distros to see if
the extra search paths that it brings are warranted.

Thoughts?

Ludo’.
Toggle diff (77 lines)
diff --git a/guix/packages.scm b/guix/packages.scm
index c94a651f27..7d5c8198ac 100644
--- a/guix/packages.scm
+++ b/guix/packages.scm
@@ -26,6 +26,7 @@
   #:use-module (guix store)
   #:use-module (guix monads)
   #:use-module (guix gexp)
+  #:use-module (guix modules)
   #:use-module (guix base32)
   #:use-module (guix grafts)
   #:use-module (guix derivations)
@@ -749,6 +750,64 @@ recursively."
                          '()))
                       (package-transitive-propagated-inputs package))))
 
+(define (package-run-time-search-paths package)
+  "Return a file that contains a list of search path specifications
+corresponding to the run-time references of PACKAGE."
+  (define search-paths
+    (map (lambda (package)
+           (cons (package-full-name package "-")
+                 (map search-path-specification->sexp
+                      (package-native-search-paths package))))
+         (package-closure (list package))))
+
+  (define build
+    (with-imported-modules (source-module-closure
+                            '((guix build utils)
+                              (guix search-paths)
+                              (guix build store-copy)))
+      #~(begin
+          (use-modules (guix build utils)
+                       (guix build store-copy)
+                       (guix search-paths)
+                       (srfi srfi-1)
+                       (ice-9 match)
+                       (ice-9 pretty-print))
+
+          (define search-paths
+            ;; Map items like "coreutils-8.20" to their search path specs.
+            (map (match-lambda
+                   ((item . paths)
+                    (cons item (map sexp->search-path-specification paths))))
+                 '#$search-paths))
+
+          (define (store-item-search-paths item)
+            "Return the search paths that apply to ITEM, a store item."
+            (or (any (match-lambda
+                       ((entry . paths)
+                        (and (string-suffix? entry item)
+                             paths)))
+                     search-paths)
+                '()))
+
+          (let* ((references (map store-info-item
+                                  (call-with-input-file "graph"
+                                    read-reference-graph)))
+                 (paths      (delete-duplicates
+                              (append-map store-item-search-paths
+                                          references))))
+            (call-with-output-file #$output
+              (lambda (port)
+                (pretty-print (map search-path-specification->sexp
+                                   paths)
+                              port)))))))
+
+  (computed-file (string-append (package-full-name package "-")
+                                "-search-paths")
+                 build
+                 #:options
+                 `(#:references-graphs (("graph" ,package))
+                   #:properties '((type . search-paths)))))
+
 (define (transitive-input-references alist inputs)
   "Return a list of (assoc-ref ALIST <label>) for each (<label> <package> . _)
 in INPUTS and their transitive propagated inputs."
L
L
Ludovic Courtès wrote on 24 Aug 2019 15:52
(name . Mark H Weaver)(address . mhw@netris.org)
87pnkuwtzl.fsf@gnu.org
Ludovic Courtès <ludo@gnu.org> skribis:

Toggle quote (4 lines)
> The attached procedure computes the search paths of a package based on
> its run-time dependencies. ‘profile-derivation’ could use this to
> compute its ‘etc/profile’ file and its ‘manifest’ file.

One difficulty with this approach is that search paths are currently a
“host-side” feature: there’s a ‘search-paths’ field in <manifest-entry>,
which is populated before the profile is actually built.

What I suggest above would mean turning search paths into a “build-side”
feature, which has some implications.

Needs more thought…

Ludo’.
J
J
Julien Lepiller wrote on 26 Nov 2019 13:00
Search paths of dependencies are not honored
(address . 22138@debbugs.gnu.org)
3EB63F9F-2885-4A59-B945-CE8011778017@lepiller.eu

I'm not sure what the implications would be. Your results look good and I think it's close enough to what we need. Now, to get SSL_CERT_DIR to be defined, we simply need a ssl cert dir to be present in the profile, which is what users expect.

I would argue that we could simply add every possible variable to the profile, it wouldn't make much of a difference, but if you feel using only runtime dependencies is better, I won't argue against it. It's fine with me, even if it seems more complicated than it needs to be.

As a user, I have control over the content of my profile. When I have some file in it, I expect it to be "functional" in the sense that it contributes something to the functionalities of the profile. Not setting a variable when we could makes the file non-functional, since it can't be used directly. Currently, one needs to add another non-functional package to the profile (like opennsl), which doesn't contribute anything except metadata.

If I have a package in my profile that adds an unrelated file (eg. a python library), and it is not desirable, I think we should rather fix the package to add a separate output for that library, rather than restricting the set of environment variables.
M
M
Maxime Devos wrote on 14 Jan 20:48 +0100
Should search paths of dependencies be honored automatically?
(address . 22138@debbugs.gnu.org)
e00f34344ec42db021f06004b2117d8b8a0d8f55.camel@telenet.be
(first section)

Toggle quote (10 lines)
> As of 0.9.0+, only the search paths of packages explicitly in the
> profile are shown by ‘--search-paths’ and similar. This is a problem
> for libraries that have associated environment variables.
>
> For instance, if one installs an application linked against OpenSSL,
> they will not know about ‘SSL_CERT_DIR’ and ‘SSL_CERT_FILE’. Similarly
> for GStreamer and ‘GST_PLUGIN_PATH’, libc and ‘GUIX_LOCPATH’, and so on.
> [...]
> [... stuff about references-graph ...]

I want to note that it is possible to use a library in a program without
needing the environment variables. E.g., web servers can use openssl,
but often only for listening so they don't need a certificate bundle.
A game using GStreamer for sounds knows in advance which plugins it will
need, so the package definition would be using 'wrap-program' or the like
to hardcode the set of plugins.

Many (almost all?) packages keep a reference to the "lib" output of "gcc",
but almost no packages need '{,CROSS_}C_INCLUDE_PATH', '{,CROSS_}CPLUS_INCLUDE_PATH' or
'{,CROSS_}LIBRARY_PATH'.

That said, adding these superfluos environment variables is probably
harmless in practice, but it kind of breaks some abstractions:

If I use a package as input that uses a search path and I fix the search path
using 'wrap-program' or the like, then I would expect things to be as if there
wasn't a search path in the first place, but IIUC the proposed patch looking
at references would break things.

Also, the patch seems to only affect profiles and not build environments,
creating another distance between them, which seems suboptimal.

For these reasons, I don't think search paths of dependencies should be honoured.
However, there are things to improve:

(second section)

I believe a better option would be to document search paths better in the manual
(there's almost nothing currently about search paths) and make search paths easier
to use.

E.g., we could move ‘generic’ search paths used by several packages (e.g.
INFOPATH, TERMINFO_DIRS, XDG_DATA_DIRS, GST_PLUGIN_PATH, SSL_CERT_DIR
and SSL_CERT_FILE --- why is SSL_CERT_DIR missing from icecat and guile
anyway?) (counter-example: MINETEST_MOD_PATH, because it is used by only
one package) into a common module next to each other, which some small comments

[...]
;; FWIW I think I suggested something like this elsewhere, in the context
;; of module import loops.

;; for info readers (info-reader, emacs, ...)
(define-public %info-path ...)
;; for programs interacting with terminal emulators,
;; e.g. almost anything using 'ncurses'
(define-public %terminfo-dirs-path ...)
;; for packages using TLS, e.g. web browsers (icecat, ungoogled-chromium, ...),
;; e-mail clients (evolution, icedove ...), most software 'openssl'
;;
;; Not all packages using TLS respect one of these two variables.
;;
;; (TODO check if these examples actually respect SSL_CERT_DIR/PATH)
(define-public %x509-cert-dir-path ...)
(define-public %x509-cert-file-path ...)
[...]

and mention this module in the documentation of 'search-paths',
invite people to look at the source code of the module for examples
people maybe even documenting these search paths in the manual.
Maybe also insert a reminder about search paths in the 'reviewing' section?

I think this (keeping standard search paths together, with some explanation,
with some example packages using these search paths ...) would help discoverability
a lot, reducing the risk that a package is defined without appropriate search paths.

Such a common module would also reduce redunancy (a single definition of %foo-path
instead of sprinkling it among separate packages) and reduce the chance of module import
loops (maybe we can move the python search path there?), although the latter could
be solved with thunking (at some cost).

It's also possible to do both this (manual documentation, common search path module)
and the patch using the reference graph.

This ignores GUIX_LOCPATH, but that's used by almost anything, so it could be
special-cased I think?

Greetings,
Maxime.
-----BEGIN PGP SIGNATURE-----

iI0EABYKADUWIQTB8z7iDFKP233XAR9J4+4iGRcl7gUCYeHTpRccbWF4aW1lZGV2
b3NAdGVsZW5ldC5iZQAKCRBJ4+4iGRcl7v+HAQCW39ekw+T8CnedRiOqVHTO/sqW
6ZtJVAs1Z3SROPsSpwEA8haWJBuOCFLVKuNXndpAzsavmxqEU0xnjHcVDbJBTgc=
=sClV
-----END PGP SIGNATURE-----


M
M
Maxime Devos wrote on 22 Jan 22:27 +0100
Search paths of dependencies are not honored
(address . 22138@debbugs.gnu.org)
e3960b96e782b5d01c93a8615e4f76463bb55682.camel@telenet.be
Hi,

On second thought, honouring search paths of dependencies
(filtered by looking at the references?) would solve much more problems
than it could create -- a theoretical ‘problem’ is that more
environment variables than strictly needed might be defined (see e.g.
the wrap-program example) but it doesn't seem that it would create
problems in practice.

There's a limitation to keep in mind though: static libraries.

Suppose we have a static library variant curl/static tht links against
openssl:static. Then due to the static linking, curl won't keep a
reference to openssl:static (*), so there won't be a SSL_CERT_DIR/FILE
search path.

(*) Actually, that might wrong, libcrypto.a keeps a reference to 
/gnu/store/plr00nij45964cyy7sfcg5rcsi8hks0h-openssl-1.1.1l.

For some examples in the wild where this kind of propagation(*) of
search paths is useful: all packages using ncurses (TERMINFO_DIRS),
openssl (SSL_CERT_DIR/SSL_CERT_FILE), all packages using glibc
(GUIX_LOCPATH) -- basically all software(!).

(*) it's not propagated-inputs but the concept doesn't seem completely
unlike.

As such, I agree with the concept (notwithstanding the previous mail I
sent), with the caveat that I haven't investigated the implementation
or tested it.

Greetings,
Maxime
-----BEGIN PGP SIGNATURE-----

iI0EABYKADUWIQTB8z7iDFKP233XAR9J4+4iGRcl7gUCYex2sxccbWF4aW1lZGV2
b3NAdGVsZW5ldC5iZQAKCRBJ4+4iGRcl7vGxAQC+DWY5lylg2qhcDQwEfnrPWy/H
zm+gka4paIYI5GdDKwD/V5i9oSk0gbqHi4H0E1v85tnEndjK0VADdXNL32Q8PgM=
=HAEM
-----END PGP SIGNATURE-----


?