Re: [opensuse-factory] RFC Generic Packaging for Languages that have vendor/ Trees

classic Classic list List threaded Threaded
3 messages Options
Reply | Threaded
Open this post in threaded view
|

Re: [opensuse-factory] RFC Generic Packaging for Languages that have vendor/ Trees

Neal Gompa
On Tue, Dec 19, 2017 at 4:32 PM, Aleksa Sarai <[hidden email]> wrote:

> Hello *,
>
> This is a proposal for having a generic packaging system of RPMs for
> languages that use "vendor/" trees. Please respond with any feedback you
> have on the details of this proposal.
>
> The main justification for the need for this proposal is that we have
> seen the recent rise of languages that have an *enormous* number of
> "micro-packages" (JavaScript is the most well-known offender here, where
> the majority of widely used packages are only several lines long, but
> Rust has a similar issue, and Go/Ruby do too). This has effectively made
> it an impracticality (or even an impossibility for some languages) to
> create a 1-to-1 RPM mapping for each package. So while a 1-to-1 RPM
> mapping is arguably the most ideal (both from a idealogical perspective
> and a tooling perspective), the maintenance burden is far too high.
>
> Another problem is that many projects written in these sorts of
> languages these days "vendor" their dependencies, usually using a
> language-specific package manager to do so. (This is slightly ironic in
> my opinion, because if they'd integrated more with distributions this
> ideally wouldn't be necessary, but that ship has sailed.) This is a
> problem that also needs to be resolved. Luckily such projects usually
> have some sort of "lock file" that describes what is present inside the
> "vendor/" tree -- this is something that will be useful later. It should
> be noted that the 1-to-1 RPM mapping also doesn't help here either as it
> further will balloon out the number of packages we would need to have
> (as each project might have different version dependencies). Debian has
> been attempting to do this with Go packages, and as far as I can see
> it's quite a futile effort because of the maintenance burden that comes
> from it.
>
> At the moment the way that most packages deal with this problem is that
> they just punt completely on reproducibility and audit-ability, and just
> vendor all dependencies in a project and then tar up the vendor/ tree
> and include it in the OBS project. For a JavaScript project this would
> involve just running `yarn <blah>` (or whatever the command is) and then
> taking node_modules/ and creating a node_modules.tar.xz that is
> included in the specfile. The main problem with this approach currently
> is that it is completely unauditable and nobody knows what's inside
> that magic vendor blob. *However* the core idea is not completely
> insane. The Rust folks have also started doing the same thing with
> cargo-vendor.
>
> And here we come to my proposal. The idea is to take what is already
> being done in these projects, and create better tooling around it to
> make the work of development, maintainence, security, and legal much
> easier.
>
> First, we need to provide more metadata about these vendor blobs in the
> RPM layer, so that security could at least *track* what versions of
> things are used by a project. And in the worst case, it should be
> possible to patch a vendor blob. This would likely best be done through
> RPM macros, by creating a virtual Provides for each of the vendored
> libraries. This matches what Fedora does for bundled libraries[1]. The
> Provides could be just as simple as
>
>     Provides: bundled(rust:nix) = 0.8.1
>
> Or something more involved to be extra paranoid:
>
>     Provides: bundled(rust:registry+https://github.com/rust-lang/crates.io-index:nix) = 0.8.1
>
> Secondly, in order to make this vendor archive reproducible, I propose
> we have an OBS service that can be used to vendor a source tree (which
> can obviously be run either locally or on OBS). It will produce all of
> the vendor archives created by language-specific tools, and produce a
> language-agnostic manifest of what was downloaded (the name, language,
> version, git commit, and so on). The idea is that this manifest could be
> used by the RPM macros above rather than writing language-specific
> macros.
>
> I have already started working on the OBS service, but I would love to
> hear your feedback on this proposal.
>
> [1]: https://fedoraproject.org/wiki/Bundled_Libraries?rd=Packaging:Bundled_Libraries
>

I don't fully disagree with your proposal, but I will point out a few things:

* The current vendoring of rust crates is temporary. We're waiting on
RPM 4.14[1] and the new product builder to come online (DimStar
already slapped me once for breaking Tumbleweed with rich deps
before...). I'm working on making rust2rpm make openSUSE-friendly spec
files (mainly add the boilerplate header, skip conversion of SPDX to
Fedora license tags, generate changes file) so that crates can be
easily packaged and shipped in the distribution. Right now, Fedora has
well over 230 Rust crates packaged[2], and the packaging for them is
pretty trivial[3]. We've also got a good handle on cargo integration,
so crates function as if they're in a local cargo registry for things
to depend on.

* I'm not sure why openSUSE hasn't adopted the bundled() Provides
thing across the board anyway. There are plenty of packages that ship
vendored trees/libraries and no one knows what they are. In general,
it's really not a bad idea to do that. In my opinion, it's
irresponsible to not require what you bundle to be defined.

Generally speaking, I think this is a solid idea, but I solidly do not
believe we will be continuing the vendored crates practice for much
longer in Rust.

[1]: https://build.opensuse.org/request/show/558345
[2]: https://koji.fedoraproject.org/koji/search?match=glob&type=package&terms=rust-*
[3]: https://pagure.io/fedora-rust/playground


--
真実はいつも一つ!/ Always, there's only one truth!
--
To unsubscribe, e-mail: [hidden email]
To contact the owner, e-mail: [hidden email]

Reply | Threaded
Open this post in threaded view
|

Re: [opensuse-factory] RFC Generic Packaging for Languages that have vendor/ Trees

Luke Jones
On 20 December 2017 at 11:49, Neal Gompa <[hidden email]> wrote:

> On Tue, Dec 19, 2017 at 4:32 PM, Aleksa Sarai <[hidden email]> wrote:
>> Hello *,
>>
>> This is a proposal for having a generic packaging system of RPMs for
>> languages that use "vendor/" trees. Please respond with any feedback you
>> have on the details of this proposal.
>>
>> The main justification for the need for this proposal is that we have
>> seen the recent rise of languages that have an *enormous* number of
>> "micro-packages" (JavaScript is the most well-known offender here, where
>> the majority of widely used packages are only several lines long, but
>> Rust has a similar issue, and Go/Ruby do too). This has effectively made
>> it an impracticality (or even an impossibility for some languages) to
>> create a 1-to-1 RPM mapping for each package. So while a 1-to-1 RPM
>> mapping is arguably the most ideal (both from a idealogical perspective
>> and a tooling perspective), the maintenance burden is far too high.
>>
>> Another problem is that many projects written in these sorts of
>> languages these days "vendor" their dependencies, usually using a
>> language-specific package manager to do so. (This is slightly ironic in
>> my opinion, because if they'd integrated more with distributions this
>> ideally wouldn't be necessary, but that ship has sailed.) This is a
>> problem that also needs to be resolved. Luckily such projects usually
>> have some sort of "lock file" that describes what is present inside the
>> "vendor/" tree -- this is something that will be useful later. It should
>> be noted that the 1-to-1 RPM mapping also doesn't help here either as it
>> further will balloon out the number of packages we would need to have
>> (as each project might have different version dependencies). Debian has
>> been attempting to do this with Go packages, and as far as I can see
>> it's quite a futile effort because of the maintenance burden that comes
>> from it.
>>
>> At the moment the way that most packages deal with this problem is that
>> they just punt completely on reproducibility and audit-ability, and just
>> vendor all dependencies in a project and then tar up the vendor/ tree
>> and include it in the OBS project. For a JavaScript project this would
>> involve just running `yarn <blah>` (or whatever the command is) and then
>> taking node_modules/ and creating a node_modules.tar.xz that is
>> included in the specfile. The main problem with this approach currently
>> is that it is completely unauditable and nobody knows what's inside
>> that magic vendor blob. *However* the core idea is not completely
>> insane. The Rust folks have also started doing the same thing with
>> cargo-vendor.
>>
>> And here we come to my proposal. The idea is to take what is already
>> being done in these projects, and create better tooling around it to
>> make the work of development, maintainence, security, and legal much
>> easier.
>>
>> First, we need to provide more metadata about these vendor blobs in the
>> RPM layer, so that security could at least *track* what versions of
>> things are used by a project. And in the worst case, it should be
>> possible to patch a vendor blob. This would likely best be done through
>> RPM macros, by creating a virtual Provides for each of the vendored
>> libraries. This matches what Fedora does for bundled libraries[1]. The
>> Provides could be just as simple as
>>
>>     Provides: bundled(rust:nix) = 0.8.1
>>
>> Or something more involved to be extra paranoid:
>>
>>     Provides: bundled(rust:registry+https://github.com/rust-lang/crates.io-index:nix) = 0.8.1
>>
>> Secondly, in order to make this vendor archive reproducible, I propose
>> we have an OBS service that can be used to vendor a source tree (which
>> can obviously be run either locally or on OBS). It will produce all of
>> the vendor archives created by language-specific tools, and produce a
>> language-agnostic manifest of what was downloaded (the name, language,
>> version, git commit, and so on). The idea is that this manifest could be
>> used by the RPM macros above rather than writing language-specific
>> macros.
>>
>> I have already started working on the OBS service, but I would love to
>> hear your feedback on this proposal.
>>
>> [1]: https://fedoraproject.org/wiki/Bundled_Libraries?rd=Packaging:Bundled_Libraries
>>
>
> I don't fully disagree with your proposal, but I will point out a few things:
>
> * The current vendoring of rust crates is temporary. We're waiting on
> RPM 4.14[1] and the new product builder to come online (DimStar
> already slapped me once for breaking Tumbleweed with rich deps
> before...). I'm working on making rust2rpm make openSUSE-friendly spec
> files (mainly add the boilerplate header, skip conversion of SPDX to
> Fedora license tags, generate changes file) so that crates can be
> easily packaged and shipped in the distribution. Right now, Fedora has
> well over 230 Rust crates packaged[2], and the packaging for them is
> pretty trivial[3]. We've also got a good handle on cargo integration,
> so crates function as if they're in a local cargo registry for things
> to depend on.
>
> * I'm not sure why openSUSE hasn't adopted the bundled() Provides
> thing across the board anyway. There are plenty of packages that ship
> vendored trees/libraries and no one knows what they are. In general,
> it's really not a bad idea to do that. In my opinion, it's
> irresponsible to not require what you bundle to be defined.
>
> Generally speaking, I think this is a solid idea, but I solidly do not
> believe we will be continuing the vendored crates practice for much
> longer in Rust.
>
> [1]: https://build.opensuse.org/request/show/558345
> [2]: https://koji.fedoraproject.org/koji/search?match=glob&type=package&terms=rust-*
> [3]: https://pagure.io/fedora-rust/playground
>
>
> --
> 真実はいつも一つ!/ Always, there's only one truth!

Just to add to what Neil wrote - where possible we should absolutely
be using rpm packaged
deps, especially in the case of Rust.

However, I am fairly certain that there will be cases where using
vendored blobs of sources may
be acceptable (though, not for distribution in the main openSUSE
trees) for user built and provided
packages - I wouldn't expect a hobbyist to package a pile of
dependencies, and so maybe
something in place for tracking vendoring would be wise here?

If we aim to package each and every dependency it's going to turn in
to lunacy pretty damn quickly,
so the goal here should be distribution of only the... well, top
packages? I don't know, but we
should be selective and work backwards from there.

And yeah nah, we won't be continuing with vendored packages for long
one rich deps are in place.
The current vendored packages are only a temporary thing to keep the
Rust structure and packaging
ticking over.
--
To unsubscribe, e-mail: [hidden email]
To contact the owner, e-mail: [hidden email]

Reply | Threaded
Open this post in threaded view
|

Re: [opensuse-factory] RFC Generic Packaging for Languages that have vendor/ Trees

Neal Gompa
In reply to this post by Neal Gompa
On Tue, Dec 19, 2017 at 11:54 PM, Aleksa Sarai <[hidden email]> wrote:

> On 2017-12-19, Neal Gompa <[hidden email]> wrote:
>> * The current vendoring of rust crates is temporary. We're waiting on
>> RPM 4.14[1] and the new product builder to come online (DimStar
>> already slapped me once for breaking Tumbleweed with rich deps
>> before...). I'm working on making rust2rpm make openSUSE-friendly spec
>> files (mainly add the boilerplate header, skip conversion of SPDX to
>> Fedora license tags, generate changes file) so that crates can be
>> easily packaged and shipped in the distribution. Right now, Fedora has
>> well over 230 Rust crates packaged[2], and the packaging for them is
>> pretty trivial[3]. We've also got a good handle on cargo integration,
>> so crates function as if they're in a local cargo registry for things
>> to depend on.
>
> Is there a document somewhere that explains how it works? I read through
> the Fedora wiki page on Rust packaging[1] last time the RPM feature was
> mentioned on this list, but it doesn't explain anything about the
> current status (unless "rust2rpm" is the current status?).
>

Well, if you want to do it by hand, we do document how you're supposed
to do it: https://fedoraproject.org/wiki/Packaging:Rust

Unlike Go, which is mostly B.S. on packaging, we have been taking a
careful approach to ensure we're on a solid path for Rust.

>> * I'm not sure why openSUSE hasn't adopted the bundled() Provides
>> thing across the board anyway. There are plenty of packages that ship
>> vendored trees/libraries and no one knows what they are. In general,
>> it's really not a bad idea to do that. In my opinion, it's
>> irresponsible to not require what you bundle to be defined.
>>
>> Generally speaking, I think this is a solid idea, but I solidly do not
>> believe we will be continuing the vendored crates practice for much
>> longer in Rust.
>
> Okay. I just want to make sure that we don't run into the same
> maintainence problem we already have with Ruby packages (which will end
> up being worse due to the multi-versioning support in Rust, as well as
> the existence of far more micro-packages than in the Ruby universe).
> Does the current plan for Rust packaging account for that?
>
> [1]: https://fedoraproject.org/wiki/SIGs/Rust
>

Our design of Rust packaging is deliberately because of needing to
package multiple versions of things. Though it is encouraged that when
we encounter such situations, to try to upgrade to latest crate
versions and submit patches upstream. Igor Gnatenko has been like a
machine and doing just that across most of Fedora's crates.

But yes, we handle multiple versions of crates within a dep tree
perfectly fine. :)

> --
> Aleksa Sarai
> Senior Software Engineer (Containers)
> SUSE Linux GmbH
> <https://www.cyphar.com/>



--
真実はいつも一つ!/ Always, there's only one truth!
--
To unsubscribe, e-mail: [hidden email]
To contact the owner, e-mail: [hidden email]