this post was submitted on 02 Sep 2024
42 points (92.0% liked)

Python

6375 readers
5 users here now

Welcome to the Python community on the programming.dev Lemmy instance!

📅 Events

PastNovember 2023

October 2023

July 2023

August 2023

September 2023

🐍 Python project:
💓 Python Community:
✨ Python Ecosystem:
🌌 Fediverse
Communities
Projects
Feeds

founded 1 year ago
MODERATORS
you are viewing a single comment's thread
view the rest of the comments
[–] takeda@lemmy.world 3 points 2 months ago (2 children)

The dockerfile does not guarantee this, but the docker image or any OCI image does.

That's true, but also misleading.

OCI image is like having an jpeg image. While Dockerfile is like the text prompt you write to ChatGPT to generate the image.

Yes every time you look at the jpeg, it is the same exact image, but that's kind of obvious, the real problem is if you try the text query to ChatGPT you will get something slightly different every time.

Nix brings a true reproducibility. So in this analogy the same prompt brings the exact same image. This allows you to check on that prompt in your source control and if you mess up something there's always a way back.

This is something docker promised, but never delivered.

Dockerfile should not be confused with the artifact.

It should not, but artifacts never had problem with mutating before we had docker. If you generate an rpm package and store it in an artifactory it always was the same exact package (unless someone overwrote it, lol)

Operationally we usually expect a dockerfile to be identical across many builds of different releases and know the artifact produced will have different code

But that's basically the problem docker claimed to fix. This is also the problem that you frequently encounter with a pipeline that worked fine one day suddenly stopped working next day, because something that your Dockerfile referenced changed (maybe a new image was updated that broke something, you can lock things to specific hashes, but you need to be very conscious about that and in the wild I never seen anyone really doing it).

Anything you are doing with nix to make the lock files perfect is the same amount of work you’d be doing to any method of producing an OCI artifact.

It is not. Hashes are and lock files are built-in and Nix uses them by default.

If for example I use a flake, the flake.lock will hold the exact version of nixpkgs (package repo) in time. That happens without any additional effort. The poetry2nix converts poetry.lock file to nix packages that are once again locked in time, and that also happens behind the scenes.

The result is that all dependencies (python dependencies - from poetry.lock as well as the rest of the system (python, c libraries etc) - from flake.lock are all locked and in my repo. So everything is repeatable without effort on my side.

To repeat that with Dockerfile is much more challenging.

I do think your approach is interesting though. Certainly less effort than manually packing an OCI with something like buildpaks or trying to run through bazel to get your way through a distroless build (two other methods that don’t make massive images with a Debian base). And obviously ‘From:scratch’ in docker build land is a nightmare.

If you get your app build with Nix. The whole thing, including all of app's dependencies are explicitly referenced so you can wrap it into a docker, an rpm file, OS image etc.

It's controversial, but IMO nix is actually easier than what we are doing now. I think the problem is that it is a massive paradigm shift and what most people know what to do with existing technologies will generally be not useful, so you have to relearn everything.

But IMO it pays off. For example when starting a new project I can package the whole thing in 5 minutes. poetry2nix translates the project and it's dependencies into nix packages and then since nix understands dependencies for my project it can package it automatically.

[–] arendjr@programming.dev 2 points 2 months ago

You make a good point in that Docker promised to make dev environments reproducible so that everyone on the team would have the same environment. They even succeeded in that, but either intentionally or accidentally omitted reproducibility over time due to the introduction of non-locked dependencies.