this post was submitted on 04 Dec 2024
8 points (100.0% liked)

Selfhosted

40739 readers
351 users here now

A place to share alternatives to popular online services that can be self-hosted without giving up privacy or locking you into a service you don't control.

Rules:

  1. Be civil: we're here to support and learn from one another. Insults won't be tolerated. Flame wars are frowned upon.

  2. No spam posting.

  3. Posts have to be centered around self-hosting. There are other communities for discussing hardware or home computing. If it's not obvious why your post topic revolves around selfhosting, please include details to make it clear.

  4. Don't duplicate the full text of your blog or github here. Just post the link for folks to click.

  5. Submission headline should match the article title (don’t cherry-pick information from the title to fit your agenda).

  6. No trolling.

Resources:

Any issues on the community? Report it using the report flag.

Questions? DM the mods!

founded 2 years ago
MODERATORS
 

I was wondering, Do you know of a limit on how many rootless conrainers can one run on a linux host?

Running fedora server, I have resources but once I pass about 15 containers podman starts to hang and crash.

I then need to manually delete the storage folder under ~./local/share/... for podman to work again.

It might be related to user ns keep-id flag.

top 4 comments
sorted by: hot top controversial new old
[–] droopy4096@lemmy.ca 8 points 3 weeks ago (1 children)

I'm not convinced what you run into is a specific podman issue. It's a resource issue and configuration issue likely. "vanila" podman with proper rootless containers will run as much workload as machine can handle from my experience. My company costomers seem to be running production workloads with it just fine.

Oh wait, by rootless container you really meant running podman rootless? still don't see an issue though. What specifically are you doing? I mean, what's the configuration and what's the workload?

[–] zeGoomba@programming.dev 0 points 3 weeks ago

My config is nothing special. Running all containers as user 1000 via qadlets. Sporadically, I get:

Dec 05 13:40:27 home-lab systemd-coredump[200811]: [🡕] Process 200795 (podman) of user 1000 dumped core.

                                                   Module libbz2.so.1 from rpm bzip2-1.0.8-18.fc40.x86_64
                                                   Module libsepol.so.2 from rpm libsepol-3.7-2.fc40.x86_64
                                                   Module libpcre2-8.so.0 from rpm pcre2-10.44-1.fc40.x86_64
                                                   Module libcap-ng.so.0 from rpm libcap-ng-0.8.4-4.fc40.x86_64
                                                   Module libgpg-error.so.0 from rpm libgpg-error-1.49-1.fc40.x86_64
                                                   Module libpam_misc.so.0 from rpm pam-1.6.1-4.fc40.x86_64
                                                   Module libpam.so.0 from rpm pam-1.6.1-4.fc40.x86_64
                                                   Module libattr.so.1 from rpm attr-2.5.2-3.fc40.x86_64
                                                   Module libacl.so.1 from rpm acl-2.3.2-1.fc40.x86_64
                                                   Module libcrypt.so.2 from rpm libxcrypt-4.4.36-10.fc40.x86_64
                                                   Module libeconf.so.0 from rpm libeconf-0.6.2-2.fc40.x86_64
                                                   Module libsemanage.so.2 from rpm libsemanage-3.7-2.fc40.x86_64
                                                   Module libselinux.so.1 from rpm libselinux-3.7-5.fc40.x86_64
                                                   Module libaudit.so.1 from rpm audit-4.0.2-1.fc40.x86_64
                                                   Module libseccomp.so.2 from rpm libseccomp-2.5.5-1.fc40.x86_64
                                                   Module podman from rpm podman-5.3.1-1.fc40.x86_64
                                                   Stack trace of thread 200805:
                                                   #0  0x0000558789bfa4a1 runtime.raise.abi0 (podman + 0x934a1)
                                                   #1  0x0000558789bd6cc8 runtime.sigfwdgo (podman + 0x6fcc8)
                                                   #2  0x0000558789bd51a5 runtime.sigtrampgo (podman + 0x6e1a5)
                                                   #3  0x0000558789bfa7a9 runtime.sigtramp.abi0 (podman + 0x937a9)
                                                   #4  0x00007efdbc0cad00 __restore_rt (libc.so.6 + 0x40d00)
                                                   #5  0x0000558789bfa4a1 runtime.raise.abi0 (podman + 0x934a1)
                                                   #6  0x0000558789bbda26 runtime.fatalpanic (podman + 0x56a26)
                                                   #7  0x0000558789bbc998 runtime.gopanic (podman + 0x55998)
                                                   #8  0x0000558789bd64d8 runtime.sigpanic (podman + 0x6f4d8)
                                                   #9  0x000055878a5a7842 github.com/containers/storage.(*layerStore).load (podman + 0xa40842)
                                                   #10 0x000055878a5a9608 github.com/containers/storage.(*store).newLayerStore (podman + 0xa42608)
                                                   #11 0x000055878a5bc7dd github.com/containers/storage.(*store).getLayerStoreLocked (podman + 0xa557dd)
                                                   #12 0x000055878a5bc935 github.com/containers/storage.(*store).getLayerStore (podman + 0xa55935)
                                                   #13 0x000055878a5cc451 github.com/containers/storage.(*store).Mounted (podman + 0xa65451)
                                                   #14 0x000055878ac99b88 github.com/containers/podman/v5/libpod.(*storageService).UnmountContainerImage (podman + 0x1132b88)
                                                   #15 0x000055878abec81a github.com/containers/podman/v5/libpod.(*Container).unmount (podman + 0x108581a)
                                                   #16 0x000055878abe8865 github.com/containers/podman/v5/libpod.(*Container).cleanupStorage (podman + 0x1081865)
                                                   #17 0x000055878abe965b github.com/containers/podman/v5/libpod.(*Container).cleanup (podman + 0x108265b)
                                                   #18 0x000055878ac6c2ce github.com/containers/podman/v5/libpod.(*Runtime).removeContainer (podman + 0x11052ce)
                                                   #19 0x000055878ac6aad0 github.com/containers/podman/v5/libpod.(*Runtime).RemoveContainer (podman + 0x1103ad0)
                                                   #20 0x000055878ad05948 github.com/containers/podman/v5/pkg/domain/infra/abi.(*ContainerEngine).removeContainer (podman + 0x119e948)
                                                   #21 0x000055878ad06745 github.com/containers/podman/v5/pkg/domain/infra/abi.(*ContainerEngine).ContainerRm.func1 (podman + 0x119f745)
                                                   #22 0x000055878ace297b github.com/containers/podman/v5/pkg/parallel/ctr.ContainerOp.func1 (podman + 0x117b97b)
                                                   #23 0x000055878aade678 github.com/containers/podman/v5/pkg/parallel.Enqueue.func1 (podman + 0xf77678)
                                                   #24 0x0000558789bf8c41 runtime.goexit.abi0 (podman + 0x91c41)
                                                   ELF object binary architecture: AMD x86-64

I have enugh RAM and CPU and Disk to spare....

when this error happens, I cant run any podman commands without a core dump. i.e I cant podman images podman ps and so on... The only solution is to delete the storage folder manually, pull all the images again. and it's back to normal.

[–] Voroxpete@sh.itjust.works 3 points 3 weeks ago

The practical limit to the number of containers you can run on one system is in the high hundreds or more thousands, depending on how you configure some things, and your available hardware. It's certainly more than you'll even use unless you get into some auto-scaling swarm config stuff.

The issue is more about resource limits, and access to shared resources. I'd start by trying to figure out if there are certain specific containers that don't play well together. Bring your setup online slowly, one container at a time, and take note of when things start to get funky. Then start testing combinations of those specific containers. See if there's one you can remove from the mix that suddenly makes things more stable.

[–] zeGoomba@programming.dev 1 points 2 weeks ago

Small update.

It seems to be caused by UserNS=keep-id . when adding it to an image with a lot of files, podman hangs for a while, then crashes when doind its chown. This causes some layers to be invalid.