Most people probably heard of Podman. An OCI Container management tool very similar to Docker - with some very interesting differences like being able to run completely rootless, easily integrate with systemd and the use of Pods to mention some.

The most common way of running podman containers is podman run, easily translated from docker run.
For example to start a web server on port 8080 run:
podman run --name my_webserver -dt -p 8080:80/tcp docker.io/nginx

But I’ve always preferred running my containers from a declared file/instruction like docker-compose.yml for easier portability and backups. The run method never appealed to me. The podman generate systemd is interesting but still requires starting with podman run then convert everything to systemd service files.

Enter Quadlets

But then a few weeks ago I stumbled upon Quadlets. A way to write the systemd services through reasonable templating with the [Container] section. That’s the answer I needed to migrate from docker-compose-yml.

So for example, a simple docker-compose.yml of the homer could look like this:

---
services:
  homer:
    image: b4bz/homer
    container_name: homer
    volumes:
      - /your/local/assets/:/www/assets
    ports:
      - 8080:8080
    user: 1000:1000 
    environment:
      - TZ=Europe/Stockholm

And rewritten to a homer.container quadlet would look like this:

[Container]
ContainerName=homer
Image=docker.io/b4bz/homer

Volume=/your/local/assets:/www/assets
PublishPort=8080:8080
Environment=TZ=Europe/Stockholm
User=1000

[Service]
Restart=on-failure

[Install]
WantedBy=default.target

And then reload systemctl --user daemon-reload to generate the service and start with systemctl --user start homer


Is it really that simple? Well both yes and no, it’s that simple if you just run podman as root - but my goal and reason to swapping to podman is to run it rootless - and while it’s pretty easy to get going it has some hurdles to work through.


Setting up the environment/server/VM for rootless podman

I’ll call my unprivileged user servuser in all the examples.

  • Quadlets require Podman version 4.4+ and cgroups v2
    • Check with podman --version and podman info --format {{.Host.CgroupsVersion}}
  • Rootless containers require subuid’s mapped in /etc/subuid and /etc/subgid.
    • If not set, add: servuser:100000:65536 to both files as root/sudo.
  • Create a path to store the quadlet service-files for a rootless user:
    • ~/.config/containers/systemd/ <- This is my preferred choice.
    • other options: $XDG_RUNTIME_DIR/containers/systemd/ or /etc/containers/systemd/users/$(UID) or /etc/containers/systemd/users/
  • Enable linger (this is to allow services to run after logout)
    • sudo enable-linger servuser
  • If hosting something that require privileged ports (like reverse proxy 80/443 or dns 53), you have to allow it:
    • add net.ipv4.ip_unprivileged_port_start=80 to /etc/sysctl.conf
    • enforce new settings with sysctl -p /etc/sysctl.conf

Create a container - with a network to connect containers

Create a network unit:
~/.config/containers/systemd/stacknet.network

[Unit]
Description=Stacknet network
# This is systemd syntax to wait for the network to be online before starting this service:
After=network-online.target
 
[Network]
NetworkName=stacknet
# These are optional, podman will just create it randomly otherwise.
Subnet=10.10.0.0/24
Gateway=10.10.0.1
DNS=9.9.9.9
 
[Install]
WantedBy=default.target

Then create a container to use the network:
~/.config/containers/systemd/nginx.container

[Unit]
Description=Nginx container

[Container]
ContainerName=nginx
Image=docker.io/nginx

Volume=%h/container_volumes/nginx/serve:/usr/share/nginx/html:Z,U
# And here we define the shared network:
Network=stacknet.network
PublishPort=80:80
PublishPort=443:443
Environment=TZ=Europe/Stockholm

[Service]
Restart=on-failure

[Install]
WantedBy=default.target
  • Volume:
    • the %h is the systemd-syntax for $HOME. :U tells podman to chown the source volume to match the default UID+GID within the container.
    • SELinux; :z sets the shared content label while :Z is a private, unshared label that only this container can read.
  • Environment:
    • Just use the usual syntax but with Environment= prefix.
  • After=network-online.target - Service waits for the host network before starting.
  • WantedBy=default.target - Service is started at boot.

After that, a 2nd container can be added with the same Network=stacknet.network option. And options like After=nginx.service or Wants=nginx.service to trigger dependencies and order.

This could also be handled with pods - though I’ll leave that for another day.

Setting up automatic updatesinside

Setting up podman auto-update to keep your containers fresh.

podman auto-update pulls down new container images and restarts containers configured for auto updates. If restarting a systemd unit after updating the image has failed, rollback to using the previous image and restart the unit another time.

  • Ensure the two necessary systemd units are activated: systemctl --user enable podman-auto-update.{service,timer} --now
  • Then add AutoUpdate=registry to the quadlets [Container] section.

Might be worth tp delve deeper with options like Notify=healty and the HealthCmd= to have a surer way of assessing if a service started successfully.



Some more advanced and in depth settings and troubleshooting

It’s not always straight forward and the documentation is still rather scarce.
Looking at the systemctl --user status containername is usually my first go, but it’s worth looking at the logs with journalctl --user -xeu containername.service.

When trying to start a new quadlet, Failed to start name.service: Unit name.service not found. is usually due to a typo or syntax error in the .container file which sadly isn’t pointed out.
Make frequent use of podman ps and podman stats while troubleshooting. Also an alias, eg alias scu='systemctl --user' is neat.

You can manually inspect the generated services at /run/user/1000/systemd/generator/containername.service.

More info on subuid and usermap

Containers map up the internal users to the subuid’s of the user running the container. In the setup we looked at the /etc/subuid and this file sets a range of the subuids for each user.
The syntax is username:start_uid:uid_count so in my example it was set to servuser:100000:65536 so starting at 100000 and 65536 (default) range of subordinate UIDs to use as isolated namespaces within my user namespace.
Read more general info

You’ll encounter this when mapping bind mounts to your containers, as the mapped user will own the files.
Root within is mapped to the running user outside, so files created by root within will be owned by servuser outside.
Though if a internal user of eg. 1000 creates files in a mount, they’ll be owned by subuid 1001001 on the outside, 100000+1000+1 (+1 as root is automapped). This can be manipulated a few ways to get the right mapping depending on necessity. With options like UserNS=keep-id and UIDMap.

I’ll explain a complex case of this - with a linuxserver.io image - those usually do primary setup as root and then run the actual service as UID1000.

Lets look at linuxserver/nginxas an example. The docker-compose.yml would look like this:

---
services:
  nginx:
    image: lscr.io/linuxserver/nginx:latest
    container_name: nginx
    environment:
      - PUID=1000
      - PGID=1000
      - TZ=Europe/Stockholm
    volumes:
      - /home/servuser/container_volumes/nginx/config:/config
    ports:
      - 80:80
      - 443:443
    restart: unless-stopped

And rewritten to a quadlet:

[Container]
ContainerName=nginx
Image=lscr.io/linuxserver/nginx:latest

Volume=%h/container_volumes/nginx/config:/config:Z,U
PublishPort=80:80
PublishPort=443:443
Environment=TZ=Europe/Stockholm

Environment=PUID=1000
Environment=PGID=1000

UIDMap=1000:0:1
UIDMap=0:1:1000
UIDMap=1001:1001:64536

[Service]
Restart=on-failure

[Install]
WantedBy=default.target
  • UIDMap
    • 1000:0:1 - 1000 within, intermediate 0 which automaps to my servuser, range 1 - so only that one.
    • 0:1:1000 - 0 within, intermediate 1 (my first subuid), range 999. So internal 0-999 mapped to my Subuid 1-999 (100000-100999).
    • 1001:1001:64536 - 1001 within, intermediate 1001, range 64536 (so my whole Subuid range -1000. The rest of my subuid maps that is).

That way 1000 within is mapped to my servuser while the full range of 0-65536 (except 1000) is mapped to my Subuids 100000-165536.

You can choose to have separate users on the same host, to further segment the namespaces between containers, all users with their own Subuid-ranges.

Devices and SELinux

As I prefer using Fedora Server for podman, I’ve also encountered some constrains with SELinux.
One example of this is when I wanted to map the /dev/net/tun device inside my VPN-container. SELinux blocked permissions to that device for the rootless container.

This can be solved with some SELinux Policy creation (inspired by: source)

My container is called gluetun. To generate a new policy from the block in audit.log follow these steps as root/sudo:

  • # grep gluetun /var/log/audit/audit.log | audit2allow -a -M cont_tun
  • Then inspect the generated rules # cat cont_tun.te
module cont_tun 1.0;

require {
        type tun_tap_device_t;
        type container_file_t;
        type container_t;
        class chr_file { getattr ioctl open read write };
        class sock_file watch;
}

#============= container_t ==============
allow container_t container_file_t:sock_file watch;

#!!!! This avc can be allowed using the boolean 'container_use_devices'
allow container_t tun_tap_device_t:chr_file { getattr ioctl open read write };
  • If it looks correct, install the module # semodule -i cont_tun.pp

You could also copy someone else rule like the .te file above, and convert it to a policy:

  • Convert it to a policy module # checkmodule -M -m -o gluetun_policy.mod gluetun_policy.te
  • Compile the policy # semodule_package -o gluetun_policy.pp -m gluetun_policy.mod and install it # semodule -i gluetun_policy.pp

And then restart the container and look at the logs if the permission errors are gone!



That’s it for now.

Read more!

Further exploration

  • Experiment with pods, pros and cons?
  • Create and manage with Ansible.
  • Host on a more container focused OS like Fedora CoreOS or openSUSE MicroOS or even