What is User Namespaces and Rootless Containers

Podman — Linux

Containers are really fancy processes using Kernal Gimmics !

To understand both of these concepts; user namespace and rootless containers we need to understand Cgroups and Namespaces first.

Cgroups and Namespaces

cgroups(control groups), are a kernel mechanism for limiting and measuring the total resources used by a group of processes running on a system. As an example, you can apply CPU, memory, network or IO quotas on a process.

Namespaces are another kernel mechanism by which one set of processes sees one set of resources while another set of processes sees a different set of resources. For example you can limit visibility to certain process trees, network interfaces, user IDs or filesystem mounts etc. There are multiple type of namespaces exist -

  1. Mount
  2. Process
  3. Network
  4. User
    etc

How Containers are related with Cgroups and Namespaces

Using cgroups and Namespaces you can achieve process isolation. Process Isolation is a feature on which whole empire of Containers have been build. Linux Containers are build with a full set of namespaces so that they can only see their own file system, their own processes, their own user ids and any network interfaces which they have been allowed to access. This allows us to create a kind of virtual system as a process.

In below screen shot, I am running a ubuntu container with a non privileged user. It did not allow me(its a drawback). Then I ran the command again using sudo. Now I am in container and going to ran id and ps command. Did you notice the output ? This process has its own environment.

Host system has different set of process tree but when you logged in container process, you can see different process tree. This is process isolation in a nutshell !

arun@ubuntu:~$ id
uid=1000(arun) gid=1000(arun) groups=1000(arun),4(adm),20(dialout),24(cdrom),25(floppy),27(sudo),29(audio),30(dip),44(video),46(plugdev),108(lxd),114(netdev)
arun@ubuntu:~$ sudo docker run --rm -it ubuntu bash
root@f16396d5449f:/# id
uid=0(root) gid=0(root) groups=0(root)
root@f16396d5449f:/# ps
PID TTY TIME CMD
1 pts/0 00:00:00 bash
10 pts/0 00:00:00 ps
Docker Running Ubuntu Container

Now I am going to perform same steps mentioned above but this time using Podman(Another container runtime engine). Podman allows you to run container with Non Privileged users as well. So I don’t have to use sudo. Once I logged in container you can similar output as earlier and the user root again. But there is a difference. The root user which you are seeing is not actual root, the user is actually running with the privileges of standard user which you used to run container. (user: arun) This is example of rootless containers. I will explain little more, later in this post.

Podman Running Ubuntu Container

User Namespaces

We discussed what are Namespaces, now moving on to User Namespaces. User namespaces is a feature of Linux, that is used to separate the user IDs and group IDs between the host and containers.

Using Podman you can run containers using privileged and non-privileged users both. This is a great feature because The best way to prevent privilege-escalation attacks from within a container is to configure your container’s applications to run as Non-privileged users. User namespace allows you to specify a user identifier (UID) and group identifier (GID) mapping to run your containers.

Question : Where is this mapping exists ?

Answer is : /etc/subuid and /etc/subgid files

subuid and subgid

How the subuid and subgid work ?

The entries you see for user arun can map upto 65536 User-ID’s in container to real user on the system starting with 100000. This is reserve by default, the more user you add on your system file gets amended and entries are aligned.

Please notice following details when I am running a container with standard user and how it is mapping the uid of container process with Standard user, You will have a better idea.

[arun@centos ~]$ id
uid=1000(arun) gid=1000(arun) groups=1000(arun),4(adm),190(systemd-journal) context=unconfined_u:unconfined_r:unconfined_t:s0-s0:c0.c1023
[arun@centos ~]$ cat /etc/subuid
arun:100000:65536
ak:165536:65536
[arun@centos ~]$ cat /etc/subgid
arun:100000:65536
ak:165536:65536
[arun@centos ~]$ cat /proc/self/uid_map
0 0 4294967295
[arun@centos ~]$ podman run --rm -it ubuntu bash
# container terminal
root@bfda7167e840:/# id
uid=0(root) gid=0(root) groups=0(root)

root@bfda7167e840:/# cat /proc/self/uid_map
0 1000 1
1 100000 65536

Now let’s see another example with root

[root@centos ~]# id
uid=0(root) gid=0(root) groups=0(root) context=unconfined_u:unconfined_r:unconfined_t:s0-s0:c0.c1023

[root@centos ~]# cat /proc/self/uid_map
0 0 4294967295

[root@centos ~]# podman run --rm -it ubuntu bash
# container terminal
root@e68aa59f54ef:/# id
uid=0(root) gid=0(root) groups=0(root)

root@e68aa59f54ef:/# cat /proc/self/uid_map
0 0 4294967295

root@e68aa59f54ef:/# ps
PID TTY TIME CMD
1 pts/0 00:00:00 bash
10 pts/0 00:00:00 ps

Rootless containers

I believe now you have got some idea that how your container process gets affected when you run container with standard and root users.

Running containerised process with root always pose a threat. We don’t want to do it. Using Podman you can overcome this. Podman allows you to run containers with your regular user as we have seen in above examples. These containers are rootless containers.

This has added another security layer in Container Security. In case the container engine, runtime or Container orchestrator (ex: K8s) is compromised, the attacker won’t gain root privileges on the host.

That’s it for this post. I have covered more details on Podman on another post, you can have a look to build better understanding on Podman.

In quest of understanding How Systems Work !

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store