Hello world!
So, we figured out how to build Docker
images using Dockerfile
in the last article. I hope everyone now knows how to build a Docker
image from scratch.
Before we start this article, let’s get familiar with Docker
image filesystem. How many of you know about the Docker
image filesystem?
Docker
images are read-only templates, which could contain an operating system and an application on top of it. It is then used for creating Docker
containers.
But most importantly, we should know how does a Docker image work?
Docker
image file system is layered. It uses Union file systems to combine these layers to form a single image.
One of the reasons Docker is so lightweight is because of these layers. When you change a Docker image—for example, update an application to a new version— a new layer gets built. Thus, rather than replacing the whole image or entirely rebuilding, as you may do with a virtual machine, only that layer is added or updated. Now you don’t need to distribute a whole new image, just the update, making distributing Docker images faster and simpler.
Source: Official Docker Documentation
Deep Diving to understand Docker Image File System
Let’s do some more digging to understand it practically. I’m using Docker-Machine on mac and I have started the default environment:
…and I’ve removed everything so it should not show me any containers, images or intermediate images.
You can see that, my docker host is clean and no signs of an image or a container.
Did you ever think where are Docker images stored?
The first place to look is in /var/lib/docker/
. We are using Machine
, so I’m going to ssh into my Docker-Machine
and show you:
Right, now nothing is present as we have not pulled or build any images. Let me pull the Docker
RunDeck image from my DockerHub repository and see what happens:
docker pull bhalothia/docker-rundeck:v1.1
Alright, now the RunDeck image is downloaded and we are good to create containers out of it. We can check what all is updated inside /var/lib/docker
folder on the Docker-Machine
.
Before that, let’s do docker images
and docker images -a
:
Now, at this point of time you might be wondering what is the difference between docker images
and docker images -a
?
Docker
file system layers are by default stored at /var/lib/docker/graph
, which is called as the graph database. And these layers corresponds to the IMAGE ID
When I did docker pull bhalothia/docker-rundeck:v1.1
, the image was downloaded one layer at a time. Layer ba249489d0b6
was downloaded as <none>:<none>
image and so on. The final image was bhalothia\docker-rundeck:v1
and all other layers were downloaded as intermediate images. Docker
allows you to see all intermediate images using the flat -a
.
Every layer, one after the other, starting from ba249489d0b6
to the final image 241f5ca9955c
have the parent-child hierarchical relationship.
To verify this, you can run this command: cd /var/lib/docker/graph && more layer_id/json
So, if you are worried about the <none>:<none>
images, which are also called as intermediate images then you can relax.
But, wait…what is Dangling Image?
Hey, you need to know about the evil <none>:<none>
images as well.. those which aren’t intermediate and which shows up when you do docker images
.
Docker keeps all of the images in its cache that you have used in the disk, even if those are not actively running.
Check all open issues on Docker
repository.
Here’s how you can clean up dangling images:
docker rmi $(docker images --quiet --filter "dangling=true")
It will give you an error if there’s no dangling image present. If you don’t want that, then this might work for you:
docker images -qf dangling=true | xargs docker rmi
This command will do the manual garbage collection.