Doing Things in Docker

How to Modify Docker Images

Here's a detailed explanation of how to modify docker images based on your requirements.

I presume you are a tad bit familiar with Docker and know basics like running docker containers etc.

In previous articles we have discussed updating docker container and writing docker files.

What exactly is modifying a docker image?

A container image is built in layers (or it is a collection of layers), each Dockerfile instruction creates a layer of the image. For example, consider the following Dockerfile:

FROM alpine:latest

RUN apk add --no-cache python3

ENTRYPOINT ["python3", "-c", "print('Hello World')"]

Since there are a total of three Dockerfile commands, the image built from this Dockerfile, will contain a total of three layers.

You can confirm that by building the image:

docker image build -t dummy:0.1 .

And then using the command docker image history on the built image.

articles/Modify a Docker Image on  modify-docker-images [?] took 12s 
❯ docker image history dummy:0.1 
IMAGE          CREATED          CREATED BY                                      SIZE      COMMENT
b997f897c2db   10 seconds ago   /bin/sh -c #(nop)  ENTRYPOINT ["python3" "-c…   0B        
ee217b9fe4f7   10 seconds ago   /bin/sh -c apk add --no-cache python3           43.6MB    
28f6e2705743   35 hours ago     /bin/sh -c #(nop)  CMD ["/bin/sh"]              0B        
<missing>      35 hours ago     /bin/sh -c #(nop) ADD file:80bf8bd014071345b…   5.61MB

Ignore the last '<missing>' layer.

Each of these layers is read-only. This is beneficial because since these layers are read-only, no process associated with a running instance of this image is going to be able to modify the contents of this image, therefore, these layers can be shared by many containers without having to keep a copy for each instance. But for the containers' processes to be able to perform r/w, another layer is added on top of the existing RO layers when creating the containers, this is writable and not shared by other containers.

The downside of this r/w layer is that changes made in this layer is not persistent, although you can use volumes to persist some data, sometimes you may need/want to add a layer before some existing layer, or delete a layer from an image or simply replace a layer. These are the reasons one might want to modify an existing docker image.

In this article, I'm going to cover all the cases I mentioned above, using different methods.

Methods of modifying a docker image

There are two ways you can modify a docker image.

Through Dockerfiles.
Using the command docker container commit.

I'll explain both methods, and at the end, I'll also add which use case would be better for the method in context.

Method 1: Modifying docker image through the Dockerfile

Modifying a docker image essentially means modifying the layers of an image. Now since each Dockerfile command represents one layer of the image, modifying each line of a Dockerfile will change the respective image as well.

So if you were to add a layer to the image, you can simply add another Dockerfile instruction to it, to remove one you would remove a line and to change a layer, you would change the line accordingly.

There are two ways you can use a Dockerfile to modify an image.

Using the image that you want to modify as a base image and build a child image.
Modifying the actual Dockerfile of the image you want to change.

Let me explain which method should be used when, and how.

1. Using an image as a base image

This is when you take the image you want to modify, and add layers to it to build a new child image. Unless an image is built from scratch, every image is a modification to the same parent base image.

Consider the previous Dockerfile. Say the image build from that image is named dummy:0.1. Now if I were to think that I now need to use Perl instead of Python3 to print "Hello World", but I also don't want to remove Python3, I could just use the dummy:0.1 image as the base image (since Python3 is already there) and build from that like the following

FROM dummy:0.1

RUN apk add --no-cache perl

ENTRYPOINT ["perl", "-e", "print \"Hello World\n\""]

Here I'm building on top of dummy:0.1, adding more layers to it as I see fit.

This method is not going to be much helpful if your intention is to change or delete some existing layer. For that, you need to follow the next method.

2. Modifying the image's Dockerfile

Since the existing layers of an image are read-only, you cannot directly modify them through a new Dockerfile. With the FROM command in a Dockerfile, you take some image as a base and build on it, or add layers to it.

Some tasks may require us to alter an existing layer, although you can do that using the previous method with a bunch of contradictory RUN instructions (like deleting files, removing/replacing packages added in some previous layer), it isn't an ideal solution or what I'd recommend. Because it adds additional layers and increases the image size by quite a lot.

A better method would be to not use the image as a base image, but change that image's actual Dockerfile. Consider again the previous Dockerfile, what if I did not have to keep Python3 in that image, and replace the Python3 package and the command with the Perl ones?

If following the previous method I'd have had to create a new Dockerfile like so -

FROM dummy:0.1

RUN apk del python3 && apk add --no-cache perl

ENTRYPOINT ["perl", "-e", "print \"Hello World\n\""]

If built, there is going to be a total of five layers in this image.

articles/Modify a Docker Image on  modify-docker-images [?] took 3s 
❯ docker image history dummy:0.2
IMAGE          CREATED          CREATED BY                                      SIZE      COMMENT
2792036ddc91   10 seconds ago   /bin/sh -c #(nop)  ENTRYPOINT ["perl" "-e" "…   0B        
b1b2ec1cf869   11 seconds ago   /bin/sh -c apk del python3 && apk add --no-c…   34.6MB    
ecb8694b5294   3 hours ago      /bin/sh -c #(nop)  ENTRYPOINT ["python3" "-c…   0B        
8017025d71f9   3 hours ago      /bin/sh -c apk add --no-cache python3 &&    …   43.6MB    
28f6e2705743   38 hours ago     /bin/sh -c #(nop)  CMD ["/bin/sh"]              0B        
<missing>      38 hours ago     /bin/sh -c #(nop) ADD file:80bf8bd014071345b…   5.61MB

Also, the size of the image is 83.8 MB.

articles/Modify a Docker Image on  modify-docker-images [?] 
❯ docker images
REPOSITORY   TAG       IMAGE ID       CREATED          SIZE
dummy        0.2       2792036ddc91   19 seconds ago   83.8MB

Now instead of doing that, take the initial Dockerfile, and change the Python3 ones to Perl like so

FROM alpine:latest

RUN apk add --no-cache perl

ENTRYPOINT ["perl", "-e", "print \"Hello World\n\""]

The number of layers has reduced to 3, and the size now is 40.2 MB.

articles/Modify a Docker Image on  modify-docker-images [?] took 3s 
❯ docker image history dummy:0.3
IMAGE          CREATED         CREATED BY                                      SIZE      COMMENT
f35cd94c92bd   9 seconds ago   /bin/sh -c #(nop)  ENTRYPOINT ["perl" "-e" "…   0B        
053a6a6ba221   9 seconds ago   /bin/sh -c apk add --no-cache perl              34.6MB    
28f6e2705743   38 hours ago    /bin/sh -c #(nop)  CMD ["/bin/sh"]              0B        
<missing>      38 hours ago    /bin/sh -c #(nop) ADD file:80bf8bd014071345b…   5.61MB    

articles/Modify a Docker Image on  modify-docker-images [?] 
❯ docker images
REPOSITORY   TAG       IMAGE ID       CREATED          SIZE
dummy        0.3       f35cd94c92bd   29 seconds ago   40.2MB

Image successfully changed.

💡

The previous method is more useful when you're going to just add layers on top of the existing ones, but isn't much helpful when trying to modify the existing layers like delete one, replace one, reorder the existing ones and so on. That is where this method shines.

Method 2: Modifying image using docker commit

There's this another method where you can take a snapshot of a running container, and turn that into an image of its own.

Let's build a dummy:0.1 identical image, but this time without using a Dockerfile. Since I used alpine:latest as dummy:0.1's base, spin up a container of that image.

docker run --rm --name alpine -ti alpine ash

Now inside the container, add the Python3 package, apk add --no-cache python3. Once done, open a new terminal window and run the following command (or something similar)

docker container commit --change='ENTRYPOINT ["python3", "-c", "print(\"Hello World\")"]' alpine dummy:0.4

With the --change flag I'm adding a Dockerfile instruction to the new dummy:04 image (in this case, the ENTRYPOINT instruction).

With the docker container commit command, you basically convert the outermost r/w layer to a r/o layer, append that to the existing image's layers and create a new image. This method is more intuitive/interactive so you may want to use this instead of Dockerfiles, but do understand that this isn't very reproducible. Also the same rules apply to removing or altering any existing layers, adding a layer just to remove something or alter something done in a previous layer is not the best idea, at least in most of the cases.

That concludes this article. I hope this one was helpful to you, if you have any question, do comment down below.

Debdut Chakraborty

I'm me.

@imdebdut kolkata

Doing Things in Docker