Docker 104 — Dockerfile instructions in detail

Nitin Manju
8 min readAug 18, 2023

In the previous chapter, we understood how the build process works and we also created a simple Dockerfile. In this section, we will discuss a bit more about Dockerfile capabilities and the various commands that we can use. We will only discuss the bare minimum required to build a working Dockerfile.

Dockerfile instruction explained

Below we will discuss the most used Dockerfile instructions which will help you build a basic as well as an advance Dockerfile. Let's begin!

FROM

We have mainly been talking about the FROM instruction, the syntax is as below:

FROM [--platform=<platform>] <image> [AS <name>]

This can be used multiple times in a Dockerfile and usually sets the build stage in the form of a base image for the next set of instructions until a new FROM instruction is issued. AS <name>is an optional argument and is used to provide an alias to this layer which can be referenced at a later point. The --platform=<platform> flag is optional and is used to define the platform of the image used in the FROM instruction in case a multi-platform image is targeted. Values for this flag can include linux/amd64, linux/arm64, or windows/amd64 . FROM is usually the first instruction in a Dockerfile. However, only ARG instruction(explained later) can precede the FROM instruction at the beginning of a Dockerfile in certain scenarios and these two can work together.

CMD

The CMD instruction is mainly used to provide default execution parameters when a container instance is run. In a Dockerfile, there can be only one CMD instruction. If multiple CMD is provided, the last occurrence is the only instruction executed. It has to be noted that, the CMD command and parameters can be overridden when the container is run.

There are three ways to represent theCMD instruction. In the JSON array format, here the instruction will run the executable and pass the param1 and param2 to the executable. This is the JSON array or exec form of expressing the defaults.

CMD ["executable","param1","param2"]

The CMD can also be provided in a shell form. Here, the command is the executable and param1 and param2 are passed as the parameters to it.

CMD command param1 param2

There is a third way of providing default parameters to the executable but this works in hand with the ENTRYPOINT instruction (explained next).

CMD ["param1","param2"]

Here the executable is provided by the ENTRYPOINT instruction and CMD only provides the default parameters. In this method, both CMD and ENTRYPOINT instructions become mandatory in the Dockerfile.

ENTRYPOINT

The ENTRYPOINT Docker instruction is used to specify the command that will be executed when a Docker container starts. It allows you to set a default executable for the container, which can be combined with other commands or arguments when the container is run. Unlike CMD, ENTRYPOINT parameters are not easily overridden. This instruction is often used to define the primary application or process that runs within the container, ensuring consistency and predictability in container behaviour across different environments.

The syntax for the ENTRYPOINT instruction looks as below:

ENTRYPOINT ["executable", "param1", "param2"]

Just like the CMD instruction, it can also be used in the shell form:

ENTRYPOINT command param1 param2

Only the last ENTRYPOINT instruction in the Dockerfile will have an effect.

WORKDIR

The WORKDIR Docker instruction is used to set the working directory for any subsequent instructions within a Dockerfile. By specifying a directory path as an argument to WORKDIR, you define the location where commands like RUN, COPY, and ADD will be executed. This helps organize and streamline container operations by providing a consistent context for these commands. If the directory doesn't exist, WORKDIR will create it. This instruction is valuable for improving code readability, maintaining a structured file hierarchy, and ensuring that container operations are conducted in the intended directory, enhancing Dockerfile clarity and maintainability.

The syntax is very simple:

WORKDIR /path/to/workdir

If you run multiple WORKDIR instructions, the final working directory will be the result of every WORKDIR instruction combined.

WORKDIR /dir1
WORKDIR dir2
WORKDIR dir3
ENTRYPOINT [ "pwd" ]

Try this yourself using the Dockerfile example below, the output of the pwd command will be /dir1/dir2/dir3

docker build https://github.com/nitin-manju/DockerSamples.git#main:InstructionSamples/WORKDIR

COPY

The COPY Docker instruction facilitates the copying of files and directories from the host machine into a Docker image during the build process. It requires two arguments: the source path (relative to the context of the build) and the destination path within the image. This instruction is useful for incorporating application code, configuration files, and other assets into the image. COPY solely deals with file system content and doesn't perform URL retrieval or unpacking. It helps maintain a more straightforward and predictable image construction process, enhancing Dockerfile clarity and reproducibility.

The syntax is simple:

COPY <src>... <dest>

Multiple source paths can be given and they all work as relative paths under the build context.

If the paths have whitespaces, an alternative method can be used :

COPY ["<src>",... "<dest>"]

wildcard characters like * and ? can be used in the source paths to match the file names.

COPY resource-* /resources/
COPY resource-x? /resources/

If a WORKDIR instruction is been added before the COPY instruction, the destination path will be relative to the WORKDIR directory.

ADD

The ADD Docker instruction serves to incorporate files and directories from the build context or remote URLs into a Docker image during construction. It functions similarly to the COPY instruction but includes additional capabilities like unpacking compressed files and fetching remote resources. While versatile, caution should be exercised with ADD as it might lead to unexpected behaviours when dealing with URLs. For simpler file copying, COPY is generally preferred. Utilize ADD when the advanced features it offers are necessary, such as when dealing with compressed archives or web content retrieval.

The syntax for ADDis similar to COPY and follows similar use of wildcards. Additionally, remote paths like Git repositories and archive files can be provided as the source path.

RUN

The RUN Docker instruction is used to execute commands during the image build process. It allows you to install packages, configure settings, and perform various actions within the container. Each RUN instruction creates a new intermediate layer in the image, capturing the changes made by the command. This instruction is crucial for setting up the environment and dependencies required for the application to run correctly. It's important to minimize the number of RUN instructions to optimize image size and layer caching.

The syntax for RUN instruction is below:

RUN <command>

The command will be executed when the instruction is run. You can also provide the command in the exec format as shown below

RUN ["executable", "param1", "param2"]

Below is an example that used the RUN instruction to install the Redis package and then start the Redis server.

docker build https://github.com/nitin-manju/DockerSamples.git#main:Redis

The docker file for the same can be found here.

EXPOSE

The EXPOSE Docker instruction documents the network ports that a container will listen on when it's running. It doesn't actually publish the ports; rather, it serves as a form of documentation for developers and operators. This information aids in understanding which ports the containerized application requires for proper functionality. When combined with the -p flag during runtime, EXPOSE helps map container ports to host machine ports for external access. While not mandatory, EXPOSE is particularly valuable in scenarios where multiple containers communicate or when deploying to orchestrated environments, ensuring efficient networking setup and inter-container communication.

EXPOSE 80

This will expose port 80 as a TCP (default) port. However, we can specify the protocol for the port.

EXPOSE 80/udp

Since this serves only as documentation for the developers, to actually apply the port mapping during runtime, use the -p flag during the docker run command. and the -P flag will publish all the exposed ports to random ports on the host system. See the example for the -p flag below.

docker run -p 80:80/tcp ...

An example Dockerfile that exposes a port is below.

FROM alpine

RUN apk add --update redis

EXPOSE 6379

ENTRYPOINT ["redis-server"]

VOLUME

The VOLUME Docker instruction designates a specific directory within a container as a volume, allowing data to persist beyond the container's lifecycle. This data can be shared between containers or with the host system. Decoupling data storage from the container, VOLUME supports data persistence, scalability, and data sharing across different containers.

An example to mount the logs directory is shown below.

VOLUME ["/logs"]

However, it’s worth noting that modern best practices often lean towards using named volumes or bind mounts for more controlled and manageable data storage, as these options offer better management and flexibility compared to the VOLUME instruction.

An example of a VOLUME is shown below

FROM alpine

VOLUME /volume

CMD echo "hello world" > /volume/message.txt

When the image is built and executed, provide the host directory using the -v flag followed by the <host directory>:<container volume>

docker run -v /host-volume:/volume <image-name>

Here in the above example, /host-volume is a directory on the host machine.

ARG

This instruction helps in declaring variables and parameters. This is particularly useful when you want to pass external values as build parameters. We can have multiple ARG instructions in a Dockerfile.

If the ARG is declared before the FROM, this parameter can be used in the FROM instruction. However, it can not be used after the FROM. To use it after the FROM, redeclare again but without any initialization value.

To pass a value during the build command, use the --build-arg flag for each ARG instruction.

Example: Dockerfile ARG Instruction in action

ARG MESSAGE=HelloWorld

FROM alpine

ARG MESSAGE

RUN echo "$MESSAGE" > file.txt

CMD cat file.txt

Here in the above example, we see that we have an ARG instruction declaring a parameter MESSAGE before the FROM. We intend to use the value stored in it to be written into a file (file.txt) using the echo command. And finally, print the file content using the cat command.

Build this image with the parameter passed externally as an argument:

docker build --build-arg MESSAGE="Hello World From External ARG" .

Run this image and you should see the below output:

ARG is multipurpose and is used in various scenarios during the docker image build process.

ENV

The ENV Docker instruction sets environment variables within a Docker image. These variables persist during image creation and can be utilized by subsequent commands or processes within the container. This instruction is valuable for configuring application-specific settings, such as database URLs or API keys.

ENV enhances containerization by promoting configuration separation from code, making the image more adaptable across environments. It also improves image maintainability by consolidating configuration in a single location, simplifying updates. However, sensitive data should be handled with care, and more secure methods like Docker secrets are recommended for sensitive information.

The syntax to define an ENV is as below

ENV <key>=<value> ...

You can set multiple key-value pairs within the same line separated by a space. This variable will persist even after the containers are run and can be overridden during the run time using --env <key>=<value>

In the next chapter, we will build a Dockerfile with some of the instructions discussed above. Hope you have enjoyed this post. Stay tuned for more Docker 101!

--

--