Docker 104 — Dockerfile instructions in detail
In the previous chapter, we understood how the build process works and we also created a simple Dockerfile. In this section, we will discuss a bit more about Dockerfile capabilities and the various commands that we can use. We will only discuss the bare minimum required to build a working Dockerfile.
Dockerfile instruction explained
Below we will discuss the most used Dockerfile instructions which will help you build a basic as well as an advance Dockerfile. Let's begin!
FROM
We have mainly been talking about the FROM
instruction, the syntax is as below:
FROM [--platform=<platform>] <image> [AS <name>]
This can be used multiple times in a Dockerfile and usually sets the build stage in the form of a base image for the next set of instructions until a new FROM
instruction is issued. AS <name>
is an optional argument and is used to provide an alias to this layer which can be referenced at a later point. The --platform=<platform>
flag is optional and is used to define the platform of the image used in the FROM
instruction in case a multi-platform image is targeted. Values for this flag can include linux/amd64
, linux/arm64
, or windows/amd64
. FROM
is usually the first instruction in a Dockerfile. However, only ARG
instruction(explained later) can precede the FROM
instruction at the beginning of a Dockerfile in certain scenarios and these two can work together.
CMD
The CMD
instruction is mainly used to provide default execution parameters when a container instance is run. In a Dockerfile, there can be only one CMD
instruction. If multiple CMD
is provided, the last occurrence is the only instruction executed. It has to be noted that, the CMD
command and parameters can be overridden when the container is run.
There are three ways to represent theCMD
instruction. In the JSON array format, here the instruction will run the executable and pass the param1 and param2 to the executable. This is the JSON array or exec form of expressing the defaults.
CMD ["executable","param1","param2"]
The CMD
can also be provided in a shell form. Here, the command is the executable and param1 and param2 are passed as the parameters to it.
CMD command param1 param2
There is a third way of providing default parameters to the executable but this works in hand with the ENTRYPOINT
instruction (explained next).
CMD ["param1","param2"]
Here the executable is provided by the ENTRYPOINT
instruction and CMD
only provides the default parameters. In this method, both CMD
and ENTRYPOINT
instructions become mandatory in the Dockerfile.
ENTRYPOINT
The ENTRYPOINT
Docker instruction is used to specify the command that will be executed when a Docker container starts. It allows you to set a default executable for the container, which can be combined with other commands or arguments when the container is run. Unlike CMD
, ENTRYPOINT
parameters are not easily overridden. This instruction is often used to define the primary application or process that runs within the container, ensuring consistency and predictability in container behaviour across different environments.
The syntax for the ENTRYPOINT
instruction looks as below:
ENTRYPOINT ["executable", "param1", "param2"]
Just like the CMD instruction, it can also be used in the shell form:
ENTRYPOINT command param1 param2
Only the last ENTRYPOINT
instruction in the Dockerfile
will have an effect.
WORKDIR
The WORKDIR
Docker instruction is used to set the working directory for any subsequent instructions within a Dockerfile. By specifying a directory path as an argument to WORKDIR
, you define the location where commands like RUN
, COPY
, and ADD
will be executed. This helps organize and streamline container operations by providing a consistent context for these commands. If the directory doesn't exist, WORKDIR
will create it. This instruction is valuable for improving code readability, maintaining a structured file hierarchy, and ensuring that container operations are conducted in the intended directory, enhancing Dockerfile clarity and maintainability.
The syntax is very simple:
WORKDIR /path/to/workdir
If you run multiple WORKDIR
instructions, the final working directory will be the result of every WORKDIR
instruction combined.
WORKDIR /dir1
WORKDIR dir2
WORKDIR dir3
ENTRYPOINT [ "pwd" ]
Try this yourself using the Dockerfile example below, the output of the pwd command will be /dir1/dir2/dir3
docker build https://github.com/nitin-manju/DockerSamples.git#main:InstructionSamples/WORKDIR
COPY
The COPY
Docker instruction facilitates the copying of files and directories from the host machine into a Docker image during the build process. It requires two arguments: the source path (relative to the context of the build) and the destination path within the image. This instruction is useful for incorporating application code, configuration files, and other assets into the image. COPY
solely deals with file system content and doesn't perform URL retrieval or unpacking. It helps maintain a more straightforward and predictable image construction process, enhancing Dockerfile clarity and reproducibility.
The syntax is simple:
COPY <src>... <dest>
Multiple source paths can be given and they all work as relative paths under the build context.
If the paths have whitespaces, an alternative method can be used :
COPY ["<src>",... "<dest>"]
wildcard characters like * and ? can be used in the source paths to match the file names.
COPY resource-* /resources/
COPY resource-x? /resources/
If a WORKDIR
instruction is been added before the COPY
instruction, the destination path will be relative to the WORKDIR
directory.
ADD
The ADD
Docker instruction serves to incorporate files and directories from the build context or remote URLs into a Docker image during construction. It functions similarly to the COPY
instruction but includes additional capabilities like unpacking compressed files and fetching remote resources. While versatile, caution should be exercised with ADD
as it might lead to unexpected behaviours when dealing with URLs. For simpler file copying, COPY
is generally preferred. Utilize ADD
when the advanced features it offers are necessary, such as when dealing with compressed archives or web content retrieval.
The syntax for ADD
is similar to COPY
and follows similar use of wildcards. Additionally, remote paths like Git repositories and archive files can be provided as the source path.
RUN
The RUN
Docker instruction is used to execute commands during the image build process. It allows you to install packages, configure settings, and perform various actions within the container. Each RUN
instruction creates a new intermediate layer in the image, capturing the changes made by the command. This instruction is crucial for setting up the environment and dependencies required for the application to run correctly. It's important to minimize the number of RUN
instructions to optimize image size and layer caching.
The syntax for RUN instruction is below:
RUN <command>
The command will be executed when the instruction is run. You can also provide the command in the exec format as shown below
RUN ["executable", "param1", "param2"]
Below is an example that used the RUN
instruction to install the Redis package and then start the Redis server.
docker build https://github.com/nitin-manju/DockerSamples.git#main:Redis
The docker file for the same can be found here.
EXPOSE
The EXPOSE
Docker instruction documents the network ports that a container will listen on when it's running. It doesn't actually publish the ports; rather, it serves as a form of documentation for developers and operators. This information aids in understanding which ports the containerized application requires for proper functionality. When combined with the -p
flag during runtime, EXPOSE
helps map container ports to host machine ports for external access. While not mandatory, EXPOSE
is particularly valuable in scenarios where multiple containers communicate or when deploying to orchestrated environments, ensuring efficient networking setup and inter-container communication.
EXPOSE 80
This will expose port 80 as a TCP (default) port. However, we can specify the protocol for the port.
EXPOSE 80/udp
Since this serves only as documentation for the developers, to actually apply the port mapping during runtime, use the -p
flag during the docker run command. and the -P
flag will publish all the exposed ports to random ports on the host system. See the example for the -p
flag below.
docker run -p 80:80/tcp ...
An example Dockerfile that exposes a port is below.
FROM alpine
RUN apk add --update redis
EXPOSE 6379
ENTRYPOINT ["redis-server"]
VOLUME
The VOLUME
Docker instruction designates a specific directory within a container as a volume, allowing data to persist beyond the container's lifecycle. This data can be shared between containers or with the host system. Decoupling data storage from the container, VOLUME
supports data persistence, scalability, and data sharing across different containers.
An example to mount the logs directory is shown below.
VOLUME ["/logs"]
However, it’s worth noting that modern best practices often lean towards using named volumes or bind mounts for more controlled and manageable data storage, as these options offer better management and flexibility compared to the VOLUME
instruction.
An example of a VOLUME
is shown below
FROM alpine
VOLUME /volume
CMD echo "hello world" > /volume/message.txt
When the image is built and executed, provide the host directory using the -v
flag followed by the <host directory>:<container volume>
docker run -v /host-volume:/volume <image-name>
Here in the above example, /host-volume is a directory on the host machine.
ARG
This instruction helps in declaring variables and parameters. This is particularly useful when you want to pass external values as build parameters. We can have multiple ARG instructions in a Dockerfile.
If the ARG is declared before the FROM, this parameter can be used in the FROM instruction. However, it can not be used after the FROM. To use it after the FROM, redeclare again but without any initialization value.
To pass a value during the build command, use the --build-arg
flag for each ARG instruction.
Example: Dockerfile ARG Instruction in action
ARG MESSAGE=HelloWorld
FROM alpine
ARG MESSAGE
RUN echo "$MESSAGE" > file.txt
CMD cat file.txt
Here in the above example, we see that we have an ARG instruction declaring a parameter MESSAGE before the FROM. We intend to use the value stored in it to be written into a file (file.txt) using the echo
command. And finally, print the file content using the cat
command.
Build this image with the parameter passed externally as an argument:
docker build --build-arg MESSAGE="Hello World From External ARG" .
Run this image and you should see the below output:
ARG is multipurpose and is used in various scenarios during the docker image build process.
ENV
The ENV
Docker instruction sets environment variables within a Docker image. These variables persist during image creation and can be utilized by subsequent commands or processes within the container. This instruction is valuable for configuring application-specific settings, such as database URLs or API keys.
ENV
enhances containerization by promoting configuration separation from code, making the image more adaptable across environments. It also improves image maintainability by consolidating configuration in a single location, simplifying updates. However, sensitive data should be handled with care, and more secure methods like Docker secrets are recommended for sensitive information.
The syntax to define an ENV is as below
ENV <key>=<value> ...
You can set multiple key-value pairs within the same line separated by a space. This variable will persist even after the containers are run and can be overridden during the run time using --env <key>=<value>
In the next chapter, we will build a Dockerfile with some of the instructions discussed above. Hope you have enjoyed this post. Stay tuned for more Docker 101!