Skip to main content

Creating Docker Images Using Dockerfile for Beginners

Photo by Bernd 📷 Dittrich on Unsplash


Creating efficient and secure Docker images is critical to optimizing your containerized applications. In this guide we will walk through some of the best practices for creating Docker images using Dockerfiles and techniques for reducing image size while maintaining functionality and security.

Assumptions/Prerequisites

To make the most out of this guide, it is assumed that you have:

  • Basic knowledge of Docker and Dockerfile syntax.
  • Familiarity with Node.js applications, including package.json and npm.
  • Docker installed and set up on your machine.
  • A sample Node.js project to test the examples.

Best Practices

1. Start with a Minimal Base Image

  • Use lightweight base images like node:alpine to reduce image size.
  • For example:
FROM node:alpine

2. Specify an Explicit Version

  • Avoid using latest as the tag for your base image since it can lead to inconsistencies.
  • Instead:
FROM node:18.16.0-alpine

3. Minimize the Number of Layers

  • Combine related commands to reduce the number of layers in the final image.
  • Example of combining commands:
RUN apk add --no-cache curl \
&& npm install --production \
&& npm cache clean --force

4. Use Multi-Stage Builds

  • Multi-stage builds allow you to separate build-time and runtime dependencies, which reduces the final image size.
  • Example:
# Build stage
FROM node:18.16.0 AS builder
WORKDIR /app
COPY package.json package-lock.json ./
RUN npm install
COPY . .
RUN npm run build

# Runtime stage
FROM node:alpine
WORKDIR /app
COPY --from=builder /app/dist ./dist
COPY package.json ./
RUN npm install --production
CMD ["node", "dist/index.js"]

5. Avoid Adding Unnecessary Files

  • Use .dockerignore to exclude files that are not needed in the build context.
  • Example .dockerignore:
node_modules
*.log
*.tmp
dist

6. Install Only Required Packages

  • Be precise about the tools and libraries you install to avoid unnecessary bloat.
  • Example:
RUN npm install express dotenv

7. Clean Up Temporary Files

  • Remove cache files and temporary artifacts created during the build.
  • Example:
RUN npm cache clean --force

8. Use COPY Instead of ADD

  • Prefer COPY over ADD unless you need to extract a tar file or download from a URL.
  • Example:
COPY src/ /app/src/

9. Set Metadata Using Labels

  • Add labels for documentation, versioning, and maintainership.
  • Example:
LABEL maintainer="you@example.com" \
version="1.0" \
description="Node.js application"

10. Leverage Caching

  • Order commands in your Dockerfile to maximize caching benefits.
  • Place frequently changing commands (e.g., COPY) towards the end.
  • Example:
RUN npm install
COPY . /app

11. Set a Non-Root User

  • Avoid running your application as root for security reasons.
  • Example:
RUN addgroup -S appgroup && adduser -S appuser -G appgroup
USER appuser

12. Use ENTRYPOINT Over CMD

  • Use ENTRYPOINT for the main application command and CMD for default arguments.
  • Example:
ENTRYPOINT ["node", "dist/index.js"]
CMD ["--help"]

13. Optimize Image Layers

  • Use --squash (experimental feature) to merge layers and reduce image size.
  • Example:
docker build --squash -t myapp:optimized .

14. Scan for Vulnerabilities

  • Use tools like Trivy or Docker’s built-in docker scan to identify security vulnerabilities in your image.
  • Example:
trivy image myapp:latest

Example Dockerfile

Here’s a complete Dockerfile incorporating best practices for a Node.js application:

# Stage 1: Build stage
FROM node:18.16.0 AS builder
WORKDIR /app
COPY package.json package-lock.json ./
RUN npm install
COPY . .
RUN npm run build

# Stage 2: Runtime stage
FROM node:alpine
WORKDIR /app
COPY --from=builder /app/dist ./dist
COPY package.json ./
RUN npm install --production
USER node
CMD ["node", "dist/index.js"]

Reducing Image Size Checklist

  • Use lightweight base images.
  • Remove unnecessary files and cache.
  • Leverage multi-stage builds.
  • Use .dockerignore effectively.
  • Combine commands and minimize layers.

Conclusion

By following these best practices, you can create Docker images that are efficient, secure, and optimized for production environments. These principles not only improve performance but also enhance maintainability and security. Remember, building small and robust images is an essential step in ensuring your containerized applications run smoothly in any environment.

Comments

Popular posts from this blog

Understanding Number Systems: Decimal, Binary, and Hexadecimal

In everyday life, we use numbers all the time, whether for counting, telling time, or handling money. The number system we’re most familiar with is the   decimal system , but computers use other systems, such as   binary   and   hexadecimal . Let’s break down these number systems to understand how they work. What is a Number System? A number system is a way of representing numbers using a set of symbols and rules. The most common number systems are: Decimal (Base 10) Binary (Base 2) Hexadecimal (Base 16) Each system has a different “base” that tells us how many unique digits (symbols) are used to represent numbers. Decimal Number System (Base 10) This is the system we use daily. It has  10 digits , ranging from  0 to 9 . Example: The number  529  in decimal means: 5 × 1⁰² + 2 × 1⁰¹ + 9 × 1⁰⁰ =  500 + 20 + 9 = 529 Each position represents a power of 10, starting from the rightmost digit. Why Base 10? Decimal is base 10 because it has 10 digits...

How to Monetize Your API as an Individual Developer While Hosting on Your Own Server?

In the API economy, cloud services like AWS, Google Cloud, and Azure offer many conveniences, such as scaling and infrastructure management. However, some developers prefer more control and autonomy, opting to host their APIs on personal servers. Whether for cost efficiency, data privacy, or customization, hosting your own API comes with both advantages and challenges. But, even without cloud platforms, there are effective ways to monetize your API. This guide will explore how individual developers can successfully monetize their APIs while hosting them on their own servers. Why Host Your API on Your Own Server? Hosting your own API gives you full control over the infrastructure and potentially lower long-term costs. Here’s why some developers choose this approach: Cost Control : Instead of paying ongoing cloud fees, you may opt for a one-time or lower-cost hosting solution that fits your budget and resource needs. Data Ownership : You have full control over data, which is critical if ...

The Weight of Responsibility: A Developer’s Journey to Balance Passion and Reality

For the past several years, Eddie has been on a steady climb in his career as a developer, but recently, he found himself at a crossroads — caught between the weight of his responsibilities and the desire to pursue his true passions. His journey began with a three-month internship as a web developer, which led to nearly four years in an application developer role. After that, he spent almost a year as a systems associate, managing tasks across systems analysis, quality assurance, and business analysis. Eventually, he returned to full-time software development for another two years before transitioning into more complex roles. For over a year, he worked as a multi-role software developer and database administrator before stepping into his current position as a senior software developer, database administrator, and cloud administrator — occasionally handling security tasks as well. Now, with over 8 years of professional experience, he also leads a small team of developers, which has been...

The Hidden Costs of Overdesign and Bad Practices in API Systems

In software development, simplicity and clarity are often sacrificed in favor of overly complex solutions. While it can be tempting to add more features and intricate designs to ensure robustness, overdesign and poor practices can have significant consequences. They frustrate developers, lead to inefficiencies, increase costs, and put unnecessary strain on system resources.  A recent example involving a team that has faced challenges with complexity highlights the pitfalls of such an approach. Overdesign: The Problem of Too Much Complexity Overdesign occurs when systems are built with more complexity than necessary. This might manifest in bloated APIs, convoluted data flows, or excessive checks and processes that don’t add substantial value. The goal is often to anticipate future problems, but this approach typically results in cumbersome systems that are difficult to maintain and scale. In one case, a company found itself paying a hefty price just to host two API services and a po...

Selenium for Beginners: What, Where, When, and Why to Use It in Automated Testing

In today’s software development landscape, automated testing has become essential for delivering robust applications efficiently. Among various automated testing tools,   Selenium   stands out as one of the most widely used and beginner-friendly options. As you embark on your journey into automated testing, it’s crucial to understand the   what, where, when, and why   of using Selenium. In this guide we will run through these essentials and help you decide if Selenium is the right tool for you. What is Selenium? Selenium  is an open-source framework used primarily for automating web browsers. It enables developers and testers to write scripts that interact with websites, simulating actions like clicking buttons, filling out forms, and navigating pages, which allows for comprehensive automated testing. Selenium supports multiple programming languages, including Python, Java, C#, and JavaScript, making it flexible for teams with different coding preferences. Key C...