Skip to main content

Orchestrating Containers with Docker Swarm

Introduction

As your applications grow in complexity, managing containers on a single host becomes limiting. You need high availability, scalability, and easy management of distributed containers. Docker Swarm is Docker's native orchestration solution, enabling you to cluster multiple Docker hosts, deploy multi-container applications, and manage them as a single logical unit. In this lesson, you'll learn how to set up and operate a Docker Swarm cluster, deploy services, scale your applications, and understand how Swarm handles orchestration and service discovery.


What is Docker Swarm?

Docker Swarm turns a pool of Docker hosts into a single, virtual Docker host. It provides:

  • Clustering: Pooling multiple hosts (nodes) into a single swarm.
  • Service orchestration: Defining and running multi-container applications (services).
  • Scaling: Seamlessly scaling services up or down.
  • Self-healing: Detecting and replacing failed containers.
  • Load balancing: Distributing requests among containers.

Swarm Mode is built into Docker Engine since version 1.12, making it easy to enable and use.


Swarm Architecture

A Docker Swarm consists of two types of nodes:

  • Manager Nodes: Handle cluster management tasks (orchestration, scheduling, maintaining cluster state).
  • Worker Nodes: Run containers (tasks) as instructed by managers.

Managers can also run containers, but for production, it's recommended to dedicate them to management tasks.


Initializing a Swarm

Let's create a basic Swarm cluster. You need access to at least one Linux or Mac machine (or use Docker Desktop).

Step 1: Initialize the Swarm

On the first node (which will become the manager):

docker swarm init --advertise-addr <MANAGER-IP>
  • --advertise-addr tells Swarm which IP address to advertise to other nodes.

Example:

docker swarm init --advertise-addr 192.168.1.100

After running this command, Docker will display a join command for worker nodes:

docker swarm join --token <TOKEN> <MANAGER-IP>:2377

Step 2: Add Worker Nodes

On each worker node, run the docker swarm join command provided by the manager.

Example:

docker swarm join --token SWMTKN-1-xxxx 192.168.1.100:2377

Step 3: Verify the Swarm

On the manager node, run:

docker node ls

You should see a list of all nodes in the Swarm.


Deploying Services in Swarm

In Swarm, you don't run containers directly. Instead, you deploy services.

Creating a Service

docker service create --name webserver -p 80:80 nginx:alpine
  • --name webserver: The name of the service.
  • -p 80:80: Publish port 80 on the cluster to port 80 of the container.
  • nginx:alpine: The image to use.

You can list services with:

docker service ls

Scaling Services

To scale the webserver service to 3 replicas:

docker service scale webserver=3

Swarm will ensure 3 containers (tasks) are running, distributing them across available nodes.

Updating Services

You can update the image or configuration of a running service:

docker service update --image nginx:latest webserver

Swarm will perform a rolling update, replacing each task one by one with the new image.


Service Discovery and Load Balancing

Swarm provides built-in service discovery. Services are accessible by their name inside the Swarm network.

  • Ingress Load Balancing: When you publish a port, Swarm automatically load-balances requests across all running service tasks.
  • Service-to-Service Communication: Containers can reach each other by service name using the built-in overlay network.

Rolling Updates and Rollbacks

Swarm supports zero-downtime updates and rollbacks.

Rolling Update Example

docker service update --image nginx:1.21-alpine webserver

You can control update parameters (like parallelism and delay):

docker service update \
--update-parallelism 2 \
--update-delay 10s \
--image nginx:1.21-alpine webserver

Rollback Example

If something goes wrong:

docker service rollback webserver

Common Operations

Viewing Service Tasks

docker service ps webserver

Removing a Service

docker service rm webserver

Draining a Node

Temporarily stop scheduling tasks on a node (e.g., for maintenance):

docker node update --availability drain <NODE-ID>

Real-World Use Cases

  • High availability web apps: Deploy web servers across multiple hosts with automatic failover.
  • Batch processing: Run distributed workers that can be scaled up or down easily.
  • Dev/Test environments: Quickly spin up reproducible environments for testing multi-component applications.

Common Mistakes and Pitfalls

  • Running all managers as workers: For production, run manager nodes dedicated to orchestration and avoid running user containers on them.
  • Forgetting persistent storage: Swarm tasks can be rescheduled to any node; use shared storage (e.g., NFS, cloud block storage) for stateful workloads.
  • Using docker run instead of docker service create: In Swarm mode, orchestrate with services, not individual containers.
  • Network misconfiguration: Ensure all nodes can reach each other on required ports (2377, 7946, 4789).
  • Not monitoring service health: Swarm can restart failed containers, but persistent failures should be investigated.

Summary/Recap

  • Docker Swarm orchestrates containers across a cluster of hosts.
  • It provides service deployment, scaling, rolling updates, and built-in load balancing.
  • Swarm is easy to initialize and operate, making it a good choice for small/medium clusters and getting started with orchestration.
  • For production, pay attention to storage, node roles, and network configuration.

Quiz

  1. What is the difference between a Swarm manager node and a worker node?

    Answer: Manager nodes handle orchestration, cluster management, and scheduling tasks, while worker nodes run the containers (tasks) as instructed by managers.

  2. How do you scale a service named api to 5 replicas in Docker Swarm?

    Answer: docker service scale api=5

  3. Which command initializes a Docker Swarm on a node with the IP address 10.0.0.1?

    Answer: docker swarm init --advertise-addr 10.0.0.1

  4. True or False: In Swarm mode, you should use docker run to deploy containers across the cluster.

    Answer: False. You should use docker service create and related service commands.

  5. What does the docker node update --availability drain <NODE-ID> command do?

    Answer: It marks the node as unavailable for scheduling new tasks and migrates running tasks to other nodes.