The Hidden Cost of 'docker run': Untangling Resource Limits and Container Sprawl

Every developer has run docker run without a second thought. It's the fastest way to spin up an environment, test a service, or debug a dependency. But that convenience has a dark side: containers that consume unlimited memory, pile up on disk, and drift into configurations that no one remembers. This article unpacks the real cost of ignoring resource limits and container sprawl. We'll walk through the common mistakes teams make, the patterns that actually prevent runaway containers, and the maintenance traps that turn a clean stack into a tangled mess. You'll learn how to set memory and CPU limits without breaking applications, how to clean up unused images and volumes systematically, and when the docker run approach should be replaced by orchestration or ephemeral workflows.

Where the Sprawl Begins: The 'docker run' Reflex

The trouble starts innocently. A developer needs to test a new database version, so they type docker run -d --name test-mongo mongo:5. It works. Later, they need a caching layer: docker run -d --name redis-cache redis:alpine. By the end of the week, that machine hosts a dozen containers, none of them resource-limited. Each container pulls an image, creates a writable layer, and claims a chunk of disk. Without limits, a single misbehaving container can starve the host of memory or CPU, affecting every other workload.

We see this pattern in teams of all sizes. The docker run command is so easy that it bypasses any governance. Developers treat it like a local tool, not a production workload. But when that same machine also runs monitoring agents, CI runners, or other services, the lack of limits becomes a risk. The hidden cost isn't just the disk space—it's the unpredictability. A container that leaks memory can crash the Docker daemon, or worse, trigger OOM kills on critical processes.

Another contributor to sprawl is the habit of running containers with --restart unless-stopped or always without thinking about cleanup. Containers that exit with errors stay in the docker ps -a list indefinitely. Images accumulate layers from frequent pulls. Volumes remain orphaned after container deletion. Over months, a single developer's workstation can accumulate tens of gigabytes of unused Docker artifacts. The reflex to docker run without flags like --rm or --memory is the root cause.

The Real Price of Unlimited Resources

When a container has no memory limit, it can consume all available RAM on the host. In a shared development environment, that means other containers may be killed by the kernel's OOM killer, leading to data loss or corrupted state. CPU limits are equally important: a container running an infinite loop can peg all cores, making the host unresponsive. Many teams learn this the hard way when a containerized Node.js app with a memory leak brings down the CI server.

Disk sprawl is quieter but equally costly. Each docker pull downloads image layers. Even if you delete the container, the image stays. Over time, docker system df reveals gigabytes of dangling images and volumes. Cleaning them manually is tedious, and automated cleanup scripts often get skipped. The result is a host that runs out of disk space, causing writes to fail and containers to crash.

Foundations Readers Confuse: Limits vs. Reservations vs. Swappiness

Many developers conflate memory limits with memory reservations, or assume that setting a limit guarantees performance. In Docker, --memory sets a hard limit: the container cannot exceed that amount of RAM. If it does, the kernel kills it (or throttles it with swap). --memory-reservation is a soft limit: Docker tries to keep the container under this value, but it can burst above if the host has free memory. Without a reservation, a container may get starved when the host is under pressure.

Swap is another misunderstood knob. By default, containers can use swap up to twice the memory limit. That means a container with --memory=512m can effectively use 1 GB of combined RAM and swap. This can mask memory leaks and degrade performance, because swapping is slow. Setting --memory-swap to the same value as --memory disables swap, forcing the container to stay within RAM. But this also means the container may be OOM-killed sooner, which is often preferable to thrashing.

CPU limits are similarly nuanced. --cpus limits the number of CPU cores a container can use. But if you set --cpus=0.5 on a multi-core host, the container may still use 100% of one core for half the time, then be throttled. The scheduler is fair, but bursty workloads can still cause contention. --cpu-shares (relative weight) is even softer: it only matters when cores are contended. A container with shares of 1024 gets twice the CPU of one with 512, but if no other containers are competing, both can use all cores.

Common Misconception: 'I Set a Limit, So It's Safe'

Setting a memory limit does not prevent the container from being killed. It only defines the threshold. If the application inside the container has a memory leak, it will eventually hit the limit and be OOM-killed. The limit just makes the failure predictable and contained. Without a limit, the leak can bring down the entire host. The real safety comes from combining limits with monitoring and restart policies, plus fixing the leak.

Another misconception is that CPU limits are always beneficial. For CPU-bound workloads, a limit can improve fairness. But for I/O-bound or idle services, limits add unnecessary overhead. The Docker CPU scheduler uses Completely Fair Scheduler (CFS) quotas, which can cause latency spikes if the quota period is too short. For many web servers, it's better to leave CPU unlimited and rely on memory limits and process-level cgroups.

Patterns That Usually Work: Setting Limits Without Breaking Apps

The safest starting point is to set memory limits based on the application's peak observed usage. Run the container without limits in a staging environment, monitor its memory consumption over a week, then set --memory to 1.5x the observed peak. Add --memory-reservation at 0.75x the peak to ensure the container gets baseline memory under pressure. For CPU, start with --cpus=1 for most services, and adjust based on profiling.

For databases, memory limits are tricky. Databases often use large caches. Setting a memory limit that is too low can cause performance degradation or crashes. A better approach is to use the database's own memory configuration (e.g., innodb_buffer_pool_size for MySQL) and set the Docker memory limit to 1.2x that value. Leave CPU unlimited for databases, because query processing is bursty and latency-sensitive.

Another pattern that works is using docker run --rm for ephemeral containers. This automatically removes the container when it exits, preventing accumulation of stopped containers. Combine with --restart on-failure for services that should restart after crashes, but avoid --restart always unless you have a cleanup strategy.

Checklist for a Healthy 'docker run' Workflow

Always set --memory and --memory-swap (equal to disable swap) for production workloads.
Use --memory-reservation to reserve baseline memory.
Set --cpus for CPU-bound services; leave unlimited for I/O-bound ones.
Add --rm for temporary containers.
Run docker system prune -af --volumes weekly in CI or cron.
Monitor container resource usage with docker stats or a tool like cAdvisor.

Anti-Patterns and Why Teams Revert

One common anti-pattern is setting memory limits too low, causing frequent OOM kills. A team might set --memory=256m on a Java application that needs 512 MB, then wonder why it keeps crashing. They revert to no limits, thinking limits are broken. The fix is to profile first, then set limits with a buffer. Another anti-pattern is using --cpu-shares instead of --cpus. Shares only matter under contention, so a container with high shares can still be starved if the host is overloaded. Teams often find that --cpus gives more predictable behavior.

Another mistake is forgetting to set --memory-swap. If you set --memory=512m but leave swap default, the container can use up to 1 GB of swap. This hides memory issues and causes performance degradation. Teams see slow response times and blame the network, when the real culprit is swapping. They then disable limits entirely, hoping to fix the slowness. Instead, they should set --memory-swap=512m to force the container to stay in RAM.

Container sprawl is also driven by the habit of running docker run for one-off tasks without cleanup. Developers might run a container to test a script, then forget about it. Over weeks, the host fills with stopped containers and dangling images. Teams revert to manual cleanup, which is error-prone. The solution is to enforce a policy: use --rm for one-off tasks, and schedule automated pruning.

Why Teams Abandon Limits Altogether

In some cases, teams abandon resource limits because they cause more problems than they solve. For example, a web server behind a load balancer might get OOM-killed under a traffic spike, even though the spike is temporary. The limit prevents the server from handling the load, causing a cascade of failures. The team removes the limit and relies on horizontal scaling instead. This is valid for stateless services, but for stateful ones like databases, limits are still necessary. The key is to match the limit to the workload's actual needs, not a guess.

Maintenance, Drift, and Long-Term Costs

Over time, Docker environments drift. Images that were pulled months ago remain on disk. Containers that were started with specific flags get forgotten. The docker run commands used to create them are lost in terminal history. When a developer leaves the team, no one knows which containers are critical and which are leftovers. This drift leads to configuration inconsistencies: some containers have limits, others don't. New team members copy commands from old wikis that omit flags, perpetuating the sprawl.

The long-term cost is operational debt. A host with 50 containers, each with different restart policies and no limits, is a ticking time bomb. When a memory leak occurs, it's hard to isolate because all containers compete for resources. The team spends hours debugging, only to find that a forgotten container is eating RAM. The fix is to document all docker run commands in a Docker Compose file or a script, and to enforce resource limits as part of the deployment pipeline.

Another cost is storage. Docker images and layers can consume tens of gigabytes. Without regular pruning, disk usage grows until the host runs out of space. This triggers alerts, but by then, the team must manually decide what to delete. Automated pruning with docker system prune -af can remove unused images, containers, and networks, but it also removes cached layers, slowing down future pulls. A better strategy is to use a retention policy: keep images used in the last 30 days, prune everything older.

How to Detect Drift Early

Run docker stats --no-stream periodically to check resource usage. Set up alerts when memory or CPU usage exceeds 80% of the limit. Use docker system df to monitor disk usage. If you see more than 10 GB of dangling images, it's time to prune. For teams with many hosts, use a tool like Portainer or a simple cron job that runs docker system prune -af --filter until=72h to keep only recent artifacts.

When Not to Use This Approach

Resource limits via docker run are not suitable for every scenario. If you are running containers on a Kubernetes cluster, the node-level cgroups and pod limits handle resource management. Setting limits inside the container with docker run is redundant and can conflict with Kubernetes limits. In that case, rely on the orchestrator's resource requests and limits.

Another scenario is when you need guaranteed performance for latency-sensitive applications. CPU limits with CFS quotas can cause throttling, leading to tail latency spikes. For such workloads, it's better to run the container without CPU limits and use CPU pinning (--cpuset-cpus) to dedicate cores. Memory limits should still be set, but with a generous buffer.

For development workstations, strict limits can be counterproductive. Developers often run multiple containers for testing, and limits can cause unexpected failures that waste time. A better approach is to set soft limits via --memory-reservation and no hard limit, or to use Docker Desktop's built-in resource settings. The goal is to prevent a single container from starving the host, not to enforce production-like constraints.

Finally, if you are running containers on a host with plenty of spare resources (e.g., a dedicated server with 64 GB RAM for a single application), you may not need limits at all. But this is rare in practice, and it's still good practice to set limits as a safety net. The decision should be based on the risk tolerance of the team.

Open Questions and FAQ

How do I choose the right memory limit for a container?

Monitor the container's memory usage under realistic load for at least 24 hours. Use docker stats or a monitoring tool. Set the limit to 1.5x the observed peak. For Java applications, account for the JVM heap and off-heap memory. For databases, include the buffer pool size.

Should I use `--memory-swap` or not?

If you want predictable performance and early detection of memory leaks, set --memory-swap equal to --memory to disable swap. If you prefer to allow some swapping as a safety net (at the cost of performance), leave the default (twice the memory limit). For most production services, disable swap.

What about `--oom-kill-disable`?

This flag prevents the container from being killed by the OOM killer. Use it only if you have a custom OOM handling mechanism inside the container, or if the container is critical and you prefer it to hang rather than die. In most cases, it's safer to let the OOM killer act, because a hanging container can cause cascading failures.

How often should I prune Docker artifacts?

Run docker system prune -af --volumes weekly on development machines and daily on CI servers. For production hosts, prune only dangling images (not volumes) to avoid data loss. Use filters like --filter until=72h to keep recent images.

Is Docker Compose better than `docker run` for resource limits?

Yes. Docker Compose allows you to define resource limits in a YAML file, making them repeatable and documented. It also handles cleanup better. For any multi-container setup, prefer Compose over individual docker run commands.

What if my container needs more memory temporarily?

Use --memory-reservation to set a soft limit, and --memory as a hard limit. The container can burst above the reservation if the host has free memory, but it will be throttled or killed if it hits the hard limit. This gives you a safety margin without wasting resources.

Can I set limits on running containers?

Yes. Use docker update --memory 512m --cpus 0.5 to change limits on a running container. The container does not need to be restarted. This is useful for adjusting limits based on real-time monitoring.

What about GPU limits?

Docker supports GPU access via --gpus, but there is no built-in GPU memory limit. Use NVIDIA's MPS or MIG features to partition GPUs. For most ML workloads, limit the number of GPUs and rely on the application to manage memory.

To get started today, audit your current Docker host with docker stats and docker system df. Identify the top three containers by memory usage and set limits on them. Schedule a weekly prune. Convert your most-used docker run commands into a Docker Compose file with resource limits. These three actions will cut your container sprawl and make your environment more predictable.

The Hidden Cost of 'docker run': Untangling Resource Limits and Container Sprawl

Table of Contents

Where the Sprawl Begins: The 'docker run' Reflex

The Real Price of Unlimited Resources

Foundations Readers Confuse: Limits vs. Reservations vs. Swappiness

Common Misconception: 'I Set a Limit, So It's Safe'

Patterns That Usually Work: Setting Limits Without Breaking Apps

Checklist for a Healthy 'docker run' Workflow

Anti-Patterns and Why Teams Revert

Why Teams Abandon Limits Altogether

Maintenance, Drift, and Long-Term Costs

How to Detect Drift Early

When Not to Use This Approach

Open Questions and FAQ

How do I choose the right memory limit for a container?

Should I use `--memory-swap` or not?

What about `--oom-kill-disable`?

How often should I prune Docker artifacts?

Is Docker Compose better than `docker run` for resource limits?

What if my container needs more memory temporarily?

Can I set limits on running containers?

What about GPU limits?

Comments (0)

Table of Contents

Where the Sprawl Begins: The 'docker run' Reflex

The Real Price of Unlimited Resources

Foundations Readers Confuse: Limits vs. Reservations vs. Swappiness

Common Misconception: 'I Set a Limit, So It's Safe'

Patterns That Usually Work: Setting Limits Without Breaking Apps

Checklist for a Healthy 'docker run' Workflow

Anti-Patterns and Why Teams Revert

Why Teams Abandon Limits Altogether

Maintenance, Drift, and Long-Term Costs

How to Detect Drift Early

When Not to Use This Approach

Open Questions and FAQ

How do I choose the right memory limit for a container?

Should I use --memory-swap or not?

What about --oom-kill-disable?

How often should I prune Docker artifacts?

Is Docker Compose better than docker run for resource limits?

What if my container needs more memory temporarily?

Can I set limits on running containers?

What about GPU limits?

Share this article:

Comments (0)

Should I use `--memory-swap` or not?

What about `--oom-kill-disable`?

Is Docker Compose better than `docker run` for resource limits?