Running Claude Locally on Your Mac Using Docker

Sarthak Varshney
16h
1.4k
0
2

Article

The Day My Laptop Said: "Main Yahan Hoon"

Let me tell you a story. A few weeks ago, a student from my Docker workshop messaged me on LinkedIn. He had just got a brand new MacBook Pro with the M4 chip — the one with the fancy Apple Silicon that everyone is raving about. He wanted to run an AI model locally. Not on some server, not on AWS, not on any cloud. Locally. On his own machine.

His exact words were: "Bhai, mujhe AI apne laptop mein chahiye. Koi subscription nahi, koi data leak nahi, bas mera laptop aur meri AI."

(Translation: "Bro, I want AI on my laptop. No subscription, no data leaks, just my laptop and my AI.")

Honestly? Same, bhai. Same.

That conversation is what inspired this article. If you have ever wondered how to run a Claude-compatible AI model locally — fully isolated, inside Docker, without touching your production environment — this guide is for you. We are going to do it step by step, command by command, and I am going to explain why each step matters.

What Are We Actually Building?

Before we dive in, let me paint a clear picture. By the end of this guide, you will have:

Docker Model Runner running on your Mac — Docker's built-in way to run AI models locally
A Claude-compatible open-source model downloaded and ready (like Qwen3-Coder or GPT-OSS)
Claude Code installed on your Mac, connected to your local model
A fully isolated sandbox — if something breaks or you want to stop, one command brings your Mac back to normal

Think of it like this: you are setting up a parallel universe inside your laptop. In that universe, an AI model runs freely. In your normal universe — your projects, your work, your production code — nothing changes. The two never mix unless you want them to.

Important: Anthropic's Claude (the one on claude.ai) is proprietary and cannot be downloaded or run locally. What we are running here is Claude Code — Anthropic's coding agent — connected to open-source models that behave similarly. The experience is remarkably close.

Before You Begin — What You Need

A MacBook with Apple Silicon (M1, M2, M3, or M4 chip)
At least 16GB RAM — 32GB is better for larger models
Docker Desktop installed and running (download from docker.com)
A decent internet connection for the initial model download (models are 10–17GB)
Basic familiarity with Terminal — nothing advanced, just knowing how to open it

That is it. No Anthropic API key, no AWS account, no cloud subscription needed.

Step 1 — Enable Docker Model Runner

Open your Terminal and run:

docker desktop enable model-runner --tcp

This one command activates Docker's built-in model runner with TCP support. Your local AI model will be accessible via a standard HTTP API — just like calling any REST API. You can even query it with curl later.

Think of it like switching on a power socket in your wall. Nothing runs yet, but the infrastructure is ready.

Pro tip: Make sure Docker Desktop is updated to the latest version before running this. Older versions may not have the model-runner feature. Go to Docker Desktop → Check for Updates.

Step 2 — Install Claude Code

Run this in your Terminal:

curl -fsSL https://claude.ai/install.sh | bash```When it finishes, you will see:
```
✔ Claude Code successfully installed!Version: 2.1.74
Location: ~/.local/bin/claude

Now fix your PATH so the claude command works anywhere:

echo 'export PATH="$HOME/.local/bin:$PATH"' >> ~/.zshrc && source ~/.zshrc

Think of this like adding a new contact to your phone — now you can dial Claude directly from any folder in Terminal.

Step 3 — Pull Your AI Model

Start with a small model to confirm everything works:

docker model pull ai/smollm2

SmolLM2 is only 270MB and downloads in seconds. Once that succeeds, go for the main model:

docker model pull ai/qwen3-coder

Qwen3-Coder is about 17.67GB — one of the best coding-focused open-source models available right now. On an M4 Mac, it runs fast thanks to Apple's unified memory architecture.

Go make chai. Come back. It will be ready.

If your download fails midway (you will see an "unexpected EOF" error), do not panic. This is just a network interruption. Run the same docker model pull command again — Docker resumes from where it left off, it does not restart from scratch.

To check your downloaded models:

docker model list

Step 4 — Launch Claude Code With Your Local Model

This is the moment we have been building towards:

ANTHROPIC_BASE_URL=http://localhost:12434/engines/llama.cpp/v1 \
ANTHROPIC_API_KEY=fake \
claude --model ai/qwen3-coder:latest

Here is what each line does:

ANTHROPIC_BASE_URL points Claude Code to your local Docker model runner instead of Anthropic's servers
ANTHROPIC_API_KEY=fake is a placeholder — running locally, no real API key is needed
--model ai/qwen3-coder tells Claude Code which model to use

If everything is working, the Claude Code welcome screen will appear in your terminal — complete with ASCII art and a theme selection menu. Your local AI sandbox is now live.

The Sandbox Magic — Why Docker Makes This Beautiful

Everything we set up lives inside Docker's isolated environment. Your Mac filesystem, your projects, your existing tools — none of that is touched. It is like having a guest room in your house. Your guest can do whatever they want in that room. When they leave, the room goes back to normal.

To stop the entire AI setup and free up memory:

docker desktop disable model-runner

To remove a model you no longer need:

docker model rm ai/smollm2

To see all running Docker processes:

docker ps

This is the beauty of containerization. You are in control. The AI is your tool, not the other way around.

Which Model Should You Use?

Model	Size	Best For
ai/smollm2	270 MB	Testing your setup, quick experiments, learning
ai/qwen3-coder	17.67 GB	Coding tasks, debugging, writing scripts
gpt-oss	17.45 GB	General conversation, writing, mixed tasks

Common Mistakes and How to Avoid Them

"The model download failed halfway" Run the pull command again. Docker resumes downloads rather than restarting. Be patient with 17GB files.

"The claude command is not found" You need to add ~/.local/bin to your PATH. Run the export command from Step 2 again, then run source ~/.zshrc.

"It is asking me for an API key" Set ANTHROPIC_API_KEY=fake as shown in Step 4. Your local model does not need a real key.

"The model is running but it is very slow" Close Chrome tabs, Slack, Figma, and other memory-heavy apps. AI models need RAM. On 16GB machines this matters a lot. On M4 with 32GB, this is rarely an issue.

Real-World Use Cases for This Setup

Code review: Ask the model to review your functions without sending code to external servers
Learning: Ask it to explain Docker concepts or Kubernetes architecture in simple English or Hindi
Offline work: On a train to Delhi with no internet? Your local AI still works
Private projects: Clients with NDAs? Run AI assistance locally — nothing leaves your machine
Experimentation: Try different open-source models with zero billing surprises

Wrapping Up — AI on Your Terms

We started with a student who wanted AI on his laptop — no subscriptions, no data going to the cloud. I think we delivered exactly that.

In five steps, you have gone from a fresh Mac to a fully functional local AI sandbox powered by Docker. The setup is isolated, reversible, and surprisingly capable — especially on Apple Silicon where the unified memory makes running large models genuinely fast.

The key insight: Docker is not just for deploying web apps. It is a sandboxing tool — and for AI experimentation, that sandboxing is exactly what you want. Your production environment stays clean. Your experiments stay contained. And when you are done, you shut it all down with one command.

If you try this and run into any issues, drop a comment below. I am happy to help debug.

And to that student who messaged me: ab tera laptop bhi bol sakta hai, "Main yahan hoon."

(Your laptop can now say: "I am here.")