Running AI models on your own laptop or PC is now easy with Foundry Local, a small tool you install and use from the command line. It lets you download, run, and manage modern AI models locally so you can experiment, build apps, or work offline.
What is Foundry Local?
Foundry Local is a preview tool from Microsoft that runs AI models directly on your device instead of in the cloud. It includes a command-line interface (CLI) called foundry that handles model download, hardware detection, and local serving for you.
You can start a model with one command, chat with it in the terminal, or connect it to your own applications using local APIs and SDKs.
![ska-rmavbild-2025-05-22-kl-143543]()
Image source: Microsoft Learn
Basic requirements
Your machine needs to meet some minimum specs:
Operating system: Windows 10, Windows 11, Windows Server 2025, or macOS.
Memory: At least 8 GB RAM (16 GB recommended).
Disk: At least 3 GB free (15 GB recommended).
Internet: Needed once to download models; after that you can run offline.
Optional accelerators: GPU or NPU (NVIDIA, AMD, Intel, Qualcomm, or Apple silicon) for faster performance.
On Windows, new NPUs require Windows 24H2 or later, and some NPUs (like Intel) need specific drivers installed.
Quick setup with the CLI
You can install and run your first model in just a few steps.
1. Install Foundry Local
Windows
winget install Microsoft.FoundryLocal
macOS
brew tap microsoft/foundrylocal
brew install foundrylocal
You can also download installers or packages from the official Foundry Local GitHub repository if you prefer.
2. Run your first model
After installation, open a terminal and run:
foundry model run qwen2.5-0.5b
Foundry Local will
Download the model (first time only).
Pick the best variant for your hardware (GPU, NPU, or CPU).
Start an interactive prompt in the terminal.
You can then type a question like:
Why is the sky blue?
The model replies directly in the terminal window.
Using other models
You are not limited to a single model.
foundry model list
foundry model run <model-name>
Foundry Local automatically downloads the right build for your hardware: CUDA for NVIDIA GPUs, NPU builds for Qualcomm/other NPUs, or CPU if no accelerator is found.
You can also run OpenAI’s latest open-source model GPT-OSS-20B like this (requires a strong NVIDIA GPU with at least 16 GB VRAM and Foundry Local 0.6.87+):
foundry model run gpt-oss-20b
Starter projects for real use
To move beyond the terminal, Microsoft provides starter projects that show real scenarios using Foundry Local:
Chat app with multiple models.
Text summarization tool for files or pasted text.
Function calling example using a Phi-4 mini model.
Each starter includes setup steps, full source code, and configuration so you can learn by running and modifying working solutions.
![foundry-local-arch]()
image source: Microsoft Learn
Helpful CLI commands
The foundry CLI is grouped into a few main areas:
Model (manage and run models):
Service (local background service):
Cache (manage downloaded models on disk):
To see all commands at once:
foundry --help
Update, remove, and fix issues
Upgrade Foundry Local
Windows
winget upgrade --id Microsoft.FoundryLocal
macOS
brew upgrade foundrylocal
Uninstall
Windows
winget uninstall Microsoft.FoundryLocal
macOS
brew rm foundrylocal
brew untap microsoft/foundrylocal
brew cleanup --scrub
Fix service connection errors
If you see an error like “Request to local service failed” or a 127.0.0.1:0 message when running foundry model list, restart the local service:
foundry service restart
This usually fixes port or binding issues without a full reinstall. Install Foundry Local, run one command to start a model, and you immediately have powerful AI running on your own machine no cloud account or GPU strictly required, but they help a lot.