AI  

Getting Started with Foundry Local: Run AI Models on Your Own Device

Running AI models on your own laptop or PC is now easy with Foundry Local, a small tool you install and use from the command line. It lets you download, run, and manage modern AI models locally so you can experiment, build apps, or work offline.​

What is Foundry Local?

Foundry Local is a preview tool from Microsoft that runs AI models directly on your device instead of in the cloud. It includes a command-line interface (CLI) called foundry that handles model download, hardware detection, and local serving for you.​

You can start a model with one command, chat with it in the terminal, or connect it to your own applications using local APIs and SDKs.​

ska-rmavbild-2025-05-22-kl-143543

Image source: Microsoft Learn

Basic requirements

Your machine needs to meet some minimum specs:​

  • Operating system: Windows 10, Windows 11, Windows Server 2025, or macOS.

  • Memory: At least 8 GB RAM (16 GB recommended).

  • Disk: At least 3 GB free (15 GB recommended).

  • Internet: Needed once to download models; after that you can run offline.

  • Optional accelerators: GPU or NPU (NVIDIA, AMD, Intel, Qualcomm, or Apple silicon) for faster performance.​

On Windows, new NPUs require Windows 24H2 or later, and some NPUs (like Intel) need specific drivers installed.

Quick setup with the CLI

You can install and run your first model in just a few steps.​

1. Install Foundry Local

Windows

  
    winget install Microsoft.FoundryLocal
  

macOS

  
    brew tap microsoft/foundrylocal
  
  
    brew install foundrylocal
  

You can also download installers or packages from the official Foundry Local GitHub repository if you prefer.​

2. Run your first model

After installation, open a terminal and run:​

  
    foundry model run qwen2.5-0.5b
  

Foundry Local will

  • Download the model (first time only).

  • Pick the best variant for your hardware (GPU, NPU, or CPU).

  • Start an interactive prompt in the terminal.

You can then type a question like:

Why is the sky blue?

The model replies directly in the terminal window.​

Using other models

You are not limited to a single model.​

  • To see all available models:

foundry model list
  • To run a different one, replace the name:

foundry model run <model-name>
  • Foundry Local automatically downloads the right build for your hardware: CUDA for NVIDIA GPUs, NPU builds for Qualcomm/other NPUs, or CPU if no accelerator is found.​

  • You can also run OpenAI’s latest open-source model GPT-OSS-20B like this (requires a strong NVIDIA GPU with at least 16 GB VRAM and Foundry Local 0.6.87+):​

foundry model run gpt-oss-20b

Starter projects for real use

To move beyond the terminal, Microsoft provides starter projects that show real scenarios using Foundry Local:​

  • Chat app with multiple models.

  • Text summarization tool for files or pasted text.

  • Function calling example using a Phi-4 mini model.​

Each starter includes setup steps, full source code, and configuration so you can learn by running and modifying working solutions.​

foundry-local-arch

image source: Microsoft Learn

Helpful CLI commands

The foundry CLI is grouped into a few main areas:​

Model (manage and run models):

  • See help: foundry model --help

Service (local background service):

  • See help: foundry service --help

Cache (manage downloaded models on disk):

  • See help: foundry cache --help

To see all commands at once:

foundry --help

Update, remove, and fix issues

Upgrade Foundry Local

Windows

winget upgrade --id Microsoft.FoundryLocal

macOS

brew upgrade foundrylocal

Uninstall

Windows

winget uninstall Microsoft.FoundryLocal

macOS

brew rm foundrylocal
brew untap microsoft/foundrylocal
brew cleanup --scrub

Fix service connection errors

If you see an error like “Request to local service failed” or a 127.0.0.1:0 message when running foundry model list, restart the local service:​

foundry service restart

This usually fixes port or binding issues without a full reinstall.​ Install Foundry Local, run one command to start a model, and you immediately have powerful AI running on your own machine no cloud account or GPU strictly required, but they help a lot.