![Foundry Local android]()
Microsoft has introduced Foundry Local on Android, a new way for developers to run AI models directly on mobile devices with no cloud calls, no latency, and no dependency on internet connectivity. The gated preview is now open at https://aka.ms/foundrylocal-androidprp.
Smartphones are now powerful enough to run optimized AI models locally, and Microsoft’s new offering brings that capability to Android with a streamlined developer experience. With Foundry Local, developers can deploy open-source models from Microsoft Foundry straight to mobile devices, unlocking gains in speed, privacy, and cost efficiency.
Why on-device AI matters
• Extra privacy for healthcare, finance, and other sensitive workflows
• Lower cloud costs by reducing server round trips
• Reliable offline performance in low-connectivity conditions
• Faster experiences with no network delays
Foundry Local on Android has already been tested with early customers. PhonePe has integrated the SDK into its app, which serves more than 618 million users. This integration powers an upcoming AI-driven feature inside its digital payments platform.
New Speech-to-Text API powered by Whisper
Microsoft is adding on-device speech capability through a new Speech API. Apps can transcribe audio locally with low latency, and by default no audio ever leaves the device. This enables voice-driven workflows such as offline note capture, form filling, and field service operations.
Developers can acquire Whisper models through the Foundry Catalog, download and load them with a few lines of C#, and begin streaming transcription immediately. Documentation is available at https://aka.ms/foundrylocal-audiodocs.
Foundry Local SDK: Faster, lighter, and simpler
The new SDK delivers a cleaner development experience with:
• Self-contained packaging
• Smaller footprint
• Simple APIs for chat completions and audio transcription
• Optional OpenAI-compliant local web server
• Integration with Windows ML for smart device detection
Developers can load models like Qwen in minutes, with the SDK automatically selecting the most performant model for the hardware. A full walkthrough is available at https://aka.ms/foundrylocalSDK.
Coming to edge and hybrid environments with Arc-enabled Kubernetes
Foundry Local is also expanding beyond mobile devices. Microsoft previewed an upcoming version that runs in containers on Arc-enabled Kubernetes, enabling on-device AI at industrial scale. This setup targets manufacturing, sovereign cloud, and disconnected environments, allowing customers to deploy models validated on local machines and run them on edge hardware through Azure Local.
Interested developers can join the preview list at https://aka.ms/FL-K8s-Preview-Signup.
What’s next
Microsoft plans to bring Foundry Local to General Availability, deepen Android support, and advance Windows AI Foundry. The roadmap includes Linux support, tool calling, multimodal features, and expanded on-prem server compatibility.
Partners, including NimbleEdge, PhonePe, Morgan Stanley, Dell, and Anything LLM, have contributed to shaping the platform. Dell highlighted broader model access across its AI PC portfolio, and Anything LLM emphasized the benefit of running fast models like Deepseek and Qwen locally without building a custom LLM engine.
Developers can get started now:
• Try Foundry Local: https://aka.ms/foundrylocal