![HF]()
September 12, 2025 – Hugging Face today announced the availability of its Inference Providers within GitHub Copilot Chat for Visual Studio Code, giving developers direct access to state-of-the-art open large language models (LLMs) inside their coding environment.
With this integration, developers can now select and run advanced open LLMs—including Kimi K2, DeepSeek V3.1, GLM 4.5, among others—seamlessly in VS Code through the Hugging Face provider.
![demo_vscode]()
Key Highlights
Broad model access: Use leading open-source LLMs with tool-calling capabilities.
Multi-provider flexibility: A single API enables switching across providers such as Groq, Cerebras, Together AI, SambaNova, and more.
Performance and reliability: Optimized for high availability and low-latency inference.
Transparent pricing: Users are charged exactly what the selected provider charges, with no hidden costs.
Quick Start
To enable Hugging Face within Copilot Chat:
Install the HF Copilot Chat extension in Visual Studio Code (v1.104.0 or higher required).
Open the chat interface and navigate to the model picker -> Manage Models….
Select Hugging Face as a provider.
Enter a Hugging Face API token, which is available in your user account settings.
Add desired models to the picker.
If the Hugging Face option does not appear, updating Visual Studio Code and reloading the extension resolves the issue.
Availability
The Hugging Face integration is available immediately. All users on the free tier receive a limited number of monthly inference credits to experiment. Paid tiers—Pro, Team, and Enterprise—provide $2 in monthly credits and expanded, pay-as-you-go access across all providers.
A full demonstration of the integration is available via the official workflow video.