Skip to content

Commit 6c20176

Browse files
mchammer01Copilotlecoursenhubwritercrwaters16
authored
BYOK & Local Model Support in Copilot CLI [GA] (#60430)
Co-authored-by: Copilot Autofix powered by AI <175728472+Copilot@users.noreply.github.com> Co-authored-by: Laura Coursen <lecoursen@github.com> Co-authored-by: hubwriter <hubwriter@github.com> Co-authored-by: Claire W <78226508+crwaters16@users.noreply.github.com>
1 parent 85bb724 commit 6c20176

9 files changed

Lines changed: 213 additions & 3 deletions

File tree

content/copilot/concepts/agents/copilot-cli/about-copilot-cli.md

Lines changed: 15 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -307,6 +307,21 @@ You can change the model used by {% data variables.copilot.copilot_cli %} by usi
307307

308308
Each time you submit a prompt to {% data variables.product.prodname_copilot_short %} in {% data variables.copilot.copilot_cli_short %}'s interactive interface, and each time you use {% data variables.copilot.copilot_cli_short %} programmatically, your monthly quota of {% data variables.product.prodname_copilot_short %} premium requests is reduced by one, multiplied by the multiplier shown in parentheses in the model list. For example, `Claude Sonnet 4.5 (1x)` indicates that with this model each time you submit a prompt your quota of premium requests is reduced by one. For information about premium requests, see [AUTOTITLE](/copilot/concepts/billing/copilot-requests).
309309

310+
### Using your own model provider
311+
312+
You can configure {% data variables.copilot.copilot_cli_short %} to use your own model provider instead of {% data variables.product.github %}-hosted models. This lets you connect to an OpenAI-compatible endpoint, Azure OpenAI, or Anthropic, including locally running models such as Ollama. You configure your model provider using environment variables.
313+
314+
| Environment variable | Description |
315+
|---|---|
316+
| `COPILOT_PROVIDER_BASE_URL` | The base URL of your model provider's API endpoint. |
317+
| `COPILOT_PROVIDER_TYPE` | The provider type: `openai` (default), `azure`, or `anthropic`. The `openai` type works with any OpenAI-compatible endpoint, including Ollama and vLLM. |
318+
| `COPILOT_PROVIDER_API_KEY` | Your API key for authenticating with the provider. Not required for providers that don't use authentication, such as a local Ollama instance. |
319+
| `COPILOT_MODEL` | The model to use (required when using a custom provider). You can also set this with the `--model` command-line option. |
320+
321+
Models used with {% data variables.copilot.copilot_cli_short %} must support **tool calling** (function calling) and **streaming**. If the model does not support these capabilities, {% data variables.copilot.copilot_cli_short %} will return an error. For best results, the model should have a context window of at least 128k tokens.
322+
323+
For details on how to configure your model provider, run `copilot help providers` in your terminal.
324+
310325
## Use {% data variables.copilot.copilot_cli_short %} via ACP
311326

312327
ACP (the Agent Client Protocol) is an open standard for interacting with AI agents. It allows you to use {% data variables.copilot.copilot_cli_short %} as an agent in any third-party tools, IDEs, or automation systems that support this protocol.

content/copilot/how-tos/copilot-cli/administer-copilot-cli-for-your-enterprise.md

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -55,6 +55,7 @@ All other controls do **not** affect {% data variables.copilot.copilot_cli_short
5555
* **Model Context Protocol (MCP) server policies**: Enterprise policies that control whether MCP servers can be used, or which MCP registry servers are allowed
5656
* **IDE-specific policies**: Policies configured for specific IDEs or editor extensions
5757
* **Content exclusions**: File path-based content exclusions
58+
* **User-configured model providers (BYOK)**: Users can configure {% data variables.copilot.copilot_cli_short %} to use their own model providers via environment variables. This is configured at the _user level_ and cannot be controlled by enterprise policies.
5859

5960
## Why can't my developers access {% data variables.copilot.copilot_cli_short %}?
6061

content/copilot/how-tos/copilot-cli/cli-best-practices.md

Lines changed: 16 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -104,6 +104,22 @@ Use `/model` to choose from available models based on your task complexity:
104104

105105
You can switch models mid-session with `/model` as task complexity changes.
106106

107+
If your organization or enterprise has configured custom models using their own LLM provider API keys, those models also appear in `/model` at the bottom of the list.
108+
109+
### Use your own model provider
110+
111+
You can configure {% data variables.copilot.copilot_cli_short %} to use your own model provider instead of {% data variables.product.github %}-hosted models. Run `copilot help providers` for full setup instructions.
112+
113+
**Key considerations:**
114+
115+
* Your model must support **tool calling** (function calling) and **streaming**. {% data variables.copilot.copilot_cli_short %} returns an error if either capability is missing.
116+
* For best results, use a model with a context window of at least 128k tokens.
117+
* Built-in sub-agents (`/review`, `/task`, explore, `/fleet`) automatically inherit your provider configuration.
118+
* Premium request cost estimates are hidden when using your own provider. Token usage (input, output, and cache counts) is still displayed.
119+
* `/delegate` only works if you are also signed in to {% data variables.product.github %}. It transfers the session to {% data variables.product.github %}'s server-side {% data variables.product.prodname_copilot_short %}, not your provider.
120+
121+
See [Using your own model provider](/copilot/concepts/agents/copilot-cli/about-copilot-cli#using-your-own-model-provider).
122+
107123
## 2. Plan before you code
108124

109125
### Plan mode

content/copilot/how-tos/copilot-cli/customize-copilot/index.md

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -12,6 +12,7 @@ children:
1212
- /create-skills
1313
- /add-mcp-servers
1414
- /create-custom-agents-for-cli
15+
- /use-byok-models
1516
- /plugins-finding-installing
1617
- /plugins-creating
1718
- /plugins-marketplace
Lines changed: 121 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,121 @@
1+
---
2+
title: Using your own LLM models in GitHub Copilot CLI
3+
shortTitle: Use your own model provider
4+
intro: 'Use a model from an external provider of your choice in {% data variables.product.prodname_copilot_short %} by supplying your own API key.'
5+
allowTitleToDifferFromFilename: true
6+
versions:
7+
feature: copilot
8+
contentType: how-tos
9+
category:
10+
- Configure Copilot
11+
- Configure Copilot CLI
12+
---
13+
14+
You can configure {% data variables.copilot.copilot_cli_short %} to use your own LLM provider, also called BYOK (Bring Your Own Key), instead of {% data variables.product.github %}-hosted models. This lets you connect to OpenAI-compatible endpoints, Azure OpenAI, or Anthropic, including locally running models such as Ollama.
15+
16+
## Prerequisites
17+
18+
* {% data variables.copilot.copilot_cli_short %} is installed. See [AUTOTITLE](/copilot/how-tos/copilot-cli/set-up-copilot-cli/install-copilot-cli).
19+
* You have an API key from a supported LLM provider, or you have a local model running (such as Ollama).
20+
21+
## Supported providers
22+
23+
{% data variables.copilot.copilot_cli_short %} supports three provider types:
24+
25+
| Provider type | Compatible services |
26+
|---|---|
27+
| `openai` | OpenAI, Ollama, vLLM, Foundry Local, and any other OpenAI Chat Completions API-compatible endpoint. This is the default provider type. |
28+
| `azure` | Azure OpenAI Service. |
29+
| `anthropic` | Anthropic (Claude models). |
30+
31+
For additional examples, run `copilot help providers` in your terminal.
32+
33+
## Model requirements
34+
35+
Models must support **tool calling** (also called function calling) and **streaming**. If a model does not support either capability, {% data variables.copilot.copilot_cli_short %} returns an error. For best results, use a model with a context window of at least 128k tokens.
36+
37+
## Configuring your provider
38+
39+
You configure your model provider by setting environment variables before starting {% data variables.copilot.copilot_cli_short %}.
40+
41+
| Environment variable | Required | Description |
42+
|---|---|---|
43+
| `COPILOT_PROVIDER_BASE_URL` | Yes | The base URL of your model provider's API endpoint. |
44+
| `COPILOT_PROVIDER_TYPE` | No | The provider type: `openai` (default), `azure`, or `anthropic`. |
45+
| `COPILOT_PROVIDER_API_KEY` | No | Your API key for the provider. Not required for providers that do not use authentication, such as a local Ollama instance. |
46+
| `COPILOT_MODEL` | Yes | The model identifier to use. You can also set this with the `--model` command-line flag. |
47+
48+
## Connecting to an OpenAI-compatible endpoint
49+
50+
Use the following steps if you are connecting to OpenAI, Ollama, vLLM, Foundry Local, or any other endpoint that is compatible with the OpenAI Chat Completions API.
51+
52+
1. Set environment variables for your provider. For example, for a local Ollama instance:
53+
54+
```shell
55+
export COPILOT_PROVIDER_BASE_URL=http://localhost:11434
56+
export COPILOT_MODEL=YOUR-MODEL-NAME
57+
```
58+
59+
Replace `YOUR-MODEL-NAME` with the name of the model you have pulled in Ollama (for example, `llama3.2`).
60+
61+
1. For a remote OpenAI endpoint, also set your API key.
62+
63+
```shell
64+
export COPILOT_PROVIDER_BASE_URL=https://api.openai.com
65+
export COPILOT_PROVIDER_API_KEY=YOUR-OPENAI-API-KEY
66+
export COPILOT_MODEL=YOUR-MODEL-NAME
67+
```
68+
69+
Replace `YOUR-OPENAI-API-KEY` with your OpenAI API key and `YOUR-MODEL-NAME` with the model you want to use (for example, `gpt-4o`).
70+
71+
{% data reusables.copilot.copilot-cli.start-cli %}
72+
73+
## Connecting to Azure OpenAI
74+
75+
1. Set the environment variables for Azure OpenAI.
76+
77+
```shell
78+
export COPILOT_PROVIDER_BASE_URL=https://YOUR-RESOURCE-NAME.openai.azure.com/openai/deployments/YOUR-DEPLOYMENT-NAME
79+
export COPILOT_PROVIDER_TYPE=azure
80+
export COPILOT_PROVIDER_API_KEY=YOUR-AZURE-API-KEY
81+
export COPILOT_MODEL=YOUR-DEPLOYMENT-NAME
82+
```
83+
84+
Replace the following placeholders:
85+
86+
* `YOUR-RESOURCE-NAME`: your Azure OpenAI resource name
87+
* `YOUR-DEPLOYMENT-NAME`: the name of your model deployment
88+
* `YOUR-AZURE-API-KEY`: your Azure OpenAI API key
89+
90+
{% data reusables.copilot.copilot-cli.start-cli %}
91+
92+
## Connecting to Anthropic
93+
94+
1. Set the environment variables for Anthropic:
95+
96+
```shell
97+
export COPILOT_PROVIDER_TYPE=anthropic
98+
export COPILOT_PROVIDER_API_KEY=YOUR-ANTHROPIC-API-KEY
99+
export COPILOT_MODEL=YOUR-MODEL-NAME
100+
```
101+
102+
Replace `YOUR-ANTHROPIC-API-KEY` with your Anthropic API key and YOUR-MODEL-NAME with the Claude model you want to use (for example, `claude-opus-4-5`).
103+
104+
{% data reusables.copilot.copilot-cli.start-cli %}
105+
106+
## Running in offline mode
107+
108+
You can run {% data variables.copilot.copilot_cli_short %} in offline mode to prevent it from contacting {% data variables.product.github %}'s servers. This is designed for isolated environments where the CLI should communicate only with your local or on-premises model provider.
109+
110+
> [!IMPORTANT]
111+
> Offline mode only guarantees full network isolation if your provider is also local or within the same isolated environment. If `COPILOT_PROVIDER_BASE_URL` points to a remote endpoint, your prompts and code context are still sent over the network to that provider.
112+
113+
1. Configure your provider environment variables as described in Configuring your provider.
114+
115+
1. Set the offline mode environment variable:
116+
117+
```shell
118+
export COPILOT_OFFLINE=true
119+
```
120+
121+
{% data reusables.copilot.copilot-cli.start-cli %}

content/copilot/how-tos/copilot-cli/set-up-copilot-cli/authenticate-copilot-cli.md

Lines changed: 29 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -12,14 +12,40 @@ category:
1212

1313
## About authentication
1414

15-
{% data variables.copilot.copilot_cli %} supports three authentication methods. The method you use depends on whether you are working interactively or in an automated environment.
15+
If you use your own LLM provider API keys (BYOK), {% data variables.product.github %} authentication is not required.
16+
17+
Authentication is required for any other {% data variables.copilot.copilot_cli %} usage.
18+
19+
When authentication is required, {% data variables.copilot.copilot_cli_short %} supports three methods. The method you use depends on whether you are working interactively or in an automated environment.
1620

1721
* **OAuth device flow**: The default and recommended method for interactive use. When you run `/login` in {% data variables.copilot.copilot_cli_short %}, the CLI generates a one-time code and directs you to authenticate in your browser. This is the simplest way to authenticate.
1822
* **Environment variables**: Recommended for CI/CD pipelines, containers, and non-interactive environments. You set a supported token as an environment variable (`COPILOT_GITHUB_TOKEN`, `GH_TOKEN`, or `GITHUB_TOKEN`), and the CLI uses it automatically without prompting.
1923
* **{% data variables.product.prodname_cli %} fallback**: If you have {% data variables.product.prodname_cli %} (`gh`) (note: the `gh` CLI, not `copilot`) installed and authenticated, {% data variables.copilot.copilot_cli_short %} can use its token automatically. This is the lowest priority method and activates only when no other credentials are found.
2024

2125
Once authenticated, {% data variables.copilot.copilot_cli_short %} remembers your login and automatically uses the token for all {% data variables.product.prodname_copilot_short %} API requests. You can log in with multiple accounts, and the CLI will remember the last-used account. Token lifetime and expiration depend on how the token was created on your account or organization settings.
2226

27+
## Unauthenticated use
28+
29+
If you configure {% data variables.copilot.copilot_cli_short %} to use your own LLM provider API keys (BYOK), {% data variables.product.github %} authentication is **not required**. {% data variables.copilot.copilot_cli_short %} can connect directly to your configured provider without a {% data variables.product.github %} account or token.
30+
31+
However, without {% data variables.product.github %} authentication, the following features are **not available**:
32+
33+
* `/delegate`: Requires {% data variables.copilot.copilot_coding_agent %}, which runs on {% data variables.product.github %}'s servers
34+
* {% data variables.product.github %} MCP server: Requires authentication to access {% data variables.product.github %} APIs
35+
* {% data variables.product.github %} Code Search: Requires authentication to query {% data variables.product.github %}'s search index
36+
37+
You can combine BYOK with {% data variables.product.github %} authentication to get the best of both: your preferred model for AI responses, plus access to {% data variables.product.github %}-hosted features like `/delegate` and code search.
38+
39+
### Offline mode
40+
41+
If you set the `COPILOT_OFFLINE` environment variable to `true`, {% data variables.copilot.copilot_cli_short %} runs without contacting {% data variables.product.github %}'s servers. In offline mode:
42+
43+
* No {% data variables.product.github %} authentication is attempted.
44+
* The CLI only makes network requests to your configured BYOK provider.
45+
* Telemetry is fully disabled.
46+
47+
Offline mode is **only fully air-gapped** if your BYOK provider is local or otherwise within the same isolated environment (for example, a model running on-premises with no external network access). If `COPILOT_PROVIDER_BASE_URL` points to a remote or internet-accessible endpoint, prompts and code context will still be sent over the network to that provider. Without offline mode, even when using BYOK without {% data variables.product.github %} authentication, telemetry is still sent normally.
48+
2349
### Supported token types
2450

2551
| Token type | Prefix | Supported | Notes |
@@ -50,7 +76,8 @@ When you run a command, {% data variables.copilot.copilot_cli_short %} checks fo
5076
1. GitHub CLI (`gh auth token`) fallback
5177

5278
> [!NOTE]
53-
> An environment variable silently overrides a stored OAuth token. If you set `GH_TOKEN` for another tool, the CLI uses that token instead of the OAuth token from `copilot login`. To avoid unexpected behavior, unset environment variables you do not intend the CLI to use.
79+
> * An environment variable silently overrides a stored OAuth token. If you set `GH_TOKEN` for another tool, the CLI uses that token instead of the OAuth token from `copilot login`. To avoid unexpected behavior, unset environment variables you do not intend the CLI to use.
80+
> * When you configure BYOK provider environment variables (for example, `COPILOT_PROVIDER_BASE_URL`, `COPILOT_PROVIDER_API_KEY`), {% data variables.copilot.copilot_cli_short %} uses these for AI model requests regardless of your {% data variables.product.github %} authentication status. {% data variables.product.github %} tokens are only needed for {% data variables.product.github %}-hosted features.
5481
5582
## Authenticating with OAuth
5683

content/copilot/responsible-use/copilot-cli.md

Lines changed: 24 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -107,6 +107,30 @@ You can grant {% data variables.copilot.copilot_cli_short %} specific permission
107107

108108
For more information about security practices while using {% data variables.copilot.copilot_cli %}, see "Security considerations" in [AUTOTITLE](/copilot/concepts/agents/about-copilot-cli#security-considerations).
109109

110+
## Data handling when using your own model provider
111+
112+
When you configure {% data variables.copilot.copilot_cli_short %} to use your own model provider, your prompts, code context, and generated responses are sent directly to the provider you configure. They are not routed through {% data variables.product.github %}. You are responsible for reviewing and complying with the terms of service and data handling policies of your chosen provider.
113+
114+
### Telemetry
115+
116+
When you use your own model provider without offline mode, {% data variables.copilot.copilot_cli_short %} continues to send telemetry to {% data variables.product.github %} as usual. This telemetry does not include your prompts or code, but it does include usage metadata.
117+
118+
If you enable offline mode by setting the `COPILOT_OFFLINE` environment variable to `true`, all telemetry is disabled. In offline mode, {% data variables.copilot.copilot_cli_short %} only makes network requests to your configured model provider.
119+
120+
### Authentication and feature availability
121+
122+
{% data variables.product.github %} authentication is not required when using your own model provider (BYOK). Without {% data variables.product.github %} authentication, the following features are unavailable:
123+
124+
* `/delegate`, which hands off the session to {% data variables.product.github %}'s server-side {% data variables.product.prodname_copilot_short %}
125+
* The {% data variables.product.github %} MCP server
126+
* {% data variables.product.github %} Code Search
127+
128+
In offline mode, web-based tools such as `web_fetch` and {% data variables.product.github %} Code Search are also disabled.
129+
130+
### No fallback to {% data variables.product.github %}-hosted models
131+
132+
If your model provider configuration is invalid, {% data variables.copilot.copilot_cli_short %} exits with an error. It does not fall back to {% data variables.product.github %}-hosted models. Common failures, such as connection refused, authentication errors, model not found, and timeouts, produce user-friendly messages with actionable guidance.
133+
110134
## Limitations of {% data variables.copilot.copilot_cli %}
111135

112136
Depending on factors such as your codebase and input data, you may experience different levels of performance when using {% data variables.copilot.copilot_cli %}. The following information is designed to help you understand system limitations and key concepts about performance as they apply to {% data variables.copilot.copilot_cli %}.
Lines changed: 5 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,5 @@
1+
1. Start {% data variables.copilot.copilot_cli_short %}.
2+
3+
```bash
4+
copilot
5+
```

data/variables/copilot.yml

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -199,4 +199,4 @@ copilot_workspace: 'Copilot Workspace'
199199
copilot_workspace_short: 'Workspace'
200200

201201
# BYOK
202-
copilot_byok_supported_features: '{% data variables.copilot.copilot_chat %}'
202+
copilot_byok_supported_features: '{% data variables.copilot.copilot_chat %} and {% data variables.copilot.copilot_cli %}'

0 commit comments

Comments
 (0)