A Brief Discussion on Copilot CLI's Autopilot and YOLO Mode Mechanisms and Quota Pitfalls

I recently used an old project to test the Copilot CLI. Just as I noticed GPT had released GPT-5.4 mini, I thought my quota was sufficient, so I gave it a try. It turned out I had misunderstood the Autopilot quota deduction mechanism, leading me to accidentally step into a quota trap. I decided to look up the relevant information to verify if my understanding was correct.

Features You Need to Know for Automated Execution

When an AI Agent executes tasks, it defaults to pausing and waiting for user input when it encounters actions that require confirmation. This is reasonable for security, but if you want it to run through an entire process, you need to configure some execution behaviors.

WARNING

Automated execution carries risks. Before running, ensure your code is under version control and carefully evaluate if there are external interfaces or database connections involved.

YOLO Mode

YOLO (You Only Live Once) mode controls whether the system "auto-approves" all high-risk actions, including read/write, delete, and terminal execution requests.

How to enable:
- Add the parameter at startup: gh copilot --allow-all (or the community-standard --yolo parameter).
- If the copilot interface is already open, enter the slash command: /yolo or /allow-all.
Actual operation:
- In normal circumstances, even if the AI decides the next step is to run rm -rf, the system will default to popping up a confirmation window.
- After enabling YOLO, the aforementioned confirmations are silently approved.

I personally prefer using "New Copilot CLI Session" in VS Code, which displays the interface in a tab rather than a separate window, making it easier to track which window belongs to which workspace. Since I am already logged in when entering this way, I usually enable it by typing /yolo directly in the interface.

Execution Modes

In the Copilot CLI interactive interface, you can cycle through the following three modes using Shift + Tab:

Standard: The default interactive mode where the user provides instructions step-by-step. The AI responds and waits for the next input, with the pace of task progression controlled by the user.
Plan: The AI first clarifies questions to confirm the scope of requirements, then creates a structured implementation plan. It only executes after the plan is confirmed, making it suitable for cross-file or complex logical tasks.
Autopilot: The AI enters an autonomous loop, without waiting for user input at every step, until the task is completed, an error is encountered, the user manually presses Ctrl+C, or the continuation limit is reached. If full tool permissions are not granted, operations requiring approval will be automatically rejected, which may prevent the task from completing. You can use the --max-autopilot-continues parameter to limit the maximum number of autonomous executions. Official Documentation: Autopilot Mode Details

VS Code also has a similar setting called chat.agent.maxRequests, but there are differences in the positioning and billing methods of the two:

	`--max-autopilot-continues`	`chat.agent.maxRequests`
Tool	Copilot CLI	VS Code
Restricted Object	Autonomous continuation count of Autopilot	AI model call turns for the Agent
Billing Timing	Each autonomous continuation step deducts one premium request	Only user-issued prompts are billed; tool calls and clicking "Continue" are not counted separately
After reaching the limit	Execution stops immediately	Asks whether to continue
Design Purpose	Prevent infinite loops	Prevent the agent from executing in the wrong direction, keeping the developer in control

Currently, I have not seen a corresponding setting for chat.agent.maxRequests in the Copilot CLI.

The Autopilot Quota Trap

The mechanism of Autopilot is: when it is time for user confirmation, if the user does not respond, it will reply on your behalf and continue execution. Each "reply on your behalf" round-trip deducts from your quota.

GPT-related models have a habit (other models might too, but GPT is quite proactive): after a task is completed, it will actively ask if you want to perform further actions. Under normal circumstances, you can decide for yourself whether to continue, but when paired with Autopilot, it will directly reply for you and trigger the next step.

The scenario I encountered was: low-tier GPT model + low reasoning + just asking a question (not actually executing a task). Under this combination, the model replied without thinking carefully, and after replying, it wanted to confirm again from another angle. It kept looping, and I saw Continuing autonomously (0.33 premium requests) appear 5 to 6 times. This scenario is relatively easy to reproduce. Although it deducted the quota of a low-tier model, the loss was limited, but it felt quite bad Q.Q

What is more noteworthy is the other direction: if you switch to a high-billing model like Claude Opus, when Autopilot cannot end properly, the cost of each meaningless trigger is much higher.

In fact, many users online have reported that Autopilot cannot end correctly after a task is completed, leading to a large amount of quota being burned in the background:

Summary

When the quota is sufficient, providing enough context for the AI to judge the direction on its own, combined with a model with strong execution capabilities like GPT-5.4, you can consider enabling YOLO + Autopilot to let it optimize autonomously. However, in most scenarios, using YOLO is enough; you don't necessarily need to enable Autopilot. If you are just asking questions rather than executing tasks, adding Autopilot is more likely to cause unnecessary quota consumption.

Changelog

2026-03-22 Initial document creation.

On this page

A Brief Discussion on Copilot CLI's Autopilot and YOLO Mode Mechanisms and Quota Pitfalls

Features You Need to Know for Automated Execution

YOLO Mode

Execution Modes

The Autopilot Quota Trap

Summary

Changelog

Tags

CloudyWing's Note

Quick Links

Contact

On this page

A Brief Discussion on Copilot CLI's Autopilot and YOLO Mode Mechanisms and Quota Pitfalls ​

Features You Need to Know for Automated Execution ​

YOLO Mode ​

Execution Modes ​

The Autopilot Quota Trap ​

Summary ​

Changelog ​

Tags

Related Notes

iPAS Exam Preparation Notes - AI Application Planner

A Brief Discussion on Google's Strategic Shift Following the Google AI Pro Quota Reduction

A Brief Discussion on Mainstream AI Service Ecosystems and Related Tools

A Brief Discussion on Positive Guidance and Negative Constraints in AI Prompts

CloudyWing's Note

Quick Links

Contact

A Brief Discussion on Copilot CLI's Autopilot and YOLO Mode Mechanisms and Quota Pitfalls

Features You Need to Know for Automated Execution

YOLO Mode

Execution Modes

The Autopilot Quota Trap

Summary

Changelog