On this page

Skip to content

A Brief Discussion on Mainstream AI Service Ecosystems and Related Tools

Recently, while exploring n8n, I realized that the existing services for my Side Project (such as PostgreSQL) were too cumbersome for my use case, so I decided to rewrite the entire architecture.

Furthermore, the new architecture previously planned by AI had significant issues. Although we initially discussed a seemingly reasonable version, two days later, I overturned the n8n plan again and decided to wait until the weekend to properly replan it.

I am using this cooling-off period to reflect on my actual usage scenarios for various AI tools.

Mainstream AI Service Ecosystems

Currently, the mainstream closed-source AI model providers on the market are primarily OpenAI, Anthropic, and Google. In terms of basic functionality (Q&A chat, multimodal input, web search, etc.), all three are quite mature.

Each provider generally has three ecosystems, with different billing methods and default data privacy settings:

  • Developer Ecosystem: Primarily based on API Keys, billed by usage, suitable for programmatic integration and automated workflows. By default, data sent to these APIs is not used to train models, with the exception of the Google AI Studio free tier (see the Google section below).
  • Enterprise and Team Ecosystem: Provides unified account management, usage quota control, and advanced SLA guarantees. It usually involves contract-based pricing, and data contracts explicitly prohibit the use of data for training.
  • Consumer Ecosystem: Divided into free plans and subscription tiers, with subscriptions offering more powerful models or higher usage limits. Web chat data may be used for training by default, though each provider offers settings to disable this.

OpenAI Ecosystem

As the earliest starter, OpenAI has built a massive application ecosystem by connecting various external tools and the widely popular GPTs. It is usually the first service that comes to mind when the general public encounters AI or considers a subscription.

OpenAI was the first AI service I encountered, but it is currently the only one of the three I have never actually subscribed to, so I will only provide an overview here. Subscription plans are divided into GO, Plus, and Pro tiers. API usage (Developer Ecosystem) and web subscription (Consumer Ecosystem) are billed separately.

Key tools:

  • ChatGPT (Web/App) is the primary chat interface, featuring a built-in Canvas that allows you to open a separate window next to the chat to directly edit documents or code.
  • GPTs allow for the quick creation of custom bots, supporting external API connections (Actions), which is suitable for embedding into automated workflows.
  • GPT Image 1 (text-to-image) is built into the chat and available to free users.
  • Sora (video generation) is currently limited to paid plans like Plus / Pro.

Anthropic Ecosystem

Instead of pursuing a "jack-of-all-trades" approach, Anthropic has focused its skill tree on specific domains like "coding and long-form logical reasoning," without features like image or video generation. This has led to an interesting gap in their user base: they have high visibility in the developer community, with many already subscribed or using the API; however, in general companies (including software companies), fewer people subscribe (perhaps because it's too expensive?), and some have never even heard of it, knowing only ChatGPT.

Claude has the smallest product line of the three. The developer ecosystem obtains API Keys via the Anthropic Console using a pay-as-you-go model. The consumer ecosystem is divided into Pro / Max subscription tiers for individual users.

Note that Claude's token calculation and limitation mechanism differ from others: the free tier has a very small quota, and if a single conversation becomes too long, you will be forced to "start a new chat" to continue. Although paid subscriptions do not have this hard length limit, because Claude includes the "entire conversation history" in the Context Window when replying, the longer the conversation, the more tokens are consumed per turn, burning through your daily quota faster. Additionally, "shared quota between Claude Code and the web version" is a common pitfall; at the end of last December, I was unable to use the web version because I had used Claude Code extensively for testing.

Overview of Common Claude Tools

  • Claude (Web/App): The main chat interface.

  • Claude Code A CLI AI Agent tool built specifically for developers. It can run directly in the terminal or IDE, automatically reading the local Codebase, executing Shell commands, fixing bugs, creating PRs, and supporting MCP (Model Context Protocol) to connect to external tools. It currently offers two usage modes:

    • Claude Code on Web: A browser interface that, after connecting to a GitHub account, clones the Repository to a cloud VM managed by Anthropic, suitable for scenarios where you don't want to install environments locally.
    • Claude Code on desktop: Integrated into the official Desktop App, providing a graphical user interface, including visual Diff comparisons, real-time App previews, and the ability to run multiple local or remote sessions simultaneously.
  • Claude Cowork An Agent preview feature launched in the official Desktop App for paid users, allowing Claude to directly access local files and handle complex, multi-step tasks.

  • Artifacts A real-time preview feature built into the Claude Web/App. When you ask it to write code, create a webpage, or write a long report, a separate preview window opens next to the chat box, allowing you to see the rendered result directly or copy the entire clean content with one click. This feature solves several annoying problems of traditional web chat interfaces:

    1. If code blocks are nested within other Markdown blocks, they get truncated once the nesting limit is exceeded, and the subsequent content spills outside the block.
    2. Selecting and copying text with a mouse often ruins the formatting upon pasting.
    3. Clicking the "Copy" button on a chat message often includes the AI's conversational filler, which then has to be manually cleaned up.

Google Ecosystem

Possessing massive resources and cross-platform integration capabilities, Google's feature set is extremely complex. Early market attention was relatively low, but after ChatGPT 5's performance fell short of expectations late last year, Google launched Gemini 3.0 Pro, leveraging mobile bundling, half-price offers, and "all-in-one" plans to successfully attract a large number of users to switch.

Although all AI providers suffer from training data lag, Google's product line is so vast that "internal teams often don't recognize each other's work": sometimes even Gemini itself doesn't know which services are available. For example, when asked about Antigravity, it had no idea the tool existed; NotebookLM had supported Chinese Podcasts since mid-last year, but when asked in January, it replied that it wasn't supported yet. This situation is particularly noticeable within the Google ecosystem.

1. Developer Ecosystem (Google AI)

Services are primarily provided via Google AI Studio.

  • Provides developers with free API Keys, but with RPM (Requests Per Minute) and other quota limits.
  • Can be linked to a credit card to switch to pay-as-you-go, paying only for what you use.

WARNING

Note: By default, data from free API Keys may be used by Google to train base models. This is only automatically disabled after upgrading to a paid plan; be sure to check this if you are concerned about privacy.

2. Enterprise Ecosystem (Google Cloud / Vertex AI)

Vertex AI, built on top of GCP (Google Cloud Platform).

  • Specifically for enterprise users, providing a complete MLOps toolchain.
  • No free quota; billing is based entirely on GCP resource and API usage.

3. Consumer Ecosystem (Google One)

Google One was originally just for "cloud storage subscriptions," but to promote AI, it later added subscription plans that include Gemini services.

  • About Subscription Plans: The earliest plan including Gemini Advanced was Google One AI Premium. Later, it was split into Plus (added this year), Pro, and Ultra tiers.
  • What is Gemini Advanced?
    • Many people confuse Google One subscriptions with Gemini Advanced. Simply put, Google One is your "paid subscription plan name," while Gemini Advanced is the "advanced web interface and service" unlocked after payment.
    • Only in Gemini Advanced mode can you call Google's higher-end models (e.g., Gemini 3.1 Pro) and enjoy more relaxed conversation limits and longer context memory.
  • Family Sharing and "AI Points":
    • Google One can be shared with a family group of up to 6 people (including yourself).
    • Most software benefits (like individual Gemini conversation quotas) are account-independent. Therefore, some people create 6 accounts to form a family group, switching to other accounts when the main account's quota is exhausted.
    • However, be aware: "Cloud Storage" and "AI Points" are shared by the whole family! There is no way to limit individual member usage separately.
    • AI Points are usually consumed when performing compute-intensive generation tasks, such as generating high-quality images or videos using Imagen 4.

TIP

Now, AI Points can also be used for Antigravity model call quotas, though I am skeptical about how many calls the 1000 points per month included in the Pro tier can actually provide.

Overview of Common Google Tools

Google has stuffed Gemini into so many products; here is a summary of some common tools and usage insights:

Daily Chat and Development Assistance

  • Gemini App (including Gemini Advanced)
    • The most common entry point, available on web and mobile. It is a general-purpose tool that can do a bit of everything and is very convenient; however, for specific tasks requiring precision, it is recommended to use dedicated tools.
  • Gemini Code Assist & Gemini CLI
    • Gemini Code Assist: An IDE extension. Honestly, it feels a bit awkward; in VS Code, most people are accustomed to the GitHub Copilot ecosystem. Perhaps the target audience is in other compilers.
    • Gemini CLI: A command-line interface that can be configured to connect to Gemini App, Google AI Studio, or Vertex AI accounts.
    • Positioning: If you are using an agent editor like Antigravity, you might find the official CLI or Assist less useful. However, for simple, repetitive, high-volume tasks, you can offload them to Gemini CLI to save your primary quota.

Office and Knowledge Management

  • NotebookLM Mainly used for two scenarios:
    1. Document Summarization and Presentation Generation
      • A great tool for admins or PMs. You can upload meeting minutes or long documents as "sources" and have it organize key points and outlines.
      • The latest version allows you to use commands to edit PPTs and even download .pptx files (previously, you could only export PPT-formatted PDFs).
      • Disadvantages and Limitations:
        1. PPT generation is limited to a maximum of 15 slides per turn.
          • Workaround: First, ask the AI to write a complete "PPT outline and slide content" for the entire document. Then, create a new "source" for every 15 slides of the transcript. Select only one local source at a time to generate the PPT, and finally manually merge these .pptx files in PowerPoint. As long as you keep the prompts (e.g., layout, tone) consistent for each source, you can indeed produce presentations longer than 15 slides with a consistent style.
        2. Generated presentations have a NotebookLM watermark in the bottom right corner (usually forced on non-Ultra subscribers).
          • Removal: The exported .pptx watermark is often "baked" into the image (flattened with the background), making it impossible to select and delete in PowerPoint. Current community solutions include: first, uploading the file to Canva and using the "Magic Eraser" feature (requires Canva Pro) to wipe the watermark; or using online AI background/watermark removal tools like NotebookLM Watermark Remover.
        3. Text in the generated presentation becomes non-editable images.
          • Solution: A popular free community tool is DeckEdit. It can restore presentations generated by NotebookLM (including layouts converted to images and key illustrations generated by Nano Banana Pro) into PPTX / PDF formats with "directly editable text."
    2. Personal Knowledge Base
      • Disadvantage: Source management is not very intuitive. While suitable as a long-term knowledge base, due to poor management, it is mostly used as a one-off document summarization tool.
      • When source content changes: If it's a local file, you must delete the old one and re-upload; if it's a Google Drive file, although it supports integration, it doesn't automatically show updates in the list, requiring manual clicks to trigger synchronization.
    • Podcast Audio Generation (Audio Overview): Can turn documents into a radio show with a male and female host. Personally, I find the utility debatable, as the information density is low, leaning more towards demonstrating voice expressiveness.

Image Generation and Online Agents

  • Google Labs (AI Test Kitchen)
    • This is where Google usually places its latest experimental tools. It recently underwent a major overhaul: ImageFX (image generation), Flow (video generation), and MusicFX (music) were previously separate interfaces. Now, core features like image generation and editing are consolidated into New Flow, allowing users to complete a "generate -> edit -> video generation" workflow in one interface.
    • I want to try it, but I haven't found a use case yet; these experimental audiovisual features are usually the main consumers of "AI Points."
  • Jules (Online Coding Agent)
    • An AI agent running in the background. You give it tasks or Issues, and it will create branches, write code, and even open PRs for you to review.
    • Disadvantages:
      1. It uses Gemini under the hood, which is not ideal for complex projects.
      2. If you care about Git Commit Message formats, it seems to only "append" messages; if you are unsatisfied with the generated message, you have to manually rewrite it during Squash and Merge.
      3. Because it is non-interactive, requirements must be written very precisely. In my previous tests, the instructions were too vague, and without the ability to correct it in real-time, it made a mess of the project.
    • However, for simple projects, it's a decent way to practice writing requirement specifications and reviewing code.

Others

  • Almost all Google Workspace services (Docs, Gmail, Drive, etc.) have already integrated or are integrating Gemini add-ons.

Google Ecosystem Summary

Google is the most generous with free quotas. The main differences between Google's tiers are "usage frequency (or AI Points)," "the underlying model level that can be called," and "privacy settings."

Although I have recommended Gemini to some non-engineering friends, it wasn't for its coding ability. While it's sufficient for daily chat, it clearly falls short of competitors when dealing with complex DevOps or backend architecture tasks. However, Gemini's biggest advantage lies in its "all-in-one" bundling and diverse UI carriers. It seamlessly integrates with Workspace office software and provides various interfaces like Gemini App, NotebookLM, ImageFX, and Jules. Most importantly, quotas are calculated separately across different platforms. This "use up the quota on one side and switch to the other" strategy is what makes it the best value for money right now.

Web Chat Interface and Model Experience

Regardless of the provider, most people interact with the Web/App chat interface, which is the most intuitive way to experience the "personality" of each model. The following are personal impressions and may not reflect reality.

Model Chat Personalities

TIP

The following is based on GPT‑5.2, Claude 4.5, and Gemini 3.1. GPT‑5.3 Instant (released March 4) and GPT‑5.4 (released March 5) have just been released and I haven't accumulated enough experience with them yet.

  • ChatGPT (Prone to talking to itself): It is recommended to use system instructions to lower the default enthusiasm, otherwise, it often repeats filler from multiple angles regarding the same issue. The overall feeling is that it talks to itself, often ignoring the context or definitions provided by the user and replying according to its own understanding; many online have reported that it often fails to answer questions directly, requiring several rounds of back-and-forth to get on track.
  • Claude (Too neutral to have a stance): When discussing technical issues like code architecture, Claude is actually quite organized; but once the topic extends to softer areas without clear rights or wrongs, the "lack of stance" is particularly noticeable. In version 4.5, the typical problem was: as soon as you added new information, it would immediately change its stance to accommodate you, which was annoying every time. If the context provided was insufficient, it would enter "perfunctory mode," constantly summarizing what you said until it understood your inclination before formally responding—in a way, this way of speaking is a bit like me. When it knows what to express, it is very verbose. However, by version 4.6, the verbosity improved significantly, and replies became much more concise. As for other issues, I haven't verified them.
  • Gemini (Overly eager and too confident): At first, I thought it had more of a stance than Claude, but later I realized it just has a "strong desire to perform," much like someone wanting to be praised (this refers to the tone, not that it actually wants to be praised). The typical scenario is when fixing bugs; it often confidently declares, "I made these changes, and here is why this will definitely succeed," only for the execution to fail, sometimes with the exact same error.

Mutual Awareness of Models

I really want to complain: even though these three have updated their models several times recently, their awareness of each other's latest versions is always stuck in the past. Current test results are roughly as follows:

Question TargetPerceived ChatGPT LatestPerceived Claude LatestPerceived Gemini Latest
ChatGPT53.72
Claude4.54.62.5
Gemini Pro4.53.53.1

(Note: Interestingly, Gemini Flash's training data is the most up-to-date; also, Antigravity is still not in the Gemini model's training data.)

Sometimes when I ask AI to polish my notes, if I don't watch it closely, it might secretly "downgrade" the correct version numbers in my notes back to the old versions it recognizes. This also happens frequently when modifying Docker Image versions in compose.yml.)

Supplementary Insights on Gemini Model Experience

Here are a few issues that are particularly noticeable in actual use, not limited to the web version or Agents, but the behavior of the Gemini model itself:

  • Proofreading style is hard to tune: Gemini is particularly frustrating when it comes to proofreading. Either it makes the content overly concise and formulaic (it calls this "professional"); or if you ask it to be more conversational, it becomes filled with metaphors and exaggerated terms like "powerful weapon," "significant improvement," "extremely efficient," "best practice," and "comprehensive analysis." If you tell it to just reorder without deleting information, it becomes afraid to change anything. In contrast, the Claude model can adjust the arrangement while retaining the original tone and intent, and the experience gap is quite significant.

  • Response speed and stability: Honestly, I don't feel much improvement in the model; the response speed is getting slower. When using Gemini Pro in Antigravity, sometimes it gets stuck, and when asked about the status after interruption, it takes several minutes to respond; switching to Claude results in a quick reply, and the contrast is obvious.

  • Severe anomaly events: From the night of March 3 to the early morning of March 4, there was a severe anomaly: whether in the Gemini App or Antigravity, it inexplicably outputted its thought process, and a closer look revealed it contained other users' personal data or a large amount of content completely unrelated to my questions, as if it were outputting someone else's content into my chat. Others in the community have reported similar situations; related discussions: Threads @freakyketz, Threads @hiphop3535.

  • Weird logic for providing links: For some reason, Gemini often doesn't provide direct website links but instead gives Google search links. I once asked it to proofread an article, and it changed all the relative paths of local images into Google search link formats...

  • Output integrity instability: Gemini is good at digesting long background materials, but in terms of the "integrity" of its proactive output, it feels significantly weaker than other models. Taking Antigravity's Planning mode as an example, it produces a complete execution plan before execution; but if you discuss and fine-tune during the planning process, the subsequently regenerated versions often quietly omit details that were already confirmed, rather than modifying them on the original basis. If you don't explicitly specify each item when reminding it to add them back, it usually only restores them partially, or even replaces the original complete description with a summary. This tendency is not limited to the carrier; it can also happen in the web version and CLI (see "General Observations on Model Attention" below).

    Additionally, note that the number of copies of Antigravity's Implementation Plan is limited (about 20). Once the number of iterations is high, the correct content of older versions may be permanently lost due to version overwriting. When assigning editing tasks, it is recommended to explicitly limit the scope of modifications to avoid it "optimizing" areas that were not requested; the original intent is easily lost during repeated iterations.

General Observations on Model Attention

The following applies not only to web versions or Agent editors but to all AI conversations:

  • Large context window ≠ strong attention capability: Many models claim to have a super-large Context Window, but this is different from "being able to continuously pay attention to the entire conversation." Just like when we talk, we might remember topics from several rounds ago, but if they are suddenly mentioned in the latest round, we might not immediately react to what is being discussed.
  • Over-focusing leads to tunnel vision: When AI spends a lot of time on the same problem, the attention weight of that problem in the model becomes very high, and it gets stuck there. For example, I once encountered a situation where the container image was upgraded and the settings were different. The correct approach was to directly apply the pre-planned password, but the AI chose to set the development mode first. I have also encountered situations where the direction was wrong; even after giving reminders and explaining where its understanding was wrong, it insisted that I didn't understand and kept drilling into the wrong path.
  • Repeated iterations easily omit details from the beginning: When asking the model to modify or regenerate the same document multiple times in a long conversation, you often find that the output content "gets shorter and shorter," and details that were previously confirmed quietly disappear. This is a common limitation of long-context models: the longer the conversation and the more conditions that need to be maintained simultaneously, the more the model's ability to extract information from the beginning decays, prioritizing the most recent instructions, and details from the beginning are easily glossed over. Practical strategy: Before regenerating, explicitly restate the key paragraphs that must be preserved, or limit the model to modifying only a specific range to avoid full-text rewriting.

Invisible Landmines in File Uploads

Successful upload ≠ the model has read the complete content.

Web-based file uploads actually have many traps:

  • Each provider has hard limits on the number of uploads. Interestingly, when packing files into a ZIP, the free version of ChatGPT can bypass the single-turn quantity limit using compressed files; but the paid Gemini still strictly checks the number of files inside the compressed file, making the paid experience worse than the free version of others.
  • The file parsing mechanism of the Gemini web version is unstable. I once uploaded a Markdown file of about 3500 lines, only to find during the conversation that the model had only read the beginning and the end, with the middle missing. The root cause is that the web interface frontend truncates the attachment during parsing, resulting in incomplete information received by the model. Conversely, pasting the same text directly into the chat box allows it to be read completely. In January, there were also cases where the model didn't receive attachments in chats created from Gems. Gemini suggested using Google AI Studio for uploads to avoid this, but I haven't verified it (there are many ways to read large amounts of content, such as handing it directly to an Agent editor, without relying on the web interface).
  • Previously, when using ChatGPT, I compressed a project into a ZIP and uploaded it, and the answer was clearly inconsistent with the actual project structure. After questioning, it admitted that it "did not read all the files" and asked me to re-upload the missing parts (the key is that the tone of the reply was self-righteous, like "Yeah, I didn't read everything," not feeling that it had done anything wrong, which can be emotionally frustrating).

In the past year or two, web chat interfaces have supported web search, but the actual mechanism of "web search" is not what most people imagine: The model does not directly connect to the URL you paste; it searches through a search engine. Gemini uses Google Search, ChatGPT currently uses its own ChatGPT Search, and Claude is uncertain.

History of Web Search by Provider

  • Google (Gemini): Started as a search engine and has had the most native Web access capability since the Bard era.
  • OpenAI (ChatGPT): Opened "Browse with Bing" to Plus users in May 2023, suspended it in July due to paywall content access, and relaunched it in September. SearchGPT prototype was released in July 2024, and it was officially integrated into ChatGPT at the end of October, allowing users of all plans to obtain real-time search information.
  • Anthropic (Claude): The latest starter, native search support is slower, and it previously relied more on third-party tool integration.

This mechanism has several practical implications:

  • If your website has not been indexed by a search engine, Gemini won't be able to read it. For example, when I moved my notes to GitHub Pages this year, I encountered this problem initially because Google hadn't indexed it yet. If you find your site cannot be searched, remember to submit sitemap.xml to Google Search Console.
  • Websites with robots.txt restrictions, or GitHub itself (possibly due to authorization concerns), are generally unreadable by these models.
  • When it can't read something, Gemini is a bit annoying; it doesn't want you to know it can't read it, so it starts making up content based on the URL information and conversation context; after being questioned, it will apologize for "being lazy" and try again, but it continues to reply randomly, sometimes with content similar to the previous round.

However, Agent editors (like Antigravity, Copilot) do not have this problem because they usually run directly in your local development environment (or cloud workspace) and have the authority to directly read the dedicated file system or crawl URL content through built-in tools.

Custom AI Assistants

Currently, each provider has custom AI assistant features: ChatGPT's GPTs, Gemini's Gem, and Claude's Project. All three allow uploading data and setting roles, but they differ in feature completeness and usage barriers:

  • GPTs (ChatGPT): The most complete features, can connect to external services (Actions), and is the most flexible of the three. However, it is a paid feature; since I don't use it as my main tool, I have never subscribed and haven't tried downloading existing tools from the GPT Store.
  • Project (Claude): Also requires a subscription. Conversations initiated from a Project are managed uniformly by the Project, making it easy to organize chats.
  • Gem (Gemini): Can be used without a subscription, making it the friendliest to free users. However, the features are basic, and it does not support categorized management; at most, it displays the recent Gem used to create the chat record when starting a new conversation.

The main purpose of a Gem is to let the model simulate a specific perspective or personality to think about problems. You can also set workflows and output formats in a Gem. As for the best setting format, some online recommend XML, YAML, Markdown, etc. I personally use a mixed format, but I'm not sure if it's the best practice.

Global Personalization

Web chat interfaces provide "Personalization" features, allowing you to pre-write guidelines to make the AI's default behavior more aligned with your habits.

This has a core difference from custom assistants: custom assistants are tailored for specific domains or tasks; personalization is applied as a general default format for all general conversations. When a Gem operates independently, it usually overrides global settings and is not affected by them.

Regarding the details of Prompt strategies, why too many rules make AI go off-track, and how to combine positive guidance and negative constraints, I have organized them into another article: Prompt Positive Guidance vs. Negative Constraints: Learning from Pitfalls.

AI Agent Editors

I mainly use Antigravity and GitHub Copilot as my Agent editors.

Antigravity vs Copilot vs Gemini CLI

The billing and usage habits of these three tools are very different, and once you understand them, it's easy to allocate their use:

  • Antigravity: Suitable for tasks that require "back-and-forth discussion." An agent editor that allows it to write JavaScript scripts to operate websites while fixing bugs (but don't expect it to precisely fix CSS layout issues). It was very useful when using the Claude model for development, with a quota reset every 5 hours, but it became less sufficient after Claude added weekly quota limits (unless you are in a family group of six, in which case, never mind). The Gemini model can be used to discuss project bugs to find inspiration, or you can switch to the Claude model to write plans.
  • GitHub Copilot: Suitable for assigning "implementation tasks." The billing method is based on per-conversation Q&A, but different models have different usage weight multipliers (lightweight models consume less, high-end models consume more). Because the unit of count is fixed regardless of whether it's answering questions or massive implementation, as long as you increase the iteration limit, giving the execution plan discussed in Antigravity to Copilot for development is more efficient than letting Antigravity run complex tasks and burning through the Claude model quota.
  • Gemini CLI: The positioning is a bit awkward, like a low-spec version of Antigravity, and it can only select the Gemini model, which is prone to errors when used for coding or complex tasks. The OAuth mode counts usage like Copilot, but there is a Token limit per turn. Unless it's a simple task or running in the background to organize project documents, it's rarely used.

Gemini Model Script Batch Processing Issues

The Gemini model itself tends to use scripts to batch process tasks, which brings a problem: When assigning tasks, we actually need to judge for ourselves whether this requirement is "suitable" for script processing. You don't necessarily need to come up with implementation details, but you need the judgment to infer whether the basic logic is sound. If you are not sure yourself but let Gemini run scripts, you are very likely to break things; even if it confidently claims it will use rigorous regular expressions to handle it, don't trust it easily. And once it decides to use a script, if we want to ask it to change to "read each file one by one," both Antigravity and Gemini CLI will be quite resistant, often requiring several rounds of back-and-forth before it is willing to give up the script.

WARNING

As of 2026-03-12, Google AI adjusted its subscription plan architecture, and Antigravity's quota is linked accordingly. The official announcement is as follows:

We're evolving Google AI plans to give you more control over how you build. Every subscription includes built-in AI credits, which can now be used for Antigravity, giving you a seamless path to scale.

Google AI Pro is the home for the practical builder, hobbyists, students, and developers who live in the IDE and don't necessarily rely on an agent. This plan features generous limits for Gemini Flash, with a baseline quota included to "taste test" our most advanced premium models.

Google AI Ultra serves as the daily driver for those shipping at the highest scale who need consistent, high-volume access to our most complex models.

If you're on Pro but need "extra juice" for a heavy sprint or deeper access to premium models, simply top up your AI credits to customize your plan.

Keep building. Keep shipping.

Simply put, the Pro plan is positioned with Gemini Flash as the main driver, with limited trial quotas for advanced models (Gemini Pro, Claude); Ultra is for those who need high-frequency use of advanced models; if Pro needs more quota temporarily, AI Credits can be purchased separately.

Impact on actual use: Quotas for Gemini Pro and Claude models have been significantly reduced and changed to a Weekly Quota reset. The situation is a bit paradoxical—Gemini CLI, as a lower-tier carrier, currently has a relatively generous Gemini Pro quota; while Antigravity has a better experience, its weekly quota for advanced models is very limited, significantly weakening its cost-performance advantage.

Since this mechanism may continue to be adjusted, I will no longer track or update it.

I have always been a heavy user of Visual Studio and only open VS Code when writing non-.NET projects. So at first, I used Copilot directly in Visual Studio, but the experience felt very difficult to use; it wasn't until the Chinese New Year that I realized the default Copilot chat interface in Visual Studio is not in Agent mode at all. To get Agent features, you must manually switch modes.

In terms of the completeness of Agent features, the three interfaces are roughly Copilot CLI > VS Code > Visual Studio. Copilot CLI has the most settings and the highest flexibility; VS Code is second, with a smooth interface and convenient extensions; although the official claim for Visual Studio is "deep optimization for .NET projects," my personal experience is that Agent features are relatively limited. You don't get a plus; it's more like a trade-off for .NET binding. Recently, VS Code announcements have continued to feed back some CLI features, and the gap is slowly narrowing (refer to VS Code February 2026 version (1.110)).

TIP

However, if I am modifying a .NET project myself, I will still use Visual Studio. Specifically, it depends on whether the task is developer-led or Agent-led. For the former, I use Visual Studio; for the latter, I use Copilot CLI or Visual Studio Code.

WARNING

When using Copilot in a WSL environment (e.g., Ubuntu), it is recommended to prioritize VS Code. Copilot CLI currently relies on PowerShell when scripts need to be executed, and WSL usually does not have it installed by default; once an operation requiring PowerShell is encountered, the Agent will not automatically switch to bash but will require you to manually execute the command and report the result back, severely affecting the continuity of the automated process.

Agent Context Management

When using an Agent, creating appropriate rule files and context documents is very important. Some practical observations:

  • Without a rule file, AI steps on the same landmine every time: For some temporary tasks, I'm too lazy to create files like AGENTS.md, and as a result, the Agent steps on the same landmine every time it executes, wasting a lot of time before officially entering the task.
  • Cross-workspace context awareness (no longer working?): I remember when using Antigravity, if I opened windows in two different workspaces, A and B, the conversation in window A could perceive and read the "currently open file content" in window B. But recently, this "cross-workspace reading" capability seems to be gone, and the context of each project window has become completely isolated. I suspect this might be due to privacy and security (fear of accidentally leaking confidential code from elsewhere) or Token cost considerations, and this global collection mechanism was fixed by the manufacturer.
  • Conversations that are too long will freeze: Antigravity will freeze if a single conversation is too long, forcing you to start a new one, but it's often too late to compress the conversation content and send it to the new chat.

Personal Trade-offs and Postscript

Gemini Pro 3 was actually quite good in December last year, answering many questions that other AIs got wrong in the first half of last year. But the more I use it now, the worse it feels, and I can't tell if it has really regressed or if other models are improving, causing my own evaluation standards to rise (the frequency of getting angry while chatting with Gemini has been increasing). Currently, I only have one expectation: I hope the next version can "say it doesn't know if it doesn't know" and not always pretend to know everything.

Regarding model selection, I have always believed that Claude is the best model for coding; I was tempted to resubscribe after Claude upgraded to 4.6. However, after thinking about it, the original idea for subscribing was to use one for coding and one for other fields. Copilot is already handling the coding part, and Claude doesn't particularly stand out in non-engineering scenarios, so I'll just use the free quota to discuss things occasionally and don't plan to resubscribe for now.

The high-end Claude Opus 4.6 is also more expensive; while Sonnet 4.6 and GPT-5.3 Codex have a weight of 1 in Copilot's usage count, some in the community believe that in the actual experience of "coding Agents (automated agents)," GPT-5.3 Codex performs better. For example, Po-Chih mentioned: use Claude Opus 4.6 for writing documents and planning, GPT-5.3 Codex (now the latest is 5.4) for coding and development, and Gemini 3.1 Pro for web and design.

However, I find that I have become heavily dependent on AI. When I want to build something new, my current development process is roughly as follows:

  1. Throw collected information, ideas, and preliminary discussion records to Antigravity to generate a prototype plan.
  2. Review materials based on the plan, build my own mental model, and then iterate on the entire plan with the AI.
  3. Finally, hand over the actual development work to Copilot.
  4. Formally join the ranks of PMs who don't code and only talk.

Change Log

  • 2026-03-07 Initial document creation.
  • 2026-03-13
    • Added explanations regarding Google's adjustment of the Antigravity quota mechanism on 2026-03-12.
    • Added issues and suggestions regarding Gemini's unstable output integrity.
    • Added limitations of Copilot CLI in WSL environments and rule file reading issues.