OpenAI Launches ChatGPT Agent – What It Is, How It Works & All You Need to Know

On July 17, OpenAI has unveiled ChatGPT Agent—a general-purpose AI assistant that not only talks but acts. More than just a chatbot, this new system can autonomously browse websites, use APIs, manipulate spreadsheets, run code in a terminal, generate editable documents—and crucially, do all of this using its own virtual computer.

While Silicon Valley has been buzzing for over a year about so-called AI agents—intelligent software that can perform multi-step tasks on your behalf—OpenAI’s rollout marks one of the first times that such capabilities are available to mainstream users. Starting this week, subscribers to ChatGPT Pro, Plus, and Team plans can activate Agent Mode to let the model plan their week, generate pitch decks, handle research workflows, or even shop for groceries—end to end.

With this release, OpenAI is positioning itself at the forefront of the next major shift in AI—agentic systems that don’t just answer questions, but complete tasks independently.

Let’s show you what ChatGPT agent can do 🧵

Need Monday metrics? ChatGPT agent can fetch the data, generate the spreadsheet, and schedule it to run again—automatically. pic.twitter.com/oPz7bxMkUK
— OpenAI (@OpenAI) July 17, 2025

Table of Contents hide

Agent Mode: What It Is and How It Works

What’s New Compared to Traditional ChatGPT?

Performance Benchmarks

Safety Measures

The Bigger Picture

Final Thoughts

Agent Mode: What It Is and How It Works

With ChatGPT Agent, OpenAI is introducing an entirely new paradigm for human-computer interaction. Rather than acting as a passive text generator, the Agent behaves more like a full-fledged digital assistant that can plan, act, and deliver tangible results on your behalf.

At its core, the Agent is the result of merging two of OpenAI’s most advanced experimental systems: Operator—which enabled the model to interact with websites via clicks, text inputs, and navigation—and Deep Research, which specialized in synthesizing information from large volumes of online sources into clear, structured reports. By integrating both toolchains, OpenAI has created a single, cohesive system capable of executing complex, multi-step tasks in a way that feels natural and conversational.

Unlike previous ChatGPT experiences that relied on simple prompt-response mechanics, Agent Mode runs atop a dedicated virtual computer, giving it the ability to juggle multiple tools in a controlled, persistent environment. This means it can:

Open and interact with web interfaces through a visual browser, clicking buttons, submitting forms, and filtering results.
Use a text-based browser to scrape and analyze large volumes of content, especially useful for reasoning-heavy research.
Run live code and manipulate files through a Unix-style terminal.
Access third-party services directly through Connectors, allowing it to pull emails from Gmail, analyze code from GitHub, check your calendar in Google Calendar, or push updates into cloud storage.

The real power of this system comes from how fluidly it moves between these tools. For example, you might say:

“Find a time next week when I’m free for dinner, choose a well-rated Japanese restaurant nearby that fits the schedule, and make a reservation for two.”

Or:

“Download the latest Q2 earnings reports from our top three competitors, analyze their financials, and build a 10-slide presentation highlighting key takeaways, revenue trends, and notable risks.”

And the Agent doesn’t just respond with suggestions—it executes. It opens websites, downloads documents, runs analysis in the terminal, and compiles the output into editable PowerPoint decks (.pptx) or formatted Excel spreadsheets (.xlsx) that you can immediately use or share.

What once required a clunky combination of browser tabs, note-taking apps, third-party tools, and manual effort is now unified in a single prompt-based workflow. You don’t need to coordinate between ChatGPT and five different services anymore—everything happens within the conversation. And crucially, the Agent decides how to tackle the task intelligently, choosing the optimal combination of tools based on your request.

This kind of agentic behavior marks a major step forward in making AI feel less like a chatbot—and more like a real collaborator.

What’s New Compared to Traditional ChatGPT?

The original ChatGPT was powerful, but passive. It could explain, suggest, and generate text—but it couldn’t take action. If it told you how to book a reservation or analyze a document, you still had to do the work.

ChatGPT Agent changes that. It doesn’t just advise—it acts. Using its own virtual computer, the Agent can click through websites, run code, analyze files, generate spreadsheets or slide decks, and interact with APIs. It’s the difference between getting directions and having a driver.

Compared to classic ChatGPT, the Agent is also more fluid and stateful. You can interrupt it mid-task, revise instructions, or steer it in a new direction without starting over. The conversation becomes a workspace, not just a Q&A.

We have graded the results of @OpenAI's evaluation on FrontierMath Tier 1–3 questions, and found a 27% (± 3%) performance. ChatGPT agent is a new model fine-tuned for agentic tasks, equipped with text/GUI browser tools and native terminal access. 🧵 pic.twitter.com/L7D3cEp58I
— Epoch AI (@EpochAIResearch) July 17, 2025

Its real power comes from how it picks tools on its own—shifting between a visual browser, text browser, terminal, or connectors depending on the task. That level of autonomy is new—and brings new risks.

To address that, OpenAI built in safeguards: the Agent asks for permission before making any irreversible move, and sensitive workflows like banking or email require “Watch Mode,” where you must stay present. It also skips memory by default to reduce data leakage risks.

In short, ChatGPT is now a full-fledged AI assistant that works for you, not just with you. It marks a fundamental shift from conversation to execution.

Performance Benchmarks

Unlike prior versions that mostly produced conversational output, ChatGPT Agent was benchmarked against real-world tasks and industry workflows—and the results are impressive.

Benchmark	Task Type	ChatGPT Agent	Best Previous Model	Human Baseline	Notes
Humanity’s Last Exam (HLE)	Expert-level reasoning across subjects	41.6%(44.4%†)	GPT-4o + tools (26.6%)	—	Sets new SOTA; † = multi-attempt run
FrontierMath (Tier 1–3)	Unpublished, expert-level math problems	27.4%	o4-mini (19.3%)	—	Evaluated by Epoch AI
DSBench: Data Analysis	Practical analytics tasks	87.9%	GPT-4o (34.1%)	64.1%	Real-world business workflows, spreadsheet parsing
DSBench: Data Modeling	Predictive modeling / ML tasks	85.5%	GPT-4o (45.5%)	65.0%	Includes exploratory data science, model selection
SpreadsheetBench (.xlsx)	Spreadsheet editing + formatting	45.5%	Copilot in Excel (20.0%)	71.3%	Uses LibreOffice on OSX; agent edits .xlsx files directly
Investment Banking Modeling	Complex multi-sheet financial models	71.3%	Deep research (55.9%)	—	Includes 3-statement models, LBOs, with formula validation

Safety Measures

With ChatGPT Agent, OpenAI moves from passive text generation to real-world action—and the security stakes rise accordingly.

The Agent introduces new risks like prompt injection and data leakage, so OpenAI equipped it with its most advanced safeguards yet:

Explicit user confirmation before any irreversible action
Real-time tool monitoring for suspicious behavior
No memory access in Agent Mode to prevent long-term data exposure
Watch Mode for sensitive tasks (e.g. banking), requiring the user to stay present
High-risk filters for biological, chemical, and financial misuse

OpenAI has classified the system under its High Bio/Chem Capabilities category—despite no misuse evidence—signaling a precaution-first approach.

As CEO Sam Altman put it, this is “cutting edge and experimental”—powerful, but not ready for truly high-stakes tasks or sensitive personal data just yet.

The Bigger Picture

The launch of ChatGPT Agent comes as interest in agentic AI hits a tipping point. These are becoming digital workers that can act, plan, and automate across tools and services.

The market is moving fast. According to Litslink, the AI agent sector is expected to grow from $5.4B in 2022 to $47.1B by 2030, with a 45% annual growth rate. Over 85% of enterprises are forecasted to use AI agents by 2025, driven by productivity gains and cost efficiency.

OpenAI isn’t alone. Anthropic released Computer Use, Google is investing heavily in Project Astra, and Perplexitylaunched its own agent, Comet. But OpenAI’s edge may be its unified approach—combining a browser, terminal, connectors, and reasoning in one system—and its massive head start in user adoption.

This rollout also sets the stage for what’s next. OpenAI has already confirmed plans to:

Improve formatting for slideshows and spreadsheets
Enable user-defined templates
Reintroduce memory (once safe)
Support recurring automated tasks

Access is expanding gradually. Pro users can start now (400 agent tasks/month), while Plus and Team users are rolling in (40/month). Enterprise support is expected later this summer. EU access is still pending.

Final Thoughts

ChatGPT Agent represents the most significant step forward in OpenAI’s product line since the launch of GPT-4. It marks a shift from passive interaction to active task execution—transforming ChatGPT from a conversational tool into a hands-on digital assistant.

For users, this means less time switching between apps, fewer manual steps, and more streamlined workflows—whether that’s planning meetings, analyzing reports, or generating polished presentations from scratch. For OpenAI, it signals a broader strategy: not just leading in model performance, but in how AI is used day-to-day.

At the same time, the rollout remains cautious and deliberate. OpenAI places safety at the forefront—memory is turned off by default, critical actions require explicit user approval, and high-risk domains demand active supervision. The company makes no claim that this is a finished product; ChatGPT Agent is powerful, but still in an experimental phase.

Even so, the direction is clear. As capabilities mature and adoption grows, AI agents are set to become standard across both personal and professional use cases. With this release, OpenAI moves beyond simply entering the agent race—it’s actively shaping what that future looks like.

Share Now!

Get Exclusive AI Tips to Your Inbox!

Stay ahead with expert AI insights trusted by top tech professionals!

Michal Langmajer
July 18, 2025
Agents, AI, ai agents, AI Automation, autonomous, chatGPT, OpenAI

Get Fello AI: All-In-One AI Chatbot

All top AI models like GPT, Claude, Gemini, or Grok – in one app that works on Mac, iPhone, and iPad.

Get Fello AI Now!

OpenAI Launches ChatGPT Agent – What It Is, How It Works & All You Need to Know

Agent Mode: What It Is and How It Works