Hello Everyone,
OpenAI is launching a new general purpose AI agent in ChatGPT, which OpenAI says can complete a wide variety of computer-based tasks on behalf of users.
OpenAI says the agent can automatically navigate a user’s calendar, generate editable presentations and slideshows, and run code.
Is this a Manus AI competitor?
Key capabilities:
• Handles tasks from start to finish autonomously
• Navigates websites and clicks through interfaces
• Conducts deep research across multiple sources
• Runs code and performs complex analysis
• Creates editable slideshows and spreadsheets
A boost for Deep Research?
This product enables OpenAI’s Deep Research to be able to do more things.
ChatGPT agent, as the bot is called, runs on a new AI model created to power the capability. But how it is substantially different from the likes of Manus AI is not clear.
The tool, called ChatGPT agent, combines several capabilities from OpenAI’s previous agentic tools including Operator and Deep Research. This is about six months after those tools were announced.
What is it even for?
The "ChatGPT agent" is designed to navigate websites, filter results, prompt for logins when needed, execute code, run analyses, and create editable documents like presentations or spreadsheets.
So this means ChatGPT can “do more” with slideshows, spreadsheets and agents you might already find on Genspark.
While OpenAI is trying to “unify” some of its tools, it sounds a bit clumsy.
ChatGPT agent, launched by Open AI everywhere apart from the EU, not only “thinks” but also “acts”, OpenAI said in its PR.
Operator can browse the web and interact with websites to complete tasks.
Deep Research is designed for multi-step research that is able to combine information from different resources and generate a report.
ChatGPT agent requests permission before taking significant actions and can be interrupted and halted at any point.
There are a Lot of Possibilities Here
It has access to both a visual and a text-based browser, a terminal, OpenAI APIs, and ChatGPT connectors (for linking to services like Gmail and GitHub). And, according to OpenAI, the agent runs in its own virtual machine, which preserves context – the back and forth of prompts, responses, and data.
Early in the morning on July 18th, Sam Altman and four OpenAI researchers introduced the upcoming Agent mode of OpenAI in a live broadcast.
The Mythical “Second Brain” of Gen AI
This isn’t about chatting. It’s about having a second brain that (some influencers posted):
→ Researches online
→ Books travel
→ Analyzes data
→ Builds slides & spreadsheets
→ Writes code
→ Sends emails
→ Operates independently
Things that for the most part people do themselves with the help of multiple AI tools.
Case Study Examples on the Go?
ChatGPT Agent can follow prompts like "look at my calendar and brief me on upcoming client meetings based on recent news," "plan and buy ingredients to make Japanese breakfast for four," and "analyze three competitors and create a slide deck," per OpenAI's blog post.
To activate the tool, users can select “agent mode” in ChatGPT’s dropdown menu of tools.
In recent years, Silicon Valley companies including OpenAI, Google, and Perplexity have unveiled dozens of AI agents that have promised a lot and haven’t been very helpful in real-world or complex tasks. If I’m going to be honest.
ChatGPT agent will be available to Pro, Plus, and Team users Thursday, and will be available for Enterprise and Education users later this summer, OpenAI said.
This “unified-approach” comes a bit anti-climatic as OpenAI has delayed both GPT-5 and their Open-weight model multiple times. They also don’t seem to take trust and saftey as seriously as they once did.
Just Browser Based Tooling Continued
ChatGPT Agent will ask permission "before taking actions of consequence," like entering passwords or payment information. Users can also take over the browser at any time.
The difference between Operator and ChatGPT agent, however, is that the new agent is equipped with “deep research” capabilities that allow it to synthesize larger amounts of information it gathers from the web.
Is this Operator with Deep Research or the other way around, and does it even matter? 🤨
OpenAI says ChatGPT agent is far more capable than its previous offerings. It isn’t so clear how.
To put it simply, in the Agent mode, you can directly make requests to ChatGPT: "I'm short of a pair of shoes for my wedding. Go to the e - commerce platform and buy them for me." Or, "Design some pet accessories for me and place an order for printing directly." "Search for information and generate a PPT directly." Then, ChatGPT will open a virtual machine by itself and perform operations step by step.
Some of us might find some narrow uses cases where this is useful, but most of us will be scratching our heads. Other potential uses for the ChatGPT agent include booking a restaurant reservation, creating detailed research reports and performing financial analysis, OpenAI’s PR material tried to proclaim.
Advanced Reasoning on the Fly?
Powered by a new model, the Computer-Using Agent (CUA), which combines GPT-4o’s vision capabilities with reinforcement learning, ChatGPT Agent excels in reasoning and self-correction. It scored 41.6% on Humanity’s Last Exam (HLE) and 27.4% on FrontierMath, setting new benchmarks for AI agents.
With a parallel rollout strategy (running multiple attempts and selecting the most confident result), its HLE score improves to 44.4%.
The agent can also leverage ChatGPT connectors(opens in a new window), which allows you to connect apps like Gmail and Github so ChatGPT can find information relevant to your prompts and use them in its responses.
ChatGPT Agent is error prone and should be used with caution. The PR and media coverage about it were noticeably weak and repetitive. Given that Claude Code leads it’s hard to trust OpenAI or Google as much as Anthropic with anything MCP Agent related.
Access is Expensive
Available to Pro ($200/month, 400 prompts/month), Plus, and Team users, with Enterprise and Education access planned for later in 2025. It’s not yet available in the European Economic Area or Switzerland.
Users found Operator didn't save them that much time because it required a lot of human interaction. OpenAI says this new tool broadens its agent's "real-world utility." However that too remains to be seen. OpenAI isn’t performing as well on specific products as it once did due to AI talent loss and being limited to ChatGPT’s traction. They notably failed to acquire Windsurf in a very botched M&A operation.
To develop the new tool, the company combined the teams behind both Operator and Deep Research into one unified team. It sounded like a rushed endeavor.
The model underlying ChatGPT agent offers state-of-the-art performance on several benchmarks, according to OpenAI. Though this was not explained in a transparent way.