What is Microsoft's Magentic One?
Microsoft Research is trying to do something with the Autogen framework for AI Agents.
Hey Everyone,
While it’s debatable how well Microsoft’s various Copilots have performed, their pivot to AI Agents is already under way.
Magentic-One is a multi-agent development platform that's aimed at fielding teams of agents focused on tackling many tasks people encounter in daily life.
The platform allows creators to define and deploy a task-focused team of LLM "experts" with abilities to surf the web, access files, and write and execute programs to accomplish specific goals. Magentic-One is built on the open-source Autogen framework. Excited to see what the larger community of researchers and developers will do with Magentic-One.
Microsoft’s ideas around AI agents seems to depend mostly on what OpenAI is doing. What frameworks for AI agents are going to do well? It’s still a bit soon to tell. Enterprises looking to deploy multiple AI agents often need to implement a framework to manage them.Â
What if a team of agents could accomplish tasks together? Such a team of agents - Magentic-One ... consisting of an orchestrator that plans/coordinates 4 other agents - a FileSurfer for handling diverse file types, a WebSurfer for browser interactions, a Coder for writing programs, and a ComputerTerminal for executing code.
Consider reading my flagship Newsletter, AI Supremacy.
As a test, Microsoft Research showed that their generalist team achieves competitive results across multiple benchmarks - reaching 38% on GAIA, 27.7% on AssistantBench, and 32.8% on WebArena tasks, performing statistically comparable to specialized state-of-the-art systems.
Github: https://github.com/microsoft/autogen/tree/main/python/packages/autogen-magentic-one
Blog: https://www.microsoft.com/en-us/research/articles/magentic-one-a-generalist-multi-agent-system-for-solving-complex-tasks/
Tech Report: https://www.microsoft.com/en-us/research/publication/magentic-one-a-generalist-multi-agent-system-for-solving-complex-tasks/
Is Magentic One an early winner of AutoGen?
Magentic-One is a generalist agentic system built on AutoGen that achieves competitive performance to SOTA solutions on multiple challenging agentic benchmarks, without requiring modification to the underlying agents or how they collaborate.
Can AI be its own Orchestrator?
Modern AI agents, driven by advances in large foundation models, promise to enhance our productivity and transform our lives by augmenting our knowledge and capabilities.
To achieve this vision, AI agents must effectively plan, perform multi-step reasoning and actions, respond to novel observations, and recover from errors, to successfully complete complex tasks across a wide range of scenarios.
The Microsoft Research team see Magentic-One as a high-performing open-source agentic system for solving such tasks. Magentic-One uses a multi-agent architecture where a lead agent, the Orchestrator, plans, tracks progress, and re-plans to recover from errors.
Throughout task execution, the Orchestrator also directs other specialized agents to perform tasks as needed, such as operating a web browser, navigating local files, or writing and executing Python code.
Their experiments show that Magentic-One achieves statistically competitive performance to the state-of-the-art on three diverse and challenging agentic benchmarks: GAIA, AssistantBench, and WebArena. Notably, Magentic-One achieves these results without modification to core agent capabilities or to how they collaborate, demonstrating progress towards the vision of generalist agentic systems.
Moreover, Magentic-One’s modular design allows agents to be added or removed from the team without additional prompt tuning or training, easing development and making it extensible to future scenarios. They provide an open-source implementation of Magentic-One and AutoGenBench, a standalone agentic evaluation tool.
AutoGenBench provides built-in controls for repetition and isolation to run agentic benchmarks where actions may produce side-effects, in a rigorous and contained way. Magentic-One, AutoGenBench and detailed empirical performance evaluations of Magentic-One, including ablations and error analysis are available at https://aka.ms/magentic-one(opens in new tab).
The company that can make the winning framework for AI agents is going to be a huge winner.
Contributors: Adam Fourney, Gagan Bansal, Hussein Mozannar, Cheng Tan, Eduardo Salinas, Erkang (Eric) Zhu, Friederike Niedtner, Grace Proebsting, Griffin Bassman, Jack Gerrits, Jacob Alber, Peter Chang, Ricky Loynd, Robert West, Victor Dibia, Ahmed Awadallah, Ece Kamar, Rafah Hosn, Saleema Amershi. - Look them up on LinkedIn if you want to see what they are saying about it.
It’s great to see Microsoft so bullish on Autogen as I’ve been following it from the beginning.
Are Generalist Agentic Systems going to manifest in 2025 or 2026?
Keep reading with a 7-day free trial
Subscribe to Artificial Intelligence Learning 🤖🧠🦾 to keep reading this post and get 7 days of free access to the full post archives.