Fara1.5-27B scored 72% on Online-Mind2Web, beating OpenAI Operator (58.3%) and Gemini 2.5 Computer Use (57.3%).
The models are open-weight, come in 4 billion, 9 billion, and 27 billion parameter sizes, and are built on fine-tuned Qwen 3.5.
Fara1.5-9B is live now on Azure AI Foundry; 4B and 27B arrive shortly.
Imagine telling your computer to look up vacation rentals, compare five sites, fill out the booking form, and confirm the one closest to the beach. You go make coffee. It’s done when you get back. That is the promise of “computer use agents”—AI that reads your browser screen and clicks, scrolls, and types exactly as a human would, with no special plugins required.
OpenAI tried this first with Operator, launched in January 2025 at $200 a month before being folded into ChatGPT Agent and shut down in August. Google has Gemini 2.5 Computer Use. Both are proprietary, cloud-based, and expensive to run.
This week, Microsoft Research released a tiny model named Fara1.5—and on the benchmarks that count, it beats them both.
The family comes in three sizes: 4 billion, 9 billion, and 27 billion parameters, all built on Qwen3.5, an Alibaba base model that Microsoft fine-tuned for browser work, with all weights publicly released. (Parameters are what determine an AI model’s breadth of knowledge, with more generally meaning a higher capacity.)
Getting there required rethinking the whole development process from scratch. “We started with a simple question: What does it take to make a small model genuinely good at agentic tasks?” the AI Frontiers team wrote. “The answer spanned the full lifecycle—data generation, training objectives, model design, and orchestration had to be redesigned together rather than in isolation.”
The benchmarks
Online-Mind2Web is the benchmark that matters in the task Microsoft wanted to excel. It tests how often an AI agent correctly completes 300 diverse, real-world tasks across 136 popular live websites—things like comparing products, filling forms, and booking services—scored as a percentage of tasks finished correctly on the actual, changing internet.
Fara1.5-27B scored 72%. OpenAI Operator scored 58.3%. Google’s Gemini 2.5 Computer Use scored 57.3%. Yutori’s Navigator n1, the top proprietary alternative, reached 64.7%. Even Fara1.5-9B, the mid-sized model, hit 63.4%—ahead of both OpenAI and Google.
Open-source rivals also fell short. Alibaba’s GUI-Owl-1.5 at 8 billion parameters scored 48.6%. AI2’s MolmoWeb scored 35.3%. Microsoft’s own previous model, Fara-7B, scored 34.1%—making this release nearly double its predecessor at a comparable size.
On WebVoyager, a second benchmark measuring task success on the live web scored the same way, Fara1.5-27B hit 88.6%, edging OpenAI Operator’s 87.0% and beating H Company’s 30-billion-parameter Holo2 at 83.0%.
How it learned
The secret sauce is the training pipeline. Microsoft used a system called FaraGen1.5 to generate the training data. Here’s the clever part: they used GPT-5.4—OpenAI’s model—as a “teacher agent” to demonstrate how to complete browser tasks. Those demonstrations become the training data for Fara1.5. You’re essentially using OpenAI’s most capable model to train a rival open-source one.
They also created six fake, fully functional replicas of real websites—email clients, calendars, marketplaces—so the model could practice tasks that require logins or irreversible actions (like actually sending an email or booking a flight) without touching real accounts. That’s called synthetic domain training, and it’s a significant part of why Fara1.5 handles “gated” tasks better than its predecessors.
Every model is designed to stop and ask before doing something it cannot undo. “Balancing robust safeguards such as Critical Points with seamless user journeys is key,” Yash Lara, Senior PM Lead at Microsoft Research, told VentureBeat. “Having a UI, like Microsoft Research’s Magentic-UI, is vital for giving users opportunities to intervene when necessary, while also helping to avoid approval fatigue.”
That matters because OpenAI was not subtle about the risks when it launched ChatGPT Agent. “When you sign ChatGPT agent into websites or enable connectors, it will be able to access sensitive data from those sources, such as emails, files, or account information,” the company wrote.
Fara1.5 runs everything through MagenticLite, a sandboxed browser environment that logs every action and lets users halt the agent at any point.
Browser AI has become a crowded race—Google’s Gemini in Chrome, Perplexity’s Comet, Anthropic’s Claude for Chrome. Fara1.5’s edge is that it is open: public weights, open inference code on GitHub, runs on hardware you control. Fara1.5-9B is live now on Azure AI Foundry; the 4B and 27B variants arrive shortly. Microsoft says it plans to expand Fara1.5 beyond the browser and into desktop and enterprise software next.
Daily Debrief Newsletter
Start every day with the top news stories right now, plus original features, a podcast, videos and more.
The FSNN News Room is the voice of our in-house journalists, editors, and researchers. We deliver timely, unbiased reporting at the crossroads of finance, cryptocurrency, and global politics, providing clear, fact-driven analysis free from agendas.
We and our selected partners wish to use cookies to collect information about you for functional purposes and statistical marketing. You may not give us your consent for certain purposes by selecting an option and you can withdraw your consent at any time via the cookie icon.
Cookies are small text that can be used by websites to make the user experience more efficient. The law states that we may store cookies on your device if they are strictly necessary for the operation of this site. For all other types of cookies, we need your permission. This site uses various types of cookies. Some cookies are placed by third party services that appear on our pages.
Your permission applies to the following domains:
https://fsnn.net
Necessary
Necessary cookies help make a website usable by enabling basic functions like page navigation and access to secure areas of the website. The website cannot function properly without these cookies.
Statistic
Statistic cookies help website owners to understand how visitors interact with websites by collecting and reporting information anonymously.
Preferences
Preference cookies enable a website to remember information that changes the way the website behaves or looks, like your preferred language or the region that you are in.
Marketing
Marketing cookies are used to track visitors across websites. The intention is to display ads that are relevant and engaging for the individual user and thereby more valuable for publishers and third party advertisers.