Close Menu
FSNN | Free Speech News NetworkFSNN | Free Speech News Network
  • Home
  • News
    • Politics
    • Legal & Courts
    • Tech & Big Tech
    • Campus & Education
    • Media & Culture
    • Global Free Speech
  • Opinions
    • Debates
  • Video/Live
  • Community
  • Freedom Index
  • About
    • Mission
    • Contact
    • Support
Trending

Chief Justice Roberts (Likely) Ordered The Release Of Cook 30 Minutes Before He Announced It

18 minutes ago

More than a dozen South African journalists targeted as anti-migrant deadline looms

26 minutes ago

Millions of European crypto users face a sudden hunt for new digital asset platforms

35 minutes ago
Facebook X (Twitter) Instagram
Facebook X (Twitter) Discord Telegram
FSNN | Free Speech News NetworkFSNN | Free Speech News Network
Market Data Newsletter
Monday, June 29
  • Home
  • News
    • Politics
    • Legal & Courts
    • Tech & Big Tech
    • Campus & Education
    • Media & Culture
    • Global Free Speech
  • Opinions
    • Debates
  • Video/Live
  • Community
  • Freedom Index
  • About
    • Mission
    • Contact
    • Support
FSNN | Free Speech News NetworkFSNN | Free Speech News Network
Home»Cryptocurrency & Free Speech Finance»Ornith Is the Open-Source Coding Model Built for Agents, Not Humans
Cryptocurrency & Free Speech Finance

Ornith Is the Open-Source Coding Model Built for Agents, Not Humans

News RoomBy News Room41 minutes agoNo Comments6 Mins Read819 Views
Share Facebook Twitter Pinterest Copy Link LinkedIn Tumblr Email VKontakte Telegram
Ornith Is the Open-Source Coding Model Built for Agents, Not Humans
Share
Facebook Twitter Pinterest Email Copy Link

Listen to the article

0:00
0:00

Key Takeaways

Playback Speed

Select a Voice

In brief

  • DeepReinforce released Ornith-1.0 on June 25 under MIT license, purpose-built for AI coding agents working in real terminal and repository environments.
  • The 9B variant scores 69.4 on SWE-bench Verified, outperforming Google’s Gemma 4-31B (52.0).
  • Ornith’s own model card warns the models may underperform on non-coding tasks—they are wired for developer pipelines, not general-purpose AI conversations.

DeepReinforce, an AI research lab previously known for CUDA-L1 and the IterX code-agent optimization loop, released Ornith-1.0 late last week—a family of open-source coding models available on Hugging Face in four sizes based on the number of parameters: 9 billion, 31 billion, 35 billion mixture of experts, and a 397 billion mixture-of-experts flagship, all under MIT license with no regional restrictions.

Parameters are basically the number of dials and configurations a model can handle on its training. The more parameters, the more capable a model is. A 9-billion-parameter model is considered small, good enough to run on a good smartphone, but not capable of doing any heavy reasoning task reliably. A 397 billion model is much more capable, but requires some heavy computing, the kind that is not available on consumer hardware.

The lab describes it as “a self-improving family of open-source models specially for agentic coding tasks.” That word—agentic—is doing a lot of work.

Aloha! 🌺 Meet Ornith-1.0, a family of open-source LLMs specialized for agentic coding.

Ornith-1.0 spans the full parameter sizes including 9B Dense, 31B Dense, 35B MoE, and 397B MoE. It achieves state-of-the-art performance among open-source models of comparable size on… pic.twitter.com/7g1rmacLps

— Ornith (@ornith_) June 25, 2026

Most AI that people interact with is conversational: you type, it responds, the exchange ends. Agentic AI is different—it gets a task and takes actions to complete it without a human guiding each step. In a coding context, that means an AI that reads files, runs tests, identifies what failed, fixes the code, and loops again until it’s done.

So Agentic AI means no one needs to be at the keyboard for most of the time. That’s the whole point. This is also the direction where the most commercially relevant progress is happening in 2026—the models that can run unsupervised through 20-step dev workflows are worth more than the ones that write a clean function on request.

However, most large language models are still designed with human feedback in mind.

How Ornith’s brain works

Most AI coding agents are paired with a human-designed harness—a fixed set of rules for how the agent structures its work: when to call a tool, how to handle an error, how to decompose a multi-step problem. Ornith instead “treats the scaffold as a learnable object that co-evolves with the policy.”

Translation: instead of inheriting someone else’s playbook, it develops its own.

During reinforcement learning, each training step happens in two stages. The model first reads the task and proposes a refined strategy for approaching it. Then it uses that strategy to generate a solution.

The reward from the outcome flows back to both stages—so the model is optimized for writing better strategies, not just better code. Do that thousands and millions of times, and task-specific approaches emerge without a human engineering them.

DeepReinforce also takes reward hacking seriously. If the model can write its own training scaffold, it can theoretically write a scaffold that games the verifier—touching a file to make it look like it completed a task without actually doing the work. Three layers of defense block this: the environment and test suite are immutable and outside the model’s reach, a deterministic monitor flags any attempt to access restricted paths or alter verification scripts, and a frozen judge model sits on top of the automated verifier as a veto.

The numbers

The flagship 397 billion parameter model posts 82.4 on SWE-bench Verified—a test where an AI is given a real bug from an open-source GitHub repository and must fix it without seeing the test suite, scored as the percentage of issues it successfully resolves.

That beats Claude Opus 4.7’s 80.8 and DeepSeek-V4-Pro’s 80.6 on the same test. On Terminal Bench 2.1—89 tasks run inside containerized terminal environments ranging from debugging async code to resolving security vulnerabilities, scored by completion rate—it posts 77.5 against Claude Opus 4.7’s 70.3. 

Given that SWE-bench contamination concerns have been raised publicly—OpenAI argued earlier this year that models were inflating scores by memorizing benchmark solutions seen during training—Ornith also reports numbers on SWE-bench Pro, a harder version using more diverse, less-leaked codebases scored the same way. The 397 billion model lands at 62.2 there. Meaningfully lower, but still competitive with the field, and still better than Deepseek V4 Pro.

The 9 billion parameter model might be the more interesting data point. It posts 69.4 on SWE-bench Verified—higher than Gemma 4-31B’s 52 and competitive with Qwen 3.5-35B’s 70, despite being 3-4 times smaller.

Who it’s for, and who it isn’t

Ornith-1.0 is explicitly not a general-purpose AI. The model’s own documentation says it may underperform on tasks outside agentic coding. If you want AI to summarize a document, help you write your doctoral thesis, or draft an email, Ornith-1.0 is the wrong pick.

It’s optimized for a narrow problem set: developer pipelines where an AI agent takes a task description, operates inside a code repository or terminal session, and completes multi-step work without intervention. This is a tool that was built for people who are already running agent infrastructure—not for people trying to decide if AI is worth using.

The “beats Claude” headline is real but requires context. As Decrypt reported, every lab is now chasing performance on agentic coding evals, because that’s where the useful performance differences live.

Ornith-1.0-397B does surpass Claude Opus 4.7 on both different coding benchmarks, but Anthropic’s current flagship, Claude Opus 4.8, scores higher. The comparison that holds is within the open-source category, at comparable parameter counts, on coding-specific agent tasks.

For developers building self-hosted coding pipelines, agentic infrastructure, or similar coding-focused work, the small and medium models running on edge hardware may be genuinely useful, but the average Joe may be better looking somewhere else.

Daily Debrief Newsletter

Start every day with the top news stories right now, plus original features, a podcast, videos and more.



Read the full article here

Fact Checker

Verify the accuracy of this article using AI-powered analysis and real-time sources.

Get Your Fact Check Report

Enter your email to receive detailed fact-checking analysis

5 free reports remaining

Continue with Full Access

You've used your 5 free reports. Sign up for unlimited access!

Already have an account? Sign in here

Share. Facebook Twitter Pinterest LinkedIn Tumblr Email Telegram Copy Link
News Room
  • Website
  • Facebook
  • X (Twitter)
  • Instagram
  • LinkedIn

The FSNN News Room is the voice of our in-house journalists, editors, and researchers. We deliver timely, unbiased reporting at the crossroads of finance, cryptocurrency, and global politics, providing clear, fact-driven analysis free from agendas.

Related Articles

Media & Culture

Chief Justice Roberts (Likely) Ordered The Release Of Cook 30 Minutes Before He Announced It

18 minutes ago
Cryptocurrency & Free Speech Finance

Millions of European crypto users face a sudden hunt for new digital asset platforms

35 minutes ago
Cryptocurrency & Free Speech Finance

Bitcoin’s Bearish Options Positioning Hints At Drop To $55K

39 minutes ago
Media & Culture

VC Bros Claimed They Backed Trump To Protect AI. Trump Is Shutting Down AI. It Was Always About Access & Power

1 hour ago
Media & Culture

Can the President Fire Anyone He Wants? Yes, Unless the Target Is Part of the Federal Reserve.

1 hour ago
Cryptocurrency & Free Speech Finance

J.P. Morgan broadens Kinexys blockchain settlement network as banks modernize cross-border payments

2 hours ago
Add A Comment
Leave A Reply Cancel Reply

Editors Picks

More than a dozen South African journalists targeted as anti-migrant deadline looms

26 minutes ago

Millions of European crypto users face a sudden hunt for new digital asset platforms

35 minutes ago

Bitcoin’s Bearish Options Positioning Hints At Drop To $55K

39 minutes ago

Ornith Is the Open-Source Coding Model Built for Agents, Not Humans

41 minutes ago
Latest Posts

VC Bros Claimed They Backed Trump To Protect AI. Trump Is Shutting Down AI. It Was Always About Access & Power

1 hour ago

Can the President Fire Anyone He Wants? Yes, Unless the Target Is Part of the Federal Reserve.

1 hour ago

Uganda military chief orders Nation Media Group shutdowns, threatens Managing Director Susan Nsibirwa

1 hour ago

Subscribe to News

Get the latest news and updates directly to your inbox.

At FSNN – Free Speech News Network, we deliver unfiltered reporting and in-depth analysis on the stories that matter most. From breaking headlines to global perspectives, our mission is to keep you informed, empowered, and connected.

FSNN.net is owned and operated by GlobalBoost Media
, an independent media organization dedicated to advancing transparency, free expression, and factual journalism across the digital landscape.

Facebook X (Twitter) Discord Telegram
Latest News

Chief Justice Roberts (Likely) Ordered The Release Of Cook 30 Minutes Before He Announced It

18 minutes ago

More than a dozen South African journalists targeted as anti-migrant deadline looms

26 minutes ago

Millions of European crypto users face a sudden hunt for new digital asset platforms

35 minutes ago

Subscribe to Updates

Get the latest news and updates directly to your inbox.

© 2026 GlobalBoost Media. All Rights Reserved.
  • Privacy Policy
  • Terms of Service
  • Our Authors
  • Contact

Type above and press Enter to search. Press Esc to cancel.

🍪

Cookies

We and our selected partners wish to use cookies to collect information about you for functional purposes and statistical marketing. You may not give us your consent for certain purposes by selecting an option and you can withdraw your consent at any time via the cookie icon.

Cookie Preferences

Manage Cookies

Cookies are small text that can be used by websites to make the user experience more efficient. The law states that we may store cookies on your device if they are strictly necessary for the operation of this site. For all other types of cookies, we need your permission. This site uses various types of cookies. Some cookies are placed by third party services that appear on our pages.

Your permission applies to the following domains:

  • https://fsnn.net
Necessary
Necessary cookies help make a website usable by enabling basic functions like page navigation and access to secure areas of the website. The website cannot function properly without these cookies.
Statistic
Statistic cookies help website owners to understand how visitors interact with websites by collecting and reporting information anonymously.
Preferences
Preference cookies enable a website to remember information that changes the way the website behaves or looks, like your preferred language or the region that you are in.
Marketing
Marketing cookies are used to track visitors across websites. The intention is to display ads that are relevant and engaging for the individual user and thereby more valuable for publishers and third party advertisers.