Close Menu
FSNN | Free Speech News NetworkFSNN | Free Speech News Network
  • Home
  • News
    • Politics
    • Legal & Courts
    • Tech & Big Tech
    • Campus & Education
    • Media & Culture
    • Global Free Speech
  • Opinions
    • Debates
  • Video/Live
  • Community
  • Freedom Index
  • About
    • Mission
    • Contact
    • Support
Trending

COIN, MSTR lead gains as bitcoin (BTC) climbs above $70,000

5 minutes ago

RedStone Launches Price Oracles on Stellar Mainnet

8 minutes ago

Cathie Wood’s Ark Invest Buys the Dip on Coinbase and Robinhood—Both Now Surging

15 minutes ago
Facebook X (Twitter) Instagram
Facebook X (Twitter) Discord Telegram
FSNN | Free Speech News NetworkFSNN | Free Speech News Network
Market Data Newsletter
Wednesday, March 4
  • Home
  • News
    • Politics
    • Legal & Courts
    • Tech & Big Tech
    • Campus & Education
    • Media & Culture
    • Global Free Speech
  • Opinions
    • Debates
  • Video/Live
  • Community
  • Freedom Index
  • About
    • Mission
    • Contact
    • Support
FSNN | Free Speech News NetworkFSNN | Free Speech News Network
Home»News»Media & Culture»Six Months Of ‘AI CSAM Crisis’ Headlines Were Based On Misleading Data
Media & Culture

Six Months Of ‘AI CSAM Crisis’ Headlines Were Based On Misleading Data

News RoomBy News Room4 weeks agoNo Comments9 Mins Read280 Views
Share Facebook Twitter Pinterest Copy Link LinkedIn Tumblr Email VKontakte Telegram
Six Months Of ‘AI CSAM Crisis’ Headlines Were Based On Misleading Data
Share
Facebook Twitter Pinterest Email Copy Link

Listen to the article

0:00
0:00

Key Takeaways

Playback Speed

Select a Voice

from the lies,-damned-lies,-and… dept

Remember last summer when everyone was freaking out about the explosion of AI-generated child sexual abuse material? The New York Times ran a piece in July with the headline “A.I.-Generated Images of Child Sexual Abuse Are Flooding the Internet.” NCMEC put out a blog post calling the numbers an “alarming increase” and a “wake-up call.” The numbers were genuinely shocking: NCMEC reported receiving 485,000 AI-related CSAM reports in the first half of 2025, compared to just 67,000 for all of 2024.

That’s a big increase! And it would obviously be super concerning if any AI company were finding and detecting so much AI-generated CSAM, especially as we keep hearing that the big AI models (perhaps with the exception of Grok…) have been putting in place safeguards against CSAM generation.

The source of most of those reports? Amazon, which had submitted a staggering 380,000 of them, even though most people don’t tend to think of Amazon as much of an AI company. But, still, it became a six alarm fire about how much AI-generated CSAM Amazon had discovered. There were news stories about it, politicians demanding action, and the general sentiment was that this proved how big the problem was.

Except… it turns out that wasn’t actually what was happening. At all.

Bloomberg just published a deep dive into what was actually going on with Amazon’s reports, and the truth is very, very different from what everyone assumed. According to Bloomberg:

Amazon.com Inc. reported hundreds of thousands of pieces of content last year that it believed included child sexual abuse, which it found in data gathered to improve its artificial intelligence models. Though Amazon removed the content before training its models, child safety officials said the company has not provided information about its source, potentially hindering law enforcement from finding perpetrators and protecting victims.

Here’s the kicker—and I cannot stress this enough—none of Amazon’s reports involved AI-generated CSAM.

None of its reports submitted to NCMEC were of AI-generated material, the spokesperson added. Instead, the content was flagged by an automatic detection tool that compared it against a database of known child abuse material involving real victims, a process called “hashing.” Approximately 99.97% of the reports resulted from scanning “non-proprietary training data,” the spokesperson said.

What Amazon was actually reporting was known CSAM—images of real victims that already existed in databases—that their scanning tools detected in datasets being considered for AI training. They found it using traditional hash-matching detection tools, flagged it, and removed it before using the data. Which is… actually what you’d want a company to do?

But because it was found in the context of AI development, and because NCMEC’s reporting form has exactly one checkbox that says “Generative AI” with no way to distinguish between “we found known CSAM in our training data pipeline” and “our AI model generated new CSAM,” Amazon checked the box.

And thus, a massive misunderstanding was born.

Again, let’s be clear and separate out a few things here: the fact that Amazon found CSAM (known or not) in its training data is bad. It is a troubling sign of how much CSAM is found in the various troves of data AI companies use for training. And maybe the focus should be on that. Also, the fact that they then reported it to NCMEC and removed it from their training data after discovering it with hash matching is… good. That’s how things are supposed to work.

But the fact that the media (with NCMEC’s help) turned this into “OMG AI generated CSAM is growing at a massive rate” is likely extremely misleading.

Riana Pfefferkorn at Stanford, who co-authored an important research report last year about the challenges of NCMEC’s reporting system (which we wrote two separate posts about), wrote a letter to NCMEC that absolutely nails what went wrong here:

For half a year, “Massive Spike In AI-Generated CSAM” is the framing I’ve seen whenever news reports mention those H1 2025 numbers. Even the press release for a Senate bill about safeguarding AI models from being tainted with CSAM stated, “According to the National Center for Missing & Exploited Children, AI-generated material has proliferated at an alarming rate in the past year,” citing the NYT article.

Now we find out from Bloomberg that zero of Amazon’s reports involved AI-generated material; all 380,000 were hash hits to known CSAM. And we have Fallon [McNulty, executive director of the CyberTipline] confirming to Bloomberg that “with the exception of Amazon, the AI-related reports [NCMEC] received last year came in ‘really, really small volumes.’”

That is an absolutely mindboggling misunderstanding for everyone — the general public, lawmakers, researchers like me, etc. — to labor under for so long. If Bloomberg hadn’t dug into Amazon’s numbers, it’s not clear to me when, if ever, that misimpression would have been corrected. 

She’s not wrong. Nearly 80% of all “Generative AI” CyberTipline reports to NCMEC in the first half of 2025 involved no AI-generated CSAM at all. The actual volume of AI-generated CSAM being reported? Apparently “really, really small.”

Now, to be (slightly?) fair to the NYT, they did run a minor correction a day after their original story noting that the 485,000 reports “comprised both A.I.-generated material and A.I. attempts to create material, not A.I.-generated material alone.” But that correction still doesn’t capture what actually happened. It wasn’t “AI-generated material and attempts”—it was overwhelmingly “known CSAM detected during AI training data vetting.” Those are very different things.

And it gets worse. Bloomberg reports that Amazon’s scanning threshold was set so low that many of those reports may not have even been actual CSAM:

Amazon believes it over-reported these cases to NCMEC to avoid accidentally missing something. “We intentionally use an over-inclusive threshold for scanning, which yields a high percentage of false positives,” the spokesperson added.

So we’ve got reports that aren’t AI-generated CSAM, many of which may not even be CSAM at all. Very helpful.

The frustrating thing is that this kind of confusion wasn’t just entirely predictable—it was predicted! When Pfefferkorn and her colleagues at Stanford published their report about NCMEC’s CSAM reporting system they literally called out the potential confusion in the options of what to check and how platforms would likely over-report stuff in an abundance of caution, because the penalty (both criminally and in reputation) for missing anything is so dire.

Indeed, the form for submitting to the CyberTipline has one checkbox for “Generative AI” that, as Pfefferkorn notes in her letter, can mean wildly different things depending on who’s checking it:

When the meaning of checking a single checkbox is so ambiguous that absent additional information, reports of known CSAM found in AI training data are facially indistinguishable from reports of new AI-generated material (or of text-only prompts seeking CSAM, or of attempts to upload known CSAM as part of a prompt, etc.), and that ambiguity leads to a months-long massive public misunderstanding about the scale of the AI-CSAM problem, then it is clear that the CyberTipline reporting form itself needs to change — not just how one particular ESP fills it out. 

To its credit NCMEC did respond quickly to Pfefferkorn, and their response is… illuminating. They confirmed they’re working on updating the reporting system, but also noted that Amazon’s reports contained almost no useful information:

all those Amazon reports included minimal data, not even the file in question or the hash value, much less other contextual information about where or how Amazon detected the matching file

As Pfefferkorn put it, Amazon was basically giving NCMEC reports that said “we found something” with nothing else attached. NCMEC says they only learned about the false positives issue last week and are “very frustrated” by it.

Indeed, NCMEC’s boss told Bloomberg:

“There’s nothing then that can be done with those reports,” she said. “Our team has been really clear with [Amazon] that those reports are inactionable.”

There’s plenty of blame to go around here. Amazon clearly should have been more transparent about what they were reporting and why. NCMEC’s reporting form is outdated and creates ambiguity that led to a massive public misunderstanding. And the media (NYT included) ran with alarming numbers without asking obvious questions like “why is Amazon suddenly reporting 25x more than last year and no other AI company is even close?”

But, even worse, policymakers spent six months operating under the assumption that AI-generated CSAM was exploding at an unprecedented rate. Legislation was proposed. Resources were allocated. Public statements were made. All based on numbers that fundamentally misrepresented what was actually happening.

As Pfefferkorn notes:

Nobody benefits from being so egregiously misinformed. It isn’t a basis for sound policymaking (or an accurate assessment of NCMEC’s resource needs) if the true volume of AI-generated CSAM being reported is a mere fraction of what Congress and other regulators believe it is. It isn’t good for Amazon if people mistakenly think the company’s AI products are uniquely prone to generating CSAM compared with other options on the market (such as OpenAI, with its distant-second 75,000 reports during the same time period, per NYT). That impression also disserves users trying to pick safe, responsible AI tools to use; in actuality, per today’s revelations about training data vetting, Amazon is indeed trying to safeguard its models against CSAM. I can certainly think of at least one other AI company that’s been in the news a lot lately that seems to be acting far more carelessly.

None of this means that AI-generated CSAM isn’t a real and serious problem. It absolutely is, and it needs to be addressed. But you can’t effectively address a problem if your data about the scope of that problem is fundamentally wrong. And you especially can’t do it when the “alarming spike” that everyone has been pointing to turns out to be something else entirely.

The silver lining here, as Pfefferkorn points out, is that the actual news is… kind of good? Amazon’s AI models aren’t CSAM-generating machines. The company was actually doing the responsible thing by vetting its training data. And the real volume of AI-generated CSAM reports is apparently much lower than we’ve been led to believe.

But that good news was buried for six months under a misleading narrative that nobody bothered to dig into until Bloomberg did. And that’s a failure of transparency, of reporting systems, and of the kind of basic journalistic skepticism that should have kicked in when one company was suddenly responsible for 78% of all reports in a category.

We’ll see if NCMEC’s promised updates to the reporting form actually address these issues. In the meantime, maybe we can all agree that the next time there’s a 700% increase in reports of anything, it’s worth asking a few questions before writing the “everything is on fire” headline.

Filed Under: ai, csam, cybertipline, data, moral panic

Companies: amazon, ncmec, ny times

Read the full article here

Fact Checker

Verify the accuracy of this article using AI-powered analysis and real-time sources.

Get Your Fact Check Report

Enter your email to receive detailed fact-checking analysis

5 free reports remaining

Continue with Full Access

You've used your 5 free reports. Sign up for unlimited access!

Already have an account? Sign in here

#DigitalCulture #DigitalTransformation #FutureOfMedia #Innovation #MediaTech #Web3
Share. Facebook Twitter Pinterest LinkedIn Tumblr Email Telegram Copy Link
News Room
  • Website
  • Facebook
  • X (Twitter)
  • Instagram
  • LinkedIn

The FSNN News Room is the voice of our in-house journalists, editors, and researchers. We deliver timely, unbiased reporting at the crossroads of finance, cryptocurrency, and global politics, providing clear, fact-driven analysis free from agendas.

Related Articles

Cryptocurrency & Free Speech Finance

Cathie Wood’s Ark Invest Buys the Dip on Coinbase and Robinhood—Both Now Surging

15 minutes ago
Media & Culture

Arm the Resistance?

50 minutes ago
Legal & Courts

RCFP expands free legal support to local newsrooms in Gulf states and upper Midwest

1 hour ago
Cryptocurrency & Free Speech Finance

Morning Minute: CFTC Chair Says U.S. Perpetual Futures Are Coming

1 hour ago
Media & Culture

Brendan Carr Can’t Explain Why ‘Equal Time’ Rule Doesn’t Apply To Right Wing Radio

2 hours ago
Media & Culture

Marco Rubio Threatens to “Unleash Chiang” on Iran. What?

2 hours ago
Add A Comment

Comments are closed.

Editors Picks

RedStone Launches Price Oracles on Stellar Mainnet

8 minutes ago

Cathie Wood’s Ark Invest Buys the Dip on Coinbase and Robinhood—Both Now Surging

15 minutes ago

Arm the Resistance?

50 minutes ago

RCFP expands free legal support to local newsrooms in Gulf states and upper Midwest

1 hour ago
Latest Posts

Solana (SOL) gains 5.6%, leading index higher

1 hour ago

Bitcoin Is a Real-Time Sentiment Gauge for Weekend Warmongering

1 hour ago

Morning Minute: CFTC Chair Says U.S. Perpetual Futures Are Coming

1 hour ago

Subscribe to News

Get the latest news and updates directly to your inbox.

At FSNN – Free Speech News Network, we deliver unfiltered reporting and in-depth analysis on the stories that matter most. From breaking headlines to global perspectives, our mission is to keep you informed, empowered, and connected.

FSNN.net is owned and operated by GlobalBoost Media
, an independent media organization dedicated to advancing transparency, free expression, and factual journalism across the digital landscape.

Facebook X (Twitter) Discord Telegram
Latest News

COIN, MSTR lead gains as bitcoin (BTC) climbs above $70,000

5 minutes ago

RedStone Launches Price Oracles on Stellar Mainnet

8 minutes ago

Cathie Wood’s Ark Invest Buys the Dip on Coinbase and Robinhood—Both Now Surging

15 minutes ago

Subscribe to Updates

Get the latest news and updates directly to your inbox.

© 2026 GlobalBoost Media. All Rights Reserved.
  • Privacy Policy
  • Terms of Service
  • Our Authors
  • Contact

Type above and press Enter to search. Press Esc to cancel.

🍪

Cookies

We and our selected partners wish to use cookies to collect information about you for functional purposes and statistical marketing. You may not give us your consent for certain purposes by selecting an option and you can withdraw your consent at any time via the cookie icon.

Cookie Preferences

Manage Cookies

Cookies are small text that can be used by websites to make the user experience more efficient. The law states that we may store cookies on your device if they are strictly necessary for the operation of this site. For all other types of cookies, we need your permission. This site uses various types of cookies. Some cookies are placed by third party services that appear on our pages.

Your permission applies to the following domains:

  • https://fsnn.net
Necessary
Necessary cookies help make a website usable by enabling basic functions like page navigation and access to secure areas of the website. The website cannot function properly without these cookies.
Statistic
Statistic cookies help website owners to understand how visitors interact with websites by collecting and reporting information anonymously.
Preferences
Preference cookies enable a website to remember information that changes the way the website behaves or looks, like your preferred language or the region that you are in.
Marketing
Marketing cookies are used to track visitors across websites. The intention is to display ads that are relevant and engaging for the individual user and thereby more valuable for publishers and third party advertisers.