Gemini 3.1 Flash Lite's Web Scraping Accuracy Has

Google's newest lightweight model reportedly scrapes the web with 100% link accuracy at roughly $4 per thousand pages. If you run any kind of automated data pipeline, that number just made you sit up straighter. If you own a website, it should also make you a little uncomfortable.

Both reactions are reasonable. They're also not mutually exclusive.

Hamish from the Income Stream Surfers YouTube channel published a hands-on test this week of Gemini 3.1 Flash Lite—Google's new small model described in its own documentation as "designed for lightweight agentic workflows, simple data extraction, and applications where responsiveness and API costs are the primary constraints." Hamish's results were striking enough that the piece warrants covering. But so does the context the video doesn't bother with.

A note before the numbers: Hamish refers throughout his video to a model he calls "GPT-5 Nano" as a comparison baseline. As of this writing, OpenAI has not publicly released a model by that name. It's possible Hamish is using an internal label, a preview model not publicly announced, or a name that predates broader verification. I'm flagging this because the hallucination rate figures he cites—roughly 50% link hallucination for this unnamed model—are used as a benchmark throughout the piece, and the comparison only holds if we know what we're actually comparing. Similarly, Hamish cites a May 7th release date for Flash Lite and claims a 1 million token context window; both are plausible but sourced entirely from his video. Verify against Google's official model documentation before you build anything critical on those specs.

What he actually tested

The stack is straightforward: feed a URL to Jina Reader (a service that converts web pages to clean markdown), then pass that markdown to Gemini 3.1 Flash Lite with a prompt asking it to extract structured JSON—links, images, product data, whatever you're after. No complex infrastructure. No Firecrawl subscription.

Hamish ran this against a real site, 2min.it, and came back with 96 out of 96 URLs returning HTTP 200—meaning every single link the model returned was a real, live URL. He also got 36 out of 36 images verified as genuine. In his own words: "To have 100% real URLs here is extremely interesting at a very, very good price."

For anyone who's run LLM scraping pipelines before, that accuracy figure is legitimately notable. Language models have a well-documented tendency to confabulate plausible-looking URLs that go nowhere—a maddening failure mode when you're trying to build a link database or extract product catalog data. Hamish says he previously used a different lightweight model for this task and saw roughly 50% of returned links be hallucinated; he then upgraded to Gemini 3 Flash and still saw about 75% hallucination on the same task. Flash Lite apparently fixed what its bigger sibling couldn't. That's the kind of engineering outcome that actually matters in production.

The pricing math holds up on its face: $0.25 per million input tokens, $1.50 per million output tokens, with batch API discounts that bring 1,000 scraped pages down to around $4. That's cheap enough that cost is no longer a meaningful barrier to operating an LLM scraping pipeline at scale.

The grounding trap

The most practically useful thing in Hamish's video is a warning that has nothing to do with Flash Lite specifically—it applies to any Gemini model with grounding enabled.

Grounding lets the model pull live Google Search results to augment its responses, similar to how Perplexity works. It's genuinely useful for research tasks. It's also priced per query, and models will run as many queries as they think they need unless you explicitly constrain them.

Hamish discovered this the hard way: his daily API spend went from $10 to $150 overnight because a model decided 30 queries was the right number for a given task. "We went from spending like $10 a day to $150 a day, which luckily I noticed and stopped," he says. His advice: cap the query count explicitly in your system prompt, or skip grounding entirely and do traditional LLM scraping instead. Good advice. The kind of thing that would have saved his team real money if someone had put it in the documentation more prominently.

The part the video doesn't cover

Hamish is transparent about what he's doing with this pipeline: HarborSEO, his product, uses LLM scraping to extract product data and information from other people's websites, then feeds that structured data into AI-generated blog posts. He's also explicit that the data itself is sellable—"LLM scraping is something that you can basically resell. So what you actually resell is the JSON structured data."

That's a legitimate business model, and it's not illegal in most jurisdictions (robots.txt compliance and ToS questions vary by site). But it's worth naming what it actually is: an automated system that reads websites at scale without the site owner's knowledge or consent, extracts their content, and repackages it into a competing product.

The websites being scraped don't know it's happening. They don't get paid. When the output is an AI-written article that surfaces in search results alongside—or instead of—the original source, the economics flow entirely one direction.

I'm not saying Hamish is doing anything wrong. The web has always had crawlers; Google's entire business is built on scraping. But there's a difference between a search engine indexing a page so users can find the original, and a content pipeline extracting structured data to generate articles that substitute for the original. One sends traffic to sources. The other doesn't need to.

The reason this model release matters—the actual reason it matters beyond the benchmark numbers—is that it removes cost as a friction point for the latter. At $4 per thousand pages, the barrier to running an industrial-scale content extraction operation is essentially gone. That's what "100% accuracy at this price" means in practice.

Whether that's a problem depends on where you sit. If you're a developer building data pipelines, this is a genuinely useful tool at a genuinely good price. If you publish original content, research, or product information on the web, the model that just dropped makes it cheaper than ever for someone else to extract the value from your work and monetize it elsewhere.

Hamish isn't the threat here—he's just the early adopter showing you the playbook. The question worth sitting with is what happens when everyone running a content business reads the same tutorial.

Rachel "Rach" Kovacs is Buzzrag's cybersecurity and privacy correspondent. Former white hat hacker, former InfoSec director, permanently suspicious of anything priced to scale.