Web Scraping With an API: A Beginner's Guide
Anna Kubo's freeCodeCamp tutorial shows beginners how to scrape the web using SerpApi and Node.js — skipping the hard parts without skipping the learning.
Written by AI. Tyler Nakamura

Photo: AI. Mika Sørensen
Here's the deal with web scraping tutorials: most of them start by assuming you already know how to lose. You're halfway through the setup, a website throws a CAPTCHA at you, your script breaks, and you're googling "proxy rotation" at 11pm wondering why you even started. The learning curve isn't the code — it's the walls websites put up before you even get to write any.
That's the exact problem Anna Kubo addresses head-on in her new freeCodeCamp tutorial, Web Scraping for Beginners – Extract Data with an API. Her pitch is refreshingly honest: instead of teaching you to fight those walls yourself, she shows you how to let someone else handle them.
"Instead of building scrapers from scratch, dealing with proxies, or constantly fixing broken scripts, we're going to use an API that handles all of that for us. It runs searches in a real browser, solves CAPTCHAs automatically, and returns clean, structured data in JSON that we can use directly in our apps."
That API is SerpApi, which acts as a fully managed scraping layer between your code and the web. You send it a query; it runs the search in a real browser, handles bot detection and rate limits on its end, and hands you back clean JSON. The tradeoff — and it's worth naming plainly — is that you're introducing a paid third-party dependency into your project. SerpApi has a free tier, but if your scraping needs scale up, so does the bill. Kubo is transparent about this, and the tutorial is actually sponsored by SerpApi, which she discloses upfront. That context matters when evaluating her approach.
What the Tutorial Actually Covers
The hour-long course is structured in two acts: learn the tools, then build something real with them.
The first act walks through SerpApi's capabilities using a Node.js project. After signing up and grabbing an API key, Kubo runs a dead-simple terminal command — two parameters, q for query and your API key — and gets back Google search results as JSON. From there, she layers in optional parameters: location, language, Google domain, country code. The documentation walkthrough is genuinely useful here; she points out that you can't just make up parameter values and shows you where to find SerpApi's supported options.
What's worth paying attention to is how much ground the API covers. Beyond Google Search, SerpApi offers engines for Bing, DuckDuckGo, Amazon product listings, the Apple App Store, YouTube search, and more. Kubo also demos the Google Short Videos API (for scraping Instagram Reels, YouTube Shorts, and similar content) and the Google Lens API, which lets you run reverse image searches programmatically and pull back visual match results. The Lens demo — passing in a photo of Danny DeVito and getting back visual matches from across the web — is a solid illustration of how weird and powerful this kind of tooling can be.
Python, Java, Rust, and even Google Sheets integrations are available through SerpApi's documentation, though Kubo builds everything in Node.js here.
The Real Project: A Short Video Scraper
The second act is where things get interesting from a "what can I actually do with this" perspective. Kubo builds a full-stack web app — Node.js/Express backend, plain HTML/CSS/JavaScript frontend — that:
- Takes a search query and a result count (5, 10, or 15 videos)
- Hits SerpApi's Google Short Videos engine
- Renders the results as a grid of cards with thumbnails, titles, durations, and sources
- Downloads individual videos (or all of them at once) to your local machine via yt-dlp
The yt-dlp integration is the genuinely clever part. SerpApi returns video metadata and links; yt-dlp is the open-source tool that actually fetches and saves the video file as an MP4. Kubo runs it as a child process from within Node using execFile, which is a pattern worth knowing if you're new to Node's child_process module.
The build is intentionally stripped down — she calls it "a skeleton" — but that framing is honest rather than apologetic. The point isn't to hand you a finished product; it's to show you the connective tissue so you can extend it yourself. The full code is on GitHub if you want to dig in.
The Part That Actually Matters: API Key Security
Scattered throughout the tutorial — and given its own dedicated section at the end — is a consistent emphasis on keeping your API key out of your code. Kubo shows her own key on screen early on and immediately explains why that's fine in that context (she's disabling it after filming) but dangerous in production:
"If other people get hold of your API key and use it in their projects, they could potentially use up all your credits or even rack up a hefty credit card bill if you decide to attach credit cards to your platforms. That goes for every API key out there."
The solution she implements is a .env file with the dotenv package, storing the key as an environment variable accessed via process.env.API_KEY. It's a foundational security habit, and the fact that she builds the whole project first and then retrofits the env variable handling actually reinforces why it matters — you can see exactly where the exposure would occur.
This is the kind of thing that beginner tutorials often skip because it feels like infrastructure rather than learning. Kubo doesn't skip it.
The Honest Tension Here
There's a real question buried in this approach that the tutorial doesn't fully surface, and it's worth sitting with: what are you actually learning when the hard parts are abstracted away?
Traditional web scraping tutorials — the kind Kubo describes as "super complicated" or prone to breaking — teach you to understand HTTP requests, parse HTML with tools like Cheerio or BeautifulSoup, rotate user agents, handle pagination, and debug when sites change their markup. That's genuinely difficult, and a lot of beginners bounce off it. Kubo's approach gets you to working, useful output faster. But the knowledge of why CAPTCHAs exist, how rate limiting works, and what's actually happening when SerpApi "runs searches in a real browser" lives inside a black box you're paying to not open.
That's not a criticism of the tutorial — it's honest about what it is. For someone who needs to automate data collection for a project and doesn't have weeks to spend on scraping infrastructure, this is a practical path. For someone who wants to understand web scraping deeply, this is a solid starting point that might eventually prompt you to go deeper on your own.
"I just want to give you the skeleton to do this. So you can go forth, make it your own, add features, expand on this idea."
That framing — skeleton, not finished product — is probably the right way to hold the whole tutorial. Kubo isn't selling you a complete solution. She's showing you that working data extraction is more accessible than the hard-mode tutorials suggest, and handing you something you can actually run and modify.
The question of whether the abstraction is training wheels or a ceiling depends entirely on where you want to go next.
— Tyler Nakamura, Consumer Tech & Gadgets Correspondent, BuzzRAG
We Watch Tech YouTube So You Don't Have To
Get the week's best tech insights, summarized and delivered to your inbox. No fluff, no spam.
More Like This
This Tool Treats Your Home Lab Like Infrastructure Code
RackPeek documents home labs as YAML code in Git. Brandon Lee shows how this infrastructure-as-code approach beats static diagrams and spreadsheets.
30 Self-Hosted GitHub Projects Trending Right Now
From media automation to AI chat apps, here are 30 trending self-hosted GitHub projects that put you back in control of your data and infrastructure.
World's Fastest Drone Reclaims Record with V4
Discover how Peregreen V4 reclaimed the world's fastest drone title with a speed of 657 km/h.
Penpot Wants to Fix Design Handoff—Does It Actually?
Better Stack demos Penpot, an open-source design tool that speaks CSS natively. We look at what it solves, what it doesn't, and who should care.
Crawl4AI Claims 6x Speed Over Scrapy for RAG Pipelines
Crawl4AI promises faster web scraping built specifically for AI workflows. Better Stack tests its claims against traditional Python tools.
Firecrawl: The Gen Z Tool for Web Scraping Made Easy
Discover how Firecrawl transforms web scraping with natural language for Gen Z tech enthusiasts.
Apple M5 Max Crushes Local AI—Even Beats M3 Ultra
The M5 Max's prompt processing destroys Apple's desktop M3 Ultra. Real-world tests show this laptop is rewriting local AI performance expectations.
AI Operating Systems: The Business Wrapper You Build in Layers
Liam Ottley explains AI Operating Systems: not a business model, but a methodology for wrapping AI around your existing operations to free up founder bandwidth.
RAG·vector embedding
2026-06-09This article is indexed as a 1536-dimensional vector for semantic retrieval. Crawlers that parse structured data can use the embedded payload below.