Web Scraping With an API: A Beginner's Guide

Here's the deal with web scraping tutorials: most of them start by assuming you already know how to lose. You're halfway through the setup, a website throws a CAPTCHA at you, your script breaks, and you're googling "proxy rotation" at 11pm wondering why you even started. The learning curve isn't the code — it's the walls websites put up before you even get to write any.

That's the exact problem Anna Kubo addresses head-on in her new freeCodeCamp tutorial, Web Scraping for Beginners – Extract Data with an API. Her pitch is refreshingly honest: instead of teaching you to fight those walls yourself, she shows you how to let someone else handle them.

"Instead of building scrapers from scratch, dealing with proxies, or constantly fixing broken scripts, we're going to use an API that handles all of that for us. It runs searches in a real browser, solves CAPTCHAs automatically, and returns clean, structured data in JSON that we can use directly in our apps."

That API is SerpApi, which acts as a fully managed scraping layer between your code and the web. You send it a query; it runs the search in a real browser, handles bot detection and rate limits on its end, and hands you back clean JSON. The tradeoff — and it's worth naming plainly — is that you're introducing a paid third-party dependency into your project. SerpApi has a free tier, but if your scraping needs scale up, so does the bill. Kubo is transparent about this, and the tutorial is actually sponsored by SerpApi, which she discloses upfront. That context matters when evaluating her approach.

What the Tutorial Actually Covers

The hour-long course is structured in two acts: learn the tools, then build something real with them.

The first act walks through SerpApi's capabilities using a Node.js project. After signing up and grabbing an API key, Kubo runs a dead-simple terminal command — two parameters, q for query and your API key — and gets back Google search results as JSON. From there, she layers in optional parameters: location, language, Google domain, country code. The documentation walkthrough is genuinely useful here; she points out that you can't just make up parameter values and shows you where to find SerpApi's supported options.

What's worth paying attention to is how much ground the API covers. Beyond Google Search, SerpApi offers engines for Bing, DuckDuckGo, Amazon product listings, the Apple App Store, YouTube search, and more. Kubo also demos the Google Short Videos API (for scraping Instagram Reels, YouTube Shorts, and similar content) and the Google Lens API, which lets you run reverse image searches programmatically and pull back visual match results. The Lens demo — passing in a photo of Danny DeVito and getting back visual matches from across the web — is a solid illustration of how weird and powerful this kind of tooling can be.

Python, Java, Rust, and even Google Sheets integrations are available through SerpApi's documentation, though Kubo builds everything in Node.js here.

The Real Project: A Short Video Scraper

The second act is where things get interesting from a "what can I actually do with this" perspective. Kubo builds a full-stack web app — Node.js/Express backend, plain HTML/CSS/JavaScript frontend — that:

Takes a search query and a result count (5, 10, or 15 videos)
Hits SerpApi's Google Short Videos engine
Renders the results as a grid of cards with thumbnails, titles, durations, and sources
Downloads individual videos (or all of them at once) to your local machine via yt-dlp

The yt-dlp integration is the genuinely clever part. SerpApi returns video metadata and links; yt-dlp is the open-source tool that actually fetches and saves the video file as an MP4. Kubo runs it as a child process from within Node using execFile, which is a pattern worth knowing if you're new to Node's child_process module.

The build is intentionally stripped down — she calls it "a skeleton" — but that framing is honest rather than apologetic. The point isn't to hand you a finished product; it's to show you the connective tissue so you can extend it yourself. The full code is on GitHub if you want to dig in.

The Part That Actually Matters: API Key Security

Scattered throughout the tutorial — and given its own dedicated section at the end — is a consistent emphasis on keeping your API key out of your code. Kubo shows her own key on screen early on and immediately explains why that's fine in that context (she's disabling it after filming) but dangerous in production:

"If other people get hold of your API key and use it in their projects, they could potentially use up all your credits or even rack up a hefty credit card bill if you decide to attach credit cards to your platforms. That goes for every API key out there."

The solution she implements is a .env file with the dotenv package, storing the key as an environment variable accessed via process.env.API_KEY. It's a foundational security habit, and the fact that she builds the whole project first and then retrofits the env variable handling actually reinforces why it matters — you can see exactly where the exposure would occur.

This is the kind of thing that beginner tutorials often skip because it feels like infrastructure rather than learning. Kubo doesn't skip it.

The Honest Tension Here

There's a real question buried in this approach that the tutorial doesn't fully surface, and it's worth sitting with: what are you actually learning when the hard parts are abstracted away?

Traditional web scraping tutorials — the kind Kubo describes as "super complicated" or prone to breaking — teach you to understand HTTP requests, parse HTML with tools like Cheerio or BeautifulSoup, rotate user agents, handle pagination, and debug when sites change their markup. That's genuinely difficult, and a lot of beginners bounce off it. Kubo's approach gets you to working, useful output faster. But the knowledge of why CAPTCHAs exist, how rate limiting works, and what's actually happening when SerpApi "runs searches in a real browser" lives inside a black box you're paying to not open.

That's not a criticism of the tutorial — it's honest about what it is. For someone who needs to automate data collection for a project and doesn't have weeks to spend on scraping infrastructure, this is a practical path. For someone who wants to understand web scraping deeply, this is a solid starting point that might eventually prompt you to go deeper on your own.

"I just want to give you the skeleton to do this. So you can go forth, make it your own, add features, expand on this idea."

That framing — skeleton, not finished product — is probably the right way to hold the whole tutorial. Kubo isn't selling you a complete solution. She's showing you that working data extraction is more accessible than the hard-mode tutorials suggest, and handing you something you can actually run and modify.

The question of whether the abstraction is training wheels or a ceiling depends entirely on where you want to go next.

— Tyler Nakamura, Consumer Tech & Gadgets Correspondent, BuzzRAG