Edited by humans. Written by AI. How our editing works
BUZZRAGNews. Trends. Ideas — distilled in minutes.
All articles

Building a Serverless AI Agent with Pi and Google Cloud

A developer tutorial walks through deploying a personal AI bookkeeping agent to Google Cloud Run using Pi, Express, and Cloud Storage—accessible from any device.

Marcus Chen-Ramirez

Written by AI. Marcus Chen-Ramirez

June 11, 20267 min read
Share:
A smiling man wearing orange-tinted sunglasses against a purple background next to white pixelated logo and text reading…

Photo: AI. Ondine Ferretti

There's a recurring fantasy in developer circles: the personal AI that knows your business, lives in the cloud, costs you almost nothing to run, and answers to no one but you. No SaaS subscription. No vendor reading your data. No pricing tier that inexplicably doubles when you hit a threshold.

ZazenCodes, a programming-focused YouTube channel, published a tutorial last week doing its best to make that fantasy practical. The video—28 minutes of live terminal work, architecture diagrams, and a few detours through Toronto barbershop receipts—walks through building a serverless AI bookkeeping agent using a framework called Pi, deploying it to Google Cloud Run, and getting it working on a phone. The whole thing, from empty folder to cloud-hosted app, happens inside a single session.

What's interesting about the approach isn't really the bookkeeping part. It's the architecture choices, and what they reveal about where DIY AI tooling currently sits.

The Stack, Unpacked

Pi is the load-bearing piece here that most developers won't have heard of. The tutorial describes it as an "agent harness"—the infrastructure layer that wraps a language model and handles memory, skills, and configuration so you're not rebuilding those primitives from scratch every time. The underlying model (in this case, Anthropic's Claude via API) is essentially a plugin. Pi handles the rest.

"Pi is everything surrounding the language model," the creator explains. "It's like the infrastructure that makes it accessible to us."

The rest of the stack is deliberately mundane: an Express HTTP server for the chat frontend, Google Cloud Run for serverless container hosting, and Google Cloud Storage for persistence. That last choice deserves some attention.

Serverless architectures have a fundamental problem with state: when the function spins down—which happens automatically when nobody's using it, which is the whole cost-saving point—any data stored inside it vanishes. For a chatbot that's logging your expenses, that's fatal. The solution here is to mount a Cloud Storage bucket using GCS Fuse, a filesystem interface that lets the container treat cloud object storage like a local directory. Cloud Run has native support for this. The tutorial notes this as a discovery in real time: "Honestly, guys, I did not even know about this."

That candor is either refreshing or concerning depending on what you're hoping for. The tutorial isn't documentation—it's someone building in public, learning as they go. That has real value. It also means readers following along will need to validate decisions independently.

Claude Does the Heavy Lifting

The most technically revealing portion isn't the deployment itself—it's watching the creator use Claude Code to write the deployment code. The prompt engineering is actually thoughtful. Rather than asking for a generic Express app, the creator constructs a detailed specification: target file structure, Pi configuration, memory schema (facts.json, expenses.json, an images directory), skills registry, Node version pinning, even a stylistic preference for dark mode. The initial build is described as a "one-shot": one prompt, one complete scaffold.

The catch is that Claude doesn't know what Pi is. It's not in the model's training data—Pi is too recent and too niche. So the creator instructs Claude to research Pi before writing any code, which it does by examining the locally installed package. That's a useful pattern for working with rapidly evolving frameworks: tell the model to read the source before writing against it.

The cloud deployment phase uses a similar approach, but with higher stakes. The creator grants Claude access to the authenticated gcloud CLI and instructs it to provision infrastructure. Secrets get stored in Google Secret Manager. A service account gets created. The container gets configured with one CPU and a gigabyte of memory. Claude orchestrates all of this through prompted terminal commands.

"It's using one CPU, it's using a gig of memory. This will have implications on cost. And now we're adding volumes."

The creator turns off "bypass permissions" before this phase—meaning Claude has to ask before running each terminal command. That's the right call when an AI is provisioning real cloud resources. Small typos in infrastructure commands can be expensive.

What Actually Works

By the end of the tutorial, the stack does what it promises. A receipt photo dragged into the browser becomes a structured expense entry in a JSON file sitting in a cloud storage bucket. The same flow works from a phone. The agent correctly categorizes items, asks clarifying questions when categories don't fit cleanly, and persists everything between sessions.

The mobile demo is the satisfying part. Open the Cloud Run URL in Safari, authenticate with a shared password, tap "Add to Home Screen," and you have something that looks and behaves like a native app—except the intelligence is serverless Claude and the storage is Google Cloud, and you control both. For someone tired of paying $20/month for a notes app with "AI features," there's a real appeal here.

"This video will give you an agent that you can use, pretty much. That's what I'm telling you."

The creator is not wrong. But the tutorial is also honest about what's left undone.

The Honest Gaps

Authentication is the most obvious weak link, and the tutorial acknowledges it repeatedly. The security model is a single shared password stored in Google Secret Manager—no multi-factor authentication, no per-user sessions, no audit logging. The creator flags this multiple times: "I'm not advocating you to actually set this up... You can harden the authentication later, basically."

That "harden it later" posture is endemic to developer tutorials, and it deserves some scrutiny. For a personal productivity tool that never leaves your pocket, shared-password auth is probably fine. For anything that touches real financial data and might be accessed from unfamiliar networks, it isn't. The tutorial gestures at the gap without filling it.

There's also the question of cost visibility. The tutorial mentions implications without quantifying them. Google Cloud Run is genuinely cheap for low-traffic personal tools—you can often stay within free-tier limits—but the combination of Cloud Run, Cloud Storage, and Secret Manager, plus Anthropic API costs per token, means the economics depend heavily on usage. Someone following this tutorial should do their own math before treating it as a free alternative to commercial tools.

The version-control discipline section is actually one of the stronger parts. The tutorial shows exactly how to use .gitignore to keep actual expense data and receipt images out of the repository while preserving example files for onboarding. That's a detail that beginners routinely get wrong, and getting it right from the start matters.

The Larger Question

What this tutorial really demonstrates is that the barrier to deploying a personal AI agent has dropped far enough that a motivated developer can do it in an afternoon. That's genuinely new. A year ago, this required either accepting a commercial vendor's terms or maintaining a significant amount of infrastructure glue yourself.

The Pi framework abstracts a lot of the hard parts. Claude Code reduces the amount of infrastructure knowledge you need to hold in your head. Google Cloud Run handles the scaling and availability that used to require a DevOps team. The combination is meaningfully accessible in a way that similar stacks weren't before.

What hasn't changed is that "accessible" and "production-ready" are still different categories. The tutorial builds something real and functional—but it's a sketch of a system, not a finished one. The creator knows this. The more interesting question is whether the frameworks and tools are now mature enough that turning a sketch into something hardened is a weekend project rather than a month-long one.

Based on what's in this tutorial, the honest answer is: almost.


Marcus Chen-Ramirez is a senior technology correspondent for Buzzrag covering AI, software development, and the intersection of technology and society.

From the BuzzRAG Team

AI Moves Fast. We Keep You Current.

Framework breakdowns, tool comparisons, and AI coding insights — distilled from the best tech YouTube creators. Free, weekly.

Weekly digestNo spamUnsubscribe anytime

More Like This

Large bold text "CODEX 3.0 IS TOO GOOD!" overlaid on a code editor interface showing CSS styling and a Firefox download…

OpenAI's Codex Is Growing Up Fast—And Getting Weird

OpenAI's latest Codex updates add browser control, AI-reviewed approvals, and... animated pets? A look at where AI coding tools are actually heading.

Marcus Chen-Ramirez·1 month ago·6 min read
Man with beard facing camera surrounded by icons representing Python, databases, graphs, and neural networks with "AI…

The AI Engineer Roadmap Nobody's Talking About

A deep dive into what it actually takes to become an AI engineer in 2026—from Python fundamentals to deploying LLMs. ZazenCodes maps the terrain.

Marcus Chen-Ramirez·4 months ago·7 min read
Developer presenting setup guide with terminal, chat interface, and AI avatar on purple-lit background

Using Hermes AI Agent as a Remote Lead Developer

ZazenCodes shows how to run Hermes AI agent on a VPS with Telegram and GitHub to automate coding tasks remotely—even from your phone on a subway.

Marcus Chen-Ramirez·2 weeks ago·7 min read
A bearded man in a gray shirt stands with arms crossed next to text reading "AI ENGINEER PYTHON MCP MASTERCLASS" and three…

Inside an AI Engineer's Workflow for Building MCP Servers

AI engineer Alex demonstrates his complete workflow for building and deploying MCP servers, revealing how AI tools shape—and complicate—modern development.

Marcus Chen-Ramirez·4 months ago·6 min read
Man in grey shirt next to code editor window with Rust RefCell example, Sentry logo and "Rust Catches AI Mistakes" text on…

Rust for AI Coding: Safety Argument Has Policy Stakes

Daniel Szoke argues Rust's strict compiler makes it safer for AI-generated code. The policy implications—liability, procurement, governance—are bigger than the tech debate.

Samira Barnes·2 weeks ago·7 min read
Claude Code PRO WORKFLOW banner with orange mascot character and smiling man in black shirt against diagonal striped…

Claude Code Workflow: Build Real Apps With AI Agents

Leon van Zyl's Claude Code workflow—parallel agents, automated security audits, reusable skills—raises real questions about how AI builds production apps safely.

Rachel "Rach" Kovacs·2 weeks ago·7 min read
Technician in cleanroom suit holding a RAM chip with fire visible in the background and "IT'S OVER" text overlay

The Memory Company That Accidentally Controls AI

SK Hynix nearly went bankrupt in 2012. Now they control the supply chain for every major AI chip. Here's how a decade-old bet reshaped the industry.

Marcus Chen-Ramirez·3 months ago·6 min read
Woman surrounded by glowing red question marks with tech job titles including Data Scientist, Software Engineering, ML…

Tech Career Decisions: What to Know Before 2026

Marina Wyss breaks down seven tech roles—from software engineering to applied science—through a decision tree based on personality, not just skills.

Marcus Chen-Ramirez·3 months ago·7 min read

RAG·vector embedding

2026-06-11
1,696 tokens1536-dimmodel text-embedding-3-small

This article is indexed as a 1536-dimensional vector for semantic retrieval. Crawlers that parse structured data can use the embedded payload below.