Python Backtesting Tools Promise a Lot. Know the Limits.
Zipline can simulate stock trading strategies in Python — but the leverage trap and survivorship bias can make bad strategies look brilliant. Here's what to watch.
Written by AI. Bob Reynolds

Photo: AI. Otieno Okello
Here is what a Python backtesting tool will show you: you invested $10,000 in Apple and Nvidia using a momentum strategy from 2014 to 2025, and you ended up with $80,000. Here is what it will not show you, unless you know to ask: whether any of that was real, repeatable, or the result of the simulator quietly letting you borrow unlimited money.
That gap — between what the backtest reports and what would have actually happened with your savings — is the whole story. A recent tutorial by the NeuralNine channel on YouTube walks through the Zipline backtesting library for Python, and it does so honestly enough that the tool's real dangers surface right in the middle of the demonstration. Worth paying attention to those moments.
What backtesting actually is
The concept is straightforward. You write a set of rules — buy when this indicator rises above that one, sell when it falls below — and then you run those rules against years of historical stock price data, day by day, to see how you would have done. Zipline, the library demonstrated in the video, runs this as a simulator that walks through historical data one trading day at a time, executing your strategy as if it were happening in real time.
The original Zipline was built by Quantopian, a company that let retail traders develop and run algorithmic strategies. Quantopian shut down in November 2020. The library survived because Stefan Jansen — author of Machine Learning for Algorithmic Trading — picked up maintenance and released what's now called Zipline Reloaded. The presenter is candid about the package's uncertain provenance: "Zipline was a package by Quantopian... they're no longer maintaining this package. So this package is basically dead." Jansen is keeping it functional, but it's worth noting you're building on infrastructure that the original organization abandoned.
The leverage trap, demonstrated live
The most instructive moment in the tutorial comes when the presenter sets the starting capital to ten dollars — not ten thousand, ten dollars — and runs a strategy that buys one hundred shares of Apple at a time. The simulator executes every trade without complaint.
"Even with $1, I can buy endless amounts of Apple stock just because I can use leverage here."
That is not a minor footnote. That is the difference between a backtest that means something and one that means nothing. If your simulator lets you buy $15,000 worth of stock with $10 in the account, the resulting profit curve is not a measure of your strategy's quality. It's a measure of how much Apple stock went up. Any strategy applied to Apple between 2014 and 2018, when the stock roughly doubled on a split-adjusted basis (from around $18 to around $39), will look smart. Most of them aren't.
The presenter shows you how to fix this — adding constraints that prevent borrowing beyond what you actually have — but the fix requires knowing to add it. The default behavior is to let you borrow freely. For anyone running these simulations without that context, the numbers will look better than reality by a substantial margin.
The strategies themselves
The tutorial walks through three approaches, each a step up in mechanical complexity.
First: a pure coin-flip strategy. Buy or sell randomly each day. This is presented explicitly as the baseline for how not to trade, and it's a useful baseline — the random strategy, run against Apple during a bull market and inflated by leverage, still shows the portfolio growing. That should give anyone pause.
Second: a moving average crossover strategy, which compares the average price over the last 30 days against the average over the last 100 days. When the short-term average rises above the long-term average, buy. When it falls below, sell. The presenter notes — correctly — that once leverage is removed, this strategy on Apple roughly broke even over the test period, ending close to where it started. The academic literature on these kinds of rules-based trend strategies is mixed at best in liquid markets, though the evidence varies considerably depending on the asset class and time period examined.
Third: a momentum indicator called MACD (moving average convergence divergence), which measures the relationship between two different moving averages to generate buy and sell signals. Using standard parameters (fast period 12, slow period 26, signal period 9) and running from 2014 through 2025 with leverage disabled, this strategy showed a reported return of around 600 percent. The presenter is appropriately cautious: "I'm showing you how to work with it on a programming basis."
The Nvidia problem
The multi-stock section is where things get genuinely interesting, and where a detail slips past that deserves more attention.
The presenter builds a portfolio of five stocks — Apple, Meta, Nvidia, Tesla, and Google — and allocates 19 percent of the portfolio to each position. Five positions at 19 percent each adds up to 95 percent, leaving 5 percent unaccounted for. This is almost certainly an intentional cash buffer rather than a math error, but it passes without explanation. Small gap, but if you're building your own version of this and expecting fully deployed capital, you'll wonder where the missing 5 percent went.
More importantly: the final portfolio is 62 percent cash, with Nvidia and Apple at roughly 19 percent each, and the total portfolio value has gone from $10,000 to $80,000. The presenter suspects Nvidia is responsible for most of that: "This is crazy. Probably mostly due to Nvidia is my guess."
That guess is almost certainly right. Nvidia's stock rose from roughly $5 (split-adjusted) in early 2014 to over $130 by early 2025 — a 25-fold increase. Any strategy that held Nvidia during that period would look brilliant. The question is whether the strategy caused the holding or whether it just happened to stay long on a stock that became one of the most valuable companies in history. Those are different things, and a backtest run on data you've already seen cannot tell you which one it is.
This is the deeper problem with backtesting generally. You are always testing against history you already know. A strategy that happened to hold Nvidia through the AI boom looks prescient in hindsight. Run the same strategy on the stocks that didn't make it — the ones that were promising in 2014 and are now worth very little — and the picture changes. Zipline will test whatever stocks you tell it to test. It will not tell you that you chose the winners before you started.
What the tool is actually good for
None of this means the tool is useless. The presenter is clear throughout that this is a programming tutorial, not financial advice: "I might make mistakes here when it comes to explaining financial concepts. You are responsible for your own decisions."
That disclaimer is doing real work. Zipline is genuinely useful for programmers who want to understand how to structure a trading simulation — how to feed in historical price data, how to track portfolio value over time, how to model transaction costs and the small price gaps that occur between when you decide to trade and when the trade executes. These are real considerations that matter if you're ever building anything more serious than a demonstration script.
The tutorial also handles data sourcing practically. Quandl (operated by Nasdaq Data Link) provides historical price data free with registration but locks current data behind a paywall. The presenter shows a working alternative using Yahoo Finance's free API to download recent price data and feed it into Zipline directly — a useful workaround that gives the simulator access to current markets without a subscription.
The question the tool cannot answer
Knowing that a strategy would have worked from 2014 to 2025 tells you something. It does not tell you that it will work from today forward, because markets in the next decade will not be identical to markets in the last one. A strategy trained on a period that included a historic bull run in technology stocks is not the same as a strategy that works in general. Backtesting is a check against the most obvious forms of stupidity. It is not a guarantee.
Retail investors have been promised systematic edges by technology tools in every generation — black-box trading software in the 1990s, algorithmic platforms in the 2000s, AI-powered robo-advisors more recently. Some of those tools were genuinely useful. Most of them worked best for the people selling them. Zipline sits in a different category — it's transparent, open-source, and the NeuralNine tutorial presents it honestly — but the pattern of new technology flattering its users with impressive-looking historical returns is older than the software.
The leverage bug that inflates your results is easy to fix once you know it's there. The harder version of the same problem — selecting assets that already went up, then congratulating yourself on the strategy that held them — doesn't have a setting you can adjust.
By Bob Reynolds, Senior Technology Correspondent, BuzzRAG
We Watch Tech YouTube So You Don't Have To
Get the week's best tech insights, summarized and delivered to your inbox. No fluff, no spam.
More Like This
When Walmart Sells Last-Gen GPUs Cheaper Than Amazon
A PC build experiment reveals an uncomfortable truth about 2026 hardware markets: sometimes the discount bin beats the cutting edge.
Intel's $199 Chip Outperforms AMD's $500 Flagship
Intel's Core Ultra 250K at $199 matches or beats AMD's $500+ 9950X in real-world creative workloads. The benchmarks tell an unexpected story.
Tech Meetups: Why Showing Up Matters More Than Networking
Vienna-based developer argues tech meetups work best when you stop trying to extract value and start playing positional chess. His approach challenges conventional networking wisdom.
PostgreSQL Explained for the Rest of Us
PostgreSQL powers much of the internet's data infrastructure. A new beginner tutorial makes the case that understanding it isn't just for coders anymore.
Anthropic's Context Window Leap: Real Progress or Hype?
Anthropic's Opus 4.6 shows minimal performance drop at 1M tokens. Is this the first AI model to actually solve context rot, or just better marketing?
AI Researcher With 20 Years Experience Cuts Through Hype
Dimitri, who shipped his first custom AI at Google in 2005, explains what's real and what's marketing in today's AI promises about costs and capabilities.
RAG·vector embedding
2026-06-14This article is indexed as a 1536-dimensional vector for semantic retrieval. Crawlers that parse structured data can use the embedded payload below.