Why Linear Algebra Is the Secret Language of AI
How machine learning actually works: IBM's Fangfang Lee breaks down the math that turns cat photos into numbers computers can understand.
Written by AI. Tyler Nakamura
March 20, 2026

Photo: IBM Technology / YouTube
Here's something wild: when your phone's camera recognizes your face, it's not actually seeing you. It's doing math. Really specific, really fast math that involves concepts you might've slept through in college.
Fangfang Lee from IBM Technology just dropped an explainer that actually makes this make sense, and honestly? It's kind of beautiful how this all works. The gap between "computer processes image" and "computer understands what's in the image" isn't magic—it's linear algebra, and understanding even the basics changes how you think about AI.
The Translation Problem
Computers don't experience the world like we do. They can't look at a photo of your dog and think "good boy." They need everything converted into numbers first. As Lee explains it: "Computers cannot process images, text, audios, or videos directly like humans. Instead, we need to translate these inputs into a language they can understand, mathematics."
That's where linear algebra comes in. It's the Rosetta Stone between human-readable data and machine-processable information. Every image, every word, every sound—it all gets transformed into mathematical objects that computers can actually work with.
The process is called vectorization, which sounds intimidating but is basically just organized number storage. Your vacation photo? It becomes a matrix where each pixel gets a number representing its color intensity. A sentence? It turns into a vector that somehow captures what the words mean, not just what letters they contain.
The Building Blocks
Lee breaks down four fundamental types of mathematical objects that make this all possible:
Scalars are just single numbers. Think of them as the atoms of this whole system—5, 2.6, π. Single points in space.
Vectors are one-dimensional lists of numbers, like [2, 3, 4]. They're how you represent simple sequences or directions.
Matrices level up to two dimensions—rows and columns. This is where things get interesting because you can represent entire images as matrices. Each cell in the grid corresponds to a pixel's brightness or color.
Tensors go full sci-fi, handling three or more dimensions. They're the heavy lifters in frameworks like TensorFlow (the name isn't subtle). When you're processing video or working with massive language models, you're dealing with tensors.
What matters isn't memorizing these definitions—it's understanding that this hierarchy exists to handle increasingly complex data. A single temperature reading? Scalar. A sentence? Vector. An image? Matrix. A video? Tensor.
Measuring Similarity (The Interesting Part)
Once everything's converted to numbers, the real question becomes: how do you compare things? How does a recommendation algorithm know two movies are similar, or a search engine know which results match your query?
Two methods dominate: Euclidean distance and cosine similarity.
Euclidean distance is straightforward—it measures the literal distance between two vectors in space. You calculate the difference between each corresponding dimension, square them, sum them up, and take the square root. It's the Pythagorean theorem on steroids. The output is unbounded, which means it can be any positive number.
Cosine similarity takes a different approach. Instead of measuring distance, it measures the angle between two vectors. Lee explains: "The closer the two vectors are in their semantic meaning the smaller the angles will be." The output ranges from -1 to 1, which makes it standardized and easier to interpret.
When cosine similarity equals 1, the vectors point in the exact same direction—they're basically identical in meaning. Zero means they're perpendicular, representing completely independent features. Negative 1 means they point in opposite directions.
That perpendicular thing is actually significant. "In machine learning, when two vectors are perpendicular, it means that the features that they represent are completely independent of one another," Lee notes. That's not just math trivia—it tells you whether features in your data actually relate to each other or not.
The Efficiency Hack
Here's where it gets practical: modern AI models work with insane amounts of data. Training large language models involves billions of tokens, and doing full-dimensional calculations on all of that would be computationally ridiculous.
Enter Singular Value Decomposition (SVD), which Lee describes as "not only elegant but extremely versatile." SVD takes one massive matrix and breaks it down into three smaller, more manageable matrices that can be reconstructed back into the original.
The example Lee uses is perfect: imagine a matrix representing user ratings of movies. Rows are users, columns are movies, and each cell contains a rating. SVD splits this into three matrices: U (capturing user behavior patterns), Sigma (a diagonal matrix indicating which features matter most), and V-transposed (capturing movie characteristics).
The genius move? "Using SVD algorithm, we can select and only retain the most informative features based on the singular values and disregard unhelpful information." You're essentially compressing the data while keeping what matters. It's like converting a RAW photo to JPEG—you lose some information, but you keep the important stuff and save massive amounts of storage and processing power.
Why This Actually Matters
If you're shopping for AI-powered devices or trying to understand what's actually happening when ChatGPT responds to your prompts, this isn't academic. These operations—matrix multiplication, dot products, dimensionality reduction—are happening thousands of times per second.
When your phone processes your voice command locally instead of sending it to the cloud, it's running optimized matrix operations on a chip designed specifically for this math. When a recommendation algorithm suggests your next binge-watch, it's calculating cosine similarity between your viewing history vector and millions of other options.
The frameworks Lee mentions—PyTorch, TensorFlow, Keras—are all built to make these linear algebra operations fast and efficient. They've abstracted away a lot of the complexity, but understanding what's happening under the hood helps you make better decisions about which AI tools to trust and which claims to be skeptical of.
Lee puts it plainly at the end: linear algebra "converts data into mathematical form, computations into organized structure, and structure into actionable intelligence." That's the pipeline. That's how the magic trick works.
And honestly? Once you see it, you can't unsee it. Every AI feature, every smart recommendation, every image recognition system—it's all just extremely organized math, doing what math does best: finding patterns humans would never spot on their own.
— Tyler Nakamura, Consumer Tech & Gadgets Correspondent
Watch the Original Video
How Linear Algebra Powers Machine Learning (ML)
IBM Technology
11m 19sAbout This Source
IBM Technology
IBM Technology, a YouTube channel launched in late 2025, has swiftly garnered a following of 1.5 million subscribers. The channel serves as an educational platform designed to demystify cutting-edge technological topics such as AI, quantum computing, and cybersecurity. Drawing on IBM's rich history of technological innovation, it aims to provide viewers with the knowledge and skills necessary to succeed in today's tech-driven world.
Read full source profileMore Like This
Microsoft's Copilot Data Reveals What People Actually Use AI For
Microsoft's Copilot usage report shows people want health advice from AI, not just coding help. The data raises questions about enterprise costs and privacy.
Prompt Caching: Making AI Actually Cheaper and Faster
IBM's Martin Keen explains prompt caching—the technique that's cutting AI costs by storing key-value pairs instead of reprocessing the same prompts.
Avoiding the AI Project Graveyard: Proven Strategies
Learn how to avoid AI project failures with clear goals, stakeholder buy-in, and planning for deployment.
Decoding AI: Navigating the Future of Machine Learning
Explore AI's evolution with insights on machine learning, NLP, and generative AI. Understand its impact on privacy and digital safety.