1.1. Vectors & Operations

5 min read 889 words

🪄 Step 1: Intuition & Motivation

  • Core Idea: Imagine every piece of data you see — a house with square footage and price, a photo with pixels, or a sentence turned into numbers — as a list of quantities. That list is called a vector. Vectors are the basic building blocks of how machines “see” the world in numbers.

  • Simple Analogy: Think of a vector like a directional arrow in space. It doesn’t just tell you where something is, but also how to move there — just like a Google Maps direction saying, “Go 3 steps north, 4 steps east.”


🌱 Step 2: Core Concept

What’s Happening Under the Hood?

A vector is simply a collection of numbers stacked together. For example, $ \vec{v} = [2, 3] $ means “go 2 units right and 3 units up.”

These numbers often represent features in data.

  • In housing data: $[2000, 3, 450000]$ could represent a house with 2000 sqft, 3 bedrooms, and a price of $450,000.
  • In an image: A vector might represent pixel brightness values.

Now, the magic starts when we do operations on these vectors:

  • Addition: Combine directions (e.g., $[1,2] + [3,1] = [4,3]$).
  • Scalar Multiplication: Stretch or shrink vectors (e.g., $2*[3,1] = [6,2]$).
  • Dot Product: Measure how aligned two directions are (e.g., how similar two data points are).
  • Cross Product (3D only): Find a new direction perpendicular to both vectors.
Why It Works This Way

When we add two vectors, we’re essentially chaining two movements. For instance, if you move east by 2 and then north by 3 ($[2,3]$), and then another vector says “go west 1, north 2” ($[-1,2]$), the total move is $[1,5]$.

In machine learning, this represents combining signals — like blending different features to make a single prediction.

The dot product, on the other hand, tells how aligned two vectors are. If they point in the same direction → high dot product. If opposite → negative. If perpendicular → zero.

That’s why cosine similarity (which uses the dot product) measures how similar two vectors are in orientation, regardless of their magnitude.

How It Fits in ML Thinking

Vectors are how data lives inside a model. Every data point, every feature, and even the model’s parameters (weights) are vectors.

When a model makes predictions, it’s just taking dot products — combining inputs and weights to compute outputs.

For instance, in linear regression: $y = Xw + b$ Each row of $X$ is a data vector, and $w$ is a weight vector. Their dot product gives the model’s prediction.


📐 Step 3: Mathematical Foundation

Vector Operations
  1. Addition: $ \vec{a} + \vec{b} = [a_1 + b_1, a_2 + b_2, \dots, a_n + b_n] $ → Adds component-wise.

  2. Scalar Multiplication: $ c \vec{a} = [c \cdot a_1, c \cdot a_2, \dots, c \cdot a_n] $ → Scales the vector’s length (magnitude).

  3. Dot Product: $ \vec{a} \cdot \vec{b} = a_1 b_1 + a_2 b_2 + \dots + a_n b_n $ → Measures similarity between two vectors. If both are normalized (length 1): $ \cos(\theta) = \vec{a} \cdot \vec{b} $

The dot product tells how much one vector “leans” in the direction of another. In machine learning, that means how strongly one feature influences another or how similar two items are in embedding space.

Vector Norms
  1. $L_1$ Norm (Manhattan Distance): $ ||\vec{v}||_1 = |v_1| + |v_2| + \dots + |v_n| $ → Measures total distance if you move along axes (like city blocks).

  2. $L_2$ Norm (Euclidean Distance): $ ||\vec{v}||_2 = \sqrt{v_1^2 + v_2^2 + \dots + v_n^2} $ → The straight-line distance from the origin.

In optimization, $L_2$ norm penalizes large weights more heavily — “pulling” them closer to zero smoothly. $L_1$ norm pushes some weights exactly to zero, creating sparsity. That’s why we use $L_1$ for feature selection and $L_2$ for smoothing.

🧠 Step 4: Key Ideas

  • Vectors represent both direction and magnitude in numerical space.
  • Operations like dot product and norm help measure similarity and importance.
  • All data and model weights in ML are vectors (or collections of vectors).

⚖️ Step 5: Strengths, Limitations & Trade-offs

  • Forms the foundation for all data representations in ML.
  • Enables compact, efficient mathematical manipulation.
  • Provides geometric intuition behind algorithms (e.g., similarity, projections).
  • Abstract to visualize beyond 3D.
  • Harder for beginners to connect algebraic operations with geometric meaning.
  • May feel mechanical until tied to model intuition.
  • The power of vectors lies in balance: Treat them as numbers (for computation) and as directions (for intuition). Both views are essential for mastering deeper topics like PCA or Gradient Descent.

🚧 Step 6: Common Misunderstandings

🚨 Common Misunderstandings (Click to Expand)
  • Myth: Vectors are just lists of numbers. → Truth: They represent directions and magnitudes that encode relationships between features.
  • Myth: The dot product only has numerical meaning. → Truth: It measures alignment — a geometric concept crucial in similarity-based ML methods.
  • Myth: Norms are only for measuring size. → Truth: Norms act as regularizers, influencing how models generalize.

🧩 Step 7: Mini Summary

🧠 What You Learned: Vectors are numerical representations of data that capture both direction and magnitude.

⚙️ How It Works: Through operations like addition, dot product, and norms, vectors let us measure relationships, combine information, and compute similarity.

🎯 Why It Matters: Understanding vectors is the foundation for everything — from matrix algebra to gradient descent.

Any doubt in content? Ask me anything?
Chat
🤖 👋 Hi there! I'm your learning assistant. If you have any questions about this page or need clarification, feel free to ask!