1.1. Batch vs. Real-Time Processing

AI System Design Interview Guide (2025)

ML System Design Design Patterns

4 min read 772 words

🪄 Step 1: Intuition & Motivation

Core Idea (in 1 short paragraph): Machine learning systems need to process data — lots of it. But here’s the catch: not all data arrives neatly at once. Some systems work best when data is collected over time and processed in one big “batch,” while others need to react instantly, like catching fraud as it happens. The big question is when to wait and when to act immediately.
Simple Analogy: Imagine cooking for a wedding.
- Batch processing is like preparing all the food the night before — slow, planned, and organized.
- Real-time processing is like a live chef station — fast, reactive, and serving people as they come. Both feed people, but each shines in different scenarios.

🌱 Step 2: Core Concept

Let’s unpack what’s really going on behind these two patterns.

What’s Happening Under the Hood?

In batch processing, data is gathered over a period — hours or days — and then processed in one go. Think of it as taking a full snapshot of your world and analyzing it. For example:

A recommendation engine retrains every night on all user interactions from the day.
The pipeline uses tools like Spark, Hadoop, or Airflow to extract data, transform it (ETL), and load it into a training pipeline.

In real-time processing, data is processed the moment it arrives. Imagine a transaction flowing into a fraud detection service. The model must predict fraud instantly, so each new event triggers an inference. Frameworks like Kafka, Flink, or Spark Streaming handle continuous data streams, ensuring each event gets processed without delay.

Why It Works This Way

Batch systems are efficient when:

You don’t need immediate results.
Data can be aggregated and processed periodically (e.g., daily).

Real-time systems are vital when:

Each new data point matters immediately (e.g., self-driving cars, stock trading, credit card fraud).

The key trade-off is between timeliness and efficiency. Batch = “cheaper but slower.” Real-time = “faster but more expensive.”

How It Fits in ML Thinking

Machine learning pipelines depend on how data flows.

Batch pipelines are used for model training — when we want a model to learn from all historical data.
Real-time pipelines are used for model inference — when we want the model to predict right now based on new input.

In production, both often coexist: A model trained on batch data serves predictions in real time.

📐 Step 3: Mathematical Foundation

While batch vs. real-time is more architectural than mathematical, one idea helps us quantify their performance: throughput vs. latency.

Throughput–Latency Relationship

$$ \text{Throughput} = \frac{\text{Number of records processed}}{\text{Time taken}} $$$$ \text{Latency} = \text{Time taken to process a single record} $$

Throughput measures how much data we can process per second.
Latency measures how long each data point takes to process.

Imagine a restaurant:

Throughput = number of meals served per hour.
Latency = time from order to table. Batch systems prioritize throughput; real-time systems prioritize latency.

🧠 Step 4: Assumptions or Key Ideas

Data can arrive continuously or in chunks.
Systems must choose based on the business goal — accuracy vs. immediacy.
Infrastructure choices (Kafka, Spark, Airflow) differ depending on this design decision.
Real-time often means streaming architecture, stateful computations, and low-latency serving layers.

⚖️ Step 5: Strengths, Limitations & Trade-offs

Batch Processing

Simpler to manage.
Cost-efficient for large-scale analysis.
Ideal for model retraining and long-term trends.

Real-Time Processing

Enables instant decisions (fraud detection, alerts).
Great for adaptive systems (recommendations, personalization).

Batch Processing

Not suitable for immediate responses.
Delayed insights.
Requires significant compute for large data loads.

Real-Time Processing

Expensive to maintain.
Difficult to ensure data consistency.
Harder debugging and scaling.

Batch = accuracy and efficiency over time. Real-time = responsiveness and immediacy at higher cost.

Think of it like watching news:

Batch = reading tomorrow’s newspaper (complete but delayed).
Real-time = watching live TV (immediate but chaotic).

🚧 Step 6: Common Misunderstandings (Optional)

🚨 Common Misunderstandings (Click to Expand)

“Real-time systems always outperform batch.” → Not true; real-time adds latency overhead for small events.
“Batch systems are outdated.” → Far from it; most enterprise pipelines rely heavily on batch for training and analytics.
“You must choose one.” → In practice, hybrid architectures (called lambda or kappa architectures) combine both.

🧩 Step 7: Mini Summary

🧠 What You Learned: The difference between batch and real-time lies in how data is processed — all at once vs. continuously.

⚙️ How It Works: Batch pipelines store and process data periodically; real-time systems react to events as they occur.

🎯 Why It Matters: Choosing the right approach determines how fast, costly, and reliable your ML system will be.

1.2. Latency vs. Throughput