3.3 Time Series and Lag Analysis
🪄 Step 1: Intuition & Real-World Analogy
🎯 Core Idea
Time-series analysis in SQL is all about understanding change over time — tracking trends, growth, and periodic patterns directly inside your database.
You can use:
LAG()/LEAD()→ compare current values to previous or next rows.- Moving averages → smooth out daily noise and highlight trends.
This allows analysts to say things like:
“Revenue grew by 15% compared to last week,” “Average sales over the last 7 days are trending up,” or “User logins are dropping compared to yesterday.”
🌍 Real-Life Analogy
Imagine you’re a fitness trainer 📈 tracking your clients’ weights:
- You write down their weights daily.
- You want to see how much they lost since yesterday → that’s
LAG(). - You want to see the average weight over the last 7 days → that’s a moving average.
The more historical data you look at (the wider your window), the smoother your trend line — but the heavier your notebook. That’s the trade-off: insight vs. computation cost.
💬 Conversational Style
You might think:
“Wait, can’t I just compute differences in Excel?”
Sure — but imagine your dataset has 10 million daily transactions.
SQL lets you compute trends at scale, using OVER windows that operate directly in the query engine.
And it’s not just about numbers — it’s about patterns and deltas.
🌱 Step 2: The Core Concept Explained
What It Really Does
Time series analysis in SQL lets you:
- Compare a value to its neighbors (
LAG,LEAD). - Compute rolling metrics like averages or sums over time windows.
- Detect trends (growth rate, spikes, seasonality).
It’s all powered by window frames (ROWS BETWEEN X PRECEDING AND CURRENT ROW) that define how many rows around the current one the function can “see.”
How It Works Internally
For each row:
- SQL looks at its partition (optional) and order (usually time).
- It builds a window frame — a set of nearby rows.
- Functions like
LAG(),LEAD(), orAVG()compute using those rows.
Each time you widen the window, the engine stores more intermediate data in memory, which affects performance.
Common Use Cases
- Day-over-day revenue growth (
LAG). - Week-over-week moving average (
AVG OVER). - Detecting churn or inactivity gaps (
LEAD). - Computing rolling retention, session durations, or sales trends.
💻 Step 3: Code Demonstration & Explanation
Let’s explore practical examples that build intuition and query fluency.
🧩 Example 1: Day-over-Day Change with LAG()
SELECT
order_date,
SUM(amount) AS daily_revenue,
LAG(SUM(amount)) OVER (ORDER BY order_date) AS prev_day_revenue,
SUM(amount) - LAG(SUM(amount)) OVER (ORDER BY order_date) AS revenue_change
FROM sales
GROUP BY order_date
ORDER BY order_date;Explanation:
LAG()lets the current day look “back” at the previous day’s revenue.- You can easily compute deltas, growth percentages, or detect drop-offs.
| order_date | daily_revenue | prev_day_revenue | revenue_change |
|---|---|---|---|
| 2025-01-01 | 1200 | NULL | NULL |
| 2025-01-02 | 1800 | 1200 | 600 |
| 2025-01-03 | 1600 | 1800 | -200 |
🧩 Example 2: Next Event Preview with LEAD()
SELECT
customer_id,
order_date,
LEAD(order_date) OVER (PARTITION BY customer_id ORDER BY order_date) AS next_order_date,
LEAD(order_date) OVER (PARTITION BY customer_id ORDER BY order_date) - order_date AS days_between_orders
FROM orders;Explanation:
LEAD()peeks forward to find the next event for each user.- Useful for measuring re-engagement, session frequency, or churn risk.
🧩 Example 3: Moving Average (Rolling Weekly)
SELECT
order_date,
SUM(amount) AS daily_sales,
AVG(SUM(amount)) OVER (
ORDER BY order_date
ROWS BETWEEN 6 PRECEDING AND CURRENT ROW
) AS rolling_weekly_avg
FROM sales
GROUP BY order_date
ORDER BY order_date;Explanation:
- Looks at a 7-day window (current + 6 days before).
- Smooths daily fluctuations — ideal for identifying weekly trends.
🧩 Example 4: Moving Sum with Explicit Range
SELECT
order_date,
SUM(amount) AS daily_sales,
SUM(SUM(amount)) OVER (
ORDER BY order_date
ROWS BETWEEN 6 PRECEDING AND CURRENT ROW
) AS weekly_total
FROM sales
GROUP BY order_date
ORDER BY order_date;Explanation: Perfect for rolling totals — e.g., “total revenue over the last 7 days.”
⚙️ Step 4: Performance Insights & Optimization
🧠 What Happens Under the Hood
- Each window frame requires temporary storage of multiple rows.
- For
LAG()/LEAD(), the system just fetches one row offset — very cheap. - For moving averages, SQL maintains an in-memory sliding buffer of the last N rows. The larger the window → the more memory and I/O needed.
⚙️ Optimizing Time Series Queries
Use the Smallest Possible Window
ROWS BETWEEN 6 PRECEDING AND CURRENT ROW→ better than
UNBOUNDED PRECEDINGwhen you only need a short period.Pre-Aggregate by Date If you have billions of transactions per day, first summarize daily totals:
CREATE MATERIALIZED VIEW daily_sales AS SELECT order_date, SUM(amount) AS daily_revenue FROM sales GROUP BY order_date;Partition by Entity When Possible Example:
PARTITION BY region ORDER BY order_dateLimits memory usage by splitting windows per region.
Cluster or Index by Time Ensures data is physically ordered by
order_date, reducing sort cost.Monitor Execution Plans Use:
EXPLAIN (ANALYZE, BUFFERS)Look for “WindowAgg” nodes — if they show large buffer usage, consider breaking query into chunks.
🚀 Rule of Thumb
The wider your window frame, the heavier the memory footprint. Wide rolling windows ≈ temporary cache explosion.
🎯 Step 5: Interview Perspective & Reasoning
Common Interview Questions
- “What is the difference between
LAG()andLEAD()?” - “When would you use a moving average in SQL?”
- “If your window frame is too wide, what happens to performance?”
- “What’s the difference between ROWS and RANGE in window frames?”
- “How do you compute day-over-day growth for millions of rows efficiently?”
Reasoning Expectation
Interviewers want to see:
- Clarity on temporal logic (time ordering & partitions).
- Awareness of execution costs (sorts, memory, and buffers).
- Ability to justify design decisions: “I’d pre-aggregate or partition by date for performance.”
Bonus Challenge
You have sales data at the transaction level for 1 billion rows. Compute 7-day rolling revenue per region efficiently. Discuss indexing, partitioning, and when you’d switch to materialized views.
🧩 Step 6: Common Mistakes & Debugging Insights
🚨 Common Mistakes (Click to Expand)
Forgetting
ORDER BYin window functions → undefined ordering = meaningless results.Using
UNBOUNDED PRECEDINGunnecessarily → performance killer.Confusing
RANGEandROWS:ROWS= fixed number of rows.RANGE= all rows within a value range (e.g., 7 days).
Not pre-aggregating → applying rolling windows over raw events instead of daily summaries.
Missing index on time column → expensive sorts.
Debugging Tip:
Use EXPLAIN (ANALYZE, BUFFERS) to check memory usage of WindowAgg.
If your rolling query is slow:
- Limit window frame.
- Materialize intermediate results.
- Check physical data order.
🧠 Step 7: Key Takeaways
🧠 Main Idea: Window functions like
LAG(),LEAD(), and moving averages let SQL understand change over time — powering real analytics directly in queries.
💻 How It Works: They operate over ordered partitions using window frames that define how many past/future rows are visible.
🚀 Why It Matters: They’re essential for trend analysis, anomaly detection, and forecasting pipelines — but mastering performance trade-offs is what sets experts apart.