Streaming vs Batch: Why Sub-Second Aggregation Matters

Batch and streaming aren’t rivals. They’re answers to different questions. The mistake is using one where the other belongs. The deciding factor is almost always how fast a stale answer becomes a wrong action.

Batch

Collect data over an interval, then process it all at once. Simple, cheap, and correct for anything where a delay between event and insight is fine: daily reports, weekly cohorts, model training sets. Most analytics is batch and should stay batch.

Streaming

Process each event as it arrives; keep aggregates continuously up to date. More moving parts, but the result reflects reality now. It’s worth the complexity only when the gap between “it happened” and “I know” has a cost.

The cost of the window

A batch aggregator that runs every 30–60 seconds has a built-in blind spot of that length. For a dashboard a human glances at, 60 seconds is invisible. For an autonomous agent making tool calls in a loop, 60 seconds is thousands of calls. The blind spot is the incident. That’s the entire reason Phronis uses streaming SQL: detection has to happen inside the window where the damage is still small, and a batch timer puts the answer outside it.

How to choose

Ask: if my answer is 60 seconds stale, what happens?

Nothing much: batch. Don’t pay for streaming you don’t need.
An automated system keeps acting on the wrong state, fast: streaming. The latency is the point.

Takeaway

Streaming isn’t “better” than batch. It’s what you reach for when staleness drives bad automated action within seconds. Detecting a runaway agent, fraud in flight, or a cascading failure all share that shape. A daily report does not. Match the tool to the cost of the delay.