The detection layer is where Phronis earns its latency budget. It’s built on RisingWave, a streaming SQL engine whose materialized views are incremental: they update as each event arrives, not on a batch timer. That single property is the difference between catching a runaway agent in 400ms and catching it in 40 seconds.
// 01 — INCREMENTAL, NOT MICRO-BATCH
A traditional analytics MV is recomputed on a schedule. RisingWave maintains its MVs continuously: when an event lands, only the affected window’s state changes. There’s no “wait for the next batch.” Checkpointing is tuned to 100ms, which keeps the end-to-end event-to-alert path under the 500ms target.
// 02 — THE DETECTION VIEW
A TUMBLE window slices the event stream into fixed intervals and counts tool calls per agent per window:
CREATE MATERIALIZED VIEW mv_agent_tool_call_rate AS
SELECT agent_id, window_start, window_end,
COUNT(*) AS call_count
FROM TUMBLE(agent_events, event_time, INTERVAL '60 seconds')
WHERE event_type = 'TOOL_CALL'
GROUP BY agent_id, window_start, window_end;
mv_circuit_breaker_triggers sits on top and fires when call_count crosses the threshold (more than 500 tool calls in a 60-second window in production; the demo config uses a tighter window so a runaway agent trips within seconds). The moment the count crosses the line, the view produces a row.
// 03 — FROM ROW TO ALERT
That row is the alert. A sink writes it straight to the phronis.alerts topic in Redpanda, where the AlertExecutor is waiting. There’s no polling. The alert exists because the streaming view produced it, in the same instant the threshold broke. Alongside detection, parallel MVs track latency percentiles, token cost by model, and token drift against a running baseline.
TAKEAWAYS
- Incremental materialized views update per-event; micro-batch views update per-interval. For real-time enforcement, only the former hits a sub-second budget.
- A
TUMBLEwindow + a threshold is a complete call-storm detector, expressed as plain SQL, not a custom stream-processing job. - Push detection results into a topic the enforcer is already consuming. The alert should be a side effect of detection, not a separate poll.
NEXT
- Build log 04: the kill switch: turning an alert into an enforced stop, persisted in PostgreSQL.
