<?xml version="1.0" encoding="UTF-8"?><rss version="2.0"><channel><title>@frogwebp — Data Engineering Log</title><description>Self-hosted data systems documented in the open. Build logs, engineering concepts, and process notes by Anthony Pena.</description><link>https://frogwebp.com/</link><language>en-us</language><item><title>The first one</title><link>https://frogwebp.com/log/the-first-one/</link><guid isPermaLink="true">https://frogwebp.com/log/the-first-one/</guid><description>I didn&apos;t wait until I was ready. I decided, and then I moved.</description><pubDate>Sat, 20 Jun 2026 00:00:00 GMT</pubDate></item><item><title>REFRESH MATERIALIZED VIEW CONCURRENTLY: How It Works, How to Break It</title><link>https://frogwebp.com/log/refresh-materialized-view-concurrently/</link><guid isPermaLink="true">https://frogwebp.com/log/refresh-materialized-view-concurrently/</guid><description>The incremental refresh, and the one mistake that turns it into a full rewrite.</description><pubDate>Fri, 19 Jun 2026 00:00:00 GMT</pubDate></item><item><title>An Idempotent ELT Pipeline With SKIP LOCKED and Phase Commits</title><link>https://frogwebp.com/log/spectrum-idempotent-elt-pipeline/</link><guid isPermaLink="true">https://frogwebp.com/log/spectrum-idempotent-elt-pipeline/</guid><description>Three committed phases that make re-runs produce exactly the right result.</description><pubDate>Thu, 18 Jun 2026 00:00:00 GMT</pubDate></item><item><title>JSONB Staging: Keep Raw Events Unmodified</title><link>https://frogwebp.com/log/spectrum-jsonb-staging-ingestion/</link><guid isPermaLink="true">https://frogwebp.com/log/spectrum-jsonb-staging-ingestion/</guid><description>Why the staging table stores event payloads as JSONB, not typed columns.</description><pubDate>Wed, 17 Jun 2026 00:00:00 GMT</pubDate></item><item><title>The Circuit Breaker That Tripped Before the Demo Started</title><link>https://frogwebp.com/log/phronis-stale-alert-replay/</link><guid isPermaLink="true">https://frogwebp.com/log/phronis-stale-alert-replay/</guid><description>Stale alerts surviving in a topic, replaying into a fresh run.</description><pubDate>Tue, 16 Jun 2026 00:00:00 GMT</pubDate></item><item><title>You Can&apos;t Bind to a ClusterIP: Redpanda on Kubernetes</title><link>https://frogwebp.com/log/phronis-cluster-ip-bind/</link><guid isPermaLink="true">https://frogwebp.com/log/phronis-cluster-ip-bind/</guid><description>A networking lesson about pod interfaces versus Service IPs.</description><pubDate>Mon, 15 Jun 2026 00:00:00 GMT</pubDate></item><item><title>The Init Container That Couldn&apos;t Reach localhost</title><link>https://frogwebp.com/log/phronis-init-container-localhost/</link><guid isPermaLink="true">https://frogwebp.com/log/phronis-init-container-localhost/</guid><description>Init containers run to completion before the main one starts, so &apos;localhost&apos; is empty.</description><pubDate>Sun, 14 Jun 2026 00:00:00 GMT</pubDate></item><item><title>An Em-Dash Broke My PowerShell Script</title><link>https://frogwebp.com/log/phronis-em-dash-powershell/</link><guid isPermaLink="true">https://frogwebp.com/log/phronis-em-dash-powershell/</guid><description>How one Unicode character became a syntax error 5,000 miles from where it was typed.</description><pubDate>Sat, 13 Jun 2026 00:00:00 GMT</pubDate></item><item><title>Asking Your Pipeline Questions: An MCP Server for Claude</title><link>https://frogwebp.com/log/phronis-mcp-server/</link><guid isPermaLink="true">https://frogwebp.com/log/phronis-mcp-server/</guid><description>&apos;Which agent is misbehaving?&apos; Answered in plain English over RisingWave.</description><pubDate>Fri, 12 Jun 2026 00:00:00 GMT</pubDate></item><item><title>I Built a Self-Hosted Amplitude With Pure PostgreSQL</title><link>https://frogwebp.com/log/spectrum-self-hosted-amplitude-postgresql/</link><guid isPermaLink="true">https://frogwebp.com/log/spectrum-self-hosted-amplitude-postgresql/</guid><description>Kicking off the Spectrum build: why I went all-in on SQL for product analytics.</description><pubDate>Thu, 11 Jun 2026 00:00:00 GMT</pubDate></item><item><title>I Chose a Boring Stack on Purpose</title><link>https://frogwebp.com/log/boring-stack-on-purpose/</link><guid isPermaLink="true">https://frogwebp.com/log/boring-stack-on-purpose/</guid><description>On depth over hype, and the quiet discipline of staying.</description><pubDate>Wed, 10 Jun 2026 00:00:00 GMT</pubDate></item><item><title>NOW() Was Silently Wrecking My CONCURRENTLY Refresh</title><link>https://frogwebp.com/log/spectrum-now-concurrently-refresh-bug/</link><guid isPermaLink="true">https://frogwebp.com/log/spectrum-now-concurrently-refresh-bug/</guid><description>How one timestamp turned an incremental refresh into a full table rewrite.</description><pubDate>Tue, 09 Jun 2026 00:00:00 GMT</pubDate></item><item><title>Incremental Materialized Views, Explained</title><link>https://frogwebp.com/log/incremental-materialized-views/</link><guid isPermaLink="true">https://frogwebp.com/log/incremental-materialized-views/</guid><description>The difference between &apos;recompute on a timer&apos; and &apos;update as data arrives.&apos;</description><pubDate>Mon, 08 Jun 2026 00:00:00 GMT</pubDate></item><item><title>Star Schema vs Snowflake — When to Actually Use Each</title><link>https://frogwebp.com/log/star-schema-vs-snowflake/</link><guid isPermaLink="true">https://frogwebp.com/log/star-schema-vs-snowflake/</guid><description>Both are dimensional models. The difference is how far you normalize.</description><pubDate>Sun, 07 Jun 2026 00:00:00 GMT</pubDate></item><item><title>Funnels and Retention in Pure SQL</title><link>https://frogwebp.com/log/spectrum-analytics-pure-sql/</link><guid isPermaLink="true">https://frogwebp.com/log/spectrum-analytics-pure-sql/</guid><description>The analytics views that power the dashboards: no Python, no dbt, just SQL.</description><pubDate>Sat, 06 Jun 2026 00:00:00 GMT</pubDate></item><item><title>Real-Time Circuit Breaking for AI Agents — What Batch Tools Can&apos;t Do</title><link>https://frogwebp.com/log/phronis-realtime-circuit-breaking/</link><guid isPermaLink="true">https://frogwebp.com/log/phronis-realtime-circuit-breaking/</guid><description>Detect and halt a runaway agent in under 500ms, before the blast radius grows.</description><pubDate>Fri, 05 Jun 2026 00:00:00 GMT</pubDate></item><item><title>Semantic Search Over 100 PDFs: Meaning, Not Ctrl+F</title><link>https://frogwebp.com/log/contextflow-semantic-search-kickoff/</link><guid isPermaLink="true">https://frogwebp.com/log/contextflow-semantic-search-kickoff/</guid><description>Ask a question in plain English; get ranked answers with page provenance in under 2 seconds.</description><pubDate>Thu, 04 Jun 2026 00:00:00 GMT</pubDate></item><item><title>Sessionization in Pure SQL</title><link>https://frogwebp.com/log/spectrum-sessionization-sql/</link><guid isPermaLink="true">https://frogwebp.com/log/spectrum-sessionization-sql/</guid><description>Grouping raw events into sessions with a 30-minute inactivity gap, no Python required.</description><pubDate>Wed, 03 Jun 2026 00:00:00 GMT</pubDate></item><item><title>A/B Experiments With Wilson Score Intervals in SQL</title><link>https://frogwebp.com/log/spectrum-ab-experiments-wilson/</link><guid isPermaLink="true">https://frogwebp.com/log/spectrum-ab-experiments-wilson/</guid><description>Computing statistically sound confidence intervals for conversion rates without leaving PostgreSQL.</description><pubDate>Tue, 02 Jun 2026 00:00:00 GMT</pubDate></item><item><title>Fifty-Four Bugs, Documented in Public</title><link>https://frogwebp.com/log/bugs-documented-in-public/</link><guid isPermaLink="true">https://frogwebp.com/log/bugs-documented-in-public/</guid><description>The failures aren&apos;t the embarrassing part of the portfolio. They are the portfolio.</description><pubDate>Mon, 01 Jun 2026 00:00:00 GMT</pubDate></item><item><title>Schema Registry: Producers That Can Never Break Consumers</title><link>https://frogwebp.com/log/schema-registry-data-contracts/</link><guid isPermaLink="true">https://frogwebp.com/log/schema-registry-data-contracts/</guid><description>Data contracts enforced at the broker, not hoped for in a code review.</description><pubDate>Sun, 31 May 2026 00:00:00 GMT</pubDate></item><item><title>The IMDS Saga: Five Bugs, One Wrong Parameter Name</title><link>https://frogwebp.com/log/phronis-imds-saga/</link><guid isPermaLink="true">https://frogwebp.com/log/phronis-imds-saga/</guid><description>A multi-session debugging story that came down to s3.access.key.id vs s3.access.key.</description><pubDate>Sat, 30 May 2026 00:00:00 GMT</pubDate></item><item><title>The Circuit-Breaker Pattern, Applied to AI Agents</title><link>https://frogwebp.com/log/circuit-breaker-pattern-ai-agents/</link><guid isPermaLink="true">https://frogwebp.com/log/circuit-breaker-pattern-ai-agents/</guid><description>A classic resilience idea, pointed at a new kind of runaway process.</description><pubDate>Fri, 29 May 2026 00:00:00 GMT</pubDate></item><item><title>&quot;Unknown&quot; Was the Most Dangerous Row in My Warehouse</title><link>https://frogwebp.com/log/spectrum-unknown-dangerous-row/</link><guid isPermaLink="true">https://frogwebp.com/log/spectrum-unknown-dangerous-row/</guid><description>Why a synthetic fallback bucket quietly corrupts tenant metrics.</description><pubDate>Thu, 28 May 2026 00:00:00 GMT</pubDate></item><item><title>2.5MB Through a 48KB Door: The XCom Size Limit</title><link>https://frogwebp.com/log/contextflow-xcom-size-limit/</link><guid isPermaLink="true">https://frogwebp.com/log/contextflow-xcom-size-limit/</guid><description>A silent empty result, and why Airflow swallowed the data without an error.</description><pubDate>Wed, 27 May 2026 00:00:00 GMT</pubDate></item><item><title>Streaming vs Batch: Why Sub-Second Aggregation Matters</title><link>https://frogwebp.com/log/streaming-vs-batch-aggregation/</link><guid isPermaLink="true">https://frogwebp.com/log/streaming-vs-batch-aggregation/</guid><description>When &apos;every 30 seconds&apos; is the same as &apos;too late.&apos;</description><pubDate>Tue, 26 May 2026 00:00:00 GMT</pubDate></item><item><title>Sub-500ms Detection: TUMBLE Windows in RisingWave</title><link>https://frogwebp.com/log/phronis-tumble-windows-detection/</link><guid isPermaLink="true">https://frogwebp.com/log/phronis-tumble-windows-detection/</guid><description>Incremental materialized views that update as each event lands, not every 30 seconds.</description><pubDate>Mon, 25 May 2026 00:00:00 GMT</pubDate></item><item><title>Idempotency in Data Pipelines</title><link>https://frogwebp.com/log/idempotency-data-pipelines/</link><guid isPermaLink="true">https://frogwebp.com/log/idempotency-data-pipelines/</guid><description>Why the same run twice should produce the same result once.</description><pubDate>Sun, 24 May 2026 00:00:00 GMT</pubDate></item><item><title>The SDK: @agent / @tool Decorators and ~1ms Event Emission</title><link>https://frogwebp.com/log/phronis-sdk-decorators/</link><guid isPermaLink="true">https://frogwebp.com/log/phronis-sdk-decorators/</guid><description>Instrument an agent in two decorators, with no framework lock-in.</description><pubDate>Sat, 23 May 2026 00:00:00 GMT</pubDate></item><item><title>The Star Schema, Monthly Partitions, and Why I Indexed That Way</title><link>https://frogwebp.com/log/spectrum-star-schema-partitions/</link><guid isPermaLink="true">https://frogwebp.com/log/spectrum-star-schema-partitions/</guid><description>Building the warehouse layer: fact_events, four dimensions, and the index strategy.</description><pubDate>Fri, 22 May 2026 00:00:00 GMT</pubDate></item><item><title>Why I Write Before I Have the Answers</title><link>https://frogwebp.com/log/write-before-the-answers/</link><guid isPermaLink="true">https://frogwebp.com/log/write-before-the-answers/</guid><description>Documenting the process, not just the result: the conviction underneath all of it.</description><pubDate>Thu, 21 May 2026 00:00:00 GMT</pubDate></item><item><title>The Kill Switch: ~600ms to Halt, Persisted in PostgreSQL</title><link>https://frogwebp.com/log/phronis-kill-switch/</link><guid isPermaLink="true">https://frogwebp.com/log/phronis-kill-switch/</guid><description>From alert to enforced stop, with state that survives a restart.</description><pubDate>Wed, 20 May 2026 00:00:00 GMT</pubDate></item><item><title>One User, Every Tenant&apos;s Retention</title><link>https://frogwebp.com/log/spectrum-cross-tenant-retention/</link><guid isPermaLink="true">https://frogwebp.com/log/spectrum-cross-tenant-retention/</guid><description>A missing WHERE clause that leaked retention across accounts.</description><pubDate>Tue, 19 May 2026 00:00:00 GMT</pubDate></item><item><title>Cleaning Multilingual PDFs: Ligatures, IPA, and Broken Unicode</title><link>https://frogwebp.com/log/contextflow-cleaning-multilingual-pdfs/</link><guid isPermaLink="true">https://frogwebp.com/log/contextflow-cleaning-multilingual-pdfs/</guid><description>Six transformations, in an order that matters, to make academic text searchable.</description><pubDate>Mon, 18 May 2026 00:00:00 GMT</pubDate></item><item><title>The Partition That Existed But Couldn&apos;t Be Found</title><link>https://frogwebp.com/log/spectrum-partition-ddl-visibility/</link><guid isPermaLink="true">https://frogwebp.com/log/spectrum-partition-ddl-visibility/</guid><description>A DDL-visibility bug, and why partition creation has to commit alone.</description><pubDate>Sun, 17 May 2026 00:00:00 GMT</pubDate></item><item><title>NULL ≠ NULL: A Duplicate-Geography Bug</title><link>https://frogwebp.com/log/spectrum-nulls-not-distinct/</link><guid isPermaLink="true">https://frogwebp.com/log/spectrum-nulls-not-distinct/</guid><description>Why dedup silently failed, and the Postgres 15 feature that fixed it.</description><pubDate>Sat, 16 May 2026 00:00:00 GMT</pubDate></item><item><title>Chunking With Provenance: 512 Chars, 64 Overlap, Full Lineage</title><link>https://frogwebp.com/log/contextflow-chunking-with-provenance/</link><guid isPermaLink="true">https://frogwebp.com/log/contextflow-chunking-with-provenance/</guid><description>Splitting text so retrieval stays accurate and every result can cite its source.</description><pubDate>Fri, 15 May 2026 00:00:00 GMT</pubDate></item><item><title>Embeddings Explained: From Text to Cosine Similarity</title><link>https://frogwebp.com/log/embeddings-explained/</link><guid isPermaLink="true">https://frogwebp.com/log/embeddings-explained/</guid><description>How words become vectors, and why distance in that space means meaning.</description><pubDate>Thu, 14 May 2026 00:00:00 GMT</pubDate></item><item><title>The Airflow DAG: Validate, Extract, Transform, Load, Notify</title><link>https://frogwebp.com/log/contextflow-airflow-dag/</link><guid isPermaLink="true">https://frogwebp.com/log/contextflow-airflow-dag/</guid><description>Orchestration with the right retries per task, and chunks that don&apos;t go through XCom.</description><pubDate>Wed, 13 May 2026 00:00:00 GMT</pubDate></item><item><title>Shipping It to Kubernetes: Kind + Podman + Helm</title><link>https://frogwebp.com/log/phronis-kubernetes-helm/</link><guid isPermaLink="true">https://frogwebp.com/log/phronis-kubernetes-helm/</guid><description>From an 11-container compose stack to 8 pods Running in under 3 minutes.</description><pubDate>Tue, 12 May 2026 00:00:00 GMT</pubDate></item><item><title>Deterministic Chunk IDs: Idempotency as Structure, Not a Flag</title><link>https://frogwebp.com/log/contextflow-deterministic-chunk-ids/</link><guid isPermaLink="true">https://frogwebp.com/log/contextflow-deterministic-chunk-ids/</guid><description>Re-running the pipeline updates in place, never duplicates, because of how IDs are made.</description><pubDate>Mon, 11 May 2026 00:00:00 GMT</pubDate></item><item><title>384 Dimensions, Zero API Cost: Local Embeddings</title><link>https://frogwebp.com/log/contextflow-local-embeddings/</link><guid isPermaLink="true">https://frogwebp.com/log/contextflow-local-embeddings/</guid><description>Why a small local model and L2 normalization are the right default for RAG.</description><pubDate>Sun, 10 May 2026 00:00:00 GMT</pubDate></item><item><title>Building Alone, in Public</title><link>https://frogwebp.com/log/building-alone-in-public/</link><guid isPermaLink="true">https://frogwebp.com/log/building-alone-in-public/</guid><description>Deep work, scenius, and why solitude and an audience aren&apos;t opposites.</description><pubDate>Sat, 09 May 2026 00:00:00 GMT</pubDate></item><item><title>Cold Storage That Time-Travels: Iceberg on MinIO</title><link>https://frogwebp.com/log/phronis-iceberg-cold-storage/</link><guid isPermaLink="true">https://frogwebp.com/log/phronis-iceberg-cold-storage/</guid><description>Every agent event, in Parquet, queryable as of any point in the past.</description><pubDate>Fri, 08 May 2026 00:00:00 GMT</pubDate></item><item><title>The Sidebar That Mutated Global State</title><link>https://frogwebp.com/log/contextflow-sidebar-global-mutation/</link><guid isPermaLink="true">https://frogwebp.com/log/contextflow-sidebar-global-mutation/</guid><description>A Streamlit slider that quietly rewrote the app&apos;s settings singleton.</description><pubDate>Thu, 07 May 2026 00:00:00 GMT</pubDate></item><item><title>Why My .env Was Silently Ignored</title><link>https://frogwebp.com/log/contextflow-env-silently-ignored/</link><guid isPermaLink="true">https://frogwebp.com/log/contextflow-env-silently-ignored/</guid><description>Nested Pydantic settings classes each need to be told where to read.</description><pubDate>Wed, 06 May 2026 00:00:00 GMT</pubDate></item><item><title>FOR UPDATE SKIP LOCKED: Queue Processing Without a Queue</title><link>https://frogwebp.com/log/for-update-skip-locked-queue/</link><guid isPermaLink="true">https://frogwebp.com/log/for-update-skip-locked-queue/</guid><description>Turn any PostgreSQL table into a safe, concurrent work queue.</description><pubDate>Tue, 05 May 2026 00:00:00 GMT</pubDate></item><item><title>structlog Is Not logging: The %s Trap</title><link>https://frogwebp.com/log/contextflow-structlog-printf-trap/</link><guid isPermaLink="true">https://frogwebp.com/log/contextflow-structlog-printf-trap/</guid><description>Two logging APIs that look identical and behave completely differently.</description><pubDate>Mon, 04 May 2026 00:00:00 GMT</pubDate></item><item><title>The Protocol Pattern: Swappable Embedders With Duck Typing</title><link>https://frogwebp.com/log/python-protocol-pattern/</link><guid isPermaLink="true">https://frogwebp.com/log/python-protocol-pattern/</guid><description>Python structural typing for clean, replaceable seams.</description><pubDate>Sun, 03 May 2026 00:00:00 GMT</pubDate></item><item><title>RAG Provenance: Why Every Chunk Must Know Where It Came From</title><link>https://frogwebp.com/log/rag-provenance/</link><guid isPermaLink="true">https://frogwebp.com/log/rag-provenance/</guid><description>Retrieval you can&apos;t cite is retrieval you can&apos;t trust.</description><pubDate>Sat, 02 May 2026 00:00:00 GMT</pubDate></item><item><title>Chunking for RAG: Size, Overlap, and the Separator Hierarchy</title><link>https://frogwebp.com/log/chunking-for-rag/</link><guid isPermaLink="true">https://frogwebp.com/log/chunking-for-rag/</guid><description>The most underrated knob in a retrieval pipeline.</description><pubDate>Fri, 01 May 2026 00:00:00 GMT</pubDate></item><item><title>The Long Detour Into Engineering</title><link>https://frogwebp.com/log/the-long-detour/</link><guid isPermaLink="true">https://frogwebp.com/log/the-long-detour/</guid><description>On restlessness, reading, and finding the work that finally fit.</description><pubDate>Thu, 30 Apr 2026 00:00:00 GMT</pubDate></item></channel></rss>