pg_duckpipe: Real-time CDC for streaming Postgres Table into Columnar Ducklake
TL;DR: pg_duckpipe is a PostgreSQL extension that continuously streams your heap tables into DuckLake columnar tables via WAL-based CDC. One SQL call to start, no external infrastructure required. ...

Source: DEV Community
TL;DR: pg_duckpipe is a PostgreSQL extension that continuously streams your heap tables into DuckLake columnar tables via WAL-based CDC. One SQL call to start, no external infrastructure required. Why pg_duckpipe? When we released pg_ducklake, it brought a columnar lakehouse storage layer to PostgreSQL: DuckDB-powered analytical tables backed by Parquet, with metadata living in PostgreSQL's own catalog. One question kept coming up: how do I keep these analytical tables in sync with my transactional tables automatically? This is a real problem. If you manage DuckLake tables by hand, running periodic ETL jobs or batch inserts, you end up with stale data, extra scripts to maintain, and an operational surface area that grows with every table. For teams that want fresh analytical views of their OLTP data, this quickly becomes painful. pg_duckpipe addresses this. It is a PostgreSQL extension (and optionally a standalone daemon) that streams changes from regular heap tables into DuckLake colu