Lakehouse Plumber¶
Managing dozens of Lakeflow/DLT pipelines means thousands of lines of repetitive Python — inconsistent patterns, boilerplate sprawl, and painful maintenance across environments.
Lakehouse Plumber turns concise YAML actions into fully-featured Databricks Lakeflow Declarative Pipelines (formerly Delta Live Tables) — without hiding the Databricks platform you already know and love.
How LHP Solves It¶
Eliminates boilerplate — a template + 5-line config replaces 86 lines of Python per table.
Zero runtime overhead — pure code generation, not a runtime framework.
Transparent output — readable Python files, version-controlled and debuggable in the Databricks IDE.
Fits DataOps workflows — CI/CD, automated testing, multi-environment substitutions.
No lock-in — the output is plain Python & SQL you own and control.
Data democratization — power users create artifacts within platform standards.
Real-World Example
Instead of repeating 86 lines of Python per table, write a 5-line configuration:
pipeline: raw_ingestions
flowgroup: customer_ingestion
use_template: csv_ingestion_template
template_parameters:
table_name: customer
landing_folder: customer
Result: 4,300 lines of repetitive Python → 250 lines total (1 template + 50 simple configs). See Quickstart for the full template and generated output.
Quick Start¶
Get started in minutes:
pip install lakehouse-plumber
lhp init my_project
cd my_project
# Edit your YAML flowgroups (IntelliSense auto-configured)
lhp validate --env dev
lhp generate --env dev
# Inspect the generated/ directory — readable Python ready for Databricks
Note
New to LHP? Follow the Quickstart to build your first pipeline in 10 minutes.
Core Workflow¶
The execution model is deliberately simple:
graph LR
A[Load] --> B{0..N Transform}
B --> C[Write]
Load Ingest raw data from CloudFiles, Delta, JDBC, SQL, or custom Python.
Transform Apply zero or many transforms (SQL, Python, schema, data-quality, temp-tables…).
Write Persist results as Streaming Tables, Materialized Views, or Snapshots.
Where to next¶
The sidebar groups documentation by purpose, following the Diátaxis framework:
Get Started — install LHP, set up your editor, and ship your first pipeline.
How-to — task-shaped recipes for common data-engineering problems.
Explanation — the why behind LHP’s design and patterns.
Reference — exhaustive lookup tables for CLI flags, YAML keys, error codes, and the public API.
If you have a specific problem, jump straight to Overview. If you want to understand the execution model first, read Architecture.