Bundle Configuration Reference¶
This page catalogs every option LHP exposes for Databricks Asset Bundle (DAB) integration. For the step-by-step walk-through, see Configure Bundles.
Scope¶
LHP generates DAB pipeline and job resource YAML under resources/lhp/. It
does not replace the Databricks CLI and never modifies databricks.yml.
Catalog and schema must come from pipeline_config.yaml — see
Configuring catalog and schema for pipelines for the resolution rules.
Bundle activation¶
CLI flags¶
Flag |
Command |
Behavior |
|---|---|---|
|
|
Skip |
|
|
Disable bundle sync even when |
|
|
Path to pipeline config YAML (relative to project root). |
|
|
With |
|
|
Path to job config YAML. |
|
|
Write generated job YAML under |
|
|
One of |
There is no --bundle flag. Bundle is on whenever databricks.yml exists in
the project root and --no-bundle is not set.
Project layout¶
<project_root>/
├── databricks.yml # User-owned; LHP does not modify it
├── lhp.yaml # LHP project config
├── pipelines/ # Flowgroup YAML
├── substitutions/ # <env>.yaml per target
├── config/ # Optional pipeline_config.yaml / job_config.yaml
├── resources/
│ ├── lhp/ # LHP-owned; regenerated each run
│ └── *.job.yml # User-owned; LHP leaves them alone
└── generated/ # Auto-generated Python; do not edit
Files under resources/lhp/ carry a # Generated by LakehousePlumber
header. Manual edits are overwritten on the next generate cycle.
Sync behavior¶
Conservative sync runs after every successful lhp generate. The decision
matrix is enforced by BundleManager.sync_resources_with_generated_files:
Generated dir |
File in |
Action |
|---|---|---|
Exists |
LHP-owned |
Preserve (no rewrite) |
Exists |
LHP-owned + |
Regenerate |
Exists |
User-edited (no LHP header) |
Rename to |
Exists |
None |
Create |
Missing |
Any |
Delete |
Any |
Multiple files defining the same pipeline |
Raise |
Backup names use .bkup, .bkup.1, .bkup.2 on collision.
Target binding rules¶
Each target in
databricks.ymlmust have a matchingsubstitutions/<target>.yaml.Pipeline names in
pipelines/must use[a-zA-Z0-9_-]+only.Generated resource files:
<pipeline_name>.pipeline.yml(preferred) or<pipeline_name>.yml.Generated pipeline resource key:
<pipeline_name>_pipeline.
Pipeline configuration¶
File format¶
Multi-document YAML. The first document holds project_defaults; each
subsequent document targets one or more pipelines via the pipeline key.
project_defaults:
catalog: "${catalog}"
schema: "${schema}"
serverless: true
---
pipeline:
- bronze_load
serverless: false
clusters:
- label: default
node_type_id: Standard_D16ds_v5
autoscale:
min_workers: 2
max_workers: 10
Pass file path with --pipeline-config / -pc.
Top-level keys¶
Explicitly rendered by pipeline_resource.yml.j2. Source of truth:
EXPLICITLY_RENDERED_PIPELINE_CONFIG_KEYS in
src/lhp/bundle/manager.py.
Key |
Type |
Default |
Notes |
|---|---|---|---|
|
string |
none |
Unity Catalog name. Required if |
|
string |
none |
Schema name. Required if |
|
bool |
|
Pipeline compute mode. |
|
string |
|
One of |
|
string |
|
One of |
|
bool |
|
Streaming/continuous mode. |
|
bool |
none |
Photon engine. Non-serverless only. |
|
list |
none |
Cluster specs. Used when |
|
dict |
none |
Spark/DLT properties. Values must be quoted strings. |
|
list |
none |
Email recipients + alert types. See Notification keys. |
|
dict |
none |
Pipeline tags. Non-serverless only. |
|
dict or |
none |
Per-pipeline event log override. |
|
dict |
none |
Pip dependencies passed through as-is. |
|
list |
none |
Pipeline ACL entries. See Permission keys. |
Any other top-level key is rendered verbatim via the pass-through filter,
including run_as, trigger, budget_policy_id, edit_mode, and any
Databricks Pipelines API field added after your LHP release.
Cluster keys¶
Each entry under clusters:
Key |
Notes |
|---|---|
|
Required. |
|
Optional. Mutually exclusive with |
|
Optional. |
|
Optional. |
|
Optional. |
|
Optional cluster policy. |
|
Required when |
|
Required when |
|
Optional. Typically |
Notification keys¶
Each entry under notifications:
email_recipients: list of email strings.alerts: list of alert types —on-update-success,on-update-failure,on-update-fatal-failure,on-flow-failure.
Permission keys¶
Each entry under permissions:
level: one ofCAN_VIEW,CAN_RUN,CAN_MANAGE.user_name,group_name, orservice_principal_name: exactly one.
Configuration block¶
The configuration dict is merged with LHP’s mandatory
bundle.sourcePath entry. All values must be quoted strings; unquoted
booleans or integers raise a validation error. Any user-supplied
bundle.sourcePath is silently ignored.
Monitoring alias¶
The reserved key __eventlog_monitoring under pipeline: targets the
monitoring pipeline generated by monitoring in lhp.yaml. See
Monitoring Reference for resolution rules.
Merge precedence¶
DEFAULT_PIPELINE_CONFIG → project_defaults → pipeline-specific. Deep
merge for dicts; lists are replaced wholesale.
Substitution applies to every field. Tokens resolve from
substitutions/<env>.yaml at generate time.
Catalog/schema validation¶
Both
catalogandschemamust be set, or neither.Both must be non-empty after substitution.
Missing or partial definition raises
BundleResourceErrorwithdocs_reference="docs/configure_catalog_schema.rst".
See Configuring catalog and schema for pipelines for per-pipeline and project_defaults
configuration, resolution order, and the full error reference.
Job configuration¶
File format¶
project_defaults:
max_concurrent_runs: 1
performance_target: STANDARD
queue:
enabled: true
---
job_name:
- bronze_ingestion_job
timeout_seconds: 7200
schedule:
quartz_cron_expression: "0 0 2 * * ?"
timezone_id: America/New_York
Pass with --job-config / -jc. Use --bundle-output to write the job
file under resources/ for bundle deployment.
Top-level keys¶
Explicitly rendered by job_resource.yml.j2. Source of truth:
EXPLICITLY_RENDERED_JOB_CONFIG_KEYS in
src/lhp/core/services/job_generator.py. Defaults from
JobGenerator.DEFAULT_JOB_CONFIG.
Key |
Default |
Notes |
|---|---|---|
|
|
Concurrent run cap. |
|
|
One of |
|
|
Job queueing. |
|
none |
Job-level timeout. |
|
none |
Job tag dict. |
|
none |
|
|
none |
|
|
none |
Job ACL entries (same shape as pipeline |
|
none |
Required when |
|
none |
Required when |
|
none |
|
|
none |
Monitoring job only. |
|
|
LHP-internal; controls master-job emission. Never written to output. |
|
none (auto) |
LHP-internal; overrides the master-job name. Never written to output. |
Any other top-level key passes through as-is. Common examples:
trigger.file_arrival, continuous, run_as.service_principal_name,
git_source, health, parameters, environments, edit_mode,
budget_policy_id. LHP does not validate pass-through fields against the
Databricks Jobs API; misspellings surface at deploy time.
Merge precedence¶
DEFAULT_JOB_CONFIG → project_defaults → job-specific. Deep merge for
dicts; lists are replaced wholesale. Author key order preserved.
Multi-job orchestration¶
Set job_name on flowgroups in pipelines/*.yaml to split execution into
named jobs.
pipeline: data_bronze
flowgroup: customer_ingestion
job_name:
- bronze_ingestion_job
Rules¶
All-or-nothing: if any flowgroup sets
job_name, every flowgroup must set it.Format:
^[a-zA-Z0-9_-]+$.--pipelinefilter is rejected in multi-job mode.
Generated artifacts¶
resources/
├── <job_name>.job.yml # One per unique job_name
└── <project>_master.job.yml # Master orchestrator
The master job wires individual jobs together via task_key references with
depends_on edges resolved from dependency analysis.
Generated resource example¶
# Generated by LakehousePlumber - Bundle Resource for bronze_load
resources:
pipelines:
bronze_load_pipeline:
name: bronze_load_pipeline
catalog: ${var.catalog}
schema: ${var.bronze_schema}
serverless: true
libraries:
- glob:
include: ${workspace.file_path}/generated/${bundle.target}/bronze_load/**
root_path: ${workspace.file_path}/generated/${bundle.target}/bronze_load
configuration:
bundle.sourcePath: ${workspace.file_path}/generated/${bundle.target}
LHP always emits libraries as a glob, root_path under
${workspace.file_path}/generated/${bundle.target}, and the
bundle.sourcePath configuration entry.
Configuration templates¶
lhp init writes starter templates under config/:
config/pipeline_config.yaml.tmplconfig/job_config.yaml.tmpl
Copy each to drop the .tmpl suffix before editing.
Version enforcement¶
The optional required_lhp_version key in lhp.yaml pins generation to a
specific LHP release range, so the same project produces the same Python output
across development and CI. lhp validate and lhp generate fail when the
installed LHP version falls outside the range. Informational commands such as
lhp show skip the check so you can inspect a project even on a mismatched
LHP version.
LHP accepts any PEP 440 version specifier:
# Exact pin
required_lhp_version: "==0.4.1"
# Allow patch updates only (equivalent to >=0.4.1,<0.5.0)
required_lhp_version: "~=0.4.1"
# Range with exclusion
required_lhp_version: ">=0.4.1,<0.5.0,!=0.4.3"
# Allow minor updates
required_lhp_version: ">=0.4.0,<1.0.0"
Projects without required_lhp_version run on any installed LHP version.
Emergency bypass¶
Set LHP_IGNORE_VERSION=1 to skip version checking temporarily:
export LHP_IGNORE_VERSION=1
lhp generate -e dev
# Or inline for a single command
LHP_IGNORE_VERSION=1 lhp validate -e prod
Warning
LHP_IGNORE_VERSION=1 defeats the purpose of version pinning. Reserve it
for incident response, not regular workflows.
CI/CD integration¶
Install the LHP version matching the project requirement before running
lhp validate or lhp generate:
# Install the exact range from lhp.yaml
pip install "lakehouse-plumber$(yq -r .required_lhp_version lhp.yaml)"
# Or pin a known-good range
pip install "lakehouse-plumber>=0.4.1,<0.5.0"
# Validate and generate (fail-fast on mismatch)
lhp validate -e prod
lhp generate -e prod
Error codes¶
BundleResourceError— Missing, incomplete, or emptycatalog/schemaafter substitution (carriesdocs_reference="docs/configure_catalog_schema.rst"). Also raised on multiple files defining the same pipeline, malformed YAML inresources/lhp/, or filesystem failure. See Configuring catalog and schema for pipelines for catalog/schema cases.LHPConfigError 028—BundleManagerinitialized with noproject_root.
See also¶
Configure Bundles — Bundle setup walk-through.
Configuring catalog and schema for pipelines — Catalog and schema configuration via
pipeline_config.yaml.How to Set Up CI/CD for an LHP Project — CI/CD patterns and deployment workflows.
Architecture — How LHP’s generation and sync layers fit together.
Dependency Analysis & Job Generation — Pipeline dependency graph and orchestration job generation.
Monitoring Reference — Event log and monitoring pipeline schema.
Error Reference — Full error code catalog.