Bundle Configuration Reference

This page catalogs every option LHP exposes for Databricks Asset Bundle (DAB) integration. For the step-by-step walk-through, see Configure Bundles.

Scope

LHP generates DAB pipeline and job resource YAML under resources/lhp/. It does not replace the Databricks CLI and never modifies databricks.yml. Catalog and schema must come from pipeline_config.yaml — see Configuring catalog and schema for pipelines for the resolution rules.

Bundle activation

CLI flags

Flag

Command

Behavior

--no-bundle

lhp init

Skip databricks.yml and resources/lhp/ scaffolding. Bundle is enabled by default.

--no-bundle

lhp generate

Disable bundle sync even when databricks.yml exists.

--pipeline-config FILE, -pc FILE

lhp generate

Path to pipeline config YAML (relative to project root).

--force, -f

lhp generate

With --pipeline-config, rewrites existing LHP-owned bundle YAML resource files. Without -pc the flag has no effect (Python is always regenerated).

--job-config FILE, -jc FILE

lhp deps

Path to job config YAML.

--bundle-output, -b

lhp deps

Write generated job YAML under resources/ for bundle deployment.

--format, -f

lhp deps

One of dot, json, text, job, all. Default all.

There is no --bundle flag. Bundle is on whenever databricks.yml exists in the project root and --no-bundle is not set.

Project layout

<project_root>/
├── databricks.yml          # User-owned; LHP does not modify it
├── lhp.yaml                # LHP project config
├── pipelines/              # Flowgroup YAML
├── substitutions/          # <env>.yaml per target
├── config/                 # Optional pipeline_config.yaml / job_config.yaml
├── resources/
│   ├── lhp/                # LHP-owned; regenerated each run
│   └── *.job.yml           # User-owned; LHP leaves them alone
└── generated/              # Auto-generated Python; do not edit

Files under resources/lhp/ carry a # Generated by LakehousePlumber header. Manual edits are overwritten on the next generate cycle.

Sync behavior

Conservative sync runs after every successful lhp generate. The decision matrix is enforced by BundleManager.sync_resources_with_generated_files:

Generated dir

File in resources/lhp/

Action

Exists

LHP-owned

Preserve (no rewrite)

Exists

LHP-owned + --force + --pipeline-config

Regenerate

Exists

User-edited (no LHP header)

Rename to .bkup and recreate

Exists

None

Create

Missing

Any

Delete

Any

Multiple files defining the same pipeline

Raise BundleResourceError

Backup names use .bkup, .bkup.1, .bkup.2 on collision.

Target binding rules

  • Each target in databricks.yml must have a matching substitutions/<target>.yaml.

  • Pipeline names in pipelines/ must use [a-zA-Z0-9_-]+ only.

  • Generated resource files: <pipeline_name>.pipeline.yml (preferred) or <pipeline_name>.yml.

  • Generated pipeline resource key: <pipeline_name>_pipeline.

Pipeline configuration

File format

Multi-document YAML. The first document holds project_defaults; each subsequent document targets one or more pipelines via the pipeline key.

config/pipeline_config.yaml
project_defaults:
  catalog: "${catalog}"
  schema: "${schema}"
  serverless: true

---
pipeline:
  - bronze_load
serverless: false
clusters:
  - label: default
    node_type_id: Standard_D16ds_v5
    autoscale:
      min_workers: 2
      max_workers: 10

Pass file path with --pipeline-config / -pc.

Top-level keys

Explicitly rendered by pipeline_resource.yml.j2. Source of truth: EXPLICITLY_RENDERED_PIPELINE_CONFIG_KEYS in src/lhp/bundle/manager.py.

Key

Type

Default

Notes

catalog

string

none

Unity Catalog name. Required if schema is set. Supports ${token}.

schema

string

none

Schema name. Required if catalog is set. Supports ${token}.

serverless

bool

true

Pipeline compute mode.

edition

string

ADVANCED

One of CORE, PRO, ADVANCED. Ignored when serverless: true.

channel

string

CURRENT

One of CURRENT, PREVIEW.

continuous

bool

false

Streaming/continuous mode.

photon

bool

none

Photon engine. Non-serverless only.

clusters

list

none

Cluster specs. Used when serverless: false. See Cluster keys.

configuration

dict

none

Spark/DLT properties. Values must be quoted strings.

notifications

list

none

Email recipients + alert types. See Notification keys.

tags

dict

none

Pipeline tags. Non-serverless only.

event_log

dict or false

none

Per-pipeline event log override. false opts out of project-level event_log.

environment

dict

none

Pip dependencies passed through as-is.

permissions

list

none

Pipeline ACL entries. See Permission keys.

Any other top-level key is rendered verbatim via the pass-through filter, including run_as, trigger, budget_policy_id, edit_mode, and any Databricks Pipelines API field added after your LHP release.

Cluster keys

Each entry under clusters:

Key

Notes

label

Required. default for the main cluster.

node_type_id

Optional. Mutually exclusive with instance_pool_id.

instance_pool_id

Optional.

driver_node_type_id

Optional.

driver_instance_pool_id

Optional.

policy_id

Optional cluster policy.

autoscale.min_workers

Required when autoscale set.

autoscale.max_workers

Required when autoscale set.

autoscale.mode

Optional. Typically ENHANCED.

Notification keys

Each entry under notifications:

  • email_recipients: list of email strings.

  • alerts: list of alert types — on-update-success, on-update-failure, on-update-fatal-failure, on-flow-failure.

Permission keys

Each entry under permissions:

  • level: one of CAN_VIEW, CAN_RUN, CAN_MANAGE.

  • user_name, group_name, or service_principal_name: exactly one.

Configuration block

The configuration dict is merged with LHP’s mandatory bundle.sourcePath entry. All values must be quoted strings; unquoted booleans or integers raise a validation error. Any user-supplied bundle.sourcePath is silently ignored.

Monitoring alias

The reserved key __eventlog_monitoring under pipeline: targets the monitoring pipeline generated by monitoring in lhp.yaml. See Monitoring Reference for resolution rules.

Merge precedence

DEFAULT_PIPELINE_CONFIGproject_defaults → pipeline-specific. Deep merge for dicts; lists are replaced wholesale.

Substitution applies to every field. Tokens resolve from substitutions/<env>.yaml at generate time.

Catalog/schema validation

  • Both catalog and schema must be set, or neither.

  • Both must be non-empty after substitution.

  • Missing or partial definition raises BundleResourceError with docs_reference="docs/configure_catalog_schema.rst".

See Configuring catalog and schema for pipelines for per-pipeline and project_defaults configuration, resolution order, and the full error reference.

Job configuration

File format

config/job_config.yaml
project_defaults:
  max_concurrent_runs: 1
  performance_target: STANDARD
  queue:
    enabled: true

---
job_name:
  - bronze_ingestion_job
timeout_seconds: 7200
schedule:
  quartz_cron_expression: "0 0 2 * * ?"
  timezone_id: America/New_York

Pass with --job-config / -jc. Use --bundle-output to write the job file under resources/ for bundle deployment.

Top-level keys

Explicitly rendered by job_resource.yml.j2. Source of truth: EXPLICITLY_RENDERED_JOB_CONFIG_KEYS in src/lhp/core/services/job_generator.py. Defaults from JobGenerator.DEFAULT_JOB_CONFIG.

Key

Default

Notes

max_concurrent_runs

1

Concurrent run cap.

performance_target

STANDARD

One of STANDARD, PERFORMANCE_OPTIMIZED.

queue.enabled

true

Job queueing.

timeout_seconds

none

Job-level timeout.

tags

none

Job tag dict.

email_notifications

none

on_start / on_success / on_failure lists of recipients.

webhook_notifications

none

on_start / on_success / on_failure lists of {id} entries.

permissions

none

Job ACL entries (same shape as pipeline permissions).

schedule.quartz_cron_expression

none

Required when schedule set.

schedule.timezone_id

none

Required when schedule set.

schedule.pause_status

none

PAUSED or UNPAUSED.

notebook_cluster

none

Monitoring job only. new_cluster dict or existing_cluster_id.

generate_master_job

true

LHP-internal; controls master-job emission. Never written to output.

master_job_name

none (auto)

LHP-internal; overrides the master-job name. Never written to output.

Any other top-level key passes through as-is. Common examples: trigger.file_arrival, continuous, run_as.service_principal_name, git_source, health, parameters, environments, edit_mode, budget_policy_id. LHP does not validate pass-through fields against the Databricks Jobs API; misspellings surface at deploy time.

Merge precedence

DEFAULT_JOB_CONFIGproject_defaults → job-specific. Deep merge for dicts; lists are replaced wholesale. Author key order preserved.

Multi-job orchestration

Set job_name on flowgroups in pipelines/*.yaml to split execution into named jobs.

pipelines/bronze/customer.yaml (excerpt)
pipeline: data_bronze
flowgroup: customer_ingestion
job_name:
  - bronze_ingestion_job

Rules

  • All-or-nothing: if any flowgroup sets job_name, every flowgroup must set it.

  • Format: ^[a-zA-Z0-9_-]+$.

  • --pipeline filter is rejected in multi-job mode.

Generated artifacts

resources/
├── <job_name>.job.yml          # One per unique job_name
└── <project>_master.job.yml    # Master orchestrator

The master job wires individual jobs together via task_key references with depends_on edges resolved from dependency analysis.

Generated resource example

resources/lhp/bronze_load.pipeline.yml
# Generated by LakehousePlumber - Bundle Resource for bronze_load
resources:
  pipelines:
    bronze_load_pipeline:
      name: bronze_load_pipeline
      catalog: ${var.catalog}
      schema: ${var.bronze_schema}
      serverless: true
      libraries:
        - glob:
            include: ${workspace.file_path}/generated/${bundle.target}/bronze_load/**
      root_path: ${workspace.file_path}/generated/${bundle.target}/bronze_load
      configuration:
        bundle.sourcePath: ${workspace.file_path}/generated/${bundle.target}

LHP always emits libraries as a glob, root_path under ${workspace.file_path}/generated/${bundle.target}, and the bundle.sourcePath configuration entry.

Configuration templates

lhp init writes starter templates under config/:

  • config/pipeline_config.yaml.tmpl

  • config/job_config.yaml.tmpl

Copy each to drop the .tmpl suffix before editing.

Version enforcement

The optional required_lhp_version key in lhp.yaml pins generation to a specific LHP release range, so the same project produces the same Python output across development and CI. lhp validate and lhp generate fail when the installed LHP version falls outside the range. Informational commands such as lhp show skip the check so you can inspect a project even on a mismatched LHP version.

LHP accepts any PEP 440 version specifier:

lhp.yaml — version specifier examples
# Exact pin
required_lhp_version: "==0.4.1"

# Allow patch updates only (equivalent to >=0.4.1,<0.5.0)
required_lhp_version: "~=0.4.1"

# Range with exclusion
required_lhp_version: ">=0.4.1,<0.5.0,!=0.4.3"

# Allow minor updates
required_lhp_version: ">=0.4.0,<1.0.0"

Projects without required_lhp_version run on any installed LHP version.

Emergency bypass

Set LHP_IGNORE_VERSION=1 to skip version checking temporarily:

Bypass version checking
export LHP_IGNORE_VERSION=1
lhp generate -e dev

# Or inline for a single command
LHP_IGNORE_VERSION=1 lhp validate -e prod

Warning

LHP_IGNORE_VERSION=1 defeats the purpose of version pinning. Reserve it for incident response, not regular workflows.

CI/CD integration

Install the LHP version matching the project requirement before running lhp validate or lhp generate:

CI pipeline with version enforcement
# Install the exact range from lhp.yaml
pip install "lakehouse-plumber$(yq -r .required_lhp_version lhp.yaml)"

# Or pin a known-good range
pip install "lakehouse-plumber>=0.4.1,<0.5.0"

# Validate and generate (fail-fast on mismatch)
lhp validate -e prod
lhp generate -e prod

Error codes

  • BundleResourceError — Missing, incomplete, or empty catalog/schema after substitution (carries docs_reference="docs/configure_catalog_schema.rst"). Also raised on multiple files defining the same pipeline, malformed YAML in resources/lhp/, or filesystem failure. See Configuring catalog and schema for pipelines for catalog/schema cases.

  • LHPConfigError 028BundleManager initialized with no project_root.

See also