Configuring catalog and schema for pipelines¶
Every Databricks Lakeflow Declarative Pipeline targets a Unity Catalog
catalog and schema. In Lakehouse Plumber (LHP), both values come from
pipeline_config.yaml — either as per-pipeline entries or as project-wide
defaults. lhp generate requires both values and fails fast when either is
missing.
Changed in version 0.8.7: Catalog and schema must be defined in pipeline_config.yaml. Previous
versions wrote default_pipeline_catalog and default_pipeline_schema
variables into databricks.yml automatically and the pipeline-resource
template fell back to ${var.default_pipeline_*}. Both behaviors are
removed. See Migration from databricks.yml variables.
Per-pipeline configuration¶
Set catalog and schema inside a per-pipeline entry. Each pipeline gets
its own YAML document keyed by pipeline::
---
pipeline: ingestion_dev
catalog: my_dev_catalog
schema: bronze
serverless: true
A per-pipeline entry wins over project_defaults for every key it sets. LHP
deep-merges nested dicts, so per-pipeline configuration and event_log
blocks merge field-by-field; lists (clusters, notifications,
permissions) replace the project-default list wholesale.
Project defaults¶
Declare project_defaults at the top of the file when several pipelines
share the same catalog and schema. The block applies to every pipeline unless
a per-pipeline entry overrides specific keys:
project_defaults:
catalog: shared_catalog
schema: bronze
serverless: true
---
pipeline: ingestion_dev
---
pipeline: ingestion_prod
The key is project_defaults (not default_pipeline_config or any
variant). LHP reads it once at config load and merges it into every
pipeline’s resolved configuration.
Resolution order¶
LHP resolves each pipeline’s configuration in three layers, with later layers overriding earlier ones via deep merge:
DEFAULT_PIPELINE_CONFIG— LHP’s built-in defaults (serverless: true,edition: ADVANCED,channel: CURRENT,continuous: false).project_defaults— the top-of-file block inpipeline_config.yaml.Pipeline-specific entry — the document keyed by
pipeline: <name>.
Per-pipeline values override project_defaults; project_defaults overrides
the built-in defaults. The following config illustrates the precedence rules:
project_defaults:
catalog: shared_dev
schema: bronze
---
pipeline: ingestion_a
schema: silver # overrides project_defaults; catalog inherits 'shared_dev'
---
pipeline: ingestion_b
# both catalog and schema inherit from project_defaults
After resolution, ingestion_a runs against shared_dev.silver and
ingestion_b runs against shared_dev.bronze.
Substitution tokens (${token}, %{local_var}) work inside any of these
layers. LHP resolves tokens against substitutions/<env>.yaml after the deep
merge completes. See Substitutions & Secrets for the substitution syntax.
Passing the pipeline config¶
Pass the config path with --pipeline-config (short form -pc) on every
lhp generate invocation:
lhp generate --env dev --pipeline-config config/pipeline_config.yaml
The flag accepts paths relative to the project root or absolute paths.
Changed in version 0.8.7: --pipeline-config is required when bundle support is enabled
(databricks.yml is present in the project root and --no-bundle is
not passed). Invocations that omit the flag now fail with
LHP-CFG-023 before any files are written or deleted. To skip bundle
resource generation entirely, pass --no-bundle.
Pre-flight validation¶
lhp generate runs catalog/schema validation before any side effects:
no directory is wiped, no code is generated, no bundle YAML is written until
preflight passes. Two checks fire in order:
CLI flag check (
LHP-CFG-023) — when bundle support is enabled,--pipeline-configmust be supplied. Fails in <1 second on a missing flag.Catalog/schema resolution (
LHP-CFG-026) — every pipeline (plus the synthetic monitoring pipeline if monitoring is enabled inlhp.yaml) is checked for resolvedcatalogandschema. All failures are aggregated into one error, grouped by failure type — see Error reference below.
Preflight runs identically under --dry-run: a dry-run invocation with
missing config still surfaces the same errors, just without ever touching
the filesystem.
Without --pipeline-config and with bundle support enabled, every pipeline
fails with the error described in Error reference.
For multi-environment workflows, pair one config file with each environment:
lhp generate --env dev --pipeline-config config/pipeline_config-dev.yaml
lhp generate --env prod --pipeline-config config/pipeline_config-prod.yaml
Migration from databricks.yml variables¶
Prior to v0.8.7, LHP auto-populated two variables in databricks.yml for
every target:
targets:
dev:
variables:
default_pipeline_catalog:
default: my_dev_catalog
default_pipeline_schema:
default: bronze
The pipeline-resource template then emitted
catalog: ${var.default_pipeline_catalog} and
schema: ${var.default_pipeline_schema} when per-pipeline values were
absent. Both halves of that pathway are removed.
To migrate:
Delete the
default_pipeline_cataloganddefault_pipeline_schemaentries from each target’svariables:block indatabricks.yml. Remove thevariables:block entirely if those were the only entries.Move the values into
pipeline_config.yaml— typically intoproject_defaultsso they apply across every pipeline.
Before:
targets:
dev:
variables:
default_pipeline_catalog:
default: my_dev_catalog
default_pipeline_schema:
default: bronze
prod:
variables:
default_pipeline_catalog:
default: my_prod_catalog
default_pipeline_schema:
default: gold
After:
targets:
dev:
mode: development
prod:
mode: production
project_defaults:
catalog: my_dev_catalog
schema: bronze
project_defaults:
catalog: my_prod_catalog
schema: gold
If different pipelines need different catalogs or schemas inside one environment, add per-pipeline entries that override the defaults. See Per-pipeline configuration.
Error reference¶
Preflight surfaces two distinct error codes. LHP-CFG-023 covers the CLI
flag check; LHP-CFG-026 is the aggregated catalog/schema validation
result. Both carry a doc_link pointing to this page so that programmatic
catchers (CI tooling, editor integrations) can route users back to it.
LHP-CFG-023 — --pipeline-config is required¶
Message: --pipeline-config is required when bundle support is enabled
Triggered when databricks.yml exists in the project root (bundle support
is enabled), --no-bundle was not passed, and --pipeline-config /
-pc was omitted. Without that flag, PipelineConfigLoader loads empty
defaults and every pipeline would fail catalog/schema validation later — so
preflight surfaces the actionable error up front.
Fix: Pass --pipeline-config config/pipeline_config.yaml (or your
project’s equivalent path). To skip bundle resource generation entirely
without supplying the flag, pass --no-bundle.
LHP-CFG-026 — aggregated catalog/schema validation¶
Message: Catalog/schema validation failed for N pipeline(s)
Triggered when any pipeline (including the synthetic monitoring pipeline if
monitoring is enabled) has missing or empty catalog / schema after
config merge and substitution. Failures are aggregated across all
pipelines and grouped by failure type. The error’s context["failures"]
exposes the structured payload to programmatic consumers:
{
"both_missing": ["pipeline_a", "pipeline_b"],
"incomplete": [{"pipeline_name": "...", "catalog": "...", "schema": null}],
"empty_after_substitution": [{"pipeline_name": "...", ...}]
}
The three failure categories below explain each bucket.
Both missing¶
Pipelines listed under both_missing have neither catalog nor
schema set after the deep merge (DEFAULT_PIPELINE_CONFIG →
project_defaults → per-pipeline). The most common cause is a
pipeline_config.yaml without a project_defaults block and no
per-pipeline overrides.
Fix: Add catalog and schema to project_defaults (for
project-wide values) or to each pipeline’s own entry (for per-pipeline values).
Incomplete pairing¶
Pipelines listed under incomplete have one of catalog or schema
defined but the other missing. LHP requires both or neither — partial
configuration is treated as a typo, not as a fallback request.
Fix: Add the missing key. If the intent was to inherit from
project_defaults for one half, remove the other half from the
per-pipeline entry so both resolve from the same layer.
Empty after substitution¶
Pipelines listed under empty_after_substitution have catalog or
schema that resolved to whitespace-only strings after
substitutions/<env>.yaml was applied. The keys exist in the config but
evaluate to nothing meaningful.
Fix: Inspect the substitution file for the failing environment. Run
lhp substitutions --env <env> to print the resolved values for every
token. Add or correct the entries that resolve to empty strings.
Note
Pre-0.8.7 versions raised three separate LHP-CFG-026 errors — one per
failure category, on the first failing pipeline. The current preflight
walks every pipeline once and aggregates into a single error. Users who
parsed the previous title strings should switch to
LHPConfigError.context["failures"], which is the stable contract.
See also¶
Configure Bundles — Bundle setup walk-through and directory layout.
Bundle Configuration Reference — Full
pipeline_config.yamlschema and bundle sync rules.Substitutions & Secrets — Substitution token syntax and resolution order.
Error Reference — Full LHP error code catalog.