How to Set Up CI/CD for an LHP Project ======================================= .. meta:: :description: Set up CI/CD for a Lakehouse Plumber project: generate on PR, deploy on merge, and promote across dev, uat, and prod with substitutions. This how-to sets up a CI/CD pipeline that validates Lakehouse Plumber (LHP) configurations on every pull request, deploys to a development workspace on merge, and promotes the same commit through ``uat`` and ``prod`` :term:`Databricks Asset Bundle ` (DAB) targets. The primary example uses GitHub Actions; Azure DevOps and GitLab follow the same shape. Before you start ---------------- You need: - An LHP project with ``databricks.yml`` and at least one substitution file per environment (``substitutions/dev.yaml``, ``substitutions/uat.yaml``, ``substitutions/prod.yaml``). See :doc:`configure_bundles` to bootstrap one. - A Databricks service principal with deploy permissions on each target workspace. - The Databricks CLI installed in the runner (the ``databricks/setup-cli`` action handles this on GitHub Actions). Do not commit anything under ``generated/`` or ``resources/lhp/`` — those are build artifacts. Add them to ``.gitignore``. Workflow shape -------------- Every LHP CI/CD pipeline follows the same three steps, repeated per target: 1. ``lhp validate --env `` — fail fast on YAML or substitution errors. 2. ``lhp generate --env `` — produce Python files under ``generated//`` and resource YAML under ``resources/lhp//``. Bundle integration is enabled by default; pass ``--no-bundle`` only if you are not using DABs. 3. ``databricks bundle deploy --target `` — deploy the generated bundle. Both ``lhp validate`` and ``lhp generate`` exit non-zero on failure (``65`` for validation or dependency errors, ``78`` for configuration errors, ``66`` for missing files; full list in :doc:`errors_reference`). A failing step stops the workflow without ever calling ``databricks bundle deploy``. Generate on PR, deploy on merge ------------------------------- The workflow below runs ``lhp validate`` and a dry-run ``lhp generate`` on pull requests, then deploys the merged commit to ``dev`` on push to ``main``. Add ``uat`` and ``prod`` jobs that gate on Git tags (see "Promote across environments" below). .. code-block:: yaml :caption: .github/workflows/lhp-cicd.yml name: LHP CI/CD on: pull_request: branches: [main] push: branches: [main] tags: ['v*-uat', 'v*-prod'] permissions: contents: read id-token: write # required for OIDC jobs: validate: if: github.event_name == 'pull_request' runs-on: ubuntu-latest steps: - uses: actions/checkout@v4 - uses: actions/setup-python@v5 with: python-version: '3.12' cache: 'pip' - run: pip install lakehouse-plumber - run: lhp validate --env dev --verbose - run: lhp generate --env dev --dry-run deploy-dev: if: github.event_name == 'push' && github.ref == 'refs/heads/main' runs-on: ubuntu-latest environment: development env: DATABRICKS_AUTH_TYPE: github-oidc DATABRICKS_HOST: ${{ vars.DATABRICKS_HOST_DEV }} DATABRICKS_CLIENT_ID: ${{ vars.DATABRICKS_CLIENT_ID_DEV }} steps: - uses: actions/checkout@v4 - uses: actions/setup-python@v5 with: python-version: '3.12' - uses: databricks/setup-cli@main - run: pip install lakehouse-plumber - run: lhp generate --env dev - run: databricks bundle deploy --target dev The ``validate`` job runs without Databricks credentials — ``lhp validate`` and ``lhp generate --dry-run`` are local operations. .. note:: OIDC federation (``github-oidc``) avoids storing long-lived Databricks tokens. Create one federation policy per environment, scoped to the matching GitHub environment subject. See the `Databricks OIDC federation docs `_ for the exact ``service-principal-federation-policy create`` payload. Promote across environments --------------------------- Use the same commit for every target. Substitution files (``substitutions/.yaml``) supply the per-environment values; the same YAML in ``pipelines/`` generates per-environment Python under ``generated//`` and resource YAML under ``resources/lhp//``. Trigger promotions with Git tags, not branches — this preserves the commit-once, deploy-many guarantee: - ``v1.2.3-uat`` deploys the tagged commit to UAT. - ``v1.2.3-prod`` deploys the same tagged commit to production, gated by a GitHub environment with required reviewers. Add these jobs to the workflow above: .. code-block:: yaml :caption: UAT and prod promotion jobs deploy-uat: if: startsWith(github.ref, 'refs/tags/v') && endsWith(github.ref, '-uat') runs-on: ubuntu-latest environment: uat env: DATABRICKS_AUTH_TYPE: github-oidc DATABRICKS_HOST: ${{ vars.DATABRICKS_HOST_UAT }} DATABRICKS_CLIENT_ID: ${{ vars.DATABRICKS_CLIENT_ID_UAT }} steps: - uses: actions/checkout@v4 with: ref: ${{ github.ref }} - uses: actions/setup-python@v5 with: python-version: '3.12' - uses: databricks/setup-cli@main - run: pip install lakehouse-plumber - run: lhp generate --env uat - run: databricks bundle deploy --target uat deploy-prod: if: startsWith(github.ref, 'refs/tags/v') && endsWith(github.ref, '-prod') runs-on: ubuntu-latest environment: production # configure required reviewers in GitHub env: DATABRICKS_AUTH_TYPE: github-oidc DATABRICKS_HOST: ${{ vars.DATABRICKS_HOST_PROD }} DATABRICKS_CLIENT_ID: ${{ vars.DATABRICKS_CLIENT_ID_PROD }} steps: - uses: actions/checkout@v4 with: ref: ${{ github.ref }} - uses: actions/setup-python@v5 with: python-version: '3.12' - uses: databricks/setup-cli@main - run: pip install lakehouse-plumber - run: lhp generate --env prod - run: databricks bundle deploy --target prod --mode production Configure GitHub environments (``Settings → Environments``) for ``development``, ``uat``, and ``production``. Attach required reviewers and protected-branch rules to ``production``. The same federation policy subject must match the environment name set in each job's ``environment:`` field. Recover from a failed deploy ---------------------------- Bundle deploys are desired-state — the next successful deploy overwrites whatever the previous run left behind. Two failure modes deserve specific handling: 1. ``lhp generate`` failed. The pipeline never reached ``databricks bundle deploy``. Fix the YAML or substitution error, commit, and re-tag (or re-trigger the job). Exit codes from :doc:`errors_reference` identify the failure category. 2. ``databricks bundle deploy`` failed mid-run. Some resources may have been updated. Re-run the same workflow to regenerate and redeploy. If you need to roll back to a previous version, tag the older commit with a new ``-prod`` tag (for example ``v1.2.2-prod-hotfix``); the deploy job regenerates from that commit's YAML and replaces the bad deployment. Other CI platforms ------------------ Azure DevOps Pipelines, GitLab CI, and Bitbucket Pipelines follow the same three-step shape (``lhp validate`` → ``lhp generate`` → ``databricks bundle deploy``). The differences are platform-specific: - **OIDC issuer and subject format.** Each platform exposes a different issuer URL and subject pattern. Create one federation policy per environment per platform. - **Auth-type variable.** Set ``DATABRICKS_AUTH_TYPE`` to ``github-oidc``, ``azure-service-principal``, or ``bitbucket-oidc`` to match the platform. - **Approval gates.** Use Azure DevOps Environments, GitLab protected environments, or Bitbucket Deployments to enforce production reviewers. The CLI commands, exit codes, substitution files, and bundle configuration are identical across platforms. See also -------- - :doc:`architecture` — why LHP separates source YAML from generated artifacts. - :doc:`configure_bundles` — enable DAB integration in an LHP project. - :doc:`bundle_config_reference` — bundle and pipeline configuration fields. - :doc:`errors_reference` — exit codes and error categories returned by ``lhp validate`` and ``lhp generate``.