Monitoring Reference ===================== .. meta:: :description: Configuration reference for Lakehouse Plumber pipeline monitoring — event log fields, monitoring fields, materialized view rules, workflow job schema, reserved aliases, and validation error codes. This page catalogs every ``event_log`` and ``monitoring`` key in ``lhp.yaml``, the schema of the dedicated job-config file, the reserved ``__eventlog_monitoring`` alias, and the validation errors LHP raises. For the walk-through, see :doc:`enable_monitoring`. Prerequisites: Databricks Asset Bundles enabled (``databricks.yml`` exists), Unity Catalog workspace, and a cloud storage path (typically a Unity Catalog volume) for streaming checkpoints. Event Log Configuration ----------------------- The ``event_log`` block in ``lhp.yaml`` injects ``event_log:`` blocks into every generated pipeline resource file during ``lhp generate``. String fields support ``${token}`` substitution from ``substitutions/.yaml``. .. list-table:: ``event_log`` fields :header-rows: 1 :widths: 22 12 18 48 * - Field - Type - Default - Description * - ``enabled`` - boolean - ``true`` - Set to ``false`` to keep the section but skip injection. * - ``catalog`` - string - (required when ``enabled``) - Unity Catalog name for event log tables. * - ``schema`` - string - (required when ``enabled``) - Schema name for event log tables. * - ``name_prefix`` - string - ``""`` - Prefix prepended to the generated event log table name. * - ``name_suffix`` - string - ``""`` - Suffix appended to the generated event log table name. Per-pipeline table name formula: ``{name_prefix}{pipeline_name}{name_suffix}``. Pipeline-Level Overrides ~~~~~~~~~~~~~~~~~~~~~~~~ In ``pipeline_config.yaml`` (loaded by ``lhp generate -pc``): * An ``event_log:`` mapping for a pipeline **fully replaces** the project-level block (no merge). * ``event_log: false`` opts that pipeline out of event logging. Opt-outs are excluded from the monitoring notebook's source list. For the full ``pipeline_config.yaml`` schema, see :doc:`bundle_config_reference`. Monitoring Configuration ------------------------ The ``monitoring`` block generates three artifacts: a union notebook, an MVs-only Lakeflow Declarative Pipeline, and a Databricks Workflow job chaining them. Monitoring requires ``event_log`` enabled (LHP raises ``LHP-CFG-008`` otherwise). .. list-table:: ``monitoring`` fields :header-rows: 1 :widths: 25 10 22 43 * - Field - Type - Default - Description * - ``enabled`` - boolean - ``true`` - When ``false``, no monitoring artifacts are generated and previously generated artifacts are removed on the next ``lhp generate``. * - ``checkpoint_path`` - string - required when ``enabled`` - Base path for streaming checkpoints. Each monitored pipeline gets the subdirectory ``{checkpoint_path}/{pipeline_name}/``. Typically a Unity Catalog volume. * - ``job_config_path`` - string - required when ``enabled`` - Project-root-relative path to a flat single-document YAML file describing the monitoring Workflow job. See `Workflow Job Schema`_. * - ``max_concurrent_streams`` - integer - ``10`` - ``ThreadPoolExecutor`` ``max_workers`` for the union notebook. Must be ≥ 1. * - ``pipeline_name`` - string - ``${project_name}_event_log_monitoring`` - Used for the generated pipeline, job, and the directory under ``generated//``. * - ``catalog`` - string - inherits ``event_log.catalog`` - Override catalog for monitoring tables. * - ``schema`` - string - inherits ``event_log.schema`` - Override schema for monitoring tables. * - ``streaming_table`` - string - ``all_pipelines_event_log`` - Delta table the union notebook writes into (regular Delta table, created by Structured Streaming on first run). * - ``materialized_views`` - list - one default view - Materialized view definitions. See `Materialized Views`_. * - ``enable_job_monitoring`` - boolean - ``false`` - Adds a ``jobs_stats`` view populated via the Databricks SDK. See `Job Monitoring`_. Both required strings support ``${token}`` substitution. When ``job_config_path`` contains unresolved tokens, LHP defers the file-existence check until the environment is selected. ``monitoring: {}`` parses but always fails validation, because both required strings are empty. Minimal Valid Configuration ~~~~~~~~~~~~~~~~~~~~~~~~~~~ .. code-block:: yaml :caption: lhp.yaml event_log: catalog: "${catalog}" schema: _meta name_suffix: "_event_log" monitoring: checkpoint_path: "/Volumes/${catalog}/_meta/checkpoints/event_logs" job_config_path: "config/monitoring_job_config.yaml" Override Resolution ~~~~~~~~~~~~~~~~~~~ Monitoring catalog and schema resolve as: ``monitoring.catalog`` / ``monitoring.schema`` (when set) wins; otherwise inherits from ``event_log``. The Delta table fully qualified name is ``{catalog}.{schema}.{streaming_table}``. Materialized Views ------------------ The ``materialized_views`` field has three modes: .. list-table:: :header-rows: 1 :widths: 30 70 * - Value - Behavior * - omitted or ``null`` - LHP generates the default ``events_summary`` view (see `Default events_summary Schema`_). * - ``[]`` (empty list) - No materialized views and no MVs pipeline file. The Workflow job contains only the notebook task. Cleanup removes any previously generated ``monitoring.py``. * - explicit list - Only the listed views are generated, replacing the default. View Definition Fields ~~~~~~~~~~~~~~~~~~~~~~ Each entry must declare ``name`` (required, unique within the list) and exactly one of ``sql`` (inline SQL string) or ``sql_path`` (project-root-relative path to a SQL file). Each view becomes a ``@dp.materialized_view`` function in ``generated///monitoring.py``, fully qualified as ``{monitoring_catalog}.{monitoring_schema}.{view_name}``. .. code-block:: yaml :caption: lhp.yaml (custom views) monitoring: checkpoint_path: "/Volumes/${catalog}/_meta/checkpoints/event_logs" job_config_path: "config/monitoring_job_config.yaml" materialized_views: - name: error_events sql: "SELECT * FROM all_pipelines_event_log WHERE event_type = 'error'" - name: custom_analysis sql_path: "sql/monitoring_custom_analysis.sql" Default events_summary Schema ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ The default view aggregates run status, timing, row metrics, and run config from the union Delta table. .. list-table:: :header-rows: 1 :widths: 32 14 54 * - Column - Type - Description * - ``pipeline_name``, ``pipeline_id``, ``update_id`` - STRING - Pipeline identifiers and run (update) ID. * - ``run_status`` - STRING - Final status (``COMPLETED``, ``FAILED``, ``CANCELED``, …). * - ``trigger_cause`` - STRING - ``USER_ACTION``, ``SCHEDULED``, ``API_CALL``, … * - ``is_full_refresh`` - BOOLEAN - Full refresh vs. incremental. * - ``dbr_version`` - STRING - Databricks Runtime version. * - ``compute_type`` - STRING - ``Serverless`` or ``Classic``. * - ``run_start_time``, ``run_end_time`` - TIMESTAMP - Run start and end. * - ``duration_minutes`` - DOUBLE - Run duration in minutes (2 decimal places). * - ``tables_processed`` - BIGINT - Distinct tables (flows) processed. * - ``total_upserted_rows``, ``total_deleted_rows``, ``total_rows_affected`` - BIGINT - Row counts across all tables. * - ``total_dropped_records`` - BIGINT - Records dropped by data quality expectations. .. _monitoring-job-config: Workflow Job Schema ------------------- ``monitoring.job_config_path`` must point to a flat single-document YAML mapping. LHP deep-merges the file over its defaults (``max_concurrent_runs=1``, ``performance_target=STANDARD``, ``queue.enabled=true``) and then token-substitutes via the active environment's ``substitutions/.yaml``. The job name is fixed at ``{pipeline_name}_job`` and the ``pipeline_task`` is generated automatically. .. list-table:: Job-config fields (all optional) :header-rows: 1 :widths: 32 68 * - Field - Description * - ``notebook_cluster.new_cluster`` - New-cluster spec for the notebook task. Mutually exclusive with ``existing_cluster_id``. Serverless when neither is set. * - ``notebook_cluster.existing_cluster_id`` - Attach the notebook task to an existing cluster. * - ``queue.enabled`` - Enable job-run queueing. * - ``performance_target`` - ``STANDARD`` (default) or ``PERFORMANCE_OPTIMIZED``. * - ``timeout_seconds`` - Job-level timeout in seconds. * - ``max_concurrent_runs`` - Maximum concurrent runs. Default ``1``. * - ``schedule`` - Quartz cron schedule: ``quartz_cron_expression``, ``timezone_id``, ``pause_status``. * - ``tags`` - Free-form ``{key: value}`` mapping. Merged recursively over LHP defaults. * - ``email_notifications``, ``webhook_notifications`` - ``on_start`` / ``on_success`` / ``on_failure`` recipient or webhook lists. * - ``permissions`` - User/group permission entries. Disallowed keys: a top-level ``project_defaults:`` wrapper, ``job_name:``, and ``pipeline_task`` entries (silently overridden). .. warning:: Legacy auto-pickup of ``templates/bundle/job_config.yaml`` for monitoring jobs is **removed**. Use ``monitoring.job_config_path`` to point at a dedicated file. The ``__eventlog_monitoring`` alias inside the generic ``config/job_config.yaml`` consumed by ``lhp deps`` for *orchestration* jobs is unaffected. .. _monitoring-jobs: Job Monitoring -------------- When ``monitoring.enable_job_monitoring: true``, LHP adds a Python-load chain that correlates Databricks Jobs with their pipeline runs via the Databricks SDK. Generated files under ``generated///``: * ``monitoring.py`` — adds a ``v_jobs_stats`` view and a ``jobs_stats`` MV. * ``jobs_stats_loader.py`` — calls the SDK to scan recent job runs, correlate each pipeline update with its triggering job, and enrich rows with pipeline tags (``spec.tags``) and job tags (``settings.tags``). Default SDK lookback is 7 days, configurable via the ``lookback_hours`` pipeline parameter. The ``jobs_stats`` view inherits the monitoring pipeline's catalog and schema. .. list-table:: ``jobs_stats`` schema :header-rows: 1 :widths: 28 14 58 * - Column - Type - Description * - ``pipeline_id``, ``pipeline_name`` - STRING - Pipeline identifiers. * - ``update_id`` - STRING - Pipeline update correlated with the job run. * - ``job_id``, ``job_run_id``, ``job_name`` - STRING - Triggering job identifiers and name. * - ``job_run_start_time``, ``job_run_end_time`` - TIMESTAMP - Job run start and end. * - ``job_run_status`` - STRING - Final status (``SUCCESS``, ``FAILED``, ``UNKNOWN``, …). * - ``pipeline_tags``, ``job_tags`` - STRING - JSON map of ``spec.tags`` and ``settings.tags`` respectively. Reserved Aliases ---------------- The ``__eventlog_monitoring`` alias (double-underscore prefix) targets the monitoring pipeline from ``pipeline_config.yaml`` without hardcoding its name. It resolves to ``monitoring.pipeline_name`` (default ``${project_name}_event_log_monitoring``) at generation time. .. list-table:: :header-rows: 1 :widths: 40 60 * - Constraint - Violation result * - Monitoring must be enabled in ``lhp.yaml``. - Alias entry is silently dropped with a warning. * - Cannot coexist with the actual monitoring pipeline name in the same config. - LHP raises ``LHP-VAL-010``. * - Must appear as a standalone ``pipeline:`` value, not inside a list. - LHP raises ``LHP-VAL-011``. The alias inside the generic orchestration ``config/job_config.yaml`` (consumed by ``lhp deps``) is independent and still supported. Automatic Cleanup ----------------- LHP reconciles monitoring artifacts on every ``lhp generate``. Toggling monitoring, renaming ``pipeline_name``, or switching ``materialized_views`` between populated and empty never leaves stale files. .. list-table:: :header-rows: 1 :widths: 25 75 * - Artifact - Reconciliation * - Notebook directory - ``monitoring//`` is cleared before each write; an empty ``monitoring/`` is removed. * - Job resource - Any ``resources/lhp/*.job.yml`` whose header matches the monitoring job comment is removed before write. Renaming ``pipeline_name`` cleans up the prior file. * - Pipeline directory - When monitoring is disabled, LHP scans ``generated//*`` for ``monitoring.py`` files marked ``FLOWGROUP_ID = "monitoring"`` and removes the directory. Validation Errors ----------------- .. list-table:: :header-rows: 1 :widths: 22 78 * - Code - Trigger * - ``LHP-CFG-006`` - ``event_log`` is not a YAML mapping, or fails to parse. * - ``LHP-CFG-007`` - ``event_log.enabled`` is ``true`` but ``catalog`` or ``schema`` is missing. * - ``LHP-CFG-008`` - ``monitoring`` is not a mapping; ``monitoring`` enabled without ``event_log``; ``checkpoint_path`` or ``job_config_path`` missing; ``job_config_path`` file not found (when path has no unresolved tokens); ``materialized_views`` is not a list or contains a non-mapping entry; a materialized view is missing ``name``; duplicate view names; both ``sql`` and ``sql_path`` set on one view. * - ``LHP-VAL-010`` - Both ``__eventlog_monitoring`` and the resolved monitoring pipeline name appear in ``pipeline_config.yaml``. * - ``LHP-VAL-011`` - ``__eventlog_monitoring`` appears inside a pipeline list. * - ``LHP-IO-001`` - At post-substitution time, the resolved ``job_config_path`` does not point to a readable file. For the full error catalog, see :doc:`errors_reference`. See also -------- * :doc:`enable_monitoring` — how-to walk-through for turning monitoring on, with the workflow-job configuration steps and pre-V0.8.2 migration notes. * :doc:`operational_metadata` — reference for project-level operational metadata columns that complement event log monitoring. * :doc:`architecture` — explanation of the LHP generation model, including how the orchestrator finalizes monitoring artifacts. * :doc:`bundle_config_reference` — bundle integration and ``pipeline_config.yaml`` schema. * :doc:`errors_reference` — full error code reference.