Changelog¶
Changelog¶
All notable changes to Lakehouse Plumber are documented in this file.
The format is based on Keep a Changelog, and this project adheres to Semantic Versioning.
[0.8.7] — 2026-05-21¶
Breaking changes¶
lhp generatenow requires--pipeline-config/-pcwhen bundle support is enabled (databricks.ymlpresent,--no-bundlenot set). Previously, generation would proceed and fail late during bundle sync. New error code:LHP-CFG-023. Users who relied on a default-empty pipeline_config must now supply a path explicitly. Seedocs/configure_catalog_schema.rst.resources/lhp/is now wiped and regenerated on everylhp generate. The directory is exclusively LHP-managed. Users who placed custom resource YAMLs (hand-written jobs, dashboards, secret scopes) underresources/lhp/must move them toresources/(top level) or a non-lhpsubdirectory before upgrading, or those files will be deleted on the next generate. Files outsideresources/lhp/are never touched, with one exception: the monitoring job YAML atresources/<name>.job.ymlis identified by its sentinel header (# Generated by LakehousePlumber - Monitoring Job) and replaced on each run.catalogandschemaare now REQUIRED inpipeline_config.yaml— set them per-pipeline or via the top-levelproject_defaultsblock. Missing or incomplete catalog/schema causeslhp generateto fail fast withLHPConfigError(LHP-CFG-026), aggregated across all pipelines plus the synthetic monitoring pipeline (when monitoring is enabled). Programmatic consumers readLHPConfigError.context["failures"](grouped byboth_missing,incomplete,empty_after_substitution) instead of parsing message text. Previously LHP auto-populateddefault_pipeline_catalog/default_pipeline_schemaindatabricks.ymland the pipeline-resource template fell back to${var.default_pipeline_*}. Both halves of that pathway are removed. Seedocs/configure_catalog_schema.rstfor the migration guide.Smart generation removed. Every
lhp generatenow full-regenerates all flowgroups (equivalent to the old--force --no-statebehaviour); there is no incremental mode. Thelhp statesubcommand, the--no-cleanupflag, and the.lhp_state.json/.lhp_state/files are all removed. Any leftover state files are auto-cleaned on the first run of this version.PythonFileCopier.apply_copy_recordno longer mutates state. It returns a list ofCopiedFileEntryrecords and the caller invokestrack_generated_fileseparately. In-tree callers route throughPipelineProcessor._apply_copy_recordsautomatically.
Added¶
Pre-flight catalog/schema validation (
lhp.bundle.preflight): every pipeline (plus the synthetic monitoring pipeline if monitoring is enabled) is validated forcatalogandschemaresolution BEFORE any side effects. Failures are aggregated across all pipelines and grouped by failure type (both_missing/incomplete/empty_after_substitution), surfaced viaLHP-CFG-026with structuredcontext["failures"]. Preflight runs identically under--dry-run. Typical fail-fast time is under 1 second with zero filesystem changes — the most common cause of an emptyresources/lhp/(catalog/schema misconfig) can no longer happen.Per-pipeline parallel generation. One worker process per pipeline via
ProcessPoolExecutor(spawn start method). Phase A (parse, codegen, format) and the intra-pipeline portion of Phase B (cross-flowgroup validation,.pywrites, copied-module application, test-reporting hook) all run inside the worker — eliminating the main-thread GIL-starvation hot path that bottlenecked large projects.Worker count auto-detection at ~80% of OS-visible CPU, capped to the workload size. Override with the
LHP_MAX_WORKERSenv var or the--max-workersCLI flag onlhp generate/lhp validate. Use--max-workers 1for sequential execution.LHPError.from_worker_exception+lhp_error_from_worker_failurefactories — reconstruct worker-side exceptions on the main thread while preserving the original exception type via dual-inheritance subclasses (LHPValidationError(LHPError, ValueError),LHPFileError(LHPError, FileNotFoundError)). Existingexcept ValueError/except FileNotFoundErrorhandlers continue to catch worker failures.docs/configure_catalog_schema.rst— migration guide for the new mandatory catalog/schema rule, with worked examples for per-pipeline config,project_defaults, resolution order, the migration from the removeddatabricks.ymlvariables, and an error reference coveringLHP-CFG-023andLHP-CFG-026.Release-time performance gate (
pytest tests/performance/ -m performance) — opt-in viaLHP_RUN_PERFORMANCE_GATE=1. Captures baselines per release; excluded from default CI to keep PR pipelines fast. Comprehensive unit tests forPerformanceTimerand its snapshot accessor.Picklable test fakes (
tests/fakes/,tests/test_fakes_picklable.py) for worker-boundary tests that need to cross the spawn boundary.E2E coverage expansion: state CLI semantics, test actions, temp-table transformations, JDBC load, Python load, built-in sinks, Delta CDC reader, generate flags, negative paths.
Changed¶
Catalog/schema validation moved from the bundle-write phase to a dedicated preflight stage. Late-bound validation inside
BundleManager.generate_resource_file_contentis replaced by a single internal-error guard (LHP-GEN-001) that fires only when preflight is bypassed by a non-CLI caller (programming bug). The three separate raises that this method used to emit are gone.LHP-CFG-026is now the aggregated form: a single error with all failures grouped by category. Programmatic consumers should readLHPConfigError.context["failures"]instead of parsing message text.PipelineDeltamoved fromlhp.core.state_modelstolhp.models.processing(lost itsfiles_skippedfield).build_lhp_source_headermoved fromlhp.utils.smart_file_writertolhp.utils.file_header; adds anormalize_contenthelper used by every bundle/pipeline write site.ChecksumCacheno longer holds athreading.Lock(main-thread-only usage now that workers are process-isolated).The blueprint provenance map (
Dict[Tuple[str, str], BlueprintProvenance]) now crosses the spawn boundary into workers; synthetic-flowgroup detection and blueprint-aware dependency resolution work the same in the worker as on the main thread.
Deprecated¶
--forceand--no-stateflags onlhp generate: accepted with a deprecation warning; will be removed in a future release. Both are no-ops since full regeneration is the default. Remove them from CI invocations.
Fixed¶
Generating without
--pipeline-configno longer wipesresources/lhp/andgenerated/<env>/before failing. A subsequentdatabricks bundle deploywill no longer find an emptyresources/lhp/from a misconfigured run.Init template (
pipeline_config_env.yaml.tmpl) no longer claims LHP auto-loadstemplates/bundle/pipeline_config.yaml— that pathway was removed in this release. The comment now correctly documents the--pipeline-configrequirement.
Removed¶
Smart-generation subsystem entirely:
lhp stateCLI subcommand,--no-cleanupflag,.lhp_state.json/.lhp_state/files (auto-deleted on first run of new version), and the moduleslhp.core.state_manager,lhp.core.state_dependency_resolver,lhp.core.state.*,lhp.utils.smart_file_writer,lhp.services.state_display_*,lhp.core.strategies,lhp.core.services.generation_planning_service,lhp.cli.commands.state_command.DatabricksYAMLManagermodule (and the runtime dependency on ruamel.yaml that it was the sole consumer of, where applicable).BundleManager._update_databricks_variablesmethod, plus the auto-population ofdefault_pipeline_catalog/default_pipeline_schemaindatabricks.ymltargets.<env>.variables.Bundle-resource preservation decision tree (Scenarios 1a / 1b / 2 / 4 in
BundleManager._sync_pipeline_resource— the Conservative Approach for LHP-vs-user file preservation); orphan cleanup ofresources/lhp/; LHP-vs-user header sentinel insideresources/lhp/.Deprecated
${var.default_pipeline_catalog}/${var.default_pipeline_schema}fallback inpipeline_resource.yml.j2.variables:block in the init template’sdatabricks.yml.j2(default_pipeline_catalog/default_pipeline_schemawere the only entries).ActionOrchestrator._sync_bundle_resources(orphan method with no production caller) and its 8 test references.Orchestrator main-thread machinery:
_PipelineProgress,_assemble_pipeline,_assemble_pipeline_outputs(worker-internal now).DependencyTracker.set_checksum_cache(the cache is now provided at factory-construction time viaDependencyTracker.for_project(..., checksum_cache=...)).Outdated benchmark scripts and results (replaced by the structured release-time performance gate above).
Known limitation¶
lhp generateis non-atomic between theresources/lhp/wipe and the end of bundle sync: if the process is interrupted (OS kill, power loss) after preflight passes but before bundle sync completes,resources/lhp/may be left partially populated or empty. A subsequentdatabricks bundle deployagainst that state could silently drop pipelines from the workspace. Catalog/schema misconfigs — the most common trigger in pre-preflight versions — can no longer cause this; preflight rejects them before any wipe. Re-runninglhp generateto completion restores the directory.
0.8.6 — 2026-05-11¶
Changed¶
custom_datasourceandcustom_sinkgenerated output format changed from inline-embed to copy-and-import. The user’s PySparkDataSource/DataSinksource file is now copied verbatim into acustom_python_functions/subdirectory beside the generated pipeline file, and the pipeline imports the class by name. Previously the user’s class body (50–250 lines) was inlined into every generated file that registered a custom source/sink. YAML user-facing surface is unchanged. The new generated file is shorter, more diff-able, and consistent with the existing python-transform action pattern.
Added¶
Cloudpickle registration for custom sources/sinks: generated files now emit
_lhp_cloudpickle.register_pickle_by_value(custom_python_functions)between the imports block andPIPELINE_ID. This is the one-line fix that makes the import-based pattern work across the local-Spark / executor boundary — PySpark’s vendored cloudpickle is what serializes registered DataSource classes to executors, and onlyregister_pickle_by_valueagainst the vendored copy actually takes effect.Import name-collision detection in
ImportManager.add_import: twofrom … import …lines that bind the same local name to different modules now raiseLHPValidationError(LHP-VAL-021). This catches a class of silent shadowing bugs that affectedpythonload/transform actions and (post-refactor) the new copy-and-import pattern. Existing projects with legitimately conflicting symbol names will see this as an error on the next regenerate; rename one of the conflicting symbols, or alias one of the imports, to resolve.
0.8.5 — 2026-04-24¶
Added¶
Job-config pass-through: any top-level key in
job_config.yamlthat is not one of LHP’s explicitly handled keys (max_concurrent_runs,queue,performance_target,timeout_seconds,tags,email_notifications,webhook_notifications,permissions,schedule,notebook_cluster) is now rendered verbatim into the generated job YAML. Users can use newly-released Databricks Jobs API fields (trigger types,continuous,run_as,git_source,health,parameters,environments,edit_mode,budget_policy_id, …) without waiting for an LHP release that adds explicit support.
Changed¶
Python dependency extraction tracks local-variable bindings: patterns like
tbl = "cat.sch.t"; spark.read.table(tbl)are now resolved. Reassignments and conditional branches emit the union of possible values. Variables whose value comes from function parameters, function return values, or string concatenation remain unresolvable — for those cases, declare an explicitsource:on the action; for Python actions, parser output and explicitsource:are unioned. Implemented as a new_TableExtractor(ast.NodeVisitor)with_Scope/_Bindingclasses inutils/python_parser.py, replacing the priorast.walkoverast.Callthat loggedast.Namearguments as unresolvable.Jinja2 templates loaded via
PackageLoaderinstead of file-system loaders. Templates ship as package resources and are discovered throughimportlib.resources, removing the editable-install dependency on a source tree shadow. Applied inutils/template_renderer.py,core/init_template_loader.py, andgenerators/base_generator.py;job_generator.pyretainsFileSystemLoaderfor user-provided bundle template directories, which are not package resources.
Fixed¶
lhp depsnow honorstrigger.file_arrival(and any other Databricks Jobs API field) injob_config.yaml. Previously, keys the Jinja template didn’t explicitly handle were silently dropped from the generated orchestration job YAML.lhp depsextracts source tables from externalized write-target code: materialized views and custom sinks usingwrite_target.sql_path,write_target.sql,write_target.module_path, orwrite_target.batch_handlerpreviously appeared to have no upstream dependencies, causing gold pipelines and silver-MV flowgroups to be reported as root nodes in the dependency graph. New_iter_sql_bodies/_iter_python_bodiesgenerators incore/services/dependency_analyzer.pywalk these externalized bodies the same way they walk inline SQL/Python.
0.8.4 — 2026-04-22¶
Fixed¶
Missing
.j2files in published wheels: thelhp.templates.monitoringpackage was declared inpyproject.tomlbut its*.j2template files were not included by the wheel’spackage-dataglob. As a result,lhp generateagainst a project that exercised monitoring would fail withTemplateNotFound: monitoring/union_event_logs.py.j2on installs from PyPI (editable installs from a checkout were unaffected because the source tree shadowed the missing package data). The package-data inclusion list now covers monitoring templates so wheel and editable installs render the same output.
0.8.3 — 2026-04-21¶
Fixed¶
Cold-run race in the monitoring event-log union notebook: the generated notebook ran N parallel streaming queries that each called
.toTable(TARGET_TABLE), so on a cold target table all N threads raced toCREATEthe table — one thread won and the remaining N−1 failed withTABLE_OR_VIEW_ALREADY_EXISTS. The notebook now pre-creates the target through an idempotent_ensure_target_exists()prologue that samples the schema from the first readable source event log before theThreadPoolExecutorblock starts. Existing projects must regenerate their monitoring notebook to pick up the prologue.
Changed¶
Dedicated
monitoring.job_config_pathreplaces the__eventlog_monitoringalias: the monitoring workflow job is now configured from a separate, single-documentjob_config.yamlthat describes only the monitoring workflow, rather than from a special-cased alias inside the sharedjob_config.yaml.ProjectConfigLoadervalidates thatmonitoring.job_config_pathis set and that the file exists (tokenized paths are deferred to the orchestrator);JobGenerator.generate_monitoring_jobnow receives a resolvedjob_configdict from its caller. Users on 0.8.2 with monitoring enabled should addmonitoring.job_config_pathtolhp.yamland move the__eventlog_monitoringblock from the sharedjob_config.yamlinto the file it points at.
0.8.2 — 2026-04-17¶
Added¶
Multi-CDC fan-in to one streaming table (closes #113): multiple write actions in
mode: cdcthat share acatalog.schema.tabletarget now combine into onedp.create_streaming_table()plus Ndp.create_auto_cdc_flow()calls, each with its ownname=and its own per-flow CDC parameters (ignore_null_updates,apply_as_deletes,apply_as_truncates,column_list,except_column_list). This is the CDC counterpart to the standard-mode append-flow fan-in LHP already supported. A newCdcFanInCompatibilityValidatorraisesLHPConfigErrorwhen shared fields (keys,sequence_by, SCD type,track_history_*,partition_columns,table_properties) disagree between fan-in participants, rejects mode-mixing on the same target, and rejectssource: [v1, v2] + mode: cdcwith guidance to split into one write action per source (this combination silently caused truncations on prior versions).Per-pipeline event-log monitoring checkpoints (fixes #96): replaces the single-query
UNION ALLstreaming flowgroup with a pair of artifacts whose checkpoints survive adding or removing monitored pipelines. A notebook (monitoring/{env}/union_event_logs.py) runs N independent streaming queries under aThreadPoolExecutorwithtrigger(availableNow=True), each with its own checkpoint at{checkpoint_path}/{pipeline_name}; a separate MV-only DLT flowgroup reads the populated Delta table; a Databricks workflow chains the two vianotebook_task→pipeline_task. New required setting:monitoring.checkpoint_path. New tunables:max_concurrent_streams(default 10) andenable_job_monitoring.testaction type in the JSON schema:flowgroup.schema.jsonnow lists"test"in the actiontypeenum (previously accepted only via runtime parsing).
Changed¶
Env-scoped generation context replaces per-file composite checksums: the per-file
file_composite_checksumandgeneration_contextonFileStateare replaced by a singlelast_generation_contextonProjectState. The--include-testsflip detection is now a single env-wide comparison instead of O(N) per-file checksum recomputations. A newStalenessCachelets the display phase and per-pipeline filter share a single env-wide staleness scan per run, andinclude_testsis forwarded to orphan cleanup sotest_reporting_*artifacts are reaped when the flag flips toFalse.generateandvalidatenow share one flowgroup discovery pass rather than re-scanning at each phase;FlowgroupDiscoverercaches include patterns and the source-path index across calls within a run.lhp validate --include-testshas parity withlhp generate, filtering test actions out of validation when the flag is absent.Hardened state persistence: malformed or legacy state files now raise
LHPFileErrorwith actionable guidance instead of silently resetting.
Deprecated¶
{token}substitution syntax: docs, init templates, generator templates, and source comments are migrated to${token}. The deprecation is surfaced throughlogger.warning(rather thanDeprecationWarning) so it reaches end users via normal CLI output. Migrate any{token}usages in YAML to${token}; the only legitimate non-$braces syntax remaining is%{local_var}for local variables.
0.8.1 — 2026-04-14¶
Added¶
External test result reporting: new
test_reportingblock inlhp.yamlgenerates a per-pipeline_test_reporting_hook.pyevent hook. The hook uses@dp.on_event_hookto accumulate DQ expectation results fromflow_progressevents and publishes them at pipeline terminal state via a user-supplied provider module declared bymodule_pathandfunction_name. Actions gain an optionaltest_idfield for linkage to external test management systems. Generated hooks, provider module copies, and__init__.pyare tracked as pipeline artifacts and cleaned up whentest_reportingis removed fromlhp.yaml.${token}substitution is applied to provider module copies so secret/config tokens resolve at generate time.Three built-in test-reporting providers:
delta_test_reporter.py(appends results to a pre-existing Delta table),ado_test_reporter.py(publishes to ADO Test Plans via atest_case_mappingconfig), andado_test_reporter_inline.py(publishes to ADO wheretest_idis itself the ADO Test Case ID). Each implements thepublish_results(results, config, context, spark)contract withdry_runsupport and structured logging.--include-testsflag onlhp validatefor test-reporting validation parity withlhp generate.
0.8.0 — 2026-04-12¶
Added¶
source_functionparameters for snapshot CDC: declare keyword arguments insource_functionand have them bound viafunctools.partialat generation time. Makes snapshot functions reusable and testable outside LHP without baking substitution tokens into the function body. AST validation enforces keyword-only args (*separator) and rejects unknown parameter names.PerformanceTimerutility for structured timing instrumentation of generation phases.Performance testing project (
Example_Projects/performance_testing/): synthetic 4000-flowgroup project across 100 per-domain pipelines (20 domains × 5 layers) for realistic stress testing of discovery, staleness analysis, and generation.
Changed¶
Large-project generation is ~8× faster: the
find_source_yamlO(N×F) bottleneck is replaced by a lazy source-path index onFlowgroupDiscoverer, reducing roughly 4M filesystem operations to a single-pass O(1) lookup — 500 s → 64 s for a 2000-flowgroup project. A per-runChecksumCachewith thread-safe locking ensures each file is read and hashed at most once during parallel generation; redundant discovery, hashing, and staleness analysis across phases ofgenerateare eliminated; a singleCodeFormatterinstance is reused across worker threads (no more repeatedpyproject.tomlreads); and--forcenow wipes the output directory directly instead of running orphan detection over a pre-builtactive_flowgroupsset.
Fixed¶
CloudFiles
source.schemanow applies viareadStream.schema()before.load()instead ofdf.schema()after the load (#98). The previous ordering left Auto Loader unable to honor user-supplied schemas in some configurations.CloudFiles path/file exclusion via
pathGlobFilterand metadata-based filtering (#87): Auto Loader pipelines can now exclude directories or individual files.
0.7.8 — 2026-04-08¶
Added¶
catalog/schemanamespace format replaces the legacydatabase: "catalog.schema"field across load sources and write targets. YAML can now declarecatalog: my_catandschema: my_schemaas separate keys. A newnamespace_normalizerservice transparently converts the olddatabaseshorthand and emits a deprecation warning, so existing projects keep generating identical output. Closes #100.source.schemaenforcement for CloudFiles loads: the load action’ssource.schemais now applied on theDataStreamReaderchain before.load()for Auto Loader sources, replacing the previously invalid post-load property access that silently dropped the user-supplied schema. Newsource_schema_loadE2E fixture exercises the path end-to-end.Mandatory pipeline configuration: every pipeline must now have an explicit
pipeline_config.yaml. The implicit fallback that derived catalog/schema from project-level defaults is deprecated and will be removed. Migration: add apipeline_config.yaml(or per-environment override) for any pipeline that previously relied on the default lookup.Supply chain security hardening: all GitHub Actions are pinned to immutable SHA hashes; new workflows add
pip-audit(SARIF),bandit(SARIF),gitleaks,licchecklicense compliance, OpenSSF Scorecard, SLSA provenance generation, and reproducible builds viaSOURCE_DATE_EPOCH. Adds.pre-commit-config.yaml,CODEOWNERS,SECURITY.md, Dependabot config, and pinned dev dependencies.
Changed¶
Deterministic
depends_onordering in generated orchestration job YAML.job_generatornow sorts thedepends_onlist so generated job files are byte-identical across runs, eliminating spurious diffs when regenerating bundles.Data quality templates use
withColumns(plural) in a single dict call instead of loopedwithColumncalls. Generated code is shorter and runs one projection per stage. Fixes #92.
Fixed¶
CloudFiles description fallback now formats the file format string correctly. Previously the fallback emitted the literal string
<built-in function format>because the generator referenced Python’s built-informat()instead of thefile_formatvariable.CloudFiles template dead batch branch removed — the template had an unreachable batch read path that has been cleaned up alongside the schema-placement fix. Closes #98.
0.7.7 — 2026-03-17¶
Added¶
Quarantine mode for data quality transforms: new
mode: quarantineon data-quality transform actions routes failed rows to a Dead Letter Queue table viaforeach_batch_sink+append_flow, with automatic recycling of fixed records through Change Data Feed. Supports both CloudFiles and non-CloudFiles sources. Backed by a newQuarantineConfigmodel, schema and validator updates, a newdata_quality_quarantine.py.j2template, andget_all_expectations_as_drop()in the DQE parser to coerce expectations into the quarantine pattern.Delta load options validation: the Delta load generator now validates the
optionsblock against the supported flag set and rejects incompatible combinations (for example, mixingreadChangeFeedwith options that don’t apply to CDF reads). Documentation indocs/actions/load_actions.rstlists the supported options and their constraints.
Fixed¶
Quote escaping in quarantine template: applied the
tojsonJinja2 filter to the four remaining interpolation sites indata_quality_quarantine.py.j2(inverse_filter,failed_rule_data, and two related sites) to preventSyntaxErrorwhen an expectation rule contains double quotes (e.g.status = "active"). The_EXPECTATIONSdict already usedtojson; this brings the rest of the template in line.
0.7.6 — 2026-03-06¶
Added¶
Declarative event log configuration: new
event_log:section inlhp.yamlwithcatalog,schema, andname_suffix(LHP token substitution supported). LHP injects anevent_logblock into every generated pipeline bundle resource automatically. Pipelines can override the project-wide setting (event_log: {custom}) or opt out (event_log: false) inpipeline_config.yaml. Closes #82.Synthetic monitoring pipeline generation: new
monitoring:section inlhp.yamlgenerates a self-contained DLT pipeline that UNIONs every event log table in the project into a single streaming table and emits a defaultpipeline_run_summarymaterialized view (status, duration, and row metrics per update). Knobs:pipeline_name,catalog,schema,streaming_table, andmaterialized_viewsfor custom MV definitions. The__eventlog_monitoringalias is recognised as a reserved pipeline target inpipeline_config.yaml.enable_job_monitoring: true: when set undermonitoring:, LHP generates an additionaljobs_statsmaterialized view via a Python load action that uses the Databricks SDK to correlate pipeline updates with the triggering job runs and enriches the output with both pipeline and job tags. Thejobs_stats_loader.pyis shipped as a package resource undersrc/lhp/templates/monitoring/and loaded viaimportlib.resources.Instance pool support in pipeline clusters:
instance_pool_idanddriver_instance_pool_idare now accepted inpipeline_config.yamlcluster blocks as alternatives tonode_type_id. The Jinja2 template conditionally renders the pool fields and omitsnode_type_idwhen a pool is configured. Closes #83.sql_pathon materialized view write targets andbatch_handler/foreachbatchsink_type are now recognised by the PydanticWriteTargetmodel, both JSON schemas, the MV field allowlist, and the MV validator.custom_datasourceis also accepted as a load source type. Closes #85.Self-contained materialized views no longer require a load action. The validator and dependency resolver now exempt MV-only flowgroups that use
sql,sql_path, or a CTE-only definition, so a gold MV can read directly from upstream tables without an explicit load.
Changed¶
Documentation overhaul: switched the Sphinx theme to Furo with dark mode and a custom OG image, added SEO
metadescriptions to every page, reorganized the landing page (problem statement → value proposition → trimmed example → grouped features), and split the monolithicactions_reference.rstinto per-type pages underdocs/actions/. Extracted new standalone guides for substitutions, operational metadata, and dynamic templates. Toctrees are grouped into Getting Started, Configuration Guides, Deployment & Operations, and Reference. Added “Best Practices” sections. Closes #27.Monitoring docs now document
jobs_statsas a materialized view (it was previously mis-typed as a streaming table) and include full schema tables forevents_summary(16 columns) andjobs_stats(11 columns).
Fixed¶
Materialized view & write target validation stack (13 bugs): synced the validators, Pydantic models, and JSON schemas with what the generators and docs already supported. Removes the undocumented
namealias fortableacross validator and generators, removes the invalidpathfield from the delta source allowlist, removes a deadtransform_fieldsdict and unreachable dict-source handling from write generators, and fixes an orphaned transform to append (not raise) so accumulated errors like “must have at least one Load action” are preserved. Fixes the malformedReference_Templates/standard_ingestion.yaml.
0.7.5 — 2026-02-17¶
Added¶
Pipeline-level configuration entries in
pipeline_config.yaml: arbitrary Spark/DLT key-value pairs declared under aconfiguration:block are now rendered alongside the mandatorybundle.sourcePathin generated bundle resource YAML. Validation enforces a dict of string-only values, andbundle.sourcePathis filtered out to prevent duplication.Pipeline environment dependencies propagation: the
environmentsection inpipeline_config.yamlis now rendered into generated bundle resource YAML, enabling pip package dependencies for DLT pipelines. Closes #74.lhp depsrefactor: consolidated three duplicate source extraction implementations into a sharedsource_extractormodule (extract_action_sources,is_cdc_write_action,extract_cdc_sources); replaced per-job re-analysis with NetworkX graph partitioning that filters the global graph by job membership; switched fromnx.find_cycle(one cycle) tonx.simple_cycles(all cycles, capped at 20); and added a circular-dependency guard that skips job-format generation with a warning when cycles are detected while still emitting the other output formats.
Changed¶
lhp initinitializes in the current working directory instead of creating a subdirectory;project_nameis now used only for template rendering. The--bundleflag was flipped to--no-bundleso bundle (Databricks Asset Bundles) is the default. The generated bundle now includes abundle_uuidfield rendered intodatabricks.yml, and the.gitignoretemplate adds*.tmpl,.lhp/, and.bundle/while dropping.vscode/(kept for IntelliSense schemas). Conflict detection switched from a directory-exists check to anlhp.yamlconflict check, with selective cleanup on failure instead ofshutil.rmtree. Migration: existing users invokinglhp init <name>should now run it from inside the intended project directory and pass--no-bundleto opt out of bundle generation.DependenciesCommanderror propagation: removed the error-swallowingtry/exceptfromDependenciesCommand.execute()and theIOErrorwrapping fromDependencyOutputManager.save_outputs(). Errors now propagate to the CLI error boundary so failures are surfaced instead of silently downgraded.
Fixed¶
SQL parser CTE name leak that created false cross-CTE dependencies when one CTE referenced another; the parser now scopes CTE names per query and correctly handles subqueries and
UNION/INTERSECT/EXCEPTset operations.
0.7.4 — 2026-01-19¶
Added¶
Local variables in flowgroups: new top-level
variables:section in flowgroup YAML lets users define reusable values scoped to a single flowgroup, resolved before template parameters and environment substitution. Referenced via%{var_name}to reduce repetition and keep related values close to where they are used. Resolves #58.Multi-target config for jobs and pipelines:
job_nameandpipelinekeys now accept a list of names injob_config.yamlandpipeline_config.yaml, applying the same configuration block to every entry. Duplicate names and empty lists are rejected with clear errors. Resolves #66.Python 3.13 support: CI now tests against Python 3.11, 3.12, and 3.13.
pyproject.tomldependencies updated to their latest compatible versions.
Changed¶
Delta load action: unified
optionsfield. YAML now uses a singleoptions:map for Delta sources; the previousreader_options,cdf_enabled, andcdc_optionsfields are no longer supported and raise an error pointing users to the new structure. Migration: move keys fromreader_options:andcdc_options:intooptions:(for example,cdf_enabled: truebecomesoptions: { readChangeFeed: "true" }).Minimum Python version raised to 3.11. Python 3.8, 3.9, and 3.10 are no longer supported. Resolves #72.
Fixed¶
Load generators (
CloudFiles,Delta,Kafka) now validate that theoptionsfield is a dictionary and raise a user-friendly error rather than failing later during template rendering.CodeFormatternow logs the error type, traceback, and the first 500 characters of the offending code when Black formatting fails, replacing silent or opaque failures.
0.7.3 — 2026-01-07¶
Added¶
ForEachBatch sink: new
foreachbatchsink type lets users invoke a user-supplied Pythonbatch_handlercallable for each micro-batch produced by a streaming flow. Required keys (module_path,batch_handler) are validated up-front byWriteActionValidator, and the referenced module is tracked as a dependency so edits trigger regeneration. Documentation covers configuration, common use cases (REST APIs, external systems, custom merge logic), and best practices. Resolves #18.Substitution in
python_transformaction fields:module_pathandfunction_nameare now passed through the substitution engine, so values like${python_modules_root}/cleanup.pyand${env}_cleanresolve correctly per environment.
Changed¶
Sink dependency tracking:
StateDependencyResolvernow recordsmodule_pathfor bothforeachbatchandcustomsinks, so edits to those handler files participate in incremental regeneration alongside the YAML.Validators (
ConfigValidator,FlowgroupProcessor,PipelineValidator) now re-raiseLHPErroras-is rather than wrapping it, producing consistent error formatting and preserving the original error context. Secret and flowgroup validation messages now include the detailed validation context.
0.7.2 — 2025-12-19¶
Added¶
Per-source
readModefor streaming tables: each source in a streaming-table write action can now declarereadMode: streamorreadMode: batchindependently. The generated code emitsspark.readStreamorspark.readper source, instead of forcing a single mode on the whole table. Resolves #22.Parallel flowgroup processing:
ActionOrchestratornow processes flowgroups concurrently via a newparallel_processormodule, withPythonFileCopierproviding thread-safe file copy and conflict detection for the python-action copy step. Single-process behavior is preserved for diagnostics; speed-up is largest on projects with many flowgroups.External-file dependency extraction from template parameters:
StateDependencyResolverheuristically detects file paths (schema files, SQL files, custom python modules) passed as template parameters and tracks them as dependencies, so edits trigger correct regeneration even when the path lives inside a template-expanded value.Pipeline-generation summary: generation now logs the number of files written vs. skipped by the smart writer, surfacing what the incremental path actually did.
Changed¶
Cross-platform path normalization:
StateDependencyResolver,DependencyTracker,StateCleanupService,PythonFileCopier,Action, andCodeGeneratornow normalize file paths to forward slashes via a newutils/path_utils.py. State files written on Windows and Linux are now interoperable, and generated python files use relative, environment-independent paths. Resolves #52.Orchestrator refactor:
Orchestratorwas restructured into helper functions for source extraction and batch processing. Action validators moved into a dedicatedvalidators/package, and a newOperationalMetadataServicecentralizes operational-metadata handling (previously duplicated across generators).generate_pipeline()was removed in favor of the unifiedgenerate_pipeline_by_field()path.Operational metadata import management:
BaseActionGeneratornow consolidates metadata retrieval and import detection into a singleget_metadata_and_imports()call, and registers expressions withImportManagerfor semantic tracking. Imports declared by metadata expressions stay consistent with the file’s actual imports.Package description updated in
pyproject.toml,README.md, andLLM.txtfrom “Lakeflow Declarative Pipelines” to “Lakeflow Spark Declarative Pipelines”, aligning with the upstream Databricks terminology adopted in 0.7.0.
Fixed¶
create_table: falseis now correctly honored end-to-end (minor flow-only bug).TableCreationValidatornow embeds a complete example configuration in itsLHPError, so users can see the exact YAML shape needed to resolve a conflict rather than only the error class.
0.7.1 — 2025-11-26¶
Added¶
Catalog and schema in
pipeline_config.yaml: newcatalogandschemakeys at the pipeline-config level let users set the Unity Catalog target per environment without repeating the value in every flowgroup. Values are validated and embedded in the generated*.pipeline.ymlresource.Multi-job orchestration via
job_name: flowgroups can now declare ajob_name, grouping them into separate Databricks jobs.lhp depsgenerates per-job orchestration files plus a master orchestration job; an “all-or-nothing” validation rule prevents partially-tagged pipelines from producing an ambiguous job graph. Resolves #45.Schema file support (YAML/JSON) in streaming-table and materialized-view writes:
table_schema:can now reference an external.yaml,.yml, or.jsonfile in addition to inline DDL/SQL. A newSchemaParserconverts the structured definition to DDL at generation time.External-file schema for
cloudFiles.schemaHints: schema hints can now point at an external DDL or SQL file rather than being inlined. The referenced files are tracked byStateDependencyResolverso edits trigger regeneration.schema_transformaction: external files and strict/permissive modes. The schema transform was reworked to support external schema files and fixed strict/permissive mode handling. Resolves #23.
Changed¶
schemafield renamed totable_schemain write-action configuration.schemaremains accepted for backward compatibility but is now documented as legacy; new code and examples should usetable_schema. The rename disambiguates the field from PySpark’sschemaand from cloud-files schema hints.Templates renamed to drop characters that broke checkouts on Windows file systems.
Fixed¶
SQLLoadGeneratorandSQLTransformGeneratornow fall back toPath.cwd()when noproject_rootis available in the context, fixing failures in projects that invoke the API directly. Resolves #16.Error handling in
LakehousePlumberApplicationFacadeandActionOrchestratorno longer duplicates error details across log lines and re-raises.
0.7.0 — 2025-11-10¶
Changed¶
Generated code migrated from
dlttopyspark.pipelines as dp(Spark Declarative Pipelines API). This is the headline change in 0.7.0: Lakehouse Plumber now emits Lakeflow Spark Declarative Pipelines (SDP) code aligned with the current Databricks API, replacing the legacy DLT decorators across every code path. Specifically:import dlt→from pyspark import pipelines as dp@dlt.table→@dp.materialized_view(for materialized views)@dlt.view→@dp.temporary_viewAll
dlt.*calls (e.g.dlt.read_stream,dlt.create_streaming_table) →dp.*The deprecated
refresh_scheduleparameter on materialized views is no longer emitted.
Import categorization also recognizes
pyspark.pipelinesas the DLT-equivalent module. Migration: YAML inputs are unchanged — regenerate the project (lhp generate --env <env> --force) to pick up the new decorators. Hand-editeddlt.*code in custom python actions must be updated todp.*manually. Cluster/runtime must support thepyspark.pipelinesAPI.
Added¶
Sink framework for write actions. New modular sink architecture (
BaseSink+ concrete implementations) lets write actions target destinations beyond Delta tables. Three sink types ship in 0.7.0:Delta sink: writes to Delta tables with full streaming-table / materialized-view option support.
Kafka sink: streams output to Kafka topics with configurable serialization and partitioning. A dedicated
kafka_validatorchecks broker URLs, topic names, and options up-front.Custom sink: extensibility hook for user-supplied destinations.
Sinks are configured under a
sink:block in the write action. Existing Delta-onlywriteactions continue to work unchanged. Resolves #17.Unresolved-token validation (
LHP-CFG-010): a newvalidate_no_unresolved_tokens()step runs after substitution and recursively scans rendered config for stray{token}patterns (excludingdbutils.secrets.getreferences), with detailed context and fix suggestions. Circular substitution references are detected with a 10-iteration cap and surfaced as warnings.lhp show substitutionscommand: displays available substitution tokens for an environment, useful for diagnosing rendering issues caught by the new validator. Resolves #42.Quarantine plumbing in expectations template: the data-quality expectations Jinja template now uses named variables (fail / drop / warn) for expectation lists, in preparation for upcoming quarantine support. Resolves #39.
Fixed¶
Action-validator integration with the new write/sink schema; bundle manager updated to emit sink-aware pipeline resources.
0.6.5 — 2025-10-29¶
Added¶
Kafka load source action: new
type: kafkaload action generates code forspark.readStream.format("kafka")(or batchspark.read) with full option pass-through. Built on the same generator pattern as CloudFiles, with a dedicatedKafkaLoadGenerator, Jinja2 template, action-registry entry, and config validator. Includes optional operational-metadata columns and works with both streaming and batch read modes. Closes #38.Kafka auth reference templates: shipped reference templates for the two most common managed-Kafka auth patterns — Azure Event Hubs with OAuth and AWS MSK with IAM authentication — plus documentation describing the required options and connection strings.
Fixed¶
Quote/backslash escaping in load templates:
cloudfiles,custom_datasource, andjdbcload templates now correctly escape quotes and backslashes in option values. Previously, an option value containing a quote or a backslash (e.g., a regex pattern, a Windows path, or a JSON-encoded secret) could produce syntactically invalid generated Python or change the runtime value at generation time.
0.6.4 — 2025-10-29¶
Added¶
Multi-flowgroup YAML files: a single pipeline YAML file can now declare multiple flowgroups, in either multi-document syntax (multiple
----separated documents per file) or array syntax (a top-level list of flowgroup mappings). Previously, every flowgroup required its own file, which forced large pipelines into deep directory trees. Existing single-flowgroup files continue to parse unchanged. Closes #12, #28.lhp generate -pc <config-file>regenerates DAB pipeline YAML on--force: when the pipeline-config flag is supplied together with--force, the corresponding Databricks Asset Bundle pipeline YAML files underresources/are rewritten. Previously,--forceonly regenerated the Python pipeline code, leaving stale resource YAML on disk.Anonymous usage telemetry with explicit opt-out:
lhp generatenow emits aggregated, anonymous usage metrics (flowgroup and template counts, project/machine identifiers hashed) to help prioritize feature work. Two opt-out paths are honored: settingLHP_DISABLE_ANALYTICS=1in the environment, and running insidepytest. Documentation describes what is collected and how to turn it off.
Fixed¶
Malformed YAML for cluster config in pipeline-config flow: custom cluster blocks supplied via
lhp generate -pcno longer produce syntactically invalid DAB pipeline YAML. Closes #37.
0.6.3 — 2025-10-28¶
Fixed¶
Template-level presets are now applied (#34): templates could declare a
presets:list, but the field was missing from theTemplatemodel and the value was silently dropped — none of the listed presets were actually applied to generated actions. TheTemplatemodel now carriespresets, andFlowgroupProcessorapplies them after template expansion with the documented precedence: flowgroup-level presets override template-level presets. Referencing a preset that does not exist now raisesValueErrorinstead of failing silently. Projects that previously relied on the (broken) silent-skip behavior will see a hard error on regenerate — remove the bogus reference or create the missing preset file (e.g.,bronze_layer.yaml) to resolve.
0.6.2 — 2025-10-28¶
Added¶
Customizable DLT pipeline configuration via YAML (
-pc/--pipeline-config): a new pipeline-config YAML file lets projects define DLT pipeline-level defaults (serverless, clusters, notifications, channel, edition, photon, configuration map, …) and per-pipeline overrides. TheBundleManagerloads this file and merges it into the generated DAB pipeline resource YAML, removing the need to hand-edit generatedresources/files after every regenerate. Closes #13, #14, #31; resolves #29.Customizable orchestration job configuration: a complementary job-config file lets users set
max_concurrent_runs, notifications, schedule/trigger, tags, and related Databricks Jobs API fields applied bylhp depswhen generating the orchestration job YAML. Fixes #15, #21.Bundle-mode job output to
resources/: when bundle output is enabled, generated job YAML files are written to theresources/directory so they are picked up bydatabricks bundle deploywithout additional wiring. Fixes #26.
Fixed¶
Python transform action: import resolution bug affecting generated pipeline files has been corrected.
0.6.1 — 2025-10-01¶
Fixed¶
Dependency detection bug: a minor incorrect-edge issue in the
lhp depsgraph builder (introduced alongside the v0.6.0 dependency feature) has been corrected. Graphs now match the intended source-to-target relationships.
0.6.0 — 2025-10-01¶
Added¶
Pipeline dependency analysis (
lhp deps): new subcommand that walks every flowgroup, extracts source-table references from both SQL and Python action bodies, and produces a project-wide dependency graph. Output is available in multiple formats —dot(Graphviz),json,text, andjob(a generated Databricks Jobs YAML that runs upstream pipelines before downstream ones), plusallwhich emits every format. Backed by a newDependencyOutputManager, aPythonParserthat recognizesspark.sql(...),spark.read.table(...), and related patterns, and aSQLParserthat handles joins, CTEs, and quoted / multi-part identifiers. This is the foundationlhp depscontinues to build on in subsequent releases.
0.5.9 — 2025-09-15¶
Changed¶
Architectural refactor: monolithic classes broken down into single-responsibility services.
ActionOrchestrator(~1300 lines),StateManager(~1300 lines), and thelhpCLI entrypoint (~1500 lines) were decomposed into focused service modules. New modules includecore/services/(code_generator,flowgroup_discoverer,flowgroup_processor,pipeline_validator,generation_planning_service),core/state/(dependency_tracker,state_analyzer,state_persistence,state_cleanup_service),core/commands.py,core/factories.py,core/layers.py,core/strategies.py, andutils/template_renderer.py/utils/yaml_loader.py. Public APIs were promoted from previously private methods so the orchestration pipeline is composable and testable. User-facing CLI behavior is unchanged.CLI restructured into per-command modules under
src/lhp/cli/commands/. Each command (generate,validate,init,show,state,stats, pluslist_*) now lives in its own file behind a sharedbase_command.py. The single 1500-linemain.pyis now a thin dispatcher.
Removed¶
src/lhp/bundle/yaml_processor.pyremoved. The legacy YAML processor module and its tests were deleted; bundle YAML modifications now go throughbundle/databricks_yaml_manager.py(ruamel.yaml-backed) and the new template-renderer service. No user-facing YAML syntax change.LegacyGenerateCommandand its 1300-line test suite removed. Generation now flows exclusively through the newGenerateCommandimplementation introduced earlier in the 0.5.x series.
Fixed¶
Deterministic ordering in
BundleManager. Iterating over pipelines for bundle resource sync is now sorted, eliminating spurious diffs between consecutivelhp generateruns on different platforms.
Added¶
End-to-end integration test fixture project under
tests/e2e/(testing_project/) covering bronze/silver/gold pipelines, test actions, custom Python functions, and a multi-environment substitution setup. Used by the new E2E suite to catch generation regressions across the whole orchestrator-to-bundle path.CI: JUnit XML reports and Codecov integration. Test runs in GitHub Actions now publish JUnit XML and upload coverage to Codecov.
0.5.2 — 2025-09-02¶
Added¶
Automatic
databricks.ymlvariable management.lhp generatenow scans generated Python files for the firstcatalog.schemapattern, locates the matching variable names in the current environment’ssubstitutions/<env>.yaml, and writes the resolved per-environment values into thevariables:block of every matching target indatabricks.yml. Two variables are populated:default_pipeline_cataloganddefault_pipeline_schema. This removes the need to hand-editdatabricks.ymlafter changing substitution values.ruamel.yaml-backed
DatabricksYAMLManager. The newsrc/lhp/bundle/databricks_yaml_manager.pyis the only place in LHP that uses ruamel.yaml; it preserves comments, quoting, and key order when updatingdatabricks.ymlvariables. Other YAML operations continue to use PyYAML.New runtime dependency:
ruamel.yaml>=0.17.0.
Changed¶
Bundle pipeline-resource template now emits static variable references (
${var.default_pipeline_catalog}/${var.default_pipeline_schema}) instead of inlining catalog/schema values extracted from generated Python. Catalog/schema selection is now driven entirely bydatabricks.ymlvariables, which LHP populates from substitutions.resources/lhp/is now flat — per-environment subdirectories (resources/lhp/dev/,resources/lhp/prod/, …) were removed; resource files live directly underresources/lhp/. The bundle template references${workspace.file_path}/generated/${bundle.target}/<pipeline>/**so each environment still gets its own deployed artifact set. Existing projects regenerating withlhp generatewill see resource YAML files relocate; commit the move.lhp init --bundlescaffolding refreshed. Generateddatabricks.ymlnow defines top-levelvariables:fordefault_pipeline_catalog/default_pipeline_schema, adds atsttarget betweendevandprod, and switchesprod/tstto service-principalrun_asand permission stubs.Pre-flight validation of
databricks.ymltargets. Generation now fails fast withMissingDatabricksTargetErrorif any substitution environment lacks a matchingtargets.<env>block indatabricks.yml.
0.5.1 — 2025-09-01¶
Added¶
README link to the ReadTheDocs documentation site.
black>=23.0.0is now a runtime dependency (previously only a dev-time requirement); generated code formatting works in environments that install LHP without dev extras.
0.5.0 — 2025-08-29¶
Added¶
Test actions. A new top-level
ActionType.TESTplus ninetest_typevalues —row_count,uniqueness,referential_integrity,completeness,range,schema_match,all_lookups_found,custom_sql,custom_expectations. Each test is emitted as a DLT expectation with a configurableon_violationoffail,warn, ordrop. Tests are validated by a newTestActionValidatorand generated byTestActionGenerator(src/lhp/generators/test/). Seedocs/test_actions.rst.lhp generate --include-testsflag. Test actions are skipped by default for faster CI builds; pass--include-tests(or run a test-only environment) to emit them. Flowgroups that contain only tests produce no Python file when--include-testsis not set.Per-environment generated output.
lhp generate --env <env>now writes togenerated/<env>/<pipeline>/by default instead of a singlegenerated/tree, so dev/tst/prod artifacts coexist without overwriting each other. Override with--output.required_lhp_versioninlhp.yaml. Projects can pin the framework to a PEP 440 specifier (==,~=,>=,<).lhp generateandlhp validatefail with a clearLHP-CFG-007/008error when the installed version is out of range. Bypass withLHP_IGNORE_VERSION=1(intended for emergencies, not production).Per-environment bundle resources. Bundle resource files are written under
resources/lhp/<env>/so each target deploys its own resource set; the pipeline template emitslibraries.globandroot_pathpaths scoped to the environment.CI/CD reference documentation (
docs/cicd_reference.rst, ~2000 lines) covering GitHub Actions, Azure DevOps, and Bitbucket workflows for Asset-Bundle deployments. A samplelakehouse-cicd.ymlworkflow ships in the ACME example project.New runtime dependency:
packaging>=23.2(used for version-specifier checking).
Changed¶
lhp generate --formatflag removed. Black formatting is now always applied to generated code; the redundant opt-in flag was dropped. Behavior is equivalent to the previous--formaton, so no migration is needed beyond removing the flag from scripts.lhp generatewarns when the requested--envhas no matching target indatabricks.yml. Generation continues, but the warning surfaces the missing target before deploy time.Empty flowgroups are now skipped silently. A flowgroup whose actions all evaluate to no-ops (typically a tests-only flowgroup without
--include-tests) no longer produces an empty.pyfile or a state entry.
Earlier Releases (v0.2.6-alpha – v0.4.1) — 2025-07-10 through 2025-08-18¶
Fourteen tagged releases (V0.2.6-alpha through v0.4.1) spanning
roughly five weeks of early development. These predate the v0.5.0
test-actions / per-environment-output milestone and were primarily
rapid internal iteration on the core surface area. Pull requests were
not yet routine; nearly all changes landed as direct pushes. What
follows is a condensation, not a per-version diary.
Added¶
Initial PyPI release.
V0.2.6-alpha(2025-07-10) was the first tagged release on PyPI; the CI publish workflow and PyPI version-check guard landed shortly after, followed by ReadTheDocs and GitHub Pages publishing.Databricks Asset Bundles integration (
v0.3.1, PR #6, 2025-07-21). Introducedsrc/lhp/bundle/(manager, template fetcher, YAML processor, exceptions, bundle-detection), thelhp init --bundleflag, automatic resource-YAML synchronization underresources/lhp/, and thedatabricks.yml.tmplscaffold. The flag became the default in v0.7.5.State tracking and staleness detection (
v0.2.12–v0.2.13). Introduced.lhp_state.json, thelhp statecommand, the state-display service, composite-checksum staleness logic, and dependency discovery — the basis of incremental regeneration still used today.create_tablefield and append-flow API for streaming-table writes, with orchestrator-level cross-flowgroup validation (V0.2.6-alpha).VS Code IntelliSense for LHP YAML via
lhp setup-intellisense, with JSON schemas and editor configuration (v0.2.7+).Include-pattern filtering in
lhp.yamlto scope which YAML files are processed (v0.2.7).Pipeline-field-based flowgroup discovery — the orchestrator now groups flowgroups by their
pipeline:field rather than directory layout, allowing one directory tree to contribute to many pipelines (v0.2.7).Table tags on streaming tables and materialized views (
v0.3.3).Temp-table transform action (
v0.2.15).Operational metadata integration across load and transform generators, with template-level column-sort logic.
Custom PySpark DataSource as a load action (
v0.4.0, 2025-08-04). Newcustom_datasourcegenerator andimport_managerutility for detecting and rewriting user imports — the foundation of thecustom_datasource/custom_sinksurface area today.Multi-platform, multi-version CI — Linux, macOS, Windows across Python 3.8–3.12, with forward-compatible type-annotation refactors and Windows-specific logging fixes (
v0.3.6).lhp initJinja2 template system (v0.3.4). The scaffolded project layout —pipelines/,presets/,substitutions/dev|tst|prod.yaml,expectations/,schemas/,templates/,bundle/— was introduced here as*.j2/*.tmplassets insrc/lhp/templates/init/.
Removed¶
The legacy
src/lhp/notebook/module (deployment.py,interface.py,widgets.py, ~1,700 LOC) was deleted inv0.3.4and replaced with the Jinja2 init-template loader. Projects scaffolded prior to v0.3.4 used a notebook-based deployment flow that no longer exists.
For per-commit detail across this era, see
git log V0.2.6-alpha..v0.4.1.