Stark Informatics
Home · Accelerators · Medallion Lakehouse Starter
Foundation accelerator

Medallion Lakehouse Starter

A production-shaped Bronze / Silver / Gold lakehouse for Microsoft Fabric — deployed in your tenant in under a day, ready to extend with your own data.

StackFabric + OneLake
LanguagePySpark + T-SQL
Deploy time< 1 day
Updates12 months
LicenseCommercial

The skeleton we use to start nearly every Fabric engagement, packaged so your team can use it without us. Three Lakehouses laid out for Bronze / Silver / Gold, a metadata-driven Spark notebook framework that handles 80% of pipelines you'll ever build, deployment pipelines wired for dev/test/prod, and a Direct Lake semantic model template ready to plug into Power BI.

Designed so a senior engineer can deploy it, configure their first source, and see data flow Bronze → Silver → Gold → Power BI in their first afternoon.

Who this is for

This is for engineering teams who:

  • Are starting fresh on Microsoft Fabric and don't want to reinvent the medallion pattern from scratch.
  • Already have Fabric capacity provisioned and at least one source they want to land into a lakehouse.
  • Want to ship the first production data product in two weeks, not two quarters.
  • Prefer to learn by working through a battle-tested reference rather than from a blog post.

It's not for teams who need a no-code or self-service tool. This is engineering source — Python notebooks, JSON config, Spark transformations. Bring an engineer.

What's in the box

A complete, source-controlled Fabric deployment package. Open the repo, fill in three config files, run the deployment script.

FileTypeWhat it does
infrastructure/deploy.ps1PSIdempotent PowerShell script that provisions the three Lakehouses, the workspace, and Git integration via Fabric REST APIs.
infrastructure/workspace_config.jsonJSONWorkspace + capacity assignments + role assignments. Edit this file, run deploy — done.
notebooks/00_framework_setup.ipynbPythonBootstraps the metadata tables and configuration schema in the Bronze lakehouse.
notebooks/10_ingest_bronze.ipynbPythonMetadata-driven ingest. Reads a control table; for each source, lands raw data idempotently with file partitioning and arrival metadata.
notebooks/20_transform_silver.ipynbPythonBronze → Silver with Delta MERGE patterns, deduplication, type conformance, slowly-changing-dimension Type 1 + Type 2 templates.
notebooks/30_serve_gold.ipynbPythonSilver → Gold star schema patterns with surrogate keys, late-arriving dimension handling, and built-in row-count + freshness checks.
notebooks/90_optimize.ipynbPythonScheduled OPTIMIZE + VACUUM + Z-ORDER with metadata-driven hot-table selection.
pipelines/orchestrator.jsonJSONFabric pipeline that runs the notebooks in order, with retry, timeout, and failure alerting baked in.
warehouse/control_tables.sqlT-SQLSchema for the metadata control tables: source_registry, load_history, data_quality_results.
semantic-model/template.bimTMDLDirect Lake semantic model template with date dimension, conformed measures, RLS placeholder, BPA-clean.
powerbi/starter-report.pbixPBIXTwo-page Power BI report wired to the semantic model — a working "first dashboard."
tests/test_bronze.pyPythonpytest suite covering ingest idempotency, schema evolution, and partition correctness.
docs/RUNBOOK.mdMDThe Wednesday-morning operations runbook: alerts, on-call playbook, common failure modes and fixes.
docs/ARCHITECTURE.mdMDAnnotated reference architecture with the design decisions explained.
docs/CAPACITY_SIZING.xlsxXLSXCapacity sizing model — input your workload, output an F-SKU recommendation.
i
Total: 28 files, ~3,800 lines of code, 14 pages of documentation. Tested in production on 12+ client engagements.

How the architecture works

Three lakehouses, one workspace per environment, one orchestrator pipeline, one semantic model.

Azure SQL · CDC Files (S3/ADLS) SaaS APIs Orchestrator pipeline · Bronze → Silver → Gold → semantic model Bronze raw · append-only Silver cleansed · typed Gold star schema Power BI Direct Lake OneLake All three lakehouses + the semantic model live here — open Delta-Parquet, Purview-governed Capacity Metrics + chargeback by workspace

From download to production data flow

  1. Day 1 morning. Download the repo. Open workspace_config.json, fill in your tenant + capacity + workspace names. Run deploy.ps1. Three lakehouses appear in your tenant; Git integration kicks in.
  2. Day 1 afternoon. Connect your first source. Add a row to the source_registry control table. Run the Bronze ingest notebook. Verify data lands.
  3. Day 2. Configure the Silver transformation for that source. Copy a template, adjust the column mappings, run. Watch data flow Bronze → Silver.
  4. Day 3. Define the Gold star schema. Build your first fact + 2-3 dimensions using the templates. Wire the semantic model to your Gold tables.
  5. Day 4. Open the starter Power BI report, swap in your model, publish. Your first dashboard is live on real data.
  6. Week 2. Add 2-3 more sources following the same pattern. Configure the orchestrator pipeline schedule. Hand over the runbook to your operations team.

Pricing

One-time purchase. Includes 12 months of free updates. Cancel any time — your repo stays yours.

Individual
$499
1 developer · personal projects only
  • Full repo + docs
  • 12 months of updates
  • Email support (5-day response)
Buy individual
Site
$4,999
Unlimited developers, one org
  • Everything in Team
  • Unlimited org-wide use
  • 2-hour implementation workshop
  • Lifetime updates
Buy site

What's NOT included

Honesty matters more than upsell. This accelerator gets you a Fabric medallion lakehouse foundation. It does not include:

  • The data itself. You bring your sources.
  • Source-specific connectors beyond the templates (we ship Azure SQL CDC, files, and REST patterns; bring your own for SAP, Workday, etc.).
  • Custom semantic-model business logic. The template has the structure; you fill in the measures.
  • White-glove implementation. For that, our End-to-End Solution Build service uses this accelerator as the foundation.

Frequently asked questions

Do I need a Fabric capacity already?
Yes. The accelerator deploys into your existing Fabric tenant. If you don't have capacity yet, book a discovery call and we'll help you size it.
What's the minimum F-SKU I need?
F2 will run the framework but is too small for real workloads. F8 is the practical minimum for production. F16+ if you're loading more than a handful of sources.
Can I use it on multiple client projects?
Yes — Team license and above include commercial use across unlimited client projects. Just don't resell or sublicense the source.
Do updates require re-deploying?
No. Updates ship as Git commits to a branch we maintain. Pull them, merge into your fork, redeploy when convenient.
What if I find a bug?
Email astark@starkinformatics.com. Bugs are fixed at no charge for licensed customers, usually within 5 business days.
Can I extend it?
It's designed to be extended. Add new source types by adding rows to the control table. Add new transformations by following the notebook templates. The framework is documented and source-controlled.
Refund policy?
Full refund within 14 days of purchase if it doesn't work for you, no questions asked.

License terms

Individual: Single named developer. Personal projects + commercial work where you are the sole consumer. May not be used to deliver to clients.

Team: Up to 10 named developers within one organization. May be used on unlimited internal projects and on client-facing engagements as part of your services. May not be resold standalone.

Site: Unlimited developers within one organization. Same commercial-use rights as Team. Lifetime updates included.

All tiers: you may modify the source freely. You may not redistribute the original or modified source as a competing accelerator product.

Start with the right foundation

Most Fabric projects spend their first month rebuilding the medallion pattern. Buy ours, skip the month.

Buy team license — $1,999 See a live demo