Stark Informatics
Home · Solutions · OneLake

OneLake

The single logical data lake under every Microsoft Fabric workload. Open Delta-Parquet, one copy of data, governed end-to-end by Microsoft Purview — the OneDrive for data.

GAFoundation· 7 min read

What it is

OneLake is the unified data lake that comes with every Fabric tenant. Conceptually it's a single, logical, hierarchical lake — one OneLake per tenant, structured under workspaces and Fabric items. Physically it's ADLS Gen2 underneath. Data is stored in open Delta-Parquet so that any engine (Fabric, Spark, Databricks, Trino, Snowflake-via-Iceberg) can read it without movement.

The promise: one copy of every dataset, available to every Fabric workload, governed centrally.

Hierarchy & addressing

The path scheme is simple:

onelake://<tenant>/<workspace>/<item>.{Lakehouse|Warehouse|...}/Tables/...

Every workspace gets a node. Every Lakehouse, Warehouse, Eventhouse, or other item gets a node beneath it. Files sit under Files/ and managed Delta tables under Tables/. The OneLake Catalog is the unified discovery surface across workspaces and domains.

Access patterns

  • Native Fabric items. Lakehouse, Warehouse, KQL DB, Eventhouse all live inside OneLake automatically. No setup needed.
  • OneLake file APIs. Use Azure Storage Explorer, the OneLake VS Code extension, or Python (azure.storage.filedatalake) to read and write files programmatically.
  • Shortcuts. Reference data in place across OneLake, ADLS Gen2, Amazon S3, Dataverse, and GCS — without copying. See the Shortcuts guide.
  • Mirroring. Continuously replicate Azure SQL, Cosmos, Snowflake, and Databricks Unity Catalog into OneLake. See the Mirroring guide.

Security & governance

OneLake security composes Fabric's identity model (Microsoft Entra ID), workspace roles, item-level permissions, and granular OneLake data-access roles. Microsoft Purview is built in: sensitivity labels propagate across workloads, DLP policies are enforced at query time, and the OneLake Catalog assesses your data trust posture.

i
Implication: Data created in a Lakehouse can be queried from a notebook in another workspace, joined into a Warehouse, and consumed by Power BI without ever leaving its OneLake folder — and Purview enforces the same labels everywhere.

Best practices

  • Design domains and workspaces first. They are your governance perimeter. Naming and ownership matter more than you think.
  • Use Shortcuts before you copy. If data already lives in S3, Dataverse, or another OneLake workspace, a Shortcut beats a pipeline every time.
  • Apply Purview labels at the source. Sensitivity labels inherit downstream; assigning at the Lakehouse table is far cleaner than retro-labeling.
  • Treat OneLake paths as production identifiers. The path becomes the de facto contract between Fabric and external consumers.

Common pitfalls

!
Putting raw and serving data in the same workspace. OneLake permissions are at workspace + item level. Mixing tiers turns governance into Swiss cheese.
!
Re-mirroring data already in OneLake. If a Lakehouse already has the data, use a Shortcut. Re-mirroring doubles cost and breaks the "one copy" promise.

Architecting OneLake the right way

Getting domains, workspaces, and item layout right at the start saves quarters of rework later.

Talk to an architect