Data Mesh Governance / Policies / Interoperability / File Format

Parquet File Format

Category: Interoperability
Platform: Databricks, Azure Synapse Analytics, Generic Data Lake

Context

Data products are stored as files on Azure Data Lake Storage Gen2 (Data Product Storage).

To ensure interoperability and consistent usage patterns, we want to agree on a common file format.

We assume that data products frequently will be combined across domains.

Decision

We use Apache Parquet for data products.

Consequences

Automation