Data Mesh Governance by Example
Curated examples for Data Mesh guiding values, an operating model, and global policies to support a federated governance group.
We want this to be an open source collection of policy examples, driven by the community. Contribute by submitting a pull request on the GitHub repository.
The data mesh governance group consists of representatives from the domain teams and the data platform team.
They are temporarily supported by a subject-matter experts, to address special issues, e.g. concerning legal, compliance, and security.
Together, they make sure that data products in the mesh are interoperable and can be used securely. For this, they agree on a few architectural decisions and global policies. To make it easy for domain teams to implement the policies, they specify the requirements for the data platform to automate the policies as much as possible.
Guiding values are the fundamental beliefs we agree on when implementing data mesh governance.
They guide us to make the right choices and give justification for our decisions.
- Promote the usage of data products
- Optimize experience for generalist majority
- Standardize for interoperability
- Enforce consistent security
- Design for automation
The operating model defines the structure and processes of the data mesh governance group.
After forming the group with its members, in the first meeting the collaboration mode, communication channels and a policy repository needs to be decided on.
- Data Product Specification
- Data Contract Specification
- Address scheme
- File Format
- Partitioning Keys
- Timestamp as ISO-8601 Strings
- Money amounts in cents as integers
- Common IDs
- Well-known Fields Names
- Bitemporal Timestamp Fields
- Naming Conventions (environment, database, table, column, file, bucket, …)
- Documentation of data products
- Mandatory Fields for Data Products
- Schema Format
- Access Request
- Ticket with manual steps
- Decentralized self-service via Pull Requests
- Central self-service app with decentralized handlers
- Access granted through AWS IAM Policies
- ACLs managed by domain teams
- Reassess after x month
- One domain published consents as data product
Privacy & Compliance
- Data Classification
- PII data separation
- PII Anonymization
- Data Stored in Customer’s Business Region
- PHI (protected health info)
- Data Retention Periods
- Right to be Forgotten By Tombstone Events
- Politically exposed person (PEP)
- People in witness protection program
- Observability Metrics
- Cost reporting
- Data Product Creation
- Self-service app (Backstage.io)
- Ownership for New Data Products
- Ownership for Legacy Data Products
While it is not the federated governance group’s actual job to define the architecture of the data platform,
decisions about the platform have consequences for global policies and vice versa, e.g. for policy automation and monitoring.
The governance group always has to keep track of those decisions related to the data platform.
Wider, Arif & Verma, Sumedha & Akhtar, Atif. (2023). Decentralized Data Governance as Part of a Data Mesh Platform: Concepts and Approaches.
Joshi, D., Pratik, S., & Rao, M. P. (2021). Data governance in data mesh infrastructures: The Saxo bank case study. Proceedings of the International Conference on Electronic Business (ICEB), 21, 599–604.