Skip to main content

Power BI

New

Export data transformations as rule metadata

Vote (3) Share
simon opper's profile image

simon opper on 04 Jul 2019 07:08:23

I want to be able to capture the data transformations that are made to a data set as a set of rules. JSON will suffice for now. But ultimately it will need to adopt a rule markup langauge such as one of the standards based approaches used in other frameworks.

For example, I want to export all column name text relacements in M queries to a list of rules that become metadata about that transformations. In essence it captures lineage and provenance of the data.

In an ultimate system I would have rules/metadata about a whole host of transforms and data validation and constraints.

e.g.

datatype changes
text replacements - original names plus any other future label applied to the the source column
data validations - e.g. text pattern conforms to pattern rule (e.g. regex)

The metadata should be additive, be persisted and should move with with the data from it's source, through every dataflow and every report that consumes that data. Similar to the concept of the CDM it should form a set of metadata for search, discovery and master data management.

All the data is already present in the PBIX files via the .bim models. So it will require a set of backend parsers to extract M Query, possibly also Dax and dump a these in a set of rules files.

This will require that a unique Id is assigned internally to each data object from source to consumption and an additive persist record of this be included with the rules.

Finally all of this should be available as metadata integrated with Azure data catalouge. This should be a set of facets to use for data governance of enterprise data assets.