MDM is an approach to reducing data redundancy by maintaining a definitive "record of truth," or master data, for critical data in order to supply a single source as a reference. Ideally, MDM organizes data sharing among multiple applications or departments.
It's common for organizations to have duplicate information on different systems. For example, customer information could be stored in both a CRM and accounting system:
While some data in both systems is identical, other data may be similar but not the same.
With small organizations, it can be straightforward to keep just a few systems up to date. But as an organization brings more systems online, their business data often gets increasingly duplicated, which poses a problem when keeping all of the systems up to date. The process of keeping data up to date between disparate systems is called data integration.
For example, imagine a delivery service that stores information about its customers in several systems:
As customer information changes over time (i.e. customer addresses or phone numbers), keeping all the systems up to date becomes exponentially more difficult.
Consider the diagram above:
A customer's email address may be stored on several systems, which may result in conflicting values. If so, which system holds the true/correct value?
In the example above, if the Credit Card Processing system receives updated customer address, which of the other systems also need to have that data? The Accounting and Distribution systems. If only the street address changed, but the city, state, and zip code information remain the same, then only the street address data element of the customer's data record needs to be updated in the other two systems.
Note: The most recent version of a given record is called the data record.
If all systems need to be updated, then each system needs to know how to transform the data to be consumed by each of the other systems. This requires programming many transformations. In an extreme case, the example above could take five transformations for each of the six systems, requiring 30 transformation modules or adaptors.
Note: An adaptor is software located within a system that connects to, and shares data through, the YOUnite Data Hub. The adaptor focuses on Extract, Transform, and Load (ETL) functions, ensuring any systems' "outbound data" meets defined format requirements before it gets transformed into "inbound data" format that another system requires. The YOUnite Data Hub sits between the various systems, routing data between them based on which systems have access to which data (data governance).
The easiest and most common change handling is via batch updates. However, latency between batch updates may cause business process issues.
Transferring data between systems can be a daunting task. Some applications have built-in adaptors to handle transformations, but these generally handle only a subset of the required data. Where the built-in adaptors fall short or don't exist, an organization may spend resources developing "one-off" adaptors to meet an ongoing transformation need.
Using the delivery service example above, the Warehouse Management system should get access to only a subset of the customer data for security reasons. And it shouldn't get any level of access to the Credit Card Processing system. This level of access is defined by the Access Control Lists or "ACLs". Data governance is used to describe managing the ACLs and also defines where the data records are stored.
The challenge doesn't end with just the customer data. Additional data, such as product, inventory, and employee data may need to be kept up to date on several systems:
In MDM, the set of fields or properties (e.g. name, address, phone number) that define a set of data (e.g. Customer) is called a data domain (or sometimes just called "domain").
MDM solves the problem of keeping interrelated systems up to date by creating a separate system where data domains are defined for all systems inside the organization. The domains provide neutral data formats or schemas for all systems, facilitating data sharing.
The data for a data domain defined in YOUnite can:
Using the delivery service example above:
In the federated model, data records can be stored in the YOUnite Data Store or in one or more systems connected to YOUnite. Many systems may hold similar data but generally the organization as a whole decides which system(s) hold the data records. Note that with YOUnite's federated model, different groups inside of an organization can designate which system holds the master data. YOUnite's governance model can manage who can access the data.
In the federated example, governance can be set so that:
For the Warehouse Management System... | For the Accounting System... |
---|---|
data access is restricted to data stored in the Distribution and CRM systems | data access is allowed to all the other systems |
data record lookups return only information that is appropriate for the division's needs | data record lookups return information from all systems, which may include Credit Card proessing, for example |
Implementing MDM involves a process of determining where the data records are stored (whether it's the MDM Data Store or Federated MDM) and managing who (a person or a system) has read, write, update, and delete privileges to those systems.
Data records are the latest, most recent version of a record stored in many systems connected to YOUnite. Master data is data in a particular domain or a particular element that has been declared the "Record of Truth." It's not always necessary or appropriate for systems to access an organization's master data. Many data access requests are for data records that may or may not contain master data. However, YOUnite has the ability to propagate changes from a system that contains data records to others in the YOUnite ecosystem on a permission-appropriate basis (i.e. governance).
Several terms have been introduced and it may be helpful to review them before moving on:
An good source for more MDM background is Mark Allen and Cervo Dalton's Multi-domain Master Data Management: Advanced MDM and Data Governance in Practice. Waltham, MA: Morgan Kaufmann, 2015. (ISBM 978-0-12-800835-5).