Skip to end of metadata
Go to start of metadata

You are viewing an old version of this page. View the current version.

Compare with Current View Page History

« Previous Version 7 Next »

It Starts with Data Integration

Its common for organizations to have duplicate information on different systems. For example, customer information could be stored in both a CRM and accounting system:

While some data in both systems is identical, some is similar but not the same. 

With small organizations, it usually is not an issue keeping just a few systems up to date:

The process of keeping data up to date between disparate systems is called "Data Integration" or DI.

As organizations bring more and more systems online, their business data is duplicated even further and the problem of keeping all of the systems up to date becomes problematic. For example, imagine a delivery service that stores information about its customers in several systems:

As information about their customers change, the difficulty of keeping all the systems up to date becomes exponentially more difficult as the number of systems increases. 

  • What system holds the truth?
  • Which system(s) get notified when a change is made on one of the systems?  
    • If all systems need to be updated, then each system needs to know how to transform the data to be consumable by the each of the other systems which requires programming many transformations.  In the extreme case the example above could take five transformations for each of the six systems making for a total of 30 transformation applications.
  • How are the changes handled?  In real time or in batch?  If in batch, what is the latency between batch updates?  
  • The onus of transferring data between systems becomes a data integration nightmare where some systems have to spend a great deal of resources converting data so that it can be consumed by or received from other systems.
  • How does the organization manage access to the data?  Perhaps the Warehouse should get access to only a subset of the customer data or shouldn't get any level of access to the Credit Card Processing system.  

The problem doesn't end with just the customer data;  other data such as product, inventory and employee data may need to be kept up to date on several systems as well:

In Master Data Management (or MDM), the set of fields or properties that define a set of data (e.g. Customer) is called a data domain. 

Migrating to Master Data Management

Master data management (MDM) solves this problem by creating a separate system where data domains (or domains) are defined for all systems inside the organization. The domains provide neutral data formats or schemas for all systems. YOUnite MDM can centrally save the latest change of a record or it can  make note of when a change occurs in one of the organization's systems without actually storing the data; this is called federated MDM.

So in the example of our delivery service, the latest version of a customer record is called the customer's Master Data Record. In a federated MDM implementation, this would be the latest version of the customer's information in the data definition defined by the customer data domain.

Additionally, MDM provides better control over data access or governance over DI. MDM can manage who can see what so in the federated example,  perhaps the Warehouse Management system only has access to data stored in the Distribution and CRM systems. but the Accounting System has access to all systems. When the Warehouse Division looks up the master data record for its customer the Acme Company it may get a different result than Accounting Division since the Accounting Division has access or "scope" to Acme Company's information on all systems. 


  • No labels