Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

  • MDM is the process of describing and cataloging data inside of an organization and understanding which stakeholders value which sources of data. 
  • DI is the process of integrating disparate data. This ranges from annual CSV exports and imports between systems to real-time connectors between systems.
  • Master Data is what is considered the source of truth for a given data domain. See "Establish the Truth" below.

Making DI & MDM Easy is Impossible

YOUnite's primary focus has been to make this process as easy and non-intrusive as possible. ‘Nuff said

Always Start by Analyzing the Use Cases

If instead you start by analyzing the data, you will be adding to an already exceedingly arduous process of normalizing data by analyzing data that isn’t relevant to your MDM process.  

Example: A college system uses a Learning Management System (LMS). The LMS also has features for Ed Planning however, the college system uses a separate system for Ed Planning so a data analyst would be wasting their precious time if they were to catalog the entire LMS schema since the Ed Planning system in the LMS is not used.

Think in Terms of REST

Asking use case questions in terms of RESTful operations (HTTP GET, PUT, POST and DELETE -- following REST principles) can help keep focus on what can become a very convoluted process of analysis -- if the analysis deviates from this it almost always leads to paralysis. Ultimately YOUnite breaks transactions down into RESTful operations and if you know which operations to avoid then a lot of time can be saved.

Example: The College Application systems never wants to delete a student once they have been added to the system. Since this is the case, analysis for the DELETE request can be ignored with this application.

Establish the Truth

Out of analysis you discover the truth i.e. which systems hold the truth values for a given domain. As you catalogue the entities it is important to note which systems hold the truth since knowing this reduces the amount of analysis required.

...

Knowing this, you no longer need to worry about what issues may arise from sending data from the LMS to other systems and focus primarily on how data will flow from either the Application System or SIS into other systems such as the LMS.

The MDM Process is a Multi-Dimensional Cross-Cutting Concern

There is no way around it, you must analyze:

...

Ultimately, the data architects create a worksheet that contains the required attributes to complete an operation for a given entity for a given adaptor.

If an HTTP Operation Is Not Required for an Adaptor, Don't Analyze It

Example: There is never a situation where the analysts for the College Application system wants YOUnite to create (POST) a new student - they need to maintain control of that process so there is no need to analyze the required elements for a POST /student for the College Application system.

Generally Speaking, All Changes to an Data Record Should Generate a Notification to All Adaptors Interested in That Data Domain

If that application tied to an adaptor has a well written RESTful interface it will allow you to register a callback for changes -- if not then you will need to discover a way to detect changes.

...

Example: A college course catalogue system would not get a notification that a student has been deleted from the system but several others would such as the college application system and the college SIS.

If Data Elements Are Used by Only One System, Then Don't Normalize Them Unless They Are Used Inside Another Data Domain

The job of the data analyst is to create as little work as possible. A single element added to a federated data domain has an exponential effect on the complexity of the overall system.

Example: A college system uses an Ed Planning system that tracks meetings between the student and college faculty and staff. Others systems may use the Ed Planning data but if no other systems in the systems use the scheduling system, then the schedulng data can be ignored in respect to modeling student, faculty or college data domains.

A Couple of Additional Points

  • The YOUnite adaptor might need to read and manipulate non MDR attributes to complete transactions

  • When building an MDR worksheet you also need a reference data worksheet too to keep. This is data that infrequently changes (e.g. States, Countries, etc) but is commonly cross-referenced by other domains (e.g. customers). A decision should be made where the reference data should reside and consideration should be made to storing some or all of the reference data in the YOUnite data store for performance reasons.

...