MDM Background & Notes
From Lou's Back of the Napkin MDM notes:
What feature are considered best of breed?:
- Good data quality
- proper metadata management
- agile SOA
- Performance tuning
What are our needs?
- Multi-domain support: A domain is data model. A domain can have a narrow focus (e.g. "students") or broad (e.g. HEDM). HEDM and SAP are examples of single-domain products that are focused on a given vertical.
- Informatica, Orchestra, Riversand is .Net only, TIBCO) - Oracle is not multi-domain
- Some products may allow for the user to configure a domain but will only support one (perhaps Oracle fits that bill?)
- Flexible schemas (models) with the ability to create different versions of the same model - the prevents having to rev all apps tied to a given version of a model
- Informatica, Orchestra, Riversand is .Net only, TIBCO) - Oracle is not multi-domain
- Data modeling: The process of cataloging and reconciling the difference between the similar data stored in different systems
- The domain modeling process should be fairly interactive and flexible. A bad approach would be to have to configure DB tables a better approach would be an interactive UI that allowed the data architect/steward to view the schemas/models of the multiple application being reconciled while building the domain model.
- Governance models: The resources and processes involved in controlling access to master data.
- Permissions can become very complicated quickly - think of NTs nightmarish file permission hierarchy.
- A permission model simplicity needs to be factored into the decision process
- Easy to set/view for the data stewards and MDM developers
- Centralized vs Federated (or both): How is the data stored? Is it stored in a golden DB that is part of the MDM system or does MDM reach in to where the data is stored and re-create master data on the fly?
- CCC needs both:
- The capability of federated enables buyin from stakeholders that aren't comfortable or fully onboard with MDM
- For performance and simplicity reasons, as stakeholders become more comfortable with MDM, new systems are brought online or systems are upgraded – the stakeholders will consider the golden DB approach.
- CCC needs both:
- Integration focused vs. information management
- Synchronization
Most vendors are built for a single data domain - CRM or Product.
- We would need to break new ground with educational data.
Open Source vendors:
- Akeneo - Not in gartner quadrant. Not multi-domain. Product focused.
- Pimcore - not in gartner quadrant. Not multi-domain. Product & Content Management focused.
- Talend - not in Gartner Quadrant at all
Feedback from users:
- significant time and expense involved.
Interesting stuff:
- Kettle ETL (open source)
- DataCleaner (commercial open source) for data quality