A data domain (domain) refers to a data model, A data domain (domain) refers to a data model, such as student or course, and is defined by the parties responsible for data governance. The data domain defines the model schema and other attributes for that focus. Common examples of data domains are customers, students, employees, parts, and product orders.
YOUnite allows the data architect architects to model for multiple domains (customers, students, employees, parts, product orders, etc) as part of a single solution.
...
MDM Data Store: When the domain data records are stored in YOUnite. This is used when the entire organization is comfortable or mandated to migrate a domain to a single store. The YOUnite data store is optimal for reference data such as a list of states, countries, zip-codes, etc.
Federated domains : Federated domains do not store their data in YOUnite but retrieve and update the data on the systems in which it resides. For example, MIS, ERP, or CRM systems. Federated domains require adaptors, metadata, and governance configurations and are covered in detail *TODO: TUTORIAL ON FEDERATED*
Domain Model Schemas
A domain Model Schema refers to the attributes (properties), format, and other metadata that defines how a specific domain should expect to store the data (either in the YOUnite Data Store or Federated), for the purposes of standardizing how data is exchanged between systems. The Data Governance Steward is responsible for configuring and maintaining domain model schemas. A domain model schema is a JSON object describing/defining the properties for the domain's schema. The root node of the model schema is the properties element. See Valid Property Names and Valid Types for ModelSchema Properties below.
...
property | required | valid values | description |
---|---|---|---|
name | yes | Must be at least 3 characters long but no longer than 128between 2 to 128 characters long and must start with an alpha character. The | The domain name. Must be unique to the entire YOUnite deployment since domains are typically shared. TODO NEED SOME DESCRIPTION OF VALID CHARACTER PARAMETERS. |
description | no | 0 to 255 characters long. If longer it will be truncated. | A human readable description of the domain. |
zoneUuid | no | Owning zone's domain UUID | The zone that the domain, the domain's versions, and its data records will be tied to. If this is omitted the caller's current zone will be used. Note that the caller must have permissions to create a domain. |
domainType | no | MDM_DATA_STORE or FEDERATED | The domain type can be either: MDM_DATA_STORE, which is when the domain's data records are stored in YOUnite. This is used when the entire organization is comfortable or mandated to migrate a domain to a single store. MDM_DATA_STORE is optimal for reference data such as a list of states, countries, zip-codes, etc. FEDERATED domains do not store their data in YOUnite, but reference and update data on the systems in which it resides. Federated domains require adaptors, metadata, and governance configurations that are covered in detail *TODO: TUTORIAL ON FEDERATED*. |
The model is created in the next step. Models are tied to specific versions, which is covered below.
...
POST /domains/versions/<zone-uuid>
&&&Remove displayProperty and uniquenessRules from example below and replace with fastDuplicateDetectionProperties&&&
Code Block | ||
---|---|---|
| ||
{ "modelSchema": { "properties": { "<property-name>": { "type": "<property-type>", ...item1 properties.... }, "<property-name>": { ...item1 properties.... } } }, "description": "<description>", "fastDuplicateDetectionProperties": "<property-name> [,<property-name2>, ....]": { ... } } }, "description": "<description>", "displayProperty": "<property-name>", "uniquenssRules": "<property-name1> [, <property-name2>, .....]" } |
Domain Version Properties Descriptions
...
Once the domain/version has been created you can POST data records to, and retrieve data records from, the domain.
Domain Properties Details
The information below provides further domain property details:
displayProperty
uniquenessRules
- Federated Domains: Uniqueness Rules & Required Properties
- Valid property names
- Valid property types
...
Each domain must have a display property (displayProperty
). The display property acts as a primary key for the domain. For example, the "states"
domain below uses the abbreviation
property as the display property.
Code Block | ||
---|---|---|
| ||
{
"name": "states",
"version": 1,
"json": {
"name" : "California",
"abbreviation" : "CA"
}
} |
Use the /drs
endpoint and the appropriate domain and display property to GET a data record:
GET /drs?filters=name:states,displayProperty:CA
If there are multiple versions of a domain, and a domain other than the default is needed, the version number can be included in the URI. For example, assume there are three versions of the "states"
domain and the current version is version 3. The consumer can retrieve the California
version 1 data record by using the following:
GET /drs?filters=name:states,version:1,displayProperty:CA
Note: See Posting a Data Record and Retrieving a Data Record sections for further details on posting/retrieving data records.
Display Property Rules
- The value provided for the
displayProperty
must be unique between all domain entries of a given domain type (e.g. each entry in the "state"
domain must have a uniquestateName
). - Only one property in a domain can be the
displayProperty
. If more than one property is required to ensure uniqueness see the Uniqueness Rules Property below. - Display properties are limited to type STRING.
- Properties designated as the
displayProperty
are required; i.e. null values are not allowed. - Display properties are case sensitive e.g. "California" is NOT equal to "california".
...
The optional uniquenessRules
property is used as an added data record identifier to prevent data record duplication.
Uniqueness Rules
- MDM provides uniqueness rules to prevent data record duplication in the case where a simple
displayProperty
won't suffice. - Optionally, specify a comma-separated list of domain properties whose values, in aggregate, must be unique across all the data records for the domain.
- If no
uniquenessRules
values are provided, MDM will use thedisplayProperty
as theuniquenessRules
. - Uniqueness rule properties are limited to type STRING.
- Values in
uniquenessRules
properties are case sensitive e.g. "California" is NOT equal to "california"
...
The goal for federated domains is to keep the combined list of:
- Properties in the uniqueness rules to a minimum
- Properties defined with the "required" option to a minimum
If the properties list is long, some adaptors associated with the domain might not contain all of the properties and will be unable to add new records to a data domain.
It is important that a service associated with a domain:
...
} |
11/16/17 Per Mark: THINGS HAVE CHANGED…. in the “Domain Version Properties Descriptions” and throughout the page.
- No more displayProperty, uniquenessRules: they got replaced with “fastDuplicateDetectionProperties” (FDDP). See the glossary. It has a good description of “Deterministic Uniqueness” , "De-Duplication” and "Probabilistic Uniqueness” — it also has a pretty good description of "Fast Duplicate Detection Properties”.
- Basically, we can’t have dupes so FDDP are the way we prevent them.
Domain Uniqueness: One of the functions of an MDM system is to detect, resolve and avoid duplicate data which is a significant problem in an organization dependent on many disparate systems. Uniqueness refers to a Data Record's (DR) state as being the only representation within that domain across the systems managed by MDM. For example; there should only be one student DR for any given student. Refer to Deterministic Uniqueness and Fast Duplicate Detection Properties for how CCCTC MDM manages uniqueness.
Deterministic Uniqueness: Deterministic Uniqueness is an MDM approach to detecting whether an MDR is unique for it's domain (refer to Uniqueness). When a transaction through MDM attempts to POST (add) a new record Deterministic Uniqueness combines one-to-many individual attributes (fields) that make up the MDR and queries the MDM stored values for that domain to make sure that they do not already exist. If they do than a process is initiated to resolve and/or report the duplication. The attirbutes that collectively define uniqueness for that domain are identified by the Zone Data Steward. The CCCTC MDM implements Deterministic Uniqueness.
Fast Duplicate Detection Properties: Fast Duplicate Detection Properties are attributes of a Domain and identify those fields that, when combined, MDM should use to detect whether the DR is unique. Refer to Domain Uniqueness, Deterministic Uniqueness, Duplicates and De-Duplication for more.
Probabilistic Uniqueness: Probalistic Uniqueness attempts to go further than Determinstic Uniqueness in identifying duplicate transactions by assigning a priority to each of the attirbutes that collectively define what is a unique Master Data Record (MDR). As an example; uniqueness for a student may consist of the student first and last name, street, city, state, zip code and phone number. It may be that all the fields match an existing record except for the street name which is only off by a few characters. This may indicate a spelling error when the street was manually entered. While the first and last name must match exactly and so were defined as a high priority when identifying duplicates, the street may have been defined as a low priority so, in this case, MDM would assume the existing MDR is a "probably" a duplicate and would act accordingly. See Deterministic Uniqueness for contrast. The CCCTC MDM does not implement Probalistic Uniqueness.
Domain Version Properties Descriptions
property | required | valid values | description |
---|---|---|---|
modelSchema | yes | See Model Schema Properties and Post a Domain below for details. | A JSON model describing the schema for the data domain; it defines the properties that make up the domains schema. The root node of the model schema is the properties element. |
description | no | 0 to 255 characters long. If longer it will be truncated. | A human readable description of the domain version. |
fastDuplicateDetectionProperties |
Once the domain/version has been created you can POST data records to, and retrieve data records from, the domain.
Domain Properties Details
The information below provides further domain property details:
- fastDuplicateDetectionProperties
- Federated Domains: Uniqueness Rules & Required Properties
- Valid property names
- Valid property types
Fast Duplicate Detection Properties
Anchor | ||||
---|---|---|---|---|
|
Each domain must have a duplicate detection property (fastDuplicateDetectionProperties
). Fast Duplicate Detection Properties are attributes of a domain and identify those fields that, when combined, MDM should use to detect whether the data record is unique. Refer to Domain Uniqueness, Deterministic Uniqueness, and Probabilistic Uniqueness for more information.
&&&Update code block and other content in this section??&&&
Code Block | ||
---|---|---|
| ||
{
"name": "states",
"version": 1,
"json": {
"name" : "California",
"abbreviation" : "CA"
}
} |
Use the /drs
endpoint and the appropriate domain and display property to GET a data record:
GET /drs?filters=name:states,displayProperty:CA
If there are multiple versions of a domain, and a domain other than the default is needed, the version number can be included in the URI. For example, assume there are three versions of the "states"
domain and the current version is version 3. The consumer can retrieve the California
version 1 data record by using the following:
GET /drs?filters=name:states,version:1,displayProperty:CA
Note: See Posting a Data Record and Retrieving a Data Record sections for further details on posting/retrieving data records.
Fast Duplicate Dectction Properites Rules
- The value provided for the
displayProperty
must be unique between all domain entries of a given domain type (e.g. each entry in the "state"
domain must have a uniquestateName
). - Only one property in a domain can be the
displayProperty
. If more than one property is required to ensure uniqueness see the Uniqueness Rules Property below. - Display properties are limited to type STRING.
- Properties designated as the
displayProperty
are required; i.e. null values are not allowed. - Display properties are case sensitive e.g. "California" is NOT equal to "california".
Valid Property Names
Anchor | ||||
---|---|---|---|---|
|
...
Property | Description | ||
---|---|---|---|
minItems | Minimum items allowed in the array. | ||
| Maximum items allowed in the array | ||
default | If the item is not provided or is null, the default value is used. | ||
items | Contains the list of sub-properties in the node. | ||
required | A non-null value for this item must be provided (false by default). | description | A human-readable description of the property.must be provided (false by default). |
description | A human-readable description of the property. |
&&&11/16/17 From Mark: double back and check the section titled "An Example of Creating a Domain in Two Steps” is accurate since some of the API options parameters may change.&&&
An Example of Creating a Domain in Two Steps
...
POST /domains/versions/
7f28180b-7d9f-42b5-b5ed-d4a0e7ec09fc
&&&Remove references to displayProperty and UniquenessRules in the sections below; update them with fastDuplicateDetectionProperites&&&
Code Block | ||
---|---|---|
| ||
{ "displayPropertyfastDuplicateDetectionPropoerties": "abbreviation", "uniquenessRules": "abbreviationxx", "description": "A reference list of states in the North American States: USA, Mexico and Canada", "modelSchema" { "properties": { "name": { "type": "string", "description": "The state's official name", "min": 2, "max": 80, "required": true }, "abbreviation": { "type": "string", "description": "The state's official abbreviation", "min": 2, "max": 2, "required": true } } } } |
...