Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

A data domain (domain) refers to a data model, A data domain (domain) refers to a data model, such as student or course, and is defined by the parties responsible for data governance. The data domain defines the model schema and other attributes for that focus. Common examples of data domains are customers, students, employees, parts, and product orders.

YOUnite allows the data architect architects to model for multiple domains (customers, students, employees, parts, product orders, etc) as part of a single solution.

...

MDM Data Store: When the domain data records are stored in YOUnite. This is used when the entire organization is comfortable or mandated to migrate a domain to a single store. The YOUnite data store is optimal for reference data such as a list of states, countries, zip-codes, etc. 

Federated domains : Federated domains do not store their data in YOUnite but retrieve and update the data on the systems in which it resides. For example, MIS, ERP, or CRM systems. Federated domains require adaptors, metadata, and governance configurations and are covered in detail *TODO: TUTORIAL ON FEDERATED*

Domain Model Schemas

A domain Model Schema refers to the attributes (properties), format, and other metadata that defines how a specific domain should expect to store the data (either in the YOUnite Data Store or Federated), for the purposes of standardizing how data is exchanged between systems. The Data Governance Steward is responsible for configuring and maintaining domain model schemas. A  domain model schema is a JSON object describing/defining the properties for the domain's schema. The root node of the model schema is the properties element. See Valid Property Names and Valid Types for ModelSchema Properties below.

...

propertyrequiredvalid valuesdescription
nameyes

Must be

at least 3 characters long but no longer than 128

between 2 to 128 characters long and must start with an alpha character. The name property value can only contain upper/lower case alpha characters, digits, and "_" and "-".

The domain name. Must be unique to the entire YOUnite deployment since domains are typically shared. TODO NEED SOME DESCRIPTION OF VALID CHARACTER PARAMETERS.
descriptionno0 to 255 characters long. If longer it will be truncated.A human readable description of the domain.
zoneUuidnoOwning zone's domain UUIDThe zone that the domain, the domain's versions, and its data records will be tied to. If this is omitted the caller's current zone will be used. Note that the caller must have permissions to create a domain.
domainTypenoMDM_DATA_STORE or FEDERATED

The domain type can be either:

MDM_DATA_STORE,  which is when the domain's data records are stored in YOUnite. This is used when the entire organization is comfortable or mandated to migrate a domain to a single store. MDM_DATA_STORE is optimal for reference data such as a list of states, countries, zip-codes, etc. 

FEDERATED domains do not store their data in YOUnite, but reference and update data on the systems in which it resides. Federated domains require adaptors, metadata, and governance configurations that are covered in detail *TODO: TUTORIAL ON FEDERATED*.

The model is created in the next step. Models are tied to specific versions, which is covered below.

...

POST /domains/versions/<zone-uuid>

&&&Remove displayProperty and uniquenessRules from example below and replace with fastDuplicateDetectionProperties&&&

Code Block
languagetext
{
	"modelSchema": {
		"properties": {
			"<property-name>": {
				"type": "<property-type>",
				...item1 properties....
			},
			"<property-name>": {
				 ...item1 properties.... 
			}
		}
	},
			"description": "<description>",
	"fastDuplicateDetectionProperties": "<property-name> [,<property-name2>, ....]": {
				 ... 
			}
		}
	},
	"description": "<description>",
	"displayProperty": "<property-name>",
	"uniquenssRules": "<property-name1> [, <property-name2>, .....]"
}

Domain Version Properties Descriptions

...

Once the domain/version has been created you can POST data records to, and retrieve data records from, the domain.

Domain Properties Details

The information below provides further domain property details:

...

Each domain must have a display property (displayProperty). The display property acts as a primary key for the domain. For example, the "states" domain below uses the abbreviation property as the display property.

Code Block
languagejs
{
	"name": "states",
	"version": 1,
	"json": {
		"name" : "California", 
		"abbreviation" : "CA"
	}
 }

Use the /drs endpoint and the appropriate domain and display property to GET a data record:

GET /drs?filters=name:states,displayProperty:CA

If there are multiple versions of a domain, and a domain other than the default is needed, the version number can be included in the URI. For example, assume there are three versions of the "states" domain and the current version is version 3. The consumer can retrieve the California version 1 data record by using the following:

GET /drs?filters=name:states,version:1,displayProperty:CA

Note: See Posting a Data Record and Retrieving a Data Record sections for further details on posting/retrieving data records.

Display Property Rules
  • The value provided for the displayProperty must be unique between all domain entries of a given domain type (e.g. each entry in the "state" domain must have a unique stateName).
  • Only one property in a domain can be the displayProperty. If more than one property is required to ensure uniqueness see the Uniqueness Rules Property below.
  • Display properties are limited to type STRING.
  • Properties designated as the displayProperty are required; i.e. null values are not allowed.
  • Display properties are case sensitive e.g. "California" is NOT equal to "california".

...

The optional uniquenessRules property is used as an added data record identifier to prevent data record duplication.

Uniqueness Rules
  • MDM provides uniqueness rules to prevent data record duplication in the case where a simple displayProperty won't suffice.
  • Optionally, specify a comma-separated list of domain properties whose values, in aggregate, must be unique across all the data records for the domain.
  • If no uniquenessRules values are provided, MDM will use the displayProperty as the uniquenessRules.
  • Uniqueness rule properties are limited to type STRING.
  • Values in uniquenessRules properties are case sensitive e.g. "California" is NOT equal to "california"

...

The goal for federated domains is to keep the combined list of:

  1. Properties in the uniqueness rules to a minimum
  2. Properties defined with the "required" option to a minimum

If the properties list is long, some adaptors associated with the domain might not contain all of the properties and will be unable to add new records to a data domain.

It is important that a service associated with a domain:

...


}

11/16/17 Per Mark: THINGS HAVE CHANGED…. in the “Domain Version Properties Descriptions” and throughout the page.

  • No more displayProperty, uniquenessRules: they got replaced with “fastDuplicateDetectionProperties” (FDDP). See the glossary. It has a good description of “Deterministic Uniqueness” , "De-Duplication” and "Probabilistic Uniqueness” —   it also has a  pretty good description of "Fast Duplicate Detection Properties”.
  • Basically, we can’t have dupes so FDDP are the way we prevent them.

Domain Uniqueness: One of the functions of an MDM system is to detect, resolve and avoid duplicate data which is a significant problem in an organization dependent on many disparate systems. Uniqueness refers to a Data Record's (DR) state as being the only representation within that domain across the systems managed by MDM. For example; there should only be one student DR for any given student. Refer to Deterministic Uniqueness and Fast Duplicate Detection Properties for how CCCTC MDM manages uniqueness.

Deterministic Uniqueness: Deterministic Uniqueness is an MDM approach to detecting whether an MDR is unique for it's domain (refer to Uniqueness). When a transaction through MDM attempts to POST (add) a new record Deterministic Uniqueness combines one-to-many individual attributes (fields) that make up the MDR and queries the MDM stored values for that domain to make sure that they do not already exist. If they do than a process is initiated to resolve and/or report the duplication. The attirbutes that collectively define uniqueness for that domain are identified by the Zone Data Steward. The CCCTC MDM implements Deterministic Uniqueness.

Fast Duplicate Detection Properties: Fast Duplicate Detection Properties are attributes of a Domain and identify those fields that, when combined, MDM should use to detect whether the DR is unique. Refer to Domain Uniqueness, Deterministic Uniqueness, Duplicates and De-Duplication for more.

Probabilistic Uniqueness: Probalistic Uniqueness attempts to go further than Determinstic Uniqueness in identifying duplicate transactions by assigning a priority to each of the attirbutes that collectively define what is a unique Master Data Record (MDR). As an example; uniqueness for a student may consist of the student first and last name, street, city, state, zip code and phone number. It may be that all the fields match an existing record except for the street name which is only off by a few characters. This may indicate a spelling error when the street was manually entered. While the first and last name must match exactly and so were defined as a high priority when identifying duplicates, the street may have been defined as a low priority so, in this case, MDM would assume the existing MDR is a "probably" a duplicate and would act accordingly. See Deterministic Uniqueness for contrast. The CCCTC MDM does not implement Probalistic Uniqueness.


Domain Version Properties Descriptions

propertyrequiredvalid valuesdescription
modelSchemayesSee Model Schema Properties and Post a Domain below for details.A JSON model describing the schema for the data domain; it defines the properties that make up the domains schema. The root node of the model schema is the properties element.
descriptionno0 to 255 characters long. If longer it will be truncated.A human readable description of the domain version.
fastDuplicateDetectionProperties


Once the domain/version has been created you can POST data records to, and retrieve data records from, the domain.

Domain Properties Details

The information below provides further domain property details:

Fast Duplicate Detection Properties
Anchor
fastDuplicateDetectionProperties
fastDuplicateDetectionProperties

Each domain must have a duplicate detection property (fastDuplicateDetectionProperties). Fast Duplicate Detection Properties are attributes of a domain and identify those fields that, when combined, MDM should use to detect whether the data record is unique. Refer to Domain Uniqueness, Deterministic Uniqueness, and Probabilistic Uniqueness for more information.

&&&Update code block and other content in this section??&&&

Code Block
languagejs
{
	"name": "states",
	"version": 1,
	"json": {
		"name" : "California", 
		"abbreviation" : "CA"
	}
 }

Use the /drs endpoint and the appropriate domain and display property to GET a data record:

GET /drs?filters=name:states,displayProperty:CA

If there are multiple versions of a domain, and a domain other than the default is needed, the version number can be included in the URI. For example, assume there are three versions of the "states" domain and the current version is version 3. The consumer can retrieve the California version 1 data record by using the following:

GET /drs?filters=name:states,version:1,displayProperty:CA

Note: See Posting a Data Record and Retrieving a Data Record sections for further details on posting/retrieving data records.

Fast Duplicate Dectction Properites Rules
  • The value provided for the displayProperty must be unique between all domain entries of a given domain type (e.g. each entry in the "state" domain must have a unique stateName).
  • Only one property in a domain can be the displayProperty. If more than one property is required to ensure uniqueness see the Uniqueness Rules Property below.
  • Display properties are limited to type STRING.
  • Properties designated as the displayProperty are required; i.e. null values are not allowed.
  • Display properties are case sensitive e.g. "California" is NOT equal to "california".

Valid Property Names
Anchor
validPropertyNames
validPropertyNames

...

PropertyDescription
minItemsMinimum items allowed in the array.

maxItems

Maximum items allowed in the array
defaultIf the item is not provided or is null, the default value is used.
itemsContains the list of sub-properties in the node.
requiredA non-null value for this item must be provided (false by default).descriptionA human-readable description of the property.must be provided (false by default).
descriptionA human-readable description of the property.


&&&11/16/17 From Mark: double back and check the section titled "An Example of Creating a Domain in Two Steps” is accurate since some of the API options parameters may change.&&&

An Example of Creating a Domain in Two Steps

...

POST /domains/versions/7f28180b-7d9f-42b5-b5ed-d4a0e7ec09fc

&&&Remove references to displayProperty and UniquenessRules in the sections below; update them with fastDuplicateDetectionProperites&&&

Code Block
languagetext
{
	"displayPropertyfastDuplicateDetectionPropoerties": "abbreviation",
	"uniquenessRules": "abbreviationxx",
	"description": "A reference list of states in the North American States: USA, Mexico and Canada",
	"modelSchema" {
		"properties": {
			"name": {
				"type": "string",
				"description": "The state's official name",
				"min": 2,
				"max": 80,
				"required": true
			},
			"abbreviation": {
				"type": "string",
				"description": "The state's official abbreviation",
				"min": 2,
				"max": 2,
				"required": true
			}
		}
	}
}

...