Data Domains

Data Domains

A data domain (domain) refers to a data model, such as student or course, and is defined by the parties responsible for data governance. The data domain defines the model schema and other attributes for that focus. Common examples of data domains are customers, students, employees, parts, and product orders.  Data domains are typically created by the Data Governance Steward (DGS).

YOUnite allows the DGS to architect model for multiple domains (customers, students, employees, parts, product orders, etc) as part of a single solution.

Once a version of a domain is created and its data records are loaded (or mapped, in a federated model), it can be referenced by other data domains, adaptors, and API consumers as a source of truth. Domains have version numbers so that other domains, adaptors, and applications can bind to a specific domain and version (e.g. students:v3). Data governance can designate data at specific adaptors as the master data and can control which zone has the appropriate ACLs to access and update it.


Domain Types

The data records for a domain can be stored either as:

MDM Data Store: When the domain data records are stored in YOUnite. This is used when the entire organization is comfortable or mandated to migrate a domain to a single store. The YOUnite data store is optimal for reference data such as a list of states, countries, zip-codes, etc. 

Federated: Federated domains do not store their data in YOUnite but retrieve and update the data on the systems in which it resides. For example, MIS, ERP, or CRM systems. Federated domains require adaptors, metadata, and governance configurations. Accessing federated data is covered on Accessing Data Records,

Domain Model Schemas

A domain Model Schema refers to the attributes (properties), format, and other metadata that defines how a specific domain should expect to store the data (either in the YOUnite Data Store or Federated), for the purposes of standardizing how data is exchanged between systems. The Data Governance Steward is responsible for configuring and maintaining domain model schemas. A  domain model schema is a JSON object describing/defining the properties for the domain's schema. The root node of the model schema is the properties element. See Valid Property Names and Valid Types for ModelSchema Properties below.

Domain Creation Overview

An overview of the domain creation process is described below, and is followed by an example of the domain creation process and then posting records to and retrieving records from the domain. 

POST the Domain

The first step in creating a domain is to define the domain name. You can also define the new domain's type and the zone it is attached to (its "owning zone"). If you do not define the type and zone, then YOUnite will use defaults as described beow:

POST /domains

{ "name": "<domain name>", "description": "<human readable description of the data domain>", "zoneUuid": "<owning zone uuid>", "domainType": "<domain type>" }

Domain Properties Descriptions

property

required

valid values

description

property

required

valid values

description

name

yes

Must be between 2 to 128 characters long and must start with an alpha character. The name property value can only contain upper/lower case alpha characters, digits, and "_" and "-".

The domain name. Must be unique to the entire YOUnite deployment since domains are typically shared.

description

no

0 to 255 characters long. If longer it will be truncated.

A human readable description of the domain.

zoneUuid

yes

Owning zone's domain UUID

The zone that the domain, the domain's versions, and its data records will be tied to. If this is omitted the caller's current zone will be used. Note that the caller must have permissions to create a domain.

domainType

no

MDM_DATA_STORE or FEDERATED

The domain type can be either:

MDM_DATA_STORE,  which is when the domain's data records are stored in YOUnite. This is used when the entire organization is comfortable or mandated to migrate a domain to a single store. MDM_DATA_STORE is optimal for reference data such as a list of states, countries, zip-codes, etc (default).

FEDERATED domains do not store their data in YOUnite, but reference and update data on the systems in which it resides. Federated domains require adaptors, metadata, and governance configurations that are covered in detail.

The model is created in the next step. Models are tied to specific versions, which is covered below.

POST a Domain Version

With the domain in place, its first version can be created. The domain version defines the properties that make up its model schema. You may want to create a new domain version if you want to add more properties to the model, for instance.

Domain version numbers are automatically generated and start at 1 and continue in ascending order. The first version of a domain is the default version and will remain the default version if more versions of a given domain are created.The PATCH method for the /domains/<domain-uuid> endpoint can be used to change a domain's default version. See the YOUnite API for implementation details.

The root node of the model schema is the properties element. See Valid Property Names and Valid Types for Model Schema Properties below for details.

A domain version is defined with a domain JSON Object as described below:

POST /domains/versions/<zone-uuid>

&&&Remove displayProperty and uniquenessRules from example below and replace with fastDuplicateDetectionProperties&&&

{ "modelSchema": { "properties": { "<property-name>": { "type": "<property-type>", ...item1 properties.... }, "<property-name>": { ... } } }, "description": "<description>", "fastDuplicateDetectionPropertiess": "property-name, [, <property-name2>, ...]" }

11/16/17 Per Mark: THINGS HAVE CHANGED…. in the “Domain Version Properties Descriptions” and throughout the page.

  • No more displayProperty, uniquenessRules: they got replaced with “fastDuplicateDetectionProperties” (FDDP). See the glossary. It has a good description of “Deterministic Uniqueness” , "De-Duplication” and "Probabilistic Uniqueness” —   it also has a  pretty good description of "Fast Duplicate Detection Properties”.

  • Basically, we can’t have dupes so FDDP are the way we prevent them.

Domain Version Properties Descriptions

property

required

valid values

description

property

required

valid values

description

modelSchema

yes

See Model Schema Properties and Post a Domain below for details.

A JSON model describing the schema for the data domain; it defines the properties that make up the domains schema. The root node of the model schema is the properties element.

description

no

0 to 255 characters long. If longer it will be truncated.

A human readable description of the domain version.

fastDuplicateDetectionProperties

yes

A list of of one or more valid properties for the given domain version. Each property must be of type either String. Number, Int or Boolean.

A list of properties that will insure that a given data record is unique for the domain version:

  • YOUnite provides uniqueness rules to prevent data record duplication prevention 

  • Specify a comma-separated list of domain properties whose values, in aggregate, must be unique across all the data records for the domain.

  • Fast duplicate detection properties are limited to simple datatypes STRING, NUMBER, INT, BOOLEAN

  • Values in fastDuplicateDetectionProperties properties are case sensitive e.g. "California" is NOT equal to "california"

Once the domain/version has been created you can POST data records to, and retrieve data records from, the domain.

Domain Properties Details

The information below provides further domain property details:

Fast Duplicate Detection Properties (FDDPS)

Each domain must have a duplicate detection property (fastDuplicateDetectionProperties). Fast Duplicate Detection Properties (FDDPS) are attributes of a domain and identify those fields that, when combined, MDM should use to detect whether the data record is unique. Refer to Domain Uniqueness, Deterministic Uniqueness, and Probabilistic Uniqueness for more information.

GET /drs?filters=name:states,displayProperty:CA

If there are multiple versions of a domain, and a domain other than the default is needed, the version number can be included in the URI. For example, assume there are three versions of the "states" domain and the current version is version 3. The consumer can retrieve the California version 1 data record by using the following:

GET /drs?filters=name:states,version:1,displayProperty:CA

11/17/17: Per Robbie, drLabel replaced displayProperty, yet is optiona - TO DIANA FROM MARK - We are going to ditch drLabel and just use FDDPsl. 

fastDuplicateDetectionProperites": {

      "description": "The optional property that serves as a label for data records defined by the domain version.  It also can be used as a filter in the API's GET resource.",

      "type": "string"

    },

&&&Update code block and other content in this section??&&&

{ "name": "states", "version": 1, "json": { "name" : "California", "abbreviation" : "CA" }  }

Use the /drs endpoint and the appropriate domain and display property to GET a data record:

GET /drs?filters=name:states,displayProperty:CA

If there are multiple versions of a domain, and a domain other than the default is needed, the version number can be included in the URI. For example, assume there are three versions of the "states" domain and the current version is version 3. The consumer can retrieve the California version 1 data record by using the following:

GET /drs?filters=name:states,version:1,displayProperty:CA

Note: See Posting a Data Record and Retrieving a Data Record sections for further details on posting/retrieving data records.

Fast Duplicate Dectection Properites Rules

Fast Duplicate Detection Properties are attributes of a Domain and identify those fields that, when combined, MDM should use to detect whether the DR is unique. Refer to Domain Uniqueness, Deterministic Uniqueness, and De-Duplication for more information.

  • The value provided for the displayProperty must be unique between all domain entries of a given domain type (e.g. each entry in the "state" domain must have a unique stateName).

  • Only one property in a domain can be the displayProperty. If more than one property is required to ensure uniqueness see the Uniqueness Rules Property below.

  • Display properties are limited to type STRING.

  • Properties designated as the displayProperty are required; i.e. null values are not allowed.

  • Display properties are case sensitive e.g. "California" is NOT equal to "california".

Valid Property Names

  • Must start with a letter or "\_" (underscore).

  • Can only contain digits, "-" (dash) and "\_" (underscore).

  • Can be up to 64 characters in length.

  • Are case in-sensitive.

  • If two properties have the same name at the same level only one will be used. In Example 1 below, only one name "property" will be used. In Example 2, both will be used  because "name" occurs at different levels in the JSON structure.



Example 1
{ "properties": { ... "name": {...}, "city": {...}, "state": {...}, "name": {...}, ... } }
Example 2
{ "properties": { ... "owner": { "name": {...}, "phone": {...}, .... }, "pet": { "name": {...}, "classification": {...}, .... }, .... } }

Valid Model Schema Types

Each property with the exception of a node, requires a type property e.g.:

"type": "string"

Rules About Required and Default Values

  • If a domain property has a default value defined in its modelSchema, then any domain data records posted will use the default value if the property either does not include a value or is sent in with a null value.

For example, if a domain named "College" contains a model schema with a property StateAbbreviation with a default value of "CA",
then any college posted to the College domain without the StateAbbreviation property, or with StateAbbreviation
set with a null value, will use the default value "CA".

  • If a domain property has required set to true, then a valid non-null value must be posted for it  (by default required is set to false).

For example, if a college domain model schema has a property CollegeName with required set to true, then any
domain resource data posted must include the CollegeName property or a BAD_REQUEST(400) will be returned.

  • If both required and default are used to define a given property, then default is ignored and POSTing data must include the required property.

Node

A container-node item is a node that contains sub-properties. For example, "address" is a container node with the sub-items "city" and "state."

{ "properties": { ... "address": { "city": {...}, "state": {...} } } }



Container-nodes can be nested.


The property "type": "node" isn't required (or recommended) but can be used for clarity. If the "type" and/or "required" properties are used, the sub-properties must be  contained inside of  the "items" property:

{ "properties": { ... "address": { "type": "node", "items": { "city": {...}, "state": {...} } } } }

Property

Description

Property

Description

required

A non-null value for this item must be provided (false by default). Items inside the container-node can
override the parent container node's required setting.

items

Contains the list of sub-properties in the node. This is required only if the "type", "required", and/or "description" properties are used.


String

A string of characters variable. The following properties are applied when data is posted for this item:

Property

Description

Property

Description

min

Minimum string length.

max

Maximum string length.

regex

String must match the regex pattern.

default

If the item is not provided or is null, the default value is used.

required

A non-null value for this item must be provided (false by default).

description

A human-readable description of the property.

Int

A numeric, whole number.

Property

Description

Property

Description

min

Minimum value allowed.

max

Maximum value allowed.

default

If the item is not provided or is null, the default value is used.

required

A non-null value for this item must be provided (false by default).

description

A human-readable description of the property.

Number

A numeric, decimal number with up to 15 bits of precision.

Property

Description

Property

Description

min

Minimum value allowed.

max

Maximum value allowed.

default

If the item is not provided or is null, the default value is used.

required

A non-null value for this item must be provided (false by default).

description

A human-readable description of the property.

Boolean

A Boolean, allowing only the two values of true or false

Property

Description

Property

Description

default

If the item is not provided or is null, the default value is used.

required

A non-null value for this item must be provided (false by default).

description

A human-readable description of the property.

Enum

Enumerations (enums) can be either a primitive type (stringint, or a number) or, a cross-reference to an entire set or subset of data records in another domain.

Property

Description

Property

Description

enumType

Types of enums:

  • int: A numeric whole number.

  • number: A numeric, decimal number with up to 15 bits of precision.

  • string: A string of character variables. 

  • xref: A cross-reference to another domain; use in conjunction with xrefLocation.

xrefLocation

If enumType is set to xref then xrefLocation points to the domain that makes up the elements of the enumeration.

For example, if a version of a domain "countries"contains a list of data record values that represent all countries then a domain can reference it by specifying an enumeration type of xref and the desired domain and version number:

enumType: xref

xrefLocation: /domains/countries:v1

A subset of the data record values in the domain can be specified using the data property.

data

An array containing a comma-separated list of enum values.

e.g.    "data":["Mays", "McCovey", "Marichal"]

Is a three-item string enumeration.

If the enumType is xref the list of values in the xrefLocation can be limited to the subset of data records whose display values contained in the data array. For example, if /domains/countries:v1 contains a list of all countries but the domain should only reference countries in North America, then the data entries would be:

  "data":["CANADA", "USA", "MEXICO"]

Assuming the display values for Canada, the United States, and Mexico are those shown above in the data elements.

default

If the item is not provided or is null, the default value is used.

required

A non-null value for this item must be provided (false by default).

description

A human-readable description of the property.

URI

TODO: "xref" type and "xrefLocation" has been moved to a new type called "xref". Documentation needs to be updated.

A properly formed Uniform Resource Locator (URL). A URL is a type of Uniform Resource Identifier (URI). By default, a property of type uri can be any valid URI or it can be limited by a regex pattern or to a domain cross-reference (xrefLocation:"xref").

A cross-reference (xref) points to another domain or domain data element. For more on "uriType": "xref" and "xrefLocation" see Cross Domains.

See type enum and the data property to limit the cross-reference to a subset of the domain's data records (e.g. if a domain "Company" exists with all Global 1000 companies, an enumeration of EU_Companies could be created referencing only the Global 1000 companies in the European Union).

Property

Description

Property

Description

uriType

The uri type (e.g. url, blob, xref).

xrefLocation

The location of the domain item or domain item data element. Used in conjunction with "uritype": "xref".

regex

String must match the regex pattern.

default

If the item is not provided or is null, the default value is used.

required

A non-null value for this item must be provided (false by default).

description

A human-readable description of the property.

Array

An array can contain any type of value including nodes and nested arrays.

Property

Description

Property

Description

minItems

Minimum items allowed in the array.

maxItems

Maximum items allowed in the array

default

If the item is not provided or is null, the default value is used.

items

Contains the list of sub-properties in the node.