YOUnite Adaptor Guide for Java Developers

What is a YOUnite Adaptor

YOUnite Adaptors are essentially extensions to the YOUnite DataHub allowing access to where managed data is delivered to and retrieved from. 3rd parties implement these adaptors using the YOUnite Adaptor SDK. Adaptors are a fundamental component to federated data domains.

For more information on YOUnite adaptors see:

Introductions to Adaptors can be found on the Introduction to YOUnite page and the Adaptors page.

Managing adaptors can be found on the Managing Adaptors page.

The YOUnite API Documentation can be found at https://younite.us/api

How to Get Started

An adaptor is implemented using the YOUnite Adaptor SDK. The SDK takes care of quite a bit of the complexity behind the scenes, but there are still a few things developers will need to be aware of... namely some configuration elements, the life cycle, and the proper use of the SDK Annotations.

The first step is to get the actual SDK. For Java, the most common way is to use the Maven artifact YOUnite provides. For those not using Maven, there is also the option to download the SDK library directly <TODO Kevin: Link here to latest sdk jar file>. The Maven configuration is as follows:

<dependencies>
  <dependency>
    <groupId>com.younite</groupId>
    <artifactId>adaptor-sdk</artifactId>
    <version>1.0.0-SNAPSHOT</version>
  </dependency>
</dependencies>

You will also need to add a server to your project POM to access the YOUnite Maven Repository: <TODO KEVIN: ONCE YOUNITE MAVEN REPO IS WORKING... INSERT INFO BELOW>

<repositories>
  <repository>
    <id>younite-snapshots</id>
    <name>younite</name>
    <url>https://younite.bintray.com/</url>
  </repository>
</repositories>

Once you have these configured in your project POM, the YOUnite Adaptor SDK will be available for you to get started with.

Dependencies

The YOUnite Adaptor SDK aims to be a very small, easy to use library. As such, it has been part of the design goal to avoid depending on external libraries as much as possible. But sometimes, it is better to use well designed and tested libraries to perform menial work than attempting to roll your own. As such, the YOUnite Adaptor SDK depends on only a few libraries so as to have a minimal impact on integrating within existing applications that may use the same libraries, possibly of different versions. The first of these dependencies is the Google Reflections library. This library provides runtime reflection capabilities needed to find and resolve classes and methods. The second library is the Jackson YAML processor. This is used primarily to support the YAML configuration file format, if selected.

<dependency>
  <groupId>org.reflections</groupId>
  <artifactId>reflections</artifactId>
  <!-- use latest version of Reflections -->
  <version>0.9.11</version>
</dependency>
<!-- Jackson YAML Databind -->
<dependency>
  <groupId>com.fasterxml.jackson.dataformat</groupId>
  <artifactId>jackson-dataformat-yaml</artifactId>
  <version>${jackson.version}</version>
</dependency>

Adaptor Architecture

Once an adaptor is connected to the YOUnite DataHub through the YOUnite Message Bus, it is able to send and receive data and ops messages. However, to streamline an adaptor developers time, the SDK has a minimal configuration step so that developers can focus on the business logic their adaptor is being built for and not the inner workings of sending and receiving messages, parsing those messages, and so on. To facilitate this, the YOUnite Adaptor SDK makes use of annotations developers use to define the capabilities related to the data their adaptor can produce and/or consume. These capabilities loosely translate into a Pub/Sub configuration on the YOUnite DataHub. Essentially they indicate to the YOUnite DataHub the specific domain properties they are interested in receiving changes for, and which of their own domain properties they will push out to the YOUnite DataHub when a change occurs within the associated local service(s) the adaptor is implemented to work with. These local services could be direct database data, an FTP server, in-memory data, or a remote service with data. In fact, the dynamic nature of YOUnite domains leave the details to individual Adaptor implementations to determine how they access and retrieve domain property data and what transformations they want to apply to them.

Connecting Adaptors to the YOUnite DataHub

To get connected to the YOUnite DataHub, an adaptor must make use of a dynamic transport layer. Out of the box, YOUnite is integrated with the Active MQ Message Bus as the transport layer. Regardless of the actual underlying transport layer, the YOUnite Adaptor SDK shields the adaptor developer from having to be concerned with any of the details on how to connect, send and receive data via the transport layer. The YOUnite Adaptor SDK comes bundled with an implementation of the AMQ Transport layer, so nothing more is needed to be done with regards to this to get started, except for some configuration details which are described next.

Configuring

Because the YOUnite DataHub runtime engine may be deployed in any number of environments (including but not limited to local developer machines, QA, Staging, Production..), it is necessary to instruct the YOUnite Adaptor SDK on how to connect to the YOUnite DataHub. Specifically, the URL that the YOUnite Adaptor SDK will use to connect to the implemented transport layer, authentication details which allow the transport layer to identify the specific Adaptor to the YOUnite DataHub as a legitimate adaptor, a valid adaptor UUID and Zone UUID (previously registered with the YOUnite DataHub) and possibly the OAUTH Server URL used by the transport layer... (TODO: Determine if this is necessary.. or can the transport layer which is already configured with the URL details in order to check the validity of the adaptor authentication just use what it has configured).

There are two ways in which an Adaptor can be configured. One is to use a YAML configuration file that is provided to the YOUnite Adaptor SDK in the init() method. The other is to create a Config object, provided by the SDK, and fill in the variety of configuration properties, such as those mentioned previously and a few others. The YAML file is turned in to a Config object when that is the configuration option used. Either way is identical to the SDK, though providing a YAML file can present some potential runtime issues if the file is not valid, or not found, etc. On the other hand, it does allow for an external configuration file as opposed to code that has to be recompiled to pick up any configuration changes.

Startup

Once you have everything configured, you simply call the AdaptorSDK.init() method. You pass it either the Config object, or the file location of the YAML configuration file. <Add links to example working code here>. This will take the configuration, attach to the transport layer, authenticate and if all goes well, set up data and ops listeners for incoming messages. Behind the scenes, the YOUnite Adaptor SDK attempts to locate any annotated adaptor classes, build the capabilities list from annotated methods within adaptor classes, connect to the transport layer, send the capabilities list to the YOUnite DataHub and build some necessary mappings in memory for the SDK to properly process incoming messages.

Mappings

Internally the YOUnite Adaptor SDK maintains data structures to keep track of the capabilities of the adaptor, and the associated domain version(s) of the YOUnite DataHub. This is what makes it possible to respond to incoming data, as well as send outgoing data do to some form of data change detection that the adaptor developer implements, which will all become clear in the next few sections... but first, adaptor capabilities and how they are defined.

Adaptor Capabilities

As mentioned previously, when an adaptor is started (the AdaptorSDK.init() method is called), it builds the list of capabilities that it needs to send to the YOUnite DataHub so the DataHub knows what data the adaptor accepts (most notably from other adaptors), as well as what data the adaptor can provide (most notably to other adaptors). But what are these capabilities, and how do you define them?

Capabilities are nothing more than describing a specific domain name, version and a set of properties that the domain schema (that are associated with the domain name/version) define. What this means is the Adaptor developer will either work closely with the Data Steward to figure out the domains and their defined properties (as defined by a JSON Schema representing the domain), or will utilize the YOUnite Adaptor SDK Adaptor Generator tool to quickly build and generate an Adaptor stub (more on this tool below). The purpose of an adaptor is to be able to get domain/version property data to do something with it (from other adaptors), or to send domain/version property data the adaptor is responsible for to other adaptors (via the DataHub). So there has to be a way to define exactly what properties of what domain/version the adaptor is able to work with. The Java YOUnite Adaptor SDK provides a set of annotations that are used to annotate methods within an annotated Adaptor class. The annotations have some properties that are set to indicate the domain name, version, and property names within the domain that the annotated method expects as input, or returns as output.

Lets see a simple example, including how we annotate the Adaptor class:

@Adaptor(name = "SimpleAdaptor")
public class MyAdaptor {
  @GetFromAdaptor(domains = {
    @Domain(name = "Student", version = "1", properties = {})
  })
  public Student getStudent() {
    return new Student("Jane Doe");
  }
}

In the above example snippet, you can see a valid Adaptor that utilizes the @Adaptor annotation to specify the class MyAdaptor is an implementation of an Adaptor. It also provides a single method, getStudent(), which defines the GetFromAdaptor annotation. You can see the list of annotations the YOUnite Adaptor SDK supports here (Insert link to the table of annotations, definitions, etc). Suffice it to say, each annotation has a specific purpose for their use. In this case, the GetFromAdaptor is used to define a method that responds to an incoming request to retrieve the specified properties from the local service. From the simple example, you can see a new Student object is returned. More examples will provide details on when and how to use the various method annotations. They all, however, have two things in common. First, they always specify at least one domain by name and version. There is no use case for any of the Action annotations being used without a domain name AND version. In fact, the in memory mapping mentioned earlier requires that the domain name AND version both be provided at all times. If during the processing of adaptors a @Domain annotation is discovered with not name and version, or just having a name or a version, a processing exception will occur as the method would be deemed an invalid use of the annotation. The second thing they have in common, is for every @Domain specified, a properties array is required. Now, unlike the requirement to have the values filled out like domain name and version, the properties is just a hint to the YOUnite Adaptor SDK what properties of the domain the properties are associated with, this particular method (and ultimately the Adaptor) has the capability to accept or send (depending on the Action annotation being used).

What does this mean exactly? Simply that whenever a @Domain is specified, the object associated with that Domain definition is passed in as a parameter OR returned from the method, again, depending on the Action annotation being used... regardless of the properties specified. So what is the purpose of specifying properties then? Simple. It lets the YOUnite DataHub know exactly what properties to send to or expect from an adaptor. For example, in the above snippet, the DataHub knows that this adaptor will return a Studen domain object, version 1, back in response to the GetFromAdaptor action... and because no properties are specified, it is essentialy a wild card "*"... meaning any to all of the properties within the domain may be returned as part of the Student object.

So what happens if you define a property:

@Adaptor(name = "SimpleAdaptor")
public class MyAdaptor {
  @GetFromAdaptor(domains = {
    @Domain(name = "Student", version = "1", properties = {"name"})
  })
  public Student getStudent() {
    return new Student("Jane Doe");
  }
}

The above simply tells the YOUnite DataHub to only expect the name property of the Student domain (version 1) to be returned. Or more to the point, it tells the YOUnite DataHub that the capability of THIS Adaptor is that it returns the Name property of the Student domain, version 1. This allows the YOUnite DataHub to use this information to assemble a routing manifest (link to ROUTING details here). So what happens if the adaptor method sets additional properties on the object?

@Adaptor(name = "SimpleAdaptor")
public class MyAdaptor {
  @GetFromAdaptor(domains = {
    @Domain(name = "Student", version = "1", properties = {"name"})
  })
  public Student getStudent() {
    Student student = new Student();
    student.setName("Jane Doe");
    student.setAge("21");
    student.setAddress("Some Address");
    return student;
  }
}

It is clear that we have defined only the name property, yet we added age and address to the returned object. Well one of two things will happen. Either the entire object as it is populated here is sent back (in JSON format) and the DataHub simply ignores the other properties based on the fact that the annotation only indicates it handles the Name property... or the YOUnite Adaptor SDK itself may do one of a couple of things... one is to do some extensive checking to ensure that only the properties specified contain data and if anything else is set to anything other than null (which results in the JSON string not containing the property name/value), a runtime exception can be thrown... or 2, modify the structure so that ONLY the properties specified will contain data before the object is sent on its way. Both of these last two options could be possible but as of version 1 of the YOUnite Adaptor SDK the entirety of the object is sent with everything filled and is left for the DataHub to deal with.

OK.. so what if we want to work with multiple domains in a single Adaptor class? Perhaps the local service handles students and courses. How would we do this? Well we could define two @Adaptor classes, each with their own set of annotated methods related to a specific domain:

@Adaptor(name = "StudentAdaptor")
public class StudentAdaptor {
  @GetFromAdaptor(domains = {
    @Domain(name = "Student", version = "1", properties = {"name"})
  })
  public Student getStudent() {
    Student student = new Student();
    student.setName("Jane Doe");
    return student;
  }
}

------------------------------------------------------------------------

@Adaptor(name = "CourseAdaptor")
public class CourseAdaptor {
  @GetFromAdaptor(domains = {
    @Domain(name = "Course", version = "1", properties = {})
  })
  public Course getCourse() {
    return new Course();
  }
}

The problem that may arise with this though is that you may have a need to work with both objects in one class and do not want to have to set up additional helper methods or other means to use both classes in some manner. A better approach is to be able to use both domains in a single adaptor:

@Adaptor(name = "StudentAndCourseAdaptor")
public class StudentCourseAdaptor {
  @GetFromAdaptor(domains = {
    @Domain(name = "Student", version = "1", properties = {"name"})
  })
  public Student getStudent() {
    Student student = new Student();
    student.setName("Jane Doe");
    return student;
  }

  @GetFromAdaptor(domains = {
    @Domain(name = "Course", version = "1", properties = {})
  })
  public Course getCourse() {
    return new Course();
  }
}

In the above, you can see how both domains are accounted for in separate methods but in just the one Adaptor. You can specify as many of these as you want in one class, though to keep code clean it would be best to follow this practice only when you may need to utilize multiple domains in the one class.

What is a Capability Then?

Simple.. a capability is nothing more than a domain name, version and a subset of properties defined by the Domain Schema, declared in one of the Action annotations. For each action annotation defined, at least one domain name and version are specified (remember.. no point in defining an action without the domain name and version it expects to work with.. it is an error otherwise). Each and every @Domain() becomes a capability.

Same Domain, Different Properties

It is possible to use the same action annotation multiple times on different methods. This allows the Adaptor Developer to separate theri code by domain/version, if they wish, or.. you can even separate individual properties on a property per method basis, if you are so inclined. For example:

@Adaptor(name = "StudentAndCourseAdaptor")
public class StudentCourseAdaptor {
  @GetFromAdaptor(domains = {
    @Domain(name = "Student", version = "1", properties = {"name"})
  })
  public Student getStudentName() {
    Student student = new Student();
    student.setName("Jane Doe");
    return student;
  }

  @GetFromAdaptor(domains = {
    @Domain(name = "Student", version = "1", properties = {"address"})
  })
  public Student getStudentAddress() {
    Student student = new Student();
    student.setAddress("address");
    return student;
  }
}

In the above, you can see that the same action is used on two methods for the same domain and version. However, the property is different for each. One returns the Student with the name filled out, the other returns the Student with the address filled out. You may have noticed they both return a Student object.. yet they define different properties. So what is to stop an adaptor developer from filling in other properties in either of the Student objects other than the property(ies) it is declared to manage? Nothing. Yup.. nothing. Any extra data will simply be ignored by the DataHub <TODO: We MIGHT remove data before it is sent at the SDK layer.. not initially..but in a future version, scrub data not defined so as to avoid the DataHub having to do that work (e.g. distribute the load of doing that work to the adaptors to limit the processing needed by DataHub for this menial task). > Lets be clear though.. the purpose for specifying the same action for domain/version but different properties is code aesthetics. So.. an adaptor developer that tries to shoehorn extra data in to the object is being a bad citizen. Dont do it. Keep the code concise and consistent.

Combining Capabilities

In the previous section, you saw that you could define the same action for the same domain and version, but declare different properties. You also learned that a capability is nothing more than the domain name and version and its properties. So what happens when you have two (or more) methods with the same action annotation each having a subset of the domain schema properties? They get combined in to one capability. A capability is the domain name/version and the sum of all properties defined, regardless of the number of methods the definitions are distributed over.

What About Flowing Into Adaptors From YOUnite?

There are actions to handle those situations too. There is PostOrLinkToAdaptor, and PutToAdaptor, as well as DeleteAtAdaptor. PostOrLinkToAdaptor would be used when new data is to be stored at the local service. Typically this would mean the entire domain object is provided as the parameter to the method that is annotated with PostOrLinkToAdaptor. Data changes to individual properties, on the other hand, would fall under the PutToAdaptor action. This is used to update (think overwrite) the local data with what is provided from the DataHub.

Unlike the GetFromAdaptor which can only return a single Domain object, the various incoming data actions can support multiple domain types for one method:

@Adaptor(name = "StudentAndCourseAdaptor")
public class StudentCourseAdaptor {
  @PutToAdaptor(domains = {
    @Domain(name = "Student", version = "1", properties = {"name"}),
    @Domain(name = "Course", version = "1", properties = {})
  })
  public void updateStudentOrCourse(Student student, Course course) {
    if (null != student && null != course) { // do something that may require both objects to be present at the same time
    } else if (null != student) {  // do something with the student object at the local service
    } else if (null != course) {  // do something with the course object at the local service
    } else 
  }
}

Here you see we define a single @PutToAdaptor action, yet we specify two domains. It is possible that the method will be called with a Student object OR a Course object. OR.. both! What? Yes.. if the message that arrives at the adaptor contains both student data AND course data, the method would be called with BOTH objects provided. This opens up the ability to use both objects at once before working with the local service. It may seem unlikely.. but there could very well be use cases where before the local service can be updated BOTH a student and a course domain must be provided. Maybe on the local service to create a student there is a NOT NULL column specified to a Course (e.g. course table), thus the ability to update a student can not occur without the course data being provided as well. Again, unlikely in many cases, but we can not predict when such a thing may be required. Thus, we allow for either or both domain objects to be provided in a single method call. As such, it is best to set up null checks before attempting to use objects (or properties of the object) to avoid NPEs.