YOUnite Adaptor Guide for Java Developers

What is a YOUnite Adaptor

YOUnite Adaptors are essentially extensions to the YOUnite DataHub, allowing access to managed data. Third parties implement these adaptors using the YOUnite Adaptor SDK. Adaptors are a fundamental component to federated data domains.

For more information on YOUnite adaptors see:

Introductions to Adaptors can be found on the Introduction to YOUnite page and the Adaptors page.

Managing adaptors can be found on the Managing Adaptors page.

The YOUnite API Documentation can be found at https://younite.us/api

How to Get Started

To implement an adaptor you use the YOUnite Adaptor SDK (SDK). The SDK handles much of the behind-the-scenes complexity, but developers still need to manage: configuration elements, the life cycle, and the proper use of the SDK Annotations.

Get the actual YOUnite Adaptor SDK. For Java, the most common method is to use the YOUnite-provided Maven artifact. If you don't use Maven, you can optionally download the SDK library directly. <TODO Kevin: Link here to latest sdk jar file>.
Use the following Maven configuration in your settings.xml file:

<dependencies>
  <dependency>
    <groupId>com.younite</groupId>
    <artifactId>adaptor-sdk</artifactId>
    <version>1.0.0-SNAPSHOT</version>
  </dependency>
</dependencies>

3. Add a server to your project POM to access the YOUnite Maven Repository: <TODO KEVIN: ONCE YOUNITE MAVEN REPO IS WORKING... INSERT INFO BELOW>

<repositories>
  <repository>
    <id>younite-snapshots</id>
    <name>younite</name>
    <url>https://younite.bintray.com/</url>
  </repository>
</repositories>

4. Once you have these configurations in place for your project, the YOUnite Adaptor SDK will be available.

Dependencies

The YOUnite Adaptor SDK aims to be a very small, easy-to-use library. It is a product of the design goal to avoid external library dependencies as much as possible. But sometimes it is better to use well-designed and tested libraries to perform menial work than attempting to roll your own. The YOUnite Adaptor SDK depends on only one external library: the Google Reflections library. This minimizes the impact of integrating within existing applications that may use duplicates the same libraries, possibly of different versions, causing a dependency conflict. The Google Reflections library provides run-time reflection capabilities needed to find and resolve classes and methods.

Use the following configuration in your project's pom.xml file:

<dependency>
  <groupId>org.reflections</groupId>
  <artifactId>reflections</artifactId>
  <!-- use latest version of Reflections -->
  <!-- maintained here: https://github.com/ronmamo/reflections -->
  <version>0.9.11</version>
</dependency>

Adaptor Architecture

Once an adaptor is connected to the YOUnite DataHub through the YOUnite Message Bus, it is able to send and receive data and ops messages. However, to streamline an adaptor developers time, the SDK has a minimal configuration step so that developers can focus on the business logic their adaptor is being built for and not the inner workings of sending and receiving messages, parsing those messages, and so on. To facilitate this, the YOUnite Adaptor SDK makes use of annotations developers use to define the capabilities related to the data their adaptor can produce and/or consume. These capabilities loosely translate into a Pub/Sub configuration on the YOUnite DataHub. Essentially they indicate to the YOUnite DataHub the specific domain properties they are interested in receiving changes for, and which of their own domain properties they will push out to the YOUnite DataHub when a change occurs within the associated local service(s) the adaptor is implemented to work with or domain properties requested by the YOUnite DataHub. These local services could be direct database data, an FTP server, in-memory data, or a remote service with data. In fact, the dynamic nature of YOUnite domains leave the details to individual adaptor implementations to determine how they access and retrieve domain property data and any transformations they may want to apply to the data.

Connecting Adaptors to the YOUnite DataHub

To get connected to the YOUnite DataHub, an adaptor must make use of a dynamic transport layer. Out of the box, YOUnite is integrated with the Active MQ Message Bus as the transport layer. Regardless of the actual underlying transport layer, the YOUnite Adaptor SDK shields the adaptor developer from having to be concerned with any of the details on how to connect, send and receive data via the transport layer. The YOUnite Adaptor SDK comes bundled with an implementation of the AMQ Transport layer, so nothing more is needed to be done with regards to this to get started, except for some configuration details which are described next.

Configuring

Because the YOUnite DataHub runtime engine may be deployed in any number of environments (including but not limited to local developer machines, QA, Staging, Production..), it is necessary to instruct the YOUnite Adaptor SDK on how to connect to the YOUnite DataHub. Specifically, the URL that the YOUnite Adaptor SDK will use to connect to the implemented transport layer, authentication details which allow the transport layer to identify the specific Adaptor to the YOUnite DataHub as a legitimate adaptor, a valid adaptor UUID and Zone UUID (previously registered with the YOUnite DataHub) and possibly the OAUTH Server URL used by the transport layer... (TODO: Determine if this is necessary.. or can the transport layer which is already configured with the URL details in order to check the validity of the adaptor authentication just use what it has configured).

The other is to create a Config object, provided by the SDK, and fill in the variety of configuration properties, such as those mentioned previously and a few others. The YAML file is turned in to a Config object when that is the configuration option used. Either way is identical to the SDK, though providing a YAML file can present some potential runtime issues if the file is not valid, or not found, etc. On the other hand, it does allow for an external configuration file as opposed to code that has to be recompiled to pick up any configuration changes.

Startup

Once you have everything configured, you simply call the AdaptorSDK.init() method. You pass it either the Config object, or the file location of the YAML configuration file. <Add links to example working code here>. This will take the configuration, attach to the transport layer, authenticate and if all goes well, set up data and ops listeners for incoming messages. Behind the scenes, the YOUnite Adaptor SDK attempts to locate any annotated adaptor classes, build the capabilities list from annotated methods within adaptor classes, connect to the transport layer, send the capabilities list to the YOUnite DataHub and build some necessary mappings in memory for the SDK to properly process incoming messages.

Mappings

Internally the YOUnite Adaptor SDK maintains data structures to keep track of the capabilities of the adaptor, and the associated domain version(s) of the YOUnite DataHub. This is what makes it possible to respond to incoming data, as well as send outgoing data do to some form of data change detection that the adaptor developer implements, which will all become clear in the next few sections... but first, adaptor capabilities and how they are defined.

Adaptor Capabilities

As mentioned previously, when an adaptor is started (the AdaptorSDK.init() method is called), it builds the list of capabilities that it needs to send to the YOUnite DataHub so the DataHub knows what data the adaptor accepts (most notably from other adaptors), as well as what data the adaptor can provide (most notably to other adaptors). But what are these capabilities, and how do you define them?

Capabilities are nothing more than describing a specific domain name, version and a set of properties that the domain schema (that are associated with the domain name/version) define. What this means is the Adaptor developer will either work closely with the Data Steward to figure out the domains and their defined properties (as defined by a JSON Schema representing the domain), or will utilize the YOUnite Adaptor SDK Adaptor Generator tool to quickly build and generate an Adaptor stub (more on this tool below). The purpose of an adaptor is to be able to get domain/version property data to do something with it (from other adaptors), or to send domain/version property data the adaptor is responsible for to other adaptors (via the DataHub). So there has to be a way to define exactly what properties of what domain/version the adaptor is able to work with. The Java YOUnite Adaptor SDK provides a set of annotations that are used to annotate methods within an annotated Adaptor class. The annotations have some properties that are set to indicate the domain name, version, and property names within the domain that the annotated method expects as input, or returns as output.

Lets see a simple example, including how we annotate the Adaptor class:

@Adaptor(name = "SimpleAdaptor")
public class MyAdaptor {
  @GetFromAdaptor(domains = {
    @Domain(name = "Student", version = "1", properties = {})
  })
  public Student getStudent() {
    return new Student("Jane Doe");
  }
}

In the above example snippet, you can see a valid Adaptor that utilizes the @Adaptor annotation to specify the class MyAdaptor is an implementation of an Adaptor. It also provides a single method, getStudent(), which defines the GetFromAdaptor annotation. You can see the list of annotations the YOUnite Adaptor SDK supports here (Insert link to the table of annotations, definitions, etc). Suffice it to say, each annotation has a specific purpose for their use. In this case, the GetFromAdaptor is used to define a method that responds to an incoming request to retrieve the specified properties from the local service. From the simple example, you can see a new Student object is returned. More examples will provide details on when and how to use the various method annotations. They all, however, have two things in common. First, they always specify at least one domain by name and version. There is no use case for any of the Action annotations being used without a domain name AND version. In fact, the in memory mapping mentioned earlier requires that the domain name AND version both be provided at all times. If during the processing of adaptors a @Domain annotation is discovered with not name and version, or just having a name or a version, a processing exception will occur as the method would be deemed an invalid use of the annotation. The second thing they have in common, is for every @Domain specified, a properties array is required. Now, unlike the requirement to have the values filled out like domain name and version, the properties is just a hint to the YOUnite Adaptor SDK what properties of the domain the properties are associated with, this particular method (and ultimately the Adaptor) has the capability to accept or send (depending on the Action annotation being used).

What does this mean exactly? Simply that whenever a @Domain is specified, the object associated with that Domain definition is passed in as a parameter OR returned from the method, again, depending on the Action annotation being used... regardless of the properties specified. So what is the purpose of specifying properties then? Simple. It lets the YOUnite DataHub know exactly what properties to send to or expect from an adaptor. For example, in the above snippet, the DataHub knows that this adaptor will return a Studen domain object, version 1, back in response to the GetFromAdaptor action... and because no properties are specified, it is essentialy a wild card "*"... meaning any to all of the properties within the domain may be returned as part of the Student object.

So what happens if you define a property:

@Adaptor(name = "SimpleAdaptor")
public class MyAdaptor {
  @GetFromAdaptor(domains = {
    @Domain(name = "Student", version = "1", properties = {"name"})
  })
  public Student getStudent() {
    return new Student("Jane Doe");
  }
}

The above simply tells the YOUnite DataHub to only expect the name property of the Student domain (version 1) to be returned. Or more to the point, it tells the YOUnite DataHub that the capability of THIS Adaptor is that it returns the Name property of the Student domain, version 1. This allows the YOUnite DataHub to use this information to assemble a routing manifest (link to ROUTING details here). So what happens if the adaptor method sets additional properties on the object?

@Adaptor(name = "SimpleAdaptor")
public class MyAdaptor {
  @GetFromAdaptor(domains = {
    @Domain(name = "Student", version = "1", properties = {"name"})
  })
  public Student getStudent() {
    Student student = new Student();
    student.setName("Jane Doe");
    student.setAge("21");
    student.setAddress("Some Address");
    return student;
  }
}

It is clear that we have defined only the name property, yet we added age and address to the returned object. Well one of two things will happen. Either the entire object as it is populated here is sent back (in JSON format) and the DataHub simply ignores the other properties based on the fact that the annotation only indicates it handles the Name property... or the YOUnite Adaptor SDK itself may do one of a couple of things... one is to do some extensive checking to ensure that only the properties specified contain data and if anything else is set to anything other than null (which results in the JSON string not containing the property name/value), a runtime exception can be thrown... or 2, modify the structure so that ONLY the properties specified will contain data before the object is sent on its way. Both of these last two options could be possible but as of version 1 of the YOUnite Adaptor SDK the entirety of the object is sent with everything filled and is left for the DataHub to deal with.

OK.. so what if we want to work with multiple domains in a single Adaptor class? Perhaps the local service handles students and courses. How would we do this? Well we could define two @Adaptor classes, each with their own set of annotated methods related to a specific domain:

@Adaptor(name = "StudentAdaptor")
public class StudentAdaptor {
  @GetFromAdaptor(domains = {
    @Domain(name = "Student", version = "1", properties = {"name"})
  })
  public Student getStudent() {
    Student student = new Student();
    student.setName("Jane Doe");
    return student;
  }
}

------------------------------------------------------------------------

@Adaptor(name = "CourseAdaptor")
public class CourseAdaptor {
  @GetFromAdaptor(domains = {
    @Domain(name = "Course", version = "1", properties = {})
  })
  public Course getCourse() {
    return new Course();
  }
}

The problem that may arise with this though is that you may have a need to work with both objects in one class and do not want to have to set up additional helper methods or other means to use both classes in some manner. A better approach is to be able to use both domains in a single adaptor:

@Adaptor(name = "StudentAndCourseAdaptor")
public class StudentCourseAdaptor {
  @GetFromAdaptor(domains = {
    @Domain(name = "Student", version = "1", properties = {"name"})
  })
  public Student getStudent() {
    Student student = new Student();
    student.setName("Jane Doe");
    return student;
  }

  @GetFromAdaptor(domains = {
    @Domain(name = "Course", version = "1", properties = {})
  })
  public Course getCourse() {
    return new Course();
  }
}

In the above, you can see how both domains are accounted for in separate methods but in just the one Adaptor. You can specify as many of these as you want in one class, though to keep code clean it would be best to follow this practice only when you may need to utilize multiple domains in the one class.

What is a Capability Then?

Simple.. a capability is nothing more than a domain name, version and a subset of properties defined by the Domain Schema, declared in one of the Action annotations. For each action annotation defined, at least one domain name and version are specified (remember.. no point in defining an action without the domain name and version it expects to work with.. it is an error otherwise). Each and every @Domain() becomes a capability.

Same Domain, Different Properties

It is possible to use the same action annotation multiple times on different methods. This allows the Adaptor Developer to separate theri code by domain/version, if they wish, or.. you can even separate individual properties on a property per method basis, if you are so inclined. For example:

@Adaptor(name = "StudentAndCourseAdaptor")
public class StudentCourseAdaptor {
  @GetFromAdaptor(domains = {
    @Domain(name = "Student", version = "1", properties = {"name"})
  })
  public Student getStudentName() {
    Student student = new Student();
    student.setName("Jane Doe");
    return student;
  }

  @GetFromAdaptor(domains = {
    @Domain(name = "Student", version = "1", properties = {"address"})
  })
  public Student getStudentAddress() {
    Student student = new Student();
    student.setAddress("address");
    return student;
  }
}

In the above, you can see that the same action is used on two methods for the same domain and version. However, the property is different for each. One returns the Student with the name filled out, the other returns the Student with the address filled out. You may have noticed they both return a Student object.. yet they define different properties. So what is to stop an adaptor developer from filling in other properties in either of the Student objects other than the property(ies) it is declared to manage? Nothing. Yup.. nothing. Any extra data will simply be ignored by the DataHub <TODO: We MIGHT remove data before it is sent at the SDK layer.. not initially..but in a future version, scrub data not defined so as to avoid the DataHub having to do that work (e.g. distribute the load of doing that work to the adaptors to limit the processing needed by DataHub for this menial task). > Lets be clear though.. the purpose for specifying the same action for domain/version but different properties is code aesthetics. So.. an adaptor developer that tries to shoehorn extra data in to the object is being a bad citizen. Dont do it. Keep the code concise and consistent.

Combining Capabilities

In the previous section, you saw that you could define the same action for the same domain and version, but declare different properties. You also learned that a capability is nothing more than the domain name and version and its properties. So what happens when you have two (or more) methods with the same action annotation each having a subset of the domain schema properties? They get combined in to one capability. A capability is the domain name/version and the sum of all properties defined, regardless of the number of methods the definitions are distributed over.

What About Flowing Into Adaptors From YOUnite?

There are actions to handle those situations too. There is PostOrLinkToAdaptor, and PutToAdaptor, as well as DeleteAtAdaptor. PostOrLinkToAdaptor would be used when new data is to be stored at the local service. Typically this would mean the entire domain object is provided as the parameter to the method that is annotated with PostOrLinkToAdaptor. Data changes to individual properties, on the other hand, would fall under the PutToAdaptor action. This is used to update (think overwrite) the local data with what is provided from the DataHub.

Unlike the GetFromAdaptor which can only return a single Domain object, the various incoming data actions can support multiple domain types for one method:

@Adaptor(name = "StudentAndCourseAdaptor")
public class StudentCourseAdaptor {
  @PutToAdaptor(domains = {
    @Domain(name = "Student", version = "1", properties = {"name"}),
    @Domain(name = "Course", version = "1", properties = {})
  })
  public void updateStudentOrCourse(Student student, Course course) {
    if (null != student && null != course) { // do something that may require both objects to be present at the same time
    } else if (null != student) {  // do something with the student object at the local service
    } else if (null != course) {  // do something with the course object at the local service
    } else 
  }
}

Here you see we define a single @PutToAdaptor action, yet we specify two domains. It is possible that the method will be called with a Student object OR a Course object. OR.. both! What? Yes.. if the message that arrives at the adaptor contains both student data AND course data, the method would be called with BOTH objects provided. This opens up the ability to use both objects at once before working with the local service. It may seem unlikely.. but there could very well be use cases where before the local service can be updated BOTH a student and a course domain must be provided. Maybe on the local service to create a student there is a NOT NULL column specified to a Course (e.g. course table), thus the ability to update a student can not occur without the course data being provided as well. Again, unlikely in many cases, but we can not predict when such a thing may be required. Thus, we allow for either or both domain objects to be provided in a single method call. As such, it is best to set up null checks before attempting to use objects (or properties of the object) to avoid NPEs.

Detecting Changes (rough draft.. implementation not yet in place so this is subject to change)

One of the features of the SDK in making it easier for developers is to provide a way when a local entity change occurs to send that change to YOUnite MDM without the developer having to do so in code. There may still need to be some code on the developers part in determining how that change is detected. However, as long as the @PutToMdm annotated method is called with a domain object, that object will be sent to YOUnite MDM. This is essentially a PUSH from the adaptor to YOUnite MDM.

To standalone or integrate...

At some point, you need a way to get your adaptor started. In the above sections you learned how you configure the adaptor to get connected to the transport layer and to apply annotations to describe the capabilities of your adaptor. But your adaptor implementation has to actually call the AdaptorSDK.init() call some how. You can do this by either creating a standalone application wrapper... a microservice if you will... or you can integrate your adaptor implementation into an existing application, such as a web application.

Standalone

If you are starting out with a clean slate and need a way to start your adaptor, a simple application framework can be used to get things started. The most important point is to understand that the YOUnite Adaptor SDK does not have a background thread that starts up and keeps it running. If you call the AdaptorSDK.init() call, and your application wrapper does not keep a thread alive and running, the application will abruptly end. Therefore it is essential for the adaptor to be of any use to make sure your application framework maintains a thread to keep it alive.

As described in a previous section on configuration, the AdaptorSDK.init() static method needs the Config object and the String[] packages array passed to it. This is enough to get things rolling such that the Adaptor SDK can look for classes annotated with the @Adaptor annotation. The Config object you know about. The String[] packages is for specifying the package names annotated adaptor classes may reside. The primary purpose of this is to only look for classes within the specified packages, speeding up the process at runtime. Also to be noted, the classpath used to look for classes is specified by two classloaders the scanning process makes use of. The first is the Thread context classloader. This is typically the classloader that loaded the application and the SDK library itself. The second classloader is the classloader owner of the dependent reflections library the SDK makes use of to find classes. This will typically be the same classloader in a standalone application as the thread context classloader.

Integration

Integrating the YOUnite Adaptor SDK into an existing application is similar to the standalone route, but with some subtle differences depending on the applicatoin being integrated in to. The primary caveat to be aware of is the possible differences in classloader hierarchy. In a standalone application the YOUnite Adaptor SDK library will be loaded by the thread context classloader, where as in some types of applications, such as a web application that executes in a container like Tomcat or Jetty, those containers reorder the classloader heirachy to ensure web applications and their internal contents are loaded in specific orders so as to ensure the order in which dependent libraries are discovered. As such, it is possible the thread context classloader will be different than the second classloader used by the google reflections library. While this should pose no problem for the use of the YOUnite Adaptor SDK, it is neverthelese important to be aware of these potential scenarios in case a runtime classpath issue arises within the adaptor implementation. In the event that it may be needed to add additional classloaders to the two mentioned above, an overloaded init() method is available which takes in an array of ClassLoader as the middle parameter (e.g. init(Config, ClassLoader[], String...). With this, it is possible to add any additional classloaders, such as specific container loaders to the list of loaders used to search for annotated adaptor classes.

Otherwise, the integration is about the same as the Standalone method. At some point, presumably a one time initialize method, you make the call to the AdaptorSDK.init() method same as the standalone approach. Like the standalone approach, it is assumed the integration container framework has some sort of background thread running keeping the process alive.

Other caveats (Maybe this should go in to a troubleshooting section at the end?)

The integration approach may run in to another issue that sometimes arises in applications that make use of 3rd party libraries.. versioning. It is unlikely, but possible, that the integration application would have another version of the google reflections library within the classpath due to the nature of how and why the google reflections library is used (namely.. for scanning the classpath for classes with specific annotations, signatures, etc). However, that it is a possible situation, it makes this section relevant. Many developers will be aware of a term "classpath hell". This situation arises when two (or more) of the same libraries are on the application classpath usually with different versions. Often is the case where 3rd party libraries bundle a "fat jar" that include classes from yet other libraries that might also be part of other libraries. A common case is logging. Often times log4j or other logging libraries will be bundled via the "fat jar" process and that is when you can run in to runtime classpath issues. This is espeically difficult to narrow down in integrated platform envrionments like servlet and JEE containers because of the way they munge the classloader hierarchy. A telltale sign is when you start to see runtime exceptions with ClassNotFoundExceptions showing up in the logs.. and in the case of the YOUnite Adaptor SDK, it might present as if the SDK is not finding any of your annotated adaptor classes.

Examples

Here is a list of examples provided via the <portal? sdk.zip outside of maven? link to github projects??> with information on each.

TBD...