| How to Write Integration Transforms |
|
|
|
|
Importing source data from external sources into an ontology such as the Essential Architecture Manager repository means that custom transforms are required. This tutorial describes how the Essential Integration Engine works and provides details about how to develop your own custom transforms. Each transform that you create can be re-used to synchronise, update and import new instances from the external source, e.g. from a configuration management database.
How the Essential Integration Engine WorksThe Essential Integration Engine imports information and data from external sources by operating on the Protege Java API to ensure that the consistency and integrity of the ontology is preserved. The Standard Functions library provides a suite of functions that you call from your custom transform and these functions handle most of the required API calls.
This diagram describes how the integration server operates. Transforms are written in XSL that produces the integration script for the specific source data instances. It is this script that uses the Protege API - via the Standard Functions - to import and synchronise the external data with your Essential ontology. How to construct a transformThe key concept in the transforms is that they take the form of XSL documents that transform your source information into a Python script that imports and synchronises the source data. The Integration Engine executes the XSL against the source data and then executes the resulting script to complete the integration process.See 'importEssentialInstances.xsl' for examples of the transform XSL. To support the synchronisation, the integration engine needs to understand which external data source or repository the source information has come from. This is done by giving your data source a name that will be used each time you import from that source. e.g. this could be the name of another repository or it could even be the name of a specific spreadsheet that contains some source data you wish to import. The step of your transform is to ensure that this repository has been defined in the Essential repository. Standard Start of the Transform DocumentThis repository definition should be part of a standard start to every transform XSL document. You can use the example transforms as templates on which to base your transforms. This should contain:
As the transform is creating a Python script file, the XSL is mostly made up of <xsl:text> statements. Note that each statement in the script must be finished by a carriage return character, using the 
 statement. Chunking TokensA restriction in the underlying Apache scripting engine limits the size of a script file that can be run in a single call to execute the script. The Essential Integration Engine splits long scripts into smaller chunks, which preserving the state between each chunk. This means that you can reference variables across chunks of the script, and in turn this means that this chunking of the script has no impact on the script that your transform creates.However, to ensure that the script is split at valid points - i.e. at the end of a script statement, rather than in the middle of one - you should write 'chunking tokens' after each node of the source data has been processes. e.g. after each application definition in a list of applications to import, or after a business process definition from a list of processes to import. The chunking token to use is: ####_End_of_Node_#### ![]() The Process of Writing a TransformMap the source information to the Essential Meta ModelThe first step of the process of writing your own transform is to understand how your source information should map to the Essential Meta Model. Identify the Essential Meta Classes that your are importing instances of and the slots on those classes that you will be populating.Plan your scriptHaving identified logically how your source information maps to the Essential Meta Model, you should understand what the integration needs to do.
Write the XSL DocumentOnce you are clear about the script that you need to run to import your source information, you are ready to produce the XSL document that transforms your source data. External Repository Instance ReferencesTo support the on-going synchronisation of external data and to avoid import duplications, each instance or relationship that is imported is assigned a unique identifier, called the External Repository Reference Instance. You should identify a unique identifier in your source data for each instance that you wish to import. This could be the repository identifier of the source instance as it appears in the source repository or it could be the name of the object to import, as long as it is unique. Each instance that is imported has this external identifier combined with the external repository identifier to create a unique reference for each instance in Essential that has been imported. Scripting Environment BasicsThe scripting environment provides a key global variable, 'kb', that provides a reference to the overall Essential knowledge base. From this variable, you can access all the instances, classes and slots that you need in order to perform your imports. Examples- Get a reference to a specific instance, e.g. of a Business Process The standard functions provide a helper function for setting slot values on an instance, setSlot(). To avoid issues with mis-matches of cardinality on instance slots, we recommend using the addIfNotThere() standard function to set Instance Slots: addIfNotThere(anInstance, "my_slot_name", aReferencedInstance). See below
For slots that can contain multiple values, use addOwnSlotValue(). e.g to add an instance to a relationship slot - in this case, add an actor to a parent group: This function should also be used for all single-cardinality Instance slots as it automatically uses the correct Protege API call depending on whether the slot can accept multiple instances or not.
Standard Functions LibraryFor most custom transforms, the getEssentialInstance() function is all that is required. This uses the source identifier for an instance and attempts to find it in the Essential ontology repository. If found, the instance is returned and this can then be updated in terms of its slot values. However, if it is not found, a new instance is created in the Essential Architecture Manager repository using the specified source identifier to create a unique external reference in the Essential repository. This new instance is then returned and can then be updated in terms of its slot values. getEssentialInstance(theClassName, theExternalRef, theExternalRepository, theInstanceName)# Get a reference to the instance of the specified class that has the specified external reference in the getEssentialInstanceContains(theClassName, theExternalRef, theExternalRepository, theInstanceName)# Find the instance by a contains case-sensitive match on the instance name in Essential repository getEssentialInstanceContainsIgnoreCase(theClassName, theExternalRef, theExternalRepository, theInstanceName, theMatchString)# Find the instance by a contains match - ignoring case - on the instance name in Essential repository getEssentialInstanceIgnoreCase(theClassName, theExternalRef, theExternalRepository, theInstanceName, theMatchString)# Function to find instances by a name match (precise, not contains), regardless of case getEssentialNodeInstanceIgnoreCase(theClassName, theExternalRef, theExternalRepository, theInstanceName, theMatchString)# Function to find Technology_Node instances by a name match (precise, not contains), regardless of case getExternalRefInst(theExternalRefList, theExternalRepository)# Return the external reference that applied to the specified External Repository from a list createExternalRefInst(theExternalRepositoryName, theExternalReference)# Create a new External Reference record to be associated with an Essential instance. getExternalRepository(theExternalRepositoryName)# Get a reference to the instance of External_Repository that has the specified name timestamp()# Return a string of the current date/time to be used for timestamping. setOrUpdateTechInstAttributeByName(theAttributeName, theAttributeValue, theInstance)# Update the named attribute associated with the specified technology instance object setOrUpdateTechNodeAttributeByName(theAttributeName, theAttributeValue, theInstance)# Update the named attribute associated with the specified technology Node (theInstance) object setSlot(theInstance, theSlotName, theInstanceToAdd)# Set a slot value to the specified value. To be used with single-cardinality slots addIfNotThere(theInstance, theSlotName, theInstanceToAdd)# Add the slot value to the specified instance only if it's not already there. defineExternalRepository(theExternalRepository, theDescription)# Define a new External Repository or ignore definition if the repository is already known addNewEAMAttribute(theName, theDescription, theUnit)# Function to add a new Attribute instance to the Essential model. getNameSlot(theInstance)# Find the right name slot for a given instance. If it's an EA_Class instance, the getNameSlotForClass(theClassName)# Find the right name slot for a given class. If it's an EA_Class, the
|