5 Minute Sketch

Workflow Documents

A workflow document contains a sequential and/or parallel composition of steps.

Each workflow step represents a call to an external astrogrid-enabled resource.

Example resources are datacenters (data resources) and astronomical tool servers (computational resources).

All resources conform to the CEA standard, and so can be treated in a uniform manner.

The CEA standard defines the web interfaces, message formats and registry entries each resource must provide.

User Side

When a Jes server receives a workflow document submission it annotates it with a unique execution identifier (JobURN) and stores it locally.

Users can query the Jes server to retreive a list of the JobURNs for jobs they have submitted.

Users can retreive a workflow document from Jes by passing its JobURN. This will be the original document that was submitted, annotated with execution details.

Processing Side.

Workflow documents are processed in an asynchronous manner.

Workflow documents may be queued before processing.

Initiating Steps

When Jes processes a workflow document it:
  1. loads it from the local store
  2. determines which steps should be started next using a scheduling policy
  3. resolves the resources used in these steps to endpoints of CEA services by querying an astrogrid registry
  4. dispatches each of the candidate steps as a task to the appropriate CEA service.
  5. records the steps that have been executed as annotations in the workflow document
  6. saves the updated document back to the local store.

Tracking progress of Steps.

The CEA standard defines a web-interface that CEA servers must use to notify a controller (in this case Jes) when the execution status of a task changes.

Tasks may have the following execution statuses - PENDING, RUNNING, COMPLETED, ERROR

CEA servers may optionally use the same method to report other progress / logging information to a controller.

When Jes receives notification from a CEA server it:

  1. loads the relevant workflow document from the store
  2. records the notification as an annotation to the corresponding job step
  3. if the notification states that the task has completed or in error, Jes examines the workflow document for further steps to initiate.
    • If no further steps can be initiated, the status of the entire workflow is set to 'completed' or 'error'
  4. saves the updated document back to the local store.


Architecture

Workflow Documents

The core workflow document format, and the annotations used to record execution information are defined as XML schemas:
  • Workflow (schema/html) the document core - defines steps, sequences, flows
  • Execution Record (schema/html) annotations to record execution
  • CEA Types (schema/html) types used in execution records
  • CEA Parameters (schema/html) types passed as parameters to CEA
  • Credentials (schema/html) authentication fragment of a workflow.

Within the Jes implementation, a Castor-generated object representation of the workflow documents is used (javadoc). These classes are part of the workflow-objects maven project. These classes are pretty uninteresting in themselves - they provide a structured way to manipulate and build valid workflow documents, but no further functionality.

  • Workflow Document Object Model:
    WorkflowDocument.png


Jes Server

The jes server is in the Jes maven project.

It has two web-interfaces - Job Controller (wsdl/javadoc) called by clients and Job Monitor (wsdl/javadoc) called by CEA.

Both interfaces provide entry-points into a single system of shared components.

  • Components of Jes Server:
    JesComponents.png

The components of the JesServer are

  • Job Controller - implementation of the Job Controller web interface. allows users to submit and retreive workflow documents
  • Job Monitor - implementation of the Job Monitor web interface. Allows CEA servers to notify jes of task progress.
  • Job Store - abstraction of a store for workflow documents.
  • Task Queue - internal component, used to ensure concurrent requests to JobController and JobMonitor web interfaces are processed sequentially when needed.
  • Job Scheduler - processes workflow documents - initiates steps and tracks their progress
    • Policy - pluggable component, used to determine next steps to initiate, etc.
    • ToolResolver - resolves a resource name to endpoint URLs of CEA server that provide it.
    • Dispatcher - passes a step task to a cea server for execution.
  • Component Manager - selects which implementations of each component to instantiate, and wires them together to form the jes server.

Job Controller

implementation: JobController

This component handles requests from clients in one of two ways. Requests that involve no change of server state (read a workflow from store, list workflows in store) -- i.e. reads -- are processed by calling the job store component directly, and returning the results to the client.

Requests that alter the server state in some way -- submitWorkflow, deleteWorkflow, cancelWorkflow -- are validated, and then passed to the job scheduler component for processing (which will actually by the task-queue implementation).

Job Monitor

implementation: JobMonitor This component is a very thin wrapper. It validates the notification message it receives from a CEA server, and then passes it to the job scheduler component for processing (which will actually be the task-queue implememntation)

Job Store

interface: JobFactory

implementations: FileJobFactoryImpl

Has methods to create, find, list, load and delete workflow documents in a persistent store. Documents are keyed by JobURN, and are returned as objects in the workflow object model.

A range of implementations are available, that persist workflow documents to - sql database, file system, in-memory hashmap. Different implementations are used when unit testing and deploying the jes service.

This component provides no locking or transactional support (as this would tie it to more to a particular implementation). So components that call the job store need to ensure themselves that concurrent writes, race conditions, etc do not occur.

Task Queue

interface: same as job scheduler - acts as a decorator for an existing job scheduler implementation.

implementation: SchedulerTaskQueueDecorator

This component provides the same interface as a job scheduler, but doesn't implement these methods itself. Instead, each method call is encapsulated as a java.lang.Runnable object and added to an internal task queue.

This component decorates an existing job scheduler instance. A background thread processes the tasks in the queue by passing them to the existing job scheduler.

This component is necessary because the job controller and job monitor web service components can receive multiple requests simultaneously from different clients.

After the web sevice components validate the requests, the processing of the request is passed to the job scheduler task queue, which queues them until they are handled by the background thread. In effect, concurrent streams of requests are merged into a single sequential request stream by this component.

This arrangement allows the job scheduler to be wrapped so that it can process multiple concurrent requests in a single-threaded fashion. This removes the need for complicated synchronization or locking logic in the scheduler and job store components.

The implementation of this component uses a task queue class provided by an external library

Job Scheduler

interface:JobScheduler

implementations: SchedulerImpl

Implements the main methods of the jes system

  • start a job
  • update job based on feedback from CEA
  • cancel execution of a job
  • delete job from store

the scheduler uses the Job Store, Policy, Locator and Dispatcher components to achieve this. Within the scheduler class itself is the code that orchestrates all this, logs messages, and adds annotations to the workflow objects.

Policy
interface:Policy

implementations:FullPolicy

Policy is a pluggable component that is used by the scheduler to: select steps to be executed; and to calculate the overall status of a job, based on the status of its steps.

The results returned by different policy implementations is expected to differ.

Implementation of Policies is assisted by being able to use xpath expressions to select nodes from the workflow document objects; and by a visitor pattern implementation. This can be used to traverse the workflow document object model, to search for steps or accumulate results.

There's a range of different policy implementations, that interpret parts of the workflow document in different ways. In general, I don't think that policies are the way to tackle the scheduling problem - in particular once additional features are added to jes (retrying failed steps, trying alternative servers) or the workflow document format (loops, variables).

ItnSixWorkflowExtensionsReport presents an alternative approach, based on a prototype of Jes that replaces the policy component with a scripting engine that inteprets a set of exectution rules generated from the workflow document.

Dispatcher
interface: Dispatcher

implementations: ApplicationControllerDispatcher now badly named - should be CeaDispatcher

A dispatcher component provides a single method 'dispatchStep', which takes the step to dispatch, plus the parent workflow. The component should execute the step in some way.

At present, there's only a single implementation, for a CEA service. This extracts the task parameters from the step, resolves the resource name to a URL endpoint, and then uses the CEA delegate to call the CEA server at this endpoint with the task parameters.

Tool Resolver
interface: Locator

implementations: RegistryToolLocator

A tool resolver converts from abstract resource name to endpoints of the CEA services that provide this resource. The production implementation of tool resolver queries the astrogrid registry to produce this information.

This is quite a convoluted process - the registry entry for the resource needs to be retreived first, and then a separate query executed that finds cea servers that supports this resource. A bit brittle at the moment - relies on the registry query language, and cea schemas.

Other implementations of locator are provided - these can be used for unit testing, or stand-alone deployment, where the registry is replaced by an xml configuration file listing known resources.

Component Manager

interface: ComponentManager

implementations:

Wraps a picocontainer. Configuration is examined to determine which component implementations to register with the picocontainer. After the components are registered, the container is started. The picocontainer then takes care of instantiating components in the correct order, and plumbing them together.

The started component manager is stored in a static singleton ComponentManagerFactory from where it is accessible throughout the webapp - in particular the web-service implementations.

The component manager has accessor methods to retreive the web-interface components (Job Controller, Job Monitor) from the picocontainer. These are called by the corresponding web-service implementations.

Implementation Notes.

Config
Uses org.astrogrid.common.config for configuration - although this is restricted to the component manager packages - the jes server is fully configured before it starts operating. This means configuration errors cause the service to fail at startup, rather than during execution.

Logging
use commons logging.

Testing
Unit tests - junit.

In addition, components ship with configuration tests written as junits. These can be called on a running jes server to verify it is correctly configured. There's a link from the jes webapp pages to run these tests.


Jes Delegates

Two delegates to the jes service are available

The Job Controller delegate (javadoc), called by client-side, user driven code is used to submit and retreive workflow documents to the jes service.

The Job Monitor delegate (javadoc) is called by CEA services to notify changes in task execution status.

Client-side workflow library.

This library (javadoc) is used by the astrogrid portal (and other clients - for instance integration tests) to interact with the Workflow system.

This library is in the Workflow maven project.

It layers on top of the Job Controller delegate, plus Registry and MySpace delegates.

It provides convenient methods to

  • construct workflow documents
    • query registry for CEA resources
    • add step objects to the workflow, pre-populated with registry info
  • save and load workflow documents to myspace
  • submit and retrieve workflow documents to jes servers

  • Workflow Client Library - class diagram:
    Workflowclientlibrary.png

-- NoelWinstanley - 19 May 2004

Topic attachments
I Attachment Action Size Date Who Comment
pngpng JesComponents.png manage 17.3 K 2004-05-19 - 14:30 NoelWinstanley Components of Jes Server
pngpng WorkflowDocument.png manage 58.3 K 2004-05-19 - 14:28 NoelWinstanley Workflow Document Object Model
pngpng WorkflowSystem.png manage 21.7 K 2004-05-19 - 14:29 NoelWinstanley Deployment Diagram of Workflow System
pngpng Workflowclientlibrary.png manage 59.9 K 2004-05-19 - 14:29 NoelWinstanley Workflow Client Library - class diagram
Topic revision: r2 - 2004-05-19 - 14:14:00 - NoelWinstanley
 
AstroGrid Service Click here for the
AstroGrid Service Web
This is the AstroGrid
Development Wiki

This site is powered by the TWiki collaboration platformCopyright © by the contributing authors. All material on this collaboration platform is the property of the contributing authors.
Ideas, requests, problems regarding TWiki? Send feedback