ACE
(Astronomical Catalogue Extractor)
Design
Martin Hill
mch@roe.ac.uk
The Astronomical Catalogue Extractor is the Astrogrid team's contribution to the
AVO Demonstrator (AVODemo). It is essentially a SExtractor application published as a web service.
Introduction
Although this document is primarily the design document for ACE, I am also using it to 'brain dump' ideas that surface as a result of considering this application. These are mostly written in square brackets as [
Aside.... ].
Background
See
AVODemoRequirements
Supporting Documents
See
AVODemo for Glossary, Reference docs, etc
Technologies
These have been chosen out of the various technologies discussed over the previous Astrogrid meetings, to try them out and/or confirm their suitability:
Java
XML
SOAP
Apache
Axis
ACE Client/Server actions
In summary the following actions will be performed between the ACE client and server (Consider this a brief Use/Case):
Admin:
- Request Metadata
- Respond with Metadata
- Request Defaults
- Respond with Defaults
Running Extraction
- Request Extraction
- Refuse Request
- Accept Request
- Return results
Return Failures/Exceptions
- Invalid post from client
- Illegal parameters
- SExtractor application failure
Components
This section looks at the parts that go to make up ACE. I have tried to separate them into logical, reasonably independent parts that can be written independently as a set of Java classes, and can similarly become work packages.
Collaborative view:
Grid Server
(The Server that the ACE service will be provided on).
To reduce network load, it will probably be sensible for data and applications to be co-located wherever possible.
For example, Edinburgh's expertise in optical imaging might mean that they provide a number of optical-image-editing tools on one grid server, so that several process can be run on one set of evolving data, without any network bandwidth being required.
Any Grid Server might have any number of applications and data sets (or indeed none if it is a simple space provider), and different providers might provide different sets.
It appears that Apache/Axis can handle this services management. Axis-generated WSDL should correspond well with the Axis-managed SOAP messages, keeping things nicely consistent - there may be problems interfacing different implementations of SOAP...
Axis includes Log4J, so for consistency the service application should report errors to this sub service.
May also require an FTP server for retrieving results.
Application Service
The classes that manage the execution of the service; ie, the ones that represent the service instance. It will handle instance-related things such as temporary file locations. Will be an assembly of the following classes.
Use XML documents to pass parameters through SOAP.
Beware internal links in the documents: not guaranteed unique.
Low priority: colocate SExtractor with Data. ESO need to be informed that they will need Apache/Axis running
SExtractor Executor
A set of classes that executes a SExtractor application. It will manage such things as monitoring the output streams and returning the exit codes.
Check: What workfiles does SExtractor create? SExtractor may have to be run from instance-uniques locations to allow several instances to run at once
Parameter Assembler
Creates a list of parameters and values from the following sources, validating them as it does so:
- Default parameter values
- SExtractor-format config template file
- or XML-format config template file
- Contents of SOAP document listing parameters
Overriding is carried out as above; ie, we start with a list of defaults. Any specified in the templates override these, and any specified in the SOAP document override them all.
The Assembler creates a Parameter List.
Parameter List
An internal representation of the parameter/value pairs
SExtractor Config file parser
Extracts parameters from a SExtractor-format config file. Used for both reading SExtractor-format config template files to create the initial Parameter List, and also for reading the config file actually used when executing the application so that the actual configuration used can be returned with the results. (TBD on the latter - this could be created directly from the Parameter List)
SExtractor Config file writer
Takes a Parameter List, and writes them out in the ASCII configuration file format that SExtractor expects
URI Resolver
Resolving URIs to local filenames where possible. This may need to be done explicitly - can we be sure that all TCP/IP implementations will do this automatically?
Data Converter
(TBD) if the output of the service is to be pure-XML VOTable, we will need a FITS (or ASCII) --> VOTable converter. This should be reasonably straightforward...
Test Harness/UI
The User Interface can be a quick-and-dirty web form where the user can specify:
- image location
- (optional) chi-squared image location
- (optional) variance map (weighted image)
- Parameters for object extraction
- template configuration file (XML or SExtractor format)
And an area where progress and success/failure messages can be displayed.
This will need a script behind it to wrap it all up in the right SOAP message and send it to the service.
Messages/Sequence of Events
This section describes the various messages passed between the clients and servers.
NB - I've made a distinction between the
GridServer and a
GridService; the former being the overall controller of the services available on a server, and the latter being an instance of a particular service.
Getting Metadata
Normal Extraction
In more detail, the normal flow of events is as follows:
- Axis receives SOAP message:
- Validates message
- Creates instance of SExtractor Executor
- Calls SExtractor Executor with SOAP-document contents
- SExtractor Executor:
- Resolves URLs, copies in remote files.
- Assembles Parameter List: 3 Creates initial set from defaults 3 Parses template config file (SExtractor or XML format) and overrides values 3 Takes those given by SOAP message and overrides values
- Writes out instance configuration file
- Executes SExtractor application
- (Monitors output, waits for completion)
- Converts output to VOTable format
- Writes XML-format config template based on Parameter List (or SExtractor config file)
- Posts VOTable format and config template to client
Failures
Security
In general the security aspects are low priority, as this is not a safety- mission- or business- critical application. However certain threats can be easily prevented, so a quick analysis is done here.
Quick Analysis
Threats to Data
- Service Denial
- Malicious
- Overloading (too many requests, or images too large)
- Integrity (interfering with data stream to create valid but spoof data)
- eg Creating new messages and sending
- eg Sending previously captured messages later.
- Wossname (providing garbage data)
- Wossname (Theft - taking data without authority)
Note that these affect servers as well as clients.
Sources
- Internet Hackers (Affecting the internet traffic)
- Ignorant Scientists (Innocently trying to get services not paid for)
- Thieving(!) Scientists (Deliberately trying to get services not paid for)
- Malicious Staff at Grid Service Providers.
- Malicious people with access to Grid Service Provider computers.
Effects & Treatments
No Safety of Life issues
No Mission Critical issues
(? check)
Payment Issues: Some resources will be free to some, paid for by others.
Availability Issues: Some resources will be free to a select set of people/groups, and not available at all to others.
None of these apply to a one-off demonstration, but do if ACE is to be used afterwards.
Definitely
- Application Server (kept for general use?)
- Some of the communications protocol
Possibly
--
MartinHill - 29 Aug 2002