Architect's Blog

This page records decisions, by myself and others, that affect the AstroGrid architecture. I hope to capture the reasoning and detail of apparently-small decisions that come to dominate the software product. This is inspired by a note in Brooks' The Mythical Man-Month.

-- GuyRixon - 20 Dec 2004

SSO stuff

Current state of play on SSO...

  • Still tracking the emerging IVOA standards.
  • Test MyProxy server in Cambridge (mail me if you want a presence there).
  • Client code in astrogrid/commmunity/resolver component (CVS head) to drive MyProxy .
  • Code in astrogrid/security component (branch security-gtr-1593) to sign messages to services and to check the signatures inside services.
  • Code in astrogrid/applications component to sign/verify requests (using astrogrid/security, above).
  • Code in astrogrid/desktop/impl component (CVS head) to log in with MyProxy (uses astrogrid/community/resolver, above).
  • Code in astrogrid/desktop/impl component (uncommitted) to get "authenticate" instructions from user and pass to CEA delegate.
  • Code in astrogrid/desktop/api component to help pass credentials from MyProxy to CEA delegate.

Thus, we're basically saying that the next major AR release will have all the bits for authenticating to CEA services in the application launcher. It won't be able to use secured CEA services through JES.

Note that we need to add something to the AR to support authentication. We need this anyway, for other clients using the AR, but even the Workbench can't push credentials into CEA without this change; CEA delegates are too well encapsulated by ACR to allow this. The AR changes should be pure additions, so nothing should break.

Note also that the current prototype couples the workbench directly to the MyProxy service: this is a wart. The coupling isn't apparent directly in the AR (thankfully), but it's still regrettable. I hope to subsume the MyProxy client inside the security facade so that the Workbench and/or AR don't see it. This means that we can later ditch MyProxy in favour of, say, WS-Trust if we need to.

-- GuyRixon 30 May 2006

Reverse AJAX

If Ajax makes your workbench cleaner the more your scrub it then does reverse Ajax...? No, let's not go there. Reverse AJAX (note acronym) is a set of techniques for pushing information from servers to clients without a proper callback-service on the latter. Not something that AstroGrid is committed to, but relevant to some of the ACR work.

-- GuyRixon - 30 May 2005

On UI non-design

Quote from Hacknot:

It's not hard to spot interfaces constructed by [developers without skill in HCI design]. One of the tell-tale signs is that the application's dialog boxes look like dumping grounds for abandoned components. There's no obvious order or structure, nothing lines up with anything else. The net result looks like your dog threw up a Scrabble set.

I shall now pray and give thanks that nothing AstroGrid's produced has eveer been qute that bad. smile

-- GuyRixon - 05 Apr 2006

Web 2.0

I found this as a definition of what Web 2.0 is about:

  • The Web and all its connected devices as one global platform of reusable services and data
  • Data consumption and remixing from all sources, particularly user generated data
  • Continuous and seamless update of software and data, often very rapidly
  • Rich and interactive user interfaces
  • Architecture of participation that encourages user contribution
It's in a page by Dion Hinchcliffe. If you replace "Web" with "VO" it sounds very familiar.

-- GuyRixon - 03 Apr 2006

Commons Logging

I read today some interesting commentary on Jakarta commons-logging, all of it negative and some of it worrying. The primary case for the prosecution is by Ceki Gülcü; note the section concerning the "ThreadDeath" problems. More directly, the original inventor of clogging, Rod Waldhoff, explains the cirumstances in which he'd expect it to be used. They are a very narrow set of the use cases:

In fact, there are very limited circumstances in which Commons Logging is useful. If you're building a stand-alone application, don't use commons-logging. If you're building an application server, don't use commons-logging. If you're building a moderately large framework, don't use commons-logging. If however, like the Jakarta Commons project, you're building a tiny little component that you intend for other developers to embed in their applications and frameworks, and you believe that logging information might be useful to those clients, and you can't be sure what logging framework they're going to want to use, then commons-logging might be useful to you.

In AstroGrid components, we provides frameworks rather than libraries and we get to choose the logging implementation. All our products use log4j and are configured via log4j.properties. However, we all use commons-logging too. It strikes me that we could and should eliminate the commons-logging layer as code comes up for maintenance. It can't hurt and it might cure some problems.

-- GuyRixon - 13 Mar 2006

"Type Managers"

I found some evangelism about a pattern for whole applications called Type Manager. I'm not sure yet whether I agree with all this. If the principle is considered sound, then the details of the pattern must surely apply to a lot of the interactive clients that we might make for the VO, especially those that serve a specific area of astronomy. The file-manager specialization of the pattern might apply to generic MySpace browsers.

The stipulation that a "proper" file-manager has to include a CD-burning function is interesting. The ability to archive data in MySpace to a CD (or DVD) would be quite useful. One could imagine adding this to a UI, such as the Workbench, or building it as a service, possibly as a specialized kind of VOStore.

-- GuyRixon - 16 Nov 2005

Security architecture

My summary of where I think we might go for security in 2005-2006: SecurityArchitectureFor2005. This is my personal pick of everybody else's good ideas. It's what I shall push for in IVOA unless persuaded otherwise.

-- GuyRixon - 18 Apr 2005

Ideas for installing AstroGrid components

JohnTaylor's idea for using GUI installers with AGINAB: AstroGridComponentInstallers.

An alternative idea for a new web-application container: WebAppDeploymentBasedOnJetty.

-- GuyRixon - 08 Apr 2005

Schemata published

The AstroGrid schemata are now published on software.astrogrid.org. Get 'em while they're cold and dead!

-- GuyRixon - 01 Mar 2005

Apache Apollo

The Apache Apollo project is an implementation of OASIS' WS-ResourceFramework in Java.

-- GuyRixon - 25 Feb 2005

Axis v2

The first milestone release of Axis v2 has been announced: AxisV2Announcement.

-- GuyRixon - 25 Feb 2005

schemaLocation for WSDL imports

When processing a WSDL document, the import statements are done by the processor; this means that relative URLs in the schemaLocation attributes don't work well. Consider the element

  <xsd:import
     namespace="http://www.astrogrid.org/schema/AGWorkflow/v1"
      schemaLocation="../schema/Workflow.xsd"/>
which is in the live copy of CommonExecutionConnector.wsdl. It's saying that on the server, the subsidiary schema is in a sibling directory, called schema, of the directory holding the WSDL document. This link breaks if the WSDL document wasn't taken from such a directory: e.g. it's a copy on my local disc or the WSDL contract emmited directly from the service.

When we have a stable location from which to serve schemata (within the next few days), then we should write, in import elements, absolute URLs pointing to this location. I intend to make this change in the first release of the contracts sub-system.

-- GuyRixon - 18 Feb 2005

The reason that relative links are used is to make development easier - you don't need to publish a new version of the schema before you can make changes to the WSDL - it can work off simultaneously edited local copies of all the relative files. When publishing the WSDL in the workflow objects project, care was taken to ensure that these relative links were maintained. I think that Axis totally rewrites the WSDL that it issues for a service anyway so it would resolve those links. I am rather against the idea of absolute links in import/include statements it makes life much more difficult.

-- PaulHarrison - 25 Feb 2005

The proper way to handle this is to use an XML catalogue, as implemented in Xerces. Tools that expect a catalogue tend to break if relative URLs are used.

Axis doesn't rewrite the WSDL when working from a WSDL file (c.f. Axis generating its own WSDL from Java at run-time) and this will break the tools that derive stubs from WSDL at run-time. Any tool that reads WSDL emitted by the web-service itself doesn't see the context for the imports and so can't resolve them using relative URLs.

If one is changing a schema (.xsd or .wsdl), then one has to issue a new namespace. In that case, a changed subsidiary schema can have a relative URL until it's committed into the contracts project.

-- GuyRixon - 01 Mar 2005

Catalogues are the best way to handle schema locations, but support for this in tools is extremely patchy - my experience is that relative URLs are the lowest common denominator that work in the largest number of tools.

Axis does rewrite the WSDL that it emits from the web service itself - compare http://zhumulangma.star.le.ac.uk:8080/astrogrid-jes-SNAPSHOT/services/ResultListener?wsdl with the input WSDL written for the service http://www.astrogrid.org/viewcvs/*checkout*/astrogrid/workflow-objects/wsdl/CEAResultsListener.wsdl?rev=HEAD&content-type=text/plain

-- PaulHarrison - 02 Mar 2005

In workflow-objects/wsdd/CEAResultsListener-deploy.wsdd, Axis is told to generate the WSDL at run-time from the Java classes, informed by the WSDD parameters. It doesn't use the WSDL file supplied in the source. Compare this with the registry sub-system where the WSDL supplied as source is used and Axis propagates any relative URLs to the client where they break.

Generating WSDL at run-time makes it hard to do version control on the WSDL contract, so we're moving away from that technique. Therefore, relative URLs are and remain a problem. Its possible that Axis 1.2+ might resolve the import statements on the server side; I'll check.

-- GuyRixon - 02 Mar 2005

URNs for namespaces

Down with URLs! Up with URNS! For XML namespaces, at least: UrnsForNamespaceUris.

-- GuyRixon - 11 Feb 2005

Wizard dialogs in Swing

If we're going to start making Swing clients for AstroGrid, we may as well have some wizards. Sun have published a guide to writing wizards.

-- GuyRixon - 11 Feb 2005

Use of JNDI in web applications: difficult details

AstroGrid uses JNDI in web applications to get at configuration data. We write the configuration data in env-entry elements in the deployment descriptor and get them into the Java code by calling the APIs for naming defined by J2EE .

The J2EE 1.4 specification (section J2EE .5.2.1.2) says

The Application Component Provider must declare all the environment entries accessed from the application component's code. The environment entries are declared using the env-entry elements in the deployment descriptor.
In other words, if it's not written in web.xml you musn't ask for it in a Java class. By extension, it's improper to use the absence of declaration for a particular environment entry as a trigger to look elsewhere for the configuration: the code is not allowed to make that test.

Our configuration library looks up a JNDI key that would contain a URL to a configuration file if the key were present; if the key is absent, then the library looks for the configuration on the classpath. The library also looks for individual configuration items in JNDI then falls back to this properties file. This now looks to be an improper use of J2EE ; one of the things that just happen to work on Tomcat and might fail in another container.

We should look at changing this. We might change to support some kind of null value in a JNDI key that's always present. Or we could just accept that we don't use properties any more and require all the configuration keys to be declared in the deployment descriptor; we've pretty much agreed to do the latter thing. In that case, the speculative look-ups in JNDI should be removed from the codebase.

WS-RF implementations from Apache

The Apache Muse project is an implementation of "Management Using Web Services", and that protocol uses the OASIS WS-Resource Framework that is the infrastructure for modern grid services. From the Muse web-site:

The Muse 0.5 Alpha release contains an implementation of the following specifications:

  • Management Using Web Services (MUWS) 0.5
  • WS-ResourceProperties 1.1
  • WS-ResourceLifetime 1.1
  • WS-BaseFaults 1.0
  • WS-BaseNotification 1.0
  • WS-Topics 1.0

The release provides a way to take a MUWS-specific WSDL file and generate a MUWS-compliant webservice which can be hosted inside of Apache Axis. This release contains the original implementation provided by HP and is a self-contained implementation of the specs. We've provided an architecture document here.

Ignoring the first entry in the list, that's the set of standards needed for the Unversal Worker Service proposal recently uploaded to IVOA (UWS is basically CEA rejigged to run over WS-RF). Their page goes on to say:

The future Muse 1.0 implementation will use Apache Apollo Project as its WS-ResourceFramework implementation and Apache Hermes Project as its Web Services Notification implementation.

There is clearly a lot of momemntum building up behind WS-RF. These Apache implementations are an alternative to the Globus Toolkit for us Java developers. Muse itself is still in the Apache 'Incubator' so isn't ready for production use.

-- GuyRixon - 10 Feb 2005

Gathering interface schemata into a Maven project

Noel suggested a nice way to store the various schemata associated with our web-service contracts: MavenProjectForContracts. We need first to sort out the versioning: VersionControlForContracts.

-- GuyRixon - 02 Feb 2005, updated on 08 Feb 2005

XML schemata and the registry

A while ago, we had a debate about the value of using XML. The main points in favour were

  • it's checkable in detail using schemata written to the W3C XML schema language;
  • it can be worked with 'industry-standard tools'.

Currently, we're losing these benefits, at least in respect of registry metadata, because we don't have a complete set of schemata and the industry-standard tools have problems with the schemata we do have.

This is the back-story: I'm struggling to create v0.10 registrations. I want to write these with oXygen v5.1, my preferred XML-editor, to check them against schema and only then to put them in the registry. There are three problems.

First, VOResource v0.10 refers to a subsidiary schema VOMetadata v0.1, from which it uses the element dcterm. The subsidiary schema is not imported by the main schema, which makes the latter invalid in itself. The VOMetadata schema is not visible on IVOA site.

Second, I need to get the schemata into the editor. The only way to do this is to use schemaLocation attributes in the document (or in another schema). This is a lot of typing each time I create a document.

Finally, the editor can create the document structure for me, but only if:

  • I give it a single, master schema from which to work;
  • I can tell it a global element in the master schema.

Proposed refactoring to fix this: RegistrySchemataThatWorkWithTools.

-- GuyRixon - 01 Feb 2005

Axis web-application

Suggested rationalization of how we package Axis: UsingAxisWebApp.

-- GuyRixon - 26 Jan 2005

On protecting resource registrations

Kevin asked about securing the resource registry. Here's my reply, quoting his original email: ProtectingRegistryEmailExchange.

-- Main.Guy Rixon - 25 Jan 2005

Shibboleth

There is a new-ish version of the Shibboleth protocol specification. I'm working through it, trying to understand HowToUseShibboleth.

-- GuyRixon - 25 Jan 2005

Standards for web services

There is a paper on xml.comby Rich Salz discussing standards for SOAP services. Among other conclusions, it recommends using SOAP 1.1 in preference to SOAP 1.2.

-- GuyRixon - 25 Jan 2005

Use of IVORNs and name resolution

The current use of IVORNs makes me uneasy, because we don't use them exactly according to their apparent rules. This makes for structures that work nicely inside AstroGrid an unravel in the wider VO.

I now suggest slightly different rules for naming of accounts and things in VOSpace: NamesBasedOnIVORNs. This is a collection and formalization of ideas that have been around for a while.

Previous discussion on this:

Implicit loop in workflows

Ideas for easier looping following the Consortium meeting last month: ImplicitLoopsInWorkflow.

-- GuyRixon - 04 Jan 2005

Axis v1.2 and WSDL

In Axis v1.2 (currently in beta at release candidate 2), there is a feature to set the endpoint address of any emitted WSDL, including WSDL taken from a file (as opposed to WSDL generated from the Java code). This feature is always on (no associated configuration options that I can see). It's not present in the code for Axis 1.1.

Therefore, in order to serve WSDL from files, as we want to do, we need to upgrade to Axis 1.2. This version also claims to have better conformance to the WS-I basic profile. We should make this version standard for Itn9, but not for Itn8 because

  • we have enough to do without converting all the sub-systems; and
  • we risk interoperation problems if we mix versions in one system.

However, it would be useful to run some tests with Axis 1.2 now.

-- GuyRixon - 05 Jan 2005

WSDL emitted by Axis

We want our WSDL contracts to be written down in files. This makes problems in installation as Axis doesn't support WSDL-in-files very well.

  1. How does Axis find the WSDL file?
  2. How does the endpoint address get set?

From the Axis 1.2 reference manual concerning the wsdlFile element from the WSDD vocabulary:

The path to a WSDL File; can be an absolute path or a resource that axis.jar can load. Useful to export your custom WSDL file. When specify a path to a resource, place a forward slash to start at the beginning of the classpath (e.g "/org/someone/res/mywsdl.wsdl"). How does Axis know whether to return a file or resource? It looks for a file first, if that is missing a resource is returned.

If I understand the jargon correctly, Axis requires you either to put the WSDL file on the classpath or to give the absolute file-path. We don't know the absolute file-path when we write the WSDD file, and we don't want yet another file that has to be edited by the person installing the WAR. Therefore, we are better off putting the WSDL files in the classpath. It is simplest to put them directly in the classes directory and to refer to them in the WSDD as /mywsdl.wsdl, i.e. not specifying a package name.

To set the endpoint address we currently have to edit the WSDL file to match the deployment. It may be possible to include the part of the WSDL contract that sets the address from a servlet that generates the correct endpoint from its context. Or we might be able to extend Axis to adjust the address for us.

-- GuyRixon - 04 Jan 2005

More on web-application contexts

A disappointment and a re-think

The approach noted above, where one context uses the document base in another context turns out not to be effective in Tomcat 5. If you undeploy (from Manager) the context that links to the document base, then the documents in the linked-to context are deleted. This is exactly the opposite behaviour to what we would want. (Conversely, if the context file in $CATALINA_HOME/conf/Catalina/localhost is deleted directly, then the parasitic context goes away without damaging the base context; but we can't rely on everybody working in this way.) Thus, this approach is ruled out.

If we can't use linked contexts, then there's no point in using context.xml for properties and the properties are better in web.xml which is standard, as Noel points out above. (Actually, I'd come to the same conclusion before seeing Noel's note.)

Thus, the new direction is to write all properties into the deployment descriptor web.xml. As noted in the book Core J2EE Patterns, the deployment descriptor forms a contract between the developer and the person deploying the software (and we like contracts). The DD is a machine-readable form of that contract, as proposed by Noel at the consortium meeting.

I still intend to investigate JSPs and servlets to set properties. Regarding Paul's suggestion, the tools that come with the servlet container are OK only if they actually are OK; Tomcat's are not. The Admin web-application in all Tomcat versions so far out of beta (5.0.x, x<30 and 5.5.y, y<5) limit the length of the value to 64. All Tomcat versions limit the length of the key to ~24. We can't configure our current web-applications user these restrictions. The admin web-application in Tomcat isn't even part of the basic installation for 5.5.5 and later; you have to install it separately (and the installation process is broken); therefore we can't rely on it being there in all target sites. Anyway, the Admin web-application is a poor interface for setting parameters (bad ergonomics, little contextual help, confusing etc.).

I think we should at least offer a servlet that lists the properties (as in some of the "fingerprint" pages). I am investigating code that can update the values in the deployment descriptor (but not the structure of the same) such that a restart of the web-application loads the new values. If this has a JSP UI, then we can customize the property-setting page with contextual help for each web-application. I hold that to be a good thing. I note that the person managing the deployment can use either this facility, or the tools provided by the container, or both in order to set the properties.

-- GuyRixon - 21 Dec 2004

A 'better Admin' application may be all that we need, but I think we'll need a 'property metadata' file that groups appropriate properties with appropriate descriptions (incl group descriptions that is) so that the configuration process is useful. Perhaps some components will need more specialist applications; setting up dataservices to connect to the data maybe better done through special pages? Most of the rest of the dataservice settings are standard though...

As far as I can see, we can't set properties using the JNDI service which is a pain. If possible it would be nice to avoid messing about with web.xml directly; it would be better to access/change it through the container so that we don't hit lock/refresh problems or variations between containers (can we be sure there aren't any?!). Is this possible?

-- MartinHill - 05 Jan 2005

JSPs to display and (separately) edit the env-entry elements of web.xml are now committed in the common sub-system on a branch. These are a partial solution. As Martin noted above, it's better to use the container. I would love to have a special, privileged web-application to improve Tomcat: one that managed the environment properly and allowed updating other web-apps. It's possible to write this beast, but particularly beastly to do so, since the documentation for Tomcat internals (catgut?) is poor. I've tried once and been defeated, so the JSP hack is a stop-gap that can work now.

-- GuyRixon - 11 Jan 2005

Have you considered using JMX (Java Management Extensions) ? - which seems to be intended for monitoring and configuring running applications. This java standard is supported by a lot of servlet containers (list) , including tomcat.(tomcat jmx interface) . Which means that if we made the system configurable via JMX, we wouldn't be tied just to tomcat.

There's a tutorial on JMX on Sun's site.

-- NoelWinstanley - 20 Jan 2005

On trust

This from a page by Peter Torr.

Trust is not transitive. If I trust you and you trust Bob, that doesn't mean that I trust Bob.

This bears on the single-sign-on discussion and particularly on the issue of whether communities can be federated such that signing in to community A is equivalent to signing in to community B.

-- GuyRixon - 21 Dec 2004

Similarly recently I've been wondering what you do when you change security system. For example, if you're sending a file to an ftp server, or private disk space, we need some way of looking up the appropriate login from the given Principal. This is (possibly?) alright for high security service -> low security services (eg astrogrid service -> ftp server), but is obviously no good the other way around.

-- MartinHill - 05 Jan 2005

Configuring web-application contexts

I'm still looking for a "cheap" way of setting properties in a web-application. My ideal is to include, with each AstroGrid sub-system, a JSP/servlet rig that can change the properties for that web-appplication. Ideally, I'd like this to work for any J2EE set-up and not just in Tomcat.

So far, I've established that direct write-access to JNDI is not the way, in that web-applications are not allowed by the J2EE specification to rebind their JNDI keys, and that Tomcat enforces this. Tomcat's own Admin web-application is clearly an exception, but may be some localized hack.

Hackin' on properties files from a servlet is feasible, but defeats the use of properties in web.xml and context.xml.

API calls in Tomcat to manipulate contexts work at some level (the Manager application uses them), but are presumably non-portable and very badly documented.

Therefore, I'm currently thinking of this approach:

  • Each WAR contains servlets and/or JSPs to make a new context.xml in which the "document base" directory points back to the orginal context created from the WAR (as suggested for PAL).
  • The process of creating the new context gives the user a form in which to set all the properties for which defaults are set in web.xml or context.xml.
  • To change a property, undeploy the context and deploy a new one with updated values.

-- GuyRixon - 20 Dec 2004

I believe that context.xml is specific just to tomcat - so a portable tool would just have work with web.xml alone. In web.xml it's possible to set up values for configuration keys, but only for simple types - complex types (e.g. JDBC drivers) need to be configured elsewhere -- NoelWinstanley - 20 Dec 2004

I think that perhaps the goal of trying to set these properties in jsp is misguided. As has already been discovered, it is really the role of the container to manage this sort of property setting - what might be a better aim is to try to have a java application that manages the initial install of the component and set initial properties (using knowledge of the relationships between properties). Once up an running, any tweaking of the properties be done with the usual J2EE container mechanisms.

-- PaulHarrison - 20 Dec 2004

I don't think there's any contradiction in our making the components configurable in any j2ee container (by putting such properties as we can in web.xml), and also having some extra goodies (deployment scripts like AGINAB, property-editing JSPs etc) that are Tomcat 5.x only. All our software should work in any container, but if the user isn't using Tomcat 5 (which will surely be the most popular choice), he'll just have more work to do to set things up.

-- JohnTaylor - 05 Jan 2005

Topic revision: r37 - 2006-05-30 - 18:29:28 - GuyRixon
 
AstroGrid Service Click here for the
AstroGrid Service Web
This is the AstroGrid
Development Wiki

This site is powered by the TWiki collaboration platformCopyright © by the contributing authors. All material on this collaboration platform is the property of the contributing authors.
Ideas, requests, problems regarding TWiki? Send feedback