Database access through web services and the Grid: recent progress

There has been considerable progress in making databases available on the Grid as web services.

  • A draft specigfication of the interfaces is available.
  • Prototype code to those interfaces exists.
  • There is a funded project to develop production-quality code.
  • The movement to standardize these interfaces is proposed as a working group for GGF.

Who?

The database work is a UK initiative. It is being steered by the Database Task Force (DBTF) of the e-Science core programme.

The funded project is called OGSA-DAI, where DAI stands for Database Access and Integration. It is funded by the core programme and gets matching "funding", in the form of seconded developers, from IBM's lab at Hursley. I understand that DBTF somehow steers OGSA-DAI but the exact relationship is unclear to me.

The working group at GGF is yet to be authorized by GGF. When it comes into being, it will be an open, international group for discussing the proposed interfaces and reference implementations in prparation to them become a grid standard. I understand that the WG will meet three times a year, at the GGF conferences, and essentially anybody in the grid community can ask to table documents and comments at those meetings.

What?

The specification is available at the NeSC web-site. The available paper is dated 1st February 2002, but I suspect that an update is due soon.

The interfaces provide low-level access to individual databases via web-services of a class called DatabaseService . By low-level, I mean that there is almost no abstraction of the form of the database. To make a query, the software calling the web service has to first find out the schema of the database and then must phrase the query to match the schema. Abstraction, such as translating operands from UCDs to names of actual columns, has to happen in the caller, not in the web service.

The basic operations are query, update, bulk-load and schema-update. There is also an interface to allow transactions. Output of data selected in queries is quite sophisicated: it uses a separate, dynamically-generated web-service (a DeliveryService ) which can use special transports like GridFTP and can deliver data asynchronously.

The fine details of the interface are still quite fluid. In many cases, DBTF is still debating the underlying semantics and trying to understand how the interfaces best map to established techniques for RDBMS.

The interfaces are designed to work with any kind of database. Notably, they are expected to work with RDBMS using SQL and with native-XML databases using XPath and XQuery. Presumably, the interfaces could be made to work with object databases.

There is already a partial implementation for an XML database (Xindice, by Apache) using XPath and possibly XQuery. This is from Rob Baxter's team at EPCC. At present, this presents only web services and doesn't do authentication with GSI or authorization. It's independent of OGSA so far, but this will change.

An implementation for relational data (for any RDBMS supporting the JDBC interfaces) is expected soon from IBM Hursley. Both these reference implemenations will be developed to beta-test stage during Spring and summer of 2002.

The interfaces are specified independently of OGSA at present, but many of the features cover the same ground. The reference implementations are expected to use OGSA features explicitly. The final specification will presumably be for Grid services rather than just web services.

Where next?

These are the specifications and products of which AstroGrid is supposed to be an early adoptor. DBTF and OGSA-DAI will be seeking serious feedback when their reference implementations reach the beta stage.

In the mean-time, to give us more time to react, I have arranged to borrow a copy of the alpha version of the XML implementation from EPCC. This is given on the understanding that there's no support yet and that all details may change. I intend to set up a demonstration of the DatabaseService that AstroGrid members can try out. Suggestions for test data are welcome, but I'd thought to load something relating to our ResourceCatalogue as a read-only database.

In the summer, AstroGrid needs to look closely at the new facilities and report back to DBTF. By extension, we seriously need to think about the costs of fitting our system to OGSA-DAI. More prototypes and demonstrations are indicated, especially of the RDBMS version.

When?

(These dates are from my rather-confused notes on meetings at NeSC and may not be accurate.)

  • Alpha "escape" of XML implementation: start of May.
  • Proper Alpha release of XML implementation: end of May.
  • Alpha release of RDBMS implementation: start of July.
  • Statement of contents of final release: middle of May.
  • Beta releases: about two month after alpha releases.
  • Possible training course at NeSC: October.

Alpha released would go to early adoptors only. Some papers concerning this work will go forward to GGF5, so comments from AstroGrid before then wpuld be timely.

-- GuyRixon - 22 Apr 2002

Thanks for doing that, Guy, your notes are more complete than mine in most areas. My notes have the dates for the public release of the OGSA-DAI software as:

  • XML databases: 19th July
  • Relational databases: 1st Sept.
But these are, of course, just estimates based on current progress.

-- Clive Page - 22 Apr 2002.

Topic revision: r2 - 2002-04-22 - 12:33:30 - ClivePage
 
AstroGrid Service Click here for the
AstroGrid Service Web
This is the AstroGrid
Development Wiki

This site is powered by the TWiki collaboration platformCopyright © by the contributing authors. All material on this collaboration platform is the property of the contributing authors.
Ideas, requests, problems regarding TWiki? Send feedback