[DRAFT: 2002-03-16]

Comments from the AstroGrid consortium on
"Data Requirements for the Grid: Scoping Study Report" (Draft 1Ab: 2002-02-08)

Bob Mann (Edinburgh, rgm@roe.ac.uk)
Clive Page (Leicester, cgp@star.le.ac.uk)
Anita Richards (Jodrell Bank, amsr@jb.man.ac.uk)
Guy Rixon (Cambridge, gtr@ast.cam.ac.uk)

Summary:

This report provides a very useful summary of the exercise, performed at the request of the Database and Architecture Taskforces of the UK e-science core programme, to identify the requirements for creating, maintaining and accessing data in a Grid environment, and contains a wealth of useful information, which will be very valuable in the development of the services required to integrate databases into the Grid. Its author, Dave Pearson, is to be commended warmly for producing such a detailed and wide-ranging report, and for seeking input to it from such a broad cross-section of the UK e-science community during the requirements analysis exercise that he conducted: this document certainly does meet its stated aim of providing an introductory overview of the importance and role of data in the Grid.

The report explicitly states that it does not describe any requirements in sufficient detail to be the sole input for designing solutions, and that it is intended only as an input to a programme to refine and prioritise requirements so that the design and prototyping of software components can begin, but we feel that two significant changes are required before it can fulfill that role properly. Firstly, by describing "the requirements for creating, maintaining and accessing data in a Grid environment", the report implictly conflates requirements placed on two distinct communities, namely the curators of the data to be accessed through the Grid and the developers of the Grid middleware required to perform that access. The Database and Architecture Taskforces may be able to direct the work of the latter group, to some extent, but their influence over the former is very little, probably no more than advocating protocols and standards that the data curators should adopt if they want their data to be accessible via the Grid. In the light of this, it would be useful if the report stated explicitly to which community each requirement is addressed, so that those requirements on Grid middleware developers could be clearly identified and prioritised, and the requisite labour marshalled by the Database and Architecture Taskforces. Secondly, the structure of the document could be improved significantly if the text were numbered at the subsection or paragraph level, and if requirements identified within it were themselves numbered and made obvious by bold text or some other means. In that way, it would be much easier to check that Appendix III really is a complete summary of the requirements, without which assurance it is impossible to start prioritising them.

Further comments are given below - general comments first, then specific comments, in the order in which they arise in the text. Their number should not be taken as detracting from the fact that we welcome this report warmly, and will gladly contribute to its further development, and its translation into a plan of design and prototyping work, in so far as that impinges on the work of AstroGrid.


General comments:

 

Specific comments: