(2) The Science Analysis Summary: defining key requirements for AstroGrid
(2.1) Summary
This report describes the derivation and formulation of the science requirements that are being used to define the scope of the Phase-B development of
AstroGrid. It is noted that the
AstroGrid project has determined a sample set of representative science cases, the
'AstroGrid Ten, from which the necessary
AstroGrid product deliverables are derived. Each science case sets challenging demands, in the areas of: resource discovery (from data through published literature sources), manipulation of large, multi-location, mult-TB data sets, application of sequences of algorithmic processing, and so forth.
AstroGrid will provide tools and systems to aid the astronomical researcher as they transform experimental and theoretical model data into the information by which the physical processes under investigation can be understood.
Note: Reference is made to the
RedBook - this is the
AstroGrid Phase-A report - this document.
(2.2) Introduction
The
AstroGrid 'vision' is described elsewhere in
Chapter 1 of the
RedBook. Briefly it is one whereby
AstroGrid will provide the UK astronomy community in particular, and the global astronomy community in general, access to powerful, sophisticated, distributed
data and advanced processing capabilities. An emphasis will be in
enabling more efficient science (e.g. by speeding processes that are
currently undertaken by providing improved tools for accessing and
manipulating multiple and distributed data sets). Key challenges here are in providing a system that can ive access to large, distributed data sets. Likewise,
AstroGrid will enable more
effective science via its focus on providing improved workflow
capabilities, e.g. in the development of its ontological processes
which aim to provide directed workflows for common sets of tasks.
This report focuses on the key science drivers that have been taken by the
AstroGrid project in order to determine which deliverables should be produced by the project by the end of its three year development cycle (end Dec 2004). The process by which the science drivers were obtained is described, noting the importance of the initial gathering of a wide set of requirements which were subsequently narrowed to give a final, well defined, set of drivers. These were chosen to be representative of science of current and near term relevance to the UK community, with a good coverage of astronomy, solar and solar-terrestrial physics cases. The '
AstroGrid Ten' science cases are described in detail, along with the major areas of functionality that are required by these cases. Thus the '
AstroGrid Ten' provide the primary science drivers for the
Phase-B development of AstroGrid.
Specific targets demanded by the science drivers which will need to be satisfied by the final project deliverables are noted. These are set in terms of volumes and types of data sets that may need to be discovered and processed during a researchers analysis on that specific science topic.
This document notes how the science drivers for the
AstroGrid project fit with the development of the
Astrophysical Virtual Observatory (
AVO) project (of which
AstroGrid is a primary member) and other virtual observatory initiatives.
In closing, the future areas of astronomical science that may be focussed on in any future development of the
AstroGrid or related
projects, are commented upon.
(2.3) Capturing the Science Requirements
The
AstroGrid project initially determined to seek out a wide and demanding set of science drivers which might shape a generic virtual
observatory. These VO science cases have been written up, to varying degrees of completeness, on the
AstroGrid Wiki VO science requirements area.
The following sections outline the primary mechanisms by which the project has attempted to gather it's science requirements both from
activities internal to the project, and external engagement with the community. Additional details of this process, including full details of seminars given, engagement with the community etc are to be found on relevant areas of the
AstroGrid Wiki.
(2.3.1) Gathering Virtual Observatory Science Drivers
The
AstroGrid consortium members generated science cases reflecting the scientific interests of their research groups. Because the consortium contains representatives involved in a wide range of UK astronomy, solar and STP research activity, a broad range of science cases, stressing areas such as radio astronomy, solar physics and solar/terrestrial physics, resulted. This was a major and early task of the Project.
In the first instance a number of specific use cases were formulated. These were often concerned with how a part of science problem might be approached, for instance, running a query on a database to
locate and show the positions of known QSO's. Other cases are aimed at easing the process of acquiring sufficient data to address a particular problem, e.g. returning the colours of galaxies and their bulges as a function of redshift to study alternative
theories of galaxy evolution. The distinction between 'science cases' and more generic 'use cases' rapidly became apparent. Emphasis was placed on capturing and formalising the science cases as listed at
http://wiki.astrogrid.org/bin/view/VO/ScienceProblemList.
Further input for the science cases was sought from a number of areas. The project scientist was responsible for assessing current major
scientific research strands with a view to identifying those areas likely to benefit from the promise of access to the distributed data and
processing capabilities to be opened up by a virtual observatory. Areas here included those science topics serviced by large
scale multi-wavelength data survey's or those requiring access to large scale computational facilities. Note was made of a science requirements survey undertaken by the
Science Definition Team of the
NVO.
The project scientist and other team members have been involved in discussions with their research colleagues in a variety of
situations. A series of presentations have been made at University research departments and at national meetings such as the
RAS's
National Astronomy Meeting and
RAS AGM. Importantly the project has engaged with younger researchers in astronomy, with ad-hoc meetings arranged with PhD students and new Post Docs at the
IoA, MSSL and other institutes. The focus of these discussions was which elements of
AstroGrid would most likely support young researchers work. Key areas of concern to them were in having access to capabilities which would speed the process of mundane data processing and manipulation tasks, thus the concept of assisted workflows appealed.
Engagement with a number of large scale projects has also led to significant scientific input, e.g. with the
UKIDSS
consortium, the
ING Wide Field Survey, the
XMM-Survey Science Centre and its role in the
XMM Serendipitous Sky Survey.
(2.3.2) The Pilot Programme: Feeding Back Science Requirements
The
AstroGrid Phase-A Pilot Programme, as described in the Phase A Report
section
Pilot programme report, has also
provided a number of inputs into the Science Requirements of the
project. In a similar vein, five science topics were fed back into the pilot programme as the basis for the pilots as described in
Section 7.2 of
Pilot programme report.
In a joint
AstroGrid/
SpaceGRID initiative, the
space science research (SSR) community was invited to comment on the
requirements of a possible virtual observatory system providing
distributed access to data and processing assets. The
questionnaire issued is located in full at
http://www.spacegrid.rl.ac.uk/spacegrid/SSR_online_q.htm. Summary results of
this exercise are reported in the
Pilot programme report and the
WPA5.5 report.
This report did not outline specific scientific drivers. However, it did produce an indication of
the priority areas in generic capabilities of interest to the space
science community, and these have been input into derivation of the
AstroGrid science drivers.
(2.3.3) Presenting the AstroGrid Project: Receiving Input
The project scientist and other
AstroGrid staff have been active in
giving presentations and seminars about the
AstroGrid project
throughout Phase A of the project. The
list of talks and seminars is
given in the
Progress against Goals section of this report (located on the wiki site).
This has widened the UK communities appreciation of the possibilities of the project and led
to significant input into the generation of new science cases and
amendments to pre-existing ones.
A number of general articles have been presented in journals such as
PPARC's
Frontiers magazine, the RAS's
Astronomy and Geophysics magazine etc. These have
invited scientific feedback, and have led to a number of interesting comments.
(2.4) Shaping the AstroGrid Drivers from the Virtual Observatory Science Cases
The previous section (
2.3) has described the process by which the
AstroGrid project captured its science drivers. The complete list is held at
http://wiki.astrogrid.org/bin/view/VO/ScienceProblemList. It is anticipated that this selection of science drivers will be continually expanded upon throughout the project lifetime. These drivers will form an important resource for input into partner VO initiatives, especially the
AVO. Partner projects such as
EGSO and
SpaceGRID are also likely to draw from them.These 'VO' science cases, together with those being highlighted by other virtual observatory projects, will be used in shaping the evolution of longer term VO initiatives.
With the collation of the VO science drivers, the project recognised that
AstroGrid would not be able to produce a virtual observatory capable of meeting the demands of all of these cases. Thus a formal and rigorous process was undertaken in order to select a well defined set of science drivers, the
AstroGrid Ten, which would be used to shape the
AstroGrid deliverables.
The project scientist and scientific members of the team analysed the science drivers over a number of review meetings. Based on selection criteria the science problems were distilled to produce the key set of ten drivers.
These drivers were chosen to:
- Represent a cross section of currently topical Astronomy, Solar and Space Science research areas
- Functionalities covering a wide spread of technical areas
- Be achievable within the AstroGrid Phase-B project span both in terms of technical complexity of solution, but also in terms of availability's of input science data sets.
- Have a well defined user base who would benefit from capabilities generated by the project
- But at the same time, the tools generated to satisfy the science project would be of use across a wide range of problem areas.
The
AstroGrid Ten drivers though will be under a more formal version control from the beginning of
AstroGrid's Phase B.
(2.5) 'The AstroGrid Ten' Science Drivers
For each science driver a typical flow of events has been constructed which decomposes the tasks required to complete that process.
Sequence Diagrams have then been generated for each of the science cases, and for the generic technical use cases (these covering activities such as
negotiating access to jobs,
logon to the system etc. The sum of these tasks represents the components of the system that are required to form the
AstroGrid Phase-B product, to be developed within the framework laid down by the
project architecture.
It is worth noting that
AstroGrid aims to provide tools and capabilities to help the researcher in producing solutions to these science topics. However,
AstroGrid will not in itself provide the answers, the researcher will be presented with new capabilities to make them more efficient and effective. This will be especially so in the areas of data discovery, transformation of data into information via access to processing facilities, and management of the processing flow of events.
AstroGrid will mean that the researcher will be able to devote more time to the understanding of the astrophysics revealed by the results, in other words, more time can be given to the important step of transforming information into knowledge.
The capabilities derived for the
AstroGrid Ten will have a usefulness to a wider scientific audience.
Any science problem with a similar workflow to one of the
AstroGrid Ten will be supported. For instance, searching for
AGB candidates would benefit from the system developed to support searches of
high redshift Quasars - the difference being one of types of input catalogues, and constraints on the discovery space.
From Science Driver to AstroGrid Product
The
AstroGrid Ten science drivers are used to define the scope of the
AstroGrid deliverables. Each use case was analysed and decomposed into a work flow, with the tasks required by the science cases being identified.
- Generic Use Cases
A number of use cases were captured which have generic utility. These cases are those that would be needed in any baseline virtual observatory system, and are largely infrastructural in nature. Use cases in this category include those dealing with security (e.g. Determine Identity), monitoring, job control etc. Generic use cases were captured in a wide sense as generic 'Virtual Observatory' use cases in the VO area of the Wiki - see VO.UseCaseList.
- Specific Use Cases
Analysis of each science use case revealed that in addition to the need for the generic use cases, each would require more specialised use cases. For instance, the HiZQuasars case requires use cases such as ones to allow the determinations of redshifts of objects in the field.
- The AstroGrid Use Case Set
The analysis of the Ten science cases revealed the minimum set of use cases that would be required to enable the construction of the capabilities require to meet the demands of these science drivers.
In parallel to the scientific requirements process the
architectural shape of
AstroGrid was being formulated. The use cases demanded by the science drivers, together with any of those indicated by the outline
system architecture thus represent the reduced set of use cases that will need to be developed by the
AstroGrid project. These are listed
in the
AstroGrid Use Case wiki area at
http://wiki.astrogrid.org/bin/view/Astrogrid/UseCases. For a full description of the further process by which the
AstroGrid
project will construct its products, refer to the
RedBook sections,
Architecture Overview and
Phase B Plan.
This problem involves aiding the discovery of Brown Dwarfs from large scale survey data sets. Brown dwarfs are intrinsically faint and rare objects, so their detection is not straightforward. It can be done, however, through a combination of selection criteria using colour and proper motion information. Colour selection is the more important, because brown dwarfs populate a well-defined photospheric temperature range (although the coolest brown dwarfs have unusual spectral energy distributions, peaking around 1 micron, due to the absorption of near-infrared continuum flux by water and methane), but proper motion selection can help, too, since any detectable brown dwarfs must be nearby and, so, on average, they will have relatively high proper motions.
The use of wide field optical/near-IR survey's to localise Brown Dwarfs is discussed by
Basri, 2000, his
Figure 7 shows the colour magnitude diagram for low mass Pleiades members.
The key areas of
AstroGrid functionality required are:
- access to large area optical and near-IR data sets
- the ability to search for objects in colour-colour space, with objects referenced against model predictions for that colour-colour space
- the ability to cross match samples selected in the colour-colour search with possible multi-epoch data to determine the objects proper motion.
The resulting brown dwarf sample data sets can then be used as input into spectroscopic confirmation programmes, confirming the nature of the objects by means of tests such as the 'Lithium Test' (see
Martin et al, 2000).
A comprehensive flow of events is contained in the
Brown Dwarf Sequence Diagramme wiki area. For the specific example where Brown Dwarfs are discovered in Galactic Clusters from multi-colour survey data the following flow of events occurs:
- The astronomer searches the resource catalogue for catalogues containing Galactic clusters via PerformRegistrySearch.
- A list of cluster catalogues is returned via MySpaceStoreResults, and the astronomer selects one or more cluster catalogues via SelectCatalogue.
- Next, the astronomer searches the selected catalogues for cluster locations via PerformCatalogueSearch.
- A list of locations, defined by right ascension, declination, radius, and distance, is returned via MySpaceStoreResults.
- The astronomer then returns to the resource catalogue and executes a complex query for catalogues with coverage of I, K, or R wavelengths over each cluster location via ComplexQuery.
- A list of catalogues with I, K, or R coverage of the cluster location is returned via MySpaceStoreResults.
- The astronomer selects one or more catalogues via SelectCatalogue and searches them for tow or more datasets covering the cluster location in the same wavelength (either all datasets with I coverage or all with K coverage) via PerformCatalogueSearch.
- The datasets are stored to MySpace via MySpaceStoreResults.
- Now the astronomer can prepare the data for the proper motion survey. The datasets are astrometrically aligned using a library function, a web service, or user code via DetermineProgram.
- Next, the proper motion can be applied to the datasets by user code, a web service, or a library function via DetermineProgram.
- The program calculates a proper motion vector-point diagram of objects in the dataset. The diagram is stored on MySpace via MySpacePublishDerivedData and returned to the astronomer.
and thus the
UseCases required to deliver this science case are:
- PerformRegistrySearch
- PerformCatalogueSearch
- MySpaceStoreResults
- ComplexQuery
- DetermineProgram
- MySpacePublishDerivedData
- SelectCatalogue
This sequence diagram represents a possible flow of events for this problem:
Detailed breakdowns such as these have been performed for each science case in turn, with full details accessible via the links in this document (or via the wiki pages). Further analysis of each sub case is performed to reveal the complete case set. At this stage class diagrammes, and eventualy software construction is undertaken.
Low surface brightness systems are often missed from wide field survey catalogues due to selection effects acting against their discovery. However, it is important to locate and understand the properties of this population as it could contain a significant mass (e.g.
Impey & Bothun, 1997). A knowledge of the number and mass distribution of low surface brightness galaxies is also vital when comparing theories of galaxy formation and evolution.
The key areas of
AstroGrid functionality required are:
- access to image surveys with relevant magnitude and depth
- dual pass algorithms to remove initially bright structures then localise extended low surface brightness features.
- comparison of structures across multiwavelength data sets, e.g. optical from the WFS, infrared from UKIDSS
A comprehensive flow of events is contained in the
Low Surface Brightness galaxy sequence diagramme wiki area.
Supernovae searches (e.g.
Perlmutter et al, 1999) typically programme observations of a set area of sky (the area imaged being dependent on the size of the SN sample desired, for SN samples at lower redshift larger areas of sky are required due to volume effects). The selection of the correct sample of Type Ia's at the imaging search stage is important because confirmation of the SN comes from spectroscopy often obtained on the largest ground based telescopes, such as the VLT, for the higher (z>0.7) redshift SN. Therefore it is important to minimise 'wasted' spectroscopic and followup time on Type II SN.
A problem with current techniques, is that for any candidate discovered there is an uncertainty as to whether or not the candidate is in fact the desired Type Ia SN. Whilst Type Ia SN are typically brighter than Type II core collapse SN, some (~10%) Type II's can contaminate the sample.
A rapid knowledge of the environment in which any SN is discovered can improve the situation. Pre-determination of the galaxy redshifts utilizing photometric methods enables an estimate to be made of the candidates' distance upon discovery, and thus a better estimation of which type of SN it is. Further, information on the galaxy, for instance it's morphological type, may also aid in rapid classification of the SN. Type Ia's being formed by an explosion resulting from the accretion of matter onto a degenerate star are found in all classes of galaxy. However, Type II's, which result from the catastrophic explosion of a massive star have not been found in early elliptical galaxies.
The key areas of
AstroGrid functionality required are:
- search literature and published sources for possible spectroscopic redshifts of galaxies in SN survey fields.
- search archives for spectroscopic data of objects in field > determine redshifts of galaxies in fields using perhaps automated techniques such as developed for the 2dFGRS (see Colless et al, 2001)
- Locate multicolour broadband optical data for the search fields
- Determine photometric redshifts to galaxies in the fields using a variety of techniques (e.g. hyperz, more recently (2002) Z-Peg).
- identify possible galaxy clusters
- cross reference position of newly discovered SN from search. If located in a galaxy for which the redshift is known from one of the above techniques, return an assigned redshift for that SN.
- return morphological information of the galaxy in which the SN candidate is located (if applicable)
A comprehensive flow of events is contained in the
Supernova Galaxy Environment sequence diagramme wiki area.
The Hubble Deep Field (HDF) is a 'blank' area of sky observed with unprecedented resolution and sensitivity by the HST, revealing about 3000 faint galaxies within a 3 arcmin-square region (also including flanking fields). Fields of up to 40' centred on the HDF have since been imaged at wavelengths from radio to x-ray. In order to better understand the nature of the objects in the HDF, it is vital to be able to correctly cross identify sources seen in various wavelength regimes. This involves effort in aligning the data sets, and searching for significant correlations between sub-sets of properties. For example, it turns out that there are an excess of radio sources (including those too faint to be catalogued) within the error boxes of selected optical sources in the HDF.
Only recently has the nature of the brightest sub-mm source (HDF850.1) in the HDF-N been unravelled, as described by
Dunlop et al, 2002. The key to this discovery was the combination of new deep imagery in the infrared combined with careful astrometric alignment and association techniques to relate the various data sets. Techniques developed by
AstroGrid to support further work in this area will be applicable to the data source identifications from the substantial numbers of fields for which deep multiwavelength data sets are becoming available (e.g. HDF, CDF,
Subaru/XMM-Newton Deep Survey fields etc.
The key areas of
AstroGrid functionality required are:
- Automatic registration and calibration
- Search of all available published data
- Tests for correlations (based on user-supplied criteria) across many catalogues
- Searches of image (or other) data for uncatalogued sources which become significant if found to co-incide with detections at other wavelengths
- Search for sources not detected in optical - thus identify objects such as dust-enshrouded starbursts
A comprehensive flow of events is contained in the
Deep Field Surveys sequence diagramme wiki area.
Clusters of galaxies can be used to trace distribution of matter in the universe over large scales. Clusters are typically X-ray or optically selected. Many optically selected cluster samples have suffered from various selection effects - such as the use of only one colour data (e.g.
Dalton et al, 1992).
New techniques (e.g.
Gal et al, 2000) select clusters using multicolour data to localise clusters which are predicted to contain an overabundance of red, early type galaxies. Cluster identification using Optical and Near-IR data uses positional information to select clusters (e.g.
Gladders & Yee, 2000)
Cluster distributions can be compared to matter distributions generated by e.g. Lambda CDM models (e.g.
Nagamine et al, 2001) or Warm Dark Matter models (e.g.
Bode et al, 2001). These models now have sufficient resolution to show dwarf galaxies. Morphologies of the cluster galaxies will be directly compared with predictions from models of galaxy formation (e.g. [#JumpToEke2000][Eke et al, 2000]]).
The key areas of
AstroGrid functionality required are:
- select sources marked as galaxies, select only those in a particular locus of the (g-r) vs (i-r) colour space, and then create density maps
- determination of photometric and/or spectroscopic redshifts of the sample cluster galaxies
- comparison with n-body code model outputs: issues include interfacing to large model data sets, vizualisation of model vs real data - e.g. matter vs clusters at ranges of redshift, statistical correlations etc.
A comprehensive flow of events is contained in the
Galaxy Clustering sequence diagramme wiki area.
Quasars at high redshifts will provide vital clues to the processes involved in the formation of the
first bound objects. Near-IR survey data from UKIRT's WFCAM (via the
UKIDSS survey) and later VISTA survey programmes will enable many quasars in the redshift range 5.5 < z <7 to be discovered. This will enable a number of principal scientific goals to be met.
A key primary rational is that quasars at the highest redshifts may enable the investigation of the epoch of reionisation of the
Universe. Such an effect is already being reported for the re-ionisation of He II via studies of quasars between 3 < z < 4 (
see Theuns et al, 2002). Higher redshift
HiZQuasars would probe the neutral Inter Galactic Medium at this at this epoch.
The key areas of
AstroGrid functionality required are:
- Access to large scale optical and near-IR survey's, especially those in the IR to be provided by UKIDSS
- Selection of candidate samples in colour-colour space via comparison with model predictions (c.f. optical techniques as described for the SDSS survey by Richards et al, 2002.
A comprehensive flow of events is contained in the
High Z Quasars sequence diagramme wiki area.
Flare stars are generally low temperature red, M-class, dwarf stars. Our Sun also experiences flares, and these are related in some poorly understood fashion to
Coronal Mass Ejections.
Schaefer et al, 2000 have noted that a number of nearby solar type (F-G) stars have undergone super flare events, with the energy in the flares >100 times the most energetic measured from our Sun. The census of stars with 'superflares' is incomplete due to the difficulty in collating the various data sources for nearby flaring stars. This case will aid in provide a full sample of superflare stars. Investigation of the linkage of CME's to flares could be studied by investigating eveidence for CME's in the sample of superflare stars. One technique is to discover evidence of absorption in for instance Si UV lines during a CME event for those flare stars in binary systems with a hot white dwarf (see e.g
Bond et al, 2001).
The key areas of
AstroGrid functionality required are:
- Localisation of flare stars from the literature
- Lightcurve generation for flare stars from published photometry - estimation of energy in the flare
- Determine availability of high res UV spectra for stars identified as having super flares.
A comprehensive flow of events is contained in the
Solar Stellar Flare Comparison sequence diagramme wiki area.
There is a current debate as to whether large scale coronal waves and chromospheric waves (Moreton, 1961) are related. Moreton waves were found to propagate at large distances from a solar flare site with velocities ranging from a few hundred to several thousand km/s. Due to the high speeds observed, it was assumed that the origin of the Moreton wave was in the corona and not in the chromosphere. Coronal waves were first observed by the EUV Imaging Telescope (EIT) onboard the Solar and Heliospheric Observatory (SOHO) spacecraft (
Thompson et. al, 1999). They appeared in difference images as a bright front with a following dimmed or depleted region of the corona with propagation speeds of a few hundred km/s.
A key goal is to determine whether coronal waves are MHD fast mode waves occurring from a solar flare site, or if they are a global coronal mass ejection lifting off the surface of the disk. This will be achieved by searching for flares, preferably occuring on disk centre, isolating the times, and then finding the necessary datasets (EUV and Halpha spectra, EUV/SXR imaging).
The key areas of
AstroGrid functionality required are:
- Localisation of coronal wave via image subtraction techniques
- Discovery of supporting multi-wavelength observational data sets covering correct spatial, temporal space (e.g. Zhang et al, 2001).
- Visualistion of flare, wave datasets
A comprehensive flow of events is contained in the
Solar Coronal Waves sequence diagramme wiki area.
Solar models are currently used to predict STP events as impacting on the local solar-earth space environment. This information can be used to advise the telecommunications and power industries of geomagnetic disturbances (e.g. via the US's
Space Environment Center'sSpace Weather page at
http://www.sel.noaa.gov/today.html. Solar events such as flares, CME's, and the progression of the solar cycle can cause electromagnetic disturbances in the Earth's magnetosphere. Satellites, radio and television broadcasts, and mobile telephones all experience service interuptions during periods of high solar activity.
Several existing models take solar activity parameters (i.e., time, duration, location, and intensity of events) as input and predict the resulting STP events that will occur in the Earth's magnetosphere, e.g.
Geomagnetic Storms. The solar models, solar datasets used as input, and STP datasets used to verify output predictions, are not, however, accessible from a single interface. Models include the
Relativistic Electron Forecast and
Wang Sheeley models from NOAA.
A key goal is to provide a selection of these models as Astrogrid web services. An individual model can be tested with several solar datasets to compare modeled predictions with actual STP datasets during different stages of solar activity . Also, one solar dataset may be chosen as an input to several models in order to ascertain which model mostly closely predicts STP events during a given time period.
The key areas of
AstroGrid functionality required are:
- Provide unified point of web service access to distributed models
- Capture relevant STP data to compare with predictions from models
A comprehensive flow of events is contained in the
Solar STP Event Coincidence sequence diagramme wiki area.
Study of the morphology of the tail of the Earth's magnetosphere during the onset of geomagnetic storms is important
in understanding the processes involved, and the impact of the storm on the magnetosphere. Geomagnetic storms can influence the performance of satellite systems such as the GPS (e.g.
Skone & de Jong, 2000) and also in severe cases impact power transimission (see e.g.http://www.mpelectric.com/storms/). The observational data can be compared against models of the magnetosphere (e.g.
Raeder at al, 2001).
The key areas of
AstroGrid functionality required are:
- Determine temporal location of storm
- Retrieve list of in-situ satellites with suitable instrumentation located in the magnetosphere during the relevant time periods.
- Conversion of the position data to a defined coordinate system and the magnetic field data to specific units. The appropriate coordinate system will depend on the application.
A comprehensive flow of events is contained in the
Magnetic Storm Onset sequence diagramme wiki area.
(2.6) Providing Functionality as Indicated by the AstroGrid Ten.
The ten science drivers itemised in section 2.5 above cover a number of representative science topics. They demand the development of a range functionalities by
AstroGrid.
(2.6.1) The Required Capabilities of AstroGrid
As described above, the decomposition of each of the science use cases, together with a consideration of the minimum generic infrastructure required by the project architecture, leads to the required set of system use cases. Analysis of these leads to the derivation of the requirements for the software development (as discussed in the
Phase-B Plan). These main functionality areas which will be provided by the
AstroGrid project are described in the
ArchitectureOverview section of the
RedBook.
Science Cat = Science category: A-S (Astronomy: Stellar), A-G
(Astronomy: Galaxy), A-C (Astronomy: Cosmology), S (Solar), STP
(Solar-Terrestrial Physics). A matrix element marked 'Y' indicates that this area will be important, whilst 'y' indicates a lesser degree of importance.
The analysis of the processes involved in meeting the requirements set by the science drivers shows that a number of areas are highlighted, with the relative importance of these areas varying from case to case.
For instance, the
MagneticStormOnset case does not deal with large data volumes. Rather the problem is one of discovering data from a number of dispersed and heterogeneous data sets, where data streams need to be isolated according to the spatial and temporal position of the in-situ detectors. Non standard data sets need to be addressed, as the STP data is often of the 'pen-plotter' variety, many differing variables being monitored by on-flight detectors measuring in-situ flows (e.g. the magnetic field at a point in space).
The
Galaxy Clusters use case involves the manipulation of large multi-TB scale multi-wavelength data sets, and the ability to run sophisticated algorithms on the pixel and catalogue data. The extension to this case additionally will require tools to directly compare the observational cluster data with outputs from theoretical models.
(2.6.2) Specific Targets for Data Manipulation Tasks
The
AstroGrid Ten set demands upon the capabilities that must be provided in order to service these science cases. The following table gives lists example data manipulation problems which researchers would wish to complete in a certain time by using Phase-B
AstroGrid deliverables.
| Science Case | Science Cat | Data Manipulations Problem | Completion Time |
| | | | |
| Brown Dwarfs | A-S | Perform complex query on 1000 deg^2 of multi-colour (e.g. 6 band, Sloan & UKIDSS) to return 1000+ Brown Dwarf candidate sample, plus postage cutouts of pixel data (~20Mb pixel data per cutout). For each candidate return propoer motion estimate via astrometry of multi-temporal data sets. | One hour |
| LSB Galaxies | A-G | Select candidates from IR data sets (e.g. 1000 sq deg of UKIDSS Large Area Survey data), cross match with Sloan optical, perform background analysis, determine luminosity calculations | One hour |
| SN Environment | A-G | For search fields, classify galaxies, determine redshifts (literature search, spectral or photmetric methods) | 30 mins |
| Deep Fields | A-C | Automatic registration and calibration of multicolour data sets (~10 wavebands, few x 100Mb/band), object identification, cross identification and association | One hour |
| Galaxy Clusters | A-C | Generate cluster maps from smoothing functions applied to red early type galaxies identified in catalogue data (e.g. VST, UKIDSS). Assign redshifts to clusters via photometric techniques, compare sample with model outputs as function of z | Four hours |
| Hi-z QSO's | A-C | Perform complex query on 2000 deg^2 of multi-colour (e.g. 6 band, Sloan & UKIDSS) to return 1000+ qso candidate sample, plus postage cutouts of pixel data (~20Mb pixel data per cutout). Ability to cross reference the sample qso set against X-ray, Radio, NED, catalogue lists. | One hour |
| Solar/Stellar Flares | A-S/S | Recover flare photometry for all flare stars in 25pc of the solar neighbourhood, generate energy in the flare estimates, localise high resolution UV spectra for candidate sample | Two hours |
| Coronal Waves | S | Localise previous three months of solar imaging data, process image to determine where and when coronal waves occurred. Retrieve all supporting observational data with coverage for these waves | Two hours |
| STP/Solar Events | S-STP | Provide unified point of web service access to distributd models, capture relevant STP data to compare with predictions from models, test individual model via comparison to various solar datasets | One hour |
| Magnetic Storms | STP | Determine period of geomagnetic storm from Dst index, located satellite data, return in geocentric coordinate space | 30 mins |
(2.6.3) The AstroGrid System: Core Functions and User Add-Ons
The evolving Phase-B plans detail
AstroGrid's roadmap for Phase-B of the project, to deliver software which will enable the creation of a working, grid-enabled Virtual Observatory (VO) based around key UK astronomical data centres. The key areas of activity will be concentrated on the following areas:
- Component Services
This is the main activity of the build phase and concerns the building of the web and grid service-based components from which the VO will be constructed.
- Library Services
These are also service-based components but will be wrappers or interfaces to existing tools or libraries.
- Portal & Client Programs
These are stand-alone programs with which the user will interact; they will make use of the service components defined above.
- Demonstrations
These activities are technology trials required to test whether certain technologies work the way we require, or to check how areas of research can be exploited.
- Research
This activity is, as it says, research into areas which are still insufficiently understood to be incorporated into the system.
- Test Implementations
This involves the implementation of working versions of the software in one or more data centres to test the feasibility of the software being developed.
The end user will interact with the system through the
AstroGrid Portal & Client Programs and be able to manipulate key data sets held both by
AstroGrid, and others accessible via compatible interfaces (e.g. those made available by the
AVO, NVO). Manipulations of these data sets will be possible by the application of the key
AstroGrid library of services and tools (e.g. image classifiers, database manipulations tools etc.).
Running User Provided Algorithms
However, it is recognised that the work flow of many astronomical tasks indicates that the individual researcher will want to apply some forms of algorithmic processing to their data set that is not provided for in the core
AstroGrid offer.
AstroGrid will devote effort into making available existing software packages such as starlink, iraf, aips++, solarsoft within the system.
AstroGrid's
Phase-B architecture will also support a limited ability for users to integrate the use of their specific processing algorithms into the system. This capability is foreseen to be implemented in the
late 2004 iteration of the project. For those cases where the individual user code can not be uploaded into the system, the user will of course be able to access their data products held by the
AstroGrid system and process these data locally.
(2.6.4) Data Sources for AstroGrid
As initially stated in the original project submission for
AstroGrid, the scientific focus is one with an aim to exploit key data or resources held by the UK data centres. For instance, the
Brown Dwarf Selection case would mine
SuperCOSMOS,
UKIDSS and
WFS data as primary data sources. These UK sourced data sets held by the
AstroGrid consortium data centres are listed in full at
http://wiki.astrogrid.org/bin/view/Astrogrid/DataCentres. They represent a heterogeneous set of data, covering a wide range of wavelengths from Radio (e.g. Merlin at Jodrell) to X-Ray (e.g. XMM-Newton as Leicester). Limited access to model data will be provided, initial conversations are beginning in this area.
AstroGrid will additionally enable access to many data sets available held elsewhere. This will include data sets held by
AstroGrids partners in the
AVO consortium, these including data held by ESO, and catalogue data accessible through the CDS-Strasbourg.
(2.7) The AstroGrid Science Input into Partner Virtual Observatory Projects
(2.7.1) AstroGrid and the Astrophysical Virtual Observatory
AstroGrid has adopted an approach based on suitable scoping of a widely drawn set of virtual observatory science drivers.
AstroGrid is also part of the European
Astrophysical Virtual Observatory project. The
AVO is
currently under taking it's three year Phase-A study which will lead
to a fully fledged proposal to develop a facility class
Euro-VO?.
A key area of the
AVO is it's
Science work area. Input to the Science WA is provided by the Science Working Group, membership drawn from the European astronomical community.
At it's second meeting a sub group of the SWG, chaired by the
AstroGrid project scientist, was formed to derive the science
requirements for the
AVO's initial science/technology January 2003 demonstrator product. Development of the
AVO demonstrator is now progressing, with
AstroGrid taking a lead role in the definition and production of the web services aspects of the demo.Specifically
AstroGrid is working to make available an image extractor (
SExtractor, see
Bertin & Arnouts, 1996) as a web service which will enable on-the fly user defined cut-outs and re-extractions of distributed
GOODS data sets.This work is described at
http://wiki.astrogrid.org/bin/view/Astrogrid/AVODemo.
(2.7.2) AstroGrid and the National Virtual Observatory
Because
AstroGrid has consortium partners involved in Astronomy, Solar
AstroPhysics? and STP, its scientific remit is somewhat more extensive than that of the
NVO,as outlined in the
Science Definition Teams report dated April 2002.
However,
AstroGrid is more constrained in its shorter three year project timescale compared to the 5 year NVO project span. Thus the
AstroGrid project has based its approach on supporting rather specific programmes to ensures that the project delivers a functional, although limited in scope, virtual observatory for the UK.
AstroGrid is in regular contact with the NVO project, both formally through joint membership of the International Virtual Observatory Alliance, and informally through cross attandance at each others project meetings. The
AstroGrid science drivers show a limited degree of overlap with the current NVO set of driver, for instance in the area of offering enhanced tools to analyse the low surface brightness universe. The possibility of the production of a diverse set of independently developed tools in this area is seen as being of benefit to the community. In general though, the
AstroGrid science drivers are different to those of the NVO.
(2.7.3) AstroGrid and the International Virtual Observatory Alliance
The
International Virtual Observatory Alliance (IVOA) is an alliance of virtual observatory initiatives. It's aim is to provide a forum in which the common elements, required to ensure that the systems developed by the various partners are interoperable with one another, can be identified and standards agreed upon. Most of these common elements have to do with standards for data, interfaces and perhaps ontologies and registries. Other common or shared elements may be in the form of software packages, source code libraries, and development tools. Some others have to do with issues of policy, funding and securing international support at governmental levels. The first significant milestone of the IVOA was the agreement and release of the VOTable interoperability standard.
Astrogrid is a founder/member of the IVOA, which is currently composed of representatives from the major (i.e.
AstroGrid,
AVO, the US National Virtual Observatory (NVO)) and all other major and currently funded virtual observatory initiatives (i.e. eAstronomy Australia, Canadian Virtual Observatory, German Virtual Observatory, Russian Virtual Observatory, Virtual Observatory India). The Chair of the IVOA is Bob Hanisch (NVO), Deputy Chair: Peter Quinn (
AVO), Technical Chair: Roy Williams (NVO), and Secretary: Nic Walton (
AstroGrid).
In this international context,
AstroGrid is stressing the importance of science drivers in shaping the development of the global VO initiatives. The importance of these concepts have been agreed upon, and recognised in the concept of the IVOA supporting a roadmap of international development where demonstrations of science driven capability are featured.
(2.8) The Evolutionary Path
At the completion of its Phase-B,
AstroGrid aims to provide a fully functional Virtual Observatory capability, with a specific focus on meeting the scientific demands of its UK user community. The
AstroGrid product is defined in scope by the science drivers listed in this document, and by the resulting minimum architecture needed to deliver the capabilities demanded by these drivers. The implementation plan by which the
AstroGrid product will be delivered is described in the
Phase-B plan at
http://wiki.astrogrid.org/bin/view/Astrogrid/RbPhaseBPlan.
The
AstroGrid framework is being constructed in a manner that will ensure that it is both durable, but capable of further expansion to offer increased capabilities in the future. This philosophy is in line with
AstroGrid Vision, where the focus is on producing an organic system that supports and facilitates scientific endeavour, rather than actually 'doing' the science. The design is being shaped with a view to minimise the operational costs of the VO system in the longer term.
Future UK virtual observatory initiatives would build on the
AstroGrid product. It is clear that the
AstroGrid system would provide the UK with a significant entry point into the planned
Euro-VO initiative, whereby a facility class european virtual observatory will be created.
AstroGrid is providing capabilities in a number of areas, for examples:
- Creating the multi-space digital sky, whereby seamless access and manipulation is enabled to diverse data sets covering the sky
- Allowing seamless integration of model and observational data sets
- Providing discovery and access to necessary compute and data storage assets
However, many exciting and challenging issues will need to be addressed in future virtual observatory initiatives, including:
- Allow the creation of 'topic-specific' workspaces, giving access to all data and tools relevant to a certain astrophysical problem
- Facilitate the creation of aided work-flows, whereby the use is able to construct their personalised data pipeline, using VO components, and in a manner where the system provides sophisticated guidance.
- Provide a means whereby the outputs of data manipulations can be automatically fed back into the operations of telescopes, both for real-time and ordinary observational programmes
- Increase the offer of powerful visualisation capabilities, especially in the domain of multi-dimensional visualisation. via the application of technical advances in immersive computing.
- Ultimately provide for a dynamic, self accreting digital sky, whereby all global astronomical observational endeavour is captured, tagged for quality and ingressed, for future manipulation and analysis by the global community.
AstroGrid, or its UK successors, will play a key role in providing solutions to these issues.
(2.9) Closing Remarks
AstroGrid is a modest three year project which has analysed a wide range of possible science drivers for the creation of a virtual observatory. It has defined a key set of drivers - the
AstroGrid Ten - and is using these as a basis upon which it determines the capabilities that it will produce in its Phase-B. It is noted that the science drivers will require the construction of a sophisticated system capable of providing access to significant hetereogeneous pera-byte scale data sets located in the UK and elsewhere. Tools to discover and manipulate these data will significantly aid the research community. In particular,
AstroGrid will increase both the efficiency and effectiveness of the UK astronomer, and enable them to devote more time to the vital task of understanding the physical processes at work as revealed by the results from their data manipulations. This will undoubtedly lead to significant advances in the scientific productivity of the UK astronomical community.
This science summary closes by noting that with the completion of
AstroGrid's Phase-B, the UK will be well positioned for a leadership role in future larger scale European virtual observatory initiatives.
(2.10) References
(Astronomy and Geophysics) The
Royal Astronomical Societies in house
journal - see
A&G at Blackwell
(
AstroVirtel):http://www.stecf.org/astrovirtel/
and accepted proposals at
http://archive.eso.org/wdb/wdb/vo/avt_prop/query
(
AVO) The Astrophysical Virtual Observatory - a three year
EC funded programme charged with mapping out the structure of a
facility class virtual observatory for Europe. See
http://www.eso.org/avo
Basri, G, 2000 ARA&A, 38, 485, 'Observations of Brown Dwarfs'
Bertin and Arnouts, 1996, A&AS, 117, 393, 'SExtractor: Software for source extraction'
Bode et al, 2001, ApJ, 556, 93, 'Halo Formation in Warm Dark Matter Models'
Bond et al, 2001,
ApJ?, 560, 919, 'Detection of Coronal Mass Ejections in V471 Tauri with the Hubble Space Telescope'
Dalton et al, 1992, ApJ, 390, 1, 'Spatial correlations in a redshift survey of APM galaxy clusters'
Dunlop et al, 2002, MNRAS, in press, 'Discovery of the host galaxy of HDF850.1, the brightest sub-mm source in the Hubble Deep Field'
(
EGSO) European Grid of Solar Observatories - an
EU IST funded programme. See
http://www.mssl.ucl.ac.uk/grid/egso/
Eke et al, 2000, MNRAS, 315, 18, 'The cosmological dependence of galactic specific angular momenta'
(
Euro-VO) The current working title for the European
Virtual Observatory initiative (shortly at
http://www.euro-vo.org).
AstroGrid is providing vital
scientific and technical input into the development of this future
programme.
(
Frontiers)
PPARC's in house magazine. A number of
articles describing aspects of the
AstroGrid project and Virtual
Observatory initiatives have been published:
http://www.pparc.ac.uk/frontiers
Gal et al, 2000, AJ, 119, 12, 'The Northern Sky Optical Cluster Survey. I. Detection of Galaxy Clusters in DPOSS'
Gladders & Yee, 2000, AJ, 120, 2148, 'A New Method For Galaxy Cluster Detection. I. The Algorithm'
The Great Observatories Origins Deep Survey (GOODS) is a public, multiwavelength survey that will cover two 150 arcmin2 fields. These fields are centered around the HDF-N (Hubble Deep Field North) and the CDF-S (Chandra Deep Field South): see
http://www.eso.org/goods.
Impey & Bothun,
1997, ARA&A, 35, 267, 'Low Surface Brightness Galaxies'
The International Virtual Observatory Alliance?. It's home page will shortly be found at
http://www.ivoa.net. The IVOA Mission and Roadmap is located at
http://wiki.astrogrid.org/bin/view/IVOA/RoadMap[http://wiki.astrogrid.org/bin/view/IVOA/RoadMap
Martin et al, 2000, 543, 299, 'Membership and Multiplicity among Very Low Mass Stars and Brown Dwarfs in the Pleiades Cluster'
Nagamine et al, 2001, ApJ, 558, 497, 'Star Formation History and Stellar Metallicity Distribution in a Cold Dark Matter Universe'
(
NAM2002) National Astronomy Meeting 2002 in Bristol, see
http://www.star.bris.ac.uk/nam
(
NVO) The US National Virtual Observatory project:
http://www.us-vo.org
Perlmutter et al, 1999, ApJ, 517, 565, 'Measurements of Omega and Lambda from 42 High-Redshift Supernovae'
Raeder et al, 2001,
Solar Physics, 204, 323, 'Global Simulation of Magnetospheric Space Weather Effects of the Bastille Day Storm'
(RAS AGM)
Royal Astronomical Society AGM 2002 at
http://www.ras.org.uk/
"Relativistic Electron Forecast Model", USAF and NOAA Space Environment Center,
http://www.sec.noaa.gov/refm/
Richards et al, 2002, AJ in press, 'Spectroscopic Target Selection in the Sloan Digital Sky Survey: The Quasar Sample'
Schaefer et al, 2000, ApJ, 529, 1026, 'Superflares on Ordinary Solar-Type Stars'
(
SDT) National Virtual Observatory Science Definition Team (SDT) -
final report at
http://nvosdt.org/sdt-final.pdf
Skone & de Jong, 2000, Earth Planets Science, 52, 1067, 'The impact of geomagnetic substorms on GPS receiver performance'
(
SpaceGRID) An ESA funded programme to investigate
possible uses of Grid technology in supporting the scientific and
technical functions of ESA:
http://spacegrid.esa.int
SuperCOSMOS Sky Survey:
http://www-wfau.roe.ac.uk/sss/
Theuns et al, 2002, ApJ, 574, 111, 'Detection of He II Reionization in the Sloan Digital Sky Survey Quasar Sample'
Thompson et al, 1999, ApJ, 517, 151, 'SOHO/EIT Observations of the 1997 April 7 Coronal Transient: Possible Evidence of Coronal Moreton Waves'
(
UKIDSS) UKIRT Infrared Deep Sky Survey
"Wang Sheeley Model", USAF and NOAA Space Environment Center,
http://www.sec.noaa.gov/ws/
(
WFS) The Isaac Newton Group's Wide Field Survey programme:
http://www.ast.cam.ac.uk/~wfcsur/index.php
(
XMM-SSC) XMM-Newton Survey Science Centre:
http://xmmssc-www.star.le.ac.uk/
Zhang et al, 2001, ApJ, 559, 452, 'On the Temporal Relationship between Coronal Mass Ejections and Flares'
--
NicholasWalton - 27 Dec 2002