r38 - 10 Feb 2003 - 15:46:24 - NicholasWaltonYou are here: TWiki >  Astrogrid Web  >  DocStore > PhaseBDocs > RbScienceRequirementsSummary

PhaseAReport

(2) The Science Analysis Summary: defining key requirements for AstroGrid

(2.1) Summary

This report describes the derivation and formulation of the science requirements that are being used to define the scope of the Phase-B development of AstroGrid. It is noted that the AstroGrid project has determined a sample set of representative science cases, the 'AstroGrid Ten, from which the necessary AstroGrid product deliverables are derived. Each science case sets challenging demands, in the areas of: resource discovery (from data through published literature sources), manipulation of large, multi-location, mult-TB data sets, application of sequences of algorithmic processing, and so forth. AstroGrid will provide tools and systems to aid the astronomical researcher as they transform experimental and theoretical model data into the information by which the physical processes under investigation can be understood.

Note: Reference is made to the RedBook - this is the AstroGrid Phase-A report - this document.

(2.2) Introduction

The AstroGrid 'vision' is described elsewhere in Chapter 1 of the RedBook. Briefly it is one whereby AstroGrid will provide the UK astronomy community in particular, and the global astronomy community in general, access to powerful, sophisticated, distributed data and advanced processing capabilities. An emphasis will be in enabling more efficient science (e.g. by speeding processes that are currently undertaken by providing improved tools for accessing and manipulating multiple and distributed data sets). Key challenges here are in providing a system that can ive access to large, distributed data sets. Likewise, AstroGrid will enable more effective science via its focus on providing improved workflow capabilities, e.g. in the development of its ontological processes which aim to provide directed workflows for common sets of tasks.

This report focuses on the key science drivers that have been taken by the AstroGrid project in order to determine which deliverables should be produced by the project by the end of its three year development cycle (end Dec 2004). The process by which the science drivers were obtained is described, noting the importance of the initial gathering of a wide set of requirements which were subsequently narrowed to give a final, well defined, set of drivers. These were chosen to be representative of science of current and near term relevance to the UK community, with a good coverage of astronomy, solar and solar-terrestrial physics cases. The 'AstroGrid Ten' science cases are described in detail, along with the major areas of functionality that are required by these cases. Thus the 'AstroGrid Ten' provide the primary science drivers for the Phase-B development of AstroGrid.

Specific targets demanded by the science drivers which will need to be satisfied by the final project deliverables are noted. These are set in terms of volumes and types of data sets that may need to be discovered and processed during a researchers analysis on that specific science topic.

This document notes how the science drivers for the AstroGrid project fit with the development of the Astrophysical Virtual Observatory (AVO) project (of which AstroGrid is a primary member) and other virtual observatory initiatives. In closing, the future areas of astronomical science that may be focussed on in any future development of the AstroGrid or related projects, are commented upon.

(2.3) Capturing the Science Requirements

The AstroGrid project initially determined to seek out a wide and demanding set of science drivers which might shape a generic virtual observatory. These VO science cases have been written up, to varying degrees of completeness, on the AstroGrid Wiki VO science requirements area.

The following sections outline the primary mechanisms by which the project has attempted to gather it's science requirements both from activities internal to the project, and external engagement with the community. Additional details of this process, including full details of seminars given, engagement with the community etc are to be found on relevant areas of the AstroGrid Wiki.

(2.3.1) Gathering Virtual Observatory Science Drivers

The AstroGrid consortium members generated science cases reflecting the scientific interests of their research groups. Because the consortium contains representatives involved in a wide range of UK astronomy, solar and STP research activity, a broad range of science cases, stressing areas such as radio astronomy, solar physics and solar/terrestrial physics, resulted. This was a major and early task of the Project.

In the first instance a number of specific use cases were formulated. These were often concerned with how a part of science problem might be approached, for instance, running a query on a database to locate and show the positions of known QSO's. Other cases are aimed at easing the process of acquiring sufficient data to address a particular problem, e.g. returning the colours of galaxies and their bulges as a function of redshift to study alternative theories of galaxy evolution. The distinction between 'science cases' and more generic 'use cases' rapidly became apparent. Emphasis was placed on capturing and formalising the science cases as listed at http://wiki.astrogrid.org/bin/view/VO/ScienceProblemList.

Further input for the science cases was sought from a number of areas. The project scientist was responsible for assessing current major scientific research strands with a view to identifying those areas likely to benefit from the promise of access to the distributed data and processing capabilities to be opened up by a virtual observatory. Areas here included those science topics serviced by large scale multi-wavelength data survey's or those requiring access to large scale computational facilities. Note was made of a science requirements survey undertaken by the Science Definition Team of the NVO.

The project scientist and other team members have been involved in discussions with their research colleagues in a variety of situations. A series of presentations have been made at University research departments and at national meetings such as the RAS's National Astronomy Meeting and RAS AGM. Importantly the project has engaged with younger researchers in astronomy, with ad-hoc meetings arranged with PhD students and new Post Docs at the IoA, MSSL and other institutes. The focus of these discussions was which elements of AstroGrid would most likely support young researchers work. Key areas of concern to them were in having access to capabilities which would speed the process of mundane data processing and manipulation tasks, thus the concept of assisted workflows appealed.

Engagement with a number of large scale projects has also led to significant scientific input, e.g. with the UKIDSS consortium, the ING Wide Field Survey, the XMM-Survey Science Centre and its role in the XMM Serendipitous Sky Survey.

(2.3.2) The Pilot Programme: Feeding Back Science Requirements

The AstroGrid Phase-A Pilot Programme, as described in the Phase A Report section Pilot programme report, has also provided a number of inputs into the Science Requirements of the project. In a similar vein, five science topics were fed back into the pilot programme as the basis for the pilots as described in Section 7.2 of Pilot programme report.

In a joint AstroGrid/SpaceGRID initiative, the space science research (SSR) community was invited to comment on the requirements of a possible virtual observatory system providing distributed access to data and processing assets. The questionnaire issued is located in full at http://www.spacegrid.rl.ac.uk/spacegrid/SSR_online_q.htm. Summary results of this exercise are reported in the Pilot programme report and the WPA5.5 report. This report did not outline specific scientific drivers. However, it did produce an indication of the priority areas in generic capabilities of interest to the space science community, and these have been input into derivation of the AstroGrid science drivers.

(2.3.3) Presenting the AstroGrid Project: Receiving Input

The project scientist and other AstroGrid staff have been active in giving presentations and seminars about the AstroGrid project throughout Phase A of the project. The list of talks and seminars is given in the Progress against Goals section of this report (located on the wiki site). This has widened the UK communities appreciation of the possibilities of the project and led to significant input into the generation of new science cases and amendments to pre-existing ones.

A number of general articles have been presented in journals such as PPARC's Frontiers magazine, the RAS's Astronomy and Geophysics magazine etc. These have invited scientific feedback, and have led to a number of interesting comments.

(2.4) Shaping the AstroGrid Drivers from the Virtual Observatory Science Cases

The previous section (2.3) has described the process by which the AstroGrid project captured its science drivers. The complete list is held at http://wiki.astrogrid.org/bin/view/VO/ScienceProblemList. It is anticipated that this selection of science drivers will be continually expanded upon throughout the project lifetime. These drivers will form an important resource for input into partner VO initiatives, especially the AVO. Partner projects such as EGSO and SpaceGRID are also likely to draw from them.These 'VO' science cases, together with those being highlighted by other virtual observatory projects, will be used in shaping the evolution of longer term VO initiatives.

With the collation of the VO science drivers, the project recognised that AstroGrid would not be able to produce a virtual observatory capable of meeting the demands of all of these cases. Thus a formal and rigorous process was undertaken in order to select a well defined set of science drivers, the AstroGrid Ten, which would be used to shape the AstroGrid deliverables. The project scientist and scientific members of the team analysed the science drivers over a number of review meetings. Based on selection criteria the science problems were distilled to produce the key set of ten drivers.

These drivers were chosen to:

  • Represent a cross section of currently topical Astronomy, Solar and Space Science research areas
  • Functionalities covering a wide spread of technical areas
  • Be achievable within the AstroGrid Phase-B project span both in terms of technical complexity of solution, but also in terms of availability's of input science data sets.
  • Have a well defined user base who would benefit from capabilities generated by the project
  • But at the same time, the tools generated to satisfy the science project would be of use across a wide range of problem areas.

TheAstroGrid Ten drivers though will be under a more formal version control from the beginning of AstroGrid's Phase B.

(2.5) 'The AstroGrid Ten' Science Drivers

For each science driver a typical flow of events has been constructed which decomposes the tasks required to complete that process. Sequence Diagrams have then been generated for each of the science cases, and for the generic technical use cases (these covering activities such as negotiating access to jobs, logon to the system etc. The sum of these tasks represents the components of the system that are required to form the AstroGrid Phase-B product, to be developed within the framework laid down by the project architecture.

It is worth noting that AstroGrid aims to provide tools and capabilities to help the researcher in producing solutions to these science topics. However, AstroGrid will not in itself provide the answers, the researcher will be presented with new capabilities to make them more efficient and effective. This will be especially so in the areas of data discovery, transformation of data into information via access to processing facilities, and management of the processing flow of events. AstroGrid will mean that the researcher will be able to devote more time to the understanding of the astrophysics revealed by the results, in other words, more time can be given to the important step of transforming information into knowledge.

The capabilities derived for the AstroGrid Ten will have a usefulness to a wider scientific audience. Any science problem with a similar workflow to one of the AstroGrid Ten will be supported. For instance, searching for AGB candidates would benefit from the system developed to support searches of high redshift Quasars - the difference being one of types of input catalogues, and constraints on the discovery space.

From Science Driver to AstroGrid Product

The AstroGrid Ten science drivers are used to define the scope of the AstroGrid deliverables. Each use case was analysed and decomposed into a work flow, with the tasks required by the science cases being identified.

  • Generic Use Cases
    A number of use cases were captured which have generic utility. These cases are those that would be needed in any baseline virtual observatory system, and are largely infrastructural in nature. Use cases in this category include those dealing with security (e.g. Determine Identity), monitoring, job control etc. Generic use cases were captured in a wide sense as generic 'Virtual Observatory' use cases in the VO area of the Wiki - see VO.UseCaseList.

  • Specific Use Cases
    Analysis of each science use case revealed that in addition to the need for the generic use cases, each would require more specialised use cases. For instance, the HiZQuasars case requires use cases such as ones to allow the determinations of redshifts of objects in the field.

  • The AstroGrid Use Case Set
    The analysis of the Ten science cases revealed the minimum set of use cases that would be required to enable the construction of the capabilities require to meet the demands of these science drivers.

In parallel to the scientific requirements process the architectural shape of AstroGrid was being formulated. The use cases demanded by the science drivers, together with any of those indicated by the outline system architecture thus represent the reduced set of use cases that will need to be developed by the AstroGrid project. These are listed in the AstroGrid Use Case wiki area at http://wiki.astrogrid.org/bin/view/Astrogrid/UseCases. For a full description of the further process by which the AstroGrid project will construct its products, refer to the RedBook sections, Architecture Overview and Phase B Plan.

(2.5.1) Brown Dwarf Selection

This problem involves aiding the discovery of Brown Dwarfs from large scale survey data sets. Brown dwarfs are intrinsically faint and rare objects, so their detection is not straightforward. It can be done, however, through a combination of selection criteria using colour and proper motion information. Colour selection is the more important, because brown dwarfs populate a well-defined photospheric temperature range (although the coolest brown dwarfs have unusual spectral energy distributions, peaking around 1 micron, due to the absorption of near-infrared continuum flux by water and methane), but proper motion selection can help, too, since any detectable brown dwarfs must be nearby and, so, on average, they will have relatively high proper motions. The use of wide field optical/near-IR survey's to localise Brown Dwarfs is discussed by Basri, 2000, his Figure 7 shows the colour magnitude diagram for low mass Pleiades members.

The key areas of AstroGrid functionality required are:

  • access to large area optical and near-IR data sets
  • the ability to search for objects in colour-colour space, with objects referenced against model predictions for that colour-colour space
  • the ability to cross match samples selected in the colour-colour search with possible multi-epoch data to determine the objects proper motion.

The resulting brown dwarf sample data sets can then be used as input into spectroscopic confirmation programmes, confirming the nature of the objects by means of tests such as the 'Lithium Test' (see Martin et al, 2000).

A comprehensive flow of events is contained in the Brown Dwarf Sequence Diagramme wiki area. For the specific example where Brown Dwarfs are discovered in Galactic Clusters from multi-colour survey data the following flow of events occurs:

  1. The astronomer searches the resource catalogue for catalogues containing Galactic clusters via PerformRegistrySearch.
  2. A list of cluster catalogues is returned via MySpaceStoreResults, and the astronomer selects one or more cluster catalogues via SelectCatalogue.
  3. Next, the astronomer searches the selected catalogues for cluster locations via PerformCatalogueSearch.
  4. A list of locations, defined by right ascension, declination, radius, and distance, is returned via MySpaceStoreResults.
  5. The astronomer then returns to the resource catalogue and executes a complex query for catalogues with coverage of I, K, or R wavelengths over each cluster location via ComplexQuery.
  6. A list of catalogues with I, K, or R coverage of the cluster location is returned via MySpaceStoreResults.
  7. The astronomer selects one or more catalogues via SelectCatalogue and searches them for tow or more datasets covering the cluster location in the same wavelength (either all datasets with I coverage or all with K coverage) via PerformCatalogueSearch.
  8. The datasets are stored to MySpace via MySpaceStoreResults.
  9. Now the astronomer can prepare the data for the proper motion survey. The datasets are astrometrically aligned using a library function, a web service, or user code via DetermineProgram.
  10. Next, the proper motion can be applied to the datasets by user code, a web service, or a library function via DetermineProgram.
  11. The program calculates a proper motion vector-point diagram of objects in the dataset. The diagram is stored on MySpace via MySpacePublishDerivedData and returned to the astronomer.

and thus the UseCases required to deliver this science case are:

  1. PerformRegistrySearch
  2. PerformCatalogueSearch
  3. MySpaceStoreResults
  4. ComplexQuery
  5. DetermineProgram
  6. MySpacePublishDerivedData
  7. SelectCatalogue

This sequence diagram represents a possible flow of events for this problem:


browndwarfSD2.gif

Detailed breakdowns such as these have been performed for each science case in turn, with full details accessible via the links in this document (or via the wiki pages). Further analysis of each sub case is performed to reveal the complete case set. At this stage class diagrammes, and eventualy software construction is undertaken.

(2.5.2) Discovering Low Surface Brightness Galaxies

Low surface brightness systems are often missed from wide field survey catalogues due to selection effects acting against their discovery. However, it is important to locate and understand the properties of this population as it could contain a significant mass (e.g. Impey & Bothun, 1997). A knowledge of the number and mass distribution of low surface brightness galaxies is also vital when comparing theories of galaxy formation and evolution.

The key areas of AstroGrid functionality required are:

  • access to image surveys with relevant magnitude and depth
  • dual pass algorithms to remove initially bright structures then localise extended low surface brightness features.
  • comparison of structures across multiwavelength data sets, e.g. optical from the WFS, infrared from UKIDSS

A comprehensive flow of events is contained in the Low Surface Brightness galaxy sequence diagramme wiki area.

(2.5.3) The Galaxy Environment of Supernovae at Cosmological Distances

Supernovae searches (e.g. Perlmutter et al, 1999) typically programme observations of a set area of sky (the area imaged being dependent on the size of the SN sample desired, for SN samples at lower redshift larger areas of sky are required due to volume effects). The selection of the correct sample of Type Ia's at the imaging search stage is important because confirmation of the SN comes from spectroscopy often obtained on the largest ground based telescopes, such as the VLT, for the higher (z>0.7) redshift SN. Therefore it is important to minimise 'wasted' spectroscopic and followup time on Type II SN.

A problem with current techniques, is that for any candidate discovered there is an uncertainty as to whether or not the candidate is in fact the desired Type Ia SN. Whilst Type Ia SN are typically brighter than Type II core collapse SN, some (~10%) Type II's can contaminate the sample.

A rapid knowledge of the environment in which any SN is discovered can improve the situation. Pre-determination of the galaxy redshifts utilizing photometric methods enables an estimate to be made of the candidates' distance upon discovery, and thus a better estimation of which type of SN it is. Further, information on the galaxy, for instance it's morphological type, may also aid in rapid classification of the SN. Type Ia's being formed by an explosion resulting from the accretion of matter onto a degenerate star are found in all classes of galaxy. However, Type II's, which result from the catastrophic explosion of a massive star have not been found in early elliptical galaxies.

The key areas of AstroGrid functionality required are:

  • search literature and published sources for possible spectroscopic redshifts of galaxies in SN survey fields.
  • search archives for spectroscopic data of objects in field > determine redshifts of galaxies in fields using perhaps automated techniques such as developed for the 2dFGRS (see Colless et al, 2001)
  • Locate multicolour broadband optical data for the search fields
  • Determine photometric redshifts to galaxies in the fields using a variety of techniques (e.g. hyperz, more recently (2002) Z-Peg).
  • identify possible galaxy clusters
  • cross reference position of newly discovered SN from search. If located in a galaxy for which the redshift is known from one of the above techniques, return an assigned redshift for that SN.
  • return morphological information of the galaxy in which the SN candidate is located (if applicable)

A comprehensive flow of events is contained in the Supernova Galaxy Environment sequence diagramme wiki area.

(2.5.4) Object Identification in Deep Field Surveys

The Hubble Deep Field (HDF) is a 'blank' area of sky observed with unprecedented resolution and sensitivity by the HST, revealing about 3000 faint galaxies within a 3 arcmin-square region (also including flanking fields). Fields of up to 40' centred on the HDF have since been imaged at wavelengths from radio to x-ray. In order to better understand the nature of the objects in the HDF, it is vital to be able to correctly cross identify sources seen in various wavelength regimes. This involves effort in aligning the data sets, and searching for significant correlations between sub-sets of properties. For example, it turns out that there are an excess of radio sources (including those too faint to be catalogued) within the error boxes of selected optical sources in the HDF.

Only recently has the nature of the brightest sub-mm source (HDF850.1) in the HDF-N been unravelled, as described by Dunlop et al, 2002. The key to this discovery was the combination of new deep imagery in the infrared combined with careful astrometric alignment and association techniques to relate the various data sets. Techniques developed by AstroGrid to support further work in this area will be applicable to the data source identifications from the substantial numbers of fields for which deep multiwavelength data sets are becoming available (e.g. HDF, CDF, Subaru/XMM-Newton Deep Survey fields etc.

The key areas of AstroGrid functionality required are:

  • Automatic registration and calibration
  • Search of all available published data
  • Tests for correlations (based on user-supplied criteria) across many catalogues
  • Searches of image (or other) data for uncatalogued sources which become significant if found to co-incide with detections at other wavelengths
  • Search for sources not detected in optical - thus identify objects such as dust-enshrouded starbursts

A comprehensive flow of events is contained in the Deep Field Surveys sequence diagramme wiki area.

(2.5.5) Localising Galaxy Clusters

Clusters of galaxies can be used to trace distribution of matter in the universe over large scales. Clusters are typically X-ray or optically selected. Many optically selected cluster samples have suffered from various selection effects - such as the use of only one colour data (e.g. Dalton et al, 1992).

New techniques (e.g. Gal et al, 2000) select clusters using multicolour data to localise clusters which are predicted to contain an overabundance of red, early type galaxies. Cluster identification using Optical and Near-IR data uses positional information to select clusters (e.g. Gladders & Yee, 2000)

Cluster distributions can be compared to matter distributions generated by e.g. Lambda CDM models (e.g. Nagamine et al, 2001) or Warm Dark Matter models (e.g. Bode et al, 2001). These models now have sufficient resolution to show dwarf galaxies. Morphologies of the cluster galaxies will be directly compared with predictions from models of galaxy formation (e.g. [#JumpToEke2000][Eke et al, 2000]]).

The key areas of AstroGrid functionality required are:

  • select sources marked as galaxies, select only those in a particular locus of the (g-r) vs (i-r) colour space, and then create density maps
  • determination of photometric and/or spectroscopic redshifts of the sample cluster galaxies
  • comparison with n-body code model outputs: issues include interfacing to large model data sets, vizualisation of model vs real data - e.g. matter vs clusters at ranges of redshift, statistical correlations etc.

A comprehensive flow of events is contained in the Galaxy Clustering sequence diagramme wiki area.

(2.5.6) Discovering High Redshift Quasars

Quasars at high redshifts will provide vital clues to the processes involved in the formation of the first bound objects. Near-IR survey data from UKIRT's WFCAM (via the UKIDSS survey) and later VISTA survey programmes will enable many quasars in the redshift range 5.5 < z <7 to be discovered. This will enable a number of principal scientific goals to be met. A key primary rational is that quasars at the highest redshifts may enable the investigation of the epoch of reionisation of the Universe. Such an effect is already being reported for the re-ionisation of He II via studies of quasars between 3 < z < 4 (see Theuns et al, 2002). Higher redshift HiZQuasars would probe the neutral Inter Galactic Medium at this at this epoch.

The key areas of AstroGrid functionality required are:

  • Access to large scale optical and near-IR survey's, especially those in the IR to be provided by UKIDSS
  • Selection of candidate samples in colour-colour space via comparison with model predictions (c.f. optical techniques as described for the SDSS survey by Richards et al, 2002.

A comprehensive flow of events is contained in the High Z Quasars sequence diagramme wiki area.

(2.5.7) The Solar-Stellar Flare Comparison

Flare stars are generally low temperature red, M-class, dwarf stars. Our Sun also experiences flares, and these are related in some poorly understood fashion to Coronal Mass Ejections. Schaefer et al, 2000 have noted that a number of nearby solar type (F-G) stars have undergone super flare events, with the energy in the flares >100 times the most energetic measured from our Sun. The census of stars with 'superflares' is incomplete due to the difficulty in collating the various data sources for nearby flaring stars. This case will aid in provide a full sample of superflare stars. Investigation of the linkage of CME's to flares could be studied by investigating eveidence for CME's in the sample of superflare stars. One technique is to discover evidence of absorption in for instance Si UV lines during a CME event for those flare stars in binary systems with a hot white dwarf (see e.g Bond et al, 2001).

The key areas of AstroGrid functionality required are:

  • Localisation of flare stars from the literature
  • Lightcurve generation for flare stars from published photometry - estimation of energy in the flare
  • Determine availability of high res UV spectra for stars identified as having super flares.

A comprehensive flow of events is contained in the Solar Stellar Flare Comparison sequence diagramme wiki area.

(2.5.8) Deciphering Solar Coronal Waves

There is a current debate as to whether large scale coronal waves and chromospheric waves (Moreton, 1961) are related. Moreton waves were found to propagate at large distances from a solar flare site with velocities ranging from a few hundred to several thousand km/s. Due to the high speeds observed, it was assumed that the origin of the Moreton wave was in the corona and not in the chromosphere. Coronal waves were first observed by the EUV Imaging Telescope (EIT) onboard the Solar and Heliospheric Observatory (SOHO) spacecraft (Thompson et. al, 1999). They appeared in difference images as a bright front with a following dimmed or depleted region of the corona with propagation speeds of a few hundred km/s.

A key goal is to determine whether coronal waves are MHD fast mode waves occurring from a solar flare site, or if they are a global coronal mass ejection lifting off the surface of the disk. This will be achieved by searching for flares, preferably occuring on disk centre, isolating the times, and then finding the necessary datasets (EUV and Halpha spectra, EUV/SXR imaging).

The key areas of AstroGrid functionality required are:

  • Localisation of coronal wave via image subtraction techniques
  • Discovery of supporting multi-wavelength observational data sets covering correct spatial, temporal space (e.g. Zhang et al, 2001).
  • Visualistion of flare, wave datasets

A comprehensive flow of events is contained in the Solar Coronal Waves sequence diagramme wiki area.

(2.5.9) Linking Solar and STP Events

Solar models are currently used to predict STP events as impacting on the local solar-earth space environment. This information can be used to advise the telecommunications and power industries of geomagnetic disturbances (e.g. via the US's Space Environment Center'sSpace Weather page at http://www.sel.noaa.gov/today.html. Solar events such as flares, CME's, and the progression of the solar cycle can cause electromagnetic disturbances in the Earth's magnetosphere. Satellites, radio and television broadcasts, and mobile telephones all experience service interuptions during periods of high solar activity.

Several existing models take solar activity parameters (i.e., time, duration, location, and intensity of events) as input and predict the resulting STP events that will occur in the Earth's magnetosphere, e.g. Geomagnetic Storms. The solar models, solar datasets used as input, and STP datasets used to verify output predictions, are not, however, accessible from a single interface. Models include the Relativistic Electron Forecast and Wang Sheeley models from NOAA.

A key goal is to provide a selection of these models as Astrogrid web services. An individual model can be tested with several solar datasets to compare modeled predictions with actual STP datasets during different stages of solar activity . Also, one solar dataset may be chosen as an input to several models in order to ascertain which model mostly closely predicts STP events during a given time period.

The key areas of AstroGrid functionality required are:

  • Provide unified point of web service access to distributed models
  • Capture relevant STP data to compare with predictions from models

A comprehensive flow of events is contained in the Solar STP Event Coincidence sequence diagramme wiki area.

(2.5.10) Geomagnetic Storms and their Impact on the Magnetosphere

Study of the morphology of the tail of the Earth's magnetosphere during the onset of geomagnetic storms is important in understanding the processes involved, and the impact of the storm on the magnetosphere. Geomagnetic storms can influence the performance of satellite systems such as the GPS (e.g. Skone & de Jong, 2000) and also in severe cases impact power transimission (see e.g.http://www.mpelectric.com/storms/). The observational data can be compared against models of the magnetosphere (e.g. Raeder at al, 2001).

The key areas of AstroGrid functionality required are:

  • Determine temporal location of storm
  • Retrieve list of in-situ satellites with suitable instrumentation located in the magnetosphere during the relevant time periods.
  • Conversion of the position data to a defined coordinate system and the magnetic field data to specific units. The appropriate coordinate system will depend on the application.

A comprehensive flow of events is contained in the Magnetic Storm Onset sequence diagramme wiki area.

(2.6) Providing Functionality as Indicated by the AstroGrid Ten.

The ten science drivers itemised in section 2.5 above cover a number of representative science topics. They demand the development of a range functionalities by AstroGrid.

(2.6.1) The Required Capabilities of AstroGrid

As described above, the decomposition of each of the science use cases, together with a consideration of the minimum generic infrastructure required by the project architecture, leads to the required set of system use cases. Analysis of these leads to the derivation of the requirements for the software development (as discussed in the Phase-B Plan). These main functionality areas which will be provided by the AstroGrid project are described in the ArchitectureOverview section of the RedBook.

Science Case Science Cat Functionality Area
    Advanced Algorithms Astronomical Query Language Compute Intensive Database Access Data Mining IAA AstroOntology and WorkFlow Resource Location
Brown Dwarfs A-S y y y y Y y Y y
LSB Galaxies A-G y y Y     y   y
SN Environment A-G y y Y Y   y   Y
Deep Fields A-C Y Y y Y   y y y
Galaxy Clusters A-C Y y Y Y Y Y    
Hi-z QSO's A-C Y y y y Y Y    
Solar/Stellar Flares A-S/S y Y y   Y y Y Y
Coronal Waves S y y y y y y   Y
STP/Solar Events S-STP y y   Y y y Y Y
Magnetic Storms STP   Y   Y Y y Y Y

Science Cat = Science category: A-S (Astronomy: Stellar), A-G (Astronomy: Galaxy), A-C (Astronomy: Cosmology), S (Solar), STP (Solar-Terrestrial Physics). A matrix element marked 'Y' indicates that this area will be important, whilst 'y' indicates a lesser degree of importance.

The analysis of the processes involved in meeting the requirements set by the science drivers shows that a number of areas are highlighted, with the relative importance of these areas varying from case to case.

For instance, the MagneticStormOnset case does not deal with large data volumes. Rather the problem is one of discovering data from a number of dispersed and heterogeneous data sets, where data streams need to be isolated according to the spatial and temporal position of the in-situ detectors. Non standard data sets need to be addressed, as the STP data is often of the 'pen-plotter' variety, many differing variables being monitored by on-flight detectors measuring in-situ flows (e.g. the magnetic field at a point in space).

The Galaxy Clusters use case involves the manipulation of large multi-TB scale multi-wavelength data sets, and the ability to run sophisticated algorithms on the pixel and catalogue data. The extension to this case additionally will require tools to directly compare the observational cluster data with outputs from theoretical models.

(2.6.2) Specific Targets for Data Manipulation Tasks

The AstroGrid Ten set demands upon the capabilities that must be provided in order to service these science cases. The following table gives lists example data manipulation problems which researchers would wish to complete in a certain time by using Phase-B AstroGrid deliverables.

Science Case Science Cat Data Manipulations Problem Completion Time
       
Brown Dwarfs A-S Perform complex query on 1000 deg^2 of multi-colour (e.g. 6 band, Sloan & UKIDSS) to return 1000+ Brown Dwarf candidate sample, plus postage cutouts of pixel data (~20Mb pixel data per cutout). For each candidate return propoer motion estimate via astrometry of multi-temporal data sets. One hour
LSB Galaxies A-G Select candidates from IR data sets (e.g. 1000 sq deg of UKIDSS Large Area Survey data), cross match with Sloan optical, perform background analysis, determine luminosity calculations One hour
SN Environment A-G For search fields, classify galaxies, determine redshifts (literature search, spectral or photmetric methods) 30 mins
Deep Fields A-C Automatic registration and calibration of multicolour data sets (~10 wavebands, few x 100Mb/band), object identification, cross identification and association One hour
Galaxy Clusters A-C Generate cluster maps from smoothing functions applied to red early type galaxies identified in catalogue data (e.g. VST, UKIDSS). Assign redshifts to clusters via photometric techniques, compare sample with model outputs as function of z Four hours
Hi-z QSO's A-C Perform complex query on 2000 deg^2 of multi-colour (e.g. 6 band, Sloan & UKIDSS) to return 1000+ qso candidate sample, plus postage cutouts of pixel data (~20Mb pixel data per cutout). Ability to cross reference the sample qso set against X-ray, Radio, NED, catalogue lists. One hour
Solar/Stellar Flares A-S/S Recover flare photometry for all flare stars in 25pc of the solar neighbourhood, generate energy in the flare estimates, localise high resolution UV spectra for candidate sample Two hours
Coronal Waves S Localise previous three months of solar imaging data, process image to determine where and when coronal waves occurred. Retrieve all supporting observational data with coverage for these waves Two hours
STP/Solar Events S-STP Provide unified point of web service access to distributd models, capture relevant STP data to compare with predictions from models, test individual model via comparison to various solar datasets One hour
Magnetic Storms STP Determine period of geomagnetic storm from Dst index, located satellite data, return in geocentric coordinate space 30 mins

(2.6.3) The AstroGrid System: Core Functions and User Add-Ons

The evolving Phase-B plans detail AstroGrid's roadmap for Phase-B of the project, to deliver software which will enable the creation of a working, grid-enabled Virtual Observatory (VO) based around key UK astronomical data centres. The key areas of activity will be concentrated on the following areas:

  • Component Services
    This is the main activity of the build phase and concerns the building of the web and grid service-based components from which the VO will be constructed.
  • Library Services
    These are also service-based components but will be wrappers or interfaces to existing tools or libraries.
  • Portal & Client Programs
    These are stand-alone programs with which the user will interact; they will make use of the service components defined above.
  • Demonstrations
    These activities are technology trials required to test whether certain technologies work the way we require, or to check how areas of research can be exploited.
  • Research
    This activity is, as it says, research into areas which are still insufficiently understood to be incorporated into the system.
  • Test Implementations
    This involves the implementation of working versions of the software in one or more data centres to test the feasibility of the software being developed.

The end user will interact with the system through the AstroGrid Portal & Client Programs and be able to manipulate key data sets held both by AstroGrid, and others accessible via compatible interfaces (e.g. those made available by the AVO, NVO). Manipulations of these data sets will be possible by the application of the key AstroGrid library of services and tools (e.g. image classifiers, database manipulations tools etc.).

Running User Provided Algorithms

However, it is recognised that the work flow of many astronomical tasks indicates that the individual researcher will want to apply some forms of algorithmic processing to their data set that is not provided for in the core AstroGrid offer. AstroGrid will devote effort into making available existing software packages such as starlink, iraf, aips++, solarsoft within the system. AstroGrid's Phase-B architecture will also support a limited ability for users to integrate the use of their specific processing algorithms into the system. This capability is foreseen to be implemented in the late 2004 iteration of the project. For those cases where the individual user code can not be uploaded into the system, the user will of course be able to access their data products held by the AstroGrid system and process these data locally.

(2.6.4) Data Sources for AstroGrid

As initially stated in the original project submission for AstroGrid, the scientific focus is one with an aim to exploit key data or resources held by the UK data centres. For instance, the Brown Dwarf Selection case would mine SuperCOSMOS, UKIDSS and WFS data as primary data sources. These UK sourced data sets held by the AstroGrid consortium data centres are listed in full at http://wiki.astrogrid.org/bin/view/Astrogrid/DataCentres. They represent a heterogeneous set of data, covering a wide range of wavelengths from Radio (e.g. Merlin at Jodrell) to X-Ray (e.g. XMM-Newton as Leicester). Limited access to model data will be provided, initial conversations are beginning in this area. AstroGrid will additionally enable access to many data sets available held elsewhere. This will include data sets held by AstroGrids partners in the AVO consortium, these including data held by ESO, and catalogue data accessible through the CDS-Strasbourg.

(2.7) The AstroGrid Science Input into Partner Virtual Observatory Projects

(2.7.1) AstroGrid and the Astrophysical Virtual Observatory

AstroGrid has adopted an approach based on suitable scoping of a widely drawn set of virtual observatory science drivers. AstroGrid is also part of the European Astrophysical Virtual Observatory project. The AVO is currently under taking it's three year Phase-A study which will lead to a fully fledged proposal to develop a facility class Euro-VO?.

A key area of the AVO is it's Science work area. Input to the Science WA is provided by the Science Working Group, membership drawn from the European astronomical community.

At it's second meeting a sub group of the SWG, chaired by the AstroGrid project scientist, was formed to derive the science requirements for the AVO's initial science/technology January 2003 demonstrator product. Development of the AVO demonstrator is now progressing, with AstroGrid taking a lead role in the definition and production of the web services aspects of the demo.Specifically AstroGrid is working to make available an image extractor (SExtractor, see Bertin & Arnouts, 1996) as a web service which will enable on-the fly user defined cut-outs and re-extractions of distributed GOODS data sets.This work is described at http://wiki.astrogrid.org/bin/view/Astrogrid/AVODemo.

(2.7.2) AstroGrid and the National Virtual Observatory

Because AstroGrid has consortium partners involved in Astronomy, Solar AstroPhysics? and STP, its scientific remit is somewhat more extensive than that of the NVO,as outlined in the Science Definition Teams report dated April 2002. However, AstroGrid is more constrained in its shorter three year project timescale compared to the 5 year NVO project span. Thus the AstroGrid project has based its approach on supporting rather specific programmes to ensures that the project delivers a functional, although limited in scope, virtual observatory for the UK. AstroGrid is in regular contact with the NVO project, both formally through joint membership of the International Virtual Observatory Alliance, and informally through cross attandance at each others project meetings. The AstroGrid science drivers show a limited degree of overlap with the current NVO set of driver, for instance in the area of offering enhanced tools to analyse the low surface brightness universe. The possibility of the production of a diverse set of independently developed tools in this area is seen as being of benefit to the community. In general though, the AstroGrid science drivers are different to those of the NVO.

(2.7.3) AstroGrid and the International Virtual Observatory Alliance

The International Virtual Observatory Alliance (IVOA) is an alliance of virtual observatory initiatives. It's aim is to provide a forum in which the common elements, required to ensure that the systems developed by the various partners are interoperable with one another, can be identified and standards agreed upon. Most of these common elements have to do with standards for data, interfaces and perhaps ontologies and registries. Other common or shared elements may be in the form of software packages, source code libraries, and development tools. Some others have to do with issues of policy, funding and securing international support at governmental levels. The first significant milestone of the IVOA was the agreement and release of the VOTable interoperability standard.

Astrogrid is a founder/member of the IVOA, which is currently composed of representatives from the major (i.e. AstroGrid, AVO, the US National Virtual Observatory (NVO)) and all other major and currently funded virtual observatory initiatives (i.e. eAstronomy Australia, Canadian Virtual Observatory, German Virtual Observatory, Russian Virtual Observatory, Virtual Observatory India). The Chair of the IVOA is Bob Hanisch (NVO), Deputy Chair: Peter Quinn (AVO), Technical Chair: Roy Williams (NVO), and Secretary: Nic Walton (AstroGrid).

In this international context, AstroGrid is stressing the importance of science drivers in shaping the development of the global VO initiatives. The importance of these concepts have been agreed upon, and recognised in the concept of the IVOA supporting a roadmap of international development where demonstrations of science driven capability are featured.

(2.8) The Evolutionary Path

At the completion of its Phase-B, AstroGrid aims to provide a fully functional Virtual Observatory capability, with a specific focus on meeting the scientific demands of its UK user community. The AstroGrid product is defined in scope by the science drivers listed in this document, and by the resulting minimum architecture needed to deliver the capabilities demanded by these drivers. The implementation plan by which the AstroGrid product will be delivered is described in the Phase-B plan at http://wiki.astrogrid.org/bin/view/Astrogrid/RbPhaseBPlan.

The AstroGrid framework is being constructed in a manner that will ensure that it is both durable, but capable of further expansion to offer increased capabilities in the future. This philosophy is in line with AstroGrid Vision, where the focus is on producing an organic system that supports and facilitates scientific endeavour, rather than actually 'doing' the science. The design is being shaped with a view to minimise the operational costs of the VO system in the longer term.

Future UK virtual observatory initiatives would build on the AstroGrid product. It is clear that the AstroGrid system would provide the UK with a significant entry point into the planned Euro-VO initiative, whereby a facility class european virtual observatory will be created.

AstroGrid is providing capabilities in a number of areas, for examples:

  • Creating the multi-space digital sky, whereby seamless access and manipulation is enabled to diverse data sets covering the sky
  • Allowing seamless integration of model and observational data sets
  • Providing discovery and access to necessary compute and data storage assets

However, many exciting and challenging issues will need to be addressed in future virtual observatory initiatives, including:

  • Allow the creation of 'topic-specific' workspaces, giving access to all data and tools relevant to a certain astrophysical problem
  • Facilitate the creation of aided work-flows, whereby the use is able to construct their personalised data pipeline, using VO components, and in a manner where the system provides sophisticated guidance.
  • Provide a means whereby the outputs of data manipulations can be automatically fed back into the operations of telescopes, both for real-time and ordinary observational programmes
  • Increase the offer of powerful visualisation capabilities, especially in the domain of multi-dimensional visualisation. via the application of technical advances in immersive computing.
  • Ultimately provide for a dynamic, self accreting digital sky, whereby all global astronomical observational endeavour is captured, tagged for quality and ingressed, for future manipulation and analysis by the global community.

AstroGrid, or its UK successors, will play a key role in providing solutions to these issues.

(2.9) Closing Remarks

AstroGrid is a modest three year project which has analysed a wide range of possible science drivers for the creation of a virtual observatory. It has defined a key set of drivers - the AstroGrid Ten - and is using these as a basis upon which it determines the capabilities that it will produce in its Phase-B. It is noted that the science drivers will require the construction of a sophisticated system capable of providing access to significant hetereogeneous pera-byte scale data sets located in the UK and elsewhere. Tools to discover and manipulate these data will significantly aid the research community. In particular, AstroGrid will increase both the efficiency and effectiveness of the UK astronomer, and enable them to devote more time to the vital task of understanding the physical processes at work as revealed by the results from their data manipulations. This will undoubtedly lead to significant advances in the scientific productivity of the UK astronomical community.

This science summary closes by noting that with the completion of AstroGrid's Phase-B, the UK will be well positioned for a leadership role in future larger scale European virtual observatory initiatives.

(2.10) References

(Astronomy and Geophysics) The Royal Astronomical Societies in house journal - see A&G at Blackwell

(AstroVirtel):http://www.stecf.org/astrovirtel/ and accepted proposals at http://archive.eso.org/wdb/wdb/vo/avt_prop/query

(AVO) The Astrophysical Virtual Observatory - a three year EC funded programme charged with mapping out the structure of a facility class virtual observatory for Europe. See http://www.eso.org/avo

Basri, G, 2000 ARA&A, 38, 485, 'Observations of Brown Dwarfs'

Bertin and Arnouts, 1996, A&AS, 117, 393, 'SExtractor: Software for source extraction'

Bode et al, 2001, ApJ, 556, 93, 'Halo Formation in Warm Dark Matter Models'

Bond et al, 2001, ApJ?, 560, 919, 'Detection of Coronal Mass Ejections in V471 Tauri with the Hubble Space Telescope'

Dalton et al, 1992, ApJ, 390, 1, 'Spatial correlations in a redshift survey of APM galaxy clusters'

Dunlop et al, 2002, MNRAS, in press, 'Discovery of the host galaxy of HDF850.1, the brightest sub-mm source in the Hubble Deep Field'

(EGSO) European Grid of Solar Observatories - an EU IST funded programme. See http://www.mssl.ucl.ac.uk/grid/egso/

Eke et al, 2000, MNRAS, 315, 18, 'The cosmological dependence of galactic specific angular momenta'

(Euro-VO) The current working title for the European Virtual Observatory initiative (shortly at http://www.euro-vo.org). AstroGrid is providing vital scientific and technical input into the development of this future

programme.

(Frontiers) PPARC's in house magazine. A number of articles describing aspects of the AstroGrid project and Virtual Observatory initiatives have been published: http://www.pparc.ac.uk/frontiers

Gal et al, 2000, AJ, 119, 12, 'The Northern Sky Optical Cluster Survey. I. Detection of Galaxy Clusters in DPOSS'

Gladders & Yee, 2000, AJ, 120, 2148, 'A New Method For Galaxy Cluster Detection. I. The Algorithm'

The Great Observatories Origins Deep Survey (GOODS) is a public, multiwavelength survey that will cover two 150 arcmin2 fields. These fields are centered around the HDF-N (Hubble Deep Field North) and the CDF-S (Chandra Deep Field South): see http://www.eso.org/goods.

Impey & Bothun, 1997, ARA&A, 35, 267, 'Low Surface Brightness Galaxies'

The International Virtual Observatory Alliance?. It's home page will shortly be found at http://www.ivoa.net. The IVOA Mission and Roadmap is located at http://wiki.astrogrid.org/bin/view/IVOA/RoadMap[http://wiki.astrogrid.org/bin/view/IVOA/RoadMap

Martin et al, 2000, 543, 299, 'Membership and Multiplicity among Very Low Mass Stars and Brown Dwarfs in the Pleiades Cluster'

Nagamine et al, 2001, ApJ, 558, 497, 'Star Formation History and Stellar Metallicity Distribution in a Cold Dark Matter Universe'

(NAM2002) National Astronomy Meeting 2002 in Bristol, see http://www.star.bris.ac.uk/nam

(NVO) The US National Virtual Observatory project: http://www.us-vo.org

Perlmutter et al, 1999, ApJ, 517, 565, 'Measurements of Omega and Lambda from 42 High-Redshift Supernovae'

Raeder et al, 2001, Solar Physics, 204, 323, 'Global Simulation of Magnetospheric Space Weather Effects of the Bastille Day Storm'

(RAS AGM) Royal Astronomical Society AGM 2002 at http://www.ras.org.uk/

"Relativistic Electron Forecast Model", USAF and NOAA Space Environment Center, http://www.sec.noaa.gov/refm/

Richards et al, 2002, AJ in press, 'Spectroscopic Target Selection in the Sloan Digital Sky Survey: The Quasar Sample'

Schaefer et al, 2000, ApJ, 529, 1026, 'Superflares on Ordinary Solar-Type Stars'

(SDT) National Virtual Observatory Science Definition Team (SDT) - final report at http://nvosdt.org/sdt-final.pdf

Skone & de Jong, 2000, Earth Planets Science, 52, 1067, 'The impact of geomagnetic substorms on GPS receiver performance'

(SpaceGRID) An ESA funded programme to investigate possible uses of Grid technology in supporting the scientific and technical functions of ESA: http://spacegrid.esa.int

SuperCOSMOS Sky Survey: http://www-wfau.roe.ac.uk/sss/

Theuns et al, 2002, ApJ, 574, 111, 'Detection of He II Reionization in the Sloan Digital Sky Survey Quasar Sample'

Thompson et al, 1999, ApJ, 517, 151, 'SOHO/EIT Observations of the 1997 April 7 Coronal Transient: Possible Evidence of Coronal Moreton Waves'

(UKIDSS) UKIRT Infrared Deep Sky Survey

"Wang Sheeley Model", USAF and NOAA Space Environment Center, http://www.sec.noaa.gov/ws/

(WFS) The Isaac Newton Group's Wide Field Survey programme: http://www.ast.cam.ac.uk/~wfcsur/index.php

(XMM-SSC) XMM-Newton Survey Science Centre: http://xmmssc-www.star.le.ac.uk/

Zhang et al, 2001, ApJ, 559, 452, 'On the Temporal Relationship between Coronal Mass Ejections and Flares'

-- NicholasWalton - 27 Dec 2002

Edit | Attach | Printable | Raw View | Backlinks: Web, All Webs | History: r38 < r37 < r36 < r35 < r34 | More topic actions
 
AstroGrid Service Click here for the
AstroGrid Service Web
This is the AstroGrid
Development Wiki
This site is powered by the TWiki collaboration platformCopyright © by the contributing authors. All material on this collaboration platform is the property of the contributing authors.
Ideas, requests, problems regarding TWiki? Send feedback