Research astronomer
To compute the number density of some species of object according to an appropriate measure in some data space.
Could be anything at all - this is very generic...so I'm not sure if this is really a
ScienceProblem or a
UseCase.
For definiteness, consider doing this with the federation of the
Early Data Release of the
Sloan Digital Sky Survey and the data from the same region from the
SuperCOSMOS Sky Survey.
An astronomer wants to select a sample of objects according to some search criteria, which might be a fairly complicated function of the attributes of the objects stored in one or more catalogues - e.g. the
ScienceProblems HaloWhiteDwarfs and
BrownDwarfSelection.
The astronomer formulates and runs the appropriate query, and is returned a result set listing, say, 100 such objects which satisfy the criteria. In many scientific applications, this number by itself is of limited use. What the astronomer really needs is the number density of such objects, which entails computation of the selection function of the query, in addition to just returning the list of objects satisfying the selection criterion.
At the very least it reduces to wanting to know the area of sky over which objects could be selected as satisfying these criteria, so that the astronomer knows that there are, say, ten such objects per square degree. In the general case, where the selection criteria might define a complicated polyhedron in a multi-dimensional data space spanned by attributes (e.g. fluxes in different bands) stored in several survey catalogues, this will be very challenging, and will require quite sophisticated metadata describing the survey coverage in each passband. For example, in optical astronomy one often masks out regions around bright stars when performing source extraction and, in this case, it would require not only that the total area masked out of a given survey area was recorded, but also where the masked regions were, so that the masked area within the overlap with some other survey region(s) can be computed.
This is currently only done for relatively straightforward situations (e.g. a combination of a colour cut and a flux limit) in which case the selection function can be fairly readily computed, given a sufficiently good description (not always available) of survey coverage.
It's not clear how the VO goes about this, but it's clear that delivery of this functionality is essential to many of the statistical queries which are assumed to be at the heart of the VO.
The crux of this is having a sufficiently good description of the coverage of every survey over which the query is run. If that is available then this problem reduces, in principle, to nothing more than some coordinate geometry (or equivalent, using spatial indices in a scheme like
HEALPix or
HTM) to compute the intersection of a set of sky regions. In practice, any variations of survey depth will complicate this greatly.
Discussion:
GoodStyle: Please add comments below. This area should be used for refinement of the above document. If you want to ask questions or start a dialogue with the author, please use (or create) a topic in the
Science Problems Forum. For other
ScienceProblems, refer to the
ScienceProblemList.
Author: Once the refinements here and comments in the forum die down, perhaps you could rewrite the problem, incorporating the comments and refinements.
--
BobMann - 18 Feb 2002