Multi-Condition Catalogue Queries
This is basically just ADQL access to catalogues, and could be achieved by extending the number of data sets which are accessible via ADQL. Most of the use cases so far only want very simple constraints on the values of columns, so the fact that only a limited set of SQL functions are available should not be a problem.
There are ways to make the queries more efficient (for the user!), some of which arise from (but go beyond) Vizier capabilities
Examples of use cases (feedback from workshops, NAM etc.):
- Finding all objects from any catalogue in a given redshift range
- Finding a list of all galaxies with velocity dispersions which also have X-ray and radio observations
Extracted requirements:
- Directly identify all catalogues containing particular columns e.g. a redshift column
- Simply searching for 'redshift' is not the same, since this might be in the general description, e.g. 'star-formation rates of high-redshift sub-mm sources' without giving specific redshifts.
- Searching column names is the next step, but ideally all columns should have UCDs since that is the only way to be inclusive (yes they can still be slightly ambiguous but less so than anything else presently available).
- As a future refinement, could we automatically run all harvested tables without UCDs through the UCD generator? Duncan L-G was working on something like this for LEDAS?
- Sometimes catalogues contain 'implicit' columns, e.g. 'A list of 21-cm flux densities' where the column heading is simply Flux Density and there is no column 'wavelength' but it is implied; ideally this should also be indicated in some way which allows searching. These issues have been coverd by the UCD group.
- Send the same query to multiple catalogues - no cross-matching needed (e.g. Use Case i. )
- Crudest method is for the user to save and edit the query and have a STILTS utility to concatenate VOTables
- If there was a way of using UCDs then only the catalogue name would have to be changed
- It would be necessary to specify whaat cols to be returned, also (since 'all' might be hundreds) - here, UCDs and UCD fragments would be even more useful e.g. the UCDs for position and for flux (which might be in any band).
- Find multiple properties for the same object in all catalogues (e.g. Use Case ii. )
- This is initially similar to the above but the various properties might be in separate catalogues
- In Use Case ii. , there are probably fewer catalogues containing measurements of velocity dispersions, than radio or X-ray data. One approach would be for the user to first rank their constraints in order of which were the most restrictive. A series of queries (ideally automated) could then be
- Get a list of all objects with (position, velocity dispersion).
- Send the positions as a multiple query to all X-ray catalogues (see MultipleObjectSearch).
- Create a federated list (position, velocity dispersion, X-ray properties) and discard those without X-ray.
- Repeat for Radio catalogues etc.
- This will be simpler when/if the suggested federation of the most comprehensive all-sky surveys are accomplished (USNOB+2MASS etc).
- These cases can all be achieved by constraints like 'less than', 'greater than', on single columns (for positional seraches 'circle' would be useful but not essential). There are other use cases which need more complicated functions and comparison between columns in different catalogues, but that is not essential here.
--
AnitaRichards - 20 Apr 2006