r2 - 11 Sep 2002 - 14:33:58 - GuyRixonYou are here: TWiki >  Astrogrid Web  > RefactoringUCDs

Refactoring UCDs for better precision

UCDs are a wonderful idea: an agreed set of machine-readable standard names for physical and observable quantities. This is exactly what is needed to make interoperable data-files. VOTable could not exists without UCDs.

However, the historical origin of UCDs - as a description of actual, concrete tables held by CDS - has led to a hierarchical structure of UCDs that is good but not as good as it might be with some refactoring. The problem is that to get the greatest possible precision, the tree of UCDs needs a vast number of twigs, not all of which are currently available.

For example, there are POS_EQ_RA_MAIN and POS_EQ_RA_OTHER to distinguish between rulking and subsidiary columns of Right Ascencsion. For position angle (POS_ANG_DIST_*), the MAIN/OTHER distinction is not available. There is an ERROR UCD for measurement uncertainty, but no way to distinguish, say, error in RA from error in declination. There is a branch of the UCD tree for modelled parameters (MODEL_*), but this branch does not include UCDs for celestial positions derived from a model.

The underlying problem seems to be that there are many aspects of what a quantity means which are independent. These aspects should really be axes in a many-dimensional parameter space, but the current UCDs treat them instead as branches of a tree. Inevitably, the tree is either too "bushy" to be parsed comfortable, or has been pruned too severely to describe all the quantities; currently we have the pruned version.

I propose breaking the separable ideas in the UCDs hierarchy out into their logical axes and allowing UCDs to be composed of sets of these atoms. E.g., POS_EQ_RA_MAIN becomes POS_EQ_RA/MAIN where POS_EQ_RA and MAIN are atoms and can be recognised separately in parsing. (Technically, it's the lexical analyser that does the separation.) This allows us to construct "missing" UCDs like POS_EQ_RA/ERROR and POS_ANG_DIST/MAIN without introducing any new terms.

If the atoms are always parsed separately, the extra load on the parser need not be great. In fact, the number of terms it has to recognise will go down, not up.

Here are some possible axes for defining the atoms.

  • Primary nature: e.g. POS_EQ_RA as opposed to POS_EQ_DEC etc. This axes takes over most of the UCD namespace.
  • Ruling/subsidiary column in a table: MAIN or OTHER.
  • Absolute or relative value: ABS or REL.
  • Value or uncertainty: VALUE or ERROR.
  • Observed (i.e. measured) or modelled: OBS or MODEL.

We could also break some complex parts of the tree, such as the lists of passbands for photometric magnitudes into two atoms for the primary nature. These special axes might be as follows.

  • Filter system: e.g. MAG-JOHNSON, MAG-KRON-COUSINS, MAG-SLOAN, etc.
  • Conventional names for passbands in a system: e.g. MAG-B, MAG-V, MAG-R or MAG-I.

This would reduce the number of UCDs needed for completeness by the average number of passbands in a filter system; roughly by a factor of five. It would also allow a programme to search trivially for "Sloan-system photometry" (i.e. to look for MAG-SLOAN) without having to search separately for every UCD in the Sloan system. To make this scheme work we would need a very few changes to the definition and use of UCDs.

  1. The atoms need to be isolated from the current tree of UCDs and recorded to generate the new vocabulary. The community might choose to add a few new concepts at this point.
  2. The new terms should be defined. This reduces the risk that maker of data-sets will use the terms wrongly in their metadata.
  3. Software that parses UCDs needs to be prepared to break the tags at the slashes and to match separately each resultant word.

-- GuyRixon - 11 Sep 2002

Edit | Attach | Printable | Raw View | Backlinks: Web, All Webs | History: r2 < r1 | More topic actions
 
AstroGrid Service Click here for the
AstroGrid Service Web
This is the AstroGrid
Development Wiki
This site is powered by the TWiki collaboration platformCopyright © by the contributing authors. All material on this collaboration platform is the property of the contributing authors.
Ideas, requests, problems regarding TWiki? Send feedback