Some notes and examples for the Chimera workshop.

Digitising legacy data.

Worry that non-digital data may be 'left behind'.

ESRC encourages publiation of raw data for others to use. For example Entangled data project had problems with revealing interviews.

Researchers have a lot of 'raw' data in their head, or in private notes. These are difficult to publish in a useable form.

If we disconnct the data from the original researcher, then does it loose detail. Do we need to keep the link back to the originating researcher to make data understandable.


BHPS

Anonimize the data before publication. Withhold date of birth and location details.

Although there is a 'special licence' for access to more detailed data.

Mentioned plugging data into models.

BHPS data processing

  • Consistency checks
  • Sample management
  • Plausability checks

Where necessary, some edits performed.

Reputation of the dataset and dataset providers is key. Users need to trust the data set and the processing done to it.

They do release the code to generate the derrived variables. They don't release the code used in the checking and cleaning process.

Used to use paper questionaires. Now use questions on laptop ... reduces errors in sequence.

Some work has been done on studying the impact of the change.


Metadata

Metadata takes time and money. Some fields don't have much metadata because they don't have the time/money.

Metadata is specific to audience. Define or constrain the metadata, you constrain who can or will use it. e.g light and dark side astronomers

DDI

Data documentation initiative. Called codebook, but now in XML.

Ontologies

Tend to be fixed, without design for extension later.

Symantic web - provides flexible grammar, but you need to supply vocabulary for your own field.

If you have a rich lanugage, you gain flexibility; but you loose control. People will re-use it to express their own ideas.

Ontologies and Knowledges re-spelled Let go, stop trying to control the information. Allow the community to build their own structures, ontologies etc.


Adding additiona notes on how to use the wiki ....

and a comment from -- TonyLinde

Topic revision: r3 - 2006-01-24 - 15:52:03 - TonyLinde
 
AstroGrid Service Click here for the
AstroGrid Service Web
This is the AstroGrid
Development Wiki

This site is powered by the TWiki collaboration platformCopyright © by the contributing authors. All material on this collaboration platform is the property of the contributing authors.
Ideas, requests, problems regarding TWiki? Send feedback