Scratch page for working on the VOSpace specification.


Abstract

VOSpace is a SOAP interface for access to data stores. VOSpace-1 applies the VOSpace concept to flat, unconnected stores.

This version applies the VOSpace concept to flat, unconnected data spaces.
Future versions of the specification will add extensions to support a hierarchical structure and links between the individual space services.

Introduction

VOSpace is an interface standard for data stores. It specifies how VO agents and applications can use network attached data stores to persist and exchange data in a standard way.

A VOSpace web service is an access point for a distributed storage network. Through that this access point, a client can:

  • add or delete data sets objects;
  • manipulate metadata for the data sets objects;
  • obtain URIs through which the content of the data sets objects can be accessed.

VOSpace does not define how the data are stored, but only how they are accessed. Thus, the VOSpace interface can readily be added to an existing storage system.

When we speak of “a VOSpace”, we mean the arrangement of data accessible through one particular VOSpace service. A VOSpace data node means represents a data-set data object within a VOSpace service.

Nodes in VOSpace have unique identifiers expressed as URIs in the vos:// scheme, as defined below.

In VOSpace-1.0, the subject of this this version of the standard, each VOSpace service is provides a single, flat container set of data setsobjects, like one directory of a file system similar to a service in described by the earlier VOStore standard; this version of the VOSpace-1 specification supercedes VOStore.

There are no links between VOSpace 1 services.

LaterFuture versions of the VOSpace specification will may allow provide support for a hierarchical arrangement of data sets objects within a space, and will may allow provide support for VOSpace services to be linked such that a client can navigate them as one tree global space.

Services implementing VOSpace 1 the current version of the specification can be linked in as leaf nodes of this combined tree global space without needing to change; the VOSpace 2+ services will make the links.

VOSpace identifiers

The identifier for a node in VOSpace shall be a URI with the scheme vos. Such a URI shall have the following parts with the meanings and encoding rules defined in RFC2396 [2].

  • scheme;
  • naming authority;
  • path;
  • (optional) query;
  • (optional) fragment identifier;

The naming authority for a VOSpace node shall be the VOSpace service through which the node was created. The authority part of the URI shall be constructed from the IVO registry identifier [3] for that service by deleting the ivo:// prefix and changing all forward-slash characters(‘/’) in the resource key to exclamation marks (‘!’).

This is an example of a possible VOSpace identifier.

vos://org.astrogrid.cam!vospace!container-6/siap-out-1.vot?foo=bar#baz

  • vos:// is the URI scheme for the identifier.

Using a separate URI scheme for VOSpace identifiers enables clients to distinguish between IVO registry identifiers and VOSpace identifiers.

  • org.astrogrid.cam!vospace!container-6 is the authority part of the URI, corresponding to the IVO-ID IVO registry identifier ivo://org.astrogrid.cam/vospace/container-6.

There should be a VOSpace service registered with this identifier.
This is the IVO registry identifier of the VOSpace service that contains the node.

  • /siap-out-1.vot is the file or node path part of the URI.

Slashes in the path imply a hierarchical arrangement of data, as is normal with URIs. Since VOSpace 1 the current version of the specification does not support data hierarchies, an identifier for a VOSpace-1 node in a current service must have one slash at the start of the path and no other slashes.

The node / represents the entire VOSpace-1.0 but may not be addressed as a node via the 1.0 interface.

  • ?foo=baz is a query string and thus is something to which the VOSpace service is supposed to respond.

No queries of this nature are defined for VOSpace 1 in the current version of the specification, but the query string system is reserved for use in later versions of the VOSpace specification. VOSpace-1 identifiers must not contain the ‘?’ delimiter.

  • #baz is a fragment identifier. Its meaning attaches to the data-set stored in data returned from the VOSpace node, not to the node itself.

The fragment identifier is should be interpreted by the client, not by the VOSpace service; the service shall ignore any fragment identifiers in a received node identifier..

A VOSpace identifier shall refer to exactly is globally unique, and identifies one specific node in any a specific VOSpace service.

A client shall decode a VOSpace identifier for access to the node according to the following procedure.
A client should use the following procedure to resolve access to a VOSpace node from a VOSpace identifier:

  • Extract the authority part of the VOSpace URI.
  • Convert the authority back to the IVO-ID IVO registry identifier of the VOSpace service by changing any ‘!’ characters to ‘/’ and adding the ivo:// prefix.
  • Resolve the IVO-ID IVO registry identifier to an endpoint for the VOSpace service using the IVO resource registry.
  • Access the node via the endpoint using one of the web service operation methods defined in this standard.

Given the example identifier vos://org.astrogrid.cam!vospace!container-6/siap-out-1.vot?foo=bar#baz, this would mean:

  • Extract the authority part of the VOSpace URI
    • org.astrogrid.cam!vospace!container-6

  • Convert the authority back to the IVO-ID IVO registry identifier of the VOSpace service by changing any ‘!’ characters to ‘/’
    • org.astrogrid.cam/vospace/container-6

  • and adding the ivo:// prefix
    • ivo://org.astrogrid.cam/vospace/container-6

  • The client can then use the IVO registry to resolve the VOSpace service endpoint from the IVO identifier

VOSpace data model

Data model diagram :

  • Node property needs 'readonly' attribute.
  • View needs 'original' attribute.

Nodes and node types

The type of a VOSpace node determines the metadata stored by the space for that node how the VOSpace service stores and interprets the node data.
The types are arranged in a hierarchy, with more detailed types inheriting the structure of more generic types.

The following types are defined:

  • Node is the most basic type. It has only an identifier and a type attribute.
  • DataNode describes a data item stored in the VOSpace.
  • UnstructuredDataNode describes a data item for which the VOSpace does not understand the data format.

When data is stored and retrived from an UnstructuredDataNode is written and read back, the bit pattern read back shall be identical to that written

  • StructuredDataNode describes a data item for which the space understands the format and may make transformations that preserve the meaning of the data.

When data is stored and retrived from a StructuredDataNode, the bit pattern returned may be different to the original. For example, storing tabular data from a VOTable file will preserve the tabular data, but any comments in the original XML file may be lost.

As far as I know, the current implementations only support UnstructuredDataNode. Can we deprecate the use of StructuredDataNode and UnstructuredDataNode , and replace them with something better in the next version of the specification ?

The members of the types have meanings as follows.
A Node has the following elements:

  • Node.id uri : the vos://... identifier URI, in the vos scheme, for the node . The characters in the node shall be, URI-encoded according to RFC2396 [2].
  • Node.property: arbitrary properties of properties : a set of metadata properties for the node , set either by the client or by the service (see following section for details).

A DataNode has the following elements:

  • Node. accepts : a list of the data formats (views) views (data formats) that the node can accept.
  • Node. provides : a list of the data formats (views) views (data formats) that the node can provide.
  • DataNode. busy : when true, a boolean flag to indicates that the data associated with the node may not be read or written cannot be accessed (i.e. the bulk data transfer operations will be rejected).

The busy flag is used to indicate that an internal operation is in progress, and the node data is not available.

In the current version of the specification, all All nodes in a VOSpace-1 are either structured or unstructured data nodes , either structured or unstructured.

VOSpace-2 will Future versions of the specification may introduce other types that are not data nodes (links; containers) new types of nodes.

The set of node types is defined by this standard and is closed; new types may be introduced only via new versions of the standard.

To comply with the standard, a client or service shall must support all the node types defined in the current specification to the extent that it can parse an XML description of a node of that type.

Properties

Properties are simple string based metadata properties name-value pairs associated with a node.

Individual Properties should contain simple short string values, not large blocks of information. If a system needs to attach a large ammount of metadata to a node, then it should either use multiple small Properties, or a single Property containing a URI or URL pointing to an external resource that contains the additional metadata.

The property structure has the following members.
A Property has the following elements:

  • Property.key uri : the name of the Property identifier.
  • Property. value : the assigned string value of the Property.
  • Property. readOnly : if true, the property is set by the service and may only be read a boolean flag to indicate that the Property cannot be changed by the client.

Properties of a node may , in general, be set either by the client or the service.
However, the service may define some properties as read-only, and this aspect is revealed by the readOnly attribute of the property.

The current VOSpace specification does not define a closed list of Properties.
Appendix xx of this specification includes an initial list of Properties defined by the VOSpace team.
However VOSpace implementations clients and services may define and use their own Properties without requiring a change to the specification.

Property values

Unless they have special meaning to the service or client, Properties are treated as simple string values.

Some Properties may have meaning to the service. others may have meaning only to the one specific type of client the service stores these properties but does not interpret them.
A service implementation does not need to understand the meaning of all the Properties of a node, any Properties that it does not understand can simply be stored as text strings.

Currently, there are no standard properties. However, property names with the prefix vos. are reserved. Later versions of the standard may define the use of these properties.

Possible properties are

  • MIME type;
  • size of data set;
  • identity of node owner;
  • time of last modification.

Property identifiers

Every new type of Property requires a unique URI to identify the Property and its meaning.

The rules for the Propertiy identifiers are similar to the rules for namespace URIs in XML schema. The only restriction is that it must be a vaild (unique) URI.

  • An XML schema namespace identifier can be just a simple URN, e.g. urn:my-namespace
  • Within the IVOA, the convention for namespace identifiers is to use a HTTP URL pointing to the namespace schema, or a resource describing it

The current VOSpace schema defines Property identifiers as anyURI. The only restriction is that it must be a vaild (unique) URI.

  • A Property URI can be a simple URN, e.g. urn:my-property

This may be sufficient for testing and development on a private system, but it is not scalable for use on a public service.
For a production system, any new Properties should have unique URIs that can be resolved into to a description of the Property.

Ideally, these should be IVO registry URIs that point to a description registered in the IVO registry.

  • e.g. ivo://my-registry/vospace/properties/my-property

Using an IVO registry URI to identify Properties has two main advantages

  • IVO registry URIs are by their nature unique, which makes it easy to ensure that different teams do not accidentally use the same URI.
  • If the IVO registry URI points to a description registered in the IVO registry, this provides a mechanism to discover what the Property means.

Property descriptions

If the URI for a particular Property is resolvable, i.e. an IVO registry identifier or a HTTP URL, then it should point to an XML resource that describes the Property.

A Property description gives information about those properties understood by a service should describe the data type and meaning of a Property .

It A PropertyDescription has should have the following members.

  • key uri : the formal name URI of the Property (same value as in the Property structure)

  • _readOnly_ : if true, the property may be set by the service but not by the client.

It is upto the service to decide it the client can modify a property, based on the identity of the caller, so the readonly flag is part of the property element in the server response, not the property description.
If a service decides a node is read-only, then the client can't modify any of the properties. On the other hand, is there really a use case for a property that can never be modified by any client ?

  • Description : English Text block explaining describing the meaning and validation rules of the Property

Can we add a display name, in addition to the description.

A PropertyDescription may have the following optional members.

  • UCD : the Universal Content Descriptor (in the UCD1+ scheme) for the Property value
  • Unit : the unit of measurement of the Property

These elements are based on experience with the Common Execution Architecture and can be used to generate a UI for setting the properties. The information in a Property description can be used to generate a UI for displaying and modifiying the different types of Properties.

Machine readable validation rules will probably have to wait until the next version of the specification.

Note that at the time of writing, the schema for registering PropertyDescriptions in the IVO registry has not been finalised.

UI display name

If a client is unable to resolve a Property identifier into a description, then it may just display the identifier as a text string.

  • urn:modified-date

If the client can resolve the Property identifier into a description, then the client may use the information in the description to display a human readable name and description of the Property.

  • Last modification date of the node data

Property editors

If the client is unable to resolve a Property identifier into a description, or does not understand the type information defined in the description, then the client may treat the Property value as a simple text string.

If the client can resolve the Property identifier into a description, then the client may use the information in the description to display an appropriate editing tool for the Property.

In the current version of the specification the rules for editing Properties are as follows :

  • A service may impose validation rules on the values of specific types of Properties
  • If a client attempts to set a Property to an invalid value, then the service may reject the change.
  • Where possible, the validation rules for a type of Property should be defined in the Property description

Future versions of the VOSpace specification may extend the PropertyDescription to include more specific machine readable validation rules for a Property type.

Note that at the time of writing, the schema for registering validation rules in PropertyDescriptions has not been finalised.


Views

A View defines a data format describes the data formats and contents available for importing or exporting data to or from a VOSpace node.

The metadata for each VOSpace Node contains two lists of Views.

  • accepts is a list of Views that the service can accept for importing data into the Node
  • provides is a list of Views that the service can provide for exporting data from Node

The A View structure has the following members:

  • View. uri : The URI for the data format eg:the View identifier
    • ivo://net.ivoa.vospace/formats/binary
    • ivo://net.ivoa.vospace/formats/votable-1.0
    • ivo://net.ivoa.vospace/formats/any – this is a reserved URI to identify unstructured data, ie. data of any format.

  • View. original : if true,an optional boolean flag to indicate that the View preserves the original bit pattern of the data is preserved. This is an optional member with a default value of true if it is not specified.
  • View. param : These a set of name-value pairs that can be used to specify additional arguments required to define for the View eg. JPEG compression level.

Example use cases

Simple file store

A simple VOSpace system that stores data as a binary files can just return the contents of the original file. The client supplies a View identifier when it imports the data, and the service uses this information to describe the data to other clients.

A file based system can use the special case View identifier ivo://net.ivoa.vospace/views/any to indicate that it will accept any data format or View for a Node.

For example :

  • A client imports a file into the service, specifying a View to decribe the file contents
  • The service stores the data as a binary file and keeps a record of the View
  • The service can then use the View supplied by the client to describe the data to other clients

This type of service is not required to understand the imported data, or to verify taht it contents match the View, it treats all data as binary files.

Database store

A VOSpace system that stores data in database tables would need to be able to understand the data format of an imported file in order to parse the data and store it correctly. This means that the service can only accept a specific set of Views or data formats for importing data into the Node. In order to tell the client what input data formats it can accept, the service publishes a list of specific Views in the accepts list for each Node.

On the output side, a database system would not be able to provide access to the original input file. The contents of file would have been transferred into the database table and then discarded. The system has to generate the output results from the contents of the database table. In order to support this, the service needs to be able to tell the client what Views, the data is available in.

The database system may offer access to the table contents as either VOTable or FITS files, it may also offer zip or tar.gz compressed versions of these. In which case the system needs to be able to express nested file formats such as 'zip containing VOTable' and 'tar.gz containing FITS'.

A service may also offer subsets of the data. For example, a workflow system may only want to look at the table headers to decide what steps are required to process the data. If the table contains a large quantity of data, then downloading the whole contents just to look at the header information is inefficient. To make this easier, a database system may offer a 'metadata only' View of the table, returning a VOTable or FITS file containing just the metadata headers and no rows.

So our example service may want to offer the following Views of a database table :

  • Table contents as FITS
  • Table contents as VOTable

  • Table contents as zip containing FITS
  • Table contents as zip containing VOTable

  • Table contents as tar.gz containing FITS
  • Table contents as tar.gz containing VOTable

  • Table metadata as FITS
  • Table metadata as VOTable

The service would publish this information as a list of Views in the provides section of the metadata for each Node.

The VOSpace specification does not mandate what Views a service must provide. The VOSpace specification is intended to provide a flexible mechanism enabling services to describe a variety of different Views of data. It is upto the service implementation to decide what Views of the data it can accept and provide.

View identifiers

Every new type of View requires a unique URI to identify the View and its content.

The rules for the View identifiers are similar to the rules for namespace URIs in XML schema. The only restriction is that it must be a vaild (unique) URI.

  • An XML schema namespace identifier can be just a simple URN, e.g. urn:my-namespace
  • Within the IVOA, the convention for namespace identifiers is to use a HTTP URL pointing to the namespace schema, or a resource describing it

The current VOSpace schema defines View identifiers as anyURI. The only restriction is that it must be a vaild (unique) URI.

  • A View URI can be a simple URN, e.g. urn:my-view

This may be sufficient for testing and development on a private system, but it is not scalable for use on a public service.
For a production system, any new Views should have unique URIs that can be resolved into to a description of the View.

Ideally, these should be IVO registry URIs that point to a description registered in the IVO registry.

  • e.g. ivo://my-registry/vospace/views/my-view

Using an IVO registry URI to identify Views has two main advantages

  • IVO registry URIs are by their nature unique, which makes it easy to ensure that different teams do not accidentally use the same URI.
  • If the IVO registry URI points to a description registered in the IVO registry, this provides a mechanism to discover what the View contains.

View descriptions

If the URI for a particular View is resolvable, i.e. an IVO registry identifier or a HTTP URL, then it should point to an XML resource that describes the View.

A ViewDescription should describe the data format and/or content of the view.

A ViewDescription should have the following members:

  • uri : the formal URI of the View
  • Description : Text block describing the data format and content of the View
  • DisplayName : Display name of the View

A ViewDescription may have the following optional members.

  • MimeType : the standard MIME type of the View, if applicable.

However, at the time of writing, the schema for registering ViewDescriptions in the IVO registry has not been finalised.

UI display name

If a client is unable to resolve a View identifier into a description, then it may just display the identifier as a text string.

  • Download as urn:table.meta.fits

If the client can resolve the View identifier into a description, then the client may use the information in the description to display a human readable name and description of the View.

  • Download table metadata as FITS header

MIME types

If a VOSpace service provides HTTP access to the data contained in a Node, then if the ViewDescription contains a MimeType field, this should be included in the appropriate header field of the HTTP response.

Topic attachments
I Attachment Action Size Date Who Comment
csscss changes.css manage 0.2 K 2007-02-23 - 04:49 DaveMorris CSS style sheet to show changes
csscss modified.css manage 0.1 K 2007-01-05 - 15:51 DaveMorris CSS style sheet to show modified
csscss original.css manage 0.1 K 2007-01-05 - 15:51 DaveMorris CSS style sheet to show original
Topic revision: r7 - 2007-02-23 - 04:49:04 - DaveMorris
 
AstroGrid Service Click here for the
AstroGrid Service Web
This is the AstroGrid
Development Wiki

This site is powered by the TWiki collaboration platformCopyright © by the contributing authors. All material on this collaboration platform is the property of the contributing authors.
Ideas, requests, problems regarding TWiki? Send feedback