r9 - 09 May 2003 - 14:45:00 - ElizabethAudenYou are here: TWiki >  Astrogrid Web  >  DocStore > PhaseBStructure > RegistryIt02Home > RegistryIt02Schema > RegistryQuerySchemaDebate

Registry Query Schema Debate

Iteration 02

Please continue debate about the registry query schema and registry query response schema here. The current schemas for Iteration 02 can be found at RegistryQuerySchema and RegistryQueryResponseSchema.

The registry will receive queries from several other Astrogrid components: the portal, the data mover, workflow, and the job scheduler. The following schema describes a query that could come from any Astrogrid component. Individual components have their own query types: portalQuery, dataMoverQuery, workflowQuery, and jobSchedulerQuery. Each of these query types has its own schema, and more query types can be added as different Astrogrid components need to query the registry. Currently only the portalQuery schema is filled out; its elements reflect the sample query drawn on the whiteboard at the registry workgroup meeting in Leicester on 15 April: type (star, galaxy, flare), wavelength (radio, optical, etc), and keywords. The other query type schemas are currently empty since no query information has been specified.

Example portal query:

<query>
  <queryType>
    <portalQuery>
      <typeList>
        <type>white dwarf star</type>
      </typeList>
      <wavelength>
        <optical>
        <uv>
      </wavelength>
      <keywordList>
        <keyword>BPM 16274</keyword>
        <keyword>GD 50</keyword>
        <keyword>HST photometric standards</keyword>
      </keywordList>
    </portalQuery>
  </queryType>
</query>

Query schema:

<?xml version="0.1" encoding="UTF-8"?>
<xs:schema xmlns:xs="http://www.w3.org/2001/XMLSchema">
    <xs:include schemaLocation="portalQuery_v0_1.xsd"/>
    <xs:include schemaLocation="workflowQuery_v0_1.xsd"/>
    <xs:include schemaLocation="dataMoverQuery_v0_1.xsd"/>
    <xs:include schemaLocation="jobSchedulerQuery_v0_1.xsd"/>
    <xs:element name="queryType">
        <xs:complexType>
            <xs:choice>
                <xs:element maxOccurs="1" minOccurs="0" ref="portalQuery"/>
                <xs:element maxOccurs="1" minOccurs="0" ref="workflowQuery"/>
                <xs:element maxOccurs="1" minOccurs="0" ref="dataMoverQuery"/>
                <xs:element maxOccurs="1" minOccurs="0" ref="jobSchedulerQuery"/>
            </xs:choice>
        </xs:complexType>
    </xs:element>
    <xs:element name="query">
        <xs:element maxOccurs="1" minOccurs="1" ref="queryType"/>
    </xs:element>
</xs:schema>

Query response schema:

(includes service.xsd from the RegistrySchema page)

<?xml version="0.1" encoding="UTF-8"?>
<xs:schema xmlns:xs="http://www.w3.org/2001/XMLSchema">
    <xs:include schemaLocation="portalQuery_v0_1.xsd"/>
    <xs:include schemaLocation="workflowQuery_v0_1.xsd"/>
    <xs:include schemaLocation="dataMoverQuery_v0_1.xsd"/>
    <xs:include schemaLocation="jobSchedulerQuery_v0_1.xsd"/>
    <xs:include schemaLocation="service.xsd"/>
    <xs:element name="queryType">
        <xs:complexType>
            <xs:choice>
                <xs:element maxOccurs="1" minOccurs="0" ref="portalQuery"/>
                <xs:element maxOccurs="1" minOccurs="0" ref="workflowQuery"/>
                <xs:element maxOccurs="1" minOccurs="0" ref="dataMoverQuery"/>
                <xs:element maxOccurs="1" minOccurs="0" ref="jobSchedulerQuery"/>
            </xs:choice>
        </xs:complexType>
    </xs:element>
    <xs:element maxOccurs="1" minOccurs="0" ref="service"/>
    <xs:element name="queryResponse">
        <xs:element maxOccurs="1" minOccurs="1" ref="queryType"/>
        <xs:element maxOccurs="1" minOccurs="1" ref="service"/>
    </xs:element>
</xs:schema>

Portal Query Schema:

<?xml version="1.0" encoding="UTF-8"?>
<xs:schema xmlns:xs="http://www.w3.org/2001/XMLSchema"
    <xs:element name="typeList" type="xs:string"/>
        <xs:complexType>
            <xs:sequence>
                <xs:element maxOccurs="unbounded" minOccurs="0" name="type" type="xs:string"/>
            </xs:sequence>
        </xs:complexType>
    </xs:element>
    <xs:element name="wavelength">
        <xs:complexType>
            <xs:choice>
                <xs:element maxOccurs="1" minOccurs="0" ref="radio"/>
                <xs:element maxOccurs="1" minOccurs="0" ref="optical"/>
                <xs:element maxOccurs="1" minOccurs="0" ref="uv"/>
                <xs:element maxOccurs="1" minOccurs="0" ref="infrared"/>
                <xs:element maxOccurs="1" minOccurs="0" ref="gammaRay"/>
                <xs:element maxOccurs="1" minOccurs="0" ref="xRay"/>
            </xs:choice>
        </xs:complexType>
    </xs:element>
    <xs:element name="keywordList">
        <xs:complexType>
            <xs:sequence>
                <xs:element maxOccurs="unbounded" minOccurs="0" name="keyword" type="xs:string"/>
            </xs:sequence>
        </xs:complexType>
    </xs:element>
    <xs:element name="portalQuery">
        <xs:complexType>
            <xs:sequence>
                <xs:element maxOccurs="1" minOccurs="1" ref="typeList"/>
                <xs:element maxOccurs="1" minOccurs="1" ref="wavelength"/>
                <xs:element maxOccurs="1" minOccurs="1" ref="keywordList"/>
            </xs:sequence>
        </xs:complexType>
    </xs:element>
</xs:schema>

Data Mover Schema:

<?xml version="1.0" encoding="UTF-8"?>
<xs:schema xmlns:xs="http://www.w3.org/2001/XMLSchema">
    <xs:element name="dataMoverQuery">
        <xs:complexType>
            <xs:sequence>
                <xs:element maxOccurs="1" minOccurs="1" ref=""/>
            </xs:sequence>
        </xs:complexType>
    </xs:element>
</xs:schema>

Workflow Schema:

<?xml version="1.0" encoding="UTF-8"?>
<xs:schema xmlns:xs="http://www.w3.org/2001/XMLSchema">
    <xs:element name="workflowQuery">
        <xs:complexType>
            <xs:sequence>
                <xs:element maxOccurs="1" minOccurs="1" ref=""/>
            </xs:sequence>
        </xs:complexType>
    </xs:element>
</xs:schema>

Job Scheduler Schema:

<?xml version="1.0" encoding="UTF-8"?>
<xs:schema xmlns:xs="http://www.w3.org/2001/XMLSchema">
    <xs:element name="jobSchedulerQuery">
        <xs:complexType>
            <xs:sequence>
                <xs:element maxOccurs="1" minOccurs="1" ref=""/>
            </xs:sequence>
        </xs:complexType>
    </xs:element>
</xs:schema>

-- ElizabethAuden - 23 Apr 2003


Query elaboration

If I understand the intention of the query correctly, it could be expressed in the following psuedo-SQL:

SELECT * FROM <registry> WHERE 
   (
      TYPE="white dwarf star" AND 
      (WAVELENGTH="optical" OR WAVELENGTH="uv") AND 
      (KEYWORD="BPM 16274" OR 
         KEYWORD="GD 50" OR 
         KEYWORD="HST photometric standards"
      )
   )
If that is correct, I think the query itself would be better expressed as:
<query>
   <selection id="1">
      <item>type</item>
      <value>white dwarf star</value>
   </selection>
   <selection id="2-1">
      <item>wavelength</item>
      <value>optical</value>
   </selection>
   <selection id="2-2">
      <item>wavelength</item>
      <value>uv</value>
   </selection>
   <selection id="3-1">
      <item>keyword</item>
      <value>BPM 16274</value>
   </selection>
   <selection id="3-2">
      <item>keyword</item>
      <value>GD 50</value>
   </selection>
   <selection id="3-3">
      <item>keyword</item>
      <value>HST photometric standards</value>
   </selection>
   <selectionSequence>
      <selectionID>1</selectionID>
      <operator>AND</operator>
      <selectionSequence>
         <selectionID>2-1</selectionID>
         <operator>OR</operator>   
         <selectionID>2-2</selectionID>
      </selectionSequence>
      <operator>AND</operator>
      <selectionSequence>
         <selectionID>3-1</selectionID>
         <operator>OR</operator>   
         <selectionID>3-2</selectionID>
         <operator>OR</operator>   
         <selectionID>3-3</selectionID>
      </selectionSequence>
   </selectionSequence>
</query>
There are 2 basic constructs in the example and we may well need more. These are SELECTION and SELECTIONSEQUENCE. The former defines the data required and the latter the order in which the selection is constructed. Hopefully this is pretty trivial, but would provide the basis upon which to build a registry query schema.

-- KeithNoddle - 29 Apr 2003

What about this?

<query>
   <selectionSequence>
      <selection>
         <item>type</item>
         <value>white dwarf star</value>
      </selection>
      <operator>AND</operator>
      <selectionSequence>
         <selection>
            <item>wavelength</item>
            <value>optical</value>
         </selection>
         <operator>OR</operator>   
         <selection>
            <item>wavelength</item>
            <value>uv</value>
         </selection>
      </selectionSequence>
      <operator>AND</operator>
      <selectionSequence>
         <selection>
            <item>keyword</item>
            <value>BPM 16274</value>
         </selection>
         <operator>OR</operforator>   
         <selection>
            <item>keyword</item>
            <value>GD 50</value>
         </selection>
         <operator>OR</operator>   
         <selection>
            <item>keyword</item>
            <value>HST photometric standards</value>
         </selection>
      </selectionSequence>
   </selectionSequence>
</query>

This removes the selectionID element. According to "XQuery: An XML query Language" by D. Chamberlin, "Nodes have identity: that is, two nodes may be distinguishable even through their names and values are the same."

-- ElizabethAuden - 29 Apr 2003

New try at Query Schema:

<?xml version="1.0" encoding="UTF-8"?>
<xs:schema xmlns:xs="http://www.w3.org/2001/XMLSchema">
    <xs:element name="selection">
        <xs:complexType>
            <xs:sequence>
                <xs:element maxOccurs="1" minOccurs="1" name="item" type="xs:string"/>
                <xs:element maxOccurs="1" minOccurs="1" name="value" type="xs:string"/>
            </xs:sequence>
        </complexType>
    </xs:element>
    <xs:element name="operator">
        <xs:complexType>
            <xs:choice>
                <xs:element maxOccurs="1" minOccurs="0" ref="AND"/>                             
                <xs:element maxOccurs="1" minOccurs="0" ref="OR"/>                             
                <xs:element maxOccurs="1" minOccurs="0" ref="NOT"/>
            </xs:choice>
        </xs:complexType>
    </xs:element>
    <xs:element name="selectionSequence">
        <xs:complexType>
            <xs:sequence>
                <xs:element maxOccurs="unbounded" minOccurs="1" ref="selection"/>
                <xs:element maxOccurs="unbounded" minOccurs="0" ref="operator"/>
           </xs:sequence>
        </xs:complexType>
    </xs:element>
    <xs:element name="query">
        <xs:complexType>
            <xs:sequence>
                <xs:element maxOccurs="unbounded" minOccurs="1" ref="selectionSequence"/>
           </xs:sequence>
        </xs:complexType>
    </xs:element>
</xs:schema>

-- ElizabethAuden - 30 Apr 2003


Possible simplification

Rather than depend on a specific sequence of elements : selection, operator, selection etc. We could make the 'and' and 'or' operators container elements. This might simplify the query, and make it more human-readable.

The schema notation for this may be wrong, so where unclear, go by the comments and examples rather than the schema.

   <?xml version="1.0" encoding="UTF-8"?>
   <xs:schema xmlns:xs="http://www.w3.org/2001/XMLSchema">

      <!--+
         | A field-name, just a string for now.
         | We may want to add restictions to this later.
         +-->
      <xs:simpleType name="field-name">
         <xs:extension base="xs:string"/>
      </xs:simpleType>

      <!--+
         | A field-oper, an enumeration of the alllowed comparison operators.
         +-->
      <xs:simpleType name="field-oper">
         <xs:annotation>
            <xs:documentation xml:lang="en">
               A field-oper, an enumeration of the alllowed comparison operators.
            </xs:documentation>
         </xs:annotation>
         <xs:restriction base="xs:string">
            <xs:enumeration value="EQUALS"/>
            <xs:enumeration value="NOT EQUALS"/>
            <xs:enumeration value="GREATER THAN"/>
            <xs:enumeration value="GREATER THAN EQUALS"/>
            <xs:enumeration value="LESS THAN"/>
            <xs:enumeration value="LESS THAN EQUALS"/>
            <xs:enumeration value="STARTS WITH"/>
            <xs:enumeration value="CONTAINS"/>
         </xs:restriction>
      </xs:simpleType>

      <!--+
         | A field-value, an extension of a string element, with two mandatory attributes 'field-name' and 'field-oper'.
         +-->
      <xs:complexType name="field-value">
         <xs:annotation>
            <xs:documentation xml:lang="en">
               A field-value, an extension of a string element, with two mandatory attributes 'field-name' and 'field-oper'.
            </xs:documentation>
         </xs:annotation>
         <xsd:simpleContent>
            <xs:extension base="xs:string">
               <xs:attribute name="name" minOccurs="1" type="field-name"/>
               <xs:attribute name="oper" minOccurs="1" type="field-oper"/>
            </xs:extension>
         </xs:simpleContent>
      </xs:complexType>

      <!--+
         | A criteria block, an element containing any combination of criteria fields and nested criteria blocks.
         | Should be at least one element from the set, but not sure if this is the right schema syntax.
         +-->
      <xs:complexType name="criteria-block">
         <xs:annotation>
            <xs:documentation xml:lang="en">
               A criteria block, an element containing any combination of criteria fields and nested criteria blocks.
               Should be at least one element from the set, but not sure if this is the right schema syntax.
            </xs:documentation>
         </xs:annotation>
         <xs:any minOccurs="1" maxOccurs="unbounded">
            <!-- A field value specifier -->
            <xs:element name="field" type="field-value"/>
            <!-- An 'and' element, implies an SQL 'AND' over all the child elements -->
            <xs:element name="and"   type="criteria-block"/>
            <!-- An 'or' element, implies an SQL 'OR' over all the child elements -->
            <xs:element name="or"    type="criteria-block"/>
         </xs:any>
      </xs:complexType>

      <!--+
         | A simple query.
         | Must contain one 'criteria' element.
         +-->
      <xs:complexType name="simple-query">
         <xs:annotation>
            <xs:documentation xml:lang="en">
               A query block, must contain one criteria element.
            </xs:documentation>
         </xs:annotation>
         <xs:sequence>
            <xs:element name="criteria" minOccurs="1" maxOccurs="1" type="criteria-block"/>
         </xs:sequence>
      </xs:complexType>

      <!--+
         | The list of fields to return.
         | A sequence of 'field' elements, with one mandatory attribute 'field-name'.
         | An empty list implies 'all fields'.
         +-->
      <xs:complexType name="field-list">
         <xs:annotation>
            <xs:documentation xml:lang="en">
               A sequence of 'field' elements, with one mandatory attribute 'field-name'.
               An empty list implies 'all fields'.
            </xs:documentation>
         </xs:annotation>
         <xs:sequence>
            <xs:element name="field" minOccurs="0" maxOccurs="unbounded">
               <xs:simpleContent>
                  <xs:attribute name="name" minOccurs="1" type="field-name"/>
               </xs:simpleContent>
            </xs:element>
         </xs:all>
      </xs:complexType>

      <!--+
         | An extended query, specifying the fields to return.
         | May contain one 'fields' element.
         | Must contain one 'criteria' element.
         | An emty or missing 'fields' element implies 'select all fields'.
         +-->
      <xs:complexType type="field-query"/>
         <xs:annotation>
            <xs:documentation xml:lang="en">
               An extended query, specifying the fields to request.
               May contain one 'fields' element.
               Must contain one 'criteria' element.
            </xs:documentation>
         </xs:annotation>
         <xs:sequence>
            <xs:element name="criteria" minOccurs="0" maxOccurs="1" type="criteria-block"/>
            <xs:element name="criteria" minOccurs="1" maxOccurs="1" type="criteria-block"/>
         </xs:sequence>
      </xs:complexType>

      <!--+
         | A registry query.
         | Must contain one 'criteria' element.
         +-->
      <xs:element name="query" type="field-query">
         <xs:annotation>
            <xs:documentation xml:lang="en">
               A simple registry query, must contain one criteria element.
            </xs:documentation>
         </xs:annotation>
      </xs:element>

   </xs:schema>

For the example SQL :

   SELECT * FROM <registry> WHERE 
      (
      TYPE="white dwarf star"
      AND
         (
         WAVELENGTH="optical"
         OR
         WAVELENGTH="uv"
         )
      AND 
         (
         KEYWORD="BPM 16274"
         OR
         KEYWORD="GD 50"
         OR
         KEYWORD="HST photometric standards"
         )
      )

Maps to this in a registry query :

   <query>
      <criteria>
         <and>
            <field name="type" oper="EQUALS">white dwarf star</field>
            <or>
               <field name="wavelength" oper="EQUALS">optical</field>
               <field name="wavelength" oper="EQUALS">uv</field>
            </or>
            <or>
               <field name="keyword" oper="EQUALS">BPM 16274</field>
               <field name="keyword" oper="EQUALS">GD 50</field>
               <field name="keyword" oper="EQUALS">HST photometric standards</field>
            </or>
         </and>
      </criteria>
   </query>

If we want to specify the fields, then we could add the field list to this.

   SELECT type, coverage, format FROM <registry> WHERE 
      (
      TYPE="white dwarf star"
      AND
         (
         WAVELENGTH="optical"
         OR
         WAVELENGTH="uv"
         )
      AND 
         (
         KEYWORD="BPM 16274"
         OR
         KEYWORD="GD 50"
         OR
         KEYWORD="HST photometric standards"
         )
      )

Maps to this in a registry query

   <query>
      <fields>
         <field name="type"/>
         <field name="coverage"/>
         <field name="format"/>
      </fields>
      <criteria>
         <and>
            <field name="type" oper="EQUALS">white dwarf star</field>
            <or>
               <field name="wavelength" oper="EQUALS">optical</field>
               <field name="wavelength" oper="EQUALS">uv</field>
            </or>
            <or>
               <field name="keyword" oper="EQUALS">BPM 16274</field>
               <field name="keyword" oper="EQUALS">GD 50</field>
               <field name="keyword" oper="EQUALS">HST photometric standards</field>
            </or>
         </and>
      </criteria>
   </query>

If we make the default operation 'AND', this means that the enclosing 'and' can be dropped, simplifying the XML query to this

   <query>
      <fields>
         <field name="type"/>
         <field name="coverage"/>
         <field name="format"/>
      </fields>
      <criteria>
         <field name="type" oper="EQUALS">white dwarf star</field>
         <or>
            <field name="wavelength" oper="EQUALS">optical</field>
            <field name="wavelength" oper="EQUALS">uv</field>
         </or>
         <or>
            <field name="keyword" oper="EQUALS">BPM 16274</field>
            <field name="keyword" oper="EQUALS">GD 50</field>
            <field name="keyword" oper="EQUALS">HST photometric standards</field>
         </or>
      </criteria>
   </query>

Hopefully, this would be flexible enough to extend if we needed to, but not too difficult to implement in SQL.

If we want external groups to use our query syntax, then we need to make it as simple as possible (well ...the query syntax is readable, the recursive schema syntax is a bit complex).

Possible extensions

Seeing as we are treating the data as nested XML, we could change the field 'name' to field 'path'. This would allow us to distinguish between fields with the same name. This would allow us to extend the data service schema to add more specific elements without breaking any existing queries.

For example :

  • We could add extra elements below the 'format' element to specify image formats and table formats.
  • We could add extra elements below the 'instrument' element to specify the instrument type.
  • By using 'field-path' rather than 'field-name' in the field list, we could distinguish between image format and table format without having to resort to complex field names like 'imageFormat'.
  • By using 'field-path' rather than 'field-name' in the criteria list, we could distinguish distinguish between the service type and a specific instrument type.

   <query>
      <fields>
         <field path="type"/>
         <field path="coverage"/>
         <field path="format/image"/>
         <field path="format/table"/>
      </fields>
      <criteria>
         <field path="type" oper="EQUALS">white dwarf star</field>
         <field path="instrument/type" oper="EQUALS">patrol</field>
         <or>
            <field path="wavelength" oper="EQUALS">optical</field>
            <field path="wavelength" oper="EQUALS">uv</field>
         </or>
         <or>
            <field path="keyword" oper="EQUALS">BPM 16274</field>
            <field path="keyword" oper="EQUALS">GD 50</field>
            <field path="keyword" oper="EQUALS">HST photometric standards</field>
         </or>
      </criteria>
   </query>

We could also extend our syntax to ask for 'the format element and any the child elements' by using wild cards.

   <query>
      <fields>
         <field path="type"/>
         <field path="coverage"/>
         <field path="format//*"/>
      </fields>
      <criteria>
         <field path="type" oper="EQUALS">white dwarf star</field>
         <field path="instrument/type" oper="EQUALS">patrol</field>
         <or>
            <field path="wavelength" oper="EQUALS">optical</field>
            <field path="wavelength" oper="EQUALS">uv</field>
         </or>
         <or>
            <field path="keyword" oper="EQUALS">BPM 16274</field>
            <field path="keyword" oper="EQUALS">GD 50</field>
            <field path="keyword" oper="EQUALS">HST photometric standards</field>
         </or>
      </criteria>
   </query>

-- DaveMorris - 30 Apr 2003

toggleopenShow attachmentstogglecloseHide attachments
Topic attachments
I Attachment Action Size Date Who Comment
elsexsd query_v0_1.xsd manage 0.9 K 23 Apr 2003 - 09:31 ElizabethAuden Query schema: v0.1
elsexsd portalQuery_v0_1.xsd manage 1.6 K 23 Apr 2003 - 09:33 ElizabethAuden Portal Query schema: v0.1
elsexsd dataMoverQuery_v0_1.xsd manage 0.3 K 23 Apr 2003 - 09:33 ElizabethAuden Data Mover Query schema: v0.1
elsexsd workflowQuery_v0_1.xsd manage 0.3 K 23 Apr 2003 - 09:34 ElizabethAuden Workflow Query schema: v0.1
elsexsd jobSchedulerQuery_v0_1.xsd manage 0.3 K 23 Apr 2003 - 09:34 ElizabethAuden Job Scheduler Query Schema: v0.1
Edit | Attach | Printable | Raw View | Backlinks: Web, All Webs | History: r9 < r8 < r7 < r6 < r5 | More topic actions
Astrogrid.RegistryQuerySchemaDebate moved from Astrogrid.RegistryQuerySchema on 09 May 2003 - 14:43 by ElizabethAuden - put it back
 
AstroGrid Service Click here for the
AstroGrid Service Web
This is the AstroGrid
Development Wiki
This site is powered by the TWiki collaboration platformCopyright © by the contributing authors. All material on this collaboration platform is the property of the contributing authors.
Ideas, requests, problems regarding TWiki? Send feedback