r1 - 16 Aug 2002 - 13:59:17 - ClivePageYou are here: TWiki >  Astrogrid Web  >  DocStore > DataDocs > DbmsEvaluations > OracleEvaluation

Oracle Evaluation

Version tested Oracle 9i. Detailed version number: 9.2.0.1.0.
Test hardware Compaq XP900 workstation.
Test operating system Tru64 UNIX v5.1.
Test location Edinburgh

Time to download, install and start using

Oracle 9i was not downloaded. Rather, Oracle staff visited to install it from CD-ROM. It took a while for Oracle to arrange for someone to visit to perform the installation. Further, when the installation was attempted (in early July) there were problems with the CD-ROMs which, coupled with (unrelated) local network problems meant that it took about a week and several visits to get an apparently working system. A further visit was required a couple of weeks later to recover from a corrupt database (it was not clear whether I'd corrupted the system or it had never worked properly). Since that visit it has been fine.

Documentation

There are about 100 manuals, most of which seem to run to several hundred pages. On-line versions were included in the installation and can also be accessed from Oracle's home page. Both HTML and PDF versions are included; the latter are suitable for printing. Paper copies can be bought from Oracle.

Query Language

SQL92 limitations

Only a few minor omissions (see the table on pp28-30 of O'Reilly's `SQL in a Nutshell' for most of the DBMSs we've been looking at).

There is no boolean datatype.

Joins seem to be performed using the `theta style' rather than ANSI notation.

Programming language

PL/SQL.

Trig. and inverse trig. functions

Yes. Also has sinh, cosh and tanh.

Statistical functions

  AVG                         - mean
  CORR                        - correlation coefficient,
  COVAR_POP, COVAR_SAMP       - population and sample covariance,
  CUME_DIST, PERCENT_RANK     - cumulative distribution,
  DENSE_RANK                  - ranking,
  PERCENTILE_CONT, PERCENTILE_DISC - inverse distribution,
  RANK                        - ranking,
  REGR_*,                     - several linear regression functions,
  STDDEV, STDDEV_POP, STDDEV_SAMP - standard deviation,
  SUM                         - sum,
  VAR_POP, VAR_SAMP, VARIANCE - variance.

Joins

  cross join (but not recommended!),
  inner join,
  left outer join,
  right outer join,
  full join.

Uses the `theta style' rather than ANSI notation.

User-defined functions

PL/SQL or Java (that is what the manual says; I'm amazed that functions can't be written in C).

Data Types

Integer, Floating point

Oracle has a single generic numeric datatype, NUMBER(p,s). p is the precision and s the scale. p can range from 1 to 38, s from -84 to 127. The manual says: `The NUMBER datatype stores zero, positive, and negative fixed and floating-point numbers with magnitudes between 1.0x10**-130 and 9.9...9x10**125 ( ... ) with 38 digits of precision. If you specify an arithmetic expression whose value has a magnitude greater than or equal to 1.0x10**126, then Oracle returns an error.

Oracle allows ANSI SQL datatypes and represents them internally according to the following scheme:

  ANSI                             Oracle
  NUMERIC(p,s), DECIMAL(p,s)       NUMBER(p,s)
  INTEGER, INT, SMALLINT           NUMBER(38)
  FLOAT, DOUBLE PRECISION, REAL    NUMBER

As mentioned above there is no boolean data type. (I suppose that this should not have come as a surprise; it would probably be unrealistic to expect an oracle to give a straightforward yes/no answer.)

Date/time

  DATE,
  TIMESTAMP,
  TIMESTAMP WITH TIME ZONE,
  TIMESTAMP WITH LOCAL TIME ZONE,
  INTERVAL YEAR TO MONTH,
  INTERVAL DAY TO SECOND

String

VARCHAR2, NVARCHAR2, CHAR, NCHAR, LONG (long character data).

Spatial types

Yes. Oracle supports spatial data types through Oracle Spatial. The following section is based on a brief perusal of the Oracle Spatial manual. Unfortunately, time precluded carrying out any tests on Oracle Spatial, though such tests would have been useful.

Oracle Spatial is an integrated set of functions for storing, accessing and analysing spatial data. It operates on generalised spatial data, not just latitude and longitude and can be used, for example, to represent CAD/CAM type data. Thus it can handle Cartesian as well as spherical polar coordinates. It can represent polygons and more specialised shapes (rectangles, circles etc.) as well as points. Consequently, spatial is very complicated and contains lots of facilities that are unlikely to be mportant in astronomy, rather in the way that GIS systems do. It could certainly be used to represent astronomical coordinates, but it brings a lot of baggage with it.

Spatial does, however, support two-dimensional indices. Both R-tree and Quadtree indices are available. The R-tree seems to be more appropriate for our purposes, where tables are rarely updated. The size of an R-tree index is estimated to be about 70 bytes for every row in the table indexed. Thus, a table of a million rows would require an index of about 70 Mb. Considerably more temporary space is required whilst the index is being created.

User-defined?

Yes.

Null support

Full.

Indexing

B-tree, B-tree cluster, Hash cluster, Reverse key, Bitmap, Bitmap Join. Oracle Spatial has R-tree and Quadtree.

Software Interfaces

Language APIs: Java, C/C++, COBOL, Fortran, PL/I (deprecated).

Data Import/Export

ASCII import

There is a loader utility for importing text files. This a complex program which allows a great deal of flexibility in the input format and some pre-processing of the records as they are ingested. However, because it is so flexible it is rather complicated to use, and I found the manual somewhat opaque. For example, the input data file we used had null fields represented as `\N'. The loader allowed the representation of nulls to be specified in its control files. However, it took objection to the `\' character, presumably interpreting it as a control character and I had to put `\\N' in the control file. I only found this out by guesswork and interpreting the errors I was getting; I could find no mention of it in the manual.

ASCII export

Yes. Eg. the SQL interpreter allows output to be spooled to an output file.

Binary import, export

Binary data files can be imported using the same utility as ASCII files. It is possible to and export `binary' files, but these can only then be imported into another Oracle database (ie. it is an efficient mechanism for moving tables between different databases).

HTML export

Yes.

XML export

Not clear. Oracle certainly has facilities for creating and querying XML databases. It also has an XML data type. However, it is not clear that it is possible to generate an XML document from a conventional Oracle database in the way that it is possible to generate an HTML one.

Scalability

Mapping to host file system

Complex and opaque. The usual way to manage an Oracle database is to pre-create one or more large files which will hold all the tables etc. in the database. Individual tables are hidden inside these files, so it is not possible to estimate how much space each table takes. There are complex facilities (which I've not investigated) for spreading the container files across several disks in order to improve performance.

Oracle has an option to allow the container file(s) to grow as more space is needed in a database. Initially I did not use this option because (a) it is not the way Oracle is usually used and (b) it seems somewhat dangerous as it allows disks to fill up. Attempts to use it to monitor the disk space required by different sizes of table were unsuccessful.

Support for 64-bit CPUs

Yes; the tests were run on a Compaq Alpha.

Support for SMP systems

Yes (I think).

Max. table size

`There are no limits on the physical size of tables'.

Max. single field

A character field of type LONG can be up to 2 Gb in size. A raw binary field of type LONG RAW can also be up 2 Gb in size. Binary and character large objects of various sorts can be up to 4 Gb.

Max. no. of rows

? Almost certainly unlimited in practice.

Max. no. of columns

1000

Max. indexed columns

`A table can have any number of indexes.'

Functionality

Oracle is a large and complex system which has many features not touched on in this brief evaluation. For example, it is possible to specify the partitioning and segmentation of a table in order to control the way it is spread across one or more disks and hence give better performance.

Oracle controls access to tables through the concept of `users'. Users of an Oracle database are analogous to user accounts for an operating system. There are database management tools to create and delete users and to grant users access privileges to selected databases and subsets of databases.

In addition to the basic pre-defined datatypes Oracle databases can include programmer-defined data types and programmable triggers can be set on tables.

The above are just a few of Oracle's features; there are many more not mentioned here.

Usability

I found Oracle robust and reliable to use. The manuals were usually very good, though there were a few instances where I found them a bit opaque. However, starting with no knowledge of Oracle, by just reading the manuals I was reasonably quickly able to discover how to perform the tasks required to carry out the evaluations.

System Admin Impact

An Oracle database needs a `Database Administrator' (DBA) to look after it. However, I suspect that the database administration requirements for even a large repository of astronomical catalogues would be less onerous than those of a typical commercial installation. The DBA should be able to pick up most of what he needs to know from the manuals (there is a comprehensive manual for DBA functions), perhaps augmented by attending any suitable introductory courses. Some assistance may be required in initially tuning the databases, but once tuned the DBA should be able to keep them in tune.

Once the repository is set up I'd imagine that the recurrent, routine workload should be quite small and could probably be accommodated amongst a full roster of other duties.

Discussion

Oracle is an extensive, powerful and flexible DBMS. It is robust and reliable and the documentation is usually good. All of which is as would be expected from the leading commercial DBMS. I do not see any show-stoppers that would prevent it being used to hold a repository of large astronomical catalogues, such as the AstroGrid data warehouse. One significant cause for concern is the representation of numeric data, which has implications for both the type conversions needed (internally within Oracle) when accessing data and the storage space required to hold tables.

The number of statistical functions available in Oracle-SQL, much more than basic SQL, is a bonus. The Oracle DBMS and its associated tools are a large and complex system, many features of which have not been touched on in this brief evaluation. Some of them are unlikely to be used in an astronomical context. Conversely, Oracle has some features, such as support for XML databases and Web services, which are potentially relevant to AstroGrid but which have not been covered here. I hope that the parallel project to set up a demonstration system using Oracle, which is being carried out by Oracle staff, will address some of these areas.

-- Clive Davenhall, 2002 August 16.

Edit | Attach | Printable | Raw View | Backlinks: Web, All Webs | History: r1 | More topic actions
 
AstroGrid Service Click here for the
AstroGrid Service Web
This is the AstroGrid
Development Wiki
This site is powered by the TWiki collaboration platformCopyright © by the contributing authors. All material on this collaboration platform is the property of the contributing authors.
Ideas, requests, problems regarding TWiki? Send feedback