Oracle Evaluation
| Version tested | Oracle 9i. Detailed version number: 9.2.0.1.0. |
| Test hardware | Compaq XP900 workstation. |
| Test operating system | Tru64 UNIX v5.1. |
| Test location | Edinburgh |
Time to download, install and start using
Oracle 9i was not downloaded. Rather, Oracle staff visited to install
it from CD-ROM. It took a while for Oracle to arrange for someone
to visit to perform the installation. Further, when the installation
was attempted (in early July) there were problems with the CD-ROMs which,
coupled with (unrelated) local network problems meant that it took about
a week and several visits to get an apparently working system. A further
visit was required a couple of weeks later to recover from a corrupt
database (it was not clear whether I'd corrupted the system or it had
never worked properly). Since that visit it has been fine.
Documentation
There are about 100 manuals, most of which seem to run to several
hundred pages. On-line versions were included in the installation and
can also be accessed from Oracle's home page. Both HTML and PDF versions
are included; the latter are suitable for printing. Paper copies can be
bought from Oracle.
Query Language
SQL92 limitations
Only a few minor omissions (see the table on pp28-30 of O'Reilly's
`SQL in a Nutshell' for most of the DBMSs we've been looking at).
There is no boolean datatype.
Joins seem to be performed using the `theta style' rather than ANSI
notation.
Programming language
PL/SQL.
Trig. and inverse trig. functions
Yes. Also has sinh, cosh and tanh.
Statistical functions
AVG - mean
CORR - correlation coefficient,
COVAR_POP, COVAR_SAMP - population and sample covariance,
CUME_DIST, PERCENT_RANK - cumulative distribution,
DENSE_RANK - ranking,
PERCENTILE_CONT, PERCENTILE_DISC - inverse distribution,
RANK - ranking,
REGR_*, - several linear regression functions,
STDDEV, STDDEV_POP, STDDEV_SAMP - standard deviation,
SUM - sum,
VAR_POP, VAR_SAMP, VARIANCE - variance.
Joins
cross join (but not recommended!),
inner join,
left outer join,
right outer join,
full join.
Uses the `theta style' rather than ANSI notation.
User-defined functions
PL/SQL or Java (that is what the manual says; I'm amazed that functions
can't be written in C).
Data Types
Integer, Floating point
Oracle has a single generic numeric datatype, NUMBER(p,s). p is the
precision and s the scale. p can range from 1 to 38, s from -84 to 127.
The manual says: `The NUMBER datatype stores zero, positive, and
negative fixed and floating-point numbers with magnitudes between
1.0x10**-130 and 9.9...9x10**125 ( ... ) with 38 digits of precision.
If you specify an arithmetic expression whose value has a magnitude
greater than or equal to 1.0x10**126, then Oracle returns an error.
Oracle allows ANSI SQL datatypes and represents them internally according
to the following scheme:
ANSI Oracle
NUMERIC(p,s), DECIMAL(p,s) NUMBER(p,s)
INTEGER, INT, SMALLINT NUMBER(38)
FLOAT, DOUBLE PRECISION, REAL NUMBER
As mentioned above there is no boolean data type. (I suppose that this
should not have come as a surprise; it would probably be unrealistic
to expect an oracle to give a straightforward yes/no answer.)
Date/time
DATE,
TIMESTAMP,
TIMESTAMP WITH TIME ZONE,
TIMESTAMP WITH LOCAL TIME ZONE,
INTERVAL YEAR TO MONTH,
INTERVAL DAY TO SECOND
String
VARCHAR2, NVARCHAR2, CHAR, NCHAR, LONG (long character data).
Spatial types
Yes. Oracle supports spatial data types through Oracle Spatial. The
following section is based on a brief perusal of the Oracle Spatial
manual. Unfortunately, time precluded carrying out any tests on Oracle
Spatial, though such tests would have been useful.
Oracle Spatial is an integrated set of functions for storing, accessing
and analysing spatial data. It operates on generalised spatial data, not
just latitude and longitude and can be used, for example, to represent
CAD/CAM type data. Thus it can handle Cartesian as well as spherical
polar coordinates. It can represent polygons and more specialised shapes
(rectangles, circles etc.) as well as points. Consequently, spatial is
very complicated and contains lots of facilities that are unlikely to be
mportant in astronomy, rather in the way that GIS systems do. It could
certainly be used to represent astronomical coordinates, but it brings a
lot of baggage with it.
Spatial does, however, support two-dimensional indices. Both R-tree
and Quadtree indices are available. The R-tree seems to be more
appropriate for our purposes, where tables are rarely updated. The size
of an R-tree index is estimated to be about 70 bytes for every row in
the table indexed. Thus, a table of a million rows would require an
index of about 70 Mb. Considerably more temporary space is required
whilst the index is being created.
User-defined?
Yes.
Null support
Full.
Indexing
B-tree, B-tree cluster, Hash cluster, Reverse key, Bitmap, Bitmap Join.
Oracle Spatial has R-tree and Quadtree.
Software Interfaces
Language APIs:
Java, C/C++, COBOL, Fortran, PL/I (deprecated).
Data Import/Export
ASCII import
There is a loader utility for importing text files. This a complex
program which allows a great deal of flexibility in the input format and
some pre-processing of the records as they are ingested. However,
because it is so flexible it is rather complicated to use, and I found
the manual somewhat opaque. For example, the input data file we used had
null fields represented as `\N'. The loader allowed the representation
of nulls to be specified in its control files. However, it took
objection to the `\' character, presumably interpreting it as a control
character and I had to put `\\N' in the control file. I only found this
out by guesswork and interpreting the errors I was getting; I could find
no mention of it in the manual.
ASCII export
Yes. Eg. the SQL interpreter allows output to be spooled to an output
file.
Binary import, export
Binary data files can be imported using the same utility as ASCII files.
It is possible to and export `binary' files, but these can only then be
imported into another Oracle database (ie. it is an efficient mechanism
for moving tables between different databases).
HTML export
Yes.
XML export
Not clear. Oracle certainly has facilities for creating and querying
XML databases. It also has an XML data type. However, it is not clear
that it is possible to generate an XML document from a conventional
Oracle database in the way that it is possible to generate an HTML one.
Scalability
Mapping to host file system
Complex and opaque. The usual way to manage an Oracle database is to
pre-create one or more large files which will hold all the tables etc.
in the database. Individual tables are hidden inside these files, so
it is not possible to estimate how much space each table takes. There
are complex facilities (which I've not investigated) for spreading the
container files across several disks in order to improve performance.
Oracle has an option to allow the container file(s) to grow as more
space is needed in a database. Initially I did not use this option
because (a) it is not the way Oracle is usually used and (b) it seems
somewhat dangerous as it allows disks to fill up. Attempts to use it
to monitor the disk space required by different sizes of table were
unsuccessful.
Support for 64-bit CPUs
Yes; the tests were run on a Compaq Alpha.
Support for SMP systems
Yes (I think).
Max. table size
`There are no limits on the physical size of tables'.
Max. single field
A character field of type LONG can be up to 2 Gb in size.
A raw binary field of type LONG RAW can also be up 2 Gb in size.
Binary and character large objects of various sorts can be up to 4 Gb.
Max. no. of rows
? Almost certainly unlimited in practice.
Max. no. of columns
1000
Max. indexed columns
`A table can have any number of indexes.'
Functionality
Oracle is a large and complex system which has many features not
touched on in this brief evaluation. For example, it is possible to
specify the partitioning and segmentation of a table in order to
control the way it is spread across one or more disks and hence give
better performance.
Oracle controls access to tables through the concept of `users'. Users
of an Oracle database are analogous to user accounts for an operating
system. There are database management tools to create and delete users
and to grant users access privileges to selected databases and subsets
of databases.
In addition to the basic pre-defined datatypes Oracle databases can
include programmer-defined data types and programmable triggers can be
set on tables.
The above are just a few of Oracle's features; there are many more not
mentioned here.
Usability
I found Oracle robust and reliable to use. The manuals were usually
very good, though there were a few instances where I found them a bit
opaque. However, starting with no knowledge of Oracle, by just reading
the manuals I was reasonably quickly able to discover how to perform
the tasks required to carry out the evaluations.
System Admin Impact
An Oracle database needs a `Database Administrator' (DBA) to look after
it. However, I suspect that the database administration requirements
for even a large repository of astronomical catalogues would be less
onerous than those of a typical commercial installation.
The DBA should be able to pick up most of what he needs to know from
the manuals (there is a comprehensive manual for DBA functions), perhaps
augmented by attending any suitable introductory courses. Some
assistance may be required in initially tuning the databases, but once
tuned the DBA should be able to keep them in tune.
Once the repository is set up I'd imagine that the recurrent, routine
workload should be quite small and could probably be accommodated amongst
a full roster of other duties.
Discussion
Oracle is an extensive, powerful and flexible DBMS. It is robust and
reliable and the documentation is usually good. All of which is as
would be expected from the leading commercial DBMS. I do not see any
show-stoppers that would prevent it being used to hold a repository
of large astronomical catalogues, such as the
AstroGrid data warehouse.
One significant cause for concern is the representation of numeric data,
which has implications for both the type conversions needed (internally
within Oracle) when accessing data and the storage space required to
hold tables.
The number of statistical functions available in Oracle-SQL, much more
than basic SQL, is a bonus. The Oracle DBMS and its associated tools
are a large and complex system, many features of which have not been
touched on in this brief evaluation. Some of them are unlikely to be
used in an astronomical context. Conversely, Oracle has some features,
such as support for XML databases and Web services, which are potentially
relevant to
AstroGrid but which have not been covered here. I hope that
the parallel project to set up a demonstration system using Oracle,
which is being carried out by Oracle staff, will address some of these
areas.
-- Clive Davenhall, 2002 August 16.