Notes on the emerging CAS prototype
What?
It's the prototype of the proposed authorization system for
AstroGrid and the VO.
Why this page?
These are my notes on the progress of this nano-project. They're here because:
- it's good to have the design decisions recorded;
- it would be good to have your feedback and I'd rather have that concentrated in one place;
- this is a convenient medium for notes;
- if I don't try to write all this down in intelligable form I don't know if I really understand it myself;
- if I go completely Upminster from the strain of thinking of all this at least the project has some output to fall back on.
Hence, this is all stream-of-conciousness. One day soon, this page may morph into a concise description.
Design decisions
(These decisions apply to the prototype, note, and might change for the production version...so don't panic yet.)
The basic product is a web/grid service that emits authorization data in response to queries. That is, using authorization via web services is more of a priority than granting authorization via said services. If we get stuck with only a local interface to CAS for granting authorization, then that
may be OK
The auth data are emitted in XML according to a schema TBD (and to be attached to this page when it emerges). This is (almost) inevitable, given that web services are involved. The only serious alternative is to emit CAS proxy certificates in the style of the original, Globus CAS. That seems un-necessary for this prototype and is anyway Too Hard.
The auth data are stored in XML (same schema) in the
Xindice DB. Using the same XML schema is a natural choice. Xindice is an attractive choice for storage because
- it's free-as-in-beer, so readily available;
- it's easy to install (read: I already have it installed);
- it supports XPath for queries, which is a good match to the structured data;
- it is supported as a Grid database-service in the OGSA-DAI protoypes; I may be able to use this.
(Tentative) I may choose to abstract the auth data from the XML structure to a
flat structure in a RDBMS. This would simplify the searching and
might make it go faster. Is Xindice performant?
Philosophy (and ontology too, God help us)
A
GridRole? is a group containing zero or more individuals each of whom is a member of the
GridCommunity? defining the
GridRole?.
GridRoles? follow the Party pattern (see
IAAModelNotes3?).
Each member of the
GridCommunity? is a member of a
GridRole? consisting of just themself. I.e. each individual in the community has one or more
GridRoles?. This means that privileges are
always associated with
GridRoles? and not with individuals. A privilege given logically to an individual, is techically given to their personal
GridRole?. This is a unifying and simplifying assumption.
GridRoles? and Individuals both follow the Party pattern. Parties are named in Grids by
DistinguishedNames. Hence,
GridRoles? need DNs too.
An Authority is the association of exactly two
GridRoles? with exactly one Privilege. One of these
GridRoles? is the Commissioner and the other is the Commissionee. The Commissioner grants the privilege to the Commissionee by recording the Authority in CAS. There may be many Authorities between the same Commissioner and Commissionee, but with different Privileges.
A privilege associates a Permission with a Resource.
A Resource is a very general idea. It could be a grid service; a major data-set (e.g. "The Sloan EDR"); a small data-set (e.g. a single observation); or something inside CAS, like a
GridRole?.
The exact nature of a given Resource doesn't have to be obvious from the CAS records. That is, so long as the nature of the Resource is known to the service providing the resource, then it's OK. Example: let the resource with
DistinguishedName /OU=archive.ast.cam.ac.uk/CN=ING archive/CN=W/42/02 stand for "all the data from PATT proposal W/42/02 (i.e. the 42nd approved proposal on the WHT in 2002) in the ING archive at Cambridge". CAS dones't need to know the details (e.g. run numbers) of this data-set so long as archive.ast.cam.ac.uk knows to associate the DN with the right data. This essential to avoid bloating the CAS by requiring that all resources are described down to the finest possible level of sub-division.
CAS is assumed to contain authorities for
GridRoles?. These say who is allowed to add individuals to
GridRoles? or to eject individuals. There should be a
GridRole? for the CAS adminstrators. They have automatic rights to control the membership of all the other
GridRoles?. However, they don't have any automatic authority over resources outside CAS.
The community as a whole is represented by a
GridRole?. Resource providers can grant privileges to the whol community using this.
Use-case stuff
Looking up an authority
When a resource -- i.e. a service -- on the grid gets a request to use a resticted facility, it looks up the authority of the requester in CAS. The resource needs answers to five questions.
- What authorities are there concerning me?
- What GridRoles? are in these authorities?
- Does the requester have any of these GridRoles??
- How do these authorities contrain the job?
- Who commissioned the authorities?
The first three questions are obvious. Number four is a matter of extracting details like read vs. write access or maximum storage space allowed. Number five is subtle. Any member of CAS can record an authority on any resource name, since names are abstract and CAS explictly doesn't know what they refer to. Any member who records an authority has their
GridRole? recorded as the commissioner of the authority. Therefore, a service can only trust authorities commissioned by a
GridRole? it recognizes and trusts.
Therefore, the requests to the CAS database for a given query to CAS will be:
- Fetch all GridRoles? g pertaining to individual i.
- For each g, fetch all authorities for resource r commissioned to g by c.
The set of authorities is then compared to find the one that gives i the most access.
Recording an authority
Presume that the a service provider has a
GridRole? g in CAS whose members are all trusted custodians of the resource.
- A representative of the service provider logs in to CAS using his/her personal identity i, asking CAS to record an authority with g as the commissioner and some GridRole? G as the commissionee.
- CAS checks:
- That i can be authenticated with GSI.
- That i is a member of g.
- That G exists.
- CAS records the authority if the checks are passed.
This makes g a kind of digital signature on the authority. The provider can then trust as genuine all authorities with g as commisisoner if the provider also trusts that CAS authenticates i correctly and that CAS does not allow intruders to become members of g.
Revoking or changing an authority
Revoking an authority means removing it from CAS s.t. a subsequent query doesn't find it. Presume that CAS keeps some audit trail of past authorities.
Only the commissioner of a authority can change or remove it.
Delegation by guest membership
Suppose that there is a
GridRole? for "observers on PATT proposal W/42/02". This is created by the archive that receives the data from W/42/02 and is used to control access to the data of that proposal during the one-year proprietary period. The data concerned are read-only, and read access is resticted to the given
GridRole?.
The proprietary period is a conventional courtesy given by the archivists to the observers. Ultimately, the archive doesn't care who reads these data, but the observers probably do. Therefore, the archive doesn't care who the observers delegate their access to.
Initially, the archive system creates the
GridRole? in CAS automatically when the data appear in the archive. The archivists initially put the PI into this role. The archivists mark the
GridRole? such that any party having the role can commission the role for another member of the community.
Subsequently, the PI brings in all the other observers, plus their collaborators, plus all the students of the above.
This is an important CAS function: an open group where members can introduce guests.
This simple case applies when the restictions on the resource are for the benefit of the users, not the service providers.
Now, a more compicated example. Presume that there exists a research group g0 at MSSL which needs to use computers and discs cd, at UCL.
Membership of g0 is fluid; group members can introduce their collaborators as guests without checking with any service providers.
The administrators of the resources, who are not members of the research group, care about who uses the resources since they don't want their system abused. Suppose that the UCL sysadmins are represented in CAS as
GridRole? g1, and the MSSL management as
GridRole? g2. Suppose, for sake of argument that the sysadmins trust the MSSL management to control who has access to the resources.
First, either g1 or g2 create a
GridRole? g3, as a clone of g0 to represent the researchers. G3 initially has the same members as g0. G1 then write three authorities:
- for g3 to use cd;
- for g2 to change the membership of g3;
- for g1 to change the membership of g3.
G1 then let g2 add and remove members of g3 as necessary. Note that
members of g3 cannot introduce collaborators without the cooperation of g1 or g2. Over time, the membership of g0 may diverge from g3.
This is a crucial operation for CAS: restricting resource usage by cloning a closed group from an open group. It absolves g1 of the need to know a priori the membership of g0 and also of the need to trust every individual member of g3. They just need to trust g2 to manage g3, and to know that g2 trusts everybody in g3.
This case applies when the restriction is for the benefit of the service provider.
Finally, a mixed example. (Thanks to
KonaAndrews for illuminating this.) Consider a user leasing some storage in
MySpace?. The
MySpace? provider owns the storage media, but the leasing user owns the data stored there. It is proper for the user to share read access to the data, but probably not to share write access to the storage; that is, allowing a arbitrary third-party to put arbitrary data in the storage might contravene the service providers terms of usage. Therefore, when the storage lease is granted, the service provider writes two authorities into CAS:
- read-write permission to a GridRole? that is a closed group including the user;
- write-only permission to a GridRole? that is a open group including the user.
The read-write permission goes naturally to the user's personal
GridRole? - the one containing only him or her. This role is thus assumed to be closed. By implictaion, there is another personal role of "user plus collaborators" that should exist for all users in the comunity, and this is an open role.
(Hmmmmmm....not really sure about all this......)
--
GuyRixon - 11 Jul 2002