Server side

Access controls

We have a basic access control system with three levels of permissions.
  • public - anyone can read and write
  • protected - anyone can read, owner can write
  • private - only owner can read or write

System identity

Create a 'system' account to enable background threads to modify data - e.g. garbage collector needs to be able to delete orphaned fles

At the moment the access controls don't get in the way, but as I add more access controls, they are likely to cause side effects for the background threads.

Admin pages

Add tools to change the owner and policy to the JSP pages. Create an 'admin' account to allow system admin to make changes.

At the moment the only way to set the access controls is to login to the database and set the values by hand.

Community

In order to meet our use cases, we need a couple more levels of access control.

One use case that isn't covered yet is how a Community service creates home spaces for new users. The plan is for Community to be configured with the URI of the top level '/home' directory for that community.

The permissions on that directory should be set to allow anyone from that community to create new directories, but once created they become private. This means that when a new user 'frog' is created. Community generates a new x5090 certificate for the user and then calls the vospace service to create the home directory.

  • Create user 'frog'
  • Create certificate 'frog'
  • Call vospace to create '/home/frog' using frog's certificate.
  • Owner on '/home/frog' set frog's certificate.
  • Access controls on '/home/frog' set to private.

In order to do this, the access controls on the /home directory needs to allow the community service to create new directories using the new users certificate. The details of how this should work are still to be defined. The simples form would be to allow anyone who has been authenticated to create new directories in the home directory, but once they are created they become protected or private.

Performance issues

Delete

The current service implementation treats the of delete of a directory tree as a recursive operation.

When the top of a directory tree is deleted, the service iterates through all of the child nodes and deletes them first, before it deletes the top node. This means that a call to delete can take a long time for an arbitrarily deep directory tree.

Deleting a large and complex tree can cause the SOAP service call to fail. When this happens, the client then thinks that the delete failed, but on the server the delete operation is still in progress. Which causes problems when the delete finally completes.

The delete call was implemented as a recursive operation for two reasons.

  • To maintain backwards compatibility with the existing MySpace service API.

The MySpace service API assumes that a delete operation will delete all of the child nodes immediately. If the delete call does not delete all of the child nodes, then a client using the MySpace service interface could gain access to a node that has been marked for deletion. This is not an issue with the VOSpace service API because there is no way to access a node without going via the parent node(s) using the full path to the node.

  • To maintain equivalent behavior to the Unix rm command.

The Unix rm command is recursive, it deletes all of the child nodes before it removes the top node. In the process it checks the access permissions on each of the child nodes before it deletes it. If the rm operation finds a protected file it will cancel the operation, leaving the protected file and it's ancestors in place.

For example :

If a private file exists within the /var/local tree

    /var/local/path/path/private

Then deleting the top level directory will fail with an access permission error.

    rm -r /var/local
    Permission denied - can't delete /var/local/path/path/private

If this is not an issue for VOSpace, then the delete operation can be modified to be a single step operation rather than a recursive one. The initial delete operation just needs to unlink the top level node, mark it for deletion, and return. The service can have a background thread that picks up the nodes marked for deletion and performs the rest of the operation, including deleting the files from disk etc.

This would reduce the time cost of a complex delete from several minutes to a few milliseconds. This change would be internal to the service implementation and should not alter the service or client API in any way.

Copy

The copy operation is also recursive, and can take a long time to copy an arbitrarily deep directory tree.

There is no simple way to avoid the recursive operation in this case, but there is some room for optimizing the database indexing to make this operation faster.

In VOSpace 2.x, the copy operation will be asynchronous, so the client will poll the server to check when the operation has completed. This solves the problem of the SOAP or REST service call timing out if the operation takes too long.

Database performance

There are a number of performance optimizations that could be applied to the database tables and Java code that accesses them.

One option that has not been properly investigated yet is the effect of adding a secondary cache between the Java code and the database. The best candidate for this seems to be memcached, although there are other similar tools available. In order to asses these properly, we need to deploy the service in a test environment with multiple clients accessing the system concurrently.

Any changes to the database indexing or caching would be internal to the service implementation and should not alter the service or client API in any way.

Background tasks

The VOSpace service already uses a background task to delete files from the disc.

There are a number of additional data cleaning tasks that are currently triggered manually via the JSP admin pages. For a fully operational system, these need to be implemented as background tasks.

  • Cancel unused transfer offers.
  • Cancel stalled transfers.

  • Delete orphaned child nodes (see recursive delete).
  • Delete orphaned files from the filesystem.
  • Delete completed transfers from the server history.

In addition, we need to add a hook to the servlet startup code that starts the background tasks automatically when the servlet container (Tomcat) is restarted. At the moment, it requires manual intervention via the admin JSP pages to re-start the background tasks.

Data migration

There are a number of older versions of the VOSpace server deployed as MySpace services. The database table structure has changed since many of these servers were deployed.

In order to update these services we will need to write, and test, a database migration script that updates the table structure wile preserving the data.

The plan with these is to start with our own services at Leicester, Cambridge and ROE.

  • Take a backup of the current database.
  • Apply the patches and deploy as a new test service.
  • Verify that the data has been transferred to the new structure
  • Apply the same changes to the live service.

Client side

Node properties

The server side implements last modified date and files size, but at the moment these are not propagated to the VODesktop client.

All that is required is the server side code to wrap these as node properties and the corresponding client side code to unwrap the properties and pass them to the VODesktop client.

Copy/move between services

The current client side plugin does not use 3rd party transfers to copy and move between services. At the moment a copy between two remote services will transfer the data via the client, which works for small files but does not scale for large files.

The VOspace server is capable of handling 3rd party transfers, but the client plugin needs to be updated to handle the transfer protocol negotiation.

Integration testing

Client side

As far as I know, we have not tested to see what happens when VODesktop is given a VOSpace URI as the users home space. There may be some legacy code in VODesktop that attempts to interpret the VOSpace URI as a MySpace URI, mangling it in the process.

We need to test this to check that VODesktop works correctly when given a VOSpace home directory. Including displaying the correct file chooser for CEA and DSA tasks, and passing the correct file location to the CEA and DSA services.

At the moment, the Community service does not return VOSpace identifiers for the users home space. However, it should be possible to test the VODesktop/VOSpace integration by manually changing a users home space URI to a VOSpace location.

Community service

The work to update the Community service to allocate VOspace home directories is in progress. However, this may need changes to the access permissions in the VOSpace service to make this secure.

CEA/DSA service

The code to access data and store results in VOSpace has been implemented. However apart from a few simple tests it has not been verified.

In particular, the CEA and DSA services may need modifications to use a delegated certificate when attempting to access data in a secure VOSpace service with access controls.

Testing environment

In order to test the system end to end, and find and fix the inevitable bugs we need to setup an end to end test environment with Registry, Community CEA and DSA services.

-- DaveMorris - 21 Apr 2009

Topic revision: r1 - 2009-04-21 - 13:25:47 - DaveMorris
 
AstroGrid Service Click here for the
AstroGrid Service Web
This is the AstroGrid
Development Wiki

This site is powered by the TWiki collaboration platformCopyright © by the contributing authors. All material on this collaboration platform is the property of the contributing authors.
Ideas, requests, problems regarding TWiki? Send feedback