Thursday, August 28, 2008

Grid UI Update

Here are couple of more UI's for the Grid, I found
from Australian research collaboration service which has the capabililty for gridFTP, third party transfers, and also a cool feature which is multi replica transfers. Its sort of like bittorrent with the variation like when we have a large file whose copy exists in several locations, we can specify multiple locations from where we want to get the file which will potentially create a local copy of the file faster.

2) The other one is which is a Commons VFS implementation from Apache. It provide support for Storage Resource Broker and GridFTP.


Ron said...

I think the public health professional needs a user friendly GUI that makes their job easier and makes them more productive. For now I'm going to refer to this potential GUI as PH Grid GUI.

I believe with some effort it would be possible to combine something like VBrowser that Ken mentions above with Taverna. Taverna is a workflow tool that was recently adopted by caGrid and caBIG. Ravi Madduri's team of grid experts working under Ian Foster at Argonne National Lab have ported Taverna to caGrid creating an amazingly easy to use workflow anlysis tool.

For the PH Grid GUI I'm envisioning a software that has a small foot print, a familiar look and feel and can be installed by an inexperienced computer person in less than 30 minutes. Under the hood will be several grid clients that handle security, various file transfer options, and services that handle the workflow/analysis functionality.

I have a couple of usage scenarios in my head; I may add those later if time permits.

Ron Price
Grid Developer
CoE University of Utah

Ron said...

One usage scenario would go like this: (this scenario is a draft, sorry for any typos)

* The user logs into the grid with the same user name and password as she uses for other work related activities at her organization. Due to her organization achieving a high level of assurance the companies internal security system has been combined with the Grids security system so single sign has been realized. A grid proxy is created for the user after she has successfully logged in.

*Today on the agenda for the user is a bunch of PH analysis. The user notices that she can complete this analysis by creating a workflow out of several grid services. In order to get started the user drags and drops a Virus grids service (that she has noticed to be semantically defined as the virus she needs to start her PH analysis) into the section of the GUI that allows her to create a workflow.

* The Virus grid service will due some preprocessing on the data she feeds to it. The user thinks to herself "I'm so glad I was aware of that grid service, I didn't want to create that preprocessing in excel!".

* For the output that the Virus grid service creates she wants to create a fork in her analysis. She wants to expose the virus to antivirus c and antivirus b. She knows there is an antivirus c grid service that meets her needs, but she is not of aware of one for antivirus b. She drags and drops the Antivirus C grid service into place just below Virus and connectivity line illustrating workflow is created. Now, she searches for antivirus b via the grid service search (discovery service) functionality in the PH Grid GUI and two possibilities come up, so she expands the details of both. She takes a real hard look at the analysis that each service does and semantics behind the terminology involved. The one near the bottom of her screens meets her needs and she drags and drops it into the work flow and drawing a line from Virus to the Antivirus B grid service effectively creating the fork in here analysis/workflow. Little does show know that Grid researchers spent a lot of time creating a common way to create and expose the analytical functionality of the service and defining the semantics took an army of professionals from various disciplines to create a process that uses ISO 11179, the standard caBIG uses to create meaning in it's Ontology. So, caBig is necessary to have a standard way to assign meaning to terms and to do the search that found the user the service she needed (essentially discovery was done).

* Now the user wants to measure the effectivness of each antivirus. This time she is very aware of just the Grid service to do so and she drages and drops the Antiviral Effectivness grid service into her workflow.

* Lastly she presses the run button and is prompted for input that the Virus data grid service requires to do the preprocessing on. Just to make sure everything is okay she types in one entry by hand and presses "Run". The screen visually displays the progress being made and soon the results are displayed and the answer to her public health question has been found. Now she needs to run several more cases so she hits the "Run" button again and choses to send a file of 100,000 viruses over to have the analysis run on them. Behind the scenes the file is moved via data grid transfer services (caGrid Transfer or gridftp) and her analysis begins on all 100,00 viruses. ( For now she elects to not take advantage of the parallel computing power offered under advanced features in the PH Grid GUI).

So this is just an idea, but if attempted I think the single sign on, computing grid feature and discovery option could be added after a basic successful PoC was done.

It would also be possible to third party file transfers, striped file tranfers to improve performance using (gridftp, rft and srb). Anyway, basically it could be possible to move data in ways people have not seen yet.

Ron Price
Grid Developer
CoE University of Utah

Ron said...
This comment has been removed by the author.
Ron said...

More info on Taverna in caGrid can be found here

Brian Alexander Lee said...

I like your use case. I think it is going to be essential to have a straightfoward way for developers (users) to tie together grid services to accomplish some job.

Between and collaborators we have a couple of services and different authorization schemes. What do we need to do to apply the steps you outlined to our existing set of services and security?

Ron said...

One could start by evaluating whether or not VBrowser has an API that is decoupled from the GUI. If so, this API may prove useful although it will be tied directly to Globus.

SIDE NOTE: I like Globus; it is an amazing piece of software and the people that have and currently do work on Globus are amazing. They are trying to create middleware to solve a nontrivial problem and doing a great job at it. But, working with the Globus APIs directly is problematic. Also, creating a grid service without tools (gravi or introduce) can be error prone. This is one reason I'm very excited about caGrid. I've been working with GRID for 5 years and I've had the experience of administering my own test GRID (including GSI), working with the Globus APIs directly and the pleasure of trying to create a Grid Service without tools. Now, caGrid tools can be used seperately from caBIG. So, caGrid is a layer around Globus that makes life for grid developers and grid administrators sane and more productive.

Over the next couple of months Taverna may be added to GAARDS. Now GAARDS already has a tab that takes as input the users username and password and creates a user proxy.

Now for the file transfer portion of GUI. One could further add to GAARDS. For instance, create a file transfer tab. It would be possible to add to this by laying a GUI on top of GridFTP or caGrid Transfer.

Hope that helps.

-Ron Price

Ron said...

I just wanted to mention that if a person does try to add a file transfer mechanism to GAARDS to coordinate your efforts with the Ohio State caGrid team.