Thursday, July 31, 2008

PHINMS Certificates in Globus

Thanks to Vaughn McMullin, we were able to come up with a repeatable process for installing existing PHINMS user certificates on Globus nodes. The following is a high level description of the process:

1.Export your PHINMS certificate with Internet Explorer using the Personal Information Exchange PKCS12 option.

2.Check the, “Include all certificates and certificate paths” box. NOTE: This should be the only option checked.

3.Upload the exported certificates to the Globus node. (Root, Intermediate, and Private)

4.Use Portecle to view the exported certificates. Portcle is started using the following command: java -jar portecle.jar

5.Use the PEM Encoding option in Portecle to generate a PEM file that Globus can understand.

6.Create a hash name for the PEM file that was created using the following command: openssl x509 -in yourfile.pem -noout -hash

7.Rename the file to the hash number displayed in the following format: hash.0

8.Manually create a signing policy named (hash.signing_policy) Use the following link as a guide to create a signing policy for PHINMS certificates:

9.Copy the new files to /etc/grid-security/certificates

10.Verify proper installation by running the following command: openssl verify -verbose -CApath /opt/vdt/globus/TRUSTED_CA -purpose sslclient /home/your_user/your.pem

Wednesday, July 30, 2008

NLP & Ramp-Up Progressing

I completed the installation of Ubuntu on my laptop last night and I have installed Globus locally. I installed the Simple CA client to keep everything self contained. I will continue to install the other components in the installation guide before the end of the week.

I have access to both the GridMedLee and Zip Code project on my local box. I will begin writing the GridMedLee client in the morning. Both of the above projects were created using the Introduce tool which produces atuo-generated code client code. My intentions are producing a client outside of the Introduce framework.

Got the time form working

I cleaned up the spatial series test form and moved the code over to make a time series test form. I also talked with Jeremy a bit about his installations of globus and ogsadai.

Tonight I block out the changes for the admin screen and start coding it tomorrow.

Progress on SRGM and call with GridFTP guys

We had a productive call with the GridFTP guys (Raj, to be specific) and they were extremely helpful and prompt in responding. The couple of questions posed to them were
1) Was there any way they were achieving encryption for the payload before putting it on the transport layer. Raj told us about Globus XIO Driver which can extended to do so. However since it not an inbuilt feature, we have decided to go with the same encryption used by PHINMS, with the help of Vaughn, since that algorithm is already approved by CDC standards. This will help us satisfy the test case 4G and 4I

2) Raj also provided us a transactional flow diagram for RFT to which some more details need to be added. I will be calling Raj to get more of that information

Also, it came to light that the version of Globus installed and our nodes is 4.0.5 whereas they have already rolled out 4.2. Now 4.2 already contains inbuilt persistent storage which can maintain the state of files being transferred through RFT. Dan mentioned that an upgrade of the toolkit version is on the agenda which will definitely help our case.

Right now the test case left, which Vaughn and Dan are working on, entails installation of PHINMS digital certificate on the node and then doing file transfer, details of which are in the blog post below.

Tuesday, July 29, 2008

Installing a PHINMS Certificate on Globus

I worked with Vaughn on getting an existing PHINMS digital certificate to work with Globus services. The digital certificate was extracted using Portecle 1.3 and installed on lab 1001. The first attempt was not successful. We will continue the installation in the morning.

When we attempted to verify the certificate, the system was unable to recognize the issuer of the certificate. We are currently troubleshooting the problem.

  • Verification Command: openssl verify -CApath /etc/grid-security/certificates -purpose sslclient /some-directory/cert.pem
  • Error Generated: error 20 at 0 depth lookup: unable to get local issuer certificate

Form is coming along

I have managed to get the spatial series test JSP form updated so now dynamic inputs can be entered and different servers (that were stored in the properties file) can be picked from a drop down.

The next step is to replicate the changes for a time-series form, and then I will start modifying RODSAdai to have some administration classes that allow for editing of the server lists and the like using the properties files.

Monday, July 28, 2008

Therapeutics Research and Infectious Disease Epidemiology

Tom, Ken and I had a productive call with Jeff Brown, Richard Platt, Roseanne Kue and Suzan Mitchell from the Harvard CoE regarding using grid technology and the research grid with their current process for running SAS code against various SAS data sets.

Currently, SAS programs are emailed to data sources who modify the SAS code, run it and email back the results. The idea is to use grid to send jobs using Globus that will run a SAS script on a Globus node and return the results. This is a big benefit to public health by immediately simplifying the workflow to take out email and manual modification of SAS using Glbus. The real big benefit is that it allows for remote analysis of data sets across remote data sets. No more copying over data sets to epis so they can use the data for research.

Some specific points we covered is that Harvard's timeline is to demonstrate at least 4 nodes running a script and returning the results by January, 2009.

VL-e Toolkit installed (Vbrowser)

The VL-e Toolkit version 0.8.0 has been installed on lab 1001. The VL-e toolkit is a front end GUI for Globus grid services. I'm currently testing the features of the Vbrowser included with the toolkit. Based on initial tests, Vbrowser can flawlessly create and destroy proxies, configure RFT, GridFTP and SRB servers. I will continue to evaluate and test the features and functionality of the software.

Wednesday, July 23, 2008

Now that a web client accessing Ogsa-Dai is working, now what?

Today I spent a lot of time re-firing-up the development machine and slightly expanding my test to make sure that thethe time series was working in addition to the spatial series. Yesterday, Air Conditioner installation ousted me from the lab for a good portion of the day, so I was planning what I would do once things got running and running them by Jeremy, I am going ahead and posting a copy of that list here:

  1. Shorter Term, Higher priority:
    1. Expansion of the RODSAdai and RODSAdai-web projects so that instead of just an index and a page that loads a fixed spatial series… you get some pages that let you try different queries, different data-resources (selectable from a list generated from the properties file), and an admin screen for adding, removing, and modifying properties. (shorter term, higher priority)
    2. (anticipating) Work with Jeremy to set up finals of RODSAdai environment, in the meantime set up different RODS environments (3 computers, each with a subset of data) in the CDC lab that can connect to each other across the local globus network.
    3. Look into creating a local, database based security manager that uses encrypted user information within a database instead of a “logins.txt” file for OGSA-DAI database authentication/authorization.
  2. Shorter term, Lower Priority:
    1. Brush up documentation of RODS/Adai.
    2. Outline and fill as best possible, the “Should Peter walk into the path of a greyhound dream liner” document.
    3. See if insecure tomcat’s OGSA-DAI project can be moved into JBoss, if not, find a way for coexistence, or plan out having one insecure tomcat OGSA-DAI instance somewhere accessible for build stations.
  3. Longer term, Higher Priority:
    1. (anticipating) Discuss and document finer layers of authorization in addition to the authentication and encryption afforded by Globus and OGSA-DAI
    2. Work on a security management subsystem that allows one to check the validity of a proxy certificate and/or create them as needed, and find ways to work that into the rods portal. The idea being that if the User authenticates, a proxy certificate should be issued for however long they are logged in. This may mean user-specific proxy certificates and integration of things like PURSe and MyProxy.
  4. Longer term, Lower Priority:
    1. Continue researching security policies for Tomcat.

Many of these things are subject to change, but that is where I am right now, I also have a particular interest in integrating RODS and RODSAdai for the conference. I feel like the end goal is to have RODS with two different data sets on two machines and both have the ability to see the others data.

GridFTP Secure Test

We had a successful test of secure GridFTP and the data checksum functionality. This test insures that the data payload is encrypted during transfer and all files are received intact without modification from a 3rd Party. Below is an excerpt from the Admin Guide describing the functionality tested:

-dcsafe | -data-channel-safe Sets data channel protection mode to SAFE.
Otherwise known as integrity or checksumming. Guarantees that the data channel has not been altered, though a malicious party may have observed the data. Rarely used as there is a substantial performance penalty.

-dcpriv | -data-channel-private Sets data channel protection mode to PRIVATE.
The data channel is encrypted and checksummed. Guarantees that the data channel has not been altered and, if observed, it won't be understandable. VERY rarely used due to the VERY substantial performance penalty.

Monday, July 21, 2008

RODSAdai on Jboss Worked!

So, this morning I finished up installing JBoss-4.2.2, got it running and then transferred over the Rodsadai-war from the maven target directory into the JBoss deploy directory.

Then, I went to the server endpoint, and tried running my little test JSP. And it worked! As simple as that. It didn't throw errors, it didn't say anything about failed path building, it just ran and gave me the output I wanted when I first tried all this Thursday week before last. So now I know it works and can start being integrated with RODS.

The downside of this, is that it's yet another server. Fortunately, I talked to Jeremy and he already had RODS running in a few JBoss instances, so it will not be an obstruction of his coding plans to say "this needs to run in JBoss to work"... going "RODS needs to become a gridsphere portlet" would be much less palatable. It might require some musical ports if tomcat and jboss need to run on the same server, but honestly the secure OGSA-DAI we are running would be in the globus container and not in tomcat, and other tomcat applications can run on the tomcat within jboss.

That doesn't mean I didn't spend a lot of time today trying to get tomcat to work securely based on how JBoss' tomcat did it. It appears that JBoss' tomcat uses an entirely different realm, passing itself to the security engine that JBoss manages. I tried pulling jars from JBoss into standalone tomcat and adjusting the realm and started getting all sorts of "hey, you're trying to run me outside of JBoss type errors and library failures."

Maybe Globus already has some sort of realm tweak you can perform for tomcat that makes it point to the globus container and security manager. Maybe one can be designed. Then again, maybe saying "it needs to run in JBoss" won't be a fatal problem.

Friday, July 18, 2008

ESP:SS Data Structure Discussions

I met with the CDC BioSense epi team on Wednesday to discuss the ESP:SS proposed data formats (described on the Harvard ESP wiki)and to get some feedback that will be useful to Dr. Lazarus.

Below is the feedback and questions from the BioSense subject matter experts...

For level 1 data, a few new fields were suggested:
-Facility zip - in place of the unspecific zip field Syndrome classifier author - "ESP", "BioSense", "ESSENCE", etc.-although for ESP this will always be the same value, for other providers it may differ.
-Denominator - patient visit count. The number of visits are based on total visits irrespective of whether the visit contained clinical data that binned to a syndrome.
-Facility count - number of facilities providing data for syndrome count & denominator.
-Patient Class/Type - A patient class/type of health indicator ‘category’ is needed. Again, we refer to this as “bucket” in today’s BioSense. See attached for the current bucket list. Buckets 11-19 represent the core 9 buckets that are used. That is, we may receive data that fall into the other buckets but said data is considered ineligible for use in analytics/BioSense. Buckets 11-19 cover the 3 core Patient Class settings (Emergency; Inpatient; Outpatient). Within each class setting, there are 3 buckets representing early indicator, working diagnosis, and final diagnosis (Emergency bucket 11-13; Inpatient bucket 14-16; Outpatient bucket 17-19).
-Age group - to be determined what the age groupings should be

For level 2 data, a few new fields were suggested:
The new fields suggested for level 1 will all be included in level 2
-Patient zip - zip code of the patient residence (may be able to be derived from the geocode).
-Chief Complaint or Diagnosis - the chief complaint or final diagnosis text used to classify the patient encounter syndrome.
-Patient linker id - pseudo anonymous id that can link a unit record back to a patient by cross-referencing with source facility.

Also, for level 2 Dr. Lazarus asked for a few clarification for fields:
-Geocode should be for the patient residence
-Age should be in years allowing decimal levels to provide provision for months/weeks/days.

For level 3, the only new fields suggested were the additional fields added to level 1 & level 2.

The group also had a few questions that I will forward on to ESP and see what their response is:
-How can we address the potential privacy issues with using patient geocode in level 2? There are several GIS techniques to provide controls for privacy, perhaps they should be discussed in the data field design.
-How to find the patient count denominator for level 2 data? Since level 2 is unit records, is there a separate query to find the denominator?
-How are multi-syndrome patient encounter events recorded?
-What are the use cases for the level 3 data?

Thursday, July 17, 2008

Grid service testing

The University of Utah Center of Excellence has developed a couple of services using cagrid's Introduce service tool.

Today we successfully tested calling out to the public services using the Introduce generated java client and also using a standard SOAP client (soapui) to call the service.

Utah is now applying authentication and authorization controls and we'll test secure invocation. Utah has established their own grid network using Globus/cagrid and we'll test interoperability with Globus certificates issued by Utah and with Globus certificated issued by PHIRG.

Gridsphere less promising, but what about JBoss

Well, I sent an email out to the globus user mailing list (at least I think I did, I'll have to check that later when I get home)... but I am stumped as to things to try with getting this client to run in tomcat.

Gridsphere, well first I think I installed the wrong version of it, as the newest Gridsphere doesn't seem to have grid portlets, but they might install, I will try that tomorrow.

Otherwise, I am going to try building and running RODSAdai out of JBoss... Dr. Espino indicated he has RODS running in JBoss (which makes it more ideal, should it work, than "some portal" which RODS would probably have to be re-engineered to use) and it seems to be a bit more heavyweight as an application server and might be able to deal with certification specification and security configurations (and may even have a shiny gui).

Wednesday, July 16, 2008

Questions to GridFTP guys

After having a discussion today with Vaughn, these are some of the questions we agreed on to ask to GridFTP guys, pertaining to OCISO guidelines.
* What are they using for payload level encryption as in right now, we do double encryption i.e. we encrypt data first then place it in the PHINMS queue where there is transport level encryption. Taking the same scenario, we know that for transport level, encryption is achieved through SSL however what is used for encryption before that. Now that could be outside the scope of GriFTP however its worth talking to them as to whether they have been involved in similar initiative like that.

* At present, PHINMS works through Request / Acknowledgement. Now RFT helps us in achieve the similar funtionality in terms of the fact that we can track the status of transfer as in Active/Pending/retrying/completed. So in this case, if they can provide us with some transactional diagram, sequence diagram which can help us explain this in our documentation

* What version of SSL is being used.

* What metadata fields can be configured

* As far as tranmission integrity is concerned,
GridFTP is built as an extension of FTP which uses TCP/IP protocol for data transmission. Since TCP uses a checksum computed over the whole packet to verify that the protocol header and the data in each received packet have not been corrupted, this ensures data integrity on the whole packet. We can reconfirm with them on that.

Next step would be to set up a call with GridFTP to discuss that.

More dead ends, gridsphere installed

Today was a lot of halts and starts. We got new errors, and have some ideas how to get past the new errors, but we aren't totally sure.

Otherwise, I spent some time getting the gridsphere portal installed. It was easy... almost too easy. There doesn't seem to be any obvious indication of whether it is using the grid certificates or not, looks like I will be spending a lot of time tomorrow with the admin and user guides and probably looking at some of the code to see if I can replicate whatever security management outside gridsphere.

Tuesday, July 15, 2008

More on SRGM Test cases

After talking to Brian, we decided to define some kind of timeline we're looking at, in terms of the test cases defined in the test cases document for SRGM_PoC and also identifying tasks vs research items. So here it goes

* Evaluate the GridFTP functionality against OCISO guidelines and requirements (Req. No 01)
* Evaluate reliable GRIDftp and WS-RM against NIST guidelines and requirements (specifically, but not limited to, the FIPS 140-2 Cryptographic Module Validation Program; FIPS 200 Minimum Security Requirements for Federal Information and Information Systems) (Req No.02)
These research items need to be looked into by involving the GridFTP folks, maybe on a call but before that, I will get together with Vaughn and see how we can formulate specific questions which we would want to pose to them.

* Transfer file from partner node (node B) to CDC lab node (node A) using PHINMS issued digital certificates (Req. No. 04,05 & 06)
We will get hold of a PHINMS issued digital certificate and do a transfer and since the transfer is using SSL, we would not have to explicitly encrypt and decrypt the file.

* Test ability of Globus components to integrate into existing PHINMS infrastructure (transfer of program payloads through both PHINMS and Globus). (Req No.08)
To be further looked into

* Evaluate capabilities for end to end payload level encryption (Req. No 10)
Confirm if there is a switch to enable payload level encryption. This is different than TLS/SSL

* Evaluate capabilities for file payload encryption for strength and validity (Req. No 13)
If previous test is valid, then which algorithm is used to complete the encryption

* Evaluate capabilities for guaranteed once and only once delivery of file and the robustness of its duplicate file detection (Req No 11)
Confirm that a message Transport State is tracked by both Sender and Receiver (Request-Ack)

* Evaluate capabilities for reliable data exchange for once-and-only-once transport of payload data ( Req No 14)
Application Specific

* Evaluate reliable GridFTP for reliable messaging for reliability due to node availability ( Req No. 15)
Same as the test case where we brought down the node multiple no. of times to check automatic restart. Need to check whether we can control transfer of time sensitive material.

* Evaluate user interface projects for RFT management and operation (including, but not limited to, job starting/stopping/resuming, route management) ( Req. No 16)
To be further looked into.

I will also be updating the test case doc with these additions.

Portals, they have all that cert management, but will they wrap a JSP easily?

I had some more discussions with some of the folks at OGSADAI and Pitt, In addition to some reviews with Brian. After some more eyes and some more cryptic errors, it looks like a portal might be the quick solution for getting all of the security managed.

The benefits include the ability to easily set up authentication and authorization schemes on the user end... and tie them into the authentication and (eventually) authorization schemes in the globus grid. It would also tie in easily with applications like PURSe.

The drawback is that RODSAdai and RODS might have to be modified to deal with the portal... especially things like P-Grade which is more focused on workflow registry and execution.

Right now the frontrunner looks like Gridsphere, but we are also thinking of other things to try (attached debugging of the web project) to see if we can get secure tomcat clients working.

Monday, July 14, 2008

Well, I got some documentation updated and MySQL on a server

Today I did a lot more research, tried getting tomcat secure access working (I did) to see if it would make secure client OGSA-DAI access from tomcat work (it didn't). I am out of ideas, but I sent out more feelers to see if other people could help me or provide working examples.

Otherwise, I documented and cleaned up a bunch of the RODSAdai code, and I am setting up a dedicated MySQL server just so we have a MySQL database in the lab that is not eaten by globus/vdt.

Friday, July 11, 2008

Still having trouble with Secure Access client

Today was more trying to get secure tomcat working using this guide:

I did not have much luck.

Dr. Espino has been talking to some people who have gotten tomcat clients to access secure grid webservices... and it seems that we are on the right track looking for ways to get the validation path working. So there is hope, and he will probably be tinkering with getting secure clients to work as well.

Otherwise, I have also been researching portal software, things like gridsphere which are designed to handle all the grid authentication mess complete with update grid cert files. It seems to be the toolkit that PURSe operates from, so I will ask Dan more about it since he was researching PURSe a while ago.

A portal to handle SSO and user accounts (including background process accounts) will probably end up being essential for maintaining secure access on steady-state servers that poll constantly or for long-running queries or sweeps... then we don't have to extend proxy deadlines to unreasonable levels for processes that will take a lot of time or run indefinetely.

Otherwise, Anurag got access to the lab today. I got to show him the summary of what I have been doing and chat with him through something other than a phone.

Thursday, July 10, 2008

Progress on SRGM_PoC test cases

We (me and Dan) finished up on 1 more test case. One which required us to observe the behavior on multiple restarts and system behaved as expected. Results are updated in the test document (available on the wiki). Another thing we observed was that right now if try to re transfer a file by the same name, once it has already been transferred, it overwrites the old file. That is not something we are looking to do. So the idea right now is to find a switch which may allow us to get a warning saying that this filename already exists. Also looking forward to being on the call with the GridFTP guys which will definitely help us knock out couple more test cases.

RODSAdai Webapp not able to find certification path

Today was pretty much all day trying to hammer against tomcat and getting it to recognize the globus certificates for clients... and I have no idea where such things are controlled within tomcat, and the things I tried didn't fix the problem.

Thus, the "the secure client won't run through tomcat" problem is back, and I still am not sure how to tell tomcat "use these keystores that exist within globus when invoking this client which connects to a secure globus server". The error that is thrown is the familiar "unable to find certification path"

I really have two problems: The first is that I have no idea how to get tomcat to use a keystore other than the default java one for client programs. Ideally it should be made-to pick up the certificates and look up the proxies like the command line clients... something about the tomcat container prevents security policies from being overwritten in the same way they are within the command line clients.

The second problem is that there is tons of stuff written about how to enable tomcat as a secure ogsadai/wsrf server, and it is blocking out google searches of how to enable tomcat to run secure clients which access other globus/wsrf servers with things other than the default keystore.

I have tried several things which haven't worked:

1. Moving client-config.wsdd into the WEB-INF/classes/ directory of the web-app.
2. Modifying server.xml in [tomcat-root]/conf to use the globus certificates for secure access as described by this website: '' and it was really more of an old way of setting up some sort of globus ws server with an old version of globus.
3. importing cog-tomcat.jar into the libs directory (of the webapp and the server), after finding that cog-tomcat seemed to hold several of the security handlers...

I am out of ideas at the moment... Dr. Jeremy Espino is working on setting up his globus/ogsadai environment and then he'll tinker with it too.

I figure there is a workaround to write clients that explicitely invoke the globus security elements... but I imagine I will have to pore over acres of ogsa-dai and wsrf source code to set such things up and any resulting code might be locked to very specific versions of ogsadai and/or globus.

I just feel like there is some "oh, you just set this property in the config file for the webapp" thing I don't know exists... I just have no idea where that is and my attempts to look for it keep getting flushed out by examples showing me how to enable https on port 8443.

Wednesday, July 9, 2008

New repository, building test code, and RODSAdai webapp is lining up

So... Dr. Jeremy Espino helped me get a LOT of stuff done today with RODSAdai.

First off, he had some shell scripts and a repository that is more amenable to deployment... so he just took all the suggested associated jars for ogsadai and put them in that repository... adjusted the pom to point to them, and committed the change.

This basically saved me from having to do a jar hunt, and saved Anurag from having to update our persnickety sourceforge repository.

That being done... I tried to do an mvn package... and it worked this time! So the new jars during packaging are now curing the "but it works fine in the ogsadai directory" problem. I can run secure queries in tests, and have them pass. Thus, I have written some more unit tests and have committed them.

The next thing Jeremy did is build a web project for RODSAdai called RODSAdai-web. I have written (but haven't had a chance to test yet, that will be tomorrow) a little JSP that will essentially call rodsadai, run a method that runs a query, and return the resulting spatial/time series.

Jeremy also set up the dependency to the jar that is built by the RODSAdai project... which includes all the libraries that RODSAdai needs. Hopefully, the inclusion of the jars that made the secure query operate from the junit test framework will also allow the test to operate within tomcat. If it does, then we will basically have a footprint for any jsp-container based ogsa-dai client.

So yeah, tomorrow are the tests... and since we have reached a bunch of milestones in a short time, I am going to start the haltingly-updated, rarely revised "If I get hit by a bus" guide document that should help anyone else who needs to set up RODSAdai do so within a day

Tuesday, July 8, 2008

RODSAdai secure access from in the test context

Today I ran into the familiar syndrome... lovingly known as "But it works in the ogsadai directory".

So, secure access of the RODS database does work from the ogsadai directory... which has a file that sets the java classpath up with all sorts of jars and property files from globus and ogsadai.

However, when I try to make a secure call from a test context... I get the dreaded error of "unable to find valid certification path to requested target".. I am pretty sure this means that I need to include some more jars from the globus context, hopefully those jars will have something that imports the globus security infrastructure. If it does, then it might very well be included in the war file that eventually gets moved to tomcat and it will solve the "secure access doesn't work from tomcat" problem.

Many thanks to Dr. Jeremy for helping me get over my vacillations on what to try next. My next step will be to get the test for secure access running. Then I will start working on getting these clients called securely from within tomcat. Dr. Espino pointed out that everything should test and pass before moving on to other projects/discoveries... and if it's going to take some painful jar hunting to get it working, then so be it.

Monday, July 7, 2008

Secure traffic for rodsadai

Today we got Secure traffic for rodsadai working. Dan helped with several of the hostname issues (many thanks to him for that) and I got the client factory to pass the secure client instead of the generic sql client.

Tomorrow, I will be adjusting all of the tests to take into account proper spatial and time series formatting, and probably do a bit of re-architecting to turn secure access into a separate test (this will most likely involve passing the query client into the processor as opposed to having the processor call its own).

That and documenting, putting in descriptive errors, that sort of thing.

Friday, July 4, 2008

Teleconferences with collaborators

Tom, Ken and I talked with Ron Price from the University of Utah yesterday about the CoE activities. Ron has a non-secured service that we can start testing on Monday.

Utah is using many of the caBIG/caGrid grid components for security management and service development. Specifically they're using Introduce to speed up service development and GAARDS to work with security management.

Dan is starting to research GAARDS and how that can help with the security infrastructure of the PHIRG. We're going to start looking into Introduce for test/sample services.

Tom, Ken and I also spoke with the team at Ohio State that is responsible to the bulk of the caBIG infrastructure (including GAARDS and Introduce) and they were extremely helpful and open to collaboration. They also pointed us to a caGrid bulk data transfer service that uses web services and HTTPS to secure transfer large files and data sets.

So the PHIRG team has a lot of links to follow. Fortunately we have some deep expertise available to help with the tough questions.

Code of Conduct

Some may notice the new disclaimer posted to the wiki and blog. This disclaimer links to a Code of Conduct that has gone through the CDC clearance process to establish posting guidelines for CDC employees and contractors who contribute to the blog, wiki or sourceforge.

You can read the code of conduct for all the details, but basically it describes the rules as:

  • Members will act with integrity and adhere to the highest standards of personal and professional ethics. As collaborations tend to be self-correcting, active participation means both offering suggestions and accepting them with a focus on product improvement. Personal attacks, hidden destructive code or other forms of harassment or intimidation will not be tolerated. Collaboration is highly encouraged, and although this may not always be positive, it should always be respectful and

  • Only authorized committers are allowed to make changes to project related sites. Project leads determine authorized committers and assign permissions.

  • Any sites not hosted by CDC or CDC resources should use separate authorization so as not to compromise existing CDC authentication and authorization procedures. This means that committers should not use an ID/ user name or password currently in use
    as their CDC ID / user name or password.

  • No specific security related information should be shared. No information that would allow an unauthorized party to compromise CDC systems security shall be posted. This includes, but is not limited to: user names, user ids, passwords, IP addresses, private certificate information, specific system configuration.

  • No restricted or privileged information should be posted that is limited in distribution rights. For example, there are US export controls for encryption routines and algorithms that cannot be shared with specific countries.

  • No content in violation of US or international copyright shall be posted without explicit, written consent of the copyright holder.

  • All content in draft form must be clearly marked with the words “DRAFT” both within the content itself and in any specific sites referencing the content.

  • No personally identifiable health data shall be posted or stored on collaboration sites. This includes partially identifiable data containing fields such as age, race, sex, geolocation. Only sample, mock or test data shall be used.

  • No posting of source code or unpublished materials relating to CDC or NCPHI developed production systems and applications.

Thursday, July 3, 2008

Lots of stuff done, lots of stuff to do.

Things completed:
  • Confirmed that secure Ogsa-dai was working on the new 205 box. Many thanks to Dan for his help and awesomeness.
  • Have pulled the current RODSAdai files over to the 205 box, and linked the 205 box's maven to the repository, showing that I can recreate all the dev code and have it building in about 5 minutes in a new environment... and that is so cool.
  • Got the HL7_ADTS table on the 205 box.

Things to be done now:

  • Implement the secureClient for RODSAdai
  • Document
  • Expand tests to deal with time and spatial series data.
  • Work with Jeremy on integrating with RODS and getting the security to work with tomcat.

Cheers, Things seems to be rolling along nicely, have a good 4th of July weekend!

Grid Messaging Test Cases

I worked on finishing up the first presentable draft of the Secure Reliable Grid Messaging PoC test cases. Its right now located at
Please feel free to browse through the document and provide your feedback. Right now we have tried to explain the tests that we have performed so far, with others outlined in the document. After we created the first draft, Brian recommended some additions, which have been incorporated. If you think of a way where we can be more clear in explaining what has been tested so far, please let us know.

Public Health Case Reporting Use Case

As mentioned on the concall today, two of the AHIC use cases on the radar screen for this year of particular interest to PH are the case reporting (PHCR) use case, and the vaccine adverse event reporting (VAER) use case. The latter is still in development but the former was completed last spring. It may be useful to consider the biz processes and data elements in the PHCR use case document along with BSV use case.

Unknown CA Error: Fix

Globus was recently reinstalled on the Ubuntu node. Instead of installing the certificates in the /etc/grid-security directory, I choose to use the non-root install and install the certificates in the $GLOBUS_LOCATION/etc directory.

I received the following error in the $GLOBUS_LOCATION/var/container.log file when I tried to start the Globus container:

Remote exception was ;nested exception is: org.globus.common.ChainedIOException: Authentication failed [Caused by:Failure unspecified at GSS-API level [Caused by: Unknown CA]]

I knew the certificates were good, but it seemed like Globus was unable to locate the installed certificates. After researching the issue, I came across Globus Bug# 4303.

Based on the information in the Bug Report, I discovered that Globus was looking for the certificates in the default, /etc/grid-security/certificates directory. To correct the problem I created two symbolic links. The first was a link from $GLOBUS_LOCATION/etc to /etc/grid-security. The second was a link from $GLOBUS_LOCATION/etc/certificates to $GLOBUS_LOCATION/TRUSTED_CA.

After creating the links, the Globus container started up.

Wednesday, July 2, 2008

Now time series is processing, and the ogsadai errors are gone

Yay for troubleshooting and debugging.

The cause for the OD errors was some transposed digits in the port number for Postgres.

Otherwise, I managed to code up the time-series and write a quick console test, and it passed!

Now, to move those tests into the automatic build testing, and then document it and work on secure data transport now that we have two securable nodes.

Also, I got to play with an OLPC (One Laptop Per Child) laptop today, it is really neat and there might be some cool grid computing aspects or data reporting to be made.

Harvard ESP activities

Dr. Lazarus from the Harvard CoE working on the ESP project posted some initial design comments on their ESP wiki.

Specifically, he is designing the stakeholders, data format, metadata, data entities and the sensitivity levels of data. This is a great start.

I've forwarded his comments on to the biosurveillance team (specifically Dr. Rhodes and Dr. Tokars) so they can provide their input/feedback.

Tuesday, July 1, 2008

Lab (Re)Engagement

Tom Savel, Dan and I met yesterday with the NCPHI Lab management team to review the resources needed for the three Proof of Concept projects. The lab management team approved the activities and are working on allocating and configuring the additional resources for the 3 PoCs.

This means that we can begin to reinvestigate using VM appliances for distribution as PHGrid nodes. In the past, an exception was made to allow a single VM appliance to be used, but now the lab team is able to support using VM appliances for additional nodes.

Related to the lab engagement is that the WS-RM portions of the Secure Reliable Grid Messaging PoC will not be posted directly to the blog/sourceforge/wiki until they are approved by the Gautam Kesarinath, the NCPHI project steward responsible for the WS-RM portion of the PoC.