Friday, June 26, 2009

Training updated

The training node was updated with the new caching gridviewer and the listed schemas were also updated.

In gridviewer, Dallas has been disabled for the time being because it seems to be acting rather flakey (will load fine for a half hour, and then times out). If anything it means I need to work in a good timer function to abandon a load and notify the user after a set amount of time.

Otherwise, gridviewer next week will be given some UI tweaks. I am hoping to get better server allow/disallow logic, multiple loads on one map (with different pushpins depending on the load), and autopopulated classifier/indicator drop-downs that don't require server hits.

Cheers everyone, have a good weekend,
Peter

Thursday, June 25, 2009

Data is caching

Now the data for gridviewer is caching using OS cache. I may have to tweak it a bit to behave better as a singleton, but otherwise it is efficient and simple and seems to make everything quite snappy. Especially now that gridviewer is not referencing servers for metadata in each load.

Otherwise, the server url's are now pulling from the wiki page (or any page you wish to configure yourself). Thus, less database configuration is needed just for gridviewer.

Wednesday, June 24, 2009

Chronicles of NHIN CONNECT, volume 1

For the past few weeks, I have been in the process of becoming an expert in the NHIN CONNECT project. This project contains both the NHIN connect GATEWAY and the NHIN Connect ADAPTER.

The NHIN Connect project is located at http://www.connectopensource.org and version 2.0 is the current release of the code in question (although 2.1 is supposed to be forthcoming in early July).

While I can now configure a gateway and an adapter in under 1 hour each, there are several 'gotchas' that are not addressed in the documentation. Using the "pre-configured binaries' option (versus install-from-scratch), here are the 'gotchas' so far:

==>Adapter and Connect need different machines. I've tested them on the same one and functionality is mediocre at best. The lab has these set-up nicely on 2 seperate but equal machines.

==>Java is set to allocate 1.2 gb of memory from the start. Don't try this on a machine with less than 1 gb of memory... you want at least 2.

==>OID Registration doesn't work as advertised. Fortunately I haven't needed my OID for internall connectivity but when I connect outside of the lab, I will need this.

==>c:\java is hard-coded as java location. Make sure you install to this location. If you don’t some of the NHIN services break with odd errors. The documentation reflects inaccurate pointers to the java locations. And with _some_ of the application hard-coded with this java location, better safe than sorry.


==>The NHIN documentation says in multiple places that port 9080 is the non-secure port and that 9081 is the secure port. DON'T BELIEVE IT! Port 8181 is the secure port.

Thus concludes volume 1 of the Chronicles of NHIN CONNECT. Stay tuned for updates as the PHGRID <--> NHIN CONNECT interoperability testing continues...

Data is combining, now to get it caching.

The data is now combining in the data model... and it seems to be doing it correctly.

Now, I am hoping to get a majority of it caching... and instead of trying to write my own caching, I am going to use a caching mechanism suggested by Chris: OSCache from the OpenSymphony project.

It seems to have generic back-end caching with configurable levels of intelligence and persistence (which is all I really needed)... but it also has the most potential for helping on the front end (like request caching: if a request looks like the exact same request that was sent a minute ago, it will return the exact same HTML that was sent back rather than hitting the server) It also allows for better and more fine-grained error handling.

After that, it's some UI clean up, some javascript niftiness, and testing to see if GridViewer will run in the same area as an AMDS service.

Tuesday, June 23, 2009

A case for using grid architecture in state public health informatics: the Utah perspective

"This paper presents the rationale for designing and implementing the next-generation of public using grid computing concepts and tools. Our attempt is to evaluate all including data grids for sharing information and for accessing computation"

http://www.biomedcentral.com/content/pdf/1472-6947-9-32.pdf

New time series handling.

I have adjusted the time series handling of gmap-polygon and grid-viewer so that regional collections hold multiple time series lists (differentiated by server).

Right now, it seems that it is building, but when you open the flot plot, all the data is somehow being tagged with one date, and it is not immediately apparent where or how that data is being set like that. I think I might have run into this problem before and it might have to do with how Java Calendars increments... I hope to figure it out later today or tomorrow.

Otherwise, after that is finished tomorrow, I will implement the new caching structure we discussed with Brian, and start publishing arrays so that the drop-downs for indicators and classifiers can be dynamically populated (as discussed with Chris).

After that, I am going to start allowing for multiple polygon loading... thus you can do one search, then another, and click and compare graphs. I might also look into making histograms in addition to line charts (that way, one can see the different data brought back by the different servers)

maphiv.org

NPR did a piece this morning on HIV statistics in the US and mentioned this website as a source. Maphiv.org appears to be operated by Z-Atlas which states itself as "America's health index and online maps providing information about health and health care in the US." Z-Atlas uses ArcGIS which I am sure most of you are familiar. Though the site requires registration, I think it is worth it to get a look at the UI.


Saturday, June 20, 2009

GIPSEPoisonService

GIPSEPoisonService has been committed to the repository this week. The service has been created using Introduce and then modeled around Brian's AMDSService. I will begin writing test cases next week and hopefully not run into anything major.

Peter and Brian asked me to assist on a few things for gridviewer application. I took a look at the code and the UI have a few ideas around where I think we can add functionality to it in the distant future. Peter has worked through some tough requirements and developed some strong code to handle all the p0lygon/map manipulations.

1. MVC
I think it would be a good step to introduce a controller to the gridviewer. The controller would not only to handle web user interface requests but also to process remoting protocols or generate specific outputs on demand. To handle MVC, I believe we should look to framework's such as Spring, JSF, Struts, etc. My preference is Spring-MVC, especially now that 3.0M2 release will be REST complaint, which is another consideration for the gridviewer.

2. JSON/RSS/XML output
The controller mentioned above would facilitate the development of outputting additional response types per client requests. Reusing all the business logic Peter has developed we could "re-format" the output to a resemblance of the AMDS schema in JSON. The only requirement to consume the JSON request would be a simple html file with a few lines of javascript. We could even output RSS for notification purposes (Spring 3.0M2 has controllers designed especially for this purpose).

3. Authentication
The application currently does session based authentication. We could potentially look to Spring for handling authentication. We would gain persistence (remember me), an adaptor for authenticating with OpenID, LDAP, and an easier path to cross-domain authentication if that were to become a requirement in the future. Spring-Security (formerly ACEGI) also supports X.509 certificates.

4. UI
I showed Brian a mash-up that is a few years old called housingmaps.com. It basically takes craigslist and google maps and creates city-based maps of the current RSS listings. I think it is a good example of making the map the focal point of the UI. By making the map larger and moving the selection components to the right, I think we can really improve the UI.

5. REST
Also I think we should use look to using GET requests which could facilitate additional functionality in the future such as remembering passed searches or allowing users to easily find URLs in their address bar. We should always work towards a REST architecture.

These are just a couple initial thoughts for future development (post-August).

Friday, June 19, 2009

more services on gridviewer

There are now some more services on Grid-Viewer, and there are some visual tweaks.

Unfortunately, the data from multiple servers is not combining because of some back-end restrictions that I am going to work on next week, some additional refactors include separating polygons from pinpoints (thus, coastal states won't have really dark shadows (the result of multiple pinpoints being drawn in the same spot)), and storing data from multiple states in the markers (thus, flotplots should be able to distinguish which data came from which service in future versions).

Also, caching (with the help of Brian), fun Javascript and layout changes (with the help of Chris), and new multi-plotting and regional overlays.

Cheers everyone, have a good weekend!

Mini-GIPSE Store

I added a small GIPSE Store installation and GIPSE Service on the Dallas node to test out TX subset data so Peter can have some additional GIPSE services to display in Grid Viewer.

Ken created a new mailing list for those interested in GIPSE schema development. It is: gipse-schema@phgrid.net should anyone want to join the technical discussion around the GIPSE schema.

Interesting article on risk management in software development

In keeping with our agile development, I came across M. Jorgensen's article (direct link to pdf).

Abstract
Software professionals are, on average, over-optimistic about the required effort usage and the success probability of software development projects. This paper hypothesizes that common risk analysis processes may contribute to this over-optimism and over-confidence. Evidence from four experiments with software professionals, together with research in other domains, supports this hypothesis. The results of the experiments imply that in some situations more risk analysis leads to over-optimism and over-confidence, instead of the intended improvement of realism. Possible explanations of this counter-intuitive finding relate to results from cognitive science on “illusion-of-control,” “cognitive accessibility,”, “the peak-end rule” and “risk as feeling.”. The results suggest that it matters how risk analysis and effort estimation processes are combined. An approach is presented that is designed to avoid an increase in optimism and confidence with more risk analysis.

Thursday, June 18, 2009

grid-viewer now pulls it's library from the phgrid repository

I uploaded the jars that grid-viewer uses to the phgrid repository, and updated the pom file to pull from the repositories.

So now, instead of having to track down the various files from globus, amds service, and introduce... and then install them into the local repository... one can just type "maven package" and maven will download the files.

I think this will help me a lot, and now that several people are using my code I am hoping it will help them a lot.

The one thing that will still need to be written up is that some of the files (all the ones specified by "provided" in the POM file) will need to be copied into the web containers shared lib directory. I am debating whether it will be more useful to change them from provided to default (meaning they will be included in the war) so long as whoever is trying to set-up grid-viewer isn't trying to use multiple grid-viewers (or other things that will be using the secure globus libraries), it should work.

Otherwise, I have made some minor changes to grid viewer, but I'm planning to make some progress to gridviewer behavior and and performance tomorrow and next week.

Presentations About LOINC

http://loinc.org/slideshows

Wednesday, June 17, 2009

poicondai-2.0-noprops now available in the repository

Poicondai-2.0-noprops is sitting happily in the repository section of the website. Complete with a POM… and maven is actually nifty enough that when npdsgmaps references poicondai-2.0-noprops, mvn will read the pom and download all the extra dependencies needed for poicondai in addition to those needed specifically for npdsgmaps.

Simpler… if Project A needs Jar B… but Jar B relies also on jars C, D, and E… maven will read the pom associate with Jar B in the repository and also go fetch C, D, and E, even though C, D, and E, aren't listed as dependencies in Project A's pom file. At least it seems that way from experimentation.

So, if you want to use this jar, you will need to create a properties file with the url, systemuser, system name, and password (look into the poicondai project to see what the filter is filling), but it will possibly make it much easier for people needing to access poison to include this.

More importantly, it got me comfortable with moving things into a non-local repository (also many thanks to Felicia's posts), because I anticipate I will have to move more things into the remote repository for the sake of fixing a RODSA-DAI bug and for people who want to use grid-viewer (it will keep them from having to do a jar hunt like I have so many times).

Now the next step would be finding a cool way to get all the provided jars into the lib container of a webserver. Maybe there is a deploy plugin for that sort of thing.

Tuesday, June 16, 2009

grid-viewer updated and returning data

Yay! Grid-viewer and the Test Biosense GIPSE service have united on the training node to have grid viewer returning data (most of it is in January right now).

I'm happy because it is connecting over a secure connection to get both metadata and data, and it's relatively speedy, and allows me to move on with future plans for grid viewer and how it will behave.

First, I will be working with Chris to get a second GIPSE service testing, which will allow me to finish the server-selection logic (IE, you cannot select servers if the data you are looking for is out of their range. You won't be able to scan the North Carolina AMDS source for results in California or try to scan a BioSense node for Poison-based indicators).

Then, I am going to finish separating the selection bits from the mapping bits... which will require some generification: Right now, because it's based off of quicksilver, the map is expecting a series of states, a series of zip3s in a state, or a series of zip5s in a zip3... while it is already flexible enough to allow only a handful of states as opposed to all of them, I hope to expand it to allow for collections of zip3 (that may cross state borders) or zip5s (that may cross zip3 borders) that don't necessarily have to be related. It should also be possible to select states and zip3's at the same time (but I am not sure if that would be terribly helpful or just confusing).

Finally, I am hoping to allow for cumulative data loads. Thus, you can run one query, and then run another query with different pinpoints, thus you can click and compare the data between the two queries.

In addition to all this and in the process, I am hoping to clean up some of how gmap-polygon behaves so that less data has to be stored in databases and things aren't repeated as often.

It's going to be a lot of work, but I think the end results will be rather nifty, and I'm just happy to have this version all wired together and working so I can move on.

Conflicting PostgreSQL drivers

While deploying the AMDSService (soon to be renamed to GIPSEService) to the training node, we noticed this error:
"SET AUTOCOMMIT TO OFF is no longer supported" being thrown by the AMDSService operations.

This was caused because the training node uses PostgreSQL 8.1 yet the globus_database_common package deploys the JDBC driver for PostgreSQL 7.3 (pg73jdbc2.jar). This causes a problem with the way that ibatis handles postgres connections (specifically ibatis turns autocommit off to maximize performance).

The AMDSService includes the postgreSQL 8 driver (postgresql-8.3-604.jdbc3.jar). To get around this error, we removed the 7.3 driver from the training node's Globus and the training node's Tomcat. Globus is happy using the updated driver, AMDSService is happy using the updated driver and RODSA-DAI is still happy using the updated driver.

This is not a problem for WS-Core Globus installs (like our Windows nodes) since WS-Core doesn't include the 7.3 driver.

Something to keep in mind for any Linux PHGrid nodes that use AMDSService is that they will need to remove the 7.3 postgres driver after installing Globus and the AMDSService. (This is accomplished by renaming $GLOBUS_LOCATION\lib\pg73jdbc2.jar to pg73jdbc2.jar.old; then redeploying Globus to tomcat).

Friday, June 12, 2009

Gridviewer is close

So, I managed to get gridviewer working in a sense last night.

It was pulling data and displaying it Quicksilver style when installed in tomcat and connecting to the GIPSE Globus Service running on a non-secure globus container. In Windows.

Today, I tried to build it for a training deploy when connecting to secure GIPSE Globus service running in a secure tomcat container. I found some more dependencies I needed to add. And then I found out that the GIPSE globus service seemed to be having configuration issues.

Thus, I figure the next best step is to get that GIPSE Globus Service running on a secure tomcat container in Windows... or move development over to my old Ubuntu development box (where I would still have to get the GIPSE service running in Linux). Both would be beneficial and work towards the goal of getting an environment more like the training node.

Otherwise, I hope to get that done relatively soon, and then start focusing on the refactors I have been planning for Grid Viewer but haven't been able to really work on in the attempts to just get something Grid-Viewer-ey completed and working (hopefully correctly).

The other thing that needs to happen is more services. One that is starting development will provide NPDS data in GIPSE form, and we should probably also deploy a smaller GIPSE service some place like Dallas so that we can pretend to have a smaller service that only has data for a few states and/or a few conditions. Multiple services with varying metadata is where Grid Viewer should get most of it's niftiness.

Cheers!

June 2009 GeoInformatics - Cloud computing and GIS

The June 2009 issue of GeoInformatics is here:

http://fluidbook.microdesign.nl/geoinformatics/04-2009/

If you go to page 36....there is an article on Neogeography and Google;

They interview Jack Dangermond (ESRI) as well as Intergraph and other GIS vendors;

Jack Dangermond makes an interesting comment about the future of Desktop GIS and the increase in demand for cloud computing....

DHS project aims to bring open-source software to state and local agencies

The Homeland Security Department is funding a program that will help federal, state and local agencies better understand their options for using open-source software. More information can be found here.

Thursday, June 11, 2009

Tomcat SSL

Testing the latest Windows installation document for the grid node. The goal is to create a clear and simple guide for users to get a grid node up and running quickly. The primary issues I am running into are all in the certificate/SSL area.

It seems to me we could link to the installation documents already written by Apache on configuring SSL with Tomcat in a pre-requisite section. There are so many variables like the user account running tomcat, where to store the keystore file, whether they can use a 3rd party certificate or self-signed, is there load-balancing, etc. Either way, once a user has Tomcat running with a certificate then they could begin stepping through our Globus installation.

Basically I think we should not duplicate existing documentation, and it will allow us to focus on Globus.

Cloud vs Grid Comment

Good post. Thanks for the post.

To further articulate the differences and similarities between the EGEE Grid, Amazon Cloud and the Public Health Grid...

The EGEE Grid service focuses on "short-lived batch-style processing (job execution)". The Public Health Grid has these services available and plan to research and deploy in the future but our current focus is on public health and health specific services.

The Amazon Cloud (and other clouds) service is "long-lived services based on hardware virtualization". The Public Health Grid does have a virtualization appliance, which we ship on DVD, and we are currently researching Grid in the cloud methodologies (University of Utah and Argonne specifically are working in this area).

The Public Health Grid services are long-lived services built using service-oriented architecture (SOA) methodologies and technologies. The team has build the first services: GIPSE, Grid Publisher and Grid Viewer. EAch of these services may be accessed through this web site and downloaded through our open source repository.

Grid vs Cloud

I was asked yesterday what the primary differences were between grid and cloud computing. Struggling for a good answer, I did some quick google searches and I came across this interesting paper called EGEE Comparative Study: Grids and Clouds Evolution or Revolution? done in Nov 2008. I imagine most people have read it, but thought I would post just in case.

Wednesday, June 10, 2009

building! testing! Deploying?

So, I got a list of the jars (which can be found in the $GLOBUS_LOCATION$\lib directory after an introduce install) which grid-viewer now needs to run the GIPSE client:

addressing-1.0.jar
caGrid-metadata-security-1.3.jar
caGrid-ServiceSecurityProvider-client-1.3.jar
caGrid-ServiceSecurityProvider-stubs-1.3.jar
caGrid-ServiceSecurityProvider-common-1.3.jar
axis.jar (this has to be the version that is shipped with globus)
jaxrpc.jar
saaj.jar
cog-axis.jar
cog-jglobus.jar
commons-logging.jar
commons-discovery.jar
wss4j.jar
wsdl4j.jar
wsrf_core.jar
wsrf_core_stubs.jar
jce-jdk13.jar (and this one seems to have those pesky bouncycastle.org security libs)
puretls (*for secure access*)
cryptix32 (*for secure access*)
cryptix-asn1 (*for secure access*)

These are stored in my private repository and downloaded by maven because they are classified as "provided" in the POM (which means they are needed for compilation and testing, but will be provided in the classpath when the war is installed). Thus, I copied them all over to the tomcat/commons/lib directory.

Initial attempts seem to be resulting in NoClassDefFound errors. So it seems there may be a few jars needed still in tomcat alone... or that tomcat is not properly loading the jars in the common space (and that might be the case, I remember RODSA-DAI having issues running in tomcat becuse of library/classpath issues).

Either way, I am hoping to get it sorted out early tomorrow, have it returning data to the map. Then I want to get it tested on a secure-globus environment (in case some more security libraries are needed and to make sure that it plays nice with other secure clients like RODSA-DAI), and then I'll be ready to continue with the refactors.

Right now, the main refactorings have to do with shifting from a "state, zip3, zip5" paradigm to a "region" paradigm. This should allow for showing zip3s and zip5s and states all on the same map. A complimentary paradigm shift will be allowing multiple loads on one map (load one query with one set of pushpins, load another query with another set of pushpins... so you can click both sets of pushpins and compare data from two queries on the same map). All with more services and more realtime options.

Examining the relationship betweeen GIPSE and SDMX-HD

I was recently pointed to a very interesting entity known as SDMX-HD.

Google Groups for SDMX-HD

SDMX-HD is a Statistical Data and Metadata Exchange (SDMX)-based data exchange format intended to serve the needs of the Monitoring and Evaluation community. It has been developed by WHO and partners to facilitate exchange of indicator definitions and data in aggregate data systems.


Experience with the UNAIDS IXF version 2.0 provided a basis for developing the SDMX-HD. New features in SDMX-HD include better support for domain-specific content for various stakeholders, cross-domain 'Metadata Common Vocabulary', hierarchical codelists, and the ability to generate generic or compact XML from a common data model. It will be based on the ISO SDMX standard and be a 'Content-Oriented Guideline' for the exchange of public health indicator data typically for Monitoring and Evaluation (M&E) activities and international reporting, e.g. PEPFAR.

Tuesday, June 9, 2009

Jar hunt complete

So, I managed to find all the jars that are needed to test (and eventually run) the use of the GIPSE client (at least the metadata portion of it)... and it is a long list pulling from both the CAGrid and globus libs (which thankfully end up in the globus libs). The very long list is in the pom file of the grid viewer project and I plan to post it in more plain text on the wiki tomorrow.

The thing is, I had to hand-install these items into the maven repository... which means that anyone wanting to test or build the gridviewer project at this point would have to do the same... and after setting up a grid-node and the GIPSE service, it is arduous and annoying.

Thus, I am hoping to get most of the jars needed into sourceforge. It was done once before for RODSA-dai already, and it should allow for people to get all the jars needed for testing after a simple property file change (which should be easily scriptable)

Furthermore, I am going to test running grid-viewer from the tomcat-enabled globus container... which should have all the libraries co-located in the lib directory, which will hopefully make it a much easier to install grid-viewer after setting up a tomcat instance.

I'm sure there will need to be some finessing either way.... and will still need to check all this in a secure globus environment and get the data returning in the grid view.

But, it's exciting. Nothing feels more gratifying than seeing your test code come back with something other than "NoClassDefFoundError: "

Monday, June 8, 2009

Compiling

So, I managed to take out all the conflicts and update all the code in grid viewer, so now everything is successfully compiling.

Tomorrow will be re-engineering the tests to match the new test cases, and then, hopefully, a war that can be installed and run (even if the initial phase is just having the service loader pull back the metadata options).

Then, it's implementing some new features in line with the current refactor, which will ultimately make things simpler.

I'm excited, I think grid viewer is turning into something that will work rather well, and still be flexible enough to work in ways not yet anticipated.

Sunday, June 7, 2009

week 1

I spent most of my first week browsing the repositories and trying to get a grasp of the overall project objectives. And, of course, trying to learn the litany of acronyms (maybe we should start a post just to start documenting them for quick reference).

We did get a simple landing page put together. The team has built some terrific demonstration applications, so a summary page was needed to briefly describe their respective functionality and also launch to the actual demo. It is a simple html page with a little added javascript functionality. You can find it here.

I am thrilled to be working on this open source project that has enormous potential for public health informatics. There are really great individuals on the team and I am excited to be working with them.

Friday, June 5, 2009

Service Changes

The spec for GIPSE has changed, and so has the client, since the last time I got grid viewer working. Thus, I have been doing a lot of refactoring, and planning more refactoring for some of the places we want to take grid viewer, and it's taking some time.

On the service side, all the names of the objects being returned have changed, and that means all the classes have to be changed and the ways they are loaded too. That also means the metadata has changed and needs to be set up.

Also, the grid service will now allow for cumulative loads... in that one can load more than one set of results onto the map.. which is a completely new paradigm that will have to be handled, in addition to the already shaky paradigms of multiple, variable regions. And the new idea of having services come from a central repository instead of a database (like a Wiki page or UDDI).

But, I think I have finally gotten it all mapped out in my mind, and have built the task lists, and have started the massive refactoring that will be needed.

The thing is, I know this will happen again (new service bits), so I need to keep in mind where all these changes occur and try to make them obvious and as isolated as possible. Then it is more likely that a change in the GIPSE structure can be propogated without a change in the grid view dynamics, and generally less changes the better.

Finally, I think I am going to introduce a simple "this is the data we got back from the service" page for the sake of debugging and sanity. It will help immensely to see what is coming back to figure out how the grid viewer is interpreting it and illuminating other options or assumptions.

Wednesday, June 3, 2009

GIPSE Loader

I added the top 30 BioSense Sub-Syndromes (as determined by Roseanne and Jerry) to the amds-db project and tested a load using the past 40 days of sub-syndrome RT data. This is increasing the size of the test data set that we use for testing the GIPSE services.

PS- In case you haven't noticed from Tom's massive renaming of all the AMDS-related wiki pages, AMDS has been renamed to Geocoded Interoperable Population Summary Exchange (GIPSE) by the NCPHI Director. So whenever you see GIPSE think AMDS.

old code running on new box

A lot of strange issues knocked down, some new ones cropped up.

Issues knocked down:

- Some of the m2eclipse issues: jar projects will find and eat other jar projects just fine, war projects still go "I can't copy this" and give errors that force you to go to the command line where it works just fine.

- Quicksilver, gmap-poly-web showing on my new box: Meaning I got all of the geodata and user data transferred to the new database and connecting okay. I found some rows that got omitted in the transfer, added them, and replaced the CSVs.

- gmap-polygon and gmap-polyweb version 1.0 have been updated to reflect their proper version in their pom files (before they were considered 1.1, not 1.0).

New issues:

- For some reason M2Eclipse will look up maven artifacts in brians version of eclipse, but not the one I installed.

- IE on my dev box is super-duper-secure. Meaning it doesn't like downloading little things like jQuery or google-map javascript IDEs (but I can look at it from other sources of javascript)

- The GIPSE spec, and resulting client, have changed a lot from prior loads. This means lots and lots of code needs to be updated in gridviewer not just to use all the peices, but to reflect all the metadata options and the like.

Tuesday, June 2, 2009

new box, old code.

So, today has been a lot of "Try and get the code from the old system onto the new box"

And while it has not been particularly difficult, it has been rather tedious.

Most of the morning has been spent getting SQL Server Management Studio and figuring out that the data export for the CSV's of locational data from Postgres was done in some weird format (IE, dump from the terminal client into a text file) which caused all sorts of padding issues which needed to be repaired, and then updating the data back to the repository.

A lot of the afternoon was spent figuring out that there is a strange M2Eclipse behavior with our projects that keeps installed jars from being found (so, in the eclipse instance I would build gmap-polygon, install it, and then try to build gmap-poly-web, only to have it blow up because it couldn't find gmap-polygon.jar, even though it was installed... and when I did it from the command line, it worked just fine). So that is something I am going to have to debug because really, it is much easier to just right click and get the pretty gui to do it for you.

Another portion of the afternoon was spent fixing a vise-versa error where I thought that a branch was the spot for new code, and not the trunk... thus, I was moving recent code from branch locations back into the trunks (luckily, the eclipse/subclipse SVN browser works rather well for that sort of thing... but sourceforge's SVN is rather slow regardless)

By the end of today, I had some small issue that keeps me from being able to run any cool web-code on this new box:

Gmap-Poly-Web is not set to work with gmap-polygon version 1.1. I either need to upgrade gmap-poly-web to deal with the 1.1 version of gmap-polygon, or I need to download the 1.0 version of gmap-polygon and install it long enough to build gmap-poly-web (I will probably do the latter as gmap-poly-web functionality can be seen in both quicksilver and grid-viewer).

I don't have the username/password tables for poicondai on this box. Thus, I will need to find them and import them before I can log into poicondai.

Finally, AMDSCore has been completely replaced with a new service which will need to be integrated, the main deploy target is going to be having gridviewer working with this new service and thus providing more reliable data pulls.

This, if anything, has illuminated several points where we needed to update our "this is how you get/build/install this software" entries and shown us a few spots where we need to work to make things more automatic and user friendly if we want non-experienced programmers to have an easy time with it.

Cancer Registry - Useful Information

As we look at ways that a PHGrid infrastructure could augment cancer registry activities - here are some very useful links:

NPCR-Advancing E-cancer Reporting and Registry Operations (AERRO) (previously MERP):
www.cdc.gov/cancer/npcr/informatics/merp. This site has updated Registry diagrams and use case documents that describe the cancer registration business.

North American Association of Central Cancer Registries (NAACCR) (umbrella organization to coordinate standards across the cancer registry community) http://www.naaccr.org/

Electronic Pathology Reporting standard developed by NAACCR that uses the HL7 2.3.1 standard (working on update to HL7 2.5.1) and HL7 Messaging WorkBench to validate the message content: http://www.naaccr.org/index.asp?Col_SectionKey=7&Col_ContentID=501

NAACCR transmission format that is used to transmit data from the hospital cancer registry and from the central cancer registry (state) to the national agencies can be found at: http://www.naaccr.org/index.asp?Col_SectionKey=7&Col_ContentID=133. On page 58 of this document is a table that lists all of the data elements with a cross mapping of the agency that requires collection.

You can find more information about all of the Registry Plus software tools at http://www.cdc.gov/cancer/npcr/tools/registryplus/. A link to additional information on Link Plus can be found at the same URL.

Monday, June 1, 2009

AMDS Service Beta updates

I worked with the CSC/NEDSS programmer to test out their usage of AMDSService. I left out some of the boilerplate jars, but after adding those they were able to configure, build and successfully test out the AMDSService running on their local node. So this is progress.

New box!

So, a large portion of today was spent moving from the Linux development station I am used-to to a new Windows development station I am not-as-used-to.

Some of the things were easier to set up. The biggest help was being able to share other windows development boxes and being able to nab their already-downloaded copies of Java, eclipse, and Globus-ws core files. The setup for globus also seems to be a lot snappier in Windows (but at the same time, I am doing a much less involved install, and this isn't the first time I've done it).

Otherwise, I was able to build and run the client from the AMDS Service that was pulled over from another computer. Now I need to build and run the client from a fresh client downloaded from SVN and configured myself... so I anticipate a few "I am not sure what this property is" issues, but then it will be configuring the client, building the jars that grid viewer needs on my box (gmap-polygon) and figuring out what is needed between those steps.

NHIN Specification Factory

At the end of April, the NHIN released a detailed document outlining current NHIN specifications. The NHIN Specification Factory has been uploaded to the WIKI and can be downloaded at http://sites.google.com/site/phgrid/nhin-interoperability-1/SpecFactoryApprovedQ2Q3Tasking.ppt?attredirects=0 in the general section of NHIN Interoperability at http://sites.google.com/site/phgrid/nhin-interoperability-1 .

BioSense AMDS Service - Beta Release

The AMDS Service for BioSense is now in beta release. Please view details of how to download,configure, build, deploy and use on the service registry page.

You can also download the raw gar from sourceforge. But I recommend getting the source and building with your own configuration.

This service uses the updated 5/31 AMDS draft.

This service is specifically developed to share BioSense aggregate data over PHGrid, but can actually be used for any JDBC data source that wants to be shared using the AMDS spec.

This release is significantly different than the 4/30 alpha release. Specifically we're using Introduce 1.3 (big improvement over 1.2) for service development and configuration management and iBATIS for easy db access / ORM. This release is smaller than the alpha release in size and lines of code so theoretically it will be easier to use. Please let me know any comments. We'll be following the weekly build schedule with a target of July 8 for code freeze.

Update: Tom asked me to explain that we're using iBATIS rather than Hibernate. Both are decent JDBC/ORM frameworks. I chose iBATIS because it has a lighter footprint and I got it working in about 15 minutes. This doesn't mean we won't use Hibernate in the future, but just that we're using iBATIS for now.