Friday, July 18, 2008

ESP:SS Data Structure Discussions

I met with the CDC BioSense epi team on Wednesday to discuss the ESP:SS proposed data formats (described on the Harvard ESP wiki)and to get some feedback that will be useful to Dr. Lazarus.

Below is the feedback and questions from the BioSense subject matter experts...

For level 1 data, a few new fields were suggested:
-Facility zip - in place of the unspecific zip field Syndrome classifier author - "ESP", "BioSense", "ESSENCE", etc.-although for ESP this will always be the same value, for other providers it may differ.
-Denominator - patient visit count. The number of visits are based on total visits irrespective of whether the visit contained clinical data that binned to a syndrome.
-Facility count - number of facilities providing data for syndrome count & denominator.
-Patient Class/Type - A patient class/type of health indicator ‘category’ is needed. Again, we refer to this as “bucket” in today’s BioSense. See attached for the current bucket list. Buckets 11-19 represent the core 9 buckets that are used. That is, we may receive data that fall into the other buckets but said data is considered ineligible for use in analytics/BioSense. Buckets 11-19 cover the 3 core Patient Class settings (Emergency; Inpatient; Outpatient). Within each class setting, there are 3 buckets representing early indicator, working diagnosis, and final diagnosis (Emergency bucket 11-13; Inpatient bucket 14-16; Outpatient bucket 17-19).
-Age group - to be determined what the age groupings should be

For level 2 data, a few new fields were suggested:
The new fields suggested for level 1 will all be included in level 2
-Patient zip - zip code of the patient residence (may be able to be derived from the geocode).
-Chief Complaint or Diagnosis - the chief complaint or final diagnosis text used to classify the patient encounter syndrome.
-Patient linker id - pseudo anonymous id that can link a unit record back to a patient by cross-referencing with source facility.

Also, for level 2 Dr. Lazarus asked for a few clarification for fields:
-Geocode should be for the patient residence
-Age should be in years allowing decimal levels to provide provision for months/weeks/days.

For level 3, the only new fields suggested were the additional fields added to level 1 & level 2.

The group also had a few questions that I will forward on to ESP and see what their response is:
-How can we address the potential privacy issues with using patient geocode in level 2? There are several GIS techniques to provide controls for privacy, perhaps they should be discussed in the data field design.
-How to find the patient count denominator for level 2 data? Since level 2 is unit records, is there a separate query to find the denominator?
-How are multi-syndrome patient encounter events recorded?
-What are the use cases for the level 3 data?

No comments: