Proteomic Technologies Informatics Workshop - February 8-9, 2005 - The Fairmont Olympic Hotel,  Seattle, WA
View AgendaMeeting SummaryParticipant List
 
Meeting Summary Microsoft Word Document Link Download and Print Meeting Summary (Microsoft Word - Size: 290 kb)
Return to Table of Contents

Next >

Session 5: Requirements of a General Clinical Proteomics Informatics Resource

Discussion Group Leaders:
Stephen George Oliver, Ph.D., Faculty of Life Sciences, University of Manchester
Samir M. Hanash, M.D., Ph.D., Fred Hutchinson Cancer Research Center
Martin W. McIntosh, Ph.D., Fred Hutchinson Cancer Research Center

In this final session, leaders discussed use-case scenarios for a general clinical proteomics resource and emphasized their views of the resource needs required to develop a general clinical proteomics data repository to support biomarker discovery.

Dr. Oliver:

The proteome is central to the functional genomics agenda; proteins are directly linked to the genome. However, identifying the proteome is technically more challenging than the genome and the transcriptome. Dr. Oliver then discussed the Proteome Experimental Data Repository (PEDRo), a database model that was developed in the Consortium for Genomics of Microbial Eukaryotes (COGEME; www.cogeme.man.ac.uk). The PEDRo model was published (Taylor CF, et.al. Nat Biotechnol 2003;21:247-254) following feedback from the wider community. At the time of publication, it contained no complete datasets. Recently, however, a database containing PEDRo proteomic data from seven species ( Pierre ) has been developed that will be online later this month.

PEDRo was designed to provide enough detail to allow analysis and comparison of results from different experiments, allow the sustainability of experiment design and implementation decisions to be assessed, and to allow protein identification to be rerun in the future using new databases or software. The system is not detailed enough to allow experiments to be rerun.

Dr. Oliver also discussed other resources, including the Genome Information Management System (GIMS; download at http://img.cs.man.ac.uk/gims ), a Java-based tool that allows close integration of the programming language with the database. Using the object database, FastObjects, GIMS allows rapid access to database data from application programs and allows data to be stored in a way that reflects the underlying mechanisms in the organism. The GIMS user interface allows the user to browse the database, ask canned queries, and store and combine datasets. Results may be saved as txt, html, or XML.

He then highlighted several other resources, including my Grid (www.mygrid.org.uk), an open-source upper-middleware for bioinformatics, and the In Silico Proteome Integrated Data Environment Resource ( i SPIDER; http://www.ispider.ac.uk ), an integrated platform of proteomic data resources enabled as grid/web services. Existing infrastructure to support iSPIDER includes my Grid, AutoMed, PSI/Pedro infrastructure and standards, and protein identification tools at the University of Manchester .

General Discussion:

Dr. Hartwell then reflected on the workshop, noting that much of most interesting development activity is uncoordinated and most likely duplicative. He noted that this meeting will help enable these activities to collaborate, through caBIG and other means. Stressing that the activities and discussions should not end today, Dr. Hartwell reiterated Dr. Olson's challenge of having two or more groups analyze each other's raw data as a way to measure achievement of consensus. Also, the community must articulate a grand goal that is currently beyond reach, so that it defines a marker of success. He noted that no current proteomic activities espouse this type of goal, and he suggested using biomarkers for disease as an endpoint that will completely transform medicine.

A participant commented that human cancers arise from numerous mechanisms and are heterogeneous as compared to genetically-induced mouse cancers. Thus, to translate mouse proteomics to clinical studies, proteomic data must be linked to clinical information. Considering that only a subset of cancer patients may respond well to a particular therapy, the link between proteomic and clinical data will inform hypotheses for future clinical trials. Also, patient consent and confidentiality must be built into the system, and the consortia may serve as a model on which to build. Another participant noted that a database that compiles data from a variety of cancers on which biomarkers have passed some sort of empirical process will be a valuable resource.

Dr. Downing noted that the EDRN has developed an architecture for discovery and validation of biomarkers. The NCI would like these consortia to be a pathway that enables discovery in a complementary, yet different, way. How will this new resource facilitate such discovery for the clinic? One attendee suggested defining a specific challenge goal for the consortia. In 2003, attendees at a HUPO/NIH meeting set the goal of reliably identifying and quantitating 5000 proteins in serum, plasma, and tissue in a three-year time frame. While this challenge has not yet been met, posing a similar challenge to detect and quantitate a number of proteins in mouse (or human) serum would represent a goal to be met in time.

Another attendee observed that the heterogeneity of human cancers will necessitate help from the NCI Specialized Programs of Research Excellence (SPOREs) and other members of the clinical community. Although it is possible to post raw data from human specimens, consent forms may prevent publishing of background data, even if deidentified. To this end, it was suggested that the NCI could assist, perhaps by making the data available to a small group, but not the public. Also, the message for the public must be controlled; e.g., a biomarker shall be defined as such only when it has been validated. Dr. Downing noted that the NCI sees these projects as a path forward, noting that the Institute is actively engaged in a pilot project among its prostate cancer SPOREs for the National Biospecimen Network to develop a shared repository for specimens and data used in an inter-institutional biomarkers study. Several attendees commented on the natural synergy among different data types, noting that the proteomics enterprise is evolving toward a systems biology perspective.

Return to Table of Contents

Next >



Home | View Agenda | Meeting Summary | Participant List

National Cancer Institute
National Institutes of Health
U.S. Department of Health and Human Services