![]() |
![]() |
| Meeting Summary | |||||
Overview of Mouse Proteomic Technology Consortia and Informatics Plans Samir M. Hanash, M.D., Ph.D., Program Head, Molecular Diagnostics, Fred Hutchinson Cancer Research Center In this session, principal investigators of the mouse proteomic technology consortia outlined their current experimental plans and introduced the data that they will generate and place into a public informatics data repository. In addition, the informatics platform currently being developed by the consortia was outlined to enable attendees to understand the scope (and limits) of the consortia informatics plans. The "Eastern" Consortium (Samir Hanash, PI): Dr. Hanash noted that the "Eastern" Consortium (comprised of the Fred Hutchinson Cancer Research Center, the Harvard Partners Center for Genetics & Genomics, Massachusetts Institute of Technology, the Dana Farber Cancer Institute, the Van Andel Research Institute, and Memorial Sloan-Kettering Cancer Center) emphasizes two fundamental questions: 1) Are proteomic technologies suitable for cancer marker discovery in sera from mouse models? and 2) Are mouse models suitable for discovering cancer markers that are applicable to humans? He stated that the consortium leverages the expertise and existing resources to meet program objectives without duplicating work already done or in progress. Leveraged consortium resources include engineered mouse models of different types of adencocarcinoma with genomic and transcriptomic data, extensive studies of corresponding human adenocarcinomas ( e.g., the Early Detection Research Network (EDRN) and genomic, transcriptomic, and proteomic data), and multi-investigator studies of human serum and plasma from the Human Proteome Organization (HUPO)'s Plasma Proteome Project (PPP). Mouse models used at the consortium span a range of adenocarcinomas, including colon/GI, pancreas, lung, and ovarian cancers. Consortium members' experience with these human adenocarcinoma tumors will allow comparisons and transitions from mouse to human studies. A continuum of technologies will be tested, spanning the range from "shotgun" proteomics to extensive fractionation of intact proteins. Also, antibody microarray-based technologies will be applied for discovery and validation. One specific technology is the Whole Proteome Scan using an Intact Protein Analysis System (IPAS). IPAS employs dyes to label proteins, detecting and measuring low-abundance proteins using prostate-specific antigen (PSA) as a reference. This strategy has been tested in a mouse xenograft lung cancer model to search for human proteins in the mouse plasma following tumor development from implanted human cancer cells. Validation strategies used by the consortium include antibody microarrays and cross-validation with human tumors. The goal is to determine the relevancy of specific proteins for human cancer. The strategy will generate a large volume of data, thus highlighting the need to discern which information should be captured and placed into a repository. Using IPAS, processed samples are combined into a single set that is subject to fractionation. These fractions are fractionated a second time, and molecular weight information is determined by gel analysis. Identified proteins are then digested, and resultant peptides are analyzed using mass spectrometry (MS). An annotation database will combine all annotations in an extensible markup language (XML) file that can be annotated and deposited in the database. The data system is currently in development and lacks annotations for storage and meaningful query of these data. Dr. Hanash concluded by noting that the consortium wishes to make data publicly-available, although the challenge is to determine which data need to be made available. The "Western" Consortium (Martin McIntosh, PI): Dr. McIntosh discussed the "Western" Consortium, comprised of the FHCRC (laboratory integration and informatics development), the Institute for Systems Biology (ISB; informatics tools and fractionation schemes), the Pacific Northwest National Laboratory (PNNL; fractionation and mass tags table), and the Plasma Proteome Institute (PPI; antibody enrichment, fractionation, and target database). The primary consortium goal is to develop public resources for mining the mouse model serum proteome. Other goals include proof-of-principle of biomarker discovery using high resolution MS as a platform and analyzing the normal variability in serum protein concentrations among and between healthy mice. Deliverables include high quality data, a public database and query tools, and an open-source pipeline for serum proteomics. For mouse models, genetic variability is minimized through closely-controlled breeding strategies. A two-stage sampling strategy of mouse model plasma includes a discovery cohort of samples collected just prior to sacrifice for cases and controls and a validation cohort of samples collected four weeks apart up to sacrifice for cases and controls. This strategy reflects a plan that attempts to mimic human studies. The mammary adenocarcinoma model will be profiled comprehensively, and validation samples will be banked for models of prostate, epithelial ovarian, GI adenoma, skin papilloma, lung, lung adenoma, and mammary carcinoma. Other consortium resources include an accurate mass tag (AMT) table generated by high-resolution sequencing. MS platforms include a Micromass LCT Premier electrospray/time-of-flight (ESI/TOF) instrument and tandem instruments, including a Thermo Finnigan LTQ and a Fourier transform ion cyclotron resonance (FTICR) instrument for the AMT database. All algorithms are built into the open-source MS pipeline and allow the identification of the mono-isotopic mass and hydrophobicity of discriminatory peptides. Biomarker discovery will be conducted via generation of a peptide array following image and peptide alignment and normalization. Consortium efforts during the first year of the funding period will emphasize platform establishment, and efforts during the following year will generate data for the mouse model database. Numerous fractionation schemes and quantitative approaches ( e.g., isotope-coded affinity tags (ICAT), 18 O reference standards, N-terminal labeling) will be evaluated for optimization. Evaluation criteria for fractionation and quantitation include the number of unique peptides and reproducibility of signal intensity and the ways to best allocate resources in the second year of funding. The consortium will generate a complete system of open-source tools. Data from both consortia will be stored at a single site and presented using a common analytic strategy. Limitations of this plan include a central focus on MS, use of a single organism and only a few well-defined protocols and platforms, and focus on a subset of specific research questions. The consortium welcomes input from meeting participants regarding strategies to make data and informatics resources of most use to the scientific community, including identifying other uses of consortium data, data elements critical to those use cases, and strategies to make this platform most applicable to clinical proteomics. Discussion: One attendee inquired about the interchange of activities between the consortia, and consortia leaders noted that a central data repository will be created for disseminating results to the public. While the consortia have differing philosophies, the groups are currently identifying common principles regarding data entry into public databases. It was also noted that HUPO will discuss the challenges of developing common standards and supporting open-access models at its Proteomics Standards Initiative Spring Workshop, held in Siena , Italy on April 17-20, 2005. Such issues represent components of a larger issue, and the consortium can be viewed as a model for other projects occurring globally. Another participant asked whether the consortia plan to cross-validate data. While time constraints are limiting, the consortia will cross-validate to the extent possible. One workshop participant asked whether consortia members use a common database and how such a database is updated. It was noted that significant but irreproducible observations are common in proteomic databases. Another challenge has been the changes in gene models that result in different proteins being dropped from the International Protein Index. One participant suggested that reading frames that have been removed could be stored in a searchable archive that features a way to correlate archived and newer entries. One participant asked whether there has been any discussion in the consortia with respect to informatics for protein separation/fractionation and microarray approaches. It was noted that the consortia wish to adapt tools for microarrays to the extent possible, although the consortia will process data to a point where it can be integrated into standard available microarray approaches. Another participant inquired whether the two consortia have agreed to exchange samples, and Dr. McIntosh commented that the consortia are in negotiation. Ultimately, tissues and samples from both consortia will be banked. Another attendee commented that two of the issues currently encountered by the consortia, evaluation criteria and the indexing of various genes, were also faced by the Human Genome Project. It was suggested that the clinical proteomics community explore ways to avoid repeating the ever-shifting mapping issues that consumed resources during the evolution of the HGP. To this end, guidelines from the community will be useful regarding how soon users wish to see proteomic data following its generation.
|
|||||
|