Prime Time for Proteomics

Loading...

By Kevin Davies
Editor-in-Chief

March 7, 2002 | In 1994, a young Australian biologist named Marc Wilkins was grappling for a simple term to portray the complete set of proteins encoded in a genome. It took a few years for his suggestion to take hold, but "proteomics"— the catchy counterpart to "genomics"—has become a fixture of biotechnology-related research papers, press releases, and business plans. As the industry (and media) spotlight that was once trained on the race for individual genes and later complete genomes shifts to this new field of discovery—"Introducing the biology of the future" cried one recent press release—it is worth asking whether the science lives up to its breathless billing. Judging by a clutch of reports published recently in the journals Nature, Science, and Current Biology, the hyperbole may in fact be justified.

The term may be relatively new, but the basic prerequisites for proteomics have existed for several decades. Two-dimensional gel electrophoresis is the routine if tedious method for separating complex mixtures of proteins, while sequence and structural databases have been an indispensable component of protein research for years. Thanks to recent advances in mass spectrometry and genome sequencing, researchers finally have the tools to conduct systematic surveys of the proteomes of various molecular machines, tissues, and organisms. While the first priority is to take a full inventory of proteins expressed in a given cell or tissue, researchers are particularly interested in mapping the complex network of physical partners for each protein. The logic is simple: if two proteins specifically associate under physiological conditions, there is probably a functional reason.

While the goal for proteomic companies is to translate information on human protein pathways into drug targets, many groups are ramping up by studying model organisms with more tractable protein collections. The rationale is not unlike Celera's decision to sequence the DNA of the fruit fly Drosophila melanogaster before embarking on the human genome. But whereas fruit flies carry some 14,000 genes, the consensus choice for initial proteomic studies is the baker's yeast, Saccharomyces cerevisiae, the genome of which was completely sequenced back in 1996. Possessing a mere 6,000 genes, yeast has just a fraction of the roughly 30,000 to 40,000 genes in the human genome, not to mention a five-year head start in terms of functional analysis.


Introducing the Interactome 
Writing in the January 10 issue of Nature, two industrial/academic consortia—featuring investigators at MDS Proteomics in Canada and Denmark, the other the German company Cellzome AG—describe impressive progress in organizing the yeast proteome. "A formidable challenge of postgenomic biology," according to Anne-Claude Gavin, Giulio Superti-Furga and colleagues at Cellzome, "is to understand how genetic information results in the concerted action of gene products in time and space to generate function." The first step in that quest is to characterize about 30,000 protein-protein interactions—the "interactome"—in yeast, assuming each protein has 5 partners on average. The Cellzome approach is called tandem-affinity purification (TAP), but both methods are quite similar: first, prepare a "bait" protein by attaching a chemical tag. Next, introduce the DNA encoding the bait into a yeast cell. Then, fish out the bait proteins along with any attached partners by running the purified protein mixture through an affinity column. The resulting protein complexes are fingerprinted using mass spectrometry (MS) and identified using bioinformatics.

The joint effort from Cellzome and the European Molecular Biology Laboratory

Featured Reports
A-C. Gavin et al. "Functional organization of the yeast proteome by systematic analysis of protein complexes." Nature 415, 141-147 (2002).

Y. Ho. et al. "Systematic identification of protein complexes in Saccharomyces cerevisiae by mass spectrometry." Nature 415, 180-183 (2002).

A.H. Tong et al. "A combined experimental and computational strategy to define protein interaction networks for peptide recognition modules." Science 295, 321-324 (2002).

J.S. Andersen et al. "Directed proteomic analysis of the human nucleolus." Current Biology 12, 1-11 (2002).

A.H. Fox et al. "Paraspeckles: A novel nuclear domain." Current Biology 12, 13-25 (2002).
 
in Heidelberg, Germany, studied more than 1,700 yeast genes and identified 589 purified tag proteins, 80 percent of which were associated with other proteins. Following MS of these proteins (some present in amounts as small as 15 copies per cell) and bioinformatic data searches to identify redundancies, Gavin's group was left with 232 distinct protein complexes, sometimes referred to as "molecular machines." Ninety-eight of these complexes had been previously catalogued and deposited in the yeast protein database, but over 130 complexes were previously unknown, and 91 percent contained at least one protein of unknown function. In several cases, the authors found large complexes that are virtually identical in yeast and human cells—not surprising given that 40 percent of yeast proteins are conserved through evolution.

At first glance, renderings of the maze of protein-protein interactions pouring out of such studies is akin to some incoherent piece of modern art, but Superti-Furga suggests that a better analogy is to a French pointillist painting. "If you stand too close," he explains, "all you see are single-colored dots. As you move away, you begin to see a coherent picture."

A similar approach called high-throughput mass spectrometric protein complex identification (HMS-PCI) was taken by Yuen Ho and coworkers from the University of Toronto and MDS Proteomics. Working from 600 baits, they identified more than 1,500 distinct interacting proteins, or 25 percent of the yeast proteome. Data on these novel complexes has been entered into the recently created Biomolecular Interaction Network Database (BIND), produced by co-authors Gary Bader and bioinformatician Christopher Hogue, which stores data on protein-protein interactions. The database includes a tool called PreBIND, which can be used to search abstracts in the scientific literature for information on protein-protein binding. (Visit www.binddb.org for more information. Full details of the yeast proteome maps are available at yeast.cellzome.com and www.mdsproteomics.com/yeast)


Collaborative approach 
Another multicenter analysis of the yeast proteome demonstrates the value of combining "wet lab" and computational approaches for protein identification, thereby helping to discard some of the inevitable false positives. Four groups—led by Charles Boone and Hogue (University of Toronto), Stanley Fields (University of Washington), and Gianni Cesareni (University of Rome)—used two complementary methods to identify proteins that bind to a well-known protein-binding domain called SH3. The first involved a computational search for ligands that could potentially bind one or more of the 24 yeast proteins containing the SH3 domain. The second used the classic two-hybrid method developed by Fields to identify gene products binding to the SH3 domain. Combining both datasets revealed 59 interactions in common.

These important studies are simply the first round in attempts to characterize functionally the yeast proteome, but given that the human genome may contain only five times as many genes as yeast, the Cellzome group concludes the current technology "may provide drug discovery programmes with a molecular context for the choice and evaluation of drug targets."

Indeed, in results published contemporaneously in Current Biology, the first steps toward that goal have been taken. In what is the largest proteomic study so far for a single human organelle, the groups of Angus Lamond at the University of Edinburgh and Matthias Mann in Denmark have teamed up to compile an inventory of the components of the human nucleolus. Originally described more than 150 years ago by Rudolph Wagner, the nucleolus is a dynamic, membrane-free compartment of the cell nucleus where many components of the ribosome (the cell's protein synthesizing machinery) are produced. Using nanoelectrospray tandem MS to analyze the purified components, the Lamond-Mann group identified 271 nucleolar proteins, of which fully 30 percent were novel. Many were quite unexpected, including factors typically associated with protein synthesis and the cytoskeleton. The authors also describe dynamic novel nuclear compartments called paraspeckles, which are thought to be involved in the processing of RNA.* 


White Papers & Special Reports

sapiosciences
The Workflow Driven Lab
Sponsored by Sapio Sciences

Many companies have recognized that their internal business units operate as a set of business processes. These business processes are also called workflows. Modern Laboratories are highly suitable to this workflow driven approach. In fact, the lab environments successful operation is predicated on the successful definition and adherence to workflows. It could be said that a modern  laboratory is an advanced process implementing construct. It is important that laboratory management software mirrors the process driven nature of the lab thereby increasing automation, shortening learning curves, improving data quality and increasing lab throughput.

  • The modern laboratory is an advanced workflow implementing construct
  • Laboratory Management Software solutions should fully embrace and mirror this process driven approach
  • Effective information management of workflow processes with a LIMS results in increased automation, reduced training curves, better data quality and increased lab throughput


panasas
Curing Life Sciences Data Management Challenges with Scalable Storage
Sponsored by Panasas

High performance storage systems are a given to meet today’s life sciences R&D computational challenges. But with the explosive growth in data produced by next-gen lab equipment, scalability and long-term data management issues must also be addressed. Read this paper to learn:

  • Why new lab equipment will impact R&D workflows
  • How to avoid the hidden costs of long-term data management
  • What approach you should take to accommodate today’s data while having the flexibility to scale to meet future demands.


Quantum
StorNext 4.0: Technical Product Brief
Sponsored by Quantum

 
Proven in the world’s most data intensive industries, Quantum StorNext is a scalable, high-performance file system which allows data sharing across Linux, Mac, Unix, and Windows operating systems and manages data in enterprise storage environments. In this Technical Brief you'll learn:

  • How a high-performing file system can accelerate your business
  • How to simplify your data management
  • How a tiered storage approach can save you money


Life Science Webcasts & Podcasts

Predict or Perish! Shaping the Practices of Clinical Trials
Decisionview webinarSponsored by:  DecisionView

Predictive Analytics are a key differentiator in running your clinical trials successfully through 2010 and beyond. They will help you to optimize your patient enrollment, reduce your clinical operations costs and minimize your financial liability in the clinical supply chain. In this session, you will:
• Learn what predictive analytics are and what they are not
• Understand why you need predictive analytics to run your clinical trials, and
• Explore how predictive analytics will shape the future of clinical trials

Download Now. 

 



More Podcasts

Job Openings

The University of Washington Department of Genome Sciences is seeking a LINUX SYSTEMS ENGINEERING MANAGER to lead a team in a diverse scientific computing environment that includes multiple HPC systems, petascale storage, and custom application servers. Apply online at UW Hires for req number 61505.  http://www.washington.edu/admin/hr/jobs/

Loading...

For reprints and/or copyright permission, please contact The YGS Group, 3650 West Market Street, York, PA;

(717) 505-9701 ext. 125, or via email to Ashley.Zander@theYGSgroup.com.