Interview with John Doe 1 at JGI, Friday February 3, 2006
Interviewers: Anya, Annette
What is annotation?
There are two basic kinds of annotation, structural and functional. Structural annotation is deciding where the exons and introns, start and stop codons, and coding sequence (CDS) are located in the sequence, i.e., where the genes are located and what their structure is. Functional annotation is deciding what the protein coded by the gene does. Both types of annotation can be done by hand (manual annotation) or by any of various programs (automatic annotation). John Doe 1 doesn't do manual annotation, he only does automated annotation. In fact, none of the other members of JGI's annotation group does routine manual annotation, though Andrea and Astrid did up until a year and a half ago.
Note on structure of genes: IMG is almost all prokaryotes, and their genes don't have introns and exons, so the structure is simpler.
John Doe 1 showed us the manual annotation tool on JGI's Genome Portal. He wasn't sure why or whether anyone would have to fill in all the fields or what they meant. The Genome Portal allows manual structural and functional annotations. You can even do ab initio annotations, creating a new gene model, but this is a complicated process and John Doe 1 wasn't confident of how to do it.
Who does the manual annotation now?
Collaborators. The model for jamborees has changed. JGI used to hold jamborees for collaborators to work together and annotate the genome. Now the annotation is done before the jamboree, and the jamboree itself is a regular scientific meeting, more like a conference. No manual annotation is done at the jamboree. John Doe 1 didn't know any more about who does what kind of annotation (besides the automatic annotations that he does).
Do you use IMG?
No. He hasn't really looked at it.
Why are people interested in annotations?
The JGI business model is to do sequencing and related services for collaborators, so annotation is one of the services. Collaborators come in several flavors, but they are all interested in specific organisms and what their genes do. Some collaborators are interested in a single organism that is different from its relatives. They study that organism to understand how the thing that's different about it works. They can do that by comparing the genes of the organism to the genes of its relatives and studying what the different genes do. Some collaborators are more interested in evolution, trying to determine how the branches of the tree of life evolved over time. They're interested in organisms that are at branch points of the tree and how the organisms on different joining branches differ. Other collaborators are interested in biochemical pathways, like photosynthesis. They study a model organism that is representative of organisms that use that pathway. Still other collaborators are interested in a particular protein family. They study what related proteins do in various different species, and how related proteins within a single species work together.
Are there competitors (other centers that have manual annotation tools)?
Sanger Center, UC Santa Cruz
Note the lead investigator for the Chlamydomonas rheinhardtii project is local (Stanford). Those guys are annotating their genome.
What are the challenges in annotation?
John Doe 1 would like to be able to send email to whoever made an annotation. Lots of clicking is required to create one gene model. It's a tedious process. John Doe 1's one big complaint with the portal system, when you browse to a certain track in the browser, do something with it, and come back, the browser should put you back at the same track. Instead, it puts you back at the top of the page. John Doe 1 has the impression this would be an easy fix but just doesn't have priority for getting done. There are so many genes in a genome that most will never get annotated. The ones that do get annotated are usually only annotated by one person.