Figure 2

An external file that holds a picture, illustration, etc.
Object name is pbio.0040368.g002.jpg
Composition of the Assemblage Genome Sequences as Determined by Similarity to Known DNA and Protein Sequences

(A) The percent of “known” sequences compared to the SEED and environmental databases. A sequence was considered “known” if it had a significant similarity (E < 10−5) to the SEED, else “environmental” if it had a similarity to any environmental database, and else “unknown”.

(B) Breakdown of the “known” sequences into viral (both eukaryotic and bacteriophages), prophage, Bacteria, Archaea, or Eukarya.

-