Yale Gerstein Lab




Pseudogenes in the Human Genome




Comprehensive Survey of Processed Pseudogenes in the Human Genome

We have identified ~8000 processed pseudogenes plus ~4000 duplicated pseudogenes in the latest GoldenPath human draft genome. You can either interatively search an online database (coming soon), or download the relevant data and texts from this page.



Analysis of Chromosome 22 Pseudogenes and Transcription

By integrating several sources of pseudogene annotation, we have identified 525 pseudogenes or pseudogene fragments on chromosome 22 of the human genome NCBI Build34. Using data from tiling microarrays and EST sequences, we found that about 5% of them were potentially transcribed. The relevant data and texts can be found here.



Analysis of Ribosomal Protein Pseudogenes

We have identified over 2000 ribosomal protein (RP) pseudogenes in the August 2001 freeze of the human genome draft. An interactive database holding all the results is being constructed. Meanwhile, the relevant data and texts can be found here.



Patterns of nucleotide substitution, insertion and deletion in the human genome

Using the recently identified human ribosomal protein (RP) pseudogene sequences, we have thoroughly studied DNA mutation patterns in the human genome. Overall, we found that nucleotide transitions are more common than transversions, by roughly a factor of two. Moreover, the substitution rates amongst the 12 possible nucleotide pairs are not homogeneous as they are affected by the type of immediately neighboring nucleotides and the overall local G+C content. We also found that deletions are about three times more common than insertions.



Analysis of Human Mitochondrial Ribosomal Protein (MRP) Pseudogenes

We have identified over 120 MRP pseudogenes in the August 2001 freeze of the human genome draft.



Analysis of Human Cytochrome c (cyc) Pseudogenes

We have identified over 49 cytochrome c (cyc) pseudogenes in the August 2001 freeze of the human genome draft.



21/22 Annotations

Specific pseudogene annotations for chromosomes 21 and 22 (Harrison et al. Genome Res (2002)):
  • pseudogenes whose primary match is a Swiss-Prot protein - chr21, chr22
  • pseudogenes whose primary match is an Ensembl protein - chr21, chr22
  • pseudogenes which are primarily Riken Centre or Sanger Centre annotations but are also detected by our procedures - chr21, chr22
  • a parseable table summarizing the above information - chr21, chr22. Please read the legend for this data.
  • SNPs - Analysis of single-nucleotide polymorphisms in human chromosomes 21 and 22.



Database (Under Construction)

Click here for the interactive database