|
![]() |
||
|
|
|
Supplemental Tables Referenced in A Comprehensive Transcript Index of the Human Genome Generated Using Microarrays and Computational Approaches Table S1. Complete list of 48,614 transcripts in the Primary Transcript Index (PTI) described in the main text that were represented on the set of predicted transcript arrays (PTA) also described in the main text. The columns in this table are: 1)Rosetta Locus Projection (RLP) id specific to the custom annotations associated with the PTI described in the main text, 2) category or class assigned to the custom annotations, as described in the main text, 3) RefSeq or Unigene (Build 138) accession number associated with the RLP (transcript and EST sequences from RefSeq and/or Unigene supporting the RLP), 4) official gene symbol, if available, and 5) chromosome on which the RLP is located. Table S2. Complete list of 60 tissues and cell lines hybridized to the predicted transcript arrays described in the main text.
Table S3. List of 6 tissues and cell lines hybridized to the chromosome 20 genomic tiling arrays described in the main text.
Comparing the EVG Set with the Current Set of RefSeq Genes To further validate the expression verified genes identified in the analysis presented in the main text, all probes from the Primary Transcript Index were mapped to the most recent set of RefSeq genes. For a probe to be assigned to a RefSeq sequence, 56 out of 60 bases had to match the positive strand of the RefSeq sequence with no gaps. Probes with hits to multiple RefSeqs from different Locus Link records were discarded. All locus projections containing probes mapping to the current RefSeq set were then summarized based on the EVG detection status and original locus projection category presented in the main text. The results of this summary are given in Supplemental Table S5. Slightly more than 85% of the locus projections that mapped to the latest RefSeq gene set were detected as expression verified genes. This percentage is higher than expected based on the false negative predictions from the main manuscript, which suggests the estimates provided were somewhat conservative. However, because transcripts that are more highly expressed over a broader range of tissues are over represented in RefSeq and, therefore, are the easiest to detect using the microarray-based approach described in the main text, the conservative estimate provided in the main text is still warranted. We also note that there is an increased percent of EVGs in the known category (based on RefSeq alignments to the genome) for those locus projections matching current RefSeqs (87%), compared with the percentage of all locus projectsions referenced in the paper (75%). This is likely due to the removal of incorrect provisional RefSeq sequences during the review process and again highlights the value of microarray validation pending full characterization of the human transcriptome. The percentage of EVGs drops as the reliance on gene model predictions for the locus projection increases. This is likely due to cases where the structure of the gene model was incorrect. Since one of the major criteria for determining an EVG is co-regulation of probes across conditions, probes designed against incorrect portions of a partially correct gene model will reduce the power to detect that gene. Table S5. Comparison of Expression Validated Gene (EVG) predictions
with RefSeq sequences (March, 2004). The first column represents the predicted
gene categories as described in the main text. The second column gives
the counts of PTI genes mapping to the current RefSeq set by category
that were detected as expression verified genes (EVG). The third column
gives the counts of those PTI genes mapping to the current RefSeq set
that were not detected as EVGs. The fourth and fifth columns are simply
the percentages associated with the second and third column, respectively.
Acknowledgements for Supplemental Material |
|||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
|
|