一个基因组序列含有制造和运行一个生物体所需的全部信息。但对了解生物体是如何工作的来说重要的是决定每个基因何时何地处于激发状态的编码体系。这就是基因组的转录调节编码体系,是DNA结合受体用来控制基因组表达的序列。现在,酵母(Saccharomyces)的这种编码体系草图首次编纂完成。该编码体系是通过将关于在不同条件下生长的酵母细胞中的转录调节因子的基因组结合位置的数据与关于基因组序列保留情况的知识及以前关于调节因子-DNA相互作用的证据结合起来获得的。所得到的调节编码体系图在一定程度上反映了基因组中所含的调节潜力在活细胞中是怎样被利用的。
Transcriptional regulatory code of a eukaryotic genome
DNA-binding transcriptional regulators interpret the genome's regulatory code by binding to specific sequences to induce or repress gene expression1. Comparative genomics has recently been used to identify potential cis-regulatory sequences within the yeast genome on the basis of phylogenetic conservation2-6, but this information alone does not reveal if or when transcriptional regulators occupy these binding sites. We have constructed an initial map of yeast's transcriptional regulatory code by identifying the sequence elements that are bound by regulators under various conditions and that are conserved among Saccharomyces species. The organization of regulatory elements in promoters and the environment-dependent use of these elements by regulators are discussed. We find that environment-specific use of regulatory elements predicts mechanistic models for the function of a large population of yeast's transcriptional regulators.
Figure 1 Discovering binding-site specificities for yeast transcriptional regulators. a, Cis-regulatory sequences likely to serve as recognition sites for transcriptional regulators were identified by combining information from genome-wide location data, phylogenetically conserved sequences and previously published evidence, as described in Supplementary Methods. The compendium of regulatory sequence motifs can be found in Supplementary Table 3. b, Selected sequence specificities that were rediscovered and were newly discovered are shown. The total height of the column is proportional to the information content of the position, and the individual letters have a height proportional to the product of their frequency and the information content30.
Figure 2 Drafting the yeast transcriptional regulatory map. a, Portions of chromosomes illustrating locations of genes (grey rectangles) and conserved DNA sequences (coloured boxes) bound in vivo by transcriptional regulators. b, Combining binding data and sequence conservation data. The diagram depicts all sequences matching a motif from our compendium (top), all such conserved sequences (middle) and all such conserved sequences bound by a regulator (bottom). c, Regulator binding site distribution. The red line shows the distribution of distances from the start codon of open reading frames to binding sites in the adjacent upstream region. The green line represents a randomized distribution.
Figure 3 Yeast promoter architectures: single regulator architecture, promoter regions that contain one or more copies of the binding site sequence for a single regulator; repetitive motif architecture, promoter regions that contain multiple copies of a binding site sequence of a regulator; multiple regulator architecture, promoter regions that contain one or more copies of the binding site sequences for more than one regulator; co-occurring regulator architecture, promoters that contain binding site sequences for recurrent pairs of regulators. For the purposes of illustration, not all sites are shown and the scale is approximate. Additional information can be found in Supplementary Tables 4–6.
Figure 4 Environment-specific use of the transcriptional regulatory code. Four patterns of genome-wide binding behaviour are depicted on the left, where transcriptional regulators are represented by coloured circles and are placed above and below a set of target genes/promoters. The lines between the regulators and the target genes/promoters represent binding events. Specific examples of the environment-dependent behaviours are depicted on the right. Coloured circles represent regulators and coloured boxes represent their DNA binding sequences within specific promoter regions. We note that regulators might exhibit different behaviours when different pairs of conditions are compared.