Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2010 Jun;27(6):821-41.
doi: 10.1080/073911010010524947.

Structure-based analysis of DNA sequence patterns guiding nucleosome positioning in vitro

Affiliations

Structure-based analysis of DNA sequence patterns guiding nucleosome positioning in vitro

Feng Cui et al. J Biomol Struct Dyn. 2010 Jun.

Abstract

Recent studies of genome-wide nucleosomal organization suggest that the DNA sequence is one of the major determinants of nucleosome positioning. Although the search for underlying patterns encoded in nucleosomal DNA has been going on for about 30 years, our knowledge of these patterns still remains limited. Based on our evaluations of DNA deformation energy, we developed new scoring functions to predict nucleosome positioning. There are three principal differences between our approach and earlier studies: (i) we assume that the length of nucleosomal DNA varies from 146 to 147 bp; (ii) we consider the anisotropic flexibility of pyrimidine-purine (YR) dimeric steps in the context of their neighbors (e.g., YYRR versus RYRY); (iii) we postulate that alternating AT-rich and GC-rich motifs reflect sequence-dependent interactions between histone arginines and DNA in the minor groove. Using these functions, we analyzed 20 nucleosome positions mapped in vitro at single nucleotide resolution (including clones 601, 603, 605, the pGUB plasmid, chicken beta-globin and three 5S rDNA genes). We predicted 15 of the 20 positions with 1-bp precision, and two positions with 2-bp precision. The predicted position of the '601' nucleosome (i.e., the optimum of the computed score) deviates from the experimentally determined unique position by no more than 1 bp - an accuracy exceeding that of earlier predictions. Our analysis reveals a clear heterogeneity of the nucleosomal sequences which can be divided into two groups based on the positioning 'rules' they follow. The sequences of one group are enriched by highly deformable YR/YYRR motifs at the minor-groove bending sites SHL+/- 3.5 and +/- 5.5, which is similar to the alpha-satellite sequence used in most crystallized nucleosomes. Apparently, the positioning of these nucleosomes is determined by the interactions between histones H2A/H2B and the terminal parts of nucleosomal DNA. In the other group (that includes the '601' clone) the same YR/YYRR motifs occur predominantly at the sites SHL +/- 1.5. The interaction between the H3/H4 tetramer and the central part of the nucleosomal DNA is likely to be responsible for the positioning of nucleosomes of this group, and the DNA trajectory in these nucleosomes may differ in detail from the published structures. Thus, from the stereochemical perspective, the in vitro nucleosomes studied here follow either an X-ray-like pattern (with strong deformations in the terminal parts of nucleosomal DNA), or an alternative pattern (with the deformations occurring predominantly in the central part of the nucleosomal DNA). The results presented here may be useful for genome-wide classification of nucleosomes, linking together structural and thermodynamic characteristics of nucleosomes with the underlying DNA sequence patterns guiding their positions.

PubMed Disclaimer

Figures

Figure 1
Figure 1
Comparison of the “mini-kink” model for nucleosomal DNA bending (22, 38), (a), and the crystallographic structure of a nucleosome (54), (b). (a) According to the model, bends into the major and minor grooves occur predominantly at the YR (red) and RY (blue) dimers, respectively. (b) Directionality and extent of the DNA bending in the nucleosomal X-ray structure (54) are generally consistent with the earlier model shown in (a). The DNA fragment shown corresponds to the superhelical locations SHL −4 and −3.5 in the X-ray structure 1KX3 (54). The color coding is the same as in (a): red for positive Roll (bending into the major groove) and blue for negative Roll (bending into the minor groove). The actual Roll values are about ±15-20°, which is close to ±22.5° estimated by the “mini-kink” model (22, 38).
Figure 2
Figure 2
Locations of the minor- and major-groove bending sites. (Left) The crystal structure of the 1KX5 nucleosome with 147-bp long DNA (54) shown schematically: the DNA fragment is divided into two halves, separated by the dyad (black ball and arrow). The base-pair centers in the ‘anterior’ half are represented by large balls, and the sugar-phosphate backbone is shown by a yellow ribbon. For the ‘posterior’ half of the nucleosome, the base-pair centers are connected by sticks. The fragments whose minor grooves face the histone octamer (grey cylinder) are colored in blue (minor-groove bending sites), while the fragments whose minor grooves face away from the histone are colored in red (major-groove bending sites). The minor- and major-groove sites are defined based on the Roll values of the 147-bp core DNA, shown in Figure S1. (Right) The exact location for each site in the ‘anterior’ half is given, which is symmetric to the corresponding site in the ‘posterior’ half with respect to position 74 (for the 147-bp template) or position 73.5 (for the 146-bp template). The sites with the highest Roll values (‘critical sites’) are indicated by larger letters (superhelical locations SHL −1.5, −2, −3.5, and −5.5).
Figure 3
Figure 3
Positioning of the WW (AA:TT + AT + TA), SS (GG:CC + GC + CG) and YR (CA:TG + TA + CG) dimers in the 20 nucleosome sequences, each 147 bp in length (Table I). (a-c) Combined frequencies of occurrence for the three sets of dimers ‘symmetrized’ with respect to the dyad at base-pair step 73.5 (vertical dashed lines). (d-f) The distance autocorrelation function, P(n), represents the frequency of occurrence of two dimers from the same set (e.g., WW, SS or YR) with the distance n between them (70). Complete 147-bp fragments in both strands were used to calculate this function. The raw data are shown in thin lines and the 3-point averages are in thick lines (blue for WW, and red for SS). Note that this color coding is the same as in Figure 1, because the WW and SS dimers preferentially occur at the sites where DNA bends into the minor and major grooves, respectively (3, 34).
Figure 4
Figure 4
Profiles of the positioning scores combining individual sequence patterns. Solid curves (Set 1) represent the scores based on the ‘canonical’ sequence patterns YR, WW/WWW and SS/SSS. Dotted curves (Set 2) represent the scores based on the combined patterns including several new patterns (YYRR, RYRY and GC) in addition to the ‘canonical’ patterns. Weights of the patterns were pre-determined (see Methods and Table SIII, Sets 1 and 2). The experimental dyad positions are denoted by circles and squares. The filled and open circles represent the dyad positions in Groups I and II, respectively, while the squares represent the positions in Group III (see Table I).
Figure 5
Figure 5
Frequency of occurrence of WW (AA:TT + AT + TA) and SS (GG:CC + GC + CG) versus base-pair step position for the nucleosomal fragments in Groups I (a and c) and II (b and d). The combined frequencies are ‘symmetrized’ with respect to the dyad at base-pair step 73.5 (vertical dashed lines). The raw data are shown in thin lines and the 3-point averages are shown in thick lines. The lines are colored as in Figure 3.
Figure 6
Figure 6
The positioning score profiles combining sequence patterns with the optimized weights. The solid lines represent the scores based on the optimized weights for the positions of both Groups I and II. The dotted lines show the score profiles based on the weights for the Group I positions (a and e) or for the Group II positions (k); see Table II for the weight values. The experimental dyad positions are shown by circles and squares, as described in Figure 4.
Figure 6
Figure 6
The positioning score profiles combining sequence patterns with the optimized weights. The solid lines represent the scores based on the optimized weights for the positions of both Groups I and II. The dotted lines show the score profiles based on the weights for the Group I positions (a and e) or for the Group II positions (k); see Table II for the weight values. The experimental dyad positions are shown by circles and squares, as described in Figure 4.
Figure 6
Figure 6
The positioning score profiles combining sequence patterns with the optimized weights. The solid lines represent the scores based on the optimized weights for the positions of both Groups I and II. The dotted lines show the score profiles based on the weights for the Group I positions (a and e) or for the Group II positions (k); see Table II for the weight values. The experimental dyad positions are shown by circles and squares, as described in Figure 4.

Similar articles

Cited by

References

    1. Kornberg RD, Lorch Y. Cell. 1999;98:285–294. - PubMed
    1. Yuan GC, Liu YJ, Dion MF, Slack MD, Wu LF, Altschuler SJ, Rando OJ. Science. 2005;309:626–630. - PubMed
    1. Segal E, Fondufe-Mittendorf Y, Chen L, Thåström A, Field Y, Moore IK, Wang JP, Widom J. Nature. 2006;442:772–778. - PMC - PubMed
    1. Johnson SM, Tan FJ, McCullough HL, Riordan DP, Fire AZ. Genome Res. 2006;16:1505–1516. - PMC - PubMed
    1. Lee W, Tillo D, Bray N, Morse RH, Davis RW, Hughes TR, Nislow C. Nat Genet. 2007;39:1235–1244. - PubMed

Publication types

LinkOut - more resources