|
| 1 | +Protein Secondary Structure |
| 2 | +=========================== |
| 3 | + |
| 4 | +## What is Protein Secondary Structure? |
| 5 | + |
| 6 | +Protein secondary structure (SS) is the general three-dimensional form of local segments of proteins. |
| 7 | +Secondary structure can be formally defined by the pattern of hydrogen bonds of the protein |
| 8 | +(such as alpha helices and beta sheets) that are observed in an atomic-resolution structure. |
| 9 | + |
| 10 | +More specifically, the secondary structure is defined by the patterns of hydrogen bonds formed between |
| 11 | +amine hydrogen (-NH) and carbonyl oxygen (C=O) atoms contained in the backbone peptide bonds of the protein. |
| 12 | + |
| 13 | + |
| 14 | + |
| 15 | +For more info see the Wikipedia article on [protein secondary structure] |
| 16 | +(https://en.wikipedia.org/wiki/Protein_secondary_structure). |
| 17 | + |
| 18 | +## Secondary Structure Annotation |
| 19 | + |
| 20 | +### Information Sources |
| 21 | + |
| 22 | +There are various ways to obtain the SS annotation of a protein structure: |
| 23 | + |
| 24 | +- **Authors assignment**: the authors of the structure describe the SS, usually identifying helices |
| 25 | +and beta-sheets, and they assign the corresponding type to each residue involved. The authors assignment |
| 26 | +can be found in the `PDB` and `mmCIF` file formats deposited in the PDB, and it can be parsed in **BioJava** |
| 27 | +when a `Structure` is loaded. |
| 28 | + |
| 29 | +- **Prediction from Atom coordinates**: there exist various programs to predict the SS of a protein. |
| 30 | +The algorithms use the atom coordinates of the aminoacids to detemine hydrogen bonds and geometrical patterns |
| 31 | +that define the different types of protein secondary structure. One of the first and most popular algorithms |
| 32 | +is `DSSP` (Dictionary of Secondary Structure of Proteins). **BioJava** has an implementation of the algorithm, |
| 33 | +written originally in C++, which will be described in the next section. |
| 34 | + |
| 35 | +- **Prediction from sequence**: Other algorithms use only the aminoacid sequence (primary structure) of the protein, |
| 36 | +nd predict the SS using the SS propensities of each aminoacid and multiple alignments with homologous sequences |
| 37 | +(i.e. [PSIPRED](http://bioinf.cs.ucl.ac.uk/psipred/)). At the moment **BioJava** does not have an implementation |
| 38 | +of this type, which would be more suitable for the sequence and alignment modules. |
| 39 | + |
| 40 | +### Secondary Structure Types |
| 41 | + |
| 42 | +Following the `DSSP` convention, **BioJava** defines 8 types of secondary structure: |
| 43 | + |
| 44 | + E = extended strand, participates in β ladder |
| 45 | + B = residue in isolated β-bridge |
| 46 | + H = α-helix |
| 47 | + G = 3-helix (3-10 helix) |
| 48 | + I = 5-helix (π-helix) |
| 49 | + T = hydrogen bonded turn |
| 50 | + S = bend |
| 51 | + _ = loop (any other type) |
| 52 | + |
| 53 | +## Prediction of SS in BioJava |
| 54 | + |
| 55 | +### Algorithm |
| 56 | + |
| 57 | +The algorithm implemented in BioJava for the prediction of SS is `DSSP`. It is described in the paper from |
| 58 | +[Kabsch W. & Sander C. in 1983](http://onlinelibrary.wiley.com/doi/10.1002/bip.360221211/abstract) |
| 59 | +[](http://www.ncbi.nlm.nih.gov/pubmed/6667333). |
| 60 | + |
| 61 | +### Data Structures |
| 62 | + |
| 63 | + |
0 commit comments