Protein structure Guide, Meaning , Facts, Information and Description
Proteins are amino acid chains, made up from 20 different L-α-amino acids, also referred to as residues, that fold into unique 3-dimensional protein structures. The shape into a which a protein naturally folds is known as its native state, which is determined by its sequence of amino acids. Below about 40 residues the term peptide is frequently used. A certain number of residues is necessary to perform a particular biochemical function, and around 40-50 residues appears to be the lower limit for a functional domain size. Protein sizes range from this lower limit to several hundred residues in multi-functional proteins. Very large aggregates can be formed from protein subunits, for example many thousand actin molecules assemble into a an actin filament. Large protein complexes with RNA are found in the ribosome particles, which are in fact 'ribozymes'.Biochemists refer to four distinct aspects of a protein's structure:
- Primary structure: the amino acid sequence
- Secondary structure: highly patterned sub-structures--alpha helix and beta sheet--or segments of chain that assume no stable shape. Secondary structures are locally defined, meaning that there can be many different secondary motifs present in one single protein molecule
- Tertiary structure: the overall shape of a single protein molecule; the spatial relationship of the secondary structural motifs to one another
- Quaternary structure: the shape or structure that results from the union of more than one protein molecule, usually called subunit proteins subunits in this context, which function as part of the larger assembly or protein complex.
The primary structure is held together by covalent peptide bonds, which are made during the process of translation. The secondary structures are held together by hydrogen bonds. The tertiary structure is held together by primarily by hydrophobic interactions but hydrogen bonds, ionic interactions, and disulfide bonds are usually involved too.
The two ends of the amino acid chain are referred to as the carboxy terminus (C-terminus) and the amino terminus (N-terminus) based on the nature of the free group on each extremity.
The basic structure of an α-amino acid is quite simple. R denotes any one of the 20 possible side chains (see table below). We notice that the Cα-atom has 4 different ligands (the H is omitted in the drawing) and is thus chiral. An easy trick to remember the correct L-form is the CORN-rule: when the Cα-atom is viewed with the H in front, the residues read "CO-R-N" in a clockwise direction.
Amino acid structure
| Name (Residue) | 3-letter code | Single code | '''Relative abundance (%) E.C.''' | MW | pK | VdW volume(Å3) | '''Charged, Polar, Hydrophobic ''' |
| Alanine | ALA | A | 13.0 | 71 | 67 | H | |
| Arginine | ARG | R | 5.3 | 157 | 12.5 | 148 | C+ |
| Asparagine | ASN | N | 9.9 | 114 | 96 | P | |
| Aspartate | ASP | D | 9.9 | 114 | 3.9 | 91 | C- |
| Cysteine | CYS | C | 1.8 | 103 | 86 | P | |
| Glutamate | GLU | E | 10.8 | 128 | 4.3 | 109 | C- |
| Glutamine | GLN | Q | 10.8 | 128 | 114 | P | |
| Glycine | GLY | G | 7.8 | 57 | 48 | - | |
| Histidine | HIS | H | 0.7 | 137 | 6.0 | 118 | P,C+ |
| Isoleucine | ILE | I | 4.4 | 113 | 124 | H | |
| Leucine | LEU | L | 7.8 | 113 | 124 | H | |
| Lysine | LYS | K | 7.0 | 129 | 10.5 | 135 | C+ |
| Methionine | MET | M | 3.8 | 131 | 124 | H | |
| Phenylalanine | PHE | F | 3.3 | 147 | 135 | H | |
| Proline | PRO | P | 4.6 | 97 | 90 | H | |
| Serine | SER | S | 6.0 | 87 | 73 | P | |
| Threonine | THR | T | 4.6 | 101 | 93 | P | |
| Tryptophan | TRP | W | 1.0 | 186 | 163 | P | |
| Tyrosine | TYR | Y | 2.2 | 163 | 10.1 | 141 | P |
| Valine | VAL | V | 6.0 | 99 | 105 | H |
Two amino acids are combined in a condensation reaction. Notice that the peptide bond is in fact planar due to the delocalization of the electrons. The sequence of the different amino acids is considered the primary structure of the peptide or protein. Counting of residues always starts at the N-terminal end (NH2-group).
In contrast to the rather rigid peptide bond angle ω(the bond between C1 and N) (always close to 180 deg) ,the dihedral angles phi φ(the bond between N and Cα) and psi ψ(the bond between Cα and C1) can have a certain range of possible values. These angles are the degrees of freedom of a protein, they control the protein's three dimensional structure. They are restrained by geometry to allowed ranges typical for particular secondary structure elements, and represented in a Ramachandran plot. A few important bond lengths are given in the table below.
The polypeptide chain of a protein seldom forms just a random coil. Remember that proteins have either a chemical (enzymes) or structural function to fulfill.
High specificity requires an intricate arrangement of 3-dimensional interactions and therefore a defined conformation of the polypeptide chain. In fact, some neurodegenerative diseases like Huntington's may be related to random coil formation in certain proteins. The two most common secondary structure arrangements are the right-handed α-helix; and the β-sheet;, which can be connected into a larger tertiary structure (or fold) by turns and loops of a variety of types. These two secondary structure elements satisfy a strong hydrogen bond network within the geometric constraints of the bond angles ω, ψ, and φ. The β-sheets can be formed by parallel or, most common, antiparallel arrangement of individual β-strands.
[[Image:helices.png|none|thumb|650px|The left panel shows the hydrogen bonding in an actual α-helix backbone. Note that the n th residue O (Lys 153) bonds to the n+4 th following residue's N (Arg 147). The actual values of some displayed H-bond distances give you some idea about the variations to expect within a helix. The center panel includes the side chains which were omitted in the left panel for clarity. You see the side chains pointing towards the N-terminal of the chain (lower residue numbers) and thus it is usually possible to determine the direction of the helix quite well during initial model building. A very nice 2Å electron density is shown in the right panel]]
The polypeptide chain
Peptide bond
Average length
Single Bond
Average length
Hydrogen Bond
Average (±0.3)
Ca - C
1.53 (Å)
C - C
1.54 (Å)
O-H --- O-H
2.8 (Å)
C - N
1.33 (Å)
C - N
1.48 (Å)
N-H --- O=C
2.9 (Å)
N - Ca
1.46 (Å)
C - O
1.43 (Å)
O-H --- O=C
2.8 (Å)
Secondary structure elements
Turns, loops and a few other secondary structure elements such as a 3-10 helix complete the picture. We have now enough pieces to assemble a complete protein, displaying its typical tertiary structure.
Despite that there are about 100,000 different proteins expressed in
eukariotic systems, there are much fewer different structural motifs and folds, partly as a consequence of evolved pathways and mechansims. Motif in this sense does refer to a small specific combination of secondary structure elements (such as helix-turn-helix). Fold referes to a global type of arrangement, like helix-bundle or β-barrel;. Structure motifs usually consist of just a few elements, e.g. the 'helix-turn-helix' has just three. Note that while the spatial sequence of elements is the same in all instances of a motif, they may be encoded in any order within the underlying gene. Protein structural motifs often include loops of variable length and unspecified structure, which in effect create the "slack" necessary to bring together in space two elements that are not encoded by immediately adjacent DNA sequences in a gene. Note also that even when two genes encode secondary structural elements of a motif in the same order, nevertheless they may specify somewhat different sequences of amino acids. This is true not only because of the complicated relationship between tertiary and primary structure, but because the size of the elements varies from one protein and the next.
Main article: Protein folding
The process by which the higher structures form is called protein folding and is a consequence of the primary structure. Although any unique polypeptide may have more than one stable folded conformation, each conformation has its own biological activity and only one conformation is considered to be the active, or native conformation.
Main article: Structural domain
Within a protein, a structural domain ("domain") is an element of overall structure that is self-stabilizing and often folds independently of the rest of the protein chain. Many domains are not unique to the protein products of one gene or one gene family but instead appear in a variety of proteins. Domains often are named and singled out because they figure prominently in the biological function of the protein they belong to; for example, the "calcium-binding domain of calmodulin. Because they are self-stabilizing, domains can be "swapped" by genetic engineering between one protein and another to make chimerass. A domain may be composed of one, more than one or not any structural motifs.
There have been develop several ways of structural classification of proteins. These seek to classify the data in the Protein Data Bank in a structured order. Several databases have been made which classifies proteins with different methods. SCOP, CATH and FSSP are the largest ones. The methods used are purely manual, manual and automated, and purely automated. Work is being done to better integrate the current data. The classification is consitent between SCOP, CATH and FSSP for the majority of proteins which have been classified, but there are still some differences and inconsistencies.
This is an Article on Protein structure. Page Contains Information, Facts Details or Explanation Guide About Protein structure Side chain conformation
The atoms along the side chain are named, alpha, beta... and so on. The angles between these are named chi1, chi2, chi3... E.g. the first and second carbon atom in the side chain of lysine is named alpha and beta and the angle between them is named chi1. Side chains can be in different conformation called gauche(-), trans and gauche(+). Side chains generally tend to try to come into a staggered conformation around chi2.Folds and motifs of protein structure
Protein folding
Structural domain
Structure classification
