Skip to content
Surf Wiki
Save to docs
general/alpha-amino-acids

From Surf Wiki (app.surf) — the open knowledge base

Proteinogenic amino acid

Amino acid that is incorporated biosynthetically into proteins during translation

Proteinogenic amino acid

Summary

Amino acid that is incorporated biosynthetically into proteins during translation

Proteinogenic amino acids are a small fraction of all amino acids

Proteinogenic amino acids are amino acids that are incorporated biosynthetically into proteins during translation from RNA. The word "proteinogenic" means "protein creating". Throughout known life, there are 22 genetically encoded (proteinogenic) amino acids, 20 in the standard genetic code and an additional 2 (selenocysteine and pyrrolysine) that can be incorporated by special translation mechanisms.

In contrast, non-proteinogenic amino acids are amino acids that are either not incorporated into proteins (like GABA, L-DOPA, or triiodothyronine), misincorporated in place of a genetically encoded amino acid, or not produced directly and in isolation by standard cellular machinery (like hydroxyproline). The latter often results from post-translational modification of proteins. Some non-proteinogenic amino acids are incorporated into nonribosomal peptides which are synthesized by non-ribosomal peptide synthetases.

Both eukaryotes and prokaryotes can incorporate selenocysteine into their proteins via a nucleotide sequence known as a SECIS element, which directs the cell to translate a nearby UGA codon as selenocysteine (UGA is normally a stop codon). In some methanogenic prokaryotes, the UAG codon (normally a stop codon) can also be translated to pyrrolysine.

In eukaryotes, there are only 21 proteinogenic amino acids, the 20 of the standard genetic code, plus selenocysteine. Humans can synthesize 12 of these from each other or from other molecules of intermediary metabolism. The other nine must be consumed (usually as their protein derivatives), and so they are called essential amino acids. The essential amino acids are histidine, isoleucine, leucine, lysine, methionine, phenylalanine, threonine, tryptophan, and valine (i.e. H, I, L, K, M, F, T, W, V).

The proteinogenic amino acids have been found to be related to the set of amino acids that can be recognized by ribozyme autoaminoacylation systems. Thus, non-proteinogenic amino acids would have been excluded by the contingent evolutionary success of nucleotide-based life forms. Other reasons have been offered to explain why certain specific non-proteinogenic amino acids are not generally incorporated into proteins; for example, ornithine and homoserine cyclize against the peptide backbone and fragment the protein with relatively short half-lives, while others are toxic because they can be mistakenly incorporated into proteins, such as the arginine analog canavanine.

The evolutionary selection of certain proteinogenic amino acids from the primordial soup has been suggested to be because of their better incorporation into a polypeptide chain as opposed to non-proteinogenic amino acids.

Structures

The following illustrates the structures and abbreviations of the 21 amino acids that are directly encoded for protein synthesis by the genetic code of eukaryotes. The structures given below are standard chemical structures, not the typical zwitterion forms that exist in aqueous solutions.

Structure of the 21 proteinogenic amino acids with 3 and 1 letters codes, grouped by side chain functionality

image:L-Alanin - L-Alanine.svg|L-Alanine (Ala / A) image:Arginin - Arginine.svg|L-Arginine (Arg / R) image:L-Asparagin - L-Asparagine.svg|L-Asparagine (Asn / N) image:L-Asparaginsäure - L-Aspartic_acid.svg|L-Aspartic acid (Asp / D) image:L-Cystein - L-Cysteine.svg|L-Cysteine (Cys / C) image:L-Glutaminsäure - L-Glutamic_acid.svg|L-Glutamic acid (Glu / E) image:L-Glutamin - L-Glutamine.svg|L-Glutamine (Gln / Q) image:Glycine-2D-skeletal.svg|Glycine (Gly / G) image:L-histidine-skeletal.png|L-Histidine (His / H) image:L-Isoleucin_-L-Isoleucine.svg|L-Isoleucine (Ile / I) image:L-Leucine.svg|L-Leucine (Leu / L) image:L-Lysin-L-Lysine.svg|L-Lysine (Lys / K) image:Methionin-Methionine.svg|L-Methionine (Met / M) image:L-Phenylalanin-L-Phenylalanine.svg|L-Phenylalanine (Phe / F) image:Prolin-Proline.svg|L-Proline (Pro / P) image:L-Serin-L-Serine.svg|L-Serine (Ser / S) image:L-Threonin-L-Threonine.svg|L-Threonine (Thr / T) image:L-Tryptophan-L-Tryptophan.svg|L-Tryptophan (Trp / W) image:L-Tyrosin-_L-Tyrosine.svg|L-Tyrosine (Tyr / Y) image:L-valine-skeletal.png|L-Valine (Val / V)

IUPAC/IUBMB now also recommends standard abbreviations for the following two amino acids: image:L-selenocysteine-2D-skeletal.png|L-Selenocysteine (Sec / U) image:Pyrrolysine.svg|L-Pyrrolysine (Pyl / O)

Chemical properties

Following is a table listing the one-letter symbols, the three-letter symbols, and the chemical properties of the side chains of the standard amino acids. The masses listed are based on weighted averages of the elemental isotopes at their natural abundances. Forming a peptide bond results in elimination of a molecule of water. Therefore, the protein's mass is equal to the mass of amino acids the protein is composed of minus 18.01524 Da per peptide bond.

General chemical properties

Amino acidShortAbbrev.Avg. mass (Da)pIpK1
(α-COO−)pK2
(α-NH3+)align=right}}Alaninealign=right}}Cysteinealign=right}}Aspartic acidalign=right}}Glutamic acidalign=right}}Phenylalaninealign=right}}Glycinealign=right}}Histidinealign=right}}Isoleucinealign=right}}Lysinealign=right}}Leucinealign=right}}Methioninealign=right}}Asparaginealign=right}}Pyrrolysinealign=right}}Prolinealign=right}}Glutaminealign=right}}Argininealign=right}}Serinealign=right}}Threoninealign=right}}Selenocysteinealign=right}}Valinealign=right}}Tryptophanalign=right}}Tyrosine
AAla89.094046.012.359.87
CCys121.154045.051.9210.70
DAsp133.103842.851.999.90
EGlu147.130743.152.109.47
FPhe165.191845.492.209.31
GGly75.067146.062.359.78
HHis155.156347.601.809.33
IIle131.174646.052.329.76
KLys146.189349.602.169.06
LLeu131.174646.012.339.74
MMet149.207845.742.139.28
NAsn132.119045.412.148.72
OPyl255.31
PPro115.131946.301.9510.64
QGln146.145945.652.179.13
RArg174.2027410.761.828.99
SSer105.093445.682.199.21
TThr119.120345.602.099.10
USec168.0535.471.9110
VVal117.147846.002.399.74
WTrp204.228445.892.469.41
YTyr181.191245.642.209.21

Side-chain properties

Amino acidShortAbbrev.Side chainHydro-
phobicpKa§PolarpHSmallTinyAromatic
or Aliphaticvan der Waals
volume (Å3)align=right}}Alaninealign=right}}Cysteinealign=right}}Aspartic acidalign=right}}Glutamic acidalign=right}}Phenylalaninealign=right}}Glycinealign=right}}Histidinealign=right}}Isoleucinealign=right}}Lysinealign=right}}Leucinealign=right}}Methioninealign=right}}Asparaginealign=right}}Pyrrolysinealign=right}}Prolinealign=right}}Glutaminealign=right}}Argininealign=right}}Serinealign=right}}Threoninealign=right}}Selenocysteinealign=right}}Valinealign=right}}Tryptophanalign=right}}Tyrosine
AAla-CH3--Aliphatic67
CCys-CH2SH8.55acidic-86
DAsp-CH2COOH3.67acidic-91
EGlu-CH2CH2COOH4.25acidic-109
FPhe-CH2C6H5--Aromatic135
GGly-H---48
HHis-CH2-C3H3N26.54weak basicAromatic118
IIle-CH(CH3)CH2CH3--Aliphatic124
KLys-(CH2)4NH210.40basic-135
LLeu-CH2CH(CH3)2--Aliphatic124
MMet-CH2CH2SCH3--Aliphatic124
NAsn-CH2CONH2---96
OPyl-(CH2)4NHCOC4H5NCH3N.D.weak basic-
PPro-CH2CH2CH2----90
QGln-CH2CH2CONH2---114
RArg-(CH2)3NH-C(NH)NH212.3strongly basic-148
SSer-CH2OH---73
TThr-CH(OH)CH3---93
USec-CH2SeH5.43acidic-
VVal-CH(CH3)2--Aliphatic105
WTrp-CH2C8H6N--Aromatic163
YTyr-CH2-C6H4OH9.84weak acidicAromatic141

§: Only ionizable residues have a meaningful pKa. Values for Asp, Cys, Glu, His, Lys & Tyr were determined using the amino acid residue placed centrally in an alanine pentapeptide. The value for Arg is from Pace et al. (2009). The value for Sec is from Byun & Kang (2011).

Note: the pKa value of an amino-acid residue in a small peptide is typically slightly different when it is inside a protein. Protein pKa calculations are sometimes used to calculate the change in the pKa value of an amino-acid residue in this situation.

Gene expression and biochemistry

Amino acidShortAbbrev.Codon(s)OccurrenceEssential‡ in humansin Archaean proteins
(%)&in Bacteria proteins
(%)&in Eukaryote proteins
(%)&in human proteins
(%)&align=right}}Alaninealign=right}}Cysteinealign=right}}Aspartic acidalign=right}}Glutamic acidalign=right}}Phenylalaninealign=right}}Glycinealign=right}}Histidinealign=right}}Isoleucinealign=right}}Lysinealign=right}}Leucinealign=right}}Methioninealign=right}}Asparaginealign=right}}Pyrrolysinealign=right}}Prolinealign=right}}Glutaminealign=right}}Argininealign=right}}Serinealign=right}}Threoninealign=right}}Selenocysteinealign=right}}Valinealign=right}}Tryptophanalign=right}}Tyrosinealign=right}}Stop codon†
AAlaGCU, GCC, GCA, GCG8.210.067.637.01
CCysUGU, UGC0.980.941.762.3
DAspGAU, GAC6.215.595.44.73
EGluGAA, GAG7.696.156.427.09
FPheUUU, UUC3.863.893.873.65
GGlyGGU, GGC, GGA, GGG7.587.766.336.58
HHisCAU, CAC1.772.062.442.63
IIleAUU, AUC, AUA7.035.895.14.33
KLysAAA, AAG5.274.685.645.72
LLeuUUA, UUG, CUU, CUC, CUA, CUG9.3110.099.299.97
MMetAUG2.352.382.252.13
NAsnAAU, AAC3.683.584.283.58
OPylUAG*0000
PProCCU, CCC, CCA, CCG4.264.615.416.31
QGlnCAA, CAG2.383.584.214.77
RArgCGU, CGC, CGA, CGG, AGA, AGG5.515.885.715.64
SSerUCU, UCC, UCA, UCG, AGU, AGC6.175.858.348.33
TThrACU, ACC, ACA, ACG5.445.525.565.36
USecUGA**0000
VValGUU, GUC, GUA, GUG7.87.276.25.96
WTrpUGG1.031.271.241.22
YTyrUAU, UAC3.352.942.872.66
-TermUAA, UAG, UGA††
  • UAG is normally the amber stop codon, but in organisms containing the biological machinery encoded by the pylTSBCD cluster of genes the amino acid pyrrolysine will be incorporated.

** UGA is normally the opal (or umber) stop codon, but encodes selenocysteine if a SECIS element is present.

† The stop codon is not an amino acid, but is included for completeness.

†† UAG and UGA do not always act as stop codons (see above).

‡ An essential amino acid cannot be synthesized in humans and must, therefore, be supplied in the diet. Conditionally essential amino acids are not normally required in the diet, but must be supplied exogenously to specific populations that do not synthesize it in adequate amounts.

& Occurrence of amino acids is based on 135 Archaea, 3775 Bacteria, 614 Eukaryota proteomes and human proteome (21 006 proteins) respectively.

Mass spectrometry

In mass spectrometry of peptides and proteins, knowledge of the masses of the residues is useful. The mass of the peptide or protein is the sum of the residue masses plus the mass of water (Monoisotopic mass = 18.01056 Da; average mass = 18.0153 Da). The residue masses are calculated from the tabulated chemical formulas and atomic weights. In mass spectrometry, ions may also include one or more protons (Monoisotopic mass = 1.00728 Da; average mass* = 1.0074 Da). *Protons cannot have an average mass, this confusingly infers to Deuterons as a valid isotope, but they should be a different species (see Hydron (chemistry))

Amino acidShortAbbrev.FormulaMon. mass§ (Da)Avg. mass (Da)align=right}}Alaninealign=right}}Cysteinealign=right}}Aspartic acidalign=right}}Glutamic acidalign=right}}Phenylalaninealign=right}}Glycinealign=right}}Histidinealign=right}}Isoleucinealign=right}}Lysinealign=right}}Leucinealign=right}}Methioninealign=right}}Asparaginealign=right}}Pyrrolysinealign=right}}Prolinealign=right}}Glutaminealign=right}}Argininealign=right}}Serinealign=right}}Threoninealign=right}}Selenocysteinealign=right}}Valinealign=right}}Tryptophanalign=right}}Tyrosine
AAlaC3H5NO71.0371171.0779
CCysC3H5NOS103.00919103.1429
DAspC4H5NO3115.02694115.0874
EGluC5H7NO3129.04259129.1140
FPheC9H9NO147.06841147.1739
GGlyC2H3NO57.0214657.0513
HHisC6H7N3O137.05891137.1393
IIleC6H11NO113.08406113.1576
KLysC6H12N2O128.09496128.1723
LLeuC6H11NO113.08406113.1576
MMetC5H9NOS131.04049131.1961
NAsnC4H6N2O2114.04293114.1026
OPylC12H19N3O2237.14773237.2982
PProC5H7NO97.0527697.1152
QGlnC5H8N2O2128.05858128.1292
RArgC6H12N4O156.10111156.1857
SSerC3H5NO287.0320387.0773
TThrC4H7NO2101.04768101.1039
USecC3H5NOSe150.95364150.0489
VValC5H9NO99.0684199.1311
WTrpC11H10N2O186.07931186.2099
YTyrC9H9NO2163.06333163.1733

§ Monoisotopic mass

Stoichiometry and metabolic cost in cell

The table below lists the abundance of amino acids in E.coli cells and the metabolic cost (ATP) for synthesis of the amino acids. Negative numbers indicate the metabolic processes are energy favorable and do not cost net ATP of the cell. The abundance of amino acids includes amino acids in free form and in polymerization form (proteins).

Amino acidShortAbbrev.Abundance
(# of molecules (×108)
per E. coli cell)ATP cost in synthesisAerobic
conditionsAnaerobic
conditionsalign=right}}Alaninealign=right}}Cysteinealign=right}}Aspartic acidalign=right}}Glutamic acidalign=right}}Phenylalaninealign=right}}Glycinealign=right}}Histidinealign=right}}Isoleucinealign=right}}Lysinealign=right}}Leucinealign=right}}Methioninealign=right}}Asparaginealign=right}}Pyrrolysinealign=right}}Prolinealign=right}}Glutaminealign=right}}Argininealign=right}}Serinealign=right}}Threoninealign=right}}Selenocysteinealign=right}}Valinealign=right}}Tryptophanalign=right}}Tyrosine
AAla2.9−11
CCys0.521115
DAsp1.402
EGlu1.5−7−1
FPhe1.1−62
GGly3.5−22
HHis0.5417
IIle1.7711
KLys2.059
LLeu2.6−91
MMet0.882123
NAsn1.435
OPyl---
PPro1.3−24
QGln1.5−60
RArg1.7513
SSer1.2−22
TThr1.568
USec---
VVal2.4−22
WTrp0.33−77
YTyr0.79−82

Remarks

Amino acidAbbrev.Remarksalign=right}}Alaninealign=right}}Asparagine or aspartic acidalign=right}}Cysteinealign=right}}Aspartic acidalign=right}}Glutamic acidalign=right}}Phenylalaninealign=right}}Glycinealign=right}}Histidinealign=right}}Isoleucinealign=right}}Leucine or isoleucinealign=right}}Lysinealign=right}}Leucinealign=right}}Methioninealign=right}}Asparaginealign=right}}Pyrrolysinealign=right}}Prolinealign=right}}Glutaminealign=right}}Argininealign=right}}Serinealign=right}}Threoninealign=right}}Selenocysteinealign=right}}Valinealign=right}}Tryptophanalign=right}}Unknownalign=right}}Tyrosinealign=right}}Glutamic acid or glutamine
AAlaVery abundant and very versatile, it is more stiff than glycine, but small enough to pose only small steric limits for the protein conformation. It behaves fairly neutrally, and can be located in both hydrophilic regions on the protein outside and the hydrophobic areas inside.
BAsxA placeholder when either amino acid may occupy a position
CCysThe sulfur atom bonds readily to heavy metal ions. Under oxidizing conditions, two cysteines can join in a disulfide bond to form the amino acid cystine. When cystines are part of a protein, insulin for example, the tertiary structure is stabilized, which makes the protein more resistant to denaturation; therefore, disulfide bonds are common in proteins that have to function in harsh environments including digestive enzymes (e.g., pepsin and chymotrypsin) and structural proteins (e.g., keratin). Disulfides are also found in peptides too small to hold a stable shape on their own (e.g. insulin).
DAspAsp behaves similarly to glutamic acid, and carries a hydrophilic acidic group with strong negative charge. Usually, it is located on the outer surface of the protein, making it water-soluble. It binds to positively charged molecules and ions, and is often used in enzymes to fix the metal ion. When located inside of the protein, aspartate and glutamate are usually paired with arginine and lysine.
EGluGlu behaves similarly to aspartic acid, and has a longer, slightly more flexible side chain.
FPheEssential for humans, phenylalanine, tyrosine, and tryptophan contain a large, rigid aromatic group on the side chain. These are the biggest amino acids. Like isoleucine, leucine, and valine, these are hydrophobic and tend to orient towards the interior of the folded protein molecule. Phenylalanine can be converted into tyrosine.
GGlyBecause of the two hydrogen atoms at the α carbon, glycine is not optically active. It is the smallest amino acid, rotates easily, and adds flexibility to the protein chain. It is able to fit into the tightest spaces, e.g., the triple helix of collagen. As too much flexibility is usually not desired, as a structural component, it is less common than alanine.
HHisHis is essential for humans. In even slightly acidic conditions, protonation of the nitrogen occurs, changing the properties of histidine and the polypeptide as a whole. It is used by many proteins as a regulatory mechanism, changing the conformation and behavior of the polypeptide in acidic regions such as the late endosome or lysosome, enforcing conformation change in enzymes. However, only a few histidines are needed for this, so it is comparatively scarce.
IIleIle is essential for humans. Isoleucine, leucine, and valine have large aliphatic hydrophobic side chains. Their molecules are rigid, and their mutual hydrophobic interactions are important for the correct folding of proteins, as these chains tend to be located inside of the protein molecule.
JXleA placeholder when either amino acid may occupy a position
KLysLys is essential for humans, and behaves similarly to arginine. It contains a long, flexible side chain with a positively charged end. The flexibility of the chain makes lysine and arginine suitable for binding to molecules with many negative charges on their surfaces. E.g., DNA-binding proteins have their active regions rich with arginine and lysine. The strong charge makes these two amino acids prone to be located on the outer hydrophilic surfaces of the proteins; when they are found inside, they are usually paired with a corresponding negatively charged amino acid, e.g., aspartate or glutamate.
LLeuLeu is essential for humans, and behaves similarly to isoleucine and valine.
MMetMet is essential for humans. Always the first amino acid to be incorporated into a protein, it is sometimes removed after translation. Like cysteine, it contains sulfur, but with a methyl group instead of hydrogen. This methyl group can be activated, and is used in many reactions where a new carbon atom is being added to another molecule.
NAsnSimilar to aspartic acid, Asn contains an amide group where Asp has a carboxyl.
OPylSimilar to lysine, but it has a pyrroline ring attached.
PProPro contains an unusual ring to the N-end amine group, which forces the CO-NH amide sequence into a fixed conformation. It can disrupt protein folding structures like α helix or β sheet, forcing the desired kink in the protein chain. Common in collagen, it often undergoes a post-translational modification to hydroxyproline.
QGlnSimilar to glutamic acid, Gln contains an amide group where Glu has a carboxyl. Used in proteins and as a storage for ammonia, it is the most abundant amino acid in the body.
RArgFunctionally similar to lysine.
SSerSerine and threonine have a short group ended with a hydroxyl group. Its hydrogen is easy to remove, so serine and threonine often act as hydrogen donors in enzymes. Both are very hydrophilic, so the outer regions of soluble proteins tend to be rich with them.
TThrEssential for humans, Thr behaves similarly to serine.
USecThe selenium analog of cysteine, in which selenium replaces the sulfur atom.
VValEssential for humans, Val behaves similarly to isoleucine and leucine.
WTrpEssential for humans, Trp behaves similarly to phenylalanine and tyrosine. It is a precursor of serotonin and is naturally fluorescent.
XXaaPlaceholder when the amino acid is unknown or unimportant.
YTyrTyr behaves similarly to phenylalanine (precursor to tyrosine) and tryptophan, and is a precursor of melanin, epinephrine, and thyroid hormones. Naturally fluorescent, its fluorescence is usually quenched by energy transfer to tryptophans.
ZGlxA placeholder when either amino acid may occupy a position

Catabolism

Amino acids can be classified according to the properties of their main products:

  • Glucogenic, with the products having the ability to form glucose by gluconeogenesis
  • Ketogenic, with the products not having the ability to form glucose: These products may still be used for ketogenesis or lipid synthesis.
  • Amino acids catabolized into both glucogenic and ketogenic products

References

General references

References

  1. (January 2007). "Natural expansion of the genetic code". Nature Chemical Biology.
  2. (August 2010). "Dual functions of codons in the genetic code". Critical Reviews in Biochemistry and Molecular Biology.
  3. (August 1994). "Adult amino acid requirements: the case for a major revision in current recommendations". The Journal of Nutrition.
  4. (August 2011). "A model of proto-anti-codon RNA enzymes requiring L-amino acid homochirality". Journal of Molecular Evolution.
  5. (2019-08-13). "Selective incorporation of proteinaceous over nonproteinaceous cationic amino acids in model prebiotic oligomerization reactions". Proceedings of the National Academy of Sciences.
  6. (May 2006). "pK values of the ionizable groups of proteins". Protein Science.
  7. (May 2009). "Protein ionizable groups: pK values and their contribution to protein stability and solubility". The Journal of Biological Chemistry.
  8. (May 2011). "Conformational preferences and pK(a) value of selenocysteine residue". Biopolymers.
  9. (August 2010). "Selenocysteine, pyrrolysine, and the unique energy metabolism of methanogenic archaea". Archaea.
  10. (January 2017). "Proteome-pI: proteome isoelectric point database". Nucleic Acids Research.
  11. "Atomic Weights and Isotopic Compositions for All Elements". NIST.
  12. (2013). "Physical biology of the cell". Garland Science.
  13. (2005). "Lippincott's Illustrated Reviews: Biochemistry (Lippincott's Illustrated Reviews)". Lippincott Williams & Wilkins.
Wikipedia Source

This article was imported from Wikipedia and is available under the Creative Commons Attribution-ShareAlike 4.0 License. Content has been adapted to SurfDoc format. Original contributors can be found on the article history page.

Want to explore this topic further?

Ask Mako anything about Proteinogenic amino acid — get instant answers, deeper analysis, and related topics.

Research with Mako

Free with your Surf account

Content sourced from Wikipedia, available under CC BY-SA 4.0.

This content may have been generated or modified by AI. CloudSurf Software LLC is not responsible for the accuracy, completeness, or reliability of AI-generated content. Always verify important information from primary sources.

Report