- What is ProGlycProt?
-
ProGlycProt is a manually curated, exclusive repository of comprehensive information on experimentally characterized glycoproteins and protein glycosyltransferasesthat belong to eubacteria and archaea.
For the benefit of users, ProGlycProtrepository is arranged in following two sub-databases namely, ProGPdb and ProGTdb.
ProGPdb:is a compilation of experimentally validated bacterial and archaeal glycoproteins.Entries in ProGPdb are arranged in chronological order and each entry has a unique identifier ProGP ID. ProGPdb is further arranged in two sections:
1) ProCGP (Prokaryotic Characterized Glycoproteins) defined as a compilation of prokaryotic glycoproteins for which at least one glycosite i.e. the glycosylated residue is identified through experiments like edman degradation, mass spectroscopy, site directed mutagenesis etc.
2) ProUGP (Prokaryotic Uncharacterized Glycoproteins) defined as a list of prokaryotic glycoproteins in which glycosylation is known by experiments like aberrant migration on SDS PAGE, lectin binding, sugar specific staining but not the glycosites. In addition, ProGlycProt provides for separate structure gallery, relevant links and two new tools that are developed keeping in mind the new sequon information obtained from literature on prokaryotic glycoproteins.
ProGTdb: compiles and provides extensive information about enzymes that are experimentally characterized and involved in N-, O- and S- glycosylation (GTs) in bacteria and archaea. It provides manually curated information about native organism, genome, gene, protein, detailed description of the substrate specificity, catalytic linkages, mechanism, structure and the experimental strategies/methodologies used for the biochemical, genetic and/or biophysical characterization of the glycosyltransferase activity of the enzyme. ProGTdb not only contains entries that correspond to the ProGP ID’s in a cross linked manner but also entries for which acceptor substrate is either a synthetic peptide or a eukaryotic protein.Each entry in ProGT_Main and ProGT_Accessory has a unique identifier ProGT ID arranged chronologically. ProGTdb is composed of following two sections:
1)ProGT_Mainis a compilation of bacterial and archaeal protein glycosyltransferases for which at least one genetic or biochemical evidence of glycosyltransferase activity is validated in the published literature. Each entry in ProGT_Main has a unique identifier ProGT ID.
2)ProGT_Accessoryis a compilation of such bacterial and archaeal proteins/enzyme that have miscellaneous accessory roles in a given protein glycosylation pathway of a characterized protein glycosyltransferases compiled in ProGT_Main. The qualifying criteria is at least one known genetic or biochemical evidence of accessory function, in literature. Each entry in ProGT_Accessory has a unique identifier ProGT ID. - What is the rationale and vision behind creation of ProGlycProt?
- Once thought restricted to eukaryotes, glycoproteins are now known to exist in almost all major phyla of prokaryotes. In last ten years, many new prokaryotic glycoproteins have been characterized for the precise location of glycosites, indicating a rising interest in the biology of protein glycosylation in prokaryotes. However, according to our search, currently none of the available protein informatic or glycoinformatic resource provides comprehensive and exclusive information about these experimentally identified or characterized glycoproteins of archaea and eubacteria. ProGlycProt is therefore designed to fill in this void. A long-term vision of ProGlycProt is to provide as much as possible experimental information available in published literature about these glycoproteins and to link these with the experimental information on mechanisms and genetic machinery involved in glycosylation of these proteins.
- What is different or unique in ProGlycProt?
- To the best of our knowledge, ProGlycProt is the first manually curated, exclusive database of experimentally detected/characterized
glycoproteins of prokaryotic origin. Though parts of the data available at ProGlycProt can be retrieved at SwissProt/ PDB/ BCSDB/ O- GlycBase/ CAZY
yet unlike all these repositories ProGlycProt is an exclusive compilation of experimentally validated glycoproteins and enzymes instrumental in
glycosylation of these proteins in prokaryotes only. Furthermore, ProGlycProt provides a lot of additional interesting information about a given
glycoprotein like; full structure of attached glycan (IUPAC linear notation) as available from literature, information about glycosylation linked
genes, experimental methods used to characterize a given glycoprotein, year of detection, year of characterization, observed sequon features and
mannual annotation of all characterized glycoprotein sequences to incorporate mutational changes/ sequence conflicts/ in vitro or in vivo engineered
sequence and visual display of glycosites.
Similarly, enzymes involved in glycosylation (and glycosylation pathway) of a given protein are also described amply. Manually curated information onnative organism, genome, gene, protein, detailed description of the substrate specificity, catalytic linkages, mechanism, structure and the experimental strategies/methodologies used for the biochemical, genetic and/or biophysical characterization of the glycosyltransferase activity of the enzyme is provided , accordingly. While release of ProGlycProt contained information about at least 108 experimentally known glycoproteins that were absent in SwissProt on account of unsequenced genomes of the related organisms and as many as 69 proteins that included peptides & engineered proteins for which glycositeswere yet not annotated in any of the above mentioned protein databases.
Second release provides atleast 48 unique entries for protein glycosyltransferasesthat are not yet listed in CAZY. Further out of 181 characterized GTs listed in ProGTdb only 46 are listed as characterized in CAZY as of now. ProGlycProt additionally provides information about the in vitro engineered/evolved variants of these GT’s which are not compiled in CAZY. - Does ProGlycProt represent all known prokaryotic glycoproteins?
- As of now and to the best of our knowledge, ProGlycProtdb represents the largest compilation of characterized prokaryotic glycoproteins
available online consisting of entries from year 1968 to April2017. Though we may not claim that ProCGP and ProGTdbare complete compilations, yet
we have taken best efforts to incorporate all data that we could search through literature until we stopped finding further references in our searches
made until April 30, 2017, in the second release. Further, we have tried to provide as accurate as possible information, yet we encourage users to
refer the original literature cited along.
On the other hand, ProUGP is compiled from various brief seed compilations available in some of the published reviews (included in bibliography) and other available research publications. In view of the increasing availability of data on detection of new glycoproteins using high throughput techniques like mass spectroscopy, we believe ProUGPwill keep on increasing exponentially in future updates. In the second release, we have ensured maximal information about protein glycosylation, enzymes involved and pathways defined; at one place; in fully cross-referenced and searchable format, with information on several additional accounts. - What is the Scope of ProGlycProt?
-
In last two decades, a lot of interest is generated in studying glycoproteins and mechanisms of their glycosylation in bacteria and archaea.
In the first release, we had tried to provide all around information about a number of glycoproteins that are implicated in virulence, host pathogen
interactions, immune modulation, disease diagnosis, and vaccination. Second release provides a total of 42% increase in the number of entries made in
the database with a major expansion in compilation of experimentally validated proteins/enzymes involved in protein glycosylation in prokaryotes.
Staying true to the promises we made during first release of ProGlycProt, now in second release we have improvised and expanded the repository to:
1. Include extensive experimental information about prokaryotic Oligosaccharyl Transferases (OSTs), Glycosyl Transferases (GTs) and other accessory proteins or enzymes.
2. Facilitate better retrieval of biologically relevant and experiments oriented information about each entry. Second release provides Search by Features, Compare entries tool as well as Map/Location based search of a database entry and its associated research groups/ laboratories.
3.To address the gap in structural and image inputs for glycan entries corresponding to ProGlycProt entriesweare in the process of linking these entries to International glycan structure repository (https://glytoucan.org/) made available recently. - Future Plans for ProGlycProt?
- Apart from continued compilations for bacterial and archaeal glycoproteins and glycosyltransferases; in future updates, we aim at expanding information on directed evolution and applications of the glycosyltransferases.
Glossary of Terms used in ProGlycProt
S.No. | Term/Acronym |
Definition |
1. |
AAL |
Aleuria
Aurantia Lectin |
2. |
ABEE |
p-Aminobenzoic
acid ethyl ester |
3. |
Amino sugar |
Monosaccharide with one hydroxyl
group (-OH) replaced by an amine group (-NH3). |
4. |
Bac |
Bacillosamine
(3, 4-diacetamido-3, 4, 6-trideoxyglucopyranose). |
5. |
CAD/CID |
Collision-Activated (or –induced)
Dissociation |
6. |
CapLC-MS/MS |
Capillary Liquid
Chromatography-Tandem Mass Spectrometry |
7. |
S (Cys) linked glycosylation |
Refers to the
covalent linkage between glycan and sulphur atom of cysteine residue
in a protein sequence |
8. |
COSY |
Correlated
Spectroscopy |
9. |
DATDH |
3,4-Di-Acetamido-3,4,6-Tri
-DeoxyHexose |
10. |
Deglycosylation |
Removal of
glycans from the glycoproteins by chemical or enzymatic methods. |
11. |
DIG Glycan
detection |
Method of detection
of Digoxigenin (DIG)-labeled glycoconjugates using enzyme immunoassay |
12. |
Dolichol |
An isoprenoid
lipid with 15-19 isoprenoid units and a terminal phosphorylated
hydroxyl group. Dolichol acts as a membrane bound carrier for
sugars in the synthesis of glycoprotens and glycolipids |
13. |
DQF-COSY |
Double Quantum
Filtered Correlation Spectroscopy |
14. |
ECD |
Electron Capture
Dissociation |
15. |
Endo Hf |
Endoglycosidase
H leaves one GlcNAc residue attached to Asn by cleaving between
the two GlcNAc residues of the N-glycan core. |
16. |
Engineered
glycoprotein |
A protein naturally
unglycosylated or a synthetic peptide that is glycosylated in
vitro or in vivo by chemical or enzymatic methods (usually after
mutation of one or a few residues). Such proteins are also termed
as neoglycoproteins. |
17. |
ESI Q-TOF-MS |
Electrospray
Ionization Quadrupole Time Of Flight Mass Spectrometry |
18. |
ETD |
Electron Transfer
Dissociation |
19. |
FAB-MS |
Fast Atom Bombardment-Mass Spectrometry |
20. |
FT-ICR-MS |
Fourier Transform
Ion Cyclotron Resonance Mass Spectrometry |
21. |
Fuc |
Fucose |
22. |
FucNAc |
N-Acetylfucosamine |
23. |
GAGs |
Glycosaminoglycans |
24. |
Gal |
Galactose |
25. |
GalNAc |
N-Acetyl-D-Galactosamine |
26. |
GATDH |
3-Acetamido
4-Glyceramido 3,4,6-Trideoxyhexose Or 3- Glyceramido 4- Acetamido
3,4,6-Trideoxyhexose |
27. |
GC |
Gas Chromatography |
28. |
GC-MS |
Gas Chromatography-Mass
Spectrometry |
29. |
Glc |
Glucose |
30. |
GlcA |
Glucuronic
acid |
31. |
GlcNAc |
N-Acetyl-D-Glucosamine (NAG) |
32. |
Glycoform |
One of the
differentially glycosylated forms of a glycoprotein. Glycoforms
of a glycoprotein have the same protein sequence but differ in
the number and/or structure of oligosaccharides attached |
33. |
Glycoprotein |
Protein with one or more covalently
bound glycans added as a co-translational or post-translational
modification. The glycan may be a monosaccharide, an oligosaccharide
or a polysaccharide. |
34. |
Glycosidase |
Enzyme catalyzing
the hydrolysis of a glycosidic linkage |
35. |
Glycosidic linkage (bond) |
The bond linking monosaccharides
in didiasaccharides and polysaccharides. Formed by a condensation
reaction between teo OH groups, one from each of the two monosaccharides. |
36. |
Glycosite |
An amino acid
residue where glycosylation occurs in a protein sequence. |
37. |
Glycosyltransferase (GT) |
Enzyme (with EC 3.4.X.X) catalyzing
the transfer of a sugar from a nucleotide (nucleoside phosphate)
sugar donor to an acceptor substrate to form a glycosidic linkage |
38. |
HMBC |
Heteronuclear
Multiple Bond Coherence |
39. |
HMQC |
Heteronuclear Multiple Quantum
Correlation |
40. |
HPAEC |
High-Performance
Anion-Exchange Chromatography |
41. |
HPLC |
High Pressure Liquid Chromatography |
42. |
HSQC |
Heteronuclear
Single Quantum Coherence |
43. |
IdoA |
Iduronic Acid |
44. |
Lectin |
A glycan binding
protein with a carbohydrate-recognition domain (CRD) homologous
to the sugar binding region of leguminous plant lectin. |
45. |
LFA |
Limax Flavus Agglutinin (Sialic
acid-specific lectin) |
46. |
MALDI-TOF MS |
Matrix Assisted
Laser Desorption/Ionization Time Of Flight Mass Spectrometry |
47. |
Man |
Mannose |
48. |
MS-MS |
Tandem Mass
Spectrometry |
49. |
Nano-LC-MS/MS |
Nano Liquid Chromatography-Tandem
Mass Spectrometry |
50. |
nESI-feCID-MS/MS |
Nano-Electrospray
Ionization–Front-End Collision-Induced Dissociation Tandem
Mass Spectrometry |
51. |
NeuNAc (NANA) |
N-Acetyl Neuraminic Acid (Sialic
acid) |
52. |
N (Asn) linked
glycosylation |
Refers to the
covalent linkage between glycan and amide nitrogen of an aspargine
residue in a protein sequence |
53. |
NMR |
Nuclear Magnetic Resonance |
54. |
NOESY |
Nuclear Overhauser
Effect Spectroscopy |
55. |
O (Ser/ Thr/ Tyr) linked glycosylation |
Refers to the covalent linkage
between glycan and oxygen of hydroxyl group of serine/ threonine
or tyrosine in a protein sequence |
56. |
OST |
Oligo Saccharyl
Transferase, the enzyme responsible for catalyzing the transfer
of a precursor glycan from a sugar carrier to the protein or peptide |
57. |
PAS staining |
Periodic Acid-Schiff staining |
58. |
PNGase F |
Peptide N-glycosidase
F that cleaves between the innermost GlcNAc and the Asn residue
of an N linked oligosaccharide. |
59. |
Pro-Q Emerald glycostaining |
A derivative of dansyl hydrazide
that provides for fluorescent staining of glycoproteins |
60. |
PTM |
Post Translational
Modification |
61. |
Reducing end (of the glycan) |
The free end of a disaccharide,
polysaccharide or oligosaccharide (glycan) that retains the carbonyl
function and can act as a reducing agent. In glycoproteins, it
is the end of the glycan attached to the protein or peptide. |
62. |
RP-HPLC |
Reversed Phase
High Pressure Liquid Chromatography |
63. |
SBA |
SoyBean Agglutinin (lectin) |
64. |
Sequon |
Sequence of
conserved amino acids around residue that gets glycosylated |
65. |
STT3 |
Staurosporine- and Temperature-Sensitive
mutant 3, the catalytic subunit of eukaryotic multisubunit oligosaccharyl
transferase |
66. |
TFA |
Tri Fluoro
Acetic Acid |
67. |
TFMS |
Tri Fluoro Methane Sulfonic Acid |
68. |
TOCSY |
Total Correlation
Spectroscopy |
69. |
Xyl |
Xylose |
70. |
ß-elimination |
Base-catalyzed
nonhydrolytic cleavage of glycosidic bonds between O-linked glycans
and the ß-hydroxyl groups of serine or threonine residues
of a protein or peptide. |