hfv top banner
HFV Ebola sequence database
 

N-GlycoSite

Purpose: Highlight and tally predicted N-linked glycosylation sites (Nx[ST] patterns, where x can be any amino acid).

Input
Paste your alignment here
or upload your file

Options
Exclude NP[ST] pattern
Group sequences Do not group
Summarize results by grouped sequences according to:
first character(s) in sequence names
the column in field of sequence names delimited by
paste or upload grouped sequence names (see example below)

Details:
During glycosylation, an oligosaccharide chain is attached to asparagine (N) occurring in the tripeptide sequence N-X-S or N-X-T, where X can be any amino acid except Pro. This sequence is called a glycosylation sequon. The N-GlycoSite tool marks and tallies the locations where this pattern occurs.

The likelihood of N-linked glycosylation of a particular site can be influenced by the context in which it is embedded, and could be expanded to a 4-amino acid NX[ST]Z pattern, where the amino acid in the X or Z position can be important determinants of glycosylation efficiency. For example, a proline in position X or Z strongly disfavors N-linked glycosylation.

O-linked glycosylation signals are more difficult predict, but one can estimate their positions using the NetPhos program at Center for Biological Sequence Analysis.

Input:
Input can be one amino acid sequence, or an alignment of amino acid sequences, from any organism. If you just want to tally the number of N-glycosylation sites, the protein sequences do not need to be aligned. Standard sequence alignment formats are recognized.

Exclude NP[ST] pattern:
A second position proline (site pattern NP[ST]) is strongly disfavored for glycosylation. Thus the default option excludes these patterns. You may uncheck the box to include them.

Grouped Sequence Names:
If you are analyzing multiple sequences, you can choose how to group them in the analysis. If you are analyzing a single sequence, or you do not want to group your sequences, just ignore these options. Your sequences can be grouped by the first character in the sequence names, or by a set of characters delimiting the sequence names, or by providing a list of groups.

Each sequence must be on a separate line, and groups are separated by an empty line. The first item ending in ':' in a group will be taken as the group name, but this line is optional. If group names are omitted, names will be assigned as Group-1, Group-2, etc. Sequences that are not present in any group will be named 'Others' and colored gray. This is useful for highlighting some groups of sequences out of a target set.

The following can be pasted in as the "grouped sequence names" for testing with the Sample Input:

North America:
1a.US.-.HCV-H
1a.US.-.RBPRESC2C4
1a.US.-.US5
1a.US.-.SCPRESC2C9
1a.US.-.BCS1C13
1a.US.78.FM_78
1a.US.-.HCV-PT
1a.US.81.HW_81
1a.US.-.RHPRESC2D
1a.US.-.RJPRESC2D
1a.US.77.JL_77

Other:
1a.-.-.H77
1a.IT.-.I21
1a.-.-.COLONEL
1a.-.-.HCT23
1a.-.-.PHCV-1/SF9_A
1a.-.-.HCT18
1a.-.-.LTD6-2-XF224

References:

  1. Zhang M et al., Glycobiology. 14(12):1229-46 (2004) -- please cite this reference if you use our tool in a publication.
  2. Marshall RD, Biochem Soc Symp. 40:17-26 (1974)
  3. Kasturi et al., Biochem J. 323 (Pt 2):415-9 (1997)
  4. Mellquist JL et al., Biochemistry. 37(19):6833-7 (1998)



Questions or comments? Contact us at hfv-info@lanl.gov