The server: description
DISULFIND is a server for predicting the disulfide bonding state of
cysteines and their disulfide connectivity starting from sequence
alone. Disulfide bridges play a major role in the stabilization of the
folding process for several proteins. Prediction of disulfide
bridges from sequence alone is therefore useful for the study of structural
and functional properties of specific proteins. In addition, knowledge
about the disulfide bonding state of cysteines may help the experimental
structure determination process and may be useful in other genomic
annotation tasks.
Our server predicts disulfide patterns in two computational stages: (1) the
disulfide bonding state of each cysteine is predicted by a BRNN-SVM binary classifier; (2) cysteines that are known to participate in the formation of bridges are paired by a Recursive Neural Network to obtain a connectivity pattern.
Options:
- Predict both the bonding state of each cystein in the query and the corresponding disulfide arrangment. Option: Predict bonding state + connectivity pattern.
- Predict only the disulfide pattern of some of the cysteines. Option: Predict connectivity from user specified bonding state
Note: as you might imagine, this last option represents few possible situations:
- You know the bonding state of each cysteine in the chain and would like to know how the oxidized cysteines are connected.
- You know the bonding state of some of the cysteines and would like to know how they would be connected.
In case the user selects this option, after pressing the submit button, a page will be displayed with a list of checkboxes, each one representing the position of a cysteine in the submitted query. Please: do not make use of the navigation/reload buttons at this stage.
After pressing the submit button, the user selected checkboxes will trigger the prediction of connectivity among the corresponding cysteines only. Errors will be feedbacked in case of obvious inconsistent selections, e.g. odd or zero number of selected cysteines.
Important Note:
In order to trade-off between the computational burden and prediction
reliability, DISULFIND can
assign at most 5 disulfide bridges. Two
remarks are relevant to this limitation. First, chains with more than
5 bridges are rare (no more than 10% of the SwissProt chains annotated
with disulfide bridges). Second, the prediction accuracy is
already low for chains having 5 bridges because of a limited number of
available training examples; hence prediction of patterns with 6 or
more bridges would be very inaccurate and could not be reliably trusted.
As a consequence, the connectivity predictor will not return an answer if it is given (either by the bonding state predictor or the user) more than 10 oxidized cysteines.
Bonding State
Predictor
DISULFIND employs an SVM
binary classifier to predict the bonding state of each cysteine [2,3], followed by a refinement stage that
collectively classifies all the cysteines in the chain. Global
dependencies represented by the collective bonding state are modelled
in two ways. Firstly, the SVM receives as input both local and global
descriptive attributes. Secondly, we add a post-processing stage
consisting of a bidirectional recurrent neural network (BRNN)
in the attempt of recovering some correlation, and a Viterbi
decoder that finds the maximum likelihood bonding state assignment
which satisfies the constraint of having an even number of disulfide bonded
cysteines (interchain bonds are ignored).
Disulfide Connectivity Predictor
Predicting disulfide patterns can be seen as mapping an input sequence with
annotated cysteines into an output graphs representing disulfide
connectivity. The input is formed by the annotated sequence and a candidate
connectivity pattern. The method applied in DISULFIND,
documented in
[1], employs a bi-directional Recursive Neural
Network trained to predict the expected distance of candidate patterns to
the real one. Prediction is carried out by running the trained network on all
possible connectivity patterns and choosing the one yielding maximum
score.
Input formats
Email Address
Your email address, the place where the prediction will be delivered, if option Batch: send output to email address is selected.
NOTE: Check that you typed your address correctly.
Query Name
An optional name for your query. We strongly suggest that you use one, especially if sending more than one query. The order in which you send your queries may not correspond to the order in which you receive the answers.
Amino Acid Sequence
The sequence of aminoacids:
- A bare sequence is accepted. Please no FASTA format.
- Spaces, newlines and tabs will be simply ignored.
- Non alphabetical chars will cause the rejection of the query.
- Only 1 letter amino acid code accepted. Please do not send nucleotide sequences. If so, A will be treated as Alanine, C as Cysteine, etc...
Output Options
Replies are sent either by email (option: Batch:
send output to email address) or as an html page
(option: Interactive: receive
output on the browser). In the latter case, after
pressing the send button, an intermediate page will
appear and auto-refresh every 10 seconds until the final
output is returned.
By default DISULFIND only returns the most likely
connectivity pattern. By setting the
number of alternatives to an
integer k in the range [2,3], the k best
ranking patterns will be returned.
|
EMAIL output example
Query Name: VPRA_DENPO
+----------------------+
+----|------------+ | +---------------+
+-----|----|+ | | +-|-----+ |
| | || | | | | | |
.........10........20........30........40........50........60........70........
AA AVITGACERDLQCGKGTCCAVSLWIKSVRVCTPVGTSGEDCHPASHKIPFSGQRKMHHTCPCAPNLACVQTSPKKFKCL
DB_state 1 1 11 1 1 1 1 1 1
DB_conf 9 9 99 9 9 9 9 9 9
DB_flip
80
AA SK
DB_state
DB_conf
DB_flip
DB_bond bond(7,19)
DB_bond bond(13,31)
DB_bond bond(18,41)
DB_bond bond(60,68)
DB_bond bond(62,78)
Conn_conf 0.442914
----------------------------------------------------------------------------------------
SECOND best ranking connectivity pattern
+----+ +------------------+ +---------------+
+-----|----|+ +---------|------------------|-|-----+ |
| | || | | | | | |
.........10........20........30........40........50........60........70........
AA AVITGACERDLQCGKGTCCAVSLWIKSVRVCTPVGTSGEDCHPASHKIPFSGQRKMHHTCPCAPNLACVQTSPKKFKCL
80
AA SK
DB_bond bond(7,19)
DB_bond bond(13,18)
DB_bond bond(31,68)
DB_bond bond(41,60)
DB_bond bond(62,78)
Conn_conf 0.422348
----------------------------------------------------------------------------------------
Abbreviations used:
AA = amino acid sequence
DB_state = predicted disulfide bonding state
(1=disulfide bonded, 0=not disulfide bonded)
DB_conf = confidence of disulfide bonding state prediction (0=low to 9=high)
DB_flip = an asterisk (*) indicates that the viterbi aligner overruled the
most likely predition for that residue in order to achieve a
consistent prediction at a protein level (even number of disulfide
bonded cysteines, as interchain bonds are ignored).
DB_bond = position in sequence of a pair of cysteines predicted to be forming
a disulfide bridge.
Conn_conf = confidence of connectivity assignment given the predicted disulfide
bonding state (real value in [0,1])
----------------------------------------------------------------------------------------
|
|
HTML output example
Results for VPRA_DENPO
+----------------------+
+----|------------+ | +---------------+
+-----|----|+ | | +-|-----+ |
| | || | | | | | |
.........10........20........30........40........50........60........70........
AA AVITGACERDLQCGKGTCCAVSLWIKSVRVCTPVGTSGEDCHPASHKIPFSGQRKMHHTCPCAPNLACVQTSPKKFKCL
DB_state 1 1 11 1 1 1 1 1 1
DB_conf 9 9 99 9 9 9 9 9 9
80
AA SK
DB_state
DB_conf
DB_bond bond(7,19)
DB_bond bond(13,31)
DB_bond bond(18,41)
DB_bond bond(60,68)
DB_bond bond(62,78)
Conn_conf 0.442914
|
Second best ranking connectivity pattern
+----+ +------------------+ +---------------+
+-----|----|+ +---------|------------------|-|-----+ |
| | || | | | | | |
.........10........20........30........40........50........60........70........
AA AVITGACERDLQCGKGTCCAVSLWIKSVRVCTPVGTSGEDCHPASHKIPFSGQRKMHHTCPCAPNLACVQTSPKKFKCL
80
AA SK
DB_bond bond(7,19)
DB_bond bond(13,18)
DB_bond bond(31,68)
DB_bond bond(41,60)
DB_bond bond(62,78)
Conn_conf 0.422348
|
- Please cite:
- A. Ceroni, A. Passerini, A. Vullo and P. Frasconi. DISULFIND: a Disulfide Bonding State and Cysteine Connectivity Prediction Server,
Nucleic Acids Research, 34(Web Server issue):W177--W181, 2006.
- Contact information:
- Abbreviations used:
- AA amino acid sequence
- DB_state predicted disulfide bonding state
(1=disulfide bonded, 0=not disulfide bonded)
- DB_conf confidence of disulfide bonding state prediction
(0=low to 9=high)
A red colour means that the viterbi aligner
overruled the most likely predition for that residue in order
to achieve a consistent prediction at a protein level (even number of
disulfide bonded cysteines, as interchain bonds are ignored). See
papers for details.
- DB_bond
position in sequence of a pair of cysteines predicted to be forming a disulfide bridge.
- Conn_conf confidence of connectivity
assignment given the predicted disulfide bonding state (real value in [0,1])
|
|
References
Please cite:
[1] A. Ceroni, A. Passerini, A. Vullo and P. Frasconi. DISULFIND: a Disulfide Bonding State and Cysteine Connectivity Prediction Server, Nucleic Acids Research, 34(Web Server issue):W177--W181, 2006. Download PDF
For the disulfide connectivity predictor see also:
[2] A. Vullo and P. Frasconi. Disulfide connectivity prediction using recursive neural networks and evolutionary information, Bioinformatics, 20, 653-659, 2004. Download PDF
For the cystein bonding state predictor see also:
[3] P. Frasconi, A. Passerini, and A. Vullo. A two-stage SVM architecture for predicting the disulfide bonding state of cysteines, Proc. IEEE Workshop on Neural Networks for Signal Processing, pp.25-34, 2002. Download PDF
[4] A.Ceroni, P.Frasconi, A.Passerini and A.Vullo. Predicting the disulfide bonding state of cysteines with combinations of kernel machines, Journal of VLSI Signal Processing, 35, 287-295, 2003. Download PDF
Copyright notice
The documents listed in this site are provided as a means to ensure timely dissemination
of scholarly and technical work on a noncommercial basis. Copyright and all rights therein
are maintained by the authors or by other copyright holders, notwithstanding that they have
offered their works here electronically. It is understood that all persons copying this
information will adhere to the terms and constraints invoked by each author's copyright.
These works may not be reposted without the explicit permission of the copyright holder.
|