2013-09-18 14:12:31 +00:00
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
STRIDE: Protein secondary structure assignment
|
|
|
|
from atomic coordinates
|
|
|
|
|
|
|
|
Dmitrij Frishman & Patrick Argos
|
|
|
|
|
|
|
|
European Molecular Biology Laboratory
|
|
|
|
Postfach 102209, Meyerhofstr. 1
|
|
|
|
69012 Heidelberg
|
|
|
|
Germany
|
|
|
|
|
|
|
|
FRISHMAN@EMBL-HEIDELBERG.DE
|
|
|
|
ARGOS@EMBL-HEIDELBERG.DE
|
|
|
|
|
|
|
|
|
|
|
|
CONTENTS
|
|
|
|
|
|
|
|
|
|
|
|
1. About the method
|
|
|
|
|
|
|
|
2. Copyright notice
|
|
|
|
|
|
|
|
3. Availability
|
|
|
|
|
|
|
|
4. Installation
|
|
|
|
|
|
|
|
5. Using STRIDE
|
|
|
|
|
|
|
|
6. Output format
|
|
|
|
|
|
|
|
7. Bug reports and user feedback
|
|
|
|
|
|
|
|
8. References
|
|
|
|
|
|
|
|
|
|
|
|
----------------------------------------------------------------------
|
|
|
|
|
|
|
|
|
|
|
|
1. About the method
|
|
|
|
|
|
|
|
|
|
|
|
STRIDE [1] is a program to recognize secondary structural elements in
|
|
|
|
proteins from their atomic coordinates. It performs the same task as
|
|
|
|
DSSP by Kabsch and Sander [2] but utilizes both hydrogen bond energy
|
|
|
|
and mainchain dihedral angles rather than hydrogen bonds alone. It
|
|
|
|
relies on database-derived recognition parameters with the
|
|
|
|
crystallographers' secondary structure definitions as a standard-of-
|
|
|
|
truth. Please see Frishman and Argos [1] for detailed description of
|
|
|
|
the algorithm.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
2. Copyright notice
|
|
|
|
|
2013-09-18 15:09:44 +00:00
|
|
|
Permission is hereby granted, free of charge, to any person obtaining
|
|
|
|
a copy of this software and associated documentation files (the
|
|
|
|
"Software"), to deal in the Software without restriction, including
|
|
|
|
without limitation the rights to use, copy, modify, merge, publish,
|
|
|
|
distribute, sublicense, and/or sell copies of the Software, and to
|
|
|
|
permit persons to whom the Software is furnished to do so, subject to
|
|
|
|
the following conditions:
|
|
|
|
|
|
|
|
The above copyright notice and this permission notice shall be
|
|
|
|
included in all copies or substantial portions of the Software.
|
|
|
|
|
|
|
|
THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND,
|
|
|
|
EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF
|
|
|
|
MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND
|
|
|
|
NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS
|
|
|
|
BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN
|
|
|
|
ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN
|
|
|
|
CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
|
|
|
|
SOFTWARE.
|
2013-09-18 14:12:31 +00:00
|
|
|
|
|
|
|
For calculation of the residue solvent accessible area the program NSC
|
|
|
|
[3,4] is used and was kindly provided by Dr. F.Eisenhaber
|
|
|
|
(EISENHABER@EMBL-HEIDELBERG.DE). Please direct to him all questions
|
|
|
|
concerning specifically accessibility calculations.
|
|
|
|
|
|
|
|
3. Availability
|
|
|
|
|
|
|
|
|
|
|
|
Executables of STRIDE for several UNIX platforms, VAX/VMS, OpenVMS,
|
|
|
|
Dos and Mac together with documentation and source code are available
|
|
|
|
by anonymous FTP from ftp.ebi.ac.uk (directories
|
|
|
|
/pub/software/unix/stride, /pub/software/dos/stride,
|
|
|
|
/pub/software/vms/stride, /pub/software/mac/stride). We are willing to
|
|
|
|
compile the program for other architectures if temporary access to
|
|
|
|
them will be granted by an interested user.
|
|
|
|
|
|
|
|
Data files with STRIDE secondary structure assignments for the current
|
|
|
|
release of the PDB [5] databank are in the directory
|
|
|
|
/pub/databases/stride of the same site. Atomic coordinate sets can be
|
|
|
|
submitted for secondary structure assignment through electronic mail
|
|
|
|
to stride@embl-heildelberg.de. A mail message containing HELP in the
|
|
|
|
first line will be answered with appropriate instructions. See also
|
|
|
|
WWW page http://www.embl-heidelberg.de/stride/stride_info.html.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
4. Installation
|
|
|
|
|
|
|
|
|
|
|
|
For UNIX, DOS and Mac no installation is needed. Just download the
|
|
|
|
executable corresponding to your platform, and you are all set. For
|
|
|
|
VAX and OpenVMS you need only to link the executable with a logical
|
|
|
|
name; for example:
|
|
|
|
|
|
|
|
yourlogicalname:= $ $yourdiskname:[your.directory.name]stride.exe
|
|
|
|
|
|
|
|
and then use yourlogicalname as the program name.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
5. Using STRIDE
|
|
|
|
|
|
|
|
|
|
|
|
The only required parameter for STRIDE is the name of the file
|
|
|
|
containing a set of atomic coordinates in PDB [5] format. By default
|
|
|
|
STRIDE writes to standard output, i.e. your screen. On systems that
|
|
|
|
allow to redirect output you can do so to create a disk file. Help is
|
|
|
|
available if you just type STRIDE without parameters. The following
|
|
|
|
options are accepted:
|
|
|
|
|
|
|
|
|
|
|
|
-fFilename Write output to the file "Filename" rather than to
|
|
|
|
stdout.
|
|
|
|
|
|
|
|
|
|
|
|
-h Report hydrogen bonds. By default no hydrogen bond
|
|
|
|
information is included in the output.
|
|
|
|
|
|
|
|
-o Report secondary structure summary only.
|
|
|
|
|
|
|
|
-rId1Id2.. Read only chains Id1, Id2 etc. of the PDB file *). All
|
|
|
|
other chains will be ignored. By default all valid
|
|
|
|
protein chains are read.
|
|
|
|
|
|
|
|
-cId1Id2.. Process only chains Id1, Id2 ...etc *). Secondary
|
|
|
|
structure assignment will be produced only for these
|
|
|
|
chains, but other chains that are present will be taken
|
|
|
|
into account while calculating residue accessible
|
|
|
|
surface and detecting inter-chain hydrogen bonds and,
|
|
|
|
possibly, interchain beta-sheets. By default all
|
|
|
|
protein chains read are processed.
|
|
|
|
|
|
|
|
-mFilename Generate a Molscript [6] file. Using the program
|
|
|
|
Molscript by Per Craulis you can create a postscript
|
|
|
|
picture of your structure. You can manually edit the
|
|
|
|
Molscript file produced by STRIDE to achieve the
|
|
|
|
desired orientation and to include additional details.
|
|
|
|
|
|
|
|
-q[Filename] Generate sequence file in FASTA [7] format and die.
|
|
|
|
Filename is optional. If no file name is specified,
|
|
|
|
stdandard output is used.
|
|
|
|
|
|
|
|
All options are case- and position-insensitive.
|
|
|
|
|
|
|
|
Examples:
|
|
|
|
|
|
|
|
|
|
|
|
1. Calculate secondary structure assignment for 1ACP including
|
|
|
|
hydrogen bond information:
|
|
|
|
|
|
|
|
stride 1acp.brk -h
|
|
|
|
|
|
|
|
2. Calculate secondary structure assignment for 4RUB and write the
|
|
|
|
output to the file 4rub.str
|
|
|
|
|
|
|
|
stride 4rub.brk -f4rub.str
|
|
|
|
|
|
|
|
3. Calculate secondary structure assignment for chain B of 4RUB.
|
|
|
|
Ignore all other chains. Generate a Molscript file 4rub.mol.
|
|
|
|
|
|
|
|
stride 4rub.brk -rb -m4rub.mol
|
|
|
|
|
|
|
|
4. Calculate secondary structure assignment for chain C of 2GLS in
|
|
|
|
the presence of chains A and B. Report secondary structure
|
|
|
|
summary only.
|
|
|
|
|
|
|
|
stride 2gls.brk -rabc -cc -o
|
|
|
|
|
2013-09-18 15:18:32 +00:00
|
|
|
Examples for contact order calculations:
|
|
|
|
|
|
|
|
|
|
|
|
1. Whole PDB entry:
|
|
|
|
|
|
|
|
stride 1bed.brk -$ -k
|
|
|
|
|
|
|
|
2. Residue range:
|
|
|
|
|
|
|
|
stride 2gsq.brk -$ -k -x76 -y202
|
|
|
|
|
|
|
|
3. Single chain:
|
|
|
|
|
|
|
|
stride 1alv.brk -$ -k -ra
|
|
|
|
|
|
|
|
4. Residue range in a chain:
|
|
|
|
|
|
|
|
stride 1bmt.brk -$ -k -ra -x651 -y740
|
|
|
|
|
2013-09-18 14:12:31 +00:00
|
|
|
|
|
|
|
|
|
|
|
6. Output format
|
|
|
|
|
|
|
|
|
|
|
|
STRIDE produces output that is easily readable both visually and with
|
|
|
|
computer programs. The side effect of this conveniency is larger file
|
|
|
|
size of individual STRIDE entries. Every record is 79 symbols long and
|
|
|
|
has the following general format:
|
|
|
|
|
|
|
|
Position Description
|
|
|
|
|
|
|
|
1-3 Record code
|
|
|
|
4-5 Not used
|
|
|
|
6-73 Data
|
|
|
|
74-75 Not used
|
|
|
|
75-79 Four letter PDB code (if available)
|
|
|
|
|
|
|
|
Below follows the description of each record type.
|
|
|
|
|
|
|
|
|
|
|
|
Code Description and format of data
|
|
|
|
|
|
|
|
REM Remarks and blank lines
|
|
|
|
|
|
|
|
Format: free
|
|
|
|
|
|
|
|
HDR Header. Protein name, date of file creation and PDB code
|
|
|
|
|
|
|
|
Format: free
|
|
|
|
|
|
|
|
CMP Compound.Full name of the molecule and identifying
|
|
|
|
information
|
|
|
|
|
|
|
|
Format: free
|
|
|
|
|
|
|
|
SRC Species, organ, tissue, and mutant from which the molecule
|
|
|
|
has been obtained
|
|
|
|
|
|
|
|
Format: free
|
|
|
|
|
|
|
|
AUT Names of the structure authors
|
|
|
|
|
|
|
|
Format: free
|
|
|
|
|
|
|
|
CHN File name and PDB chain identifier*).
|
|
|
|
|
|
|
|
Format: File name beginning from position 6 followed
|
|
|
|
by one space and one-letter chain identifier
|
|
|
|
|
|
|
|
SEQ Amino acid sequence
|
|
|
|
|
|
|
|
Format: 6-9 First residue PDB number
|
|
|
|
11-60 Sequence
|
|
|
|
62-65 Last residue PDB number
|
|
|
|
|
|
|
|
STR Secondary structure summary
|
|
|
|
|
|
|
|
Format: 11-60 Secondary structure assignment **)
|
|
|
|
|
|
|
|
LOC Location of secondary structure elements
|
|
|
|
|
|
|
|
Format: 6-17 Element name
|
|
|
|
19-21 First residue name
|
|
|
|
32-26 First residue PDB number
|
|
|
|
28-28 First residue chain identifier
|
|
|
|
36-38 Last residue name
|
|
|
|
42-45 Last residue PDB number
|
|
|
|
47-47 Last residue chain identifier
|
|
|
|
|
|
|
|
ASG Detailed secondary structure assignment
|
|
|
|
|
|
|
|
Format: 6-8 Residue name
|
|
|
|
10-10 Protein chain identifier
|
|
|
|
12-15 PDB residue number
|
|
|
|
17-20 Ordinal residue number
|
|
|
|
25-25 One letter secondary structure code **)
|
|
|
|
27-39 Full secondary structure name
|
|
|
|
43-49 Phi angle
|
|
|
|
53-59 Psi angle
|
|
|
|
65-69 Residue solvent accessible area
|
|
|
|
|
|
|
|
DNR Donor residue
|
|
|
|
|
|
|
|
Format: 6-8 Donor residue name
|
|
|
|
10-10 Protein chain identifier
|
|
|
|
12-15 PDB residue number
|
|
|
|
17-20 Ordinal residue number
|
|
|
|
26-28 Acceptor residue name
|
|
|
|
30-30 Protein chain identifier
|
|
|
|
32-35 PDB residue number
|
|
|
|
37-40 Ordinal residue number
|
|
|
|
42-45 N..0 distance
|
|
|
|
47-52 N..O=C angle
|
|
|
|
54-59 O..N-C angle
|
|
|
|
61-66 Angle between the planes of donor
|
|
|
|
complex and O..N-C
|
|
|
|
68-73 angle between the planes of acceptor
|
|
|
|
complex and N..O=C
|
|
|
|
ACC Acceptor residue
|
|
|
|
|
|
|
|
Format: 6-8 Acceptor residue name
|
|
|
|
10-10 Protein chain identifier
|
|
|
|
12-15 PDB residue number
|
|
|
|
17-20 Ordinal residue number
|
|
|
|
26-28 Donor residue name
|
|
|
|
30-30 Protein chain identifier
|
|
|
|
32-35 PDB residue number
|
|
|
|
37-40 Ordinal residue number
|
|
|
|
42-45 N..0 distance
|
|
|
|
47-52 N..O=C angle
|
|
|
|
54-59 O..N-C angle
|
|
|
|
61-66 Angle between the planes of donor
|
|
|
|
complex and O..N-C
|
|
|
|
68-73 angle between the planes of acceptor
|
|
|
|
complex and N..O=C
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
HDR, CMP, SCR and AUT records are directly copied from the PDB file,
|
|
|
|
if supplied by the authors. If only the secondary structure summary is
|
|
|
|
requested, only CHN, SEQ, STR and LOC records will be output.
|
|
|
|
Hydrogen bond information (records DNR and ACC) was made very
|
|
|
|
redundant to facilitate human reading and will not be reported by
|
|
|
|
default.
|
|
|
|
|
|
|
|
|
|
|
|
*) IMPORTANT NOTE: if the protein chain identifier is ' ' (space), it
|
|
|
|
will be substituted by '-' (dash) everywhere in the STRIDE output.
|
|
|
|
The same is true for command line parameters involving chain
|
|
|
|
identifiers where you have to specify '-' instead of ' '.
|
|
|
|
|
|
|
|
**) One-letter secondary structure code is nearly the same as used in
|
|
|
|
DSSP [2] (see Frishman and Argos [1] for details):
|
|
|
|
|
|
|
|
H Alpha helix
|
|
|
|
G 3-10 helix
|
|
|
|
I PI-helix
|
|
|
|
E Extended conformation
|
|
|
|
B or b Isolated bridge
|
|
|
|
T Turn
|
|
|
|
C Coil (none of the above)
|
|
|
|
|
|
|
|
|
|
|
|
For each record (data line) except those with codes REM and STR the
|
|
|
|
number of fields is consistent and is readily suitable for processing
|
|
|
|
with external tools, such as awk, perl, etc.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
7. Bug reports and user feedback
|
|
|
|
|
|
|
|
|
|
|
|
Please send your suggestions, questions and bug reports to
|
|
|
|
FRISHMAN@EMBL-HEIDELBERG.DE. Send your contact address to get
|
|
|
|
information on updates and new features.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
8. References
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
1. Frishman,D & Argos,P. (1995) Knowledge-based secondary structure
|
|
|
|
assignment. Proteins: structure, function and genetics, 23,
|
|
|
|
566-579.
|
|
|
|
|
|
|
|
2. Kabsch,W. & Sander,C. (1983) Dictionary of protein secondary
|
|
|
|
structure: pattern recognition of hydrogen-bonded and
|
|
|
|
geometrical features. Biopolymers, 22: 2577-2637.
|
|
|
|
|
|
|
|
3. Eisenhaber, F. and Argos, P. (1993) Improved strategy in
|
|
|
|
analytic surface calculation for molecular systems: handling of
|
|
|
|
singularities and computational efficiency. J. comput. Chem. 14,
|
|
|
|
1272-1280.
|
|
|
|
|
|
|
|
4. Eisenhaber, F., Lijnzaad, P., Argos, P., Sander, C., and Scharf,
|
|
|
|
M. (1995) The double cubic lattice method: efficient approaches
|
|
|
|
to numerical integration of surface area and volume and to dot
|
|
|
|
surface contouring of molecular assemblies. J. comput. Chem. 16,
|
|
|
|
273-284.
|
|
|
|
|
|
|
|
5. Bernstein, F.C., Koetzle, T.F., Williams, G.J., Meyer, E.F.,
|
|
|
|
Brice, M.D., Rodgers, J.R., Kennard, O., Shimanouchi, T., and
|
|
|
|
Tasumi, M. (1977) The protein data bank: a computer-based
|
|
|
|
archival file for macromolecular structures. J. Mol. Biol. 112,
|
|
|
|
535-542.
|
|
|
|
|
|
|
|
6. Kraulis, P.J. (1991) MOLSCRIPT: a program to produce both
|
|
|
|
detailed and schematic plots of protein structures. J. Appl.
|
|
|
|
Cryst. 24, 946-950.
|
|
|
|
|
|
|
|
7. Pearson, W.R. (1990) Rapid and sensitive sequence comparison
|
|
|
|
with FASTP and FASTA. Methods. Enzymol. 183, 63-98.
|
|
|
|
|