You cannot select more than 25 topics
Topics must start with a letter or number, can include dashes ('-') and can be up to 35 characters long.
395 lines
13 KiB
Plaintext
395 lines
13 KiB
Plaintext
11 years ago
|
|
||
|
|
||
|
|
||
|
|
||
|
|
||
|
|
||
|
|
||
|
STRIDE: Protein secondary structure assignment
|
||
|
from atomic coordinates
|
||
|
|
||
|
Dmitrij Frishman & Patrick Argos
|
||
|
|
||
|
European Molecular Biology Laboratory
|
||
|
Postfach 102209, Meyerhofstr. 1
|
||
|
69012 Heidelberg
|
||
|
Germany
|
||
|
|
||
|
FRISHMAN@EMBL-HEIDELBERG.DE
|
||
|
ARGOS@EMBL-HEIDELBERG.DE
|
||
|
|
||
|
|
||
|
CONTENTS
|
||
|
|
||
|
|
||
|
1. About the method
|
||
|
|
||
|
2. Copyright notice
|
||
|
|
||
|
3. Availability
|
||
|
|
||
|
4. Installation
|
||
|
|
||
|
5. Using STRIDE
|
||
|
|
||
|
6. Output format
|
||
|
|
||
|
7. Bug reports and user feedback
|
||
|
|
||
|
8. References
|
||
|
|
||
|
|
||
|
----------------------------------------------------------------------
|
||
|
|
||
|
|
||
|
1. About the method
|
||
|
|
||
|
|
||
|
STRIDE [1] is a program to recognize secondary structural elements in
|
||
|
proteins from their atomic coordinates. It performs the same task as
|
||
|
DSSP by Kabsch and Sander [2] but utilizes both hydrogen bond energy
|
||
|
and mainchain dihedral angles rather than hydrogen bonds alone. It
|
||
|
relies on database-derived recognition parameters with the
|
||
|
crystallographers' secondary structure definitions as a standard-of-
|
||
|
truth. Please see Frishman and Argos [1] for detailed description of
|
||
|
the algorithm.
|
||
|
|
||
|
|
||
|
|
||
|
2. Copyright notice
|
||
|
|
||
|
|
||
|
All rights reserved, whether the whole or part of the program is
|
||
|
concerned. Permission to use, copy, and modify this software and its
|
||
|
documentation is granted for academic use, provided that:
|
||
|
|
||
|
|
||
|
i. this copyright notice appears in all copies of the software and
|
||
|
related documentation;
|
||
|
|
||
|
ii. the reference given below (Frishman and Argos, 1995) must be
|
||
|
cited in any publication of scientific results based in part or
|
||
|
completely on the use of the program;
|
||
|
|
||
|
iii. bugs will be reported to the authors.
|
||
|
|
||
|
The use of the software in commercial activities is not allowed
|
||
|
without a prior written commercial license agreement.
|
||
|
|
||
|
WARNING: STRIDE is provided "as-is" and without warranty of any kind,
|
||
|
express, implied or otherwise, including without limitation any
|
||
|
warranty of merchantability or fitness for a particular purpose. In no
|
||
|
event will the authors be liable for any special, incidental, indirect
|
||
|
or consequential damages of any kind, or any damages whatsoever
|
||
|
resulting from loss of data or profits, whether or not advised of the
|
||
|
possibility of damage, and on any theory of liability, arising out of
|
||
|
or in connection with the use or performance of this software.
|
||
|
|
||
|
For calculation of the residue solvent accessible area the program NSC
|
||
|
[3,4] is used and was kindly provided by Dr. F.Eisenhaber
|
||
|
(EISENHABER@EMBL-HEIDELBERG.DE). Please direct to him all questions
|
||
|
concerning specifically accessibility calculations.
|
||
|
|
||
|
|
||
|
|
||
|
3. Availability
|
||
|
|
||
|
|
||
|
Executables of STRIDE for several UNIX platforms, VAX/VMS, OpenVMS,
|
||
|
Dos and Mac together with documentation and source code are available
|
||
|
by anonymous FTP from ftp.ebi.ac.uk (directories
|
||
|
/pub/software/unix/stride, /pub/software/dos/stride,
|
||
|
/pub/software/vms/stride, /pub/software/mac/stride). We are willing to
|
||
|
compile the program for other architectures if temporary access to
|
||
|
them will be granted by an interested user.
|
||
|
|
||
|
Data files with STRIDE secondary structure assignments for the current
|
||
|
release of the PDB [5] databank are in the directory
|
||
|
/pub/databases/stride of the same site. Atomic coordinate sets can be
|
||
|
submitted for secondary structure assignment through electronic mail
|
||
|
to stride@embl-heildelberg.de. A mail message containing HELP in the
|
||
|
first line will be answered with appropriate instructions. See also
|
||
|
WWW page http://www.embl-heidelberg.de/stride/stride_info.html.
|
||
|
|
||
|
|
||
|
|
||
|
4. Installation
|
||
|
|
||
|
|
||
|
For UNIX, DOS and Mac no installation is needed. Just download the
|
||
|
executable corresponding to your platform, and you are all set. For
|
||
|
VAX and OpenVMS you need only to link the executable with a logical
|
||
|
name; for example:
|
||
|
|
||
|
yourlogicalname:= $ $yourdiskname:[your.directory.name]stride.exe
|
||
|
|
||
|
and then use yourlogicalname as the program name.
|
||
|
|
||
|
|
||
|
|
||
|
5. Using STRIDE
|
||
|
|
||
|
|
||
|
The only required parameter for STRIDE is the name of the file
|
||
|
containing a set of atomic coordinates in PDB [5] format. By default
|
||
|
STRIDE writes to standard output, i.e. your screen. On systems that
|
||
|
allow to redirect output you can do so to create a disk file. Help is
|
||
|
available if you just type STRIDE without parameters. The following
|
||
|
options are accepted:
|
||
|
|
||
|
|
||
|
-fFilename Write output to the file "Filename" rather than to
|
||
|
stdout.
|
||
|
|
||
|
|
||
|
-h Report hydrogen bonds. By default no hydrogen bond
|
||
|
information is included in the output.
|
||
|
|
||
|
-o Report secondary structure summary only.
|
||
|
|
||
|
-rId1Id2.. Read only chains Id1, Id2 etc. of the PDB file *). All
|
||
|
other chains will be ignored. By default all valid
|
||
|
protein chains are read.
|
||
|
|
||
|
-cId1Id2.. Process only chains Id1, Id2 ...etc *). Secondary
|
||
|
structure assignment will be produced only for these
|
||
|
chains, but other chains that are present will be taken
|
||
|
into account while calculating residue accessible
|
||
|
surface and detecting inter-chain hydrogen bonds and,
|
||
|
possibly, interchain beta-sheets. By default all
|
||
|
protein chains read are processed.
|
||
|
|
||
|
-mFilename Generate a Molscript [6] file. Using the program
|
||
|
Molscript by Per Craulis you can create a postscript
|
||
|
picture of your structure. You can manually edit the
|
||
|
Molscript file produced by STRIDE to achieve the
|
||
|
desired orientation and to include additional details.
|
||
|
|
||
|
-q[Filename] Generate sequence file in FASTA [7] format and die.
|
||
|
Filename is optional. If no file name is specified,
|
||
|
stdandard output is used.
|
||
|
|
||
|
All options are case- and position-insensitive.
|
||
|
|
||
|
Examples:
|
||
|
|
||
|
|
||
|
1. Calculate secondary structure assignment for 1ACP including
|
||
|
hydrogen bond information:
|
||
|
|
||
|
stride 1acp.brk -h
|
||
|
|
||
|
2. Calculate secondary structure assignment for 4RUB and write the
|
||
|
output to the file 4rub.str
|
||
|
|
||
|
stride 4rub.brk -f4rub.str
|
||
|
|
||
|
3. Calculate secondary structure assignment for chain B of 4RUB.
|
||
|
Ignore all other chains. Generate a Molscript file 4rub.mol.
|
||
|
|
||
|
stride 4rub.brk -rb -m4rub.mol
|
||
|
|
||
|
4. Calculate secondary structure assignment for chain C of 2GLS in
|
||
|
the presence of chains A and B. Report secondary structure
|
||
|
summary only.
|
||
|
|
||
|
stride 2gls.brk -rabc -cc -o
|
||
|
|
||
|
|
||
|
|
||
|
6. Output format
|
||
|
|
||
|
|
||
|
STRIDE produces output that is easily readable both visually and with
|
||
|
computer programs. The side effect of this conveniency is larger file
|
||
|
size of individual STRIDE entries. Every record is 79 symbols long and
|
||
|
has the following general format:
|
||
|
|
||
|
Position Description
|
||
|
|
||
|
1-3 Record code
|
||
|
4-5 Not used
|
||
|
6-73 Data
|
||
|
74-75 Not used
|
||
|
75-79 Four letter PDB code (if available)
|
||
|
|
||
|
Below follows the description of each record type.
|
||
|
|
||
|
|
||
|
Code Description and format of data
|
||
|
|
||
|
REM Remarks and blank lines
|
||
|
|
||
|
Format: free
|
||
|
|
||
|
HDR Header. Protein name, date of file creation and PDB code
|
||
|
|
||
|
Format: free
|
||
|
|
||
|
CMP Compound.Full name of the molecule and identifying
|
||
|
information
|
||
|
|
||
|
Format: free
|
||
|
|
||
|
SRC Species, organ, tissue, and mutant from which the molecule
|
||
|
has been obtained
|
||
|
|
||
|
Format: free
|
||
|
|
||
|
AUT Names of the structure authors
|
||
|
|
||
|
Format: free
|
||
|
|
||
|
CHN File name and PDB chain identifier*).
|
||
|
|
||
|
Format: File name beginning from position 6 followed
|
||
|
by one space and one-letter chain identifier
|
||
|
|
||
|
SEQ Amino acid sequence
|
||
|
|
||
|
Format: 6-9 First residue PDB number
|
||
|
11-60 Sequence
|
||
|
62-65 Last residue PDB number
|
||
|
|
||
|
STR Secondary structure summary
|
||
|
|
||
|
Format: 11-60 Secondary structure assignment **)
|
||
|
|
||
|
LOC Location of secondary structure elements
|
||
|
|
||
|
Format: 6-17 Element name
|
||
|
19-21 First residue name
|
||
|
32-26 First residue PDB number
|
||
|
28-28 First residue chain identifier
|
||
|
36-38 Last residue name
|
||
|
42-45 Last residue PDB number
|
||
|
47-47 Last residue chain identifier
|
||
|
|
||
|
ASG Detailed secondary structure assignment
|
||
|
|
||
|
Format: 6-8 Residue name
|
||
|
10-10 Protein chain identifier
|
||
|
12-15 PDB residue number
|
||
|
17-20 Ordinal residue number
|
||
|
25-25 One letter secondary structure code **)
|
||
|
27-39 Full secondary structure name
|
||
|
43-49 Phi angle
|
||
|
53-59 Psi angle
|
||
|
65-69 Residue solvent accessible area
|
||
|
|
||
|
DNR Donor residue
|
||
|
|
||
|
Format: 6-8 Donor residue name
|
||
|
10-10 Protein chain identifier
|
||
|
12-15 PDB residue number
|
||
|
17-20 Ordinal residue number
|
||
|
26-28 Acceptor residue name
|
||
|
30-30 Protein chain identifier
|
||
|
32-35 PDB residue number
|
||
|
37-40 Ordinal residue number
|
||
|
42-45 N..0 distance
|
||
|
47-52 N..O=C angle
|
||
|
54-59 O..N-C angle
|
||
|
61-66 Angle between the planes of donor
|
||
|
complex and O..N-C
|
||
|
68-73 angle between the planes of acceptor
|
||
|
complex and N..O=C
|
||
|
ACC Acceptor residue
|
||
|
|
||
|
Format: 6-8 Acceptor residue name
|
||
|
10-10 Protein chain identifier
|
||
|
12-15 PDB residue number
|
||
|
17-20 Ordinal residue number
|
||
|
26-28 Donor residue name
|
||
|
30-30 Protein chain identifier
|
||
|
32-35 PDB residue number
|
||
|
37-40 Ordinal residue number
|
||
|
42-45 N..0 distance
|
||
|
47-52 N..O=C angle
|
||
|
54-59 O..N-C angle
|
||
|
61-66 Angle between the planes of donor
|
||
|
complex and O..N-C
|
||
|
68-73 angle between the planes of acceptor
|
||
|
complex and N..O=C
|
||
|
|
||
|
|
||
|
|
||
|
|
||
|
|
||
|
HDR, CMP, SCR and AUT records are directly copied from the PDB file,
|
||
|
if supplied by the authors. If only the secondary structure summary is
|
||
|
requested, only CHN, SEQ, STR and LOC records will be output.
|
||
|
Hydrogen bond information (records DNR and ACC) was made very
|
||
|
redundant to facilitate human reading and will not be reported by
|
||
|
default.
|
||
|
|
||
|
|
||
|
*) IMPORTANT NOTE: if the protein chain identifier is ' ' (space), it
|
||
|
will be substituted by '-' (dash) everywhere in the STRIDE output.
|
||
|
The same is true for command line parameters involving chain
|
||
|
identifiers where you have to specify '-' instead of ' '.
|
||
|
|
||
|
**) One-letter secondary structure code is nearly the same as used in
|
||
|
DSSP [2] (see Frishman and Argos [1] for details):
|
||
|
|
||
|
H Alpha helix
|
||
|
G 3-10 helix
|
||
|
I PI-helix
|
||
|
E Extended conformation
|
||
|
B or b Isolated bridge
|
||
|
T Turn
|
||
|
C Coil (none of the above)
|
||
|
|
||
|
|
||
|
For each record (data line) except those with codes REM and STR the
|
||
|
number of fields is consistent and is readily suitable for processing
|
||
|
with external tools, such as awk, perl, etc.
|
||
|
|
||
|
|
||
|
|
||
|
7. Bug reports and user feedback
|
||
|
|
||
|
|
||
|
Please send your suggestions, questions and bug reports to
|
||
|
FRISHMAN@EMBL-HEIDELBERG.DE. Send your contact address to get
|
||
|
information on updates and new features.
|
||
|
|
||
|
|
||
|
|
||
|
8. References
|
||
|
|
||
|
|
||
|
|
||
|
1. Frishman,D & Argos,P. (1995) Knowledge-based secondary structure
|
||
|
assignment. Proteins: structure, function and genetics, 23,
|
||
|
566-579.
|
||
|
|
||
|
2. Kabsch,W. & Sander,C. (1983) Dictionary of protein secondary
|
||
|
structure: pattern recognition of hydrogen-bonded and
|
||
|
geometrical features. Biopolymers, 22: 2577-2637.
|
||
|
|
||
|
3. Eisenhaber, F. and Argos, P. (1993) Improved strategy in
|
||
|
analytic surface calculation for molecular systems: handling of
|
||
|
singularities and computational efficiency. J. comput. Chem. 14,
|
||
|
1272-1280.
|
||
|
|
||
|
4. Eisenhaber, F., Lijnzaad, P., Argos, P., Sander, C., and Scharf,
|
||
|
M. (1995) The double cubic lattice method: efficient approaches
|
||
|
to numerical integration of surface area and volume and to dot
|
||
|
surface contouring of molecular assemblies. J. comput. Chem. 16,
|
||
|
273-284.
|
||
|
|
||
|
5. Bernstein, F.C., Koetzle, T.F., Williams, G.J., Meyer, E.F.,
|
||
|
Brice, M.D., Rodgers, J.R., Kennard, O., Shimanouchi, T., and
|
||
|
Tasumi, M. (1977) The protein data bank: a computer-based
|
||
|
archival file for macromolecular structures. J. Mol. Biol. 112,
|
||
|
535-542.
|
||
|
|
||
|
6. Kraulis, P.J. (1991) MOLSCRIPT: a program to produce both
|
||
|
detailed and schematic plots of protein structures. J. Appl.
|
||
|
Cryst. 24, 946-950.
|
||
|
|
||
|
7. Pearson, W.R. (1990) Rapid and sensitive sequence comparison
|
||
|
with FASTP and FASTA. Methods. Enzymol. 183, 63-98.
|
||
|
|