SC (CCP4: Supported Program)
NAMEsc - Determine Sc shape complementarity of two interacting molecular surfaces
SYNOPSISsc XYZIN foo.pdb [ SCRADII radii.lib ] [ SURFIN1 foo1_in.srf SURFIN2 foo2_in.srf SURFOUT1 foo1_out.srf SURFOUT2 foo2_out.srf ]
The shape correlation statistic Sc (Lawrence and Colman, 1993) can be used to quantify the shape complementarity of protein/protein interfaces and give an idea of the "goodness of fit" between two protein surfaces. The program SC will calculate values of Sc and related statistics for the interface region between two molecules in a Brookhaven coordinate file.
The input comprises three sections:
The molecule definition commands are used to select which atoms in the input file are to make up the two individual molecules for the Sc calculation. Entries for this section appear twice, once for each molecule (see EXAMPLES):
The default values for the parameters are set inside the program at compilation time (in the file defaults.h), and should be suitable for most applications. In particular you should avoid using different values for PROBE_RADIUS, TRIM and WEIGHT if you intend to compare your values of Sc with the results of other calculations, or with values found in the literature.
These commands are only required if you want to merge the results of the Sc calculations with existing GRASP surface files for the purposes of graphical display.
See NOTES ON GRASP FILES if you intend to use the merging facility.
MOLECULE <n>This selects which molecule to put the subsequent selection in; <n> is either 1 or 2. This keyword is followed by a combination of CHAIN, ZONE, AT_EXCL and/or AT_INCL keywords, which then select the atoms which will be included as the molecule. Selection via these subsequent keywords is logically sequential.
CHAIN <chn>Include a particular chain. All atoms in chain <chn> will be included in the selected molecule.
ZONE [ <chn1> ] <res1> [ <chn2> ] <res2>Include a zone of residues. All atoms in and between the named residues will be included in the selected molecule. The chain names <chn1> and <chn2> should be omitted if the chain identifier field is blank within the coordinate file. <res1> and <res1> define the residue names (not type) that delimit the selected zone.
AT_EXCL [ <chn> ] <res> <atm>Exclude a particular atom. The atom identified by chain <chn>, residue name <res> and atom name <atm> will be excluded from the selected molecule. The chain name <chn> should be omitted if the chain identifier field is blank within the coordinate file.
AT_INCL [ <chn> ] <res> <atm>Include a particular atom. The atom identified by chain <chn>, residue name <res> and atom name <atm> will be excluded from the selected molecule. The chain name <chn> should be omitted if the chain identifier field is blank within the coordinate file.
[Default: 1.7 Å]
Sets the radius of the probe sphere which is used to define the solvent excluded surface.
Note:You should avoid changing the probe radius if you intend to cross-compare the results of the Sc calculation with values obtained elsewhere, as the comparison will be invalid if different probe radii are used.
[Default: 15 dots/Å2]
[Default: 1.5 Å]
Sets the distance used to generate the peripheral band.
The peripheral band consists of those surface points which are part of the buried portion of the molecular surface but which lie within a distance <trim> of the non-buried (i.e. solvent accessible) surface. Points in the peripheral band are omitted from the calculations.
Note: You should avoid changing the width of the peripheral band if you intend to cross-compare the results of the Sc calculation with values obtained elsewhere, as the value of Sc depends on the width of the excluded band.
[Default: 8 Å]
Distance determining which atoms are used in the calculations. See PROGRAM FUNCTION for details about this parameter before changing it.
[Default: 0.5 Å-2]
This sets the value of the weighting factor used in the calculation of the surface complementarity function S(A->B). (See PRINTER OUTPUT for the definition of S(A->B).)
[Default: 1.5 Å]
The tolerance for equivalencing GRASP and SC surface points. The strategy employed by the program is to assign to each GRASP surface vertex the weighted normal dot product associated with the nearest Connolly surface point to that vertex. If no point employed within the Sc calculation is found within a distance <tol> of the vertex then the vertex is deemed to be part of the non-interacting surface. The value of <tol> will depend on the dot density and resolution of the respective surfaces. The non-interacting surfaces are assigned a general property 1 value assigned by the GRASP_BACKGROUND keyword (below).
General Property 1 value for vertices that lie more than GRASP_MATCH from any Connolly point within the interacting surfaces. The aim here is simply to set up a distinctly different value that can hence be displayed in a separate colour within GRASP.
The program output includes the following loggraph tables for each of the molecules.
The shape correlation Sc is then defined as
where the braces denote the median of the S(A->B), S(B->A) distributions. (See Lawrence and Colman, 1993 for more detailed descriptions of these functions.)
Interfaces with Sc = 1 will mesh precisely, interfaces with Sc approximately zero will effectively be uncorrelated in their topography.
Note that Sc may become rather meaningless when the buried area becomes small, and hence it may not be a good measure for small crystal contacts. This is simply because as the overall buried area becomes smaller and/or more convoluted or disjointed in shape, the percentage removed as part of the peripheral band increases substantially.
This program computes Sc between two molecules in a numerical fashion. The algorithm is fully detailed in Lawrence and Colman, 1993. Briefly: the molecular surfaces are represented as a series of discrete points (Connolly, 1983) of sufficiently high surface sampling density (set by the DOT_DENSITY keyword) and S(1->2) and S(2->1) are then evaluated at these points.
The interface surfaces are defined as being the portion of the molecular surface of molecule 1 which is buried from solvent by its interaction with molecule 2 (and vice versa). The molecular surface itself is defined (Richards, 1977) as the union of contact and re-entrant portions demarcated by a probe sphere of a given radius (set by the PROBE_RADIUS keyword).
Only atoms within the INTERFACE distance of any "buried" atoms (defined in the Connolly sense) are selected for initial surface computation. This parameter does not enter formally into the evaluation of Sc, its purpose is simply to speed up the computation by excluding from consideration atoms remote from the interface. The program in reality computes not the entire surface for the individual molecules, but rather only for the subset of atoms within the INTERFACE distance from the other molecule. A portion of this surface is non-physical, as it is buried with the core of the individual molecule, however its presence does not affect the computation of Sc as it is remote from the interaction. If there is any doubt about the validity of this approach for a particular molecule, the program should be rerun with a larger value for this parameter to ensure that the computation is stable. Subsequently, a periphery band of buried points are removed if they lie within a distance TRIM of any solvent accessible surface points.
Cross-comparison of Sc numbers between proteins (i.e. characterisation of surfaces as more or less complementary than other types of surface) is the main interest in SC. This is only valid if the same values of the critical parameters (probe radius, width of the peripheral band, atomic radii, weighting factor) are used in both computations. To this end it is recommended that the default values for the PROBE_RADIUS, TRIM width and the atomic radii set in the sc_radii.lib file should be used, so that the results will be comparable with other literature values.
The program includes a modified version of Michael Connolly's subroutine "mds" for calculating molecular surfaces; the original code can be obtained from his website at http://www.biohedron.com/msp.html. The version contained in SC is provided here with the consent of Michael Connolly. The modifications include a minor bug fix, and use of the CCP4 library routines for exiting on fatal errors (``CCPERR'') and for calculating vector products (``CROSS'').
Sc itself cannot be computed satisfactorily within GRASP, as GRASP uses a rather different approach to surface definition. However qualitative display of the weighted normal products S(A->B) is possible - this is achieved by a simple mapping of this value from the one surface to the other.
There are however some limits to SC's interaction with GRASP. See the NOTES ON GRASP FILES below.
To the best of our knowledge, GRASP is only available for Silicon Graphics machines, and since the surface files it produces contain unformatted data these files are not generally portable to other systems, e.g. Digital Alphas.
SC will make a check on the compatibility of input surface files before trying to read them in. In cases where it detects a problem, the files will not be read in, no merging will be performed, and no output surface files will be generated. In these cases, if GRASP output is required it will be necessary to run SC on another machine which has compatible conventions for reading and writing unformatted data.
There have been some reports of bugs in GRASP 1.3.6 which have caused problems with the GRASP output from SC. Please let us know if you experience problems which might be due to such bugs.
It will be necessary to edit the radii file used by the program, if your input file contains atoms which are not in the file already. It is not recommended that you change the values of radii already in the file, as this will compromise comparison of your calculated Sc values with values used in the literature.
Each entry in the file is a single line with three fields separated by spaces, of the format:
Residue_name Atom_name Radius
Either of the name fields can contain one or more wildcards (i.e. the asterisk character '*') to match to multiple residues or atoms, e.g. O* will match to O1, O2 etc. Unidentified residue/atom combinations will cause the program to stop.
The default radii file is sc_radii.lib in $CLIBD; to use a modified radii file in a different directory, assign the filename and path via the SCRADII logical name.
Two non-runnable Unix example scripts (using Grasp input) found in $CEXAM/unix/non-runnable/
Copyright Michael Lawrence,