OASIS4.2 in IPCAS1.0 (CCP4 compatible)

FEATURES

Combination of OASIS4.2 and other programs under the automatic control of IPCAS1.0 is capable of performing the following tasks:

  1. SAD phasing at low Bijvoet ratio (< 1%), low redundancy (< 4) and/or low resolution (below 3Å)
  2. Difficult SIR (or SAD) phasing with the aid of known phases at low resolution (~8Å)
  3. MR-model completion based on a starting model corresponding to ~20% of the whole structure
  4. Completion of structure models from different sources other than MR
  5. Structure solution by extension of known phases at low resolution (below 4Å)
For details see Examples in IPCAS GUI.

CONTENTS

  1. DESCRIPTION
  2. SYNOPSIS, INPUT AND OUTPUT FILES (for script users only)
  3. KEYWORDED INPUT (for script users only)
  4. IPCAS GUI
  5. AUTHORS
  6. REFERENCES
  7. ACKNOWLEDGEMENTS

DESCRIPTION

OASIS is a computer program of direct-method phase extension for proteins. The phase extension is based on either a set of known phases or a partial structure, the latter can be the heavy-atom substructure, fragments of the protein or a nearly complete structure model. SAD/SIR information is beneficial but not compulsory. The 0 to 2π phase problem is reduced to a plus or minus sign problem based on the known heavy-atom substructure [1] or, in the absence of SAD/SIR information, by artificially creating a phase doublet to each reflection [2]. The sign problem is then solved by a direct method via the P+ formula [1]. OASIS can be combined with density modification and model-building/refinement programs to perform iterative dual-space fragment extension [3, 2]. This is particularly useful when the resultant structure model obtained from a single-cycle run of OASIS or from other structure-solving packages is not satisfactory. The first edition of OASIS was released in 2000 [4]. Further developments and typical applications can be found in references [2, 3 and 5-7] and will be described in forthcoming papers.

OASIS is a part of the CCP4 compatible pipeline IPCAS (Iterative Protein Crystal-structure Automatic Solution). A GUI is provided for controlling and monitoring. The pipeline involves locating heavy atoms, finding MR (molecular replacement) models, direct-method phase extension, density modification and model building/refinement. IPCAS can start from either .sca files or .mtz files. In favorable cases it can automatically end up with a nearly complete structure model. Users of IPCAS should have CCP4, SHELXC/D, ARP/wARP and PHENIX preinstalled in the computer.

People who used OASIS4.2 in their work, please cite: Zhang, T., He, Y., Wang, J.W., Wu, L.J., Zheng, C.D., Hao, Q., Gu, Y.X. and Fan, H.F. (2012). OASIS4.2 - a computer program of direct-method phase extension for proteins. Institute of Physics, Chinese Academy of Sciences, P.R. China (available at http://cryst.iphy.ac.cn)

Apart from running within IPCAS, OASIS can also be running under CCP4 in stand-alone mode. In this case OASIS can only be invoked by scripts with keywords described bellow. For users only interested in running OASIS within IPCAS, please jump to the section IPCAS GUI.

SYNOPSIS

oasis HKLIN foo_in.mtz HKLOUT foo_out.mtz [XYZIN foo_in.pdb or FRCIN foo_in.frc]
[Keyworded input]

INPUT AND OUTPUT FILES

HKLIN
Input data file (CCP4 mtz format).
HKLOUT
Output data file (CCP4 mtz format).
(Either one but NOT both of the following two files can be specified in a single command that invokes OASIS)
 
XYZIN (optional)
This is a .pdb file for providing partial structure information, which is usually obtained from automatic model-building programs. The file should contain CRYST and SCALE information at the beginning. Dummy atoms will be rejected in subsequent processing.
FRCIN (optioanl)
This file contains partial structure information extracted from a *.pdb file using the script pdb2frc.csh. Atomic positional parameters are converted to fractional coordinates. Dummy atoms will reserved and assigned as carbon atoms.

KEYWORDED INPUT

Keywords in alphabetic order:

ANO, AOE, CON, CYC, FIX, FRA, KMI, LABIN, LIM, NFI, NFX NHA, NHL, PHR, POS, SAD/SIR/DMR, SED, TIT,

TIT <title string>

(optional)
This keyword heads a job-title string, which consists of up to 80 characters.

SAD/SIR/DMR

(compulsory)
This specifies the calculation mode:
SAD
Ab initio phasing and fragment extension with SAD data
SIR
Ab initio phasing and fragment extension with SIR data
DMR
Fragment extension without SAD/SIR information. This can be used as Direct-method MR-model completion with native data in the absence of SAD signals.

LABIN FP=... SIGFP=... [FPH=... SIGFPH=...] DANO=... SIGDANO=...

(compulsory)
FP, SIGFP:
averaged structure-factor magnitude [F(+) + F(-)]/2 and its standard deviation in SAD mode;
Fobs of the native protein and its standard deviation in SIR or DMR mode.
FPH, SIGFPH:
Fobs of the derivative of SIR data and its standard deviation.
DANO, SIGDANO:
Bijvoet difference, F(+) - F(-), of SAD data and its standard deviation.
F(+), SIGF(+), F(-), SIGF(-):
structure factor amplitudes, its standard deviation of a reflection hkl and its Friedel mate -h-k-l.

The acceptable formats for different calculation modes are as follows:

SAD:
LABIN FP=...  SIGFP=...  DANO=...  SIGDANO=...

   or

LABIN F(+)=...  SIGF(+)=...  F(-)=...  SIGF(-)=...
SIR:
LABIN FP=...  SIGFP=...  FPH=...  SIGFPH=...
DMR:
LABIN FP=...  SIGFP=...

Note: If the keyword LABIN is missing, the program will assume that SAD data with the format
"FP=FP SIGFP=SIGFP DANO=DANO SIGDANO=SIGDANO"
are to be input. Any conflictions to this will result in errors.

CON <atom type> <number of atoms in ASU >

(compulsory)
Example:

  CON    C 770   N 185   O 231   H 1232   CU 1   

This keyword specifies contents in the asymmetric unit, which can be approximately estimated as follows:
Let r be the number of residues in the asymmetric unit, then the contents will be C 5r, N 1.2r, O 1.5r, H 8r plus atoms other than the above chemical elements.
An alternative and more accurate way to calculate contents in asymmetric unit is to run the script seq2con.csh with the sequence file by issuing the command: seq2con.csh name.pir

NFI

(optional)
By default, COS(delta_phi) values calculated from experimental Bijvoet differences will be modified to obey uniform distribution within the range of 1 to -1. This will cope with large experimental errors. The keyword NFI informs the program NOT to fit COS(delta_phi) values to uniform distribution. Instead, COS(delta_phi) values within the range of 1 to -1 will be kept unchanged and, values greater than 1 will be cut to 1 while values smaller than -1 will be set to -1. It is NOT recommended to use the keyword NFI except when you are very confident about the measured Bijvoet differences.

AOE <value>

(optional)
By default the average value of EXP[-(sigma_H)**2/2] in the P+ formula will be tuned automatically to 0.5. The user can use the keyword AOE to set the tuned average EXP[-(sigma_H)**2/2] to other values within the range of 0 to 1.

KMI <value>

(optional).
By default, for each particular set of diffraction data, the program will automatically set a proper KMI value (minimum value of three-phase structure invariants to be used in direct-method phasing). Usually the default KMI value will be in the range of 0.02 to 0.04. The smaller is the KMI value, the larger is the number of structure invariants that involve in the phasing process and, the more reliable will be the phasing results. A KMI value of 0.01 may in most cases leads to better results than that with the default value. However, for a big protein, a KMI of 0.01 may cause problems in computing time. If the user would like to set KMI value manually, it is recommended to check the total number of three-phase structure invariants (phase relationships) output by the program and adjust the value of KMI (kappa minimum) accordingly so as the total number of structure invarinats is kept within the range of 10,000,000 ~ 50,000,000.

CYC <ncycle>

(optional)
Number of cycles for P+ formula iteration (default: 2).

ANO <atom> <f">

(compulsory in SAD mode)
chemical symbol of the anomalous scatterer and the corresponding f" value - the imaginary-part correction to the atomic scattering factor
Example:

  ANO   HG 7.686
        ZN 0.678
The script crossec.csh can help with calculating anomalous corrections, for example the command: crossec.csh Br 0.9191 will calculate anomalous corrections to the atomic scattering factor of Br at the wavelength of 0.9191Å.

POS   <atom_1>   <1>   <x_1>   <y_1>   <z_1>     <occupancy_1>   <B-factor_1>
                .
                .
                .
          <atom_n>   <n>   <x_n>   <y_n>   <z_n>     <occupancy_n>   <B-factor_n>

(compulsory in SAD/SIR mode)
Fractional coordinates of anomalous scatterer(s) or replacing heavy atom(s) in the asymmetric unit.

FRA   <atom_1>   <1>   <x_1>   <y_1>   <z_1>     <occupancy_1>   <B-factor_1>
                .
                .
                .
          <atom_n>   <n>   <x_n>   <y_n>   <z_n>     <occupancy_n>   <B-factor_n>

(optional)
This keyword specifies atomic parameters of the known fragment(s) (in fractional coordinates in the asymmetric unit) for partial-structure iteration when the PDB file is not provided.

NHA <value>

(optional in DMR mode)
This specifies the percentage of atoms to be included in artificial partial structure (default: 5).

SED <value>

(optional in DMR mode)
This specifies the seed value of the random-number generator used for creating artificial partial structure (default: 1).

LIM <d1> <d2>

(optional)
This specifies the resolution range in Angstroms for reflections involved in the calculation. Where d1 is the value of either the low- or the high-resolution cutoff, while d2 is the value of the other cutoff. The default is to use all existent reflections in the calculation.

NHL

(optional)
In the SAD or SIR case, the program will output HL coefficients, which correspond to the original unresolved bimodal phase distributions. The keyword NHL will eliminate output of these HL coefficients.

PHR

(optional)
By default the program will calculate Sigma2 relations in each run and store them in the file SIGMA2.DAT. In the presence of the keyword PHR, the program will not calculate Sigma2 relations, but just extract them from the existent file SIGMA2.DAT.

FIX

(optional)
Known phases of some reflections could be used in OASIS as seeds of direct-method phase extension. The seeds will be kept fixed during phase derivation. With the key word FIX, a text file named "FIXPHS.TM" that contains phases and figures of merit should be put in the directory where OASIS will be executed. The first line of "FIXPHS.TM" should specify the format of the rest part in Fortran style. From the second line downward, each line contains

H, K, L, phase and FOM

of a single reflection.

Example of FIXPHS.TM:


 (3I5,2F10.4)
    0    0    3  300.8600    0.9600
    0    0    6   50.6700    0.8400
    0    0    9  161.6700    0.9300
                .
                .
                .
57 1 2 217.9200 0.1600 57 2 1 226.9800 0.1800 58 1 0 31.9000 0.2100

NFX <value>

(optional)
This specify the percentage of known phases (count from the largest FOM downward), which will be kept fixed and involved in the direct-method phase extension

IPCAS GUI

AUTHORS

Tao Zhang1, Yao He1, Jia-wei Wang1, Li-jie Wu1, Chao-de Zheng1, Quan Hao2, Yuan-xin Gu1 & Hai-fu Fan1

1 Beijing National Laboratory for Condensed Matter Physics, Institute of Physics, Chinese Academy of Sciences, Beijing 100190, China
2 Department of Physiology, University of Hong Kong, Hong Kong, China

Emails: gu@cryst.iphy.ac.cn; fanhf@cryst.iphy.ac.cn

REFERENCES

     [1] Fan, H.F. & Gu, Y.X. (1985). "Combining direct methods with isomorphous replacement or anomalous scattering data III. The incorporation of partial structure information". Acta Cryst. A41, 280-284.
     [2] He, Y., Yao, D.Q., Gu, Y.X., Lin, Z.J., Zheng, C.D. & Fan, H.F. (2007). "OASIS and MR-model completion". Acta Cryst. D63 793-799.
     [3] Wang, J.W., Chen, J.R., Gu, Y.X., Zheng, C.D. & Fan, H.F. (2004). "Direct-method SAD phasing with partial-structure iteration - towards automation". Acta Cryst. D60, 1991-1996.
     [4] Hao, Q., Gu, Y.X., Zheng, C.D. & Fan, H.F. (2000). "OASIS: a program for breaking phase ambiguity in OAS or SIR". J. Appl. Cryst. 33, 980-981.
     [5] Yao, D.Q., Huang, S., Wang, J.W., Gu, Y.X., Zheng, C.D., Fan, H.F., Watanabe, N. & Tanaka, I. (2006). "SAD phasig by OASIS-2004: case studies of dual-space fragment extension". Acta Cryst. D62, 883-890.
     [6] Wu, L.J., Zhang, T., Gu, Y.X., Zheng, C.D. & Fan, H.F. (2009). "Direct-method SAD phasing of proteins enhanced by the use of intrinsic bimodal phase distributions in subsequent phase-improvement process". Acta Cryst. D65, 1213-1216.
     [7] Zhang, T., Wu, L.J., Hao, Q., Gu, Y.X., Zheng, C.D. & Fan, H.F. (2010). "Combining SAD/SIR iteration and MR iteration in partial-model extension of proteins". Chin. Phys. B 19, 096101.

* PDF files of the REFERENCES are available at http://cryst.iphy.ac.cn

ACKNOWLEDGEMENTS

The authors are grateful to Dr T. C. Terwilliger for his very kind permission for the incorporation of the subroutine HENDFT.F in OASIS. Thanks are also due to Dr Z. Dauter, Professor B.D. Sha, Dr S.J. Gamblin & Dr B. Xiao, Professor D.C. Liang and Professor N. Watanabe for test data sets used in this and previous versions of OASIS. This work is supported by the Innovation Project of the Chinese Academy of Sciences and by the 973 Project (grants No. 2002CB713801 and No. 2011CB911101) of the Ministry of Science and Technology of China.