OASIS4.0 (CCP4 compatible)

NAME

oasis4.0 - a direct-methods program for SAD/SIR phasing and reciprocal-space fragment extension

SYNOPSIS

oasis4.0 HKLIN foo_in.mtz HKLOUT foo_out.mtz [XYZIN foo_in.pdb]
[Keyworded input]

CONTENTS

  1. DESCRIPTION
  2. INPUT AND OUTPUT FILES
  3. KEYWORDED INPUT
  4. OASIS GUI
  5. AUTHORS
  6. REFERENCES
  7. ACKNOWLEDGEMENTS

DESCRIPTION

OASIS is a computer program for ab initio SAD/SIR phasing and reciprocal-space fragment extension for proteins. The phase problem is reduced to a sign problem based on the known sites of anomalous scatterers or replacing heavy atoms. The sign problem is then solved by a direct method. OASIS possesses also the function of reciprocal-space fragment extension with or without SAD/SIR information. A combination of OASIS with density modification and model-building programs can perform iterative dual-space fragment extension. This is useful in particular when the total contribution of known fragments is not large ennough for carrying out a Fourier recycling.

What's new in the present version?

1. The SAD-phasing power is significantly enhanced by outputing Hendrickson-Lattman coefficients of the intrinsic unresolved bimodal phase distributions in addition to the direct-method phases and figures of merit. This enables the phase ambiguity to be re-broken by the ever-improving phases during the subsequent density modification and model building.

2. A GUI compatible with CCP4i is provided for controlling and monitoring the OASIS dual-space iteration processes. The dual-space iteration consists of phasing, density modification and model building. In order to use this GUI, the user should have CCP4, ARP/wARP and PHENIX preinstalled in the computer.

INPUT AND OUTPUT FILES

HKLIN
Input data file (CCP4 mtz format).
XYZIN (optional)
This is a .pdb file for providing partial structure information, which is usually obtained from automatic model-building programs. The file should contain CRYST and SCALE information at the beginning. Dummy atoms will be rejected in subsequent processing.
FRCIN (optioanl)
This file contains partial structure information extracted from a *.pdb file using the script pdb2frc.csh. Atomic positional parameters are converted to fractional coordinates. Dummy atoms will reserved and assigned as carbon atoms.
HKLOUT
Output data file (CCP4 mtz format).

KEYWORDED INPUT

The possible keywords are:

TIT, SAD/SIR/DMR, LABIN, CON, NFI, AOE, KMI, CYC, ANO, POS, FRA, NHA, SED, LIM, NHL, PHR

TIT <title string>

(optional)
This keyword heads a job-title string, which consists of up to 80 characters.

SAD/SIR/DMR

(compulsory)
This specifies the calculation mode:
SAD
Ab initio phasing and fragment extension with SAD data
SIR
Ab initio phasing and fragment extension with SIR data
DMR
Fragment extension without SAD/SIR information. This can be used as Direct-method MR-model completion with native data in the absence of SAD signals.

LABIN FP=... SIGFP=... [FPH=... SIGFPH=...] DANO=... SIGDANO=... [PHIC=...]

(compulsory)
FP, SIGFP:
averaged structure-factor magnitude [F(+) + F(-)]/2 and its standard deviation in SAD mode;
Fobs of the native protein and its standard deviation in SIR or DMR mode.
FPH, SIGFPH:
Fobs of the derivative of SIR data and its standard deviation.
DANO, SIGDANO:
Bijvoet difference, F(+) - F(-), of SAD data and its standard deviation.
PHIC:
standard phase for comparison with the resultant phase
F(+), SIGF(+), F(-), SIGF(-):
structure factor amplitudes for hkl and its Friedel mate -h-k-l.

The acceptable formats for different calculation modes are as follows:

SAD:
LABIN FP=...  SIGFP=...  DANO=...  SIGDANO=...  [PHIC=...]

   or

LABIN F(+)=...  SIGF(+)=...  F(-)=...  SIGF(-)=...  [PHIC=...]
SIR:
LABIN FP=...  SIGFP=...  FPH=...  SIGFPH=...  [PHIC=...]
DMR:
LABIN FP=...  SIGFP=...  [PHIC=...]

Note: If the keyword LABIN is missing, the program will assume that SAD data with the format
"FP=FP SIGFP=SIGFP DANO=DANO SIGDANO=SIGDANO"
are to be input. Any conflictions to this will result in errors.


CON <atom type> <number of atoms in ASU >

(compulsory)
Example:

  CON    C 770   N 185   O 231   H 1232   CU 1   

This keyword specifies contents in the asymmetric unit, which can be approximately estimated as follows:
Let r be the number of residues in the asymmetric unit, then the contents will be C 5r, N 1.2r, O 1.5r, H 8r plus atoms other than the above chemical elements.
An alternative and more accurate way to calculate contents in asymmetric unit is to run the script seq2con.csh with the sequence file by issuing the command: seq2con.csh name.pir


NFI

(optional)
By default, COS(delta_phi) values calculated from experimental Bijvoet differences will be modified to obey uniform distribution within the range of 1 to -1. This will cope with large experimental errors. The keyword NFI informs the program NOT to fit COS(delta_phi) values to uniform distribution. Instead, COS(delta_phi) values within the range of 1 to -1 will be kept unchanged and, values greater than 1 will be cut to 1 while values smaller than -1 will be set to -1. It is NOT recommended to use the keyword NFI except when you are very confident about the measured Bijvoet differences.

AOE <value>

(optional)
By default the average value of EXP[-(sigma_H)**2/2] in the P+ formula will be tuned automatically to 0.5. The user can use the keyword AOE to set the tuned average EXP[-(sigma_H)**2/2] to other values within the range of 0 to 1.

KMI <value>

(optional).
By default, for each particular set of diffraction data, the program will automatically set a proper KMI value (minimum value of three-phase structure invariants to be used in direct-method phasing). Usually the default KMI value will be in the range of 0.02 to 0.04. The smaller is the KMI value, the larger is the number of structure invariants that involve in the phasing process and, the more reliable will be the phasing results. A KMI value of 0.01 may in most cases leads to better results than that with the default value. However, for a big protein, a KMI of 0.01 may cause problems in computing time. If the user would like to set KMI value manually, it is recommended to check the total number of three-phase structure invariants (phase relationships) output by the program and adjust the value of KMI (kappa minimum) accordingly so as the total number of structure invarinats is kept within the range of 10,000,000 ~ 50,000,000.

CYC <ncycle>

(optional)
Number of cycles for P+ formula iteration (default: 2).

ANO <atom> <f">

(compulsory in SAD mode)
chemical symbol of the anomalous scatterer and the corresponding f" value - the imaginary-part correction to the atomic scattering factor

Example:

  ANO   HG 7.686
        ZN 0.678
The script crossec.csh can help with calculating anomalous corrections, for example the command: crossec.csh Br 0.9191 will calculate anomalous corrections to the atomic scattering factor of Br at the wavelength of 0.9191Å.


POS   <atom_1>   <1>   <x_1>   <y_1>   <z_1>     <occupancy_1>   <B-factor_1>
                .
                .
                .
          <atom_n>   <n>   <x_n>   <y_n>   <z_n>     <occupancy_n>   <B-factor_n>

(compulsory in SAD/SIR mode)
Fractional coordinates of anomalous scatterer(s) or replacing heavy atom(s) in the asymmetric unit.


FRA   <atom_1>   <1>   <x_1>   <y_1>   <z_1>     <occupancy_1>   <B-factor_1>
                .
                .
                .
          <atom_n>   <n>   <x_n>   <y_n>   <z_n>     <occupancy_n>   <B-factor_n>

(optional)
This keyword specifies atomic parameters of the known fragment(s) (in fractional coordinates in the asymmetric unit) for partial-structure iteration when the PDB file is not provided.

NHA <value>

(optional in DMR mode)
This specifies the percentage of atoms to be included in artificial partial structure (default: 5).

SED <value>

(optional in DMR mode)
This specifies the seed value of the random-number generator used for creating artificial partial structure (default: 1).

LIM <d1> <d2>

(optional)
This specifies the resolution range in Angstroms for reflections involved in the calculation. Where d1 is the value of either the low- or the high-resolution cutoff, while d2 is the value of the other cutoff. The default is to use all existent reflections in the calculation.


NHL

(optional)
In the SAD or SIR case, the program will output HL coefficients, which correspond to the original unresolved bimodal phase distributions. The keyword NHL will eliminate output of these HL coefficients.


PHR

(optional)
By default the program will calculate Sigma2 relations in each run and store them in the file SIGMA2.DAT. In the presence of the keyword PHR, the program will not calculate Sigma2 relations, but just extract them from the existent file SIGMA2.DAT.

 

OASIS GUI


 

AUTHORS

Tao Zhang2,1, Li-jie Wu1, Yao He1, Jia-wei Wang1, Chao-de Zheng1, Quan Hao3, Yuan-xin Gu1 & Hai-fu Fan1

1 Beijing National Laboratory for Condensed Matter Physics, Institute of Physics, Chinese Academy of Sciences, Beijing 100190, China
2 School of physical sciences and technology, Lanzhou University, Gansu, Lanzhou 730000, China
3 Department of Physiology, University of Hong Kong, Hong Kong, China
Emails: gu@cryst.iphy.ac.cn; fanhf@cryst.iphy.ac.cn


 

REFERENCES

  1. Fan, H.F. & Gu, Y.X. (1985). "Combining direct methods with isomorphous replacement or anomalous scattering data III. The incorporation of partial structure information". Acta Cryst. A41, 280-284.
  2. Hao, Q., Gu, Y.X., Zheng, C.D. & Fan, H.F. (2000). "OASIS: a program for breaking phase ambiguity in OAS or SIR". J. Appl. Cryst. 33, 980-981.
  3. Wang, J.W., Chen, J.R., Gu, Y.X., Zheng, C.D. & Fan, H.F. (2004). "Direct-method SAD phasing with partial-structure iteration - towards automation". Acta Cryst. D60, 1991-1996.
  4. He, Y., Yao, D.Q., Gu, Y.X., Lin, Z.J., Zheng, C.D. & Fan, H.F. (2007). "OASIS and MR-model completion". Acta Cryst. D63 793-799.
  5. Wu, L.J., Zhang, T., Gu, Y.X., Zheng, C.D. & Fan, H.F. (2009). "Direct-method SAD phasing of proteins enhanced by the use of intrinsic bimodal phase distributions in subsequent phase-improvement process". Acta Cryst. D65, 1213-1216.
  6. Zhang, T., Wu, L.J., He, Y., Wang, J.W. Zheng, C.D., Hao, Q., Gu, Y.X. & Fan, H.F. (2009). "OASIS4.0: new version with improved SAD-phasing algorithm and a GUI for controlling and monitoring iterative processes". (In preparation).

* PDF files of the REFERENCES are available at http://cryst.iphy.ac.cn
 

ACKNOWLEDGEMENTS

The authors are grateful to Dr T. C. Terwilliger for his very kind permission for the incorporation of the subroutine HENDFT.F in OASIS. Thanks are also due to Professor N. Watanabe for the test data for TTHA1012 and Dr Z. Dauter for the test data for xylanase. This work was supported by the Innovation Project of the Chinese Academy of Sciences and by the 973 Project (grant No. 2002CB713801) of the Ministry of Science and Technology of China.