OASIS4.0 (CCP4 compatible)
NAME
oasis4.0 - a direct-methods program for SAD/SIR phasing and
reciprocal-space fragment extension
SYNOPSIS
oasis4.0 HKLIN foo_in.mtz HKLOUT
foo_out.mtz [XYZIN foo_in.pdb]
[Keyworded input]
CONTENTS
- DESCRIPTION
- INPUT AND OUTPUT FILES
- KEYWORDED INPUT
- OASIS GUI
- AUTHORS
- REFERENCES
- ACKNOWLEDGEMENTS
DESCRIPTION
OASIS is a computer program for ab initio SAD/SIR phasing and reciprocal-space
fragment extension for proteins. The phase problem is reduced to a sign problem based on
the known sites of anomalous scatterers or replacing heavy atoms. The sign problem is
then solved by a direct method. OASIS possesses also the function of reciprocal-space
fragment extension with or without SAD/SIR information. A combination of
OASIS with density modification and model-building programs can perform iterative dual-space
fragment extension. This is useful in particular when the total contribution of known
fragments is not large ennough for carrying out a Fourier recycling.
What's new in the present version?
1. The SAD-phasing power is significantly enhanced by outputing Hendrickson-Lattman coefficients of
the intrinsic unresolved bimodal phase distributions in addition to the direct-method phases
and figures of merit. This enables the phase ambiguity to be re-broken by the ever-improving
phases during the subsequent density modification and model building.
2. A GUI compatible with CCP4i is provided for controlling and monitoring the OASIS dual-space
iteration processes. The dual-space iteration consists of phasing, density modification
and model building. In order to use this GUI, the user should have
CCP4,
ARP/wARP and
PHENIX
preinstalled in the computer.
INPUT AND OUTPUT FILES
HKLIN
Input data file (CCP4 mtz format).
XYZIN (optional)
This is a .pdb file for providing partial structure information, which
is usually obtained from automatic model-building programs. The file should contain
CRYST and SCALE information at the beginning. Dummy atoms will be rejected in
subsequent processing.
FRCIN (optioanl)
This file contains partial structure information extracted from a *.pdb file
using the script pdb2frc.csh. Atomic positional parameters are converted to fractional
coordinates. Dummy atoms will reserved and assigned as carbon atoms.
HKLOUT
- Output data file (CCP4 mtz format).
KEYWORDED INPUT
The possible keywords are:
TIT,
SAD/SIR/DMR,
LABIN,
CON,
NFI,
AOE,
KMI,
CYC,
ANO,
POS,
FRA,
NHA,
SED,
LIM,
NHL,
PHR
TIT <title string>
(optional)
This keyword heads a job-title string, which consists of up to 80 characters.
SAD/SIR/DMR
(compulsory)
This specifies the calculation mode:
- SAD
- Ab initio phasing and fragment extension with SAD data
- SIR
- Ab initio phasing and fragment extension with SIR data
- DMR
- Fragment extension without SAD/SIR information.
This can be used as Direct-method MR-model completion
with native data in the absence of SAD signals.
LABIN FP=... SIGFP=... [FPH=... SIGFPH=...] DANO=...
SIGDANO=... [PHIC=...]
(compulsory)
- FP, SIGFP:
- averaged structure-factor magnitude [F(+) + F(-)]/2 and its standard deviation in
SAD mode;
Fobs of the native protein and its standard deviation in SIR
or DMR mode.
FPH, SIGFPH:
- Fobs of the derivative of SIR data and its standard deviation.
- DANO, SIGDANO:
- Bijvoet difference, F(+) - F(-), of SAD data and its standard deviation.
- PHIC:
- standard phase for comparison with the resultant phase
- F(+), SIGF(+), F(-), SIGF(-):
- structure factor amplitudes for hkl and its Friedel mate -h-k-l.
The acceptable formats for different calculation modes are as follows:
- SAD:
- LABIN FP=... SIGFP=... DANO=... SIGDANO=... [PHIC=...]
- or
- LABIN F(+)=... SIGF(+)=... F(-)=... SIGF(-)=... [PHIC=...]
- SIR:
- LABIN FP=... SIGFP=... FPH=... SIGFPH=... [PHIC=...]
- DMR:
- LABIN FP=... SIGFP=... [PHIC=...]
Note: If the keyword LABIN is missing, the program will assume that SAD data
with the format
"FP=FP SIGFP=SIGFP DANO=DANO SIGDANO=SIGDANO"
are to be input. Any conflictions to this will result in errors.
CON <atom
type> <number of atoms in ASU >
(compulsory)
Example:
CON C 770 N 185 O 231 H 1232 CU 1
This keyword specifies contents in the asymmetric unit, which can be
approximately estimated as follows:
Let r be the number of residues in the asymmetric unit, then the contents
will be C 5r, N 1.2r, O 1.5r, H 8r plus atoms other
than the above chemical elements.
An alternative and more accurate way to calculate contents in asymmetric unit is
to run the script seq2con.csh with the sequence file by issuing the command:
seq2con.csh name.pir
NFI
(optional)
By default, COS(delta_phi) values calculated from experimental Bijvoet
differences will be modified to obey uniform distribution within the range of
1 to -1. This will cope with large
experimental errors. The keyword NFI informs the program NOT to fit
COS(delta_phi) values to uniform distribution. Instead, COS(delta_phi) values
within the range of 1 to -1 will be kept unchanged and,
values greater than 1 will be cut to 1 while values smaller than -1 will be set
to -1. It is NOT recommended to use the keyword NFI except when you
are very confident about the measured Bijvoet differences.
AOE <value>
(optional)
By default the average value of EXP[-(sigma_H)**2/2] in the P+ formula will be
tuned automatically to 0.5. The user can use the keyword AOE to set the tuned
average EXP[-(sigma_H)**2/2] to other values within the range of 0 to 1.
KMI <value>
(optional).
By default, for each particular set of diffraction data, the program will
automatically set a proper KMI value (minimum value of three-phase structure
invariants to be used in direct-method phasing). Usually the default KMI value
will be in the range of 0.02 to 0.04. The smaller is the KMI value, the larger is
the number of structure invariants that involve in the phasing process and, the more
reliable will be the phasing results. A KMI value of 0.01 may in most cases leads to
better results than that with the default value. However, for a big protein, a KMI of
0.01 may cause problems in computing time. If the user would like to set KMI value
manually, it is recommended to check the total number of three-phase structure invariants
(phase relationships) output by the program and adjust the value of KMI (kappa minimum)
accordingly so as the total number of structure invarinats is kept within the range of
10,000,000 ~ 50,000,000.
CYC <ncycle>
(optional)
Number of cycles for P+ formula iteration (default: 2).
ANO <atom> <f">
(compulsory in SAD mode)
chemical symbol of the anomalous scatterer and the corresponding f" value - the
imaginary-part correction to the atomic scattering factor
Example:
ANO HG 7.686
ZN 0.678
The script crossec.csh can help with calculating anomalous corrections, for example
the command:
crossec.csh Br 0.9191
will calculate anomalous corrections to the atomic scattering factor
of Br at the wavelength of 0.9191Å.
POS <atom_1> <1>
<x_1> <y_1> <z_1>
<occupancy_1> <B-factor_1>
.
.
.
<atom_n> <n>
<x_n> <y_n> <z_n>
<occupancy_n> <B-factor_n>
(compulsory in SAD/SIR mode)
Fractional coordinates of anomalous scatterer(s) or replacing heavy atom(s)
in the asymmetric unit.
FRA <atom_1> <1>
<x_1> <y_1> <z_1>
<occupancy_1> <B-factor_1>
.
.
.
<atom_n> <n>
<x_n> <y_n> <z_n>
<occupancy_n> <B-factor_n>
(optional)
This keyword specifies atomic parameters of the known fragment(s)
(in fractional coordinates in the asymmetric unit) for
partial-structure iteration when the PDB file is not provided.
NHA <value>
(optional in DMR mode)
This specifies the
percentage of atoms to be included in artificial partial structure (default: 5).
SED <value>
(optional in DMR mode)
This specifies the
seed value of the random-number generator used for creating artificial partial
structure (default: 1).
LIM <d1> <d2>
(optional)
This specifies the resolution range in Angstroms for reflections involved
in the calculation. Where d1 is the value of either the low- or the high-resolution
cutoff, while d2 is the value of the other cutoff. The default is to use all existent
reflections in the calculation.
NHL
(optional)
In the SAD or SIR case, the program will output HL coefficients, which correspond to the
original unresolved bimodal phase distributions. The keyword NHL will eliminate output
of these HL coefficients.
PHR
(optional)
By default the program will calculate Sigma2 relations in each run and store them in the file
SIGMA2.DAT. In the presence of the keyword PHR, the program will not calculate Sigma2 relations,
but just extract them from the existent file SIGMA2.DAT.
AUTHORS
Tao Zhang2,1, Li-jie Wu1,
Yao He1, Jia-wei Wang1,
Chao-de Zheng1, Quan Hao3,
Yuan-xin Gu1 & Hai-fu Fan1
1 Beijing National Laboratory for Condensed Matter Physics,
Institute of Physics, Chinese Academy of Sciences, Beijing 100190, China
2 School of physical sciences and technology, Lanzhou University, Gansu,
Lanzhou 730000, China
3 Department of Physiology, University of Hong Kong, Hong Kong, China
Emails: gu@cryst.iphy.ac.cn;
fanhf@cryst.iphy.ac.cn
REFERENCES
- Fan, H.F. & Gu, Y.X. (1985). "Combining direct
methods with isomorphous replacement or anomalous scattering data
III. The incorporation of partial structure information". Acta Cryst.
A41, 280-284.
- Hao, Q., Gu, Y.X., Zheng, C.D. & Fan, H.F. (2000). "OASIS: a program
for breaking phase ambiguity in OAS or SIR". J. Appl. Cryst.
33, 980-981.
- Wang, J.W., Chen, J.R., Gu, Y.X., Zheng, C.D. & Fan, H.F. (2004).
"Direct-method SAD phasing with partial-structure iteration - towards automation".
Acta Cryst. D60, 1991-1996.
- He, Y., Yao, D.Q., Gu, Y.X., Lin, Z.J., Zheng, C.D. & Fan, H.F. (2007).
"OASIS and MR-model completion". Acta Cryst. D63 793-799.
- Wu, L.J., Zhang, T., Gu, Y.X., Zheng, C.D. & Fan, H.F. (2009).
"Direct-method SAD phasing of proteins enhanced by the use of intrinsic bimodal
phase distributions in subsequent phase-improvement process". Acta Cryst.
D65, 1213-1216.
- Zhang, T., Wu, L.J., He, Y., Wang, J.W. Zheng, C.D., Hao, Q., Gu, Y.X. &
Fan, H.F. (2009). "OASIS4.0: new version with improved SAD-phasing algorithm and
a GUI for controlling and monitoring iterative processes".
(In preparation).
* PDF files of the REFERENCES are available at
http://cryst.iphy.ac.cn
ACKNOWLEDGEMENTS
The authors are grateful to Dr T. C. Terwilliger for his very kind permission for the
incorporation of the subroutine HENDFT.F in OASIS. Thanks are also due to Professor N. Watanabe
for the test data for TTHA1012 and Dr Z. Dauter for the test data for xylanase. This work was
supported by the Innovation Project of the Chinese Academy of Sciences and by the 973 Project
(grant No. 2002CB713801) of the Ministry of Science and Technology of China.