Description of a FOCUS input file


For background information refer to the Dissertation ETH No. 11422, section 4.1.2.
For clarity, the example FOCUS input file has been divided into blocks separated by empty lines.

The general information supplied at the beginning of the file defines the space group and lattice constants as refined with GSAS.

The next two AtomType lines define the cell contents of the structure to be solved, as determined by a chemical analysis or estimated by other means. The first item after the keyword AtomType is either "+" or "-". All atoms specified with an AtomType line are used in the calculation of F000 (the Fourier magnitude at the origin of reciprocal space), but only atoms with the "+" marker are considered in the atom and/or the framework fragment recycling procedures. The next item is a "class label" Node, NodeBridge, or "*", where the latter is for non-framework atoms. After the class label, an "atom label" and the number of atoms of this type per unit cell are supplied. Also possible - but not used here - is the definition of the occupancy factor to be used in the recycling (preset to 1.0), the isotropic temperature factor (preset to 0.035), and a "scattering factor label" (derived from the preceding atom label).

For example, the line
AtomType  -  *  Ow  20  1.25  0.05  O
describes an oxygen with an occupancy of 1.25 and an isotropic temperature factor of 0.05 Å2, which is a commonly used approximation for water molecules in zeolite channels. The scattering factor used is that of oxygen, and 20 water per unit cell are expected. However, experience has shown that recycling extra framework atoms is not efficient, and for the calculation of F000 it would be sufficient to supply one AtomType line for oxygen and one for hydrogen using the default occupancy and temperature factors.
The next block of five lines is related to the atom recycling procedure. The Chemistry MinDistance lines define the individual minimum distances for each pair of atom types which are used in the atom recycling procedure. Following Chemistry MinDistance are two pairs of "class label" and "atom label" as defined on AtomType lines, and the minimum distance for this pair of atom types in the same units as the lattice constants, usually Å. It should be noted that bonding is not considered in atom recycling mode. The minimum distances apply to all pairs of atoms, whether they are bonded or not.
Remark: since there is a "-" on the AtomType line for NodeBridge O, this atom type is not used in the atom recycling procedure. Therefore it would be sufficient to supply only the first Chemistry MinDistance line.
MaxPotentialAtoms gives the maximum number of peaks which are considered in the assignment algorithm. For example, with the value in the example, if the algorithm tries to assign a silicon atom to one of the peaks in the asymmetric unit, but is not able to find a valid position among the peaks in the asymmetric unit which generate the 102 highest peaks in the unit cell, the silicon is not assigned at all. MaxRecycledAtoms prescribes the maximum number of atoms in the unit cell that are actually assigned and is forced to be smaller or equal to MaxPotentialAtoms.

The following block specifies the parameters for the framework and framework fragment search procedure. FwSearchMethod is either FwTracking or AltFwTracking, which are simple backtracking and "colored" backtracking, respectively. When atoms are recycled and only complete frameworks are sought, MaxPeaksFwSearch defines the maximum number of peaks in the unit cell that are used in the backtracking procedure.

In framework fragment recycling mode, MaxPeaksFwFragmentSearch determines the maximum number of peaks. Since the fragment search is significantly slower than the search for complete frameworks only, it is sometimes necessary to set MaxPeaksFwFragmentSearch to a smaller value than MaxPeaksFwSearch in order to retain reasonable computing times.

MinNodeDistance and MaxNodeDistance establish the lower and upper limits for the node-node distances which are used in the preparation of the lists of potential node-node bonds. In this case, a tolerance of 0.5 Å around the "ideal" distance of 3.1 Å is set.

MinSymNodes and MaxSymNodes set the lower and upper limits for the number of framework nodes per unit cell. While MinSymNodes just prevents frameworks with too low a density from being evaluated and printed, MaxSymNodes cuts complete branches of the search tree. On the one hand, this can reduce the computing time for frameworks with a well-established low density, but on the other, one has to be careful not to prescribe a value that is too small. Normally, the choice for MaxSymNodes is based on the consideration that the number of T-sites per 1000Å3 in a zeolite must be less than 20.

The NodeType line defines the number of bonds for a given node type, the maximum number of nodes of this type in the asymmetric unit and a list of the symmetry elements which can not be occupied by a node of this type. In the example, only one node type with tetrahedral connectivity is defined. The asterisk "*" specifies that an unlimited number of nodes in the asymmetric unit can be of this type. The following numbers "-6 -3 -1 4 6" specify that this node type cannot be on a six- or threefold rotoinversion axis, an inversion center, or a four- or sixfold rotation axis.

Supplying a value greater than three for MinLoopSize has two consequences: when atoms are recycled and only complete frameworks are sought, frameworks which have loops with less than MinLoopSize members are rejected (that just means they are not printed). In framework fragment recycling mode, the fragments which are candidates for the "largest fragment" for recycling are checked for MinLoopSize. Unfortunately, the present implementation of the loop size test is very time consuming. The time spent for the fragment search increases by roughly 40%. In this example, MinLoopSize was therefore kept at its default value of three, although four is perhaps more appropriate for high silica frameworks. (However, the structure of the high silica ZSM-18 ( MEI) does contain 3-rings).

MaxLoopSize is less critical than MinLoopSize and just specifies the maximum loop size up to which the LC algorithm advances. The default value of 24 is sufficient for all known zeolite topologies. For loops with more than MaxLoopSize members, a "0" is printed. Cases where smaller values would result in a speed gain for the price of having some zeros in the LC are hardly imaginable.

In the example, EvenLoopSizesOnly is switched Off. This means, all loop sizes greater than or equal to MinLoopSize are allowed. The EvenLoopSizesOnly option was introduced for the search for frameworks where a strict alternation of two atom types is expected. In these cases, only even loop sizes are possible. A special problem arises for aluminophosphates. Since the scattering powers of Al, and P are only slightly different, it is often not possible to determine the true space group from the powder profile. Only after the structure is known, can one introduce the strict Al-P alternation, which in many cases reduces the symmetry. For this situation, EvenLoopSizesOnly provides a more robust alternative to AltFwTracking.

It has to be noted that in framework fragment search mode the impact of EvenLoopSizesOnly on the computing time requirements is similar to setting MinLoopSize to a value greater than three. However, since loop sizes have to be computed only once per framework or framework fragment, MinLoopSize greater than three does not result in more time consumption if EvenLoopSizesOnly is switched On.
Check3DimConnectivity is followed by one of the keywords On or Off. If On, a filter procedure is called for each framework topology found. Only 3-dimensionally connected frameworks can pass this filter, layer or chain structures are rejected. IdealT_NodeDistance specifies the "ideal" node-node distance for four-connected nodes. This is the basic value for the geometrical tests, which are further specified by the CheckTetrahedralGeometry keyword, which is followed by Off, Normal, or Hard. For high silica and Si-Al frameworks like Dodecasil-1H, the Hard test is appropriate.

The next block with three input lines describes the initialization and development of the "trials". The keyword RandomInitialization is used to define the "seed" value for a portable pseudo random number generator, which is used to generate the starting phases. The special value Time tells FOCUS to use the machine time for the automatic determination of the seed value, which is then printed on the output file. This integer value - like any positive integer value - can be resupplied with RandomInitialization in order to rerun FOCUS with different output options or for testing or debugging purposes.

The next input block describes the initialization and development of the "trials". Each new starting phase set generated prior to the Fourier recycling procedure is considered to be a trial. The FeedBackCycles keyword is followed by an arbitrarily long sequence of nonnegative integers (including zero). The first integer specifies the number of times the atom recycling procedure is to be used in one trial, the second integer is for the number of framework fragment recycling loops, the third again for atom recycling, and so on. In the example, ten cycles with alternation of atom and framework fragment recycling are requested. However, as the next keyword FeedBackBreakIf indicates, the recycling is prematurely terminated if both the phase set and the RF residual value have converged. Another special situation is, when no fragment which can be recycled is found. In this case, a trial continues with atom recycling (but the cycle is still counted as framework fragment recycling cycle).

Experience has shown that a simple alternation of atom recycling and framework fragment recycling, as in the example, is usually the most efficient approach.

The next block concerns the layout of the electron density map and the characteristics of the peak search and refinement. In the example, the grid for the electron density map is defined such that a resolution of about 1/3 Å is achieved. One has also to take care that all symmetry elements pass through grid points. In its present form, FOCUS does not automatically generate an appropriate grid, but it does refuse to work with grid sizes that do not conform to this requirement. For example, in space group P-1 the grid sizes for all directions have to be a multiple of two, in order to have all inversion centers laying on a grid point. In the present case of space group P6/mmm, the grid size in the z-direction has to be a multiple of two, and the grid sizes for the x- and y-direction have to be a multiple of six.

The eDensityCutOff value specifies the lower cut-off value for the peak search in the electron density maps. This specification can either be an absolute value, e.g. eDensityCutOff 1.0, or relative to the maximum value of the whole map, as is in the example. The overall maximum value of MaxPotentialAtoms, MaxPeaksFwSearch, and MaxPeaksFwFragmentSearch is the maximum number of peaks in the unit cell which are put on the peaklist by the peaksearch procedure. However, if there are less than this number of peaks with a maximum peak height above the value set by eDensityCutOff, the list will contain fewer peaks. The next three keywords, MinPfI, CatchDistance, and eD_PeaksSortElement determine the behavior of the peaklist refinement procedures. MinPfI ("minimum number of points for interpolation") defines the minimum number of grid points with a positive electron density value surrounding a grid peak position. If the actual number is fewer than MinPfI, no interpolation for the peak position is carried out and the coordinates of the grid point are retained. CatchDistance is the minimum distance a peak has to have to all of its symmetrically equivalent peaks (self-distance). For self-distances smaller than CatchDistance, a procedure is activated, which moves the peak onto the symmetry element which is responsible for the close contact.

After all peak positions have been refined, the peaklist is sorted according to eD_PeakSortElement, which can be specified as Grid_eD, Maximum, or Integral.

The last block specifies treatment and usage of the extracted intensities. First of all, the wavelength used in the diffraction experiment is specified with Lambda, followed by either a decimal value for the wavelength (in the same units as the values supplied with UnitCell) or one of the codes for the internally stored wavelengths (which are in Å units). FobsMin_d sets the minimum d-spacing for the reflections to be used. FobsScale defines the scale factor, which was determined with the Xtal GENEV module. SigmaCutOff is set to zero in this example, because the GSAS REFLIST command does not produce standard deviations for the extracted intensities. If standard deviations are available, reflections with an intensity smaller than SigmaCutOff times their standard deviation can be excluded.

The OverlapFactor together with the individual FWHM for each reflection is used to determine the overlap groups, which are then processed according to OverlapAction, which is one of NoAction, EqualF2, or EqualMF2.

ReflectionUsage specifies the number of reflections that are actually used. This can be absolute, for example ReflectionUsage 80 will select the 80 highest reflections, or it can be relative, as in the example. In the latter case, reflections are selected in descending order of (equipartitioned) intensity times multiplicity (M.F ) until the prescribed percentage of the total sum of M.F over all input reflections is accumulated.

The last part of the input file is a listing of the extracted Fourier magnitudes. The data are given as reflection indices hkl, observed relative Fourier magnitude, the estimated standard deviation of the Fourier magnitude and the FWHM as derived from the refined profile parameters. If estimated standard deviations are not available, asterisks can be supplied instead.