Q: How do I pick the correct parameters for automasking during refinement or using the mask.auto3d processor ?
Additional Note - The new e2filtertool.py program provides an excellent way to interactively determine good mask parameters !
A: When you produce a 3-D structure using single particle analysis, there is inevitably residual noise left in the final reconstruction. While there is nothing you can do about the noise on top of the particle density itself, noise which is clearly outside the particle volume in the solvent region, can, and should be removed. When used as part of an iterative refinement process, this can reduce the overall noise level in the projections, and produce more accurate particle orientations, and hence, more accurate reconstructions. However, if not done carefully, this process can also impact model bias, and can cause resolution exaggeration.
The automasking procedure used in both EMAN1 and 2 is safe for most purposes. If you have a structure with a highly mobile peripheral domain, or some other low density peripheral feature, however, it must be used with caution to avoid 'chopping off' parts of the structure you wish to visualize. In EMAN2.1 the automask parameters are determined automatically, and you should only specify this option if the automatic parameters don't work well.
The basic algorithm is fairly straightforward. Starting in the middle of the map, we first define a sphere of density which is inside, but touching, some portion of the actual model. This is defined by the 'radius=' parameter (in pixels). ie - define a radius of a sphere, starting at the center, which is large enough to contact some portion of your model's density, but NOT large enough to extend outside your model.
This sphere is used as a seed for a floodfilling algorithm. The program will find any density above a specified threshold, which is in contact with this sphere. For this, you specify the density threshold (threshold=). This number should generally be similar to the isosurface value you use when visualizing your map (perhaps slightly lower).
Now, if this were used by itself to mask out the map, your map would have a sharp edge, you would get model bias, and the sharp edge could produce resolution exaggeration with the even/odd test. To avoid this, there are 2 additional parameters. The first is 'nshells', which is the number of 1-voxel shells to extend the mask from this inital 'tight' mask around the surface which we just determined. This produces a mask which is still shaped like your particle, but is far enough away from the surface that it isn't cutting through any important densities. This value is typically ~5% of the the box-size (in pixels).
The final parameter is similar to nshells. The 'nshellsgauss=' parameter specifies an additional number of shells to extend the mask outward from the model, but in this case, the mask is combined with a Gaussian decay. This 'soft edge' on the mask prevents resolution exaggeration.
Combined, this mask produces something very similar to 'solvent flattening' in x-ray crystallography.
In early 2010, a 5th parameter was added 'NMax'. This can replace or supplement the sphere used to seed the flood-filling process. Rather that using a sphere, it finds the N largest values in the map, and uses these as seeds. This is particularly helpful for large spherical particles, like viruses, and permits inside as well as outside masking of the structure. However, if some of the noise outside the particle happens to have a high value, this mechanism will work poorly. You can use both mechanisms together, though generally choosing one and setting the other to 0 makes the most sense.
To summarize, 5 parameters:
radius |
radius big enough to touch inside of the model |
nmaxseed |
number of highest value voxels to use as seeds for the model |
threshold |
isosurface threshold (maybe slightly lower) |
nshells |
number of shells to expand away from the structure, ~5% of box size in pixels |
nshellsgauss |
number of shells to expand with Gaussian decay, ~5% of box size in pixels |
You can test it using (for example):
e2proc3d.py bdb:refine_01#threed_filt_01 testvolume.mrc --process mask.auto3d:radius=30:threshold=.8:nshells=8:nshellsgauss=8:nmaxseed=0
these parameters can also be specified in the refinement dialog in the workflow. Note that in EMAN2.1 the automask option in e2refine_easy takes a processor-like specification as above. However in EMAN2.0x you had to specify the numbers in a specific order instead (see e2refine.py --help for details).
For viruses, suggest using nmaxseed=60 and radius=0. The advantage of nmaxseed in other particles over using a radius is that it is generally fairly particle independent. Note, however, that if you have something like RNA, then the highest densities in your map may be in the core, not the capsid shell. In that case, you may need to set a post-processor in the 3-D reconstruction step using the mask.soft processor with a fixed inner and outer radius.