Frequently Asked Questions

Here are some detailed answers to questions by users. If you don't find what you need here, just email sludtke@bcm.tmc.edu .
  • I'm trying my first EMAN refinement, but the refine command crashes or produces black images in classes.1.hed. What's going on?

    There are several possible causes for this. The first possibility is that your particles in start.hed and your initial 3D model in threed.0a.mrc aren't the same size. Do an 'iminfo start.hed' and 'iminfo threed.0a.mrc' and make sure they're the same size in pixels.

    Aside from that,  by far the most likely problem is that you used the 'ctfc=' or 'ctfcw=' options in the 'refine' command, but didn't properly prepare the input particles. With 'ctfit', 'fitctf' and/or 'applyctf'. The answer here is RTFM (read the _ manual). If you want to see if your particles are prepared for CTF correction run the 'eman' file browser, and look at the start.img file. In the text box below the image, you should see something like:

    '!--2.38 252 1.22 0.15 0 6.84 2.54 1.43 400 4.1 2.7'

    Your numbers will be different, of course, but there should be a row of numbers beginning with '!-'. If you don't see this, then your images haven't been properly phase flipped.

  • How can I read images in format ABC ?  What file formats does EMAN support ?

    EMAN supports a lot of different formats, and it does it transparently. That is, in general, all EMAN program can read any image in a wide variety of formats without you having to do anything special. EMAN currently supports reading SPIDER, IMAGIC, MRC, Gatan DM2, Gatan DM3, PIF and ICOS formats. TIFF images are now natively supported using libtiff. You should now be able to directly read 16 bit tiffs. Most generic image formats like TIFF, GIF, PGM, BMP, PNG, etc. are also supported if you have the IMAGEMAGICK package installed on your machine. Due to Gatan constantly changing things, we cannot guarentee that DM3 file reading will be perfect.

    For image writing, EMAN supports most of the above formats as well. However, most EMAN programs default to IMAGIC format (for 2D) and MRC format (for 3D). To convert to a different format, use 'proc2d' for 2D images and 'proc3d' for 3D images.

    Some of you may also be aware of the 'byte ordering' issue. Different machines (SGI vs Intel, for example) store their numbers in the opposite byte-order. Often this means files generated on one machine will be unreadable on machines using the opposite convention. However, EMAN handles this problem as well. Any supported image can be read regardless of byte-order. When writing images EMAN uses the native byte order of the machine the software is being run on.

  • Symmetry determination - Do you have a facility for calculating rotational power spectra ? How can I determine the symmetry of my particles ?
    added 2/4/2004, EMAN 1.5

    This can be done in EMAN, though it doesn't use rotational power spectra. Real-space approaches are more accurate, though proper centering is critical. Past attempts at the rotational power spectrum approach (on several test cases) showed it to be unreliable and imprecise.

    First, center the particles:
    cenalignint particles.hed maxshift=<pixels>
            (warning, this can use a lot of memory. You should have 3x
            as much ram as the size of the file you operate on. If not,
            use the frac= option)
    - or -
    proc2d particles.hed centered.hed <center | acfcenter>
    

    One of those three should do a decent job centering your particles (they do not need to be in the same orientation).

    Then take the centered data and run :
    startcsym centered.hed <# top view particles to keep> sym=<trial symmetry>
    

    While this is also designed to look for side views, it will find top views (with the corresponding symmetry) very nicely.

    So, pick a trial symmetry, and run startcsym. Then look at the first 2 image in classes.hed and the first image in sym.hed. The first image in classes.hed is an unsymmetrized particle with the strongest specified symmetry. The first image in sym.hed is a symmetrized version. If the two look the same and have a visible symmetry, you've probably got the right answer. Repeat for all possible symmetries. The answer will usually stand out very clearly, and can be presented in publication by showing the 2 images side-by-side for each trial symmetry. Note that there are some known situations (detached virus portal complexes, for example) where a single data set may contain particles with multiple symmetries.

    Also see related question below

  • CTF Correction - I can't do an x-ray solution scattering experiment on my specimen. Is there some way I can get an apporoximate structure factor to use in fitting and CTF correction

    The documentation really needs to address this, but doesn't. There are two reasons for this, though. First, it is really difficult to describe this adequately textually. Second, you really need to have a sound understanding of the mathematics being used in CTF correction to use this method properly and avoid doing bad things to your structure (without realizing it). That said, I've found that it may not be quite so bad as I make it out to be.

    One other note. Many people (myself included) have suggested generating a structure factor curve computationally from a PDB structure of a similar protein. As it turns out, this is a very difficult thing to do, largely because solvent effects have a profound effect on the overall shape of this curve. Current software (2003) used by the solution scattering community can accurately predict peak locations, etc., but it doesn't have the correct overall shape, and should not be used for EM work. Perhaps this situation will improve in the future.

    Still, there is a way to get the necessary curve. It isn't perfect, but it's probably adequate in most cases. The basic idea is to use several sets of particles from images at different defocuses. You then simultaneously fit the CTF of these data sets such that the CTF curve is a reasonable fit, and simultaneously the predicted structure factor for all of the curves matches pretty well at low resolution. This process must be done manually using ctfit, but once you have a result, you should be able to do most of your fitting with the automated program 'fitctf'.

    The optimal way to approach this problem is to have some sort of solution scattering curve on-hand. This curve is simply used for scaling the data, and getting some general idea of a reasonable B-factor and amplitude contrast. It will not impose it's features on the final structure factor. This is also not strictly necessary, it is possible to proceed without one. The 'groel.sm' curve (native GroEL structure factor) is probably adequate for most cases. Then do the following:

    1. Load 3 or 4 particle sets into ctfit
    2. Select each set in turn, then select the 'From File' button in the 'Structure Factor' section. Select groel.sm or some other structure factor you will use as a model. (you can skip this if you like)
    3. Go to the 'Advanced' menu and select 'Change background mode'. This will change the model used for the background noise. In this model, only the first parameter 'N/A' has any effect. the remainder of the background is fit based on the zeros of the CTF. Note that this mode is currently INCOMPATIBLE with doing actual CTF correction of the data, but it can be used for this task of producing a structure factor.
    4. Set 'Amp' to 0, then adjust 'N/A' to make the background curve look fairly continuous. Note that the background curve should always be lower than the data curve. Try and make the curve somewhat continuous, DON'T try to fit the data curve.
    5. Now adjust the remaining parameters to get the best possible fit. If you are using groel.sm, this fit will not be good at all at low spatial frequency. The peaks won't match up even vaguely in most cases. This isn't the point. The point is to get the overall scale of the curve to match reasonably well.
    6. Now select the 'Struc Fac' button in the 'Display 2' section of the plot window. This will make a second plot appear, containing the predicted structure factor for all visible data sets. This is calculated from the data itself, based on the fit you have done. Note that it will act poorly around the CTF zeros. Don't worry about it. Zoom in (drag right mouse on the plot) to the low resolution range, from the x origin to around the first zero of the CTF.
    7. This is the tricky part. You need to adjust the CTF parameters in such a way that the low resolution predicted structure factors match each other as well as possible, while simultaneously not making the fit bad in the other window. Generally Amp is the most useful parameter to adjust.
    8. When everything looks good, make a note of the resolution just below the first zero that disturbs the structure factor. Hopefully this will be somewhere in the 1/20 - 1/30 angstrom range.
    9. On the 'Advanced' menu, select 'Save 1 Column'. Give it a filename, and tell it to save column 11.
    10. Almost done. Exit ctfit. Now use 'sfmerge.py' to combine the file you just created (at low resolution) with some other structure factor (like groel.sm) at high resolution. type 'sfmerge.py' for usage. Note that the cutoff frequency is in 1/A. ie - if you found the cutoff resolution to be at 25 A, specify .04.
    11. That's it, use the new predicted structure factor file you just made to fit all of your data with either ctfit or fitctf. Then specify this structure factor using ctfcw= in the refine command. Remember you cannot currently use the alternate background mode in ctfit when fitting the data you will be reconstructing.


  • How can I find out how many particles were used in a reconstruction ? How can I look at the particles that were discarded ?

    Good questions! To find out how many particles were included in the class-averages, type (for example) 'iminfo classes.4.hed all'. The last number on each line is the number of particles included in that class-average. At the end a total number of particles included in the classes file will be shown.

    Now this is where it gets tricky. If all of the class-averages were used in the reconstruction, you'd be done. However, some class-averages may get excluded (depending on the value you select for hard=). In addition, if you use 3dit= or 3dit2=, some class-averages may get excluded. However, they are not necessarily the same class averages that are excluded from the original make3d reconstruction. make3d will output how many original particles were included in the final reconstruction as part of its output on the screen. This is probably the best answer you'll get, but it isn't stored anywhere. Generally when I talk about the number of particles used in a reconstruction, I'm referring to the 'iminfo classes.4.hed all' method.

    The next part is a little trickier. A complete record of particles excluded from the class-averages is kept (along with classification information) for all iterations, in 'particle.log'. This file has a variety of different information in it, depending on the first character of the line. Lines starting with 'X' indicated excluded particle numbers. If going through this file is too much of a pain, you can rerun the 'classalignall' command with the 'badfile'. This will create a set of files containing the excluded images for whatever options you provide to classesbymra.

    Note that the particle.log file can also be used to recreate the 'cls' files from any particular iteration using the 'clsregen' command.