$0
$1In command descriptions, square brackets '[]' indicate optional
parameters. The brackets should NOT be typed as part of the command. Angle
brackets '<>' denote the name of a parameter you should replace with a value.
Italics often represent the same thing.
You can examine these instructions at any time by simply pressing the 'step 1' button again. All of the parameters are saved. Alternatively, if you want to print them out, or see them rendered a litle better, these instructions are stored in 'step1.html' in the local directory.
The individual particles must be located in each micrograph. In EMAN, the program for doing this is called 'boxer'. Note that this program uses a lot of memory (it loads the entire image into memory). To determine how much memory an image will require, type:
iminfo imagefileYour machine should have about 1.5-2 times this much memory. If your machine has less than this much memory, do NOT run the following command, or your machine will begin to swap heavily, and will generally become very slow. If you DO have enough memory, run the following command:
boxer imagefileOn an SGI, you can find out how much memory your machine has with the 'hinv' command. On linux machines just 'cat /proc/meminfo'. If your machine doesn't have enough memory, boxer may allow you to split the image in pieces. See the boxer documentation for more info on this, and for detailed instructions on using boxer.
Select the particles from each micrograph using boxer. When you save the particles, use a different file for the particles from each micrograph. They will be combined later. At this point, you also want to use a box size that's about 25-50% larger than your particles. That is, if the smallest box that you could possibly use is 80 pixels, you should start with a box size of 100-120 pixels. Any box size can be used, but things will run faster if the box size is a multiple of 8, or at least 4. Do not use a box size smaller than ~32 pixels. $3n
When CTF correction is not being performed, only micrographs with similar defocuses should be used in the reconstruction. The data from each micrograph is typically filtered at the first zero of the CTF, which should be at roughly the same position in each micrograph.
Run ctfit and first make sure the microscope and image parameters in the center of the control panel are correct. Then, using the 'Open Particle Set' item 'file' menu, read the particles from each micrograph into ctfit. Each file you read will appear as a separate item in the list in the upper left of the control panel. Determine the defocus of each micrograph as described in the ctfit manual. Any micrograph that varies by more than ~10% from the desired defocus should not be used int the reconstruction. In addition, any micrographs with a significant amount of drift or astigmatism should be discarded and not used in any reconstruction.
Now is a good time to determine the low-pass filter radius you will use. Drag the left mouse button in the plot window to determine the radius in pixels of the 1st zero (dark ring) of the CTF. Make a note of this value. Then drag with the left button in the plot window and determine the location of the first zero in A (the 'X=' number in the upper right corner). This is the best resolution you can hope to get from this reconstruction. To go to higher resolutions, you will need to perform CTF correction.
Make sure all of the data you use is at a similar defocus. The resolution of your reconstruction will be limited by the farthest from focus image you use in the reconstruction. Do not include any data farther from focus than usable for your target resolution. Closer to focus images are all right to an extent, but if the defocus range is too great, it will be impossibile to perform trivial low resolution CTF correction on the final 3D model. $3c
This is currently the most difficult part of performing a reconstruction in EMAN. We are actively working on a fully automated solution for this process, but for now, some manual effort is involved. You should probably read the manual section on CTF Correction before trying to proceed.
Run ctfit and make sure the microscope voltage, Cs and A/pix values are correct. Then, using the 'Read Clip Set' item 'file' menu, read the particles from each micrograph into ctfit. You don't need to worry about memory usage in ctfit, since only the average power spectrum is kept in memory for each particle set. Each file you read will appear as a separate item in the list in the upper left of the control panel. For each file displayed in this list, 2 lines will be drawn in the plot window. One line will be smooth, and one line will be somewhat jagged. The smooth line represents the current CTF model based on the parameters set with the top 9 sliders in the control panel. The 'jagged' line represents the power spectrum of the images you read in. You will probably want to read the manual section on ctf parameter determination in ctfit before proceeding.
You will now need to determine the 8 CTF parameters for each micrograph. This is a nontrivial process, and is difficult to describe. The best description currently exists in the ctfit documentation mentioned above. The suggested method is to use x-ray scattering data for your specimen if you have it. Of course, you probably don't, in which case, for optimal results, you'll need an estimated structure factor calculated from a PDB file. This can be prepared with the EMAN software, but that description is beyond the scope of this document. Reasonable results can be obtained without this file using one of the other techniques outlined in the fitting document.
EMAN does not currently do astigmatism correction, so if some images are astigmatic or have a significant amount of drift, they should be excluded from the reconstruction.
Once the parameters have been determined, highlight the first data set, and use the 'Phase Correct' item on the 'Process' menu. Repeat this process for the other data sets. This will generate a new file for each data set with '.fix' inserted in the name. All of the data in these files has now been phase corrected, and the CTF parameters you determined have been stored in the headers of each particle image. You will now use these '.fix' images for the remainder of the processing.
At this point, you should also make a note of the maximum resolution of your images. One of the 8 parameters you determined for each image is the envelope function width (which can also be displayed as a B factor). When displayed as 'Envelope', this number represents approximately the highest resolution you are likely to achieve in a reconstruction using this data set. Record the average value of this number for a few of the close to focus images for use in step 2. $4
This is a simple step. Take all of the image files you are going to use in your reconstruction and combine them into a file called 'start.hed'. For example, if you have data files: 2345.fix.hed, 2346.fix.hed and 2347.fix.hed, you would do (proc2d appends to output files):
rm start.hed start.img proc2d 2345.fix.hed start.hed proc2d 2346.fix.hed start.hed proc2d 2347.fix.hed start.hedJust to keep things neat, at this point, you might want to make a subirectory for all of the raw data. eg :
mkdir raw-data mv * raw-data (ignore the warning message this produces) mv raw-data/start.* .$4.1n
This is a simple step. Take all of the image files you are going to use in your reconstruction and combine them into a file called 'start.hed'. For example, if you have data files: 2345.hed, 2346.hed and 2347.hed, you would do (proc2d appends to output files):
rm start.hed start.img proc2d 2345.hed start.hed proc2d 2346.hed start.hed proc2d 2347.hed start.hedJust to keep things neat, at this point, you might want to make a subirectory for all of the raw data. eg :
mkdir raw-data mv *.hed *.img *.mrc raw-data mv raw-data/start.* .$5n
Now we need to filter the particles at the first zero. Keep in mind that this is NOT CTF correction, it simply prevents phase errors from causing distortions at high resolution. There are still low resolution amplitude effects due to the CTF which are NOT compensated for. This will cause certain features in your map to be expanded or reduced. This method is fine for generating a first model for a new protein, or generating a preliminary model to use for a later CTF corrected reconstruction.
It is also a good idea to perform a slight high-pass filter to eliminate the strong incoherent scattering very close to the origin in Fourier space. Typically a 1 pixel radius is sufficient, but for small particles (box size <64 pixels) even this may be too much. To do the filtering (with the low pass radius you determined above for the 1st zero):
proc2d start.hed start.hed hp=1 lp=radius in pixels inplace [invert]The hp option does high-pass filtering, the lp option does low pass filtering, and the inplace option tells proc2d not to append to the output file, but to overwrite the input images in the same location in the file. If necessary, add the invert option to reverse the density of your particles. EMAN assumes that positive values (white) indicate high density. For cryo images, that means the protein should appear white against the water background. Use invert if your protein looks darker than the background.
At the end of the reconstruction, the unhp= option in proc3d can be used to undo the highpass filter. This can potentially have dramatic effects on the appearance of the model. For example, without restoring this term, an otherwise solid object may appear hollow. In some cases it has virtually no effect. Of course, in this case, the lack of ampitude CTF correction at low resolution outweighs this effect, so you should probably make a CTF corrected model before worrying too much about the unhp option. $5c
Near the origin in Fourier space, there is a very strong component due to the structure factor and incoherent scattering. This term is so strong that interpolation errors here may interfere with alignment in the reconstruction process. For this reason it's generally a good idea to apply a small high-pass filter to the particles. This filter may potentially have adverse effects on the model, especially if the particle box size is smaller than ~64 pixels. Nonetheless, unless it turns out to clearly cause problems, it's a good idea to start with some filtering, usually 1 pixel is sufficient:
proc2d start.hed start.hed hp=1 inplace [invert]The hp option does high-pass filtering, and the inplace option tells proc2d not to append to the output file, but to overwrite the input images in the same location in the file. If necessary, add the invert option to reverse the density of your particles. EMAN assumes that positive values (white) indicate high density. For cryo images, that means the protein should appear white against the water background. Use invert if your protein looks darker than the background.
At the end of the reconstruction, the unhp= option in proc3d can be used to undo the highpass filter. This can potentially have dramatic effects on the appearance of the model. For example, without restoring this term, an otherwise solid object may appear hollow. In some cases it has virtually no effect. $6
Note that an alternative to this centering technique is to use the new multireference-based automatic boxing routine in 'boxer', which does a pretty good job of centering. Unfortunately, generating the appropriate references requires a preliminary 3D model, and running several programs in sequence. (It can make a dramatic improvement in reconstruction resolution, so ask me if you're interested) If you decide to use cenalignint, run:
cenalignint start.hed mask=<mask> [frac=<num>/<denom>]This program will read ALL of the particles into memory, and effectively make 2 copies of each. That means if you do an 'iminfo start.hed', your computer should have 3 times this much physical memory. If this is not the case, you should use the frac=<n>/<d> option. This causes only a fraction of of the data to be processed. For example, if you have 1/3 as much memory as you need, you'd do:
cenalignint start.hed maxshift=<max> frac=0/3 cenalignint start.hed maxshift=<max> frac=1/3 cenalignint start.hed maxshift=<max> frac=2/3Replace <max> with the maximum shift, in pixels, that should be used to center the particles. If you don't specify one, 1/4 of the box size will be used. If the particles are already fairly well centered, using a small value here will prevent erroneous centering with large translations.
This program will generate 3 new image files: ali.img contains the centered particles after processing. bad.img contains the particles that were rejected because of ambiguous alignment. Finally, avg.img contains the average images after each iteration of the alignment. This third file can be examined to determine the size of your particle. Find the radius a pixel or two outside the outermost whitish ring in the last image in avg.hed. This is the mask radius you should use from here on. Go back to 'step 1' in eman and enter the correct value if you haven't already.
Once you're satisfied with the results of the centering, copy ali.hed/img over start.hed/img. If you're concerned about being able to retrace your steps, you may wish to make a copy of start first. The main reason for this step is to get the centering good enough that you can reduce the box size somewhat. You probably still want to leave about 15% padding around your particle. So, if your maximum particle dimension is 64 pixels, and you used a 100 pixel box, you might reduce this to 80 pixels now (remember this number should be divisible by 8), like so:
proc2d ali.hed start.hed clip=80 rm ali* avg* bad*Note that it's not necessary to get perfect centering, just good enough so the particles don't get chopped off at the edge of the box. The smaller box size is very important for speed. A 20% box size reduction may mean as much as a factor of 2 increase in reconstruction speed. $7