Also, in command descriptions, square brackets '[]' indicate optional parameters. The brackets should NOT be typed as part of the command.
The individual particles must be located in each micrograph. In EMAN, the program for doing this is called 'boxer'. To use this program, just type:
boxer imagefileNote that this program uses a lot of memory (it loads the entire image into memory). To determine how much memory an image will require, type:
iminfo imagefileYour machine should have about 1.5-2 times this much memory. On an SGI, you can find out how much memory your machine has with the 'hinv' command. On linux machines just 'cat /proc/meminfo'. If your machine doesn't have enough memory, boxer may allow you to split the image in pieces. See the boxer documentation for more info on this, and for detailed instructions on using boxer.
Select the particles from each micrograph using boxer. Save the particles from each micrograph in a separate file at this point. They will be combined later. At this point, you also want to use a box size that's about 25-50% larger than your particles. That is, if the smallest box that you could possibly use is 80 pixels, you should start with a box size of 100-120 pixels. Any box size can be used, but things will run faster if the box size is a multiple of 8, or at least 4. Do not use a box size smaller than ~32 pixels.
This is currently the most difficult part of performing a reconstruction in EMAN. We are actively working on a fully automated solution for this process, but for now, some manual effort is involved. You should probably read and understand the manual section on CTF Correction before trying to proceed.
Run ctfit and first make sure the microscope and image parameters in the center of the control panel are correct. Then, using the 'Read Clip Set' item 'file' menu, read the particles from each micrograph into ctfit. You don't need to worry about memory usage in ctfit, since only the average power spectrum is kept in memory for each particle set. Each file you read will appear as a separate item in the list in the upper left of the control panel. For each file displayed in this list, 2 lines will be drawn in the plot window. One line will be smooth, and one line will be somewhat jagged. The smooth line represents the current CTF model based on the parameters set with the top 9 sliders in the control panel. The 'jagged' line represents the power spectrum of the images you read in. You will probably want to read the manual section on ctf parameter determination in ctfit before proceeding.
You will now need to determine the 8 CTF parameters for each micrograph. ctfit has an experimental automatic routine for doing this, but it won't work on all data sets. First, make sure only the data sets you want to determine the CTF for are being displayed. (for example, you can delete the new curve the program starts with). Next, press the QFIT button. If everything goes well, that should determine a good set of parameters for all of the data sets. If it doesn't work (should be pretty obvious), you'll have to determine the parameters manually (or wait for a new version :^) ).
Once the parameters have been determined, highlight the first data set, and use the 'Phase Correct' item on the 'Process' menu. Repeat this process for the other data sets. This will generate a new file for each data set with '.fix' inserted in the name. All of the data in these files has now been phase corrected, and the CTF parameters you determined have been stored in the headers of each particle image. You will now use these '.fix' images for the remainder of the processing.
At this point, you should also make a note of the maximum resolution of your images. One of the 8 parameters you determined for each image is the envelope function width (which can also be displayed as a B factor). When displayed as 'Envelope', this number represents the highest resolution you are likely to achieve in a reconstruction using this data set. Record the average value of this number for a few of the close to focus images for use in step 2.
This is a simple step. Take all of the image files you are going to use in your reconstruction and combine them into a file called 'start.hed'. For example, if you have data files: 2345.fix.hed, 2346.fix.hed and 2347.fix.hed, you would do (proc2d appends to output files):
rm start.hed start.img proc2d 2345.fix.hed start.hed proc2d 2346.fix.hed start.hed proc2d 2347.fix.hed start.hedJust to keep things neat, at this point, you might want to make a subirectory for all of the raw data. eg :
mkdir raw-data mv * raw-data mv raw-data/start.* .
Near the origin in Fourier space, there is a very strong component due to the structure factor and incoherent scattering. This term is so strong that it may interfere with alignment in the reconstruction process. For this reason it's generally a good idea to apply a small high-pass filter to the particles. This filter may potentially have adverse effects on the model, especially if the particle box size is smaller than ~64 pixels. Nonetheless, unless it turns out to clearly cause problems, it's a good idea to start with some filtering, usually 1 or 2 pixels is sufficient:
proc2d start.hed start.hed hp=1 inplace [invert]The hp option does high-pass filtering, and the inplace option tells proc2d not to append to the output file, but to overwrite the input images in the same location in the file. If necessary, add the invert option to reverse the density of your particles. EMAN assumes that positive values (white) indicate high density. For cryo images, that means the protein should appear white against the water background. Use invert if your cryo-images look darker than the background.
At the end of the reconstruction, the unhp= option in proc3d can be used to undo the highpass filter. This can potentially have dramatic effects on the appearance of the model. For example, without restoring this term, an otherwise solid object may appear hollow. In some cases it has virtually no effect.
cenalignint start.hed mask=<mask> [num=#1] [denom=#2]This program will read ALL of the particles into memory, and effectively make 2 copies of each. That means if you do an 'ls -l start.hed', your computer should have 3 times this much physical memory. If this is not the case, you should use the num= and denom= options. This allows a fraction of of the data to be processed at once. For example, if you have 1/3 as much memory as you need, you'd do:
cenalignint start.hed mask=<mask> num=0 denom=3 cenalignint start.hed mask=<mask> num=1 denom=3 cenalignint start.hed mask=<mask> num=2 denom=3Replace <mask> with a 'safe' mask radius (in pixels). This should be several pixels larger than the actual radius of the particles.
This program will generate 3 new image files: ali.hed contains the centered particles after processing. bad.hed contains the particles that were rejected because of ambiguous alignment. Finally, avg.hed contains the average images after each iteration of the alignment. This third file can be examined to determine the size of your particle. Find the radius a pixel or two outside the outermost whitish ring in the last image in avg.hed. This is the mask radius you should use from here on. Go back to 'step 1' in eman and enter the correct value if you haven't already.
Once you're satisfied with the results of the centering, copy ali.hed/img over start.hed/img. If you're concerned about being able to retrace your steps, you may wish to make a copy of start first. The main reason for this step is to get the centering good enough that you can reduce the box size somewhat. You probably want to still leave about 10% padding around your particle. So, if your maximum particle dimension is 70 pixels, and you used a 100 pixel box, you might reduce this to 80 pixels now, like so:
proc2d ali.hed start.hed clip=80 rm ali* avg* bad*Note that it's not necessary to get perfect centering, just good enough so the particles don't get chopped off at the edge of the box. The smaller box size is very important for speed. A 20% box size reduction may mean a factor of 2 increase in reconstruction speed.