4.3 Comments

images Num. projections Num. rotations and translations (>104 103 <sup>10</sup><sup>2</sup>

 2

Different cryo-EM reconstruction packages differ in the tricks that the packages use to speed up computation. The combination of various tricks can occasionally be overwhelming to understand, but it helps to keep in mind that the underlying algorithm is just Algorithm 1 or a minor variation of it. SPIDER, EMAN,

FREALIGN, and cryoSPARC implement versions of best-alignment reconstruction.

1.Multiresolution alignment: it begins with very coarse grids for alignment. When the algorithm converges on the coarse grids, refine the grids and carry out the alignment in the refined grids in a local neighborhood of the coarse grid

2.Polar coordinates: image rotation is computationally expensive; however, if the image is represented in polar coordinates, then rotation is just a translation along the angle axis, an operation that is computationally far less expensive. SPIDER uses polar representations of images in the Fourier domain [5].

mathematical details see the supplementary information in [10]). The lower bound is evaluated on the vertices of a coarse alignment grid. Then, the

smallest lower bound. All vertices whose lower bound is greater than the exact

ignored. The grid is then refined at the surviving vertices, and the procedure is repeated at the refined vertices. cryoSPARC introduced this method in best-

The structure update of Eq. (8) has a simpler form in the Fourier domain. The Fourier slice theorem [4, 12] suggests that the projection and back-projection operators simplify to 2D slice extraction and slice insertion operators in the Fourier domain. The CTF filter operator also reduces to a point-wise multiplication by the

Inserting or extracting a 2D slice from a 3D volume requires careful numerical

interpolation. A method called gridding is used for this [13, 14]. All cryo-EM

reconstruction packages use some form of gridding. RELION uses a

Ii CiPni Sk

cannot contain a better alignment and are

term is evaluated exactly at the vertex that has the

 2 (for

3.Branch and bound: the main idea behind branch and bound is to find a

lation. This is computationally expensive. A number of "tricks" have been developed to keep the computational cost manageable. I will discuss these below. Another problem with the simple algorithm is the structure update step of Eq. (8). The matrix representations of the operators in this equation are far too large to compute with, and, in practice, tricks are also used to simplify this calculation. I

for every execution of line 4 of the algorithm.

requires one image rotation plus trans-

 2

Ii CiPni <sup>S</sup><sup>k</sup>

evaluations of R<sup>θ</sup>i,ti

Moreover, calculating R<sup>θ</sup>i,ti

will discuss these below as well.

4.1 Speeding up alignment

solution.

R<sup>θ</sup>i,ti

50

Ii CiPni Sk

value of R<sup>θ</sup>i,ti

Ii CiPni <sup>S</sup><sup>k</sup>

Technology, Science and Culture - A Global Vision, Volume II

Approaches to speed up the alignment step are:

computationally simple lower bound for Rθi,ti

 2

CTF. With these simplifications, Eq. (8) becomes tractable.

 2

Ii CiPni <sup>S</sup><sup>k</sup>

alignment reconstruction [10].

4.2 Speeding up structure update

)

The simplifications I made to describe best-alignment reconstruction are easy to discard. It is straightforward to allow for unknown and unequal noise variances for different images and also to account for nonwhite noise.

Some packages (e.g., SPIDER, EMAN) carry out the alignment step by maximizing the correlation coefficient between R<sup>θ</sup>i,ti Ii and CiPni Sk � � rather than minimizing R<sup>θ</sup>i,ti Ii � CiPni <sup>S</sup><sup>k</sup> � � � � � � 2 . This corresponds to using the signal model of Eq. (3) in the alignment step.

In classical statistics, the estimate of any set of parameters improves as more data are added, provided that the number of parameters stays fixed. The number of parameters in best-alignment reconstructions does not stay fixed; the number of parameters in ð Þ N , T increases linearly with the number of images. This can potentially limit the asymptotic accuracy of best-alignment reconstructions. Expectationmaximization algorithms attempt to overcome this limitation by treating ð Þ N , T as latent variables, i.e., as variables that influence the likelihood, but which are not estimated. Only S is estimated.
