Mechanisms and Rates of Nucleation of Amyloid Fibrils

The classical nucleation theory finds the rate of nucleation proportional to the monomer concentration raised to the power, which is the `critical nucleaus size', ${n_c}$. The implicit assumption, that amyloids nucleate in the same way, has been recently challenged by an alternative two-step mechanism, when the soluble monomers first form a metastable aggregate (micelle), and then undergo conversion into the conformation rich in ${\beta}$-strands that are able to form a stable growing nucleus for the protofilament. Here we put together the elements of extensive knowledge about aggregation and nucleation kinetics, using a specific case of A${\beta_{1\mathrm{-}42}}$ amyloidogenic peptide for illustration, to find theoretical expressions for the effective rate of amyloid nucleation. We find that at low monomer concentration in solution, and also at low interaction energy between two peptide conformations in the micelle, the nucleation occurs via the classical route. At higher monomer concentration, and a range of other interaction parameters between peptides, the two-step `aggregation-conversion' mechanism of nucleation takes over. In this regime, the effective rate of the process can be interpreted as a power of monomer concentration in a certain range of parameters, however, the exponent is determined by a complicated interplay of interaction parameters and is not related to the minimum size of the growing nucleus (which we find to be ${\sim}$ 7-8 for A${\beta_{1-42}}$).


INTRODUCTION
Amyloid fibrils are insoluble linear ordered aggregates of particularly misfolded proteins or peptides, which are closely connected with neurodegenerative disorders [1][2][3][4] . As more evidence emerges that oligomers produced at the early stages of amyloid aggregation could be the most toxic species [5][6][7] , researchers have been keen on understanding the details of the nucleation mechanism of fibrils, in particular, determining the critical nucleus size n c of primary nucleation: the minimum size that enables the extension of amyloid fibrils. Yet, due to the transient nature of critical nuclei and the low concentration of nuclei over the whole aggregation time course, there have been no direct experimental methods of observing amyloid nucleation process 8 .
For now, experimental studies monitor the total fibril mass in real time (e.g. by optical experiments [10][11][12][13][14], obtaining kinetic plots with a characteristic sigmoidal shape [15][16][17] . An important quantity, called the lag time t lag 18,19 , can then be extracted; it is defined as the waiting time before a sharp increase of fibril mass appears in the sigmoidal plot. This lag time approximately holds a power-law relationship to the initial monomer concentration C 1 , as originally suggested by the Oosawa model 20 , which considers only primary nucleation and irreversible elongation in aggregation kinetics. Other more advanced models that further incorporate fragmentation and annealing of filaments 18,21,22 , all retain this characteristic relationship. Without secondary nucleation, the powerlaw exponent in t lag ∼ C γ 1 is approximately γ = −n c /2, which then should enables experimental determination of n c by plotting ln t lag against ln C 1 or through a global fitting scheme of the total fibril mass plots against time with different initial monomer concentrations as used in ref. 23 . However, the validity of such methods of obtaining the critical nucleus size n c depends on how closely the assumed microscopic aggregation mechanisms and kinetic equations match the actual ones in experiments. Before forming an amyloid, monomeric subunits in solution have to switch from their native soluble structure into a partially unfolded intermediate state, which has a higher free energy in solution [24][25][26] . There are many possible configurations a soluble peptide can exist in solution: the recent simulation study 9 finds the whole hierarchy, from compact conformations rich in α-helix to a fully unfolded random coil -definitively finding the random coil having a lower free energy. This challenges the earlier assumption that the soluble monomer state of Aβ 1−42 is α-helical 9,27 , see Fig. 1. However, we find it is sufficient to use a two-state simplification to capture the essence of amyloid aggregation mechanism, as has been suggested in molecular simulations 28,29 : we denote the soluble monomer as 'α-mer' (for its assumed increasing content of α helix when forming micelles), while the β-mer is the monomeric unit of mature amyloid fibrils. We later use this two-state simplification to schematically show two different nucleation mechanisms as an aid to point out the weakness of previous theoretical kinetic models on determining the critical nucleus size.
Conventional theoretical models used to find the critical nucleus size, are usually based on the 'nucleated polymerization' (NP) concept (see Fig. 2). Other more com- plicated models that add secondary nucleation, fragmentation and annealing processes to this NP model 18,21,22 , were formulated based on the classical nucleation theory, where the primary nucleation rate is proportional to C nc 1 , and further assumed a fixed n c value throughout all monomer concentration regimes. However, already in 1984, Ferrone has pointed out that the critical nucleus size n c should change with varied monomer concentration to account for observations of the sickle hemoglobin polymerization rate 30 . The assumption of a fixed n c value with varied monomer concentration therefore must be re-examined.
In recent single-molecule experiments [31][32][33][34] , an alternative two-step nucleation mechanism has been suggested: micellation of α-mers (monomers in the native soluble state) followed by a gradual conversion into β-mers within the dense micelle (Fig. 2). Under this scenario, the nucleation rate may or may not take the power-law scaling of the initial monomer concentration as would follow from NP and its derivative models. This may in turn change the previously predicted power-law relationship of t lag . Even if the power-law scaling is found to exist in practice, the physical meaning of its exponent and its relation to the actual nucleus or micelle size become unclear. All these questions challenge the validity of employing these NP models in all monomer concentration regimes.
Several theoretical works have been proposed for the two-step nucleation mechanism. The work by Lomakin et al. in 1997 considered the process of producing critical nuclei from pre-formed micelles, under the assumption of a fast thermal equilibrium between monomers and micelles before any nucleation events happen 35 . A more recent theoretical framework of two-step mechanism (the term 'nucleation-conversion-polymerization' model is used there) considered not only nucleation of micelles but also the step-by-step conversion of this micelle into its fibrillar form 36 . However, both these studies make an assumption of a single micelle size, which was questioned by the observed presence of multi-size micelles in the coarse-grained molecular simulations of the Aβ 1−42 system 29 . Later, Auer et al. derived theoretical expressions for nucleation rates of both NP and two-step nucleation mechanisms, predicting the monomer concentration at crossover between these two mechanisms 37 . However, it was not clear what is the micelle size that optimizes the nucleation rate of the two-step mechanism.
The main of our work is to first derive the nucleation rates, then to estimate the critical nucleus size in the NP mechanism, and the micelle size that maximizes the nucleation rate of the two-step mechanism -and finally compare nucleation rates of NP and two-step mechanisms to estimate the crossover monomer concentration (similar to the one in ref. 37 ). In order to fully test the capability of the two-step model, we deliberately stay with the basic mechanism, without considering secondary pathways (secondary nucleation and fragmentation): these are not expected to contribute significantly at the early nucleation stage.
We choose the free energy approach, originally developed by Ferrone et al. 38 , to obtain free energy functions of intermediate states, final products of amyloid aggregation, and compare the free energy landscape of both nucleation mechanisms. This not only allows the analytical calculation of nucleation rates of variable micelle and nucleus size, but also makes possible a simplified kinetic analysis of nucleation rates. Although this generic scheme of using free energy landscape of nucleation to find nucleation kinetics is formally similar to the work by Auer and Kashchiev 37,39 , our work does investigate how different micelle sizes can facilitate the conversion process, which was not addressed before.
Aβ 1−42 peptide is used as our model system, since it has been studied more than any other amyloidogenic protein; molecular simulations are more reliable due to its affordable small size, in turn allowing more information on thermodynamic parameters to build free energy functions. However, our model is generic, and only the binding energy and geometric parameters would differ for other amyloid systems. For simplicity and clarity of the main text, many of the derivations, and the justification of parameter values in our free energy calculation, are removed to Appendices. Within this direct nucleation route, monomeric Aβ 1−42 peptides first undergo structural conversion from the αstate into β-state, and spontaneously stack along one direction to construct a nucleus of a protofilament, or fibrils with more than one protofilaments (typically this number lies between 2 to 6) 40-44 . These fibrils have a twisted linear structure along the fibril axis, making a complete pitch every 33 β-mers [45][46][47] . Since our aim is to investigate the critical nucleus size, which will be much less than 33 subunits, the twisted fibril structure and its effect on the later free energy calculations can be neglected.
A cut of the short length in a fibril from the results of simulation and cryo-electron microscopy further indicated that each protofilament of this fibril was aligned on the same plane 48 . To portray a coarse-grained fibril structure, it is necessary to further clarify the relative position of one protofilament with respect to another, when they are associated together. Although most of the detailed structural studies 40,49 are not often resolving this longitudinal aspect of monomer packing in protofilament pairs, there are clear indications for the period shift. As concluded in a computation study of periods of helical twisting of single and paired protofilaments 50 , the second protofilament is shifted by half a period. That is, the lateral monomer binds in the middle between two subunits of the existing filament, making equal-strength diagonal bonds with both. Based on these facts, we construct the coarse-grained structures of single, paired, and multiple protofilaments, as illustrated in Fig. 3. Two types of bonds are then involved in these fibril structures: the end-to-end bond between two β-mers, with a binding energy ∆ β , and the lateral (side) bond for two neighboring monomers from two different protofilaments, with a binding energy ∆ s each.

Free energy functions of fibrils
To construct the free energy of a protofilament, we use the approach originally used by Ferrone 30,38 . This method follows the free energy change of a given species along the reaction path, relative to the reference state of pure α-monomers in solution. The formation of an Nsize aggregate is a process where N monomers undergo an internal transition into β-state, with a conversion free energy penalty ∆ c , and then flock together to form bonds within the N -size aggregate. In this process, the translational and rotational free energies of free monomers in solution are lost, but this loss is offset by the translational and rotational free energy of an N -size aggregate as a whole. For simplification, the difference in rotational and standard translational free energies between a monomer and the N -size aggregate is neglected, as this difference is of the magnitude ln N with coefficient of 3 k B T only 38 , while all other interaction free energies have a higher dependence on N and with coefficients of tens of k B T (see Appendix A).
From the bond scheme in Fig. 3, the free energy function of an N -size aggregate of a single protofilament takes the following form: Here µ 0 is the sum of translational and rotational free energies of a single α-mer at a standard concentration of 1 mM. Strictly, µ 0 includes many internal rotational degrees of freedom of a peptide subunit in solution, most of which become frozen when the monomer adopts the closely-packed β-sheet configuration in the filament. For this reason, it is hard to accept any theoretical estimate for µ 0 based on the ideal-gas statistics. Therefore, we shall use an estimate for µ 0 based on experimental measurement of elongation free energy (see Appendix A for details on all material parameters). The estimate gives µ 0 = −34.5 k B T , for the room temperature T = 25 • C, and we will use this definition in the remainder of this work. The initial monomer concentration in solution C 1 , in the units of mM (with the reference monomer concentration of 1 mM that we shall use throughout this work).
Values of the free energy of the longitudinal β-bond, and the conversion free energy in solution are also discussed in Appendix A: ∆ β = −44 k B T 51 , and ∆ c = 20 k B T 28 . Similarly, the free energy for a paired protofilament will be: Here N s (N ) is the number of lateral (side) bonds in an N -size protofilament pair: it is zero for N ≤ 2, and we assume a symmetric N s = (N − 1) for larger N vales. The free energy of a side bond ∆ s = −22 k B T 52 . Finally, N β (N ) is the number of longitudinal β-bonds of energy ∆ β in a paired protofilament of N -length, which is zero at N = 1, equal to 1 at N = 2, and then (N − 2) for other N values (cf. Fig. 3). It is noticeable that lateral addition of one β-mer to a pre-formed protofilament would give an interaction energy of 2∆ s : almost exactly the same magnitude as ∆ β . It immediately suggests that lateral addition is equally likely as addition along the fibril axis, and shall be experimentally observed in fibril formation. This indication is indirectly supported by the fact that no mature fibrils have frayed ends in amyloid aggregation, which means that a protofilament pair is likely to form at the phase of amyloid nucleation, and is further reinforced by observations of lateral addition at the early stage of fibril formation of human amylin 43,53 .
Obviously, multi-protofilaments will have more protofilaments joined through lateral addition.
To initiate the growth of an extra protofilament by side addition of one β-mer to the pre-existing β-aggregate contributes the binding energy of 2∆ s only, which is weaker than (∆ β + ∆ s ) of an alternative addition of one β-mer on the fibril end to keep the original protofilament number. Therefore, multi-protofilament cases will not have a more favorable bond free energy, and thus the aggregation free energy would not favor the multiprotofilament over the paired case at the nucleation stage. The difference between the protofilament pair and the multi-protofilament lies in that a larger critical nucleus and a higher nucleation free energy barrier are required for multi-protofilament nucleation. The presence of multi-protofilament cases in mature Aβ 1−42 fibrils is due to other free energy contributions that originate from the twisted structure of fibrils, and will become significant as fibrils grow longer. But this effect does not contribute at the nucleation stage where the aggregate size is small, and its analysis is outside the scope of this work. Consequently, multi-protofilament cases will not be discussed further.
The comparison of free energies of a single and a paired protofilaments is given in Fig. 4. Due to the increase of the number of bond sites per monomer addition in the protofilament pair when N exceeds 3 (one at the end and the other on the side of the protofilament), the protofilament pair has a lower free energy than a single protofilament with the same number of units. A thermally stable aggregate is the one that should have a larger population than monomers, when thermal equilibrium is reached, i.e. a negative free energy with respect to the reference state chosen to be α-monomers in solution. Single protofilaments cannot be thermally stable and must transform into protofilament pairs through lateral addition at the early stage of nucleation. We therefore conclude that the critical nucleus size is always n c = 3 in the paired protofilament. The free energy barrier of aggregation in the NP model is then equal to F β,2 (3, C 1 ).
The n c value we predict here is not the same as, albeit close to n c = 2 obtained by recent experimental studies of Aβ 1−42 8,54 . This inconsistency may be due to the denaturant used in experiments (i.e. sodium azide) to initiate amyloid aggregation, which can significantly change the interactions between monomers, and our interaction parameters do not cover this effect. Kashchiev and Auer also analyzed the nucleation free energy of β strands in a 2D nucleus model 39 , yet they concluded a variable critical nucleus size, which we do not observe in the examined concentration regime. In their work, the n c value starts from a rather large size (over 40 βstrands) at low concentration, and then shrinks as the function of 1/(ln C 1 ) 2 . In this sense, our result as well as the experimental work by Knowles 54 are likely to fall within the high concentration/saturation regime, where further increase in concentration can cause little effect on the critical nucleus size than at a rather low concentration regime Kashchiev and Auer were interested in. However, later in this paper we will demonstrate that the two-step mechanism of nucleation becomes prevalent at higher monomer concentrations.
Our conclusion that critical nuclei are exclusively in the form of paired protofilaments is supported by the fact that only multi-protofilament aggregates are observed at neutral pH and high ionic concentration 43 . It is arguable that our conclusion cannot be applied at low pH and ionic strength, where single protofilaments do appear 43 , because molecular simulations giving bond free energy parameters, ∆ β and ∆ s , were only implemented under neutral pH values so far (and so we do not know the values of these energy parameters in other situations). We shall not worry about this limitation of our model since it is under physiological conditions, at neutral pH, that we intend to investigate the problem.

A. Nucleation rate of the NP mechanism
Since the rate-limiting (slow) process is actually the nucleation itself, while the subsequent elongation is fast, we could take the rate of producing (n c + 1)-mers, i.e. tetramers here, as the measure of nucleation rate in the NP model. We assume the pre-thermal equilibrium for the population of the critical nucleus size, , with n c = 3 here, and then use the theoretical expression for the elongation rate for the pre-existing fibril obtained in ref. 55 .
Two consecutive processes are involved in elongation: the diffusive arrival of a monomer at the fibril end, and the attempt to cross an additional free energy barrier to achieve internal conversion, leading to a frequency factor: exp (−∆F el /k B T )/(τ D + τ I ) 55 , see Appendix B for derivation. Here ∆F el is the barrier to overcome in elongation and is roughly 3.42 k B T for Aβ 1−42 peptide 56,57 . τ I is the time-scale for internal α to β rearrangement of amyloidogenic species (the value of τ I is estimated as 10 −5 s in Appendix A), while τ D is the arrival time to the fibril end. The rate of producing tetramers, k 1 C 1 , is the product of this frequency factor and the concentration of trimers: The rate constant k 1 is interpreted as the nucleation rate constant for the NP mechanism. The arrival time τ D is the Smoluchowski diffusion rate in solution (assuming no crowding effects). Taking into account the non-spherical shape of critical nuclei, one can modify the Smoluchowski theory for bimolecular reaction rate constants 58 , as well as the Stokes-Einstein equation for diffusion coefficients, producing the estimate: 1/τ D = 4k B T C 1 f geo /3ηr 1 , where η is the solvent viscosity, r 1 is the hydrodynamic radius of one α-mer in solution. Note that 4k B T C 1 /3ηr 1 is the classical Smoluchowski result, modified here by the geometrical factor f geo that accounts for the shape of the β-aggregate (this geometry factor makes only a small correction and will not influence the conclusions we make in this paper). With this expression for the arrival time, the rate constant of the NP mechanism, k 1 in (3), is expressed as: Equation (4) shows that k 1 is proportional to C 3 1 when the monomer concentration C 1 is low and the term f geo C 1 τ I can be neglected in the denominator (remember that F β,2 (3, C 1 ) is proportional to −2 ln C 1 ). In contrast, when the term f geo C 1 τ I dominates in the denominator at high monomer concentration, the rate constant only has the square of the C 1 concentration left: k 1 ∝ C 2 1 .

Micellation of soluble peptides
Several experiments have reported that amyloidogenic proteins or peptides in their native state can coalesce into a single micelle (some papers may use the term oligomer in this context) utilizing solvent-exposed hydrophobic patches on the surface of monomers [59][60][61][62] . For example, taking lysozymes (a type of amyloidogenic proteins), micelles have a lower content of α-helix and a higher content of β-sheet yet with the majority being disordered loops compared with native monomeric units 62 . Therefore, aggregated micelles are often classified as amorphous. It is suggested that these amorphous micelles serve as intermediates, or initiation states for amyloid nucleation 63 . This generic hypothesis of the role of micelles is indirectly hinted by the increase in β-sheet structural content while losing their original α-helix content 62 , and is further supported by molecular simulation of coarse-grained peptides 29,64 . These facts imply an alternative aggregation pathway for amyloid fibrils: the two-step nucleation mechanism as described in Fig. 2.
In our work, micelles are defined as composed of a few α-mers aggregated from solution. Though micelles could have a variety of structures 65 , we will assume them to be amorphous globular aggregates of densely packed monomers for simplicity. The driving force for micellation of Aβ 1−42 peptides originates mainly from hydrophobic interaction between α-mers 29,66 . We define one α-bond as the bond formed between a pair of αmers inside a micelle. The total number of such α bonds, N α (N ), for a spherical N -size micelle can be assumed to have the bulk and surface terms. Since the expression N α (N ) has to reduce to N α (2) = 1, it is easy to obtain: With only hydrophobic attraction accounted for, an infinite micelle size could be expected in equilibrium. However, aggregation is constrained by other unfavorable free energy factors, e.g. the electrostatic repulsion due to accumulation of negative charge on the surface 67 , and the entropic loss from the compact packing of several α-mers into a micelle 68 . Therefore, a large micelle size is never observed in experiment.
A crude but convenient way to estimate the electrostatic repulsion is to assume that electrostatic charges distribute evenly on the spherical surface of the packed micelle. Then the repulsion energy is (N q e ) 2 /8π 0 r N 69 , where q e is the effective charge on a single α-mer, and 0 are the relative dielectric constant and the permittivity in vacuum, r N is the radius of the N -size micelle, proportional to N 1/3 if we assume it is roughly spherical. Accordingly, the electrostatic potential energy has the overall scaling of N 5/3 . The screening effect from counter ions in solution may challenge the validity of using the expression (N q e ) 2 /8π 0 r N for the potential energy of electrostatic repulsion. However, this Debye screening effect requires mobile charges in thermal motion. Without a doubt it would exist between any two charged micelles or monomers in solution. Yet the potential energy quoted above refers to the energy to confine charges in a small volume when starting from a well-separated distance (i.e. the interaction of immobile surface charges across the packed micelle itself). In this case, no counter charges exist inside micelles, and even if there were some, they would have very little mobility due to the compactness of internal structure. Therefore, the Debye screening effect cannot play a role. This enables the use of electrostatic potential in an effective dielectric medium in our case.
In addition, there is an entropic cost of forcing polar amino acid groups to the micelle surface, an effect well-studied in the formation of micelles of polar surfactants 68 . The simplest expression of the free energy expressing this reduction in conformational freedom turns out to scale as N 5/3 , in an analogy to the electrostatic repulsive energy of a sphere with evenly distributed surface charge 70 . Although the actual entropic repulsion term can be of more complex forms, this N 5/3 scaling can be the leading term, and helps to correctly predict the experimentally determined concentration threshold where monomers starts aggregating into micelles 68 . Therefore, we account for these two repulsive free energy contributions (electrostatic and entropic) as a single term hN 5/3 , with its parameter h to be determined.
Assembling together the free energy of α bonds and the repulsive free energy terms, the micellation free energy, F mic (N, C 1 ), can be written as: Note a close resemblance of this expression, and Eqs.
(1) and (2) for protofilaments. Here the important parameter is ∆ α , the free energy of an attractive α-bond, which is approximately −17k B T estimated from the work of Hills and Brooks 71 (see Appendix A). F mic (N, C 1 ) is analogous to free energy of micelle formation in surfactant solutions, or flocculation in colloids. At a certain threshold monomer concentration, monomers and micelles of a specific size are equally favored at thermal equilibrium; this is the 'critical micelle concentration' (cmc), and the corresponding micelle size is the 'critical micelle size' (cms). In fact, the presence of the critical micelles in the Aβ peptide system with different solvents and pH values has been already reported 35,[72][73][74] . The micellation free energy, F mic (N, C 1 ), and the slope of F mic (N, C 1 ), are both zero at N = cms, as in any coexisting equilibrium. These two independent conditions let us evaluate the parameters A and h from experimentally determined values of cmc = 17.6 µM, and cms = 25, which were measured in the solvent system that more closely reproduces the physiological conditions 61 . In this way we obtain the parameters to be used in the rest of this work: A = 4.86 (dimensionless) and h = 1.6 k B T . Both values make good physical sense, although we shall not spend any more time on this discussion.
We can now plot F mic (N, C 1 ), at cmc and several other values of monomer concentration, in Fig. 5. At low monomer concentrations (below approximately 1 µM for our chosen set of parameters), F mic (N, C 1 ) is a monotonically increasing function of N , and no metastable micelles can exist. This threshold concentration can be easily derived from (5). It is therefore impossible to have the two-step nucleation mechanism for amyloid aggregation below 1 µM in the Aβ 1−42 system. When monomer concentration exceeds 1 µM, F mic (N, C 1 ) has a metastable state, and a free energy barrier to cross to reach a micelle. We define the micelle size that gives the barrier position as N h , and the size that sits at the lowest point of the free energy trap as N l , both labelled in Fig. 5. Only micelles with the size between N h and N l are metastable and can follow the two-step nucleation mechanism. Since the barrier occurs at relatively low N , the stabilizing effect of the electrostatic h-term is not yet important and we can find an approximate expression for the barrier position, useful in the subsequent analysis (N h varies between ∼ 4.5 at the lowest C 1 end, to ∼ 2 at the highest):

Conversion of micelles
Based on two structural facts about fibrils, we can assume that α-β conversion happens on the surface of the remaining α-mers in the packed micells as schematically illustrated in Fig. 2. Firstly, due to the twisted structure of β-aggregates, it would cost more free energy to twist inside a dense amorphous micelle than being unconstrained on the surface of this micelle 47 . Besides, α-mers interact with the pre-formed β-aggregate mainly at the end of the emerging fibril. This is because the region of exposed hydrophobic groups designed to attach to the βaggregate is on the end of a β-stack. Accordingly, α-mers tend to gather near the end area of the fibril to gain more interaction energy.
We choose x ∈ (0, N ), the number of converted αmers, as the reaction coordinate in the micelle of size N . The free energy along the conversion path, F c (N, x, C 1 ), is composed of four contributions: the free energy of a remaining (N − x)-size micelle, Eq. (5); the free energy of the emerging x-size β-aggregate, Eq. (2); the bonding free energy of an interface between the micelle and the aggregate, which will be detailed later; and an additional free energy loss of −(µ • + k B T ln C 1 ) in translational and rotational motions, which is to compensate for one degree of center-of-mass freedom that is present in separate expressions for the α-micelle and the β-aggregate, but is removed when they are bound and move as one entity. Organizing these separate terms, the conversion free energy F c (N, x, C 1 ) takes the form: where for the remaining micelle, and ∆ αβ is the free energy per αβ bond, which is defined as the bond formed between an α-mer and a β-mer. This is a parameter we know the least about; the range of reasonable ∆ αβ values is given in Appendix A. We now proceed to find the expression for N αβ (N, x), the number of total αβ bonds in this intermediate aggregate. Under the assumption that αβ bonds originate from the replacement of pre-existing α bonds, and that the emerging β-aggregate is located on the surface of the micelle, we can compare two expressions for N α -one including the β-mers at the contact interface as yet nonconverted α-mers, and the other with an actual remaining (N − x) α-mers. The resulting expression is a non-linear function of N due to the surface term present in the definition of N α . When x is smaller than 3 (giving only a single protofilament geometry), the number of αβ bonds where the additional −1 is due to the replacement of one original α bond with the β bond that bridges two protofilaments on the micelle surface instead of contributing to N αβ (N −x).
In Fig. 6 we examine the conversion free energy of a hexamer (N = 6) as an illustration of two key features of free energy evolution during conversion: the 'peak' and the trap in F c (N, x). The peak of the conversion free energy is always at x = 2 no matter how strong the coupling ∆ αβ is. This is a generic characteristic for the micelle size larger than 4 due to the presence of the second protofilament at x = 3, almost doubling the number of αβ bonds and causing an enormous stabilizing effect to drag down the value of F c . A local free energy minimum, which we call a 'trap' could exist, just before this barrier, at x = 1 (a minimum appears at ∆ αβ = −22 k B T , in Fig. 6). The difference N αβ (N, 2) − N αβ (N, 1) is negative, while the difference in N αβ is positive between x = 1 and x = 0; this assures that the trap can only be at x = 1. It should be noted that even though the local minimum appears only at ∆ αβ = −22 k B T in the hexamer case of Fig. 6, it can be present at other ∆ αβ values with a larger micelle size, when the value of N αβ (N, 1) − N αβ (N, 0) and the negative value of the difference F c (N, 1, C 1 )−F c (N, 0, C 1 ) increase.
The conversion barrier for an N -size micelle, ∆F c , is defined as the free energy difference between the peak and the minimum (the trap or x = 0) along the reaction coordinate of conversion. In other words, the barrier ∆F c is evaluated either as the difference F c (N, 2) − F c (N, 0), or instead as F c (N, 2) − F c (N, 1) when a trap is present, as eventually always happens at a sufficiently large N . As a result, ∆F c is not a simple single-valued function.
Nucleation rate of the two-step nucleation mechanism Each individual nucleation path in the two-step mechanism is characterized by the final size N of the micelle that forms along the pathway. The nucleation free energy landscape is plotted with combined F mic and F c . For the micelle size between N h and N l (see Fig. 5), the nucleation free energy landscape has two barriers: one for the micellation formation and the other for the α-β conversion with a metastable (intermediate) state in between. This pathway is illustrated in Fig. 7 for the case of a hexamer micelle (N = 6) and C 1 = 1 mM. The plot starts with the expression for the micelle free energy F mic (N ), which passes over the micelle nucleation barrier N h ≈ 3 and then starts decreasing towards the micelle minimum (which for this concentration will be an N l = 35). However, at the micelle size N = 6 the conversion starts, and the remaining part of the plot gives the free energy F c (6, x). This part is not a continuous but a piece-wise function of x, same as in Fig. 6, because the expressions for the bond counts N s , N β , and N αβ are all piecewise. This, however, does not affect the conclusions on the barrier height or kinetic parameters calculated below. This overall free energy profile allows the use of a three-state kinetic model.
Several ways can be used to approach this threestate kinetics problem, a most direct being the application of the general Kramers-style analysis of steady-state flux 75,76 . However, even a simpler problem, with just one variable (degree of freedom) and a potential with two maxima surrounding the intermediate state, cannot be solved analytically except in limiting cases when one barrier is much higher than the other. In this work, this is further complicated by the fact that we have two separate variables, N for the micelle size and x for the degree of conversion, and in most cases the free energy barriers in Fig. 7 are not high, when viewed from the intermediate state. Therefore, no formal analytical solution is anticipated in this diffusion-type scheme.
Instead, we choose a different method of evaluating the overall nucleation rate. We separate the nucleation process into three distinct elements: the transition rate of association from the monomer into the intermediate state (written as k + C 1 ), the reverse transition rate of dissociation from the intermediates back to monomers (as k − C int ), and the transition rate of conversion from the intermediate state into the final β-aggregate state (k c C int ), where C int is the concentration of micelles (the intermediate species). Then we derive the effective nucleation rate from monomers to the final aggregate, defined as k 2 C 1 , and its rate constant k 2 .
Strictly, there has to be a reverse process from βaggregate back to the intermediate micelle state, and also a process of explosive instant dissociation of β-aggregate into monomers. These two processes are necessary if one is to find an equilibrium steady-state solution to this problem. However, our interest is to study the amyloid nucleation from solution, which is a very non-equilibrium process: the first stage of a fibril growth process leading to much lower free energy states where one would need to investigate the possible equilibration 22 . Besides, the subsequent elongation of the aggregate larger than the critical nucleus size is usually rather rapid, and therefore the reverse conversion from the final aggregate state back to the metastable state is slow and can be ignored. Consequently, we ignore the two very low-probability processes and regard the final β-aggregate state as irreversible.
We use a strategy similar to the probabilistic singlemolecular approach by King and Altman tackling the Michaelis-Menten enzymatic reaction 77 . Evaluation of the average time to reach the final state of nucleation, made up of an infinite number of paths with varied times of repeated micellation and dissociation processes, gives the effective nucleation rate constant in one twostep mechanism path, k 2 is expressed as follows (see Ap-pendix C for derivation): Although this effective rate constant k 2 is slightly different from the traditionally used expression (see ref. 78 for instance), Eq. (7) recovers the average rate constant of the steady-state approximation, where the assumption that k c + k − k + is used, giving the more familiar expression: k 2 = k + k c /(k c + k − ). Two limiting cases further validate Eq. (7). When k c k − , k + , then the rate limiting process is the micelle formation, and k 2 = k + . This result is reasonable since the metastable species quickly converts into the aggregate. The other limiting case is when k − k + , k c . In this case k 2 = k + (k c /k − ) k + , which reflects the alternating between the monomeric and intermediate states: completion of the reaction becomes a rare event. Both are also limiting cases of the steady-state kinetics.
We now proceed to find explicit expressions for the rate constants k + , k − and k c . A convenient way to estimate the micellation rate of N monomers is to separate the formation of an N -size micelle into the formation of one (N − 1)-size pre-micelle followed by one α-mer attachment. In this scheme, we use the Smoluchowski rate of forming a spherical micelle of size N : 4πD m (r 1 + r N −1 )C N −1 C 1 , where D m is the mutual diffusion coefficient, C N −1 the concentration of (N − 1)-size micelles, and r 1 and r N −1 the radii of one monomer and an (N −1)-size micelle 79 . To form an (N −1)-size micelle, there is a free energy barrier ∆F mic to cross, since (N −1) is always larger than N h in the three-state kinetic model. To find the concentration C N −1 , we separate it into two parts: a pre-thermal equilibrium concentration of N hsize micelles is first assumed to be reached, and then the simultaneous adsorption of (N − N h − 1) monomers to this pre-micelle takes place, which is assumed to act as a deep adsorbing sink (see Appendix D for derivation). Assembling the terms, k + takes the form: where . Consequently, the product k + C 1 implicitly has the factor C N 1 in it, and therefore corresponds to the kinetic expression for the N -particle collision, which validates our formulation for the constant k + .
In a system where the monomer concentration is maintained constant (either by re-supplying the depleted molecules from a reservoir, or simply ignoring the small change at the nucleation stage), the micelle dissociation rate constant k − can be derived from an argument based on the thermal equilibrium condition that requires the dissociation rate equal to the micellation rate, namely the equation k − = k + exp(F m /k B T ). The expression for k − is then: Here F m is the free energy of the metastable state: the micelle of size N , see Fig. 7. The expression for F m has the term of (1 − N )k B T ln C 1 arising from Eqs. (5) or (6), and therefore the product of C N −N h 1 in ω + and exp[−(∆F mic − F m )/k B T ] does not have the explicit power-law dependence on C 1 . The rate constant k − depends on C 1 only implicitly, and weakly, through N h that appears in ∆F mic . At a fixed monomer concentration, an increase in N decreases the value of k − by having a larger negative α and αβ bond free energies in ∆F mic − F m , which results in a smaller probability to break all the bonds into monomers.
The rate constant of the conversion reaction k c is found directly from the Kramers escape theory, and is the product of the activation factor over the barrier ∆F c and the attempt frequency from thermal fluctuations of positions of the Aβ 1−42 segments, 1/τ I , which has first appeared in the derivation of k 1 , Eq. (3): With the three rate constants determined, we can write down the overall rate of amyloid nucleation k 2 for a given size N . The full expression is cumbersome, but let us examine k 2 in two limiting cases. One is k 2 ≈ k + when k c k − , k + (micelle formation from solution is the rate-limiting process); the other is k 2 ≈ k + (k c /k − ) when k − k + and k c (conversion of a micelle is the rate-limiting process). These two extreme cases give the expressions of ω + e −∆Fmic/k B T and e −(Fm+∆Fc)/k B T /τ I , respectively. When conversion is the major barrier, k 2 depends on the highest free energy, i.e. F m + ∆F c , cf. Fig. 7, and is the product of a quasiequilibrium population for the metastable state and the probability to cross the conversion barrier. On the other hand, when the micellation process is slow, k 2 depends on the ∆F mic barrier only. In general, Eq. (7) is a mixture of these three kinetic processes.

The fastest growing nuclei
Since the monomer concentration in experiments mostly is within the range from µM to a few mM, and using the standard concentration of 1 mM, we let ln C 1 vary from −6.5 (i.e. C 1 ≈ 1.5 µM, approximately the threshold monomer concentration to initiate the two-step mechanism, discussed in the section of micellation of soluble peptides, Fig. 5), up to ln C 1 = 2 (corresponding to C 1 ≈ 7.4 mM). Values of C 1 outside of this common experimental range will not be discussed further. From Eqs. (4) and (7)-(10), we evaluate ln ( micelle sizes at several monomer concentrations, where the reference rate constant k • 1 is the value of the rate k 1 of the NP mechanism at ln C 1 = −6.5, referring to the lowest monomer concentration case we investigate. In this way, the fastest rate of two-step nucleation can be detected, identifying the dominant micelle species N * in this nucleation mechanism. The x-axis in the ln k 2 plot varies from N h to N l , which is the range of the metastable micelle sizes that can adopt the two-step mechanism for each chosen monomer concentration C 1 . For clarity and convenience, we normalize the plots of ln k 2 to extend between 0 and 1. The width of the peak of nucleation rate, if such a peak exists, is defined as the size range of micelles with the fraction of 0.5 in the y-axis. All the parameters involved in the estimation of k 2 have been listed in Appendix A with the room temperature for T and the viscosity of water at room temperature for η. The results for ln k 2 (N ), at ∆ αβ = −18 k B T as an illustrative example, are shown in Fig. 8. The plots in Fig. 8 show two main features: the peak of the nucleation rate that occurs at a certain micelle size N * (C 1 ), and the width of this peak increasing as the monomer concentration increases.
To understand the fastest growing rate, let us examine the contributing effects of micelle formation at a rate k + , and its conversion at a rate k c . For the parameters corresponding to Aβ 1−42 (Appendix A), the conversion barrier ∆F c is always between 10 and 25 k B T for the micelle size below 40 (which is clearly the maximum micelle size value that we will encounter). On the other hand, the barrier to form a micelle, ∆F mic , starts from roughly 50 k B T at ln C 1 = −6.5, and decreases to 20 k B T at ln C 1 = 2, meaning that the rate k + at N = N h is very small -and then sharply increases as C 1 grows to finally become comparable with k c at high concentra-tions. The value of k − at N = N h is very high, which is easy to see by inserting F m = ∆F mic at N = N h in Eq. (9). Then k − decreases dramatically as N grows from N h , due to the C N 1 e Fm/k B T factor in (9). Hence there must exist a particular micelle size N * when k − ≈ k c . When N < N * , we have the nucleation rate constant k 2 ≈ k c k + /k − = k c e −Fm/k B T , which a is growing function of N . On the other side, at N > N * , the k − is small and k 2 ≈ k + , which is a decreasing function of N . Between these regimes we will always find the point N = N * of the fastest rate of two-step nucleation.
To put it more physically, when micelle size is too small, it suffers a rather large dissociation rate and cannot undergo the full subsequent conversion into fibrils. As micelle size grows bigger, they become less prone to dissociation and will have adequate time to complete conversion process. However, a further growth of micelles will decrease the micellation rate. In this case, micellation is the rate-limiting process and the total time of the two-step nucleation will therefore increase with micelle size. Basically, this N * value is the size where the effect of slow micellation process starts to take over other two kinetic processes. Figure 9 illustrates this effective size of the critical nucleus N * (the 'peak' in Fig. 8) against the monomer concentration C 1 , using three values of the α-β interaction parameter ∆ αβ from its plausible range. At very low monomer concentrations, this nucleation size N * starts from a larger value, gradually decreasing to finally reach a plateau that spans over a broad range of concentrations. (There is an additional feature: the re-increase of N * for a low ∆ αβ = −18 k B T at very high concentrations, which we do not fully understand and will not be discussing further). These values of 'critical nucleus' size are within the range obtained by the coarse-grained molecular simulations of Sarić et al. 29 : from 2 to 14; they also fall within the experimentally observed values (2 to roughly 25) 80 . Notably, the fact that N * is not a constant value in Fig. 9 indicates the weakness of the assumption about a fixed, concentration-independent micelle size before conversion, which was employed in the previous two-step kinetic model 36 . Nevertheless, a typical prediction of the critical nucleus size N * 7-8 over a broad range of monomer concentrations seems to be appropriate. We believe this N * value is reasonable, and is close to the previous work of the two-step nucleation kinetics of amyloid peptides by Lomakin et al. 35 , which gave the nucleus size of 10, although they did not consider the conversion process from micelles into fibrils.
To qualitatively understand the dependence of N * on concentration C 1 , we can use the approximate condition for the peak of the total rate constant k 2 , namely k − ≈ k c , where we take k c to be a constant (it has only a slow variation with N compared with k − ). From Eq. (9) we know that k − does not have a power-law dependence on C 1 , and we further ignore the electrostatic/entropic repulsion term in F mic since the micelle size is quite small and this repulsion free energy is relatively low in this re- gion. The difference N − N h , required for this ratio to reach a constant value, depends on the choice of N h due to the nonlinear term in N α . A small N h requires a larger difference compared with higher N h values. One should further notice that N h depends on C 1 and gradually decreases as C 1 increases (see Fig. 5). Therefore, N * is controlled by two opposite effects: the shift of N h to a lower value with an increased ln C 1 , and the growing N − N h difference required for ln k − to reach a specific constant as ln C 1 increases. At a small C 1 , the shift of N h wins, causing a larger N * value. As C 1 increases, these two effects may offset each other, explaining the plateau region in Fig. 9. The re-increase of the N * value with ∆ αβ = −18k B T at ln C 1 ≥ 2 is likely due to the other peak condition, that k + ≈ k c . At such a high C 1 , the micelle formation rate k + decreases slowly with N , which allows the micelle size N to grow more significantly before k + reaches the required k c value, producing a noticeable increase in N * .
Strictly, the total nucleation rate in the two-step mechanism, k 2 is the sum of all k 2 values of different individual pathways labelled by the micelle size N , at a given monomer concentration C 1 . However, due to the presence of the sharp peak in k 2 (N ), we may ignore the spread of different nucleus sizes and only represent nucleation by the peak rate k 2 (N * ). A comparison of these two values is given in the inset of Fig. 10, proving that the error in ignoring nucleation pathways other than N * is very small. This allows us to investigate how k * 2 changes with increasing monomer concentration, and propose possible explanations for the features present in Fig. 10.

COMPARISON OF NUCLEATION RATES
With Eqs. (4) and (7), a comparison of rate constants of the two competing mechanisms, k 1 (C 1 ) and k 2 (C 1 ), is plotted in Fig. 10. Both rate constants are normalized by the same factor: the k 1 value at ln C 1 = −6.5, referring to the lowest monomer concentration case we investigate. The ln k 1 curve is not perfectly linear on the log-log plot of Fig. 10 due to the two-competing time-scales in (4). For the concentration regime we investigate, the diffusive arrival time τ D , which is proportional to C 1 , is smaller than the internal re-arrangement time of peptides, τ I . The τ D factor is hence screened off, giving roughly a dependence k 1 ∝ C 2 1 in Eq. (4). However, in a traditional approach to nucleation, with the critical nucleus size in the NP mechanism at n c = 4 (since we used the rate of producing tetramers as the measure for our nucleation rate), one expects a cubic dependence: k 1 ∝ C Nc 1 ; this will be observed when the diffusion time is dominantbut not in the relevant concentration regime.
For the three versions of k 2 , for three values of the conversion penalty ∆ αβ , there are two features to observe. First, a linear trend line could be used to fit ln k 2 in the range of ln C 1 between -6.5 and 0, which suggests an effective power-law dependence k 2 ∝ C n 1 . However, the exponent of this power law does not match the value of the critical nucleus size for the two-step mechanism, N * ≈ 7. Moreover, this slope is not universal and clearly varies with the values of ∆ αβ parameter.
We first focus on how ∆ αβ comes into play. At low monomer concentration, we found that k 2 is first determined by the relation k 2 = k + k c /k − , which then changes to k 2 = k + (the crossover at k − = k c is what determines the fastest growing size N * ). Accordingly, an increase in ∆ αβ has a strong effect on the rate k 2 by decreasing the conversion rate k c at low C 1 . In contrast, at higher C 1 we have k 2 = k + which does not depend on ∆ αβ : different k 2 lines overlapping in the high-concentration regime in Fig. 10.
The crossover concentration when the two-step mechanism starts to overtake the direct NP mechanism of amyloid nucleation is also strongly dependent on the α-β interaction energy ∆ αβ , which is why it is important to have better estimates of its value. But in all cases this happens in the low-to medium-concentration regime. From Fig. 10, we find that the two-step mechanism takes over from ln C 1 = −6 (2.5 µM) for ∆ αβ = −22 k B T , this crossover concentration increasing to ln C 1 = −3 (50 µM) for a lower bound ∆ αβ = −18 k B T . The crossover concentration was given as approximately 8 µM by Auer et al. 37 , where the two-step mechanism was also concluded to take place at a higher concentration than the NP mechanism.

CONCLUSIONS
We elucidate the kinetics of two alternative amyloid nucleation mechanisms, the direct (NP) nucleation and the two-step nucleation via an aggregation of α-micelle and its subsequent conversion into the β filament. We used an equilibrium free energy approach to consider the possibilities that the critical nucleus and favorable micelle size change with monomer concentration. This analysis  is not perfectly linear, but has the slope exponent of approximately 2 at higher C1. For comparison, we show the slopes of C 7 1 and C 3 1 , where the exponents are the critical nucleus sizes in the two mechanisms. The inset shows the comparison of the total rate ln k 2 (squares) and the peak rate ln k * 2 (lines), demonstrating that the difference is very small. of the composite free energy landscape in the two-step mechanism allows the use of a simplified three-state kinetic model and thus the analytical derivation of the effective nucleation rate constant. We define the critical nucleation size by the fastest growing rate criterion, having verified that this rate dominates a practically measured total rate of nucleation.
We find that theoretically predicted variation of nucleation rate with monomer concentration could be easily interpreted as a power law in the analysis of experimental data, even though the actual expression is not. For the NP mechanism, the critical nucleus size is determined to be the trimer of the protofilament pair, and this does not change within the monomer concentration range we investigate. On the other hand, in the two-step mechanism, the micelle size with the largest nucleation rate is dependent not only on monomer concentration, but also on the strength of the interaction between α-and β-mers (∆ αβ ). Generally speaking, the optimized micelle size decreases from around 10 to around 7, and then remains at this value for a wide concentration range, pointing out the inappropriateness of the pre-assumption of the fixed micelle size in the previous kinetic approaches.
Although a nearly power-law relationship of both nucleation rates on monomer concentration is observed, the exponent of the 'apparent power law' in the two-step mechanism have no simple relationship on the critical nucleus size, as would be expected from the kinetic formalism of classical nucleation theory. The exponent of k 2 ∝ C n 1 also depend on ∆ αβ , and is determined by a complex interplay between micellation, conversion and dissociation processes. Unfortunately, such complexity makes the rigorous analysis of these effective power laws hard to carry out. But it is certain that the critical nucleus size obtained by fitting the Oosawa model, or other kinetic models that all assume the classical nucleation formula in amyloid aggregation is not actually the real critical nucleus size. One may think of some direct experimental methods of determining the critical nucleus size that could extract this important nucleation size (in both mechanisms), and thus compare with our theoretical predictions. Unfortunately, no such experiments are yet available, and instead molecular simulations would be a more plausible way to understand the amyloid nucleation.
It is also important that the theoretical prediction for the crossover point of concentration where the two-step mechanism takes over from the direct filament nucleation depends on the strength of a not well-known interaction strength ∆ αβ . Experiment or simulation are also needed to focus on finding a more accurate value of this parameter, in order for our predictions to become more quantitative.
The value of ∆ α cannot be determined from experiments and can only be investigated by molecular simulations. Nevertheless, different results have been obtained depending on the constraints on the conformation of peptides that are used in simulations. In ref. 29 and 28 , ∆ α has the value of -8.4 k B T and -6 k B T respectively, but these values may be under-estimated since each α-mer is assumed to still hold the same structure as before o micellation, and can only interact through one hydrophobic patch on the surface of the native monomeric state. This scene is unrealistic because micellation usually undergoes structural change to become a more stable state, like collapsed amorphous micelles. On the other hand, ref. 71 does not have this pre-assumption and allows peptides to reorganize its structure to form an amorphous micelle. Even though it only models the peptides with the core sequence that forms β-sheet in the aggregation and may thus also undervalues ∆ α term. Despite this defect, we will use the values given in ref. 71 to estimate the value of ∆ α . The bond interaction is approximately -19.2 kcal/mol, which is taken from the amorphous dimer case of Fig. 3 in ref. 71 . The free energy change due to the entropic loss is evaluated as 9.2 kcal/mol under the assumption that the peptide loses all its conformational entropy upon bonding in the supplement of it. Together with these two values, ∆ α is roughly -17 k B T .
τ I is the time-scale for internal α to β rearrangement of amyloidogenic proteins, which cannot be theoretically evaluated, as no simple theoretical models are proposed to include internal friction in a dense protein structure. Although τ I for Aβ 1−42 peptide is still unknown, it can be approximated from the known value for insulin (in ref. 55 ) that has the closest protein length to Aβ 1−42 peptide, giving τ I as 10 −5 s.
For the ∆ αβ value, although all-atomic simulations on the free energy of one αβ bond in ref. 29 was implemented, it had the pre-assumption/constrains on the conformation of α-mers, not allowed to change from its native monomeric structure in solution, and the resultant ∆ αβ value is doubtable to be used in the micelle model of this work where α-mers within a micelle may not have the same conformation as in the monomeric state. Unfortunately, no other specific experiments or all-atomic molecular simulations considering this structural change of monomers into a micelle have been conducted to determine the value of this coupling energy.
We set the lower limit of ∆ αβ by considering its role to facilitate the conversion process. (5) can be used to obtain the free energy change from an N -size micelle into the conversion intermediate with only one α-mer converted. If we first ignore the electrostatic and entropic repulsion and focus on the role of the αβ bond free energy, the result gives: If ∆ αβ equals to ∆ α , it means that conversion of one α-mer on the surface of an micelle or conversion of one α-mer in solution are equally likely to happen because the free energy change is the same, i.e. ∆ c , in both cases. In other words, micellation cannot help facilitate the con- version of α-mers and lower the free energy barrier along the conversion process. The lower limit of ∆ αβ is therefore set to be -18 k B T in our model, a little higher than ∆ α . On the other hand, we consider the final conversion step for intermediates with one α-mer left to fully become the β-aggregate. This step gives the same free energy states involved in the elongation process: an α-mer first attaches to the end of the pre-existing β-aggregate, and then converts into the β-mer. The free energy of the intermediate at x = N − 1 shall be higher than the final free energy of the β -aggregate (x = N ), so that the final β-aggregate is more favored than partially converted intermediates, and the elongation process will not get thermodynamically trapped in the intermediate state.
We thus have the constraint on the upper limit of ∆ αβ : 0 > ∆ β + ∆ s + ∆ c − 2∆ αβ . Therefore, ∆ αβ should be weaker than -23 k B T and has its lower limit of -17 k B T as discussed earlier.
Appendix B: Derivation of k1 The shape of monomers can be approximated by a sphere with the radius of 1 nm from the hydrodynamic radius experiment of Aβ 1−42 83 , while the trimer of the paired protofilament (the critical nucleus in the NP mechanism) takes roughly the shape of a cuboid with dimensions of 46Å, 46Å and 8.4Å based on the structure of a single β-mer given in ref. 84 . In this scenario, it is clear that the collision between monomers and the critical nuclei is the non-spherical case, and in fact, no theoretical formula of the collision rate is available for such a complicated shape. Yet, a convenient way to estimate this structural effect is to approximate it as in the elongated ellipsoid case, as illustrated in Fig. 11(a), where as long as the center of mass of one monomer enters the surface of this ellipsoid (with the trimer being its center), the collision is considered effective.
In this case, the inverse of the time required for one monomer to diffuse to or hit the surface of the trimer, 1/τ D , is written as 58 : Here, A x1 and A x2 are the major and minor semi-axes of the ellipsoid, respectively. We further let A x1 equal to half the sum of the length of the trimer and the monomer diameter, which is (46 + 20)/2Å and refers to the case where the monomer contacts the top surface of trimer.
Similarly, A x2 equals to half the sum of the heigh and the monomer diameter, giving (8.4+20)/2Å and corresponding to the case of side attachment of one monomer to this trimer. A comparison with the Smoluchowski equation and (B1) gives the geometric factor f geo used in obtaining the k 1 value as: With A x1 of 33Å and A x2 of 14.2Å, f geo is estimated roughly as 26.4Å. The constant D m is assumed to be roughly twice the diffusion coefficient of the spherical monomer: although we know that the trimer has a different volume, the difference in the diffusion constant between it and the monomer is minor -while no simple calculation for an accurate mutual diffusion coefficient is available. From the Stoke-Einstein equation, D m is expressed in terms of the monomer size r 1 and solvent viscosity η as k B T /3πηr 1 . Putting this relation and (B2) into (B1), we obtain: Since we take the rate of linear aggregation as the rate of progressing from N = 3 to N = 4 (the process repeats with the same parameters further on, due to the linear N -dependence of the free energy, Fig. 4). Therefore, this rate is determined by the barrier to reach the critical nucleation size N = 3 of a paired protofilament, plus the small free energy barrier for monomer aggregation on regular filament elongation, see Fig. 11(b) and the detailed simulations 57 . Putting (B3) into the (3) we finally obtain (4).
Why do the filaments not effectively depolymerize, after reaching and exceeding the critical nucleus size? The rate constant of depolymerization k d (that is, the transition from N = 4 back to N = 3) is controlled by the activation energy F β,2 (3) − F β,2 (4) + ∆F el and its ratio to k p , the further polymerization rate constant, is proportional to the exponential factor using the (2) with N ≥ 3, and the values of energy parameters in Table 1. At the lowest limit of our examined concentrations, ln C 1 = −6.5 and the ratio of reverseforward rates is ∝ exp −5: even this upper bound is low enough to consider only the forward rate of filament polymerization.
Appendix C: Derivation of k2 We consider a system of only three states: S mono , monomers, S int , the intermediate species, and S aggr , the final β-aggregate. This system can go from S mono to S int with the micellation rate constant (k + ), S int to S aggr with the conversion rate constant (k c ), S int to S mono with the dissociation rate constant (k − ). At S int , the system can either dissociate back into monomers or fully convert into the β-aggregate. The inverse conversion from S aggr is not considered based on the fact that the elongation rate of any β-aggregate that exceeds the critical nucleus size is rather fast, leaving almost no chance to reverse the aggregation. The set of kinetic equations for each state are written as: , where C 1 , C int and C aggr are defined as the concentrations of the S mono (monomers), S int (intermediate) and S aggr (the final β-aggregate) states, respectively. Our aim is to find the average time for the system to first reach the final state S aggr considering all the possible paths. That is to find the average nucleation rate constant k 2 defined below : All the rate constants defined in Eqs. C1 and C2 have the dimension of the inverse of time, which then can be used to find the time scale involved in this nucleation process. The time required to travel from S mono to S int is defined as τ + , which is simply 1/k + . Similarly, τ − , the time from S int to S mono , is 1/k − , while τ c , the time from S int to S aggr , equals to 1/k c . To find the average time for nucleation, we need to find out the time required for each path. For example, the first path is S mono → S int → S aggr and the second path corresponds to S mono → S int → S mono → S int → S aggr . All other paths will have more than two times of dissociating back and associating into a micelle again. For the first path, the total travel time is (τ + + τ c ), whereas the second path requires [(τ + + τ − ) + 1 * (τ + + τ c )]. A general form for the travel time of the (n + 1)th path, i.e. τ n+1 , shall be τ n+1 = (τ + + τ − ) + n(τ + + τ c ) (C3) In this system, only at S int the decision is made to fully convert to the aggregate or to dissociate back into monomers. The probability to convert is denoted as P c , which is proportional to k c , while the probability to dissociate back is P − and is proportional to k − . We then have the normalized probability of conversion and dissociation: For the (n + 1)th path, the system needs to dissociate back n times and finally a conversion follows, and the probability for this to happen, P n+1 , is readily written as The average time to reach S aggr , denoted as τ ave , is the summation of products of the time required for different paths and their probability to happen: Inserting Eqs. C3, C4 and C5 into (C6) and taking the inverse of it, we arrive at the formula of (7).
Appendix D: Derivation of k + The rate collisions between a monomer and a micelle of size (N − 1) is expressed by the Smoluchowski relation using the mutual diffusion coefficient D m = D 1 + D N −1 : 4πD m (r 1 + r N −1 )C 1 C N −1 . Here r 1 is the radius of a monomer, while r N −1 is the radius of an (N − 1)-size micelle. C N −1 is the concentration of the (N − 1)-size micelle, which is between the N h and N l size in the three-state kinetic model (in order to have a metastable/intermediate state). This rate, by definition, is equal to k + C 1 , which is how we shall determine this rate constant.
Due to the micellation free energy barrier for monomers to cross to aggregate into this (N − 1)-size micelle, we assume a pre-thermal equilibrium for the concentrations of the micelles whose size is smaller than N h , the micelle size that gives the micellation free energy barrier. Accordingly, we can easily write the C N h term as: where ∆F mic refers to the micellation free energy and is simply (5) at N = N h . In order for this N h -size micelle to aggregate into (N − 1)-size, it has to adsorb additional (N − N h − 1) monomers. We may further assume that this N h -size micelle acts as a deep absorbing sink. Monomers are adsorbed as soon as they contact the surface of the N h -size micelle, which requires that the center of mass of one monomer to fall inside the spherical volume of 4π (r N h + r 1 ) 3 /3, a region that is bound by the radii of one monomer and the pre-existing micelle of N h size. The volume conservation is implemented to estimate r N h as r 1 3 √ N h . In this way, the probability of adsorption of n monomers to this pre-micelle, P a,n , is: With (D1) and (D2), C N −1 is given as For the estimation of the mutual diffusion coefficient D m , the Stokes-Einstein equation is used to express D m in terms of the viscosity of solvents η, temperature T and the size of the micelle and monomer. Together with the r N −1 expressed in r 1 , it gives: Inserting (D1) and (D3) into 4πD m C N −1 , we obtain k + as written in (8).