Department of Physics, Lehigh University, Bethlehem, United States

Institut Jacques Monod de Paris Diderot/CNRS, Paris, France

Department of Chemistry, Imperial College London, London, United Kingdom

The prevalence of multicellular organisms is due in part to their ability to form complex structures. How cells pack in these structures is a fundamental biophysical issue, underlying their functional properties. However, much remains unknown about how cell packing geometries arise, and how they are affected by random noise during growth - especially absent developmental programs. Here, we quantify the statistics of cellular neighborhoods of two different multicellular eukaryotes: lab-evolved ‘snowflake’ yeast and the green alga

The evolution of multicellularity was transformative for life on Earth, occurring in at least 25 separate lineages (

Recent work has shown that extant multicellular organisms can either suppress (

Multicellular organisms also exhibit diverse growth morphologies; for example, cells can remain attached through incomplete cytokinesis (

Here, we provide experimental evidence that, rather than being context-dependent, fluctuations in cell packing geometry instead follow a universal distribution, independent of the presence or absence of developmental regulation. We quantify the distributions of cellular space in two different types of organisms: experimentally evolved multicellular yeast (

Within statistical physics, the maximum entropy principle relates randomness in low-level units (e.g. cells) to the properties of the assembly (e.g. a multicellular group). It works by enumerating all low-level configurations that conform to a set of constraints. Any particular group-level property can be generated by many different low-level configurations, but some group-level properties may correspond to more low-level configurations than others. Those that are generated by many configurations are more likely to be observed than those that correspond to relatively few configurations; in this way, the maximum entropy principle allows one to calculate the probability of observing different group properties, given a set of constraints. Multicellular groups obey a simple but universal constraint: each group has some total volume,

Consider the ensemble of all possible cellular configurations in a simple group. As first derived by

where _{c} is the minimum cell neighborhood volume, _{c},

In practice, we divide the total group volume or area into _{c} to be the volume of a single cell without any intercellular space (or _{c}, the area of a single cell).

To test whether different kinds of multicellular groups pack their cells according to the maximum entropy principle, we investigated cell packing in two different multicellular organisms. First, we used experimentally evolved ‘snowflake’ yeast (

Snowflake yeast grow via incomplete cytokinesis, generating branched structures in which mother-daughter cells remain attached by permanently bonded cell walls (

(_{c} for snowflake yeast. In orange is the histogram for all cells; the other three distributions correspond to different subsections of Voronoi volumes. The cells were grouped into spherical shells with radius

The folders are subdivided into those on snowflake yeast (from SEM studies) and Volvox (from lightsheet studies). Each subdirectory contains an explanatory README file.

(

To determine the distribution of cell neighborhood volumes, we first must measure the position of every cell in a cluster. It is difficult to image individual cells within snowflake yeast clusters due to excessive light scattering. Instead, we used a serial block face scanning electron microscope equipped with a microtome to scan and shave thin (

We define the group volume as the smallest convex hull that surrounds all cells in the cluster and computed the 3D Voronoi tessellation of cell centers within that (

The influence of the convex hull on these results was investigated by using an alternative procedure in which the Voronoi volumes were binned into shells centered at the cluster’s center of mass (

To test if cell neighborhood volumes in extant multicellular organisms are consistent with maximum entropy cell packing predictions, we examined cell packing within the green microalgae

To determine the distribution of

We next investigated if maximum entropy predictions are more accurate within subregions with similar mean solid angles; specifically we examine regions whose mean is

We next used simulations to investigate the impact on cell packing of four different growth morphologies: growth via incomplete cell division (

First, we performed geometric simulations of multicellular groups that grow via incomplete cell division; these simulations were inspired by previous simulations of snowflake yeast (

We simulated four different growth morphologies: (

Each subdirectory contains an explanatory README file.

(

We show the percent error of the estimated skewness (

(

Inspired by

Next, we simulated organisms that stick together via reformable cell-cell adhesions, a mechanism of group formation that is common in biofilms and extant aggregative multicellular life (

Finally, we modeled cells undergoing palintomic division within a maternal cell wall, as is common in green algae (

Taken together, the results of these simulations suggest that a broad distribution of cell neighborhood sizes is a general feature of multicellular growth morphologies. In particular, when cell locations are random under these rules, cell neighborhood size distributions closely follow the k-gamma distribution.

While we have shown that the distribution of cell neighborhood volumes closely follows the k-gamma distribution in two very different organisms, we have also seen that in some cases maximum entropy predictions are more accurate in sub-sections of an organism than across its entirety. For instance, in

The spatial correlations in the cellular areas in

where

(_{0} value. (_{0}. As the subsection size increases (including more and more uncorrelated Voronoi areas), the deviation from predictions first decreases until

A natural question is whether maximum entropy predictions are more accurate within correlated subregions of an organism. We measured the Voronoi distribution in subregions with similar mean solid angles across six organisms and, for each subregion, a central node and its neighbors up to _{0} were identified. We varied _{0} from 3 (corresponding to, on average,

How much randomness is necessary for the k-gamma distribution to predict cell neighborhood size distributions? Our analysis of the solid angle distribution of

In A-C are PP plots of the observed vs. predicted cumulative distribution function for three different simulations. The colors correspond to increasing levels of noisiness in the simulations, from red (strongest correlations/determinism) to blue (strongest noise). The dashed black line in each represents

The impact of heritable size polydispersity was investigated by simulating aggregative groups consisting of large and small cells. All simulations were seeded with one small cell and one large cell. We then varied the probability

Next we investigated groups with varying amounts of noise on top of defined growth patterns. In these simulations, new cells bud in precise positions; the first daughter at the position (

Finally, we investigated groups with localized and random cell death. In these simulations, 50 cells were confined to the surface of a sphere of unit radius following the protocol described above. One cell is randomly selected to die. Centered at this cell, a spherical region of radius

In summary, absent randomness, spatial correlations lead to large deviations from the k-gamma distribution. Yet, with even a small amount of randomness, the k-gamma distribution holds significant predictive power. These simulations suggest that maximum entropy predictions are likely to be robust against even moderate correlations.

So far, we have shown that randomness in cellular packing leads to highly predictable packing statistics. Here we show that maximum entropy statistics can directly impact the emergence of a highly heritable multicellular trait, organism size.

Prior work has shown that the size of snowflake yeast at fragmentation is remarkably heritable – higher, in fact, than the traits of most clonally reproducing animals (

Before addressing how fracture impacts the distribution of cluster sizes by impacting the number of cells within a group, we first must address fluctuations in size among clusters with the same number of cells. Given a number of cells

(

To predict the group size distribution, we consider the probability of fragmentation via a weakest-link model of fracture. As the location of new cells is random (see

As each cell in a cluster of

As we do not model the fate of products of fragmentation (i.e. the size of the separate pieces post-fracture), we expect the weakest link model to be more accurate for larger clusters than it is for smaller clusters.

We measured group size for approximately

For context, we compared the distribution of group size in snowflake yeast to that of flocculating yeast, which forms multicellular groups via aggregation. The multicellular size of flocculating yeast depends on the rate of collisions with other cells and groups of cells. The growth rate of aggregates is thus typically proportional to their size, as larger aggregates are more likely to contact more cells (

One of the issues arising from the existence of the broad distribution of somatic cell areas in

A heuristic explanation for the smoothness of the flows can be developed by noting first that the flow arising from each flagellum, beating close to the no-slip surface of the colony, will fall off only as an inverse power of distance

In the squirmer model, the swimming speed

where

where

we see immediately that the contributions from all modes

Thus, within the squirmer model, motility is essentially insensitive to area inhomogeneities. This result does not preclude effects of those higher modes, only that such effects will be on quantities other than the swimming speed, such as the nutrient uptake rate (

In the shear-stress, no-slip model, the velocity field in the region between the colony radius

The swimming speed again depends only on the lowest-order mode in this expansion,

and we again have insensitivity of

In this paper, we demonstrated that universal cellular packing geometries are an inevitable consequence of noisy multicellular assembly. We measured the distribution of Voronoi polytope sizes in both nascent and extant multicellular organisms, and showed that they are consistent with the k-gamma distribution, which arises via maximum entropy considerations. Using simulations, we demonstrated that k-gamma distributions arise in many different growth morphologies, and do so requiring only a relatively small amount of structural randomness. Further, we showed that the distribution of cell neighborhood sizes can be used to distinguish the effects of randomness from the effects of developmental patterning. Finally, we demonstrated that consistent packing statistics can lead to highly reproducible, and thus heritable, multicellular traits, such as group size in snowflake yeast. Altogether, these results indicate that entropic cell packing is a general organizing feature of multicellularity, applying to multicellular organisms with varying growth morphologies, connection topologies, and dimensionalities.

One of the strengths of the packing-based maximum entropy framework employed here is its simplicity. We have demonstrated that the distribution of cell neighborhood sizes can be predicted with high accuracy, in many different multicellular morphologies, from only the first two moments of the distribution. Deviations from maximum entropy predictions therefore encode important information about additional correlations that can arise via a variety of sources, such as developmental regulation or interactions with the environment. These additional correlations could be explored via, for example, higher order structures to the maximum entropy model (

The effect of random noise has been an important area of research in developmental biology (

Our observation that heritable properties can arise from random processes is reminiscent of the reproducible structures and phenomena generated by random noise in a wide range of physical (

An example of one possible advantage granted by entropic packing is the parent-offspring fidelity that arises from its ensemble statistics. Since both parents and their offspring are assembled through similar noisy processes, they achieve similar cell packing distributions. This statistical similarity therefore details at least one heritable multicellular trait that does not rely on genetically regulated multicellular development. Other multicellular traits that build on the cell packing distribution are similarly affected by this emergent process and could become heritable as well. Such parent-offspring heredity could play a crucial role in the evolutionary transition to multicellularity, providing a mechanism for nascent multicellular organisms to participate in the evolutionary process without first having to possess genetically regulated development. Over time, developmental innovation may arise via multicellular adaptation, modifying or replacing entropic cell packing as a mechanism of multicellular heredity. Consistent with this hypothesis, maximum entropy retains considerable predictive power in extant multicellular organisms such as

The broad distributions in cellular volumes we have found in two very different types of organisms, with two very different modes of reproduction and growth, suggest that noise in developmental geometry may be an inevitable consequence of almost any microscopic mechanism. In this sense, they may be just as unavoidable in biological contexts as thermal fluctuations are in systems that obey the rules of equilibrium statistical physics. As an example, we recall the ‘flicker phenomenon’ of erythrocytes, in which the red blood cell membrane exhibits stochastic motions around its equilibrium biconcave discoid shape. Thought for many years to be a consequence of specific biochemical processes associated with living systems, flickering was eventually shown by quantitative video microscopy (

These results on equilibrium fluctuations provide a conceptual precedent for the results reported here. A central issue that then arises from our results is how to connect any given stochastic biochemical growth process defined at the microscopic level to the more macroscopic probability distribution function observed for cellular volumes. Mathematically, this is the same question that arises in the theory of random walks, wherein a Langevin equation defined at the microscopic level leads, through suitable averaging, to a Fokker-Planck equation for the probability distribution function of displacements. Can the same procedure be implemented for growth laws?

Multicellular yeast groups were constructed from initially unicellular

All experiments were performed on yeast grown for approximately 24 hr in 10 mL of yeast peptone dextrose (YPD, 10 g/L yeast extract, 20 g/L peptone, and 20 g/L dextrose) liquid medium at 30 C, and shaken at 250 rpm in a Symphony Incubating Orbital Shaker model 3,500I. All cultures were therefore in the stationary phase of growth at the time of experiments.

Since yeast cells have thick cell walls that limit the effectiveness of optical microscopy, we used a Zeiss Sigma VP 3View scanning electron microscope (SEM) equipped with a Gatan 3View SBF microtome installed inside a Gemini SEM column to obtain high resolution images of the internal structure of snowflake yeast groups and locate the positions of all cells. All SEM images were obtained in collaboration with the University of Illinois’s Materials Research Laboratory at the Grainger College of Engineering. Snowflake yeast clusters were grown overnight in YPD media, then fixed, stained with osmium tetroxide, and embedded in resin in an eppendorf tube. A cube of resin

Custom image analysis scripts were written for the SEM datasets. First, a local adaptive threshold was used to binarize the image. A distance transform was used to identify the center of each cell slice in a particular 2d image. A watershed algorithm was then seeded with the cell slice centers, followed by a particle tracking algorithm to label cells across image slices. After labeling, the boundary for each cell was found, resulting in a point cloud of the exterior of each cell. Each cell was then fitted with an ellipsoid with nine fit parameters:

We measured cellular volumes from SEM images by ellipsoid fits. The average cellular volume of petite yeast was

We next measured the typical size of bud scars on the surface of Y55 yeast cells. Single cells were stained with calcafluor to highlight the chitinous bud scars (

We measured bud scar positional distributions for petite yeast Ace2KO. Since the SEM does not image chitinous bud scars, we approximated bud scar positions as the closest point on a mother cell’s surface to the corresponding daughter cell’s proximal pole. We recorded 1990 bud scar positions in polar coordinates, as defined in

The

Positions of cells were registered based on fluorescence intensity using custom Matlab scripts. This was achieved by carrying out a 2D convolution of each frame of the

We used a Voronoi tessellation algorithm to measure the distribution of cell neighborhood sizes in groups. We computed both 3D and 2D Voronoi tessellations.

First, we computed 3D Voronoi tessellations within a defined boundary. These tessellations were performed for experimental snowflake yeast data from the SEM and simulations of 3D groups using the open-source Voronoi code Voro++ (

The boundary sphere was centered on the cluster’s center of mass. Its radius was the distance to the farthest cell center plus an additional

For Voronoi tessellations of cells on the surface of simulated spheres (see

We also computed 2D Voronoi tessellations on surfaces embedded in 3D space using custom-written MatLab functions. This approach was used for

The first step toward generating the proper Voronoi tessellation was computing the Delaunay triangulation of the cells on the surface (the Voronoi tessellation is the dual of the Delaunay triangulation). First, we found the Cartesian coordinates of each somatic cell (as described above), and normalized these coordinates so that all cell centers laid on the unit sphere. Then, a Delaunay triangulation of the normalized points was calculated. Edges of the triangulation that cut through the unit sphere were eliminated, and edges that laid along the sphere surface were kept. This Delaunay triangulation therefore mapped out the connectivity of the somatic cells. We then projected that triangulation onto the lumpy surface. The Voronoi polygon vertices are the circumcenters of each Delaunay triangle. Further, any edge shared between two Delaunay triangles denotes an edge shared between the Voronoi vertices associated with those two triangles. We found all edges connecting the Voronoi vertices. Next, connected edges were flattened so that each Voronoi cell was a 2D polygon. This step eliminates the curvature associated with the surface of the organism. However, we found that the distribution of Voronoi areas was unaffected by taking either the planar approximation or by approximating the area by taking the local curvature into account – the average difference between Voronoi areas when approximating the surface as a plane

In all cases, the output of the Voronoi algorithm is a list of Voronoi polytope sizes: in 3D, the measurements were the final Voronoi polyhedron volumes, while in 2D the measurements were polygon areas. Histograms of these sizes were generated to compare with the k-gamma distribution. As we observe cells in direct contact with each other, the minimum size of a Voronoi volume or area was defined by single cell measurements. For petite yeast cells, the mean cell size was calculated from the ellipsoid fits described above to be

We then calculated the expected maximum entropy distribution using only the mean and variance of the observed Voronoi volumes, _{c}, these measurements define

Along the surface of the

In

To test how well the k-gamma distribution performed as a predictive distribution, we systematically compared its performance to three other distributions: the normal (Gaussian) distribution, the log-normal distribution, and the beta prime distribution. We used two of our datasets for this comparison: experimental values of Voronoi volumes for snowflake yeast, and simulations of snowflake yeast (which provided us with more datapoints for comparison). First, we used the measured mean

K-gamma ( | ||

Normal ( | ||

Log-normal ( | ||

Beta prime ( |

The three distributions that we used to compare to the k-gamma distribution were chosen based upon their properties. For instance, all three comparison distributions have two parameters; also, all three chosen distributions have either semi-infinite or infinite domain. We chose the Normal distribution because it is the centrally-limiting distribution for a process with summative random errors. We chose the Log-normal distribution because it is the centrally-limiting distribution for a process with multiplicative random errors. Last, we chose the beta prime distribution arbitrarily; the goal was to make comparisons to a distribution which we have no reason to believe would accurately describe this dataset.

To systematically compare the distributions, we sampled

Next, we used the four distributions to predict the skewness of the simulation data. We empirically determined the first two moments of the dataset: we take these to be the first two moments for each of our distributions. Then, we used the first two moments to predict the skewness, which is related to the third moment of the distribution. We then compared this value to experimentally-measured skewness, finding that the k-gamma distribution estimated the experimentally observed skewness with only a 7% error; the other three distributions estimated the skewness with percent errors of

As an additional test, we used least-squares fitting to test the performance of the k-gamma distribution compared to the other three distributions for simulation data. In this case, we let the least-squares fitting algorithm find the best parameters to fit each distribution; then, we compared the least-squares-achieved mean and standard deviation of the fit distribution to the measured mean and standard deviation of the population. We found that the k-gamma fit reproduced the closest values to the measured mean and standard deviation (

Cluster sizes were measured using a Beckman Coulter Multisizer 4e particle analyzer in the Cellular Analysis and Cytometry Core of the Shared User Management System located at the Georgia Institute of Technology. Petite Ace2KO clusters were taken from steady state concentration in YPD and then submerged in electrolytic fluid and passed through a _{c} is the average cell volume from SEM measurements,

To quantify goodness-of-fit for predicted maximum entropy distributions, we compared the predicated cumulative distribution function (CDF),

From the light sheet images of

where

In

where

Simulations of snowflake yeast groups were adapted from previously published work by

To compare exhaustively the distribution of Voronoi volumes between simulations and the k-gamma distribution, we simulated

We simulated

We next sought to model two additional classes of growth morphologies: sticky aggregates and cells contained within a maternal membrane. In both simulations, cells were modeled as spheres with unit radius.

First, we considered a multicellular model of sticky aggregates, mimicking group formation in, for example, flocculating yeast and bacterial aggregates. In our simulations, groups were grown from a single cell. New spherical daughters appeared at a polar angle _{3000} measurements of single-cell volume.

Cells interacted with both steric and attractive interactions in overdamped dynamics. Steric interactions were modeled through a harmonic potential when two cells overlapped, with a cutoff once cells were no longer overlapping. That is, for two cells

Attractive interactions (i.e. sticky, aggregative bonds) were also modeled through a harmonic potential, but these interactions had both a lower bound and upper bound cutoff.

where

In simulations in which we introduced size polydispersity, cells were allowed to reproduce into two separate sizes, _{1} or _{2}, independent of the radius of the mother. Simulations were seeded with a pair of contacting cells, one each of the two radii. The simulation then proceeded with subsequent rounds of cell division and mechanical relaxation. We varied the polydispersity parameter

In another common mode of group formation, cells divide repeatedly within a confining membrane. This type of group formation has been observed in experimentally-evolved multicellular algae derived from unicellular

Steric forces between a cell and the maternal cell wall were modeled as being proportional to the non-overlapping volume of the cell and the maternal cell wall. In other words, if a cell is not contacting the membrane, there is no force acting on it. However, if the cell is contacting the membrane, the force is proportional to how much of the cell volume lies outside the membrane. Each cell was assigned volume _{i}. The force the cell experiences from the membrane is then

where

We simulated

Some groups form by arranging cells around a central core of extracellular matrix (ECM). To simulate such groups, we modeled a sphere of ECM with cells arranged randomly along the surface. Cell positions were chosen by selecting a position in spherical coordinates from uniform polar

First, we chose to place

In simulations with apoptosis events, cell death occurred after group generation. Briefly, groups were generated by iterated generations of cell division starting from a single cell. After this process, one cell was chosen at random to die. Then, all cells within a localization radius

We also investigated groups with precisely defined growth patterns. The spherical cells were held together with fixed, chitin-like bonds. The first cell was placed at the origin. It then proceeded to bud 3 daughter cells, each of which also budded subsequent cells. The exact budding pattern is described below.

Daughter cells were placed as follows. In spherical coordinates on the surface of the mother cell, the first daughter cell was placed at

The first daughter cell’s coordinate system was rotated

After each round of cell division, cells were allowed to relax mechanically in overdamped dynamics according to steric repulsive interactions and sticky, rigid bond interactions to their mother cell. The steric interactions were the same as described above. Fixed bond interactions were modeled as follows. When new cells appear, they incur a bud scar on the mother cell’s surface and a birth scar on the daughter cell’s surface. The positions

where

For

No competing interests declared

No competing interests declared

Reviewing editor, eLife

Conceptualization, Data curation, Formal analysis, Investigation, Software, Visualization, Writing – original draft, Writing – review and editing, Methodology

Conceptualization, Funding acquisition, Investigation, Methodology, Project administration, Software, Supervision, Validation, Visualization, Writing – original draft, Writing – review and editing

Investigation, Methodology

Formal analysis, Investigation, Validation, Writing – review and editing

Data curation, Investigation, Writing – review and editing

Data curation, Formal analysis, Investigation, Software, Writing – review and editing

Conceptualization, Formal analysis, Investigation, Methodology, Software

Data curation, Investigation, Methodology, Software, Visualization

Investigation, Software

Conceptualization, Formal analysis, Funding acquisition, Resources, Supervision, Writing – original draft, Writing – review and editing

Conceptualization, Formal analysis, Funding acquisition, Investigation, Methodology, Project administration, Validation, Writing – original draft, Writing – review and editing

Conceptualization, Formal analysis, Funding acquisition, Resources, Supervision, Writing – original draft, Writing – review and editing

The cells grow at exact positions/angles with noise strength determined by input parameters.

Figures 1 & 3 source data: Experimental data files enumerating the cell centers positions for each organism sampled. The folders are subdivided into those on snowflake yeast (from SEM studies) and Volvox (from lightsheet studies). Each subdirectory contains an explanatory README file. Figures 2, 4 & 5 source data: Simulation data (enumerating the cell center positions) for the six classes of numerical studies; aggregation, apoptosis, polydispersity, snowflake yeast growth, tree-like growth, and Volvox growth. Each subdirectory contains an explanatory README file.

Core Facilities at the Carl R Woese Institute for Genomic Biology. WCR was supported by NIH grant 1R35GM138030. This work was funded in whole, or in part, by the Wellcome Trust (Grant 207510/Z/17/Z; REG & SSH). For the purpose of open access, the authors have applied a CC BY public copyright license to any Author Accepted Manuscript version arising from this submission. This work was also supported in part by Established Career Fellowship EP/M017982/1 from the Engineering and Physical Sciences Research Council (REG).

It may appear surprising that the distribution of cell volumes is not governed by the Central Limit Theorem (CLT), i.e. the volumes are not distributed normally. After all, Voronoi polytope volumes are generated from many randomly interacting pieces - should not these many different random fluctuations sum to a CLT-like scenario? A simple comparison between the modified gamma distribution, a normal distribution, and a log-normal distribution shows in fact that both the normal distribution and the log-normal distribution fail to capture essential characteristics of the volume packing, while the k-gamma distribution does. For snowflake yeast, the reason for this disagreement is that as each new cell is added to a cluster, it changes the entire volume distribution, since the new cell occupies space which was previously unoccupied. It therefore changes the volumes of all its nearest neighbors; if they flex to accommodate the new cell, then those neighbors change the Voronoi volumes of their neighbors, and so on. Therefore, adding a new cell does not sample the same distribution as before - the distribution itself changes, rendering the limit inapplicable.

In the case of the

This work uncovers a simple but far-reaching statistical principle that describes the geometry of cell packing in snowflake yeast and green algae. It draws on ideas from granular physics to offer new insight into universal rules of multicellular geometry that are otherwise easily obscured by the cell-scale idiosyncrasies of the different biological systems.

In the interests of transparency, eLife publishes the most substantive revision requests and the accompanying author responses.

Thank you for submitting your article "Cellular organization in lab-evolved and extant multicellular species obeys a maximum entropy law" for consideration by

The reviewers have discussed their reviews with one another, and the Reviewing Editor has drafted this to help you prepare a revised submission.

Essential revisions:

Both reviewers praise the quality of the paper and the relevance of the results. There are not, in their opinion, critical aspects in the manuscript to be further addressed. However, they suggest a number of revisions, which would improve the clarity and presentation of the work. In particular, both reviewers think that more discussion is needed of the maximum entropy model, e.g. whether additional information on higher order structures or morphology related correlations might lead to more effective statistical models. Referee 2 also advises for more contextualization of the results, and a wider discussion about their generality.

Below the full reports, the authors are kindly invited to take into account the referees' comments in a revised version.

The manuscript by Day and colleagues investigates the geometry of cell packing in two multicellular eukaryotes (snowflake yeast and green algae). Using a combination of experiments and models drawn from statistical physics, they show that the distribution of cellular neighborhood volumes follows a simple universal form – a modified gamma function – that arises from a maximum entropy argument. Using simulations of different growth processes, they then show that these universal distributions are ubiquitous-arising, for example, even in correlated systems as long as there is a minimal level of noise. Finally, they show how these principles contribute to emergent evolutionary features (specifically group size distributions) in snowflake yeast, and use simple theoretical models to argue that fluctuations, while inherently stochastic, give rise to robust structures that do not depend sensitively on the microscopic and biological features of the system.

This paper is a beautiful example of how simple biophysical models can provide fundamental and unifying insight into complex biological systems. It is well written, addresses an important and timely topic, and raises intriguing questions about the balance between "regulated" biology and simple statistical physics as selective forces for evolution.

I have several comments for the authors to consider, at their discretion. Overall, I really enjoyed this paper and learned a great deal from it.

– The manuscript offers an interesting guiding principle that describes two considerably different biological systems. As the authors show in simulations, the principle is expected to hold over a broad range of conditions, but of course not universally (though even small levels of stochasticity broaden the range of applicability). I think the paper could be improved by expanding on the discussion of these limitations. In particular, it is not clear to me exactly how surprising it is to see "good" fits to a 2-parameter distribution of this sort (or more generally, what level of "good" we should expect of the fits in finite data sets like these). The authors address this issue in part by showing fits to other distributions, which is nice. But I wonder if it would be helpful to also include (or at least discuss) more systematic model selection. To be clear, I find the analysis quite convincing as is. But I am trying to get my head around the limitations, and in particular, to get a feel for how likely one is to see similar "goodness of fit" results using other distributions with a relatively small number of parameters.

– Related to the previous point: one approach might be to construct a type of "null" model from the data, perhaps by systematically shuffling the data in some way and then bootstrapping to evaluate the likelihood of achieving fits of similar quality.

– Have the authors considered trying to systematically quantify the impact of including higher-order structures in the max ent model? For example, one could perhaps use multi-information metrics (https://journals.aps.org/prl/abstract/10.1103/PhysRevLett.91.238701) to evaluate the extent to which higher-order features of the data are relevant / necessary; the idea is essentially to construct maximum entropy models with various levels of data complexity "built in" and then evaluate (perhaps with an info-theoretic style metric) the extent to which that complexity improved the model. Perhaps something similar has been done for granular materials to capture higher-order statistics of packing? I ask this primarily out of curiosity, not as a serious criticism of the current approach. A discussion of this point might add to the paper.

Day, Thomas C. et al. investigate the geometrical statistic of cell packing in multicellular organisms. Using a maximum entropy prediction originally developed in the study of granular materials, the authors show that the statistics of cell packing imposes a robust physical, entropic constraint on the geometrical arrangement of cells. Strikingly, the authors show that both snowflake yeast evolved under lab conditions and wild-type Volvox, which develop according to very different processes and have disparate overall morphology, both exhibit cell packing statistics consistent with the maximum entropy predictions. They then use simulations to show that entropic cellular packing can arise from various modes of multicellular development due to randomness in cell positions and that substantial deviations from entropic packing arise only in the case of low developmental noise (randomness) and strong correlations in cell positions. Finally, the authors use theory and measurements from experiments with yeast to show how maximum entropy statistics dictate size heritability in simple multicellular systems. Together, their results support the perhaps counterintuitive result that developmental randomness can actually underpin developmental reproducibility, in this case reproducibility in the geometry of cell packing in terms of the free space associated with individual cells within a multicellular structure. This work contributes the identification and new consideration of a fundamental physical constraint of particular relevance to the evolutionary origins of multicellularity and to multicellular morphogenesis in general.

The conclusions of this paper are well supported by the rigorous analysis of data and simulations.

Work on the evolution of multicellularity has traditionally focused on molecular and genetic mechanisms, but because multicellular morphogenesis is an inherently physical process, biophysical studies provide an important complementary perspective. A particular strength of this paper is that insights are derived from theory that requires few, but specific, conditions be met in order to be satisfied, and therefore stands to apply generally to diverse multicellular systems, irrespective of many differences between them. The combination of empirical results from disparate multicellular systems in conjunction with simulations encompassing an expanded set of multicellular morphologies and growth processes compellingly support the generality of the insights. Beyond simply speculating about the implications of entropic packing on the function of multicellular systems, the authors demonstrate impact or lack thereof on aspects of form and function in multicellular yeast and Volvox. Importantly, simulations allowed the authors to investigate in detail the robustness of theoretical predictions in terms of deviations from theory arising from developmental processes. In addition to providing new insight, this work lays the foundation for the exciting possibility of inferring aspects of developmental dynamics and regulation simply by observing the statistics of cell packing in an organism, which could be of great use in comparative evo-devo studies where developmental processes are difficult or impossible to observe.

While the work is very strong overall, there are a few caveats to consider, primarily concerning the simulations. Multicellularity takes many forms by many different processes among eukaryotes. While simulations do cover a range of different morphologies and developmental processes found in nature, factors not explicitly addressed such as constraint or patterning by secreted extracellular matrix, differences in cell shape, cell migration, and others can lead to different kinds of multicellular form. The extent to which potential correlations imposed by diverse morphologies might lead to deviations from theoretical maximum entropy predictions, and how robust those deviations might be to noise is not entirely clear. Additionally, the randomness strength in simulations from high to low, while reasonable, does not appear to be grounded in empirical characterization of randomness strength in developmental processes across biological systems. Ultimately, although they leave some uncertainty as to the generality of the results, these limitations do not contradict or significantly diminish the key claims of the paper.

Comments for the authors:

1) The simulation results are compelling but left me with some questions. To what extent do the morphologies and processes investigated by simulations address the diverse forms of multicellularity encountered across eukaryotes? To what extent does the overall shape of the multicellular structure affect the cell packing distribution (e.g. multi-lobed structure as in Zoothamnium niveum, dichotomous branching as in Dinobryon, something with an undulating boundary)? Are there any examples of simple multicellular eukaryotes that might exhibit very strongly correlated cell positions? What is known about randomness strength or precision in developmental processes in biological systems, and if anything is known, how does this compare to values in simulations? Providing a bit more contextualization or motivation for specific choices in simulations could help address these questions and would support the generality of conclusions drawn from the simulations. Although I am convinced that the results hold for a broad range of multicellular architectures and do not think that the possible existence of a few edge cases contradicts the main conclusions of the work, it is not entirely clear to me that the effects of growth morphology, connection topology, and dimensionality have been accounted for.

2) The sections titled "Multicellular motility is robust to cellular area heterogeneity" starting on p. 11 is slightly perplexing. It is certainly interested, and I see that it addresses a question that may arise from analysis of Volvox cell packing, but in its current form, I do not believe it contributes substantially to the key points of the paper. The introduction section seemed to imply that the results would demonstrate that fluctuations in cell packing may play a role in the evolution of multicellular systems, but as I understood them, the results suggest that fluctuations do not affect motility, at least implying that there should be little to no effect on any aspect of fitness related to motility. It is possible that there could be other aspects of organismal fitness related to cell packing, so while these results are consistent with cell packing fluctuations not necessarily impeding or constraining the evolution of multicellularity, they do not strongly support that conclusion. Perhaps contextualizing the results a bit more in terms of key points of the paper while reporting them and referring to them in the Discussion section might help the reader better appreciate their significance within the context of the paper overall.

3) I might suggest removing or otherwise modifying the phrase "highly-evolved" (p.14) as its meaning is unclear, has connotations of evolutionary teleology, and clashes with the fact that all extant organisms share an evolutionary history of equal length. Maybe something such as "organisms with highly-regulated development" may be more appropriate.

4) Is anything known about the source of correlated subregions of cells in Volvox? Do the authors have any ideas about this? Either way, it would be interesting to know and may warrant at least a small comment in the text.

5) In the author list, SSH is missing an asterisk to denote corresponding authorship.

6) An "e" is missing in "surface" in the caption for Figure 2B.

7) I believe the dotted red line in Figure 4B should be a solid line to match those in panels A and C.

Reviewer #1:

[…] I have several comments for the authors to consider, at their discretion. Overall, I really enjoyed this paper and learned a great deal from it.

– The manuscript offers an interesting guiding principle that describes two considerably different biological systems. As the authors show in simulations, the principle is expected to hold over a broad range of conditions, but of course not universally (though even small levels of stochasticity broaden the range of applicability). I think the paper could be improved by expanding on the discussion of these limitations. In particular, it is not clear to me exactly how surprising it is to see "good" fits to a 2-parameter distribution of this sort (or more generally, what level of "good" we should expect of the fits in finite data sets like these). The authors address this issue in part by showing fits to other distributions, which is nice. But I wonder if it would be helpful to also include (or at least discuss) more systematic model selection. To be clear, I find the analysis quite convincing as is. But I am trying to get my head around the limitations, and in particular, to get a feel for how likely one is to see similar "goodness of fit" results using other distributions with a relatively small number of parameters.

We completely agree that assessing goodness of fit is crucial, despite the fact that doing so is difficult for complex, non-linear, non-monotonic functions. Thanks to these comments, we have clarified our discussion of the topic and added a new analysis (described in detail in response to the next question). Thank you for this suggestion; we believe addressing it has strengthened the manuscript.

We agree that it is, in general, unclear how well one should expect 2-parameter distributions to perform and what exactly distinguishes a good 2-parameter distribution from a bad 2-parameter distribution for our datasets. To rectify this, we consider four different 2-parameter distributions that use the empirically measured mean and standard deviation (as well as the measured minimum cell size); they are the k-gamma, normal, log-normal, and beta-prime distributions (in the original text, we compared the data only to the normal and log-normal distributions). We do not calculate least-squares fits for these distributions, but simply use the empirically measured values of the mean and variance to extract the two relevant parameters of each distribution. For both experiments and simulations, the k-gamma distribution gives the best match to our data (Figure 2 Supplement 1). Further, as our simulations are less data-limited, we can confirm that the k-gamma distribution is significantly more accurate than the other distributions at predicting the frequency of large Voronoi volumes.

These four distributions were purposefully chosen. The rationale for the k-gamma distribution with regards to maximum entropy packing is detailed in the text. The normal, log-normal and beta-prime choices are discussed below.

We chose the normal distribution as it is the maximum entropy distribution given only the mean and variance of a population; furthermore, it sits as the limiting distribution according to the central limit theorem. Therefore, should random fluctuations add together completely independently, we might expect to observe a (truncated) normal distribution of the volumes. The absence of agreement between the data and this distribution implies that there is an additional feature that must be considered; namely, that cell volumes are not completely independent of one another but rather must sum to match a total volume (this is, of course, a requirement for the k-gamma maximum entropy distribution).

Similarly, the log-normal distribution was chosen for comparison due to its MaxEnt and Central-Limitlike properties. In essence, the log-normal distribution is the central limit theorem result for the logarithm of a variable. Further, volumes must be positive numbers, and the log-normal distribution only exists in the positive domain. Log-normal distributions are observed in many natural systems such as ecology, physiology, geology and more, and have recently been shown to describe organism swimming speeds quite well [6]. They define a good null model for any complex process with many interacting elements, particularly when there is multiplicative noise. By observing that the k-gamma distribution outperforms the log-normal distribution, we recognize that the volumes are not multiplicatively independent of one another.

Last, we chose to compare the performance of the k-gamma distribution to an arbitrary distribution with two parameters: we chose the beta-prime distribution, which has two parameters (

Therefore, showing these four distributions together allows us to compare four “minimal” models, and to demonstrate that the k-gamma distribution is the most accurate match for our experiments and simulations.

– Related to the previous point: one approach might be to construct a type of "null" model from the data, perhaps by systematically shuffling the data in some way and then bootstrapping to evaluate the likelihood of achieving fits of similar quality.

Thank you for this suggestion. We agree that this is an excellent approach to justify systematically the chosen distributions. For four 2-parameter distributions (k-gamma, normal, log-normal, and beta-prime), we applied a bootstrapping/resampling approach to understand how much error one might expect to see in these fits given the finite data sets. We applied this approach to experimental and simulated snowflake yeast Voronoi volumes. We took variable number subsamples of these datasets and measured how these four 2-parameter distributions perform. For each sample size, we computed the dataset’s mean and standard deviation, then used these parameters to calculate the necessary parameters for each of the distributions (i.e., we again did not do any “fitting” in a least-squares sense). By calculating the root-mean-square residual of the cumulative distribution functions, we found that the k-gamma distribution always outperforms the others (see Figure 2 Supp 1c). Further, the k-gamma distribution error continually decreases with increasing sample size (albeit slowly for large sample sizes), while the error plateaus at a larger value for all other distributions.

Next, we used the four distributions to predict the skewness of the simulation data. We empirically determined the first two moments of the dataset: we take these to be the first two moments for each of our distributions. Then, we used the first two moments to predict the skewness, which is related to the third moment of the distribution. We then compared this value to experimentally-measured skewness, finding that the k-gamma distribution estimated the experimentally-observed skewness with an error of only 7 percent, while the other three distributions (normal, log-normal, and beta-prime) estimated the skewness with percent errors of −100%, 96%, and 174%, respectively.

As an alternative approach, we test the ability of our selected distributions to perform in a least-squares fit. In other words, we fit the k-gamma, log-normal, Gaussian, and beta-prime distributions to the observed dataset. Then, we compare the values of the fitted parameters (namely, the mean and variance) to the observed values. Distributions that more closely fit the observed data are expected to produce closer estimates of the observed moments. Indeed, upon least-squares fitting the simulated dataset, we observe that the k-gamma distribution fit to the data most closely matches both the observed mean and the observed standard deviation, indicating that it is not only the distribution that best describes the tail of the distribution (as visualized in Supplementary Figure 2) but also the best fit for the distribution when considering the first two moments.

Figure additions: We revised and appended two supplemental figures for this response and the previous one. We included one figure on the probability distribution and cumulative distribution functions for all datapoints (Figure 2—figure supplement 1a,b), and then an additional panel on the bootstrapping analysis (Figure 2—figure supplement 1c). Then, we added a figure detailing the comparison of the estimated skewness from each distribution to the true measured values, and then of the measured mean and variance to the least-squares fit mean and variance for all four distributions (Figure 2—figure supplement 2).

Text additions: We added a subsection to the Methods section titled “Goodness-of-fit analysis” in which we present these results. In this section, we describe our reasons for choosing the three comparison distributions, indicate the parameters involved in each, and finally compare the performance of each.

– Have the authors considered trying to systematically quantify the impact of including higher-order structures in the max ent model? For example, one could perhaps use multi-information metrics (https://journals.aps.org/prl/abstract/10.1103/PhysRevLett.91.238701) to evaluate the extent to which higher-order features of the data are relevant / necessary; the idea is essentially to construct maximum entropy models with various levels of data complexity "built in" and then evaluate (perhaps with an info-theoretic style metric) the extent to which that complexity improved the model. Perhaps something similar has been done for granular materials to capture higher-order statistics of packing? I ask this primarily out of curiosity, not as a serious criticism of the current approach. A discussion of this point might add to the paper.

Thank you for these thought-provoking resources and ideas. We agree that this would be a very interesting next step to take. In particular, in Figures 1, 3, and 4 we probe cases where our predictions and observations differ. We think that future work could build upon these observations, though the framework to do so for maximum entropy packing may not yet exist. Nevertheless, it would be very interesting to use higher-order correction terms to understand exactly what kinds of new considerations are relevant for organisms that achieve various levels of morphological complexity.

We now directly address this topic in the discussion, and discuss both the suggested higher order structure paper as well as recently published work that uses topological information about the contact network of bacterial cells to uncover universal motif distributions. By combining topological and further additions to geometric models, future research may build upon the platform constructed here.

Text Additions: Inspired by this comment, we added a new paragraph (the second) to the Discussion section.

Reviewer #2:

[…] While the work is very strong overall, there are a few caveats to consider, primarily concerning the simulations. Multicellularity takes many forms by many different processes among eukaryotes. While simulations do cover a range of different morphologies and developmental processes found in nature, factors not explicitly addressed such as constraint or patterning by secreted extracellular matrix, differences in cell shape, cell migration, and others can lead to different kinds of multicellular form. The extent to which potential correlations imposed by diverse morphologies might lead to deviations from theoretical maximum entropy predictions, and how robust those deviations might be to noise is not entirely clear. Additionally, the randomness strength in simulations from high to low, while reasonable, does not appear to be grounded in empirical characterization of randomness strength in developmental processes across biological systems. Ultimately, although they leave some uncertainty as to the generality of the results, these limitations do not contradict or significantly diminish the key claims of the paper.

Comments for the authors:

1) The simulation results are compelling but left me with some questions. To what extent do the morphologies and processes investigated by simulations address the diverse forms of multicellularity encountered across eukaryotes? To what extent does the overall shape of the multicellular structure affect the cell packing distribution (e.g. multi-lobed structure as in Zoothamnium niveum, dichotomous branching as in Dinobryon, something with an undulating boundary)? Are there any examples of simple multicellular eukaryotes that might exhibit very strongly correlated cell positions? What is known about randomness strength or precision in developmental processes in biological systems, and if anything is known, how does this compare to values in simulations? Providing a bit more contextualization or motivation for specific choices in simulations could help address these questions and would support the generality of conclusions drawn from the simulations. Although I am convinced that the results hold for a broad range of multicellular architectures and do not think that the possible existence of a few edge cases contradicts the main conclusions of the work, it is not entirely clear to me that the effects of growth morphology, connection topology, and dimensionality have been accounted for.

As this is a detailed comment with many independent questions, we interleave them and our responses below.

“To what extent do the morphologies and processes investigated by simulations address the diverse forms of multicellularity encountered across eukaryotes?”

Broadly, multicellular growth morphologies can be sorted into two general classes: intercellular bonds may be reformable, or they can be permanent (i.e., un-reformable). Our manuscript addresses both classes.

“Permanent” intercellular bonds are not reformable if broken; mother and daughter cells remain physically attached after cell division is complete. This process occurs in many clades of multicellularity, including plants, green algae, brown algae, red algae, fungi, bacteria, and in some stages of animal development. Incomplete cell separation is one of the oldest forms of multicellular assembly and one of the most successful, dominating the planet’s biomass [1]. Thus, we have primarily focused on this class of growth morphologies, which includes experiments on snowflake yeast and

Of course, multicellularity is rich in form and function, and we cannot capture the growth morphologies of all multicellular organisms. However, sampling a range of growth morphologies with “reformable” and “non-reformable” bonds provides at least an initial sampling of the diversity of growth morphologies that exist. We specifically do not simulate growth morphologies of confluent tissues, whether these be assembled through permanent bonds or sticky cells. As confluent tissues exhibit packing fractions of nearly 1.0, the distribution of Voronoi volumes or areas is highly impacted by the cell size distribution, which is regulated at the cellular level.

Text additions: We added the following sentences to clarify the motivation for our simulations:

“These simulations captured a couple basic properties of multicellularity: organisms may grow in both two and three dimensions, and they may assemble with two different classes of bonds: bonds that are reformable, and bonds are not reformable. […] Finally, palintomy, or growth confined inside a maternal membrane, is also common from algae to animals.”

“To what extent does the overall shape of the multicellular structure affect the cell packing distribution (e.g. multi-lobed structure as in Zoothamnium niveum, dichotomous branching as in Dinobryon, something with an undulating boundary)? Are there any examples of simple multicellular eukaryotes that might exhibit very strongly correlated cell positions?”

This is a very interesting question. What is the role played by the boundary between ‘inside’ and ‘outside’ an organism? Geometric constraints, by definition, limit how cells can be arranged, and sufficiently strong geometric constraints, i.e., constraints that precisely determine cellular positions, could dampen or eliminate fluctuations in cell packing volumes. So, how do boundary conditions impact the cell packing distribution?

First, in the case of groups with unreformable bonds, such as snowflake yeast, Zoothamnium, and Dinobryon, cells are “frozen” into place (subject to minor displacements due to, for example, mechanical deformations). However, so long as the exact location is subject to random fluctuations, each organism will have a similar but different cell spatial structure. The maximum entropy principle thus suggests that the ensemble of all organisms precisely follows the most likely cell packing distribution, even though each individual organism deviates slightly from this distribution.

Second, we would like to clarify how we define the boundary of the organism. This is a non-trivial point: where should the boundary of a multicellular organism be drawn? In some cases, for example cells contained within an epithelial sheet (such as

Having clarified those points, we now more directly address the referee’s question: could a particular boundary condition make cell positions more correlated, damping fluctuations and producing deviations from maximum entropy predictions? While we cannot completely address this question, at a minimum it is known that packing statistics in thermal systems are different near and far from boundaries. This phenomenon is well demonstrated in soft matter physics, impacting everything from colloidal particles to polymers [3, 5, 7, 8, 9]. Thus, we expect that boundaries can and do impact packing statistics in multicellular organisms. In particular, it is likely that the maternal membrane we simulate causes similar effects. To test this hypothesis, we returned to our simulations of palintomy within a confining membrane to examine how the boundary impacted cell packing statistics. We bin cells into shells of finite width centered at the center of mass of the organism. As expected, we find that the distribution of Voronoi volumes depends on the distance from the boundary; cells located near the boundary are packed more densely than cells deeper in the interior. However, we find that packing within the ‘core’ and within the ‘shell’ still follow separate maximum entropy predictions (see Figure 2—figure supplement 3).

There are, in fact, a number of simple multicellular organisms with highly correlated cell positions.

Text additions: To make this subject more clear to the reader, we have revised the first paragraph of the subsection “Snowflake Yeast” in the section “Experimental tests of multicellular maximum entropy predictions”. We have changed this paragraph to read as follows:

“Snowflake yeast grow via incomplete cytokinesis, generating branched structures in which mother-daughter cells remain attached by permanently bonded cell walls (Figure 1A). […] Conversely, if there are strong correlations in the locations of daughter cells, then we will observe deviations from maximum entropy predictions regarding the cell neighborhood volumes.”

We also added a sentence about confluent cell layers: “However, we would not expect this prediction to hold for confluent cell layers (i.e. packing fraction

Figure additions: We have added Figure 2—figure supplement 3, in which we show that in simulations of cells confined by a maternal membrane, the membrane poses boundary effects on the packing of the group, as shown in panel A. In panel A, we show the volume of the Voronoi volumes as a function of the distance from the center of mass (in normalized units). Then, we partition the membrane into spherical shells and show that within each shell, the distribution of Voronoi volumes is well-described by the k-gamma distribution (panels B-E).

“What is known about randomness strength or precision in developmental processes in biological systems, and if anything is known, how does this compare to values in simulations?”

A number of studies have shown that fluctuations can be crucial to highly-functioning developmental processes, for instance in sheet folding sepal growth [2, 4]. Such studies tell us three important things: (i) randomness can be just as integral a component of development as directed growth; (ii) it may be difficult to disentangle exactly which reproducible qualities and traits stem from a regulated developmental process, and which stem from random processes; and (iii) these studies imply that random processes may be more prevalent in development than we currently know, as there are not many tools by which we can actually measure the effect of randomness. With this in mind, our manuscript provides a mechanism for testing the effect of randomness on cell packing distributions. Our simulations provide test cases for what we might expect to observe in the presence or absence of randomness. We show how deviations from maximum entropy packing predictions occur, and that they indicate the presence of correlations. With this information, future work can then use this tool to quantify the randomness strength of various developmental processes.

Text additions: To highlight better this point, we have modified a paragraph of the text. The first paragraph of the section “The crucial role of randomness” now reads:

“How much randomness is necessary for the k-gamma distribution to predict cell neighborhood size distributions? […] We do not attempt to recapitulate exactly any naturally-occurring levels of randomness strength; rather, we are looking to investigate the limits about how well maximum entropy predictions can measure a deviation from pure randomness.”

2) The sections titled "Multicellular motility is robust to cellular area heterogeneity" starting on p. 11 is slightly perplexing. It is certainly interested, and I see that it addresses a question that may arise from analysis of Volvox cell packing, but in its current form, I do not believe it contributes substantially to the key points of the paper. The introduction section seemed to imply that the results would demonstrate that fluctuations in cell packing may play a role in the evolution of multicellular systems, but as I understood them, the results suggest that fluctuations do not affect motility, at least implying that there should be little to no effect on any aspect of fitness related to motility. It is possible that there could be other aspects of organismal fitness related to cell packing, so while these results are consistent with cell packing fluctuations not necessarily impeding or constraining the evolution of multicellularity, they do not strongly support that conclusion. Perhaps contextualizing the results a bit more in terms of key points of the paper while reporting them and referring to them in the Discussion section might help the reader better appreciate their significance within the context of the paper overall.

We agree that this section could be made clearer to the reader. Broadly, random perturbations could be detrimental, beneficial, or neutral to an organism. Naively, we may expect that perturbations push an organism further from an ‘ideal’ morphology and thus are largely detrimental. With snowflake yeast, we show that randomness in cell positions can lead to highly repeatable group level traits, which allows emergent multicellular traits to become remarkably heritable (Zamani-Dahaj et al., 2021: doi.org/10.1101/2021.07.19.452990). In this way, entropic cell packing imbues nascent multicellular organisms like snowflake yeast with an evolutionary benefit, allowing natural selection to act upon multicellular traits that emerge from the mechanics of cellular packing. Our analysis of

Text additions: We make the following additions to the text to help the reader contextualize these results:

“However, the effect of heterogeneity in cell area on motility is as yet unknown; naively, we might expect that heterogeneities in flagella packing may lower the motility of the organism in comparison to a highly regular arrangement. […] The absence of such a barrier may represent a scenario where trait optimization can be achieved without first evolving sophisticated developmental genes.”

3) I might suggest removing or otherwise modifying the phrase "highly-evolved" (p.14) as its meaning is unclear, has connotations of evolutionary teleology, and clashes with the fact that all extant organisms share an evolutionary history of equal length. Maybe something such as "organisms with highly-regulated development" may be more appropriate.

We agree and have made the suggested change.

4) Is anything known about the source of correlated subregions of cells in Volvox? Do the authors have any ideas about this? Either way, it would be interesting to know and may warrant at least a small comment in the text.

We agree that this question should be addressed in the manuscript.

Typically, an anterior/posterior polarity is measured by observing a moving organism: however, in our snapshot data, the organism is stationary. In other cases, the distribution of the germ cells within the interior of the organism can be used, at some stages of the

Text additions: We have revised the first paragraph of the subsection “The role of spatial correlations” to read:

“While we have shown that the distribution of cell neighborhood volumes closely follows the k-gamma distribution in two very different organisms, we have also seen that in some cases maximum entropy predictions are more accurate in sub-sections of an organism than across its entirety. […] This polarity may affect the somatic cell packing distribution in different subregions of the organism, leading to deviations from maximum entropy predictions.”

5) In the author list, SSH is missing an asterisk to denote corresponding authorship.

Corrected.

6) An "e" is missing in "surface" in the caption for Figure 2B.

Corrected.

7) I believe the dotted red line in Figure 4B should be a solid line to match those in panels A and C.

We appreciate your close scrutiny of our figures. Here, the dotted red line signifies that there is only a small, discrete number of observed cell volumes when there is no noise added to the system. We modified the caption to emphasize this point and ensure it is clear to the reader.

Text additions: We added the following sentences to the figure caption:

“Note: the discrete points in (B) arise because, absent randomness in the cell locations, the exact same cellular structure is achieved by every simulation. […] Upon adding randomness, the cellular structure is altered between successive simulations, and the distribution again achieves continuity.”

References:

1. Yinon M. Bar-On, Rob Phillips, and Ron Milo. “The biomass distribution on Earth”. In: Proceedings of the National Academy of Sciences 115.25 (2018), pp. 6506–6511. issn: 10916490. doi: 10.1073/pnas.1711842115.

2. Pierre A. Haas et al. “The noisy basis of morphogenesis: Mechanisms and mechanics of cell sheet folding inferred from developmental variability”. In: PLoS Biology 16.7 (2018), pp. 1–37. issn: 15457885. doi: http://dx.doi.org/10.1371/journal.pbio.2005536.

3. K. Harth, A. Mauney, and R. Stannarius. “Frustrated packing of spheres in a flat container under symmetry-breaking bias”. In: Physical Review E – Statistical, Nonlinear, and Soft Matter Physics 91.3 (2015), pp. 1–5. issn: 15502376. doi: 10.1103/PhysRevE.91.030201.

4. Lilan Hong et al. “Variable Cell Growth Yields Reproducible OrganDevelopment through Spatiotemporal Averaging”. In: Developmental Cell 38.1 (2016), pp. 15–32. issn: 18781551. doi: http://dx.doi.org/ 10.1016/j.devcel.2016.06.016.

5. S´ara L´evay et al. “Frustrated packing in a granular system under geometrical confinement”. In: Soft Matter 14.3 (2018), pp. 396–404. issn: 17446848. doi: 10.1039/c7sm01900a.

6. Maciej Lisicki et al. “Swimming eukaryotic microorganisms exhibit a universal speed distribution”. In:

7. Joerg Reimann et al. “Pebble bed packing in prismatic containers”. In: Fusion Engineering and Design 88.9-10 (2013), pp. 2343–2347. issn: 09203796. doi: 10.1016/j.fusengdes.2013.05.100. url: http://dx.doi.org/10.1016/j.fusengdes.2013.05.100.

8. Mohammad Mahdi Roozbahani, Bujang B.K. Huat, and Afshin Asadi. “Effect of rectangular container’s sides on porosity for equal-sized sphere packing”. In: Powder Technology 224 (2012), pp. 46–50. issn: 00325910. doi: 10.1016/j.powtec.2012.02.018. url: http://dx.doi.org/10.1016/j.powtec.2012. 02.018.

9. Yu G. Stoyan and G. N. Yaskov. “Packing identical spheres into a cylinder”. In: International Transactions in Operational Research 17.1 (2010), pp. 51–70. issn: 14753995. doi: 10.1111/j.1475-3995.2009.00733.x.