^{1}

^{2}

^{1}

^{1}

^{3}

The authors have declared that no competing interests exist.

In the present article we use geometric microliths (a specific type of arrowhead) and Approximate Bayesian Computation (ABC) in order to evaluate possible origin points and expansion routes for the Neolithic in the Iberian Peninsula. In order to do so, we divide the Iberian Peninsula in four areas (Ebro river, Catalan shores, Xúquer river and Guadalquivir river) and we sample the geometric microliths existing in the sites with the oldest radiocarbon dates for each zone. On this data, we perform a partial Mantel test with three matrices: geographic distance matrix, cultural distance matrix and chronological distance matrix. After this is done, we simulate a series of partial Mantel tests where we alter the chronological matrix by using an expansion model with randomised origin points, and using the distribution of the observed partial Mantel test’s results as a summary statistic within an Approximate Bayesian Computation-Sequential Monte-Carlo (ABC-SMC) algorithm framework. Our results point clearly to a Neolithic expansion route following the Northern Mediterranean, whilst the Southern Mediterranean route could also find support and should be further discussed. The most probable origin points focus on the Xúquer river area.

The Neolithic arrival at the Iberian Peninsula has been explained through a mixed model triggered by the demic expansion across the Mediterranean. At this point, the seminal works of Ammerman and Cavalli-Sforza [

The Neolithic package, including domestic plants, animals and cultural items as impressed pottery ware, polished stone and sickles among others was introduced in the Iberian Peninsula circa 7600 cal BP. Its fast dispersal around the Mediterranean corridor reflects the punctuated appearance of some pioneering areas from the northeast to the southwest. Current radiocarbon records confirm the closeness between the oldest dates at the coastal territory and some sites in the Ebro valley considering dates from domestic samples [

In a brief appraisal, from the Ebro river to roughly the Segura river there is evidence of a Mesolithic record belonging to the blades and trapezes technocomplex, also extended along the Ebro river and the Atlantic coast of Portugal [

From the North to the South, it is possible to recognise some Neolithic pioneering areas in Catalonia, the middle Ebro river, the Serpis valley/Cap de la Nao region, and the coast of Málaga and Cádiz, coincident with areas with a scarce or non-existent Mesolithic population (

Only the sites containing geometric microliths for the time-span under study are shown. Maps modified from ESRI World Terrain Base Map.

Regarding the phenomenon of the first

The Cueva de Chaves (Bastarás, Huesca) constitutes a pioneer site that opens the question of the advance to the inner Iberia through the Pre-Pyrenees mountains from the Mediterranean coast [

Moving to the eastern shore of the Iberian Peninsula, the area between the mouth of the Serpis river and the Cap de la Nau, including the Serpis valley from its headwater, constitutes a genuine core area for the Neolithic dispersal. The sites of Cova de les Cendres, Barranquet, Cova d’En Pardo, Cova de l’Or, Benàmer, Mas d’Is, Falguera and Cova de la Sarsa report radiocarbon dates centred around 7500 cal BP [

From the south, a more punctuated dispersal record is registered in a long territory from the Segura river to the Cádiz coast in Andalusia [

In a general overview, the model postulated by Zilhão [

Consequently, to look over evolutionary processes behind cultural variability at the times of the Neolithic expansion becomes a challenge for exploring patterns and processes in cultural change. In this paper we turn to geometric tools considering the potential exhibited by stylistic traits in order to account for cultural change [

The potential of stylistic traits for similarity and culture transmission analysis has been noted by several authors [

The first step has been to sample the most adequate sites in order to capture the early moments of the Neolithization process in the Iberian Peninsula. With this scope, we have divided the zone under study in four different areas from which we have selected the oldest radiocarbon date from short life single samples provided by domestic plants or animals around the middle of the VIII millennium cal BP. The zones selected loosely correspond with current Aragon, Calatonia, Valencia and Andalusian regions, while, speaking broadly, they also have their correspondence with the Ebro, northeast shores, Xúquer and Guadalquivir fluvial basins respectively. The oldest dates for each zone belong to the sites of Chaves [

The geometric microliths are a specific type of arrowhead used both by the communities of the last hunter-gatherers and the first farmers of the Iberian Peninsula (

The geometrics belong to the sites of Cova de les Cendres (1,2), Barranquet (3,4), Cova de l’Or (5–11), Mas d’Is (12) and Cueva de Nerja (13–16). Full information of the geometric microliths used (including their provenance) can be found in

In this work, under the assumption that stylistic traits can account for cultural variance (see [

The collection of data traits has been performed automatically (while every geometric microlith has been personally supervised to ensure the consistency of the method). In order to do so, we have used the R package GeomeasuRe [

In order to avoid size problems derived from different raw material sources, the geometrics have been scaled, after which we have filtered the data following three main steps: (1) removing the geometric microliths which did not meet the reliability criteria, (2) removing the L-lines without presence of the geometric and (3) performing a PCA in order to reduce the dimensionality of our dataset. After obtaining the values in the filtered data of the PCA, we have categorised our numeric data, and added categoric traits relating with retouch mode and direction, in order to obtain values of representation of the trait for each site. Since the site of Can Sadurní had only one geometric, and it did not meet the reliability criterion, this site has been removed from our study. All in all, we have remained with a total of 13 sites and 146 geometric microliths, accounting for 19 variables (data in

As explained above, our main goal is to understand the cultural similarities of the geometric microliths from the sites under study and, based on those similarities, evaluate different possible origin points and routes of expansion for the Neolithic in the Iberian Peninsula. In order to do so, we have relied on Mantel Tests and Approximate Bayesian Computation, more specifically Sequential Monte-Carlo algorithms (SMC-ABC) [

The rationale behind partial Mantel tests is similar to the one above, but where a third matrix is used in order to control for the other two. In this sense, this third matrix is held constant while the relationship between the other two is determined [

Finally, and because Mantel tests focus on population structure, it has also been noted whether they can actually account for cultural diversity and affiliation, proposing some possible corrections [

As briefly mentioned above, we have used the third matrix (the chronological Matrix C) added to partial Mantel tests to control for possible expansion routes. Reminding that, in our case, this third matrix contains the oldest short-live radiocarbon date for each site, and because due to the permutation process the values of the Mantel-t statistic are not always the same, we have created a distribution of n = 1000 significant Mantel t values of our data. This distribution will be used as the observed summary statistic in the SMC-ABC process (see next paragraph). After obtaining the summary statistic, we modify the 14C matrix in order to propose different points and modes of expansion. In order to do so, we randomly select one possible starting point from the 13 sites under study and assign its actual 14C dating (calibrated in the way explained above). Because we are not considering a very large geographic area, we have decided to use the actual sites rather than possible simulated origin points, as it has been proposed for larger areas [

Being

The model has been applied to one or two possible origins. For the case of the two origins, the model remains the same for each origin (also selecting the actual 14C dating to each of the two sites). In this case, the two origins would produce a possible

The Approximate Bayesian Computation approach [_{i} [_{1}. Then, this posterior distribution _{1} becomes the prior distribution for the next particle _{2}, and this operation is repeated _{i} is accepted and becomes the prior distribution for the next iteration, the threshold proposed for _{i}, the algorithm samples one observation from the previous particle _{i−1}, which becomes the candidate for acceptance under this simulation. The next step is to modify the parameters of the candidate observation by a distribution kernel, frequently ~_{o} (the observed summary statistic) and _{s} (the simulated summary statistic) is evaluated again, and accepted only if it improves the results of the previous particle. Because the threshold value is smaller for each _{i}, each particle provides a better convergence regarding the prior than the previous one.

We have used this method also due to the randomness introduced when selecting each calibrated radiocarbon date, taking into account the increase in computational efficiency. We have used one first rejection algorithm, and three more particles, where _{i} = 1000 observations. Because we deal with the distributions of the Mantel-t, we do not have a single _{o} value to compare with. Thus, for the initial rejection algorithm, we have accepted only significant values falling within the (_{1}, _{3}) from _{o}. For the next three particles, the simulations accepted range between (_{30}, _{70}), (_{35}, _{65}) and (_{40}, _{60}). In order to compare different possibilities, we have decided to use two transition kernels, a strict transition kernel

The nuances of these methods are explained in more detail at

We have constructed _{o} from the results of the permutations on the chronological matrix of the Mantel test. In this sense, we have obtained a distribution from the significant values, accounting for a dissimilarity measure with a mean of 0.37 and the thresholds 0.23–0.51 for a 95% confidence interval under a normal distribution, thus being able to distinguish significant differences among the sites under study. These values have been the key which the posterior analyses have relied on, as they are used as the ‘target’ similarity measure for the simulation process. Two possible hypotheses have been considered, one where the Neolithic would have a single origin area in the Iberian Peninsula, and another one where there would be two possible synchronic–

After the development of the process, one of the first outputs calling our attention is the fact that the distribution of the parameter

Confidence intervals offered at 80% and 95%. For both kernels, from the initial uniform distribution, the parameter value offers better convergence for each _{i}.

Our variable of interest is the one containing the origin points for each simulation. In this sense, if we look at _{ret})<P(_{or})<P(_{or})<P(

Site abbreviations from left to right: Chav = Chaves, Ben = Benàmer, Tor = Cueva del Toro, Valm = Valmayor XI, Or = Cova de l’Or, Nerj = Cueva de Nerja, Cen = Cova de les Cendres, Gui = Guixeres, Bar = Barranquet, Cast = Los Castillejos, Ret = El Retamar, Mas = Mas d’Is, Fal = Abric de la Falguera.

These specific results cannot be interpreted in a strict sense, but rather they would account for spread signals regarding each area. Thus, we have considered these as possible origins and, mainly, as aggregated values. We have done this in two ways; first, we have considered regional divisions attending to fluvial basins, which broadly coincide with current regional administrations, as stated in the methods section, and second, we have considered three possible expansion routes. In the first case, we consider four possible nuclei; the Aragon nuclei (Ebro basin), the Catalan nuclei (Northeast basin), the Valencian nuclei (Xúquer basin) and the Andalusian nuclei (Guadalquivir basin). In order to avoid problems due to sample size (as some nuclei have more sites than others), we have applied a correction where the total aggregated probability is divided by the number of sites producing that probability. After performing these operations, we can observe how the nuclei of Xúquer basin offers the highest probability (50%), followed by Ebro basin, Northeast and Guadalquivir basin, all of them offering similar probabilities (20.25%, 15.7% and 14.05% respectively) when we consider the restrictive kernel, whereas the results do not vary substantially for the case of the relaxed kernel (48.39%; 19.35%; 18.55%; 13.71%) (

(A) relaxed kernel. (B) strict kernel. Higher intensity and size of circles indicates higher probability. Maps modified from ESRI World Terrain Base Map.

As for the possible expansion routes, we have considered three main possibilities, all of them with support in archaeological literature to a lesser or higher extent. First, we have considered a

(A) relaxed kernel. (B) strict kernel. Stronger colour indicates higher probability. Maps modified from ESRI World Terrain Base Map.

We have also considered the possibility that the arrival of the first farmers to the Iberian Peninsula occurred following two different paths of entrance and, thus, two possible origins. Again, the posterior distributions of the parameter _{1} and _{2}, we have decided to attribute the paired origins of each simulation to the different regions/routes. Thus, we consider all possible pairs of origins, including the possibility of a repeated origin; that is, two starting points chosen in the same region.

As we can see in

Higher intensity and size of circles indicates higher probability. Left column represents relaxed kernels; right column strict kernels. From left to right, barplot abbreviations stand for: AR-CAT = Ebro-Northeast, AR-PV = Ebro-Xúquer, AR-AND = Ebro-Guadalquivir, CAT-PV = Northeast-Xúquer, CAT-AND = Northeast-Guadalquivir, PV-AND = Xúquer, Guadalquivir, PV-PV, AR-AR, CAT-CAT, AND-AND. Maps modified from ESRI World Terrain Base Map.

Stronger colour indicates higher probability. Left column represents relaxed kernels; right column strict kernels. From left to right, barplot abbreviations stand for: Py-Med = Inner Pyrenees-Northern Mediterranean, Py-S = Inner Pyrenees-Southern Mediterranean, Med-S = Northern Mediterranean-Southern Mediterranean, Py-Py = Pyrenees-Pyrenees, Med-Med = Northern Mediterranean-Northern Mediterrnean, S-S = Southern Mediterranean-Southern-Mediterranean. Maps modified from ESRI World Terrain Base Map.

Once the terms of the results in regards with their statistical meaning have been explained we must highlight two interesting points, a) the existence of signals of cultural variability in stylistic traits of geometric projectiles according to geographical distance and b) the significant outputs considering specific points of origin. At this point, the hypotheses tested are relevant in the framework of recent archaeological literature regarding the Neolithization process at the Western Mediterranean although with the novelty of a new methodological approach based on crossing cultural variability data with chronological and spatial information. As we can see, three main routes, the northern Mediterranean, the Pyrenees and the Southern route (considering a possible dispersal from the Northern African shores) have been explored. Although the importance of sea travelling for the expansion of the Neolithic in the Mediterranean has been widely accepted by the archaeological literature [

In this matter, and following the results of this study, we have considered different possibilities for the expansion of the Neolithic in the Iberian Peninsula, not always necessarily exclusive of one another. In order to do so, we have combined our current information in 14C dating and material culture, focusing on geometric microliths. Our analyses point at the Northern Mediterranean as the most probable candidate for the Neolithic expansion, which also agrees with most of archaeological literature [

Perhaps the case of the Southern route is the most complicated one. In our analysis, the probability of a second Neolithic origin point in the South of the Iberian Peninsula, which could imply a South-North Mediterranean advance from Northern Africa, is relatively high. João Zilhão [

These arguments are consistent at the current state of research and, yet, the possibility of a Southern origin is still present in our results. After examining our data, we must consider the fact that we have included the site of Retamar in the analysis. This site was considered to contain Neolithic levels in its first interpretation [

In any case, we must also consider the fact that having one possible origin in the Southern part of the Iberian Peninsula does not necessarily imply a South Mediterranean expansion route. If we consider an average of at least 300 km per generation expansion rate, being each generation 32 years [

Finally, we must take into account that, despite the fact that the origin probabilities for each area have been weighted to the number of sites per zone, there is indeed a potential effect in the analysis due to the number of sites present in each region. However, the number of sites per region is also part of the Neolithization process, even more so when considering that, for all of these areas there has been active archaeological research during the last years [

In any case, this study seems to state clearly one possible expansion route following the Northern Mediterranean, in correspondence with current archaeological knowledge. For the Southern Mediterranean possibility, although it could be considered for future studies, there are currently other elements which must be disentangled in order to understand it better and, ultimately, the only answer to this could come from an increase on the well-stratified archaeological record from the Northern African shores.

Cultural diversity in the Neolithic spread along the Western Mediterranean has been approached extensively considering different scales of the analysis from a geographical and/or a diachronic point of view. The first Neolithic package includes ceramic pots, new lithic knapped tools or polished stone among other cultural productions. In a general view, the early Neolithic groups have been characterized from pottery remains in a tradition conformed by the impressed pottery ware. As we have noted, recent archaeological discoveries and renewed radiocarbon records have revealed the oldest advance of pioneer

In any case, if we were to focus on the possible reasons accounting for this variability in evolutionary terms, some questions emerge. Indeed, the closeness among the current radiocarbon dates raises some issues. Namely, the speed of the expansion and the proposed leapfrog movement [

The radiocarbon record, although obviously related, does not seem in full accordance with the Neolithic origins modelled here. Indeed, we should probably consider it as rather the consolidation of the farmer groups in some regions. If this were the case, the expectancy would be to find older Neolithic radiocarbon dates at some specific locations, as the discovery of the early

After the initial Neolithic demic contribution, the interaction between Mesolithic and Neolithic groups could be in some way significant considering the weight of cultural transfers and borrows in a shared and common lithic pool by both. To confirm this point, it would be necessary to add some new parameters and methods able to explore diversity taking into account different regional outputs [

Obviously, the nature and size of the sample analysed must be improved and increased, while it is also necessary to test the method including other cultural items. In fact, some works designed to explore cultural diversity in a diachronic view, and focussing on pottery and ornament have revealed different mechanisms of cultural transmission [

Summarizing, we have designed this work as an analytical procedure to look into cultural patterns for the agriculture spread in Iberia. The combined approach of cultural and geo-chronological similarity analysis, aided with the computational power of the ABC-SMC methods opens a new window to simulate and model crucial questions accounting for cultural diversity in a human dispersal scenario. Improving the method, exploring new parameters, and extending the spatial and temporal contexts constitute part of the challenges to build new evolutionary histories in regards with the Neolithic transition in the western Mediterranean.

(DOCX)

It includes a readme file with further indications.

(ZIP)

We would like to thank the insightful comments and observations of Prof. Stephen Shennan and Dr. Enrico R. Crema on previous drafts of the manuscript, as well as the comments of an anonymous reviewer, which greatly improved the quality of this paper.