![](https://static.cambridge.org/binary/version/id/urn:cambridge.org:id:binary:20190409071159301-0045:S0003598X18001874:S0003598X18001874_figU1g.jpeg?pub-status=live)
The recognition and selection of high-quality stone raw materials over those of lower quality would have provided a selective advantage to the prehistoric makers and users of stone tools. Ethnographic studies of contemporary small-scale societies who produce stone tools have shown that sourcing high-quality raw materials is one of the most important and difficult aspects of the stone-tool production process—especially for the unskilled (Stout Reference Stout2002; Weedman Arthur Reference Weedman Arthur2010, Reference Weedman Arthur2018). For expert stone toolmakers, the use of high-quality stone, free of impurities, would have reduced the chance of production failure due to flaws in the material, thus saving time and energy (Goodyear Reference Goodyear, Lothrop and Ellis1989; Whittaker Reference Whittaker1994; Patten Reference Patten2009). For novice stone toolmakers, using high-quality raw material would have facilitated production and improved the learning processes. Results of specific steps in the production process could be associated directly with controllable factors, such as the actions performed by a novice that increased bio-feedback, rather than with uncontrollable factors, such as stone flaws or variations in the stone matrix (Roux et al. Reference Roux, Bril and Dietrich1995; Stout Reference Stout2002, Reference Stout, Roux and Bril2005; Stout & Semaw Reference Stout, Semaw, Toth and Schick2006). For stone-tool users and producers, it is thought that high-quality raw materials influenced toolkit design, increased tool use-life and augmented the ability to alter tool design and, hence, tool function (Bamforth Reference Bamforth1986; Goodyear Reference Goodyear, Lothrop and Ellis1989; Andrefsky Reference Andrefsky1994). For these reasons, high-quality raw materials would have increased a lithic technology's portability, providing an asset for mobile foragers (Kelly & Todd Reference Kelly and Todd1988; Goodyear Reference Goodyear, Lothrop and Ellis1989).
The concept of ‘quality’ is often a subjective and poorly defined characteristic of knappable stone (Brantingham et al. Reference Brantingham, Olsen, Rech and Krivoshapkin2000). Humans perceive quality in varying ways, including colour, shape, patterning, availability, translucence, function, brittleness and durability—among other possible traits. In recent years, however, one quantifiable definition of quality advanced by a number of scholars involves fracture predictability (Domanski et al. Reference Domanski, Webb and Boland1994; Brantingham et al. Reference Brantingham, Olsen, Rech and Krivoshapkin2000; Doelman et al. Reference Doelman, Webb and Domanski2001; Braun et al. Reference Braun, Plummer, Ferraro, Ditchfield and Bishop2009; Eren et al. Reference Eren, Roos, Story, von Cramon-Taubadel and Lycett2014). Rocks that fracture predictably possess few impurities that could potentially interfere with fracture propagation (Whittaker Reference Whittaker1994; Brantingham et al. Reference Brantingham, Olsen, Rech and Krivoshapkin2000; Stout et al. Reference Stout, Quade, Semaw, Rogers and Levin2005; Bamforth Reference Bamforth2009; Braun et al. Reference Braun, Plummer, Ferraro, Ditchfield and Bishop2009). The results of several studies have quantitatively supported the hypothesis that, given a variety of rock types, Homo sapiens and other hominins were able to recognise types with greater fracture predictability. Lower Palaeolithic hominins at the African Oldowan sites of Gona and Kenjera South, for example, selected toolstones with fewer impurities (Stout et al. Reference Stout, Quade, Semaw, Rogers and Levin2005; Braun et al. Reference Braun, Plummer, Ferraro, Ditchfield and Bishop2009), as did Middle Palaeolithic hominins in Northeast Asia (Brantingham et al. Reference Brantingham, Olsen, Rech and Krivoshapkin2000). Indeed, Brown et al. (Reference Brown, Marean, Herries, Jacobs, Tribolo, Braun, Roberts, Meyer and Bernatchez2009) demonstrate that Middle Stone Age people at the South African site of Pinnacle Point, c. 164–72ka, systematically manipulated stone raw materials with heat in order to reduce internal flaws and improve the materials’ workability—as did other prehistoric groups (e.g. Domanski & Webb Reference Domanski and Webb1992; Schmidt & Morala Reference Schmidt and Morala2018). This is not to say that fracture predictability was always universally desired, only that it was often preferred. It seems, for example, that in certain contexts, quartz was purposefully selected because it shatters upon impact, or is simply conducive to breakage—regardless of whether that breakage is predictable (e.g. Gurtov & Eren Reference Gurtov and Eren2014).
The North American Holocene archaeological record provides the opportunity to examine stone raw material recognition and selection in a substantially different context from that of Old World early hominins and early Homo sapiens. Unlike many Old World Palaeolithic contexts, the North American Holocene provides a continuous, high-resolution record of temporally diagnostic projectile points (e.g. Justice Reference Justice1987). Additionally, stone was used throughout the entire North American Holocene period, whereas in Africa, Europe and Asia, stone was often replaced by other raw materials, such as metals, during the Holocene (Tylecote & Tylecote Reference Tylecote and Tylecote1992; Mei & Rehren Reference Mei and Rehren2009). Finally, due to decreasing lithic procurement and transportation distances after the North American Palaeoindian period (c. 13 500–10 000 BP) (Meltzer Reference Meltzer and Jablonski2002), there are individual stone outcrops from which the entire Holocene sequence of point types has been knapped. This allows for the assessment of selectivity over time, from within a single stone raw material type.
Here, we present an assessment of raw material selection spanning the Holocene, by quantitatively assessing the quality (fracture predictability) of a large sample of Holocene flaked stone projectile points from the Welling site (33-Co-2) in Coshocton County, Ohio (Figure 1). Welling is a multicomponent site, although it is best known for its Late Pleistocene Clovis occupation (Prufer & Wright Reference Prufer and Wright1970; Lepper Reference Lepper2005; Miller et al. Reference Miller, Bebber, Rutkoski, Haythorn, Boulanger, Buchanan, Bush, Lovejoy and Eren2018). Welling's Holocene point record, however, spans from the North American Early Archaic to the Late Prehistoric periods (9000 BP to AD 1600). Thus, we can investigate the human recognition and selection of stone raw material quality over time from a single, highly localised chert outcrop.
![](https://static.cambridge.org/binary/version/id/urn:cambridge.org:id:binary:20190409071159301-0045:S0003598X18001874:S0003598X18001874_fig1g.jpeg?pub-status=live)
Figure 1. The Welling site (A) and central Ohio chert outcrops: Delaware (B), Flint Ridge (C), Upper Mercer (D) and Plum Run (E). Welling sits within the Upper Mercer outcrop. Shown on the right are examples of Welling Holocene chert projectile points from various time periods.
Materials and methods
Archaeological and geological chert samples
Fifty-nine projectile points from the Welling site were submitted for chert-quality (fracture predictability) analysis (see below). A recent survey of the complete assemblage of more than 56 000 lithic specimens indicates that these 59 points represent most, if not all, of the Holocene-era projectile point forms in the Kent State University archaeological collections (Miller et al. Reference Miller, Bebber, Rutkoski, Haythorn, Boulanger, Buchanan, Bush, Lovejoy and Eren2018). This sample comprises point forms dating to every major archaeological period—except the Middle Woodland—in the North American Holocene (Figure 1): Early Archaic (n = 12); Middle Archaic (n = 5); Late Archaic (n = 16); Early Woodland (n = 8); and Late Woodland (n = 16). While two points were too fragmented to be identified specifically, they are thought to be Holocene in age, due to their form and particular features.
Points were assigned to type, following a commonly used regional guide (Justice Reference Justice1987). Point-type designation, weight and basic morphometric data are available in Table S1 of the online supplementary material (OSM). Some of the point types slightly overlap temporally (Table S1) and individual specimens of the same type may represent different time periods (e.g. Buchanan et al. Reference Buchanan, O'Brien, Eren, O'Brien, Buchanan and Eren2018; Eren et al. Reference Eren, Buchanan, O'Brien, O'Brien, Buchanan and Eren2018; Maguire et al. Reference Maguire, Buchanan, Boulanger, Redmond and Eren2018; O'Brien et al. Reference O'Brien, Buchanan and Eren2018a & Reference O'Brien, Buchanan, Eren, O'Brien, Buchanan and Erenb). The diversity of assigned point types at Welling, however, is clearly indicative of specific and unique periods of time spanning the entire Holocene.
All of the Welling Holocene projectile points are made on chert lithologies that are macroscopically consistent with those outcropping locally in central Ohio (DeRegnaucourt & Georgiady Reference DeRegnaucourt and Georgiady1998) (Figure 1). Most of the points are made from Upper Mercer chert (n = 42, 71.1 per cent); the Welling site is situated at an exposure of the Upper Mercer outcrop. The least-cost path between the site and the centre of that outcrop distances 14km (Figure 1A–D). Other point specimens are made on chert types outcropping nearby, including Flint Ridge (n = 9, 15.3 per cent), Delaware (n = 7, 11.9 per cent) and Plum Run (n = 1, 1.7 per cent). Stone raw material assignments are shown in Table S1. Least-cost paths between Welling and these other outcrops are 97km to Delaware, 40km to Flint Ridge and 107km to Plum Run (Figure 1).
In order to establish a range of quality for the central Ohio cherts on which the Welling points were produced, 17 geological chert specimens were also submitted for fracture predictability analysis. These geological specimens were selected from the Kent State University chert reference collections. The majority of these specimens are Upper Mercer (n = 11, 64.7 per cent), with smaller quantities of other chert types (Flint Ridge, n = 3, 17.6 per cent; Plum Run, n = 2, 11.8 per cent; Delaware, n = 1, 5.9 per cent). Some of the Upper Mercer specimens were not provenanced beyond the outcrop itself (n = 5), while others were given county-level provenance (Hocking County, n = 3; Coshocton County, n = 2); one was given a specific provenience (approximately three miles south-west of Nellie village). One Flint Ridge specimen was not provenanced beyond the outcrop itself, while the other two were given county-level provenance (Licking County and Tuscarawas County). The Plum Run and Delaware specimens were provenanced only to their respective outcrops.
Stone raw material fracture predictability analysis
Previous quantitative methods for measuring stone quality—here defined as fracture predictability (e.g. Brantingham et al. Reference Brantingham, Olsen, Rech and Krivoshapkin2000; Braun et al. Reference Braun, Plummer, Ferraro, Ditchfield and Bishop2009; Eren et al. Reference Eren, Roos, Story, von Cramon-Taubadel and Lycett2014)—include tabulation of visible impurities and assessments of rebound hardness. Neither of these methods, however, were appropriate for the raw material quality assessment here. Impurities can be uneven in their distribution (Stout et al. Reference Stout, Quade, Semaw, Rogers and Levin2005: 373), and within an already ‘high-quality’ chert such as Upper Mercer, impurities are unlikely to be macroscopically visible. For an accurate appraisal of rebound hardness, Schmidt hammers—a common geological tool—generally require specimens that are at least several centimetres thick (Proceq 2017); neither our archaeological nor geological specimens fulfil this criterion. Also, while central tendencies of rebound hardness have been discerned among vastly different rock types—including chert, obsidian and basalt—there is still substantial overlap in these measures (e.g. Braun et al. Reference Braun, Plummer, Ferraro, Ditchfield and Bishop2009: tab. 1; Eren et al. Reference Eren, Roos, Story, von Cramon-Taubadel and Lycett2014: fig. 8). Thus, identifying differences in rebound hardness among individual artefacts or geological specimens made from the same raw material is precarious, and, in our case, was inappropriate given the dimensions of our specimens.
To assess fracture predictability of the Welling points and comparative geological specimens, we used the loss on ignition (LOI) method (Dean Reference Dean1974; see also the OSM). Although this method is used as a direct estimation of organic and inorganic carbon in soils, sediments and sedimentary rocks, we use it here to quantify the amount of impurities within the geological and archaeological chert specimens. These impurities stem from in situ volatile elements and secondary mineralisation (see the OSM). Silica is not considered a volatile compound, as it is conservative in nature (Taylor & McLennan Reference Taylor and McLennan1985). Silicon dioxide (SiO2) content and the loss on ignition method (LOI), therefore, are approximately inversely proportional to each other, where the content of SiO2 in cherts is directly associated with the primary sources, such as an igneous rock or siliceous organisms. The impurities probably form during the primary and post deposition of cherts in sedimentary basins. This inverse proportionality can serve as a quantitative model for knapping fracture predictability, and hence, knapping quality. In this way, we establish ranges of geological and archaeological chert quality (Figure 2). Comparison of these two ranges allows us to assess from where along the geological chert-quality range the archaeological specimens were being selected.
![](https://static.cambridge.org/binary/version/id/urn:cambridge.org:id:binary:20190409071159301-0045:S0003598X18001874:S0003598X18001874_fig2g.jpeg?pub-status=live)
Figure 2. Geological (light grey) and archaeological (dark grey) chert-quality ranges for all chert specimens (left, n = 59) and Upper Mercer only (right, n = 42). The red star represents a theoretically ‘perfect’ chert sample composed of 100 per cent SiO2 and no impurities.
Archaeological and geological specimens were analysed on a Panalytical Benchtop Epsilon 3XLE Energy Dispersive X-ray Fluorescence (ED-XRF) in the Geology Department at Kent State University (see the OSM). For ED-XRF analyses, bulk samples were powdered using a Spex® Ball mill. Powdered samples (<60 mesh) (2–3g) were converted to ash using LOI. Ash samples were transformed into glass beads using a 1:10 ratio of lithium tetraborate iodide flux using a Claisse LeNeo Fusion Fluxer. Glass beads were measured for major and minor oxide values employing the manufacturer method for glass beads under helium (He) purge. We monitored the accuracy and precision of the ED-XRF using USGS Granodiorite, Silver Plume, Colorado (GSP-2) as a standard. ED-XRF GSP-2 were within six per cent error of the certified USGS GSP-2 SiO2 value (ED-XRF = 62.8 per cent; GSP-2 certified value 66.6 per cent).
Results
Geological sample quality
When the SiO2 and LOI values of the 17 geological specimens are plotted, a wide range of quality is established (r2 = 0.8051) (Figure 2: top left). Chert specimens with higher amounts of impurities—and thus lower fracture predictability—have lower SiO2 and higher LOI values, tending towards the left and upper part of the graph. Chert specimens with lower amounts of impurities—and thus higher fracture predictability—have higher SiO2 and lower LOI values, tending towards the right and lower part of the graph. If we only assess the 11 Upper Mercer geological specimens, the results are similar (r2 = 0.7975) (Figure 2: top right).
Archaeological sample quality
The 59 Holocene projectile points clearly cluster towards the right and lower part of graph, and within a narrow portion of the geological chert-quality range. This indicates that they possess the highest possible fracture predictability (Figure 2: bottom left). Indeed, the vast majority of point specimens have SiO2 values greater than 95 per cent and LOI values less than 2.0; no point is close to reaching the minimum quality value indicated by the geological specimens. The distribution of archaeological specimens is similar if only the 42 Upper Mercer points are examined (Figure 2: bottom right). One projectile point made on Flint Ridge (specimen #34, an Early Archaic Kirk Corner Notched) exhibits an abnormally high LOI value, suggesting a large amount of impurities. Even this specimen, however, possesses a more than 2 per cent greater SiO2 content than the lowest-quality geological specimens.
We further analysed our chert-quality assessments statistically by converting the SiO2 and LOI values to a Euclidean distance measure of each archaeological specimen to a theoretically ‘perfect’ chert specimen, comprised of 100 per cent SiO2 and thus 0 LOI (indicated by the red star on Figure 2). Next, we assessed the shape of the distribution of these distance measures using the Shapiro-Wilk test for normality (Figure 3). The results show that the distance measures for the archaeological point are significantly positively skewed (n = 59; skewness = 1.96; W = 0.792; p<0.000) (Figure 3: left). The distribution remains significantly positively skewed if we examine only the Upper Mercer archaeological specimens (n = 42; skewness = 1.58; W = 0.808; p<0.000) (Figure 3: right) (Field Reference Field2013).
![](https://static.cambridge.org/binary/version/id/urn:cambridge.org:id:binary:20190409071159301-0045:S0003598X18001874:S0003598X18001874_fig3g.jpeg?pub-status=live)
Figure 3. Histograms of each archaeological specimen's Euclidean distance from the theoretically ‘perfect’ chert sample for all chert types (left, n = 59) and Upper Mercer only (right, n = 42). The dotted line represents the natural geological range of chert quality.
Archaeological stone raw material selection through time
To assess the chert-quality selection process through time, we analysed the distances of the archaeological points from the theoretically ‘perfect’ chert specimen, according to the five time periods represented (Early Archaic (n = 12); Middle Archaic (n = 5); Late Archaic (n = 16); Early Woodland (n = 8); and Late Woodland (n = 16)) (Figure 4). As the distance measures did not conform to an underlying normal distribution, we used the non-parametric Kruskal-Wallis test to compare the medians of the specimens from the five time periods. The results demonstrate that none of the periods differ in their median values (H = 0.508; p = 0.973; Figure 4: left). A similar result is obtained if we examine only the points made from Upper Mercer chert (H = 1.55; p = 0.817; Figure 4: right). Throughout the Holocene, therefore, flintknappers at Welling selected the highest-quality chert to produce their points.
![](https://static.cambridge.org/binary/version/id/urn:cambridge.org:id:binary:20190409071159301-0045:S0003598X18001874:S0003598X18001874_fig4g.jpeg?pub-status=live)
Figure 4. Box plots of chert quality through time for all chert types (left, n = 57) and Upper Mercer only (right, n = 40).
In addition, we used Fligner-Kileen tests comparing the coefficient of variation measures for each of the time periods. The results were non-significant (p-values range from 0.216–0.953), suggesting that the levels of variation in the distance measures are statistically similar. We obtained similar results comparing the levels of variation by time period for the Upper Mercer specimens: all of the comparisons were non-significant (p-values range from 0.237–0.972), except for the comparison of the Early Archaic and the Early Woodland specimens (p = 0.006). The Early Woodland Upper Mercer specimens (coefficient of variation (CV) = 41.95) had significantly more variation than the Early Archaic Upper Mercer specimens (CV = 19.25).
Discussion
The idea that North American Holocene foragers, at discrete times and places, selected high-quality raw materials for the production of stone tools is not a new one (e.g. Bamforth Reference Bamforth1986; Andrefsky Reference Andrefsky1994; MacDonald & Andrefsky Reference MacDonald, Andrefsky and Andrefsky2008; Speth et al. Reference Speth, Newlander, White, Lemke and Anderson2013). It is widely accepted that the high-quality stone would have provided substantial benefits to stone toolmakers and users in terms of time and energy conservation, learning, tool design and portability (Kelly & Todd Reference Kelly and Todd1988; Goodyear Reference Goodyear, Lothrop and Ellis1989; Whittaker Reference Whittaker1994; Stout et al. Reference Stout, Quade, Semaw, Rogers and Levin2005; Stout & Semaw Reference Stout, Semaw, Toth and Schick2006). Previous assessments of North American stone quality, however, have been subjective and untethered to natural geological variation in stone quality (e.g. Callahan Reference Callahan1979; Tsirk Reference Tsirk2014).
Our results demonstrate quantitatively that when North American Holocene foragers were presented with stone raw materials of a range of quality, they consistently recognised and selected those that were most conducive to maximising fracture predictability. It is notable that the central Ohio cherts from which the Welling points were manufactured are already considered to be of ‘excellent’ (Upper Mercer, Flint Ridge, Plum Run) or ‘good’ quality for stone-tool production (DeRegnaucourt & Georgiady Reference DeRegnaucourt and Georgiady1998: 48, 56, 76, 81). Thus, our results also indicate that the selectivity of Holocene foragers’ stone raw material was not limited to broad-quality scales ranging from bad to excellent; rather, they discerned differences in stone quality on a finer scale, ranging from excellent to near perfect. In toto, our results demonstrate that at Welling, foragers selected the ‘best of the best’, and they did it consistently over the course of the entire Holocene.
We envision three broad avenues of research emerging from our results that warrant further consideration. First, how did Holocene foragers distinguish excellent chert from ‘merely’ very good chert? It is possible that visual or tactile cues from the stone itself—or proprioception cues upon striking it—provided some clue to its quality (Clarke Reference Clarke1935; Callahan Reference Callahan1979; Binford & O'Connell Reference Binford and O'Connell1984; Whittaker Reference Whittaker1994; DeForrest Reference DeForrest2006; Weedman-Arthur Reference Weedman Arthur2018). Alternatively, modern flintknappers often describe hearing a ‘good strike’ vs a bad one, or ‘listening for cracks’ natural to the stone (Crabtree Reference Crabtree1967; Patten Reference Patten2009). Perhaps cherts of different qualities also produce distinct sounds upon striking—a hypothesis in which DeForrest (Reference DeForrest2006) has found quantitative support for both Burlington chert and Paiute Agate, although the ‘high’ and ‘low’ quality of each rock type were assigned subjectively. By systematically linking macroscopic, tactile or aural properties of different cherts to the objective methods used here to determine chert quality, researchers will be better situated to infer what types of informational cues about stone selection were being transmitted by prehistoric people—and seemingly fixed in their cultural repertoire—over the course of 9000 years (Lycett Reference Lycett, Lycett and Chauhan2010, Reference Lycett2011, Reference Lycett, Ellen, Lycett and Johns2013).
A second potential avenue of investigation involves the increased variation of Upper Mercer chert quality manifested in the Early Woodland, relative to the Early Archaic period (Figure 4: right). This significant spike in variability may be due to sample size, although we record no such increase in variability in the Middle Archaic, which is also represented by a small sample size. We wonder whether the advent of horticulture in the Ohio region during the Early Woodland, and the concomitant increase in population size, sedentism and territoriality, prevented consistent access to the highest-quality Upper Mercer cherts, relative to the Early Archaic (Bamforth Reference Bamforth1986; Andrefsky Reference Andrefsky1994; Manninen & Knutsson Reference Manninen and Knutsson2014; Smith Reference Smith2015). Conversely, if future analyses show that different quality Upper Mercer cherts possess different colours, perhaps the increased sedentism and territoriality of the Early Woodland encouraged procurement of greater chert variability as a mechanism of social signalling or symbolism (Ellis Reference Ellis, Lothrop and Ellis1989; Bamforth Reference Bamforth2009; Speth et al. Reference Speth, Newlander, White, Lemke and Anderson2013). Alternatively, it is possible that other properties of toolstone—beyond fracture predictability—were preferred during the Early Woodland. Several researchers have shown that there are functional costs and benefits associated with different types of toolstone, including durability, sharpness and the like (Webb & Domanski Reference Webb and Domanski2008; Braun et al. Reference Braun, Plummer, Ferraro, Ditchfield and Bishop2009; Loendorf et al. Reference Loendorf, Blikre, Bryce, Oliver, Denoyer and Wermers2018). Perhaps cherts of lower fracture predictability possessed other functional advantages desired by Early Woodland people.
A third avenue of research involves investigating whether the stone raw material selectivity apparent in the projectile points is present throughout the entire Welling assemblage. Alternatively, were sub-optimal raw materials selected for tool types with different production or functional constraints, such as endscrapers? Co-variation between tool type and subjective assessments of raw materials quality has been archaeologically documented or asserted (e.g. Bamforth Reference Bamforth1986, Reference Bamforth2009; Andrefsky Reference Andrefsky1994). With the quantitative approach to stone raw material quality that we have developed here, we are well positioned to target future research on the Welling assemblage to address this important question.
Acknowledgements
This research was funded by a Kent State University Research Seed Grant awarded to J.C.W., M.R.B. and M.I.E. M.R.B. is financially supported by the Kent State University Biomedical Sciences (Biological Anthropology) Program. M.I.E. is supported by the Kent State University College of Arts and Sciences, and M.I.E. and B.B. are supported by the National Science Foundation (NSF Award ID: 1649395).
Supplementary material
To view supplementary material for this article, please visit https://doi.org/10.15184/aqy.2018.187