1. Introduction
If epistemic norms are collectively accepted evaluative standards governing epistemic judgments and procedures, then children become capable of understanding and enforcing epistemic norms during middle childhood (roughly, 6 to 10 years). In Western cultures, it is at this age that children are first introduced to the collectively accepted evaluative standards governing the ways in which scientists collect and evaluate observational data, as well as to something like the rules of evidence used in courts of law and historical tribunals attempting to determine exactly what happened in the past. In traditional small-scale cultures, it is at this age that children are first introduced to collectively accepted evaluative standards governing the epistemic dimensions of such activities as tracking game and planting crops.
The issue is not children learning how to do science or how to track game; children can socially learn many things from very early in childhood. But reflecting on the methods of data collection and evaluation – and evaluating those with socially normative standards – is something different. For this, children must come to understand and appreciate that things are done in a certain way – and not other ways – because participants have collectively agreed to do them in this way: one should believe X because one can justify X to others with reasons that are grounded in procedures that we collectively accept as valid and reliable generators of knowledge. And children must also recognize that whatever is based on collective agreement can be changed by collective agreement as well. Only when they understand these things can children fully participate in activities governed by epistemic norms.
Henderson (Reference Henderson2020, this volume) argues that, in principle, an individual could create epistemic norms for herself, but that, in practice, humans tend to do this collectively. This may in some sense be true – a teenager could in principle create her own evaluative standards for collecting stamps, for instance – but that would be possible, I would argue, only if she had already experienced and understood other socially created evaluative standards – whose methods she then appropriated for her own individual use. Moreover, and even more important, I would argue that building up the major constituent concepts enabling children to fully participate in either public or individual epistemic norms, namely, beliefs and reasons for beliefs, is not just in practice but also in principle a social process. A human being who grew up on a desert island without any social intercourse would not come to understand beliefs, much less reasons for beliefs, and so would not have the conceptual building blocks for understanding and enforcing epistemic norms. Epistemic norms are social at least several layers down.
My goal in this paper is to provide an account of how human individuals get to the point where they can learn, understand, and operate effectively with collectively understood epistemic norms, for example, in scientific activities. The model relies on children's growing proficiency with three key concepts: beliefs, reasons for beliefs, and social norms in general (i.e., those governing behavior). The hypothesis is that all of these are constructed through, and only through, particular types of social interactions, namely, those structured by humans’ species-unique skills and motivations of shared intentionality. When children are exposed to epistemic norms in school, they begin to understand them through sharing experience with others via these three concepts.
2. Understanding beliefs
Beliefs would seem to be wholly private phenomena. Individuals have beliefs – in the privacy of their own minds, so to speak – and these may differ among individuals. But the process by which individuals construct a concept of belief is an inherently social process. That is, it is a social-interactive process in which individuals compare and coordinate their perspectives on things, and thereby come to realize the basic distinction between individual subjective perspectives (or beliefs), on the one hand, and an objective perspective that these may or may not match, on the other.
2.1. Evolutionary background
Surprising almost everyone, many studies over the last two decades have established beyond a reasonable doubt that humans’ closest primate relatives, the great apes, can track the epistemic states of others. They know when others can and cannot see things, when others can and cannot hear things, and when others do and do not know things (in the sense of knowledge by acquaintance, i.e., when others have or have not seen things in the immediate past). They also can identify in many contexts the goals that others are trying to achieve, for example, they distinguish what an individual was trying to do versus what it actually did in failed attempts and accidents. Putting these together, great apes are operating with a kind of perception-goal psychology (a limited version of a belief-desire psychology) in which they can predict the behavior of another individual based on what it perceives and what it wants (see Tomasello Reference Tomasello2014, for a review of the relevant experimental evidence).
But apes only do this reliably in competition with others, when they are trying to predict the behavior of others as a means for outcompeting them. In such competitive contexts, individuals are trying to “read” the minds of others but they would prefer that those others not be able to read theirs (as this one-way reading gives them a competitive advantage). But at some point in human evolution (for current purposes the exact time does not matter, but let us just say a few hundred thousand years ago) the ecology of the human ape changed (for current purposes the exact change does not matter, but let us just say that other primate species started outcompeting them for their normal diet of fruits and leaves). The result was that humans had to find a new feeding niche, and that turned out to be foods that were only readily obtainable via collaboration with others. Such obligate collaborative foraging meant that human individuals were now interdependent with one another in much more immediate and urgent ways than were other apes, and so there was selective pressure to evolve new psychological mechanisms to meet the needs of this new socio-ecological situation.
Cognitively, what evolved was evolutionarily new skills and motivations of joint intentionality: the skills and motivations for forming with a partner a joint agent “we” that pursued collaborative goals. This created what we may call the dual-level structure of joint intentionality (or joint agency): the individuals had both a joint goal as well as their own individual roles. Epistemically, this meant that they participated together in acts of joint attention to a common focus (relevant to the joint goal), but at the same time knew that they each had their own individual perspectives on that joint focus. This type of interaction laid the foundation for the development of a notion of belief. In this type of interaction, the individual “triangulates” (to use Davidson's (Reference Davidson2001) term) with another individual on some common outside entity, but at the same time realizes that the two of them have different perspectives on this common outside entity. Moll and Tomasello (Reference Moll and Tomasello2007) argue that, indeed, the whole idea of a perspective requires that individuals be focused on something in common, about which they have different views; otherwise, without a joint focus, they just see different things. As Davidson argues, this dual-level functioning provides the first glimmering, the first possibility, of the idea that individuals may have different perspectives on things.
Then came the second key step in human evolution: the cultural organization of social life. Again focusing on the cognitive side of things, human individuals at this point needed to coordinate their attention and knowledge with the whole panoply of other individuals of the same cultural group. To facilitate the process they came to construct special supra-individual social structures, namely, conventions, norms, and institutions, based on skills and motivations for collective intentionality. There thus emerged a sense of something in common to everyone intersubjectively: triangulation ramped up to the group level. But it was not just triangulation among all the members of a group, but rather, because individuals actually identified with the group – we are those who do things this way and not that way and who believe these things and not those things – the something in common meant independent of everyone's perspective, that is, not just all of us but anyone who would be one of us rational creatures (members of other groups not being rational). What this amounts to is a sense that there are “objective” states of affairs that are independent of what I, you, or any rational being might perceive or think on a particular occasion.
Why do chimpanzees not come to make this distinction between subjective perspectives (or beliefs) and the “objective” situation? The answer is that one cannot come to this distinction on one's own or in competition with others. One might think that a chimpanzee could on some occasion see a stick and approach it but then, on closer inspection, see that it was a snake. Why is this not enough to distinguish subjective perspectives (or beliefs) from the objective situation? While in principle it could be enough, apparently it is not, because while chimpanzees track epistemic states such as perception and knowledge by acquaintance, they do not understand false beliefs (Tomasello Reference Tomasello2018). The reason is that, unlike the chimpanzee example, the triangulation involved in human joint intentional interactions contrasts two subjective perspectives that are simultaneously available. If we are cooperating by coordinating our behavior and mental states, we must somehow coordinate our different views to achieve our joint goal. The chimpanzee on its own is just thinking “stick” until it gets closer and then it thinks “snake”; there is no need to coordinate differing perspectives.
Tomasello (Reference Tomasello2018) thus speculates that species in which individuals operate more or less independently of one another, or who only compete with one another, will never come to this dual-level perspective on things in which two or more individuals attempt to align or coordinate their perspectives as they work together toward a shared goal. A key point evolutionarily is that whereas in competition, individuals are reading the minds of their competitors against their will – when we are competing, I want to conceal my mental states from you – in cooperation and coordination individuals want their partner to read their minds – when we are cooperating and coordinating I do everything I can to display or advertise to you my mental states – to facilitate the process. And so arose cooperative communication in which human individuals made clear to their cooperative partner what they were thinking (Tomasello Reference Tomasello2008). Thus, in joint attention, I do what I can to help you attend to what I am attending to – perhaps via cooperative communication – and you work toward this same goal as well. But, as emphasized by Sperber et al. (Reference Sperber, Clément, Heintz, Mascaro, Mercier, Origgi and Wilson2010), in the context of such strong assumptions of cooperation I must also develop “epistemic vigilance” to protect against individuals who might exploit my cooperativeness by deceiving me. Perhaps paradoxically, it is only in highly cooperative social contexts, where altruism and truthfulness – and so gullibility – are the norm, that there arises the need to monitor one's social partners for attempts to bend the truth.
2.2. Ontogeny
Humans’ evolutionary heritage obviously plays a crucial role in the ontogeny of the concept of belief in individuals today; this is basically the maturational component of the ontogenetic process. But, in complex human phenomena such as an understanding of beliefs, maturation is never enough. Maturation is a capacity that only is realized in human psychology as the individual exercises its maturational capacities in interaction with others (Tomasello Reference Tomasello2019).
From before their first birthdays, human infants are tracking the (non-perspectival) epistemic states of others: like apes, they know what others see or have seen, heard, etc. But at around the same time, that is, at around nine months of age, human infants also begin acting together with others to jointly attend to things. With joint attention we may say that the infant and partner understand themselves to be attending to the same thing together, but at the same time they understand that they are doing so from different perspectives; they are triangulating on it. As noted above, this manner of social engagement may be called the “dual-level structure” of joint intentionality because it simultaneously encompasses a shared focus of attention among partners on something as well as their individual perspectives on it.
In joint intentional interactions, partners are constantly attempting to align their goals and attention. The aligning of attention may happen as one individual simply follows into the attention of the other, and then they somehow acknowledge that they are now in joint attention (e.g., by a mutual look). But often one individual actively attempts to align attention with the other via referential communication. In the prototypical situation with infant and adult, one of the partners initiates things by offering an object to the other, or showing an object to the other, or pointing to some interesting event. In such acts, the communicator has a goal that the recipient attend to what he, the communicator, is already attending to; his (referential) goal is the aligning of their attention in joint attention. The recipient, if she accedes, goes from her own individual attention on something to jointly attending with her partner. The interpersonal negotiation thus involves each partner's sequential shifting from individual to joint attention, as either communicator or recipient. Unlike simply imagining what another person is seeing or attending to, with no attention to one's own seeing or attending, negotiating joint attention brings into focus the relation between the two. They are not now aligned, and to know that they are now aligned there must be at least some imagining of the content of both perspectives and their relationship. This would seem to require an executive level of cognitive functioning (a “bird's-eye-view”) in which the two perspectives may be compared in the same representational format to see if there is alignment or not.
Then, during the 1 to 3 year age period, children begin learning to communicate in the medium of a conventional language. Their earliest language is organized mainly at the level of the individual utterance, but by around 2.5 years of age they start to participate in relatively extended conversations in which partners take turns making comments about a mutually understood topic. Conversations in which the topic has been linguistically expressed thus involve joint attention on a new level: “joint attention to mental content”, defined as a shared focus on a mental construal of something, about which we express different perspectives or attitudes (O'Madagain and Tomasello Reference O'Madagain and Tomasello2019). The topic-comment structure of discourse may thus be seen as another instantiation of the dual-level structure of simultaneous sharedness and individuality: you make an utterance expressing some kind of mental content – “Look at that dog” – and I respond with a comment on the same mutually understood topic: “It's an Afghan”. You may then respond with “It's my sister's”. We are jointly attending to a topic, the dog, and we are expressing different attitudes and/or perspectives on it. It is this kind of triangulation in discourse that is the raw material from which young children discover that mental perspectives themselves – the mental content of conventional linguistic expressions – may be looked at from different perspectives.
Of special importance in constructing the notion of belief are children's conversations in which the topic is a proposition, that is, some kind of truth-bearing assertion such as That dog is sick, to which the reply may be No, its not or You're wrong. In such exchanges, there is a linguistically expressed statement of fact and then the expression of some kind of conflicting attitudes (or perspectives) that are about the mental content of that statement of fact; and both cannot be right. Conversations of this type become adult-like only when the child can take an objective perspective and then assess the assertion with respect to the objective situation. A wealth of data suggests that in many activities this occurs at around 3 years of age; that is, it is at this age that children begin to understand things objectively, from the perspective of “anyone”. For example, they understand that some pieces of knowledge are possessed by everyone in the culture, even strangers (Liebal et al. Reference Liebal, Carpenter and Tomasello2013); they understand that everyone in the culture knows the same linguistic conventions (Diesendruck et al. Reference Diesendruck, Carmel and Markson2010); they understand pedagogy to be conveying culturally general knowledge (Csibra and Gergely Reference Csibra and Gergely2011); they normatively correct people who make false statements (Rakoczy and Tomasello Reference Rakoczy and Tomasello2009); and they enforce social norms and show other signs of understanding normativity, which applies to everyone in the culture alike (Schmidt and Tomasello Reference Schmidt and Tomasello2012). Children's notion of an objective perspective, which enables discourse about the truth of propositions, emerges at around 3 years of age.
Coordination to a satisfactory conclusion in such discourse thus involves the coordination of three perspectives – yours, mine, and the objective perspective – and it often relies on the recognition, for example, that some perspectives are inaccurate, or that the different perspectives may not be incompatible after all because we are talking about different dogs or different criteria for sickness. This manner of functioning is also crucial to children's mastery of so-called propositional attitude constructions of the type “He believes the dog is sick” or “I hope the dog is not sick” (also called sentential complement constructions). In these constructions, the speaker formulates a proposition but embeds it within a propositional attitude such as “I think that ….”. Diessel and Tomasello (Reference Diessel and Tomasello2001) found that although 3-year-olds use such constructions, they mostly do so in very formulaic ways that do not require a conceptualization of mental states or perspectives (e.g., “I think it's raining” just means, for them, “Maybe it's raining”). It is not until they are 4 or 5 years of age that children understand the coordination of perspectives involved (i.e., the cat is objectively sick or not, and this is independent of the attitude about this fact that the speaker expresses in the main clause). With fully understood propositional attitude constructions, young children have a single representational format for expressing both the objective perspective as well as some subjective attitude about it (DeVilliers Reference de Villiers, Baron-Cohen, Tager-Flusberg and Cohen2000). The theoretical claim is thus that exchanges of perspective in linguistic discourse about truth-bearing propositions – made possible by the emergence of an objective perspective – begin at about 3 years of age and are crucial in children's coming to distinguish between the situation as it is objectively and the situation as different individuals subjectively believe it to be.
To summarize: apes only imagine or track epistemic states; they do not understand different perspectives on a common situation. This means that there is no possibility for a mismatch between a subjective perspective and the objective situation, and no coordinating of different perspectives into new understandings. These limitations are because apes do not “triangulate” on situations by engaging with others in joint attention with the dual-level structure of sharedness (joint focus) and individuality (individual perspectives on that joint focus). Human infants are initially the same. Then they begin to engage in joint attention with others, relating the two perspectives involved in their common focus. But it takes much social and communicative interaction with others before they can construct an objective perspective and then coordinate their own perspective both with the other person's and with that objective perspective appropriately. These constructive processes are mainly realized in communicative interactions aimed at aligning perspectives in joint attention and in linguistic discourse involving joint attention to (conflicting) mental content.
3. Reasons for beliefs
When someone asserts a belief, they want to be believed. Most often they are believed (based on mutual assumptions of cooperation), but sometimes there is not enough trust on the recipient's part. In such cases the communicator gives reasons for the recipient to believe her assertion. In this context, a reason is typically some fact about the world on which both communicator and recipient can agree and which, if true, implies the truth of the original assertion. For example, if I assert “Penguins have feathers” and you disagree, I can simply say “But penguins are birds” which, given our common ground assumption that all birds have feathers, settles the issue (i.e., we each believe individually for logical or empirical reasons that assumption, and we know together in our common ground that we share that assumption). Or perhaps I tell you that I examined a penguin closely and found feathers, which grounds my assertion in our common ground trust in observation (i.e., we each individually trust observation and note together in our common ground that we do). Reasons for beliefs thus work because, and only because, they can be connected inferentially to that form of shared intentionality known as common ground (aka: mutual knowledge).
3.1. Evolutionary background
Traditionally, human reasoning was seen as an individual affair. But Mercier and Sperber (Reference Mercier and Sperber2011, Reference Mercier and Sperber2017) have recently recast the process in terms of communication and discourse, specifically argumentative discourse in which individuals make explicit to others their reasons for believing something to be the case. (They contrast such explicit reason-giving with private processes of inference and thinking in general.) They focus on cases where interlocutors have differing interests, and so the communicator is attempting to convince the recipient of something that furthers her interests.
The proposal that human reasoning – in the sense of explicit reason-giving – has a social-communicative origin is almost certainly correct. But Mercier and Sperber's account tends to background the cooperative processes involved. An alternative evolutionary account that foregrounds these processes might go like this. The key social context for reason-giving discourse is joint or collective decision-making, as it occurred regularly in early humans’ collaborative activities. Thus, on a hunting trip, perhaps you think we should hunt for antelopes in this direction, and I think we would be better off going in that direction. To make your case, you make your reasoning more explicit in our conventional language by, for instance, noting that there is a watering hole to the south. I counter, also in language, by making explicit my reasoning that at this time of day it is likely that lions will be at the watering hole and so no antelopes will be there – and besides, here are some antelope tracks going to the north. You say these tracks look old, but I think that is because they were in the direct sunlight this morning and actually they are from around dawn or so. And on and on.
The normative dimension of the process arises precisely from the fact that we are collaborating in an interdependent fashion (i.e., we are engaged in an act of shared intentionality with a common goal and individual roles within it). Because we depend on one another for instrumental success, each of us has the right to demand that the other think together with us in effective ways. As Darwall (Reference Darwall2006: 14) puts it:
It is only in certain contexts, say, when you and I are trying to work out what to believe together, that either of us has any standing to demand that one another reason logically.
Such cooperative argumentation, as we may call it, may be modeled in game theory as a Battle of the Sexes: our overall goal is collaborative – we will hunt together under all circumstances because otherwise there is zero hope of success – but within that cooperative framework we each argue our case. In this context, neither of us wants to convince the other of our view if we are in fact wrong about the location of antelopes; each would rather lose the argument and eat tonight than win the argument and go hungry. And so a key dimension of our cooperativeness is that we both have agreed ahead of time, so to speak, that we will go in the direction for which there are the “best” reasons. That is what being reasonable is all about.
An appeal to “best” reasons invokes what Sellars (Reference Sellars1963) calls “common standards of correctness and relevance, which relate what I do think to what anyone ought to think”, that is, to collectively accepted epistemic norms. Our cooperative argumentation in the context of joint or collective decision-making is thus premised on a shared metric that we both use in determining which reasons are indeed “best”. For example, if we all know and accept ahead of time that if there are lions at the watering hole then there will be no antelopes there (what Toulmin (Reference Toulmin1958) would call the warrant for my reason), then my statement “There are lions at the watering hole this time of day” implies that there are no antelopes there (and so we do not need to search there). Overall, this ability to connect thoughts to other thoughts (both those of others and one's own) by various inferential relations – prototypically by providing explicit reasons and justifications – leads to a kind of interconnection among all of an individual's beliefs in a kind of holistic “web of beliefs”. The key points here are that arguing in this way (i) assumes a cooperative context in which we are less concerned with winning the argument than with getting the right answer (which makes us motivated to be “reasonable”), and (ii) relies on us both being able to make inferential connections to a set of background beliefs or valuations of evidence that we share in our common ground. These two processes of shared intentionality may be seen as, respectively, the motivational basis (a cooperative attitude of argumentation) and the cognitive basis (common ground beliefs or valuations) for cooperative reason-giving.
To be in an argument involving appeals to best reasons also means accepting as infrastructure certain “rules of the game”, namely, certain norms for arguing cooperatively. The early Greeks made explicit some of the most important of these norms of argumentation in Western culture, for example, the law of non-contradiction (a disputant cannot hold the same statement to be both true and false at the same time), and the law of identity (a disputant cannot change the identity of A during the course of the argument). Even before the Greeks, we can imagine that individuals who, for example, held the same statement to be both true and false at the same time, were either ignored by others or else exhorted to argue “properly”. The cooperative infrastructure was thus decisive in determining what it means to reason at all. The natural world itself may be totally “is”– the antelopes are where they are – but the culturally embedded discourse processes by which we determine what that “is” in fact is – in the space of reasons, to use Sellars’ evocative phrase – are fraught with ought.
The capstone of all of this – recognized by all modern thinkers who take a socio-cultural view of human thinking – is the internalization of these various interpersonal processes of making things explicit into individual rational thinking or reasoning. Making things explicit to facilitate the comprehension of a recipient leads the communicator to simulate, before actually producing an utterance, how his planned communicative act might be comprehended – perhaps in a kind of inner dialogue. Making things explicit to persuade someone in an argument leads the disputant to simulate ahead of time how a potential opponent might counter his argument, and so to make ready, in thought, an interconnected set of reasons and justifications – again, perhaps, in a kind of inner dialogue. As Brandom (Reference Brandom1994: 590–1) describes the process:
The conceptual contents employed in monological reasoning … are parasitic on and intelligible only in terms of the sort of content conferred by dialogical reasoning, in which the issue of what follows from what essentially involves assessments from the different social perspectives of scorekeeping interlocutors with different background commitments.
The norms of human reasoning are thus at least implicitly agreed upon in the dyad or the community, and individuals follow them in making assertions and in providing reasons and justifications as ways of convincing “any rational person”. Human reasoning, even when it is done internally with the self, is therefore shot through and through with a kind of shared intentionality, indeed normativity, in which the individual regulates her actions and thinking based on the group's normative conventions and standards (which may, of course, be based on some “natural”, individual processes of categorization and inference).
3.2. Ontogeny
It is obvious that there must be at least some understanding of beliefs for there to be an understanding of reasons for beliefs. It is thus at around the time that children first start understanding something about differing beliefs that they begin to understand and operate with reasons in argumentative discourse.
First of all, there are two studies on how children comprehend and respond to reasons given to them. First, Mercier et al. (Reference Mercier, Bernard and Clément2014) found that 4- and 5-year-old children (and to a lesser degree 3-year-old children) could identify when someone attempted to give them a poor reason for believing something (a circular argument, e.g., the dog went this way because he went in this direction). Tellingly, they also found that children gave more credence to a proposal backed by a poor circular reason than to a proposal backed by no reason at all. Second, Schmidt et al. (Reference Schmidt, Svetlova, Johe and Tomasello2016) had a puppet approach and request resources from children at 3, 5, and 8 years of age. In all conditions the puppet gave a reason for requesting the resources, but in one case it was a personal reason – I want it – which, in this context, is not really a good reason. In three other conditions the puppet gave a much better reason – that is, one that fit this at least partially moral context – in one case need (I haven't eaten in a long while; I'm very hungry; I need some); in another case fairness (you have some and I have none; that's not fair); and in another case rule (the rule says that you have to share). The study found that 8-year-olds (but not the younger children) gave more items to the requesting puppet than to a neutral puppet only for the three good reasons, not for the selfish reason. The older age of reason appreciation in this study might be due to the fact that the puppet always gave a reason and then was compared with a puppet who gave no reason, and, following the study of Mercier et al. (Reference Mercier, Bernard and Clément2014), it might be that until this older age children think a poor reason is better than no reason at all. They get the idea of reason-giving, but are poor in its implementation.
In terms of actually producing reasons for others, Köymen et al. (Reference Köymen, Lieven, Engemann, Rakoczy, Warneken and Tomasello2014) had pairs of 3- and 5-year-old children making joint decisions about where to place toy animals and other objects in a toy zoo. The question was whether they could produce reasons flexibly depending on the knowledge they shared with their partner in common ground (since convincing reasons bottom out in common ground beliefs or valuations of certain facts or epistemic procedures). The trick was that some of the toys represented things that children knew about (and both knew they both knew about them) and are conventionally found in a zoo. Other items were things that children might have known about but that are conventionally not found in a zoo. The 5-year-old children, and to a lesser degree the 3-year-old children, gave reasons differently in these two situations. For example, if the item to be placed was a polar bear one child would simply point out the location of an area with ice and a frozen pond, which was sufficient because they both assumed in common ground that polar bears live on ice. But when the item was a toy piano there was little common ground to rely on relevant to their decision about its placement. They could not just point out a location and expect their partner to accept it. They had to give a reason for the connection between the piano and, for example, a place next to a bench: because people sit here and so could listen to the whole song. (See Köymen et al. (Reference Köymen, Mammen and Tomasello2016) for an experimental demonstration in which children's common ground was experimentally manipulated.) Giving reasons appropriate to one's common ground with a partner demonstrates at least some understanding of how reasons function: they justify a belief by connecting it to other beliefs or valuations that are already mutually accepted (‘warrants’) in their store of common ground assumptions.
Relatedly, Köymen and Tomasello (Reference Köymen and Tomasello2018) found that when each member of a pair of 5-year-olds was given different information, from sources of different reliability, they were nevertheless able to successfully come to an appropriate conclusion by comparing the reliabilities of the information. For example, if one child reported that Wugs eat rocks – and he knows this because he saw them doing it – the children were more likely to accept this as a fact than the partner's assertion that Wugs eat sand – and he knows this because someone told him that they had heard that. The important point here is that we are not talking about an absolute judgment of reliability of any particular source of evidence, but rather that children share a common ground understanding of a ranking of different sources of evidence (e.g., such that direct observation is more reliable than hearsay). Finally, further demonstrating their skills of collaborative reasoning, the 5-year-olds in these dyads often engaged in explicit “meta-talk” aimed at determining with more precision such things as the strength of evidence and so the validity of reasons (e.g., Did you see it clearly? More than once?).
It is thus during the 3 to 6 year age period that young children begin to get the idea of why and how to give reasons to others in arguing for certain beliefs. Following Piaget (Reference Piaget1965 [1995]), we would stress once again the necessity of social interaction with others: without social interactants, especially peers, a developing child's thinking would be plagued by a kind of inertia toward her own parochial perspective (aka, childhood egocentrism). Thus, when talking to themselves, children typically have a hard time being coherent and consistent, contradicting themselves regularly; they need an interlocutor to keep them on track. With specific reference to older children, who have already internalized important aspects of the dialogic process, Kuhn (Reference Kuhn2015: 12) says that “the comparative merit of the dialogic form is that it inserts the missing interlocutor … to remedy the weakness … of … ignoring or dismissing opposing perspectives and restricting one's interpersonal exchanges to the echo chamber of one's own ideas”. While we humans can engage in some kinds of thinking on our own, the types that we consider rational and reasonable, the types that make sense, come out of our dialogic, perspective-shifting interactions with others. Such rational dialogue is fundamentally cooperative in nature because, at bottom, being reasonable means precisely being cooperative in one's epistemic interactions with others: all participants basically agree to yield to reason, as impersonal arbiter, as it were, when that is appropriate. Cooperative reasoning is thus an especially interesting and complex form of shared intentionality.
Unlike the case of beliefs, there are no empirical studies with children aimed specifically at discovering how they learn to operate with reasons. But on analogy with the way they learn about beliefs through perspective-shifting discourse, a plausible hypothesis is that children learn about reasons and reason-giving by having others address them with reasons. Perhaps of special importance are utterances with because-clauses, for example, “Penguins must have feathers because they are birds”. On analogy with propositional attitude constructions – which express as one thought both a fact and an attitude toward that fact – such utterances express as one thought both the belief and a reason for that belief (O'Madagain and Tomasello Reference O'Madagain and Tomasello2019).
To summarize: through their dialogical interactions with others involving joint attention to mental content, young children construct a notion of belief, and then, based on their cooperative motives in cooperative argumentation with others, they become able to justify their beliefs to others by providing reasons that are grounded in common ground beliefs and valuations that they all share. Reasons for beliefs thus involve another level of shared intentionality: assessing the validity of reasons based on how certain facts (e.g., “Penguins are birds”) connect to common ground assumptions (”All birds have feathers”), as well as on collective agreements on the relative ranking of procedures for determining facts (e.g., direct experience is better than hearsay) and by certain standards of argumentation (e.g., the law of non-contradiction).
4. Social norms
An understanding of beliefs and reasons for beliefs could, perhaps, lead an individual to create her own epistemic norms. But during this same developmental period children are also learning about collectively accepted norms, specifically collectively accepted norms of proper behavior. These have the same basic structure as epistemic norms – collectively created and accepted standards – and children learn these at around the same time they are learning about beliefs and reasons for beliefs.
Evolutionarily, great apes sometimes retaliate against those who have harmed them directly, but they do not punish or intervene against an individual who is harming a third party (Riedl et al. Reference Riedl, Jensen, Call and Tomasello2012). Human social norms, in contrast, take a thoroughly third-party perspective: they apply to everyone in the group alike – including myself as just one among many – since they express the group's expectations for how anyone who would be one of “us” should act, on pain of admonishment, punishment, or ostracism. Social norms are the paradigm social structure of group-level processes of collective intentionality, and children come to understand how they work as collectively accepted standards for individual behavior soon after 3 years of age.
According to Piaget (Reference Piaget1932 [1965]), the reason children respect and follow social norms is obvious: because they respect the adults from whom they come. But, in addition, from as young as 3 years of age, children will actively intervene to sanction others for social norm violations. For example, Vaish et al. (Reference Vaish, Missana and Tomasello2011) found that if a puppet begins attempting to destroy someone else's property, 3-year-olds will intervene by protesting to stop the transgression. Because the child herself is not being affected, this is not second-personal protest; she is not protesting how “you” are treating “me”. What she is protesting is a lack of conformity to the group-minded social norm for how one should treat others. This interpretation is bolstered by the observation that young children also intervene against individuals who violate mere conventions (albeit with less emotion; Hardecker et al. Reference Hardecker, Schmidt, Roden and Tomasello2016). Thus, Rakoczy et al. (Reference Rakoczy, Warneken and Tomasello2008, Reference Rakoczy, Hamann, Warneken and Tomasello2010) found that if 3-year-olds learn that on this table we play the game this way (while on another table we play it differently), and then a puppet plays the game the wrong way for this table, children intervene and stop him, even though no harm is being done to anyone. The child is not defending either her own or any other individual's self-interest; the immediate goal is simply for the wayward actor to conform to the way “we” (should) do it.
Importantly, in all types of third-party intervention 3-year-olds quite often use generic normative language, as in “One can't do it like that” or even “That's wrong!” Such generic language suggests that the norm enforcer is not just acting as an individual expressing a personal opinion, but rather, as in the case of intentional pedagogy, as a kind of representative of the cultural group conveying impartial and objective knowledge (in this case about how “we” act). In principle, anyone in the culture may enforce social norms; in principle, anyone in the culture may be their target (perhaps within some demographic or contextual specifications); and, in principle, the standards themselves are objective (not subjective). Norm enforcers are thus, in effect, referring the violator to an objective world of values that he himself may consult to see that his behavior is wrong. Norm enforcement is thus not a personal act, but a group-minded act of collective intentionality – the goal is to bring others into line with how “we” do things – and 3-year-old children have begun to understand this.
And norm enforcement of this type is distinctly group-minded. Schmidt et al. (Reference Schmidt and Tomasello2012) used the basic norm enforcement experimental paradigm but with an in-group/out-group manipulation, as well as a moral/conventional manipulation. They found that 3-year-olds enforced moral norms equally often on both in-group and out-group violators, reinforcing the finding that children of this age see moral norms as universally applicable. But they enforced conventional norms selectively on in-group members because conventional norms apply only to “us”, who should know better (the so-called “black sheep” effect). Presumably, underlying this practice was an implicit understanding that our conventional norms were created by us for us, and so they are good for the group and its functioning, and this makes it a good thing, a legitimate group-minded thing, for me to enforce them on members of our group (whereas we do not care what out-group individuals do). Based on this rationale, it is not surprising that Vaish et al. (Reference Vaish, Carpenter and Tomasello2016) found that 5-year-olds expressed approval of, and preferred to interact with, individuals who enforced social norms in the group over those who did not (even though they were acting somewhat aggressively), presumably because they saw such enforcement as a good thing that signals concern for the good of the group.
In their first 3 years of life children experience adults enforcing norms all the time, and so solely on the basis of their following and enforcing social norms (since even enforcing could, at least in some ways, come from adults), one could reasonably remain somewhat skeptical about the depth of 3-year-olds’ understanding. But by the time children are 5 years of age, such skepticism is no longer reasonable because children can now create and enforce their own social norms in novel (play) contexts with peers, without any form of adult guidance. This new level of competence is important for a full understanding of norms as social agreements that can at least potentially be changed by new and revised social agreements (which will be important in the acquisition of epistemic norms as well).
Several recent experiments have observed children in situations in which there are no established norms or rules, adult or otherwise, and so, for social control, they invent some for themselves. For example, in a recent experiment Göckeritz et al. (Reference Göckeritz, Schmidt and Tomasello2014) exposed triads of 5-year-old children to a complex game apparatus and only told them that the goal was to have the balls come out the end into a bucket. If the children asked any questions about how the game was or should be played, the experimenter professed ignorance. In playing the game repeatedly there were certain recurrent obstacles that the children had to overcome (e.g., the balls kept falling out of a tube on the way). Children's reaction to these obstacles was not just to try to overcome them, but, over time, to create normative rules for how to do this. Thus, when it later came time to show naïve individuals how to play the game, the children did so using generic normative language, as in “You have to do it like this” or “It works like this”. They made up their own rules and then enforced them as normatively binding, suggesting an understanding that self-created rules are as authoritative as any other.
But it is possible that the children in this experiment actually thought about things in a slightly different way. Because the apparatus was presumably set up by adults ahead of time the children might have thought that there were indeed proper rules for how one played with the apparatus, even though the adult in the room did not know them. Hardecker et al. (Reference Hardecker, Schmidt and Tomasello2017) thus simply gave groups of three 5-year-olds a set of “junk” objects (e.g., sticks and a box) and explicitly told them that there were no goals or rules; they should just play as they wished. If the children asked how they should play, the experimenter said that there were no rules. But even in this situation, the children invented their own rules (e.g., a rule about where one must stand in a throw-sticks-into-the-box game) and continued to pass them on to naïve peers in an explicitly normative fashion. They knew for certain that they had invented the rules whole cloth, and yet still they saw them as normative and binding for all who would play the game. This suggests a basic understanding of social norms as based on social agreement.
But 5-year-olds do not understand social norms in a totally adult-like way. There was another condition in this experiment. Hardecker et al. (Reference Hardecker, Schmidt and Tomasello2017) had adults teach children the exact same rules that the peers had just made up (in a yoked design), and, as expected, the children again enforced them on naïve others in an explicitly normative fashion. The difference between children's behavior in the peer and adult conditions was that when, later, a naïve individual balked at the rule (experimentally controlled), 5-year-olds were more flexible in changing the rules that they had made up themselves with peers, as opposed to the adult rules, perhaps suggesting that children at this age still view adult authority as at least a partial source of normative force. In contrast, 7-year-olds saw their own self-created rules and the adult-prescribed rules as equally rigid; normative force comes only from the social agreement to which individuals bind themselves, no matter who that is. (See Riggs and Young (Reference Riggs and Young2016) for some similar findings.) Five-year-old children are thus able to make up their own social norms, and from the beginning they enforce these norms on others using the normative language of should, must, and ought. By school-age, they even have an adult-like understanding that social norms can be changed if everyone agrees to it, no matter their source.
With regard to our particular question here, it is clear that these are not epistemic norms but rather social norms for proper behavior in the social group or in the particular play context. But still, the fact that children can make up their own rules or norms in game contexts in interaction with peers, without adult guidance, indicates an understanding of the way that social norms work, and this sets them up for understanding other kinds of norms, specifically, institutionalized epistemic norms for how we gather and assess information and claims.
5. Putting it all together
One may attempt to teach epistemic norms to a 2-year-old (or to a chimpanzee), but the results will be disappointing. The results will be disappointing because these creatures do not have the concepts necessary for taking advantage of the teaching. Our argument here is that young children can come to understand epistemic norms as they are exposed to them only after they have the prerequisite concepts of belief, reason for belief, and social norm. Our more contentious claim is that children come to these prerequisite concepts through, and only through, social interaction with others that is structured by the skills and motivations, first, of joint intentionality, and then second, of collective intentionality.
We may thus propose the following ontogenetic timeline. Young children begin to understand beliefs in a more or less adult-like way in terms of the distinction between subjective perspectives and the objective situation by around 4 years of age. During this same general age period they become capable of giving reasons for their beliefs (and understanding others’ reasons for their beliefs) in their cooperative argumentation with them. And by around 5 years of age, children understand social norms in general as collective agreements for how we in this group ought to behave in certain situations. These are the building blocks that will enable children to argue with others about what we ought to believe, bolstering their arguments with reasons that are connected to beliefs we all share, all in the context of collective agreements for how such arguments should go – how we all evaluate evidence, connect ideas logically, etc. – if these arguments are to be successful cooperative enterprises leading to the kinds of valid knowledge on which we may base effective action.
The concept of belief is constructed through shared attentional interactions, especially those involving cooperative communication, in which children attempt to relate the content of their perspective to that of others. Simulating others’ thoughts, or creating a theory of their thoughts, is simply not possible without such shared intentional interactions in which we have a common focus of attention but at the same time we have different perspectives on it. In the case of reason (for belief) as well, key to the constructive process is dual-level social interaction involving a common focus on some proposition, and our mutual apprehension of different attitudes about it – constrained by our mutually held beliefs and valuations in which valid reasons are grounded. It is only through such dialogue and mutually held beliefs that the notion of reason for belief can take root. Finally, through unknown social-interactive processes, children come to understand social norms as collective agreements that they can actually create themselves with peers. Children then somehow – we know nothing of the specific processes involved – put these three components together as they are exposed to epistemic norms in the context of science and other similar activities.
And so, while it may be true that as mature adult thinkers we may in some contexts create for ourselves epistemic norms in some solitary activity to guide and self-regulate the knowledge involved in an effective direction, that does not gainsay the fact that our ability to do this was made possible through various social-interactive processes in which the relating, matching, and coordinating of perspectives was needed. The kinds of rationality displayed in humans’ construction of and deference to epistemic norms comes about through processes derived from humans’ species-unique abilities to engage with one another in acts of shared intentionality