Hostname: page-component-745bb68f8f-kw2vx Total loading time: 0 Render date: 2025-02-06T10:56:24.388Z Has data issue: false hasContentIssue false

More “us,” less “them”: An appeal for pluralism – and stand-alone computational theorizing – in our science of social groups

Published online by Cambridge University Press:  07 July 2022

David Pietraszewski*
Affiliation:
Center for Adaptive Rationality, Max Planck Institute for Human Development, Lentzeallee 94, 14195 Berlin, Germany davidpietraszewski@gmail.com https://www.mpib-berlin.mpg.de/en/staff/david-pietraszewski

Abstract

The target article is an appeal to allow explicit computational theorizing into the study of social groups. Some commentators took this proposal and ran with it, some had questions about it, and some were confused or even put off by it. But even the latter did not seem to outright disagree – they thought the proposal was mutually exclusive with some other enterprise, when in fact it is not. Unfortunately, scientists studying social groups have not yet avoided the thread-bare trope of the blind men studying the different parts of the elephant: We see mutual exclusivity when we should see complementarity. I hope we can all take the next steps of examining how the different enterprises and approaches within our area of research might all fit together into a unified whole.

Type
Author's Response
Copyright
Copyright © The Author(s), 2022. Published by Cambridge University Press

R1. Impressions

The target article presents two arguments: (1) the study of social groups has not yet been explicit about what constitutes a group representation in the human mind and (2) the roles within triadic interaction types account plausibly describes a group representation within the bounds of conflict. No one really took issue with the first argument. Instead, most of the action surrounded the second. A few commentaries accused the triadic account of being just as slippery as past accounts, while more had questions about its explanatory scope and adequacy. Another handful ran with the account in exciting and interesting ways.

While I'm anxious to get to argument two, it's remarkable that no one took issue with argument one. We've been studying social groups for 100 years, yet there seems to be general agreement that this work has failed to produce a single non-metaphorical description of what constitutes a group representation in the mind.

So why aren't we inundated with plausible alternative accounts of what constitutes a group representation? Marvin Minsky captures the essence of the problem: “You can't look for something until you have the idea of it” (Reference Minsky2011). Currently, psychology is largely intolerant of stand-alone computational theorizing, particularly in the absence of accompanying data, and is pinning its hopes on experimental effects that will – like iron filings tossed into a magnetic field – somehow all congeal around a theory of how the mind works (van Rooij & Baggio, Reference van Rooij and Baggio2021).

But this has never worked: Computational theories rarely, if ever, fall out of the data. Instead, they inform what data to look for in the first place (e.g., Chomsky, Reference Chomsky1959, Reference Chomsky1980; Gardner, Reference Gardner1985; Minsky, Reference Minsky1961; Weiner, Reference Weiner1948/1961). Indeed, the same lesson occurs across the history of science: Yes, you need data to arbitrate between theories. But the data themselves are not sufficient for guiding what questions to ask in the first place. For that, you need independent theory (Heisenberg, Reference Heisenberg1983; Kuhn, Reference Kuhn1962/1970).

This brings us to the third and most important argument in the paper: Researchers studying groups must start tolerating stand-alone computational theorizing (theorizing about what information-processing problems exist and how they might be solved) – which includes concerning ourselves with how the mind solves the reduction problem of group representation (as the target article does). If we don't, then we will continue to (i) confuse experimental effects for computational theories (as I worry some of the commentators did), and (ii) limit the information-processing problems that we investigate to the narrow set that have obvious links to experimental effects or trivial solutions.

Finally, a number of other commentators picked up on the fact that a good computational theory suggests a large number of other information-processing problems that must also be solved. But this was sometimes treated as a problem with the theory, as opposed to being the point of the theory (see Box R1). For example, Ratner, Hamilton, and Brewer (Ratner et al.) chide me for leaving “vague how people reason about groups when they are not privy to observing behaviors,” deferring “the hard work to future directions.” But this is like criticizing a restaurant for only cooking the food and not eating it for you too. I agree that being specific about what the end-state mental representation is highlights a problem. Namely, how that representation can come to be. And I agree it is a hard problem (even when it is based on behaviors). But that's the point of the computational theory: Making the problem specific enough that we can begin to tackle it. This is exactly what others do in their commentaries (e.g., Leibo, Sasha Vezhnevets, Eckstein, Agapiou, & Duéñez-Guzmán [Leibo et al.]).

Box R1. Problems aren't a problem, they're the point.

Computational theories provide conceptual solutions: they turn abstract, vague notions into information-processing problems. Yet computational theories are not theories of how that problem is solved. Rather, they identify what problem we need, as a community of researchers, to solve. They are the analog of an engineer first analyzing a problem before coming up with a solution to it.

The point of presenting a computational theory is that it allows a larger community of researchers to conduct a task analysis of that theory (Marr, Reference Marr1982; Minsky, Reference Minsky1974). A task analysis involves asking what are the additional information-processing problems that must be solved for this computational theory to be executed in the real-world. That is, the computational theory is broken down into more specific tasks and subtasks. (In the case of the computational theory of scene analysis for vision, for example, subtasks include depth perception, color constancy, object-feature binding, etc.)

Once a task analysis is conducted, researchers start to propose multiple, competing accounts of how these problems may be solved by the mind, and then begin to test for the existence of these solutions. (For example, one problem in color constancy is correcting for reflectance, which in turn requires representing if a surface is smooth or rough, and so on [e.g., Maloney & Brainard, Reference Maloney and Brainard2010].) This three-step process is depicted below:

Fundamentally, then, the target article is an appeal for theoretical diversity. Allowing for explicit, stand-alone computational theorizing does not displace other approaches, but compliments them. The title of Wiles, Haslam, Steffens, and Jetten's (Wiles et al.) commentary captures it perfectly: “A computational model of the group needs a psychology of ‘us’ (not ‘them’).” I couldn't agree more.

R2. Responses to individual commentaries

R2.1. Confusions

Ratner et al. were far and away the most critical. They accuse of me of not holding myself to same standards with which I am criticizing others – which I agree would indeed be deeply unfair, if only it were true. Their argument is that my theory is intuitive because it is not based on empirical evidence: “Pietraszewski provided no empirical evidence to support his assertions. It is unclear why, for instance, he assumes that perceivers inherently view the behaviors in his primitives as evidence for intergroup behavior instead of a string of dyadic interpersonal behaviors. In Figure 3 he circumvents this ambiguity by labeling some positions in the diagram as ingroup and some as outgroup. However, this solution is as tautological as the container metaphor he chastises.”

There's a lot to unpack here. First, a false choice: Either participants view interactions as intergroup or dyadic behaviors. But the target article argues that intergroup behavior is a string of dyadic behaviors (contingent upon a prior dyadic event involving a third agent; a point revisited with Simandan). Second, tautology is not equivalent to a lack of computational adequacy, and I only claim the latter about the container metaphor. Third, I agree labeling agents who occupy group-constitutive roles as members of the same group (as I do in Fig. 3) constitutes a tautology – which is why I say so on page 7. But that was the whole point of this section. Figures 3–5 present the theory's definition, and when one presents a definition one is necessarily presenting a tautology. (If it's not clear why, go look up the definition of a tautology.) Ratner et al. are confusing the process of defining something with the content of the definition itself, and they present no argument that the definition is itself tautological.

Ratner et al. further criticize me for not providing any empirical evidence. But this criticism misses the point. The target article is a generative theory that points out what evidence we should be looking for in the first place. Ratner et al. also suggest I'm being hypocritical by saying that past definitions of groups are intuitive while my own definition is also intuitive. But they're conflating different senses of intuitive. My concern is not with intuitive theories in the sense of not being based on direct experimental evidence, but with definitional content that requires already having an intuition about the entity being defined. For example, in, “A group exists when two or more people define themselves as members of it,” there is little in the definition that does not feed back on to the very notion that is supposed to be defined. It is this sense of intuitive I take issue with, and is the standard that I'm holding myself and others to.

There's more to address in Ratner et al. (e.g., they accuse of me sleight of hand when in fact I'm putting my cards on the table) but space is limited, and you get the idea. But I do want to note that later in their commentary they strike a more conciliatory tone: “Pietraszewski's theorizing does not supplant existing work. … We believe that theoretical integration rather than competition between models of ‘groupness’ is the best path forward.”

I agree: The target article is a plea to allow explicit computational theorizing into the study of groups, not at the expense of other approaches, but to complement them. I suspect, then, that we have a case here of misunderstanding born out of interdisciplinarity. Or, as Ratner et al. put it, an integration of a Marr-ian style levels-of-analysis framework with social cognition. These misunderstandings can be seen as the growing pains of that integration.

Two authors, Levine and Philpot – without any apparent appreciation of the irony – next attacked me for suggesting that whenever three people get together, the only thing that can happen is a conflict. If that were my theory, it would be a good candidate for the worst theory ever.

To be clear, the account I put forward does not predict that conflict is the only polyadic behavior that can occur. Rather, it articulates what constitutes group membership (to the mind) when polyadic conflict does occur. That is, a group representation is in part a representation of roles within polyadic conflict.

The source of misunderstanding may be that the information-processing machinery required to “see” groups is complex, and conflict is but one element of it. A different matter is whether any one group token out in the world is characterized by conflict alone (a distinction that seems to have also tripped up Ratner et al., Elad-Strenger & Kessler, and Thomsen). As an analogy, we can ask how the visual system represents the world. One element of the visual system are parts for representing lines and edges. That such mechanisms exist is not undermined if we go out into the world and discover that a token of a scene (e.g., an elephant standing in a field) is not exclusively made up of lines and edges.

I do agree with Levine and Philpot that conflict protects cooperation. The ethological work out of which the present framework emerged (e.g., Chase, Reference Chase1985; Strayer & Noel, Reference Strayer, Noel, Zahn-Waxler, Cummings and Iannotti1986) makes this very point, and the present account predicts when avoidance of conflict (and therefore conciliation and de-escalation) will in fact occur (see e.g., Pietraszewski, Reference Pietraszewski2016).

Levine and Philpot also worry that I have failed to mention existing work that acknowledges the relational elements of group membership – namely, the meta-contrast principle, which states that: “individuals tend to be categorized as a group…[when]the perceived differences between them are less than the perceived differences between them and other people (outgroups) in the comparative context.”

The meta-contrast principle is a lovely description of the categorization process itself, which is, as Bruner, Goodnow, and Austin (Reference Bruner, Goodnow and Austin1956), Bruner (Reference Bruner1957), and Taylor, Fiske, Etcoff, and Ruderman (Reference Taylor, Fiske, Etcoff and Ruderman1978) put it, the accentuation of between-category differences and the minimization of within-category differences. As such, the meta-contrast principle is a descriptive framework. It acknowledges that categorization is function of context and who is around, but it is not – nor does it try to be – a theory of which particular contexts and which particular people will be categorized. To get that, you need a theory of what the categorization is for.

The target article argues that categorization is relational because ultimately what categorization is for is to predict who will take whose side in a conflict. So, I was less concerned with theories that acknowledge that categorization is relational, and more concerned with theories in which the relational property emerges out of the functioning of the system. That said, I agree the meta-contrast principle dovetails nicely with the present account, and is worth including as a way of conceptualizing (and giving a vocabulary for) the context and target specificity of categorization.

R2.2. Questions, including “what about X?”

Simandan finds much to like, but worries that delay, proximity, and loyalty/disloyalty need to be added to the pile of problems to be solved if the present account is to work. I agree. The point of presenting the target article is to provoke exactly what Simandan is doing here: decomposing the account into a number of subproblems. Simandan's deconstruction of the problem of loyalty is beautiful, and should be pushed much further.

Simandan also worries that we do not have a necessary and sufficient theory of groups in the context of conflict. What I meant by necessary and sufficient was whether a system that could produce the representations described verbally in the target article would have a representation of groups in the context of conflict. The argument is that it would. A different issue is whether the target article describes everything that you would need to put into a robot to get that robot executing the verbal description – and I would emphatically agree it is not (that's sort of the point). This is what I meant by computational adequacy. So I would rephrase this all as that both Simandan and I agree the present account is not computationally adequate, and that an entire universe of the kinds of considerations that Simandan presents needs to be brought to bear if we are to succeed.

Simandan (joining also Delton and Moffet) also worries about relegating certain group attributes to “ancillary” status. For him, spatial proximity is just too important to be on that list (a point echoed by Moffet).

To clarify, I don't mean intrinsically ancillary, but ancillary with respect to the specific conflict representation in the target article. I'm even happy to stipulate that you might need some kind of proximity representation to get a complete, non-impaired group representation (Lewin field theory comes to mind; see also Thomsen). But crucially, space as a concept is not sufficient to produce the kinds of inferences described in the target article (the inference is not that one is literally close; the inference is how agents will interact with one another in terms of costs and benefits).

Delton makes a similar point about cooperation, and I agree. Cooperation is not intrinsically ancillary to groups; it is only ancillary with respect to what counts as a group in the context of conflict.

Delton also worries I'm too harsh on past theories of obligation and interdependence (e.g., Balliet, Tybur, & Van Lange, Reference Balliet, Tybur and Van Lange2017), and that I've got a few black boxes of my own (such as what counts as a “cost”). While I take the point, I would push back a little on this: The Balliet paper describes evolutionarily recurrent dynamics, but it's not yet a computational theory of something in the mind – and it is only along this dimension that I am evaluating it. I do think that it can be turned into a computational theory, and I even take a stab at that in the target article: describing it as a set of modifiers to the polyadic event types, rather than the event types themselves.

With respect to black boxing costs, I emphatically agree, and I'm glad somebody noticed. Theorizing that highlights additional problems to be solved is what we want (a bizarre thing about current psychological theorizing is a desire for descriptions of the mind that don't highlight problems). So, yes, by all means my account – like many others – depends intrinsically upon a psychology of representing costs (see also Wittmann, Faber, & Lamm [Wittmann et al.]), and in no way tackles that problem, aside from highlighting that it is a problem. My only other comment, though, is that this is not what the current paper is about: it is about group representations. But to the degree that we have precise theories that do depend on costs (as in this work, and in Delton's other work), we can have more precise theories (and dependent measures) of what constitutes a cost.

Greenburgh and Raihani are similarly constructive in pointing out that cues about what groups exist may be sparse and even hidden, which suggests the existence of additional mechanisms that have to guess about the existence of group-based intentions. Greenburgh & Raihani describe their work on paranoia as a window into how these systems work – including how they calibrate and even break. One notable highlight is that paranoia can produce delusions of the Alliance type, which hints at a possible research program in which the triadic interaction types can be used as dependent measures with which to study both normal and clinical-level paranoia.

The tension between the opacity of available cues and the need to predict the future also lies at the heart of Phillips's commentary, in which he is concerned that intentionality may be even more important than my triadic group roles.

While I agree that intentionality is important, I think he has things backwards: The function of the cognitive system is to predict agents' occupation of triadic group roles over phylogenetic and ontogenetic time. Intentions are representations that allow this system to represent if a one-time event is diagnostic of future events. Therefore, it doesn't make sense to say that intentions are more important than the triadic group roles – because what those intentionality representations are pointing to (i.e., the aboutness of the intentions) is whether those triadic group roles are likely to occur in the future. Perhaps he thought I was referring to behaviors and not mental representations? (Phillips also gets the rock-throwing example wrong: The claim is that all four classes of roles need to be seen or expected; not just one.)

I also think Phillips has it backwards when he treats intentionality representations as “thick,” and the triadic-roles event grammar as “thin.” The systems underwriting the triadic-roles event grammar constitute the bulk of the information-processing. Whereas intentionality representations are icing-on-the-cake representations that either switch some of their calculations on or off. They are placeholders for the fact that appearances can be deceiving.

I liked Wiles et al.'s commentary, and not only because of the explicative. They helpfully point out additional problems that need to be solved – although they again seem to think this is a problem with my theory, as opposed to a problem in my theory (see Simandan and Delton, above). They highlight four problems, with which I emphatically agree (I also make similar points elsewhere; e.g., Pietraszewski, Reference Pietraszewski2020a, Reference Pietraszewski, Van Lange, Higgins and Kruglanski2020b).

First, members have to experience that some entity is a group (“accessibility”) and know in which contexts it is relevant (“fit”). Second, groups don't work if the members don't have an “internalized sense of group membership.” Third, “a satisfactory model requires an appreciation of …the norms that…dictate the different ways in which they [group members] should relate to one another.” Fourth, “that ‘who we are’ and what it means to be member of a given group varies across situations.”

Again, I agree with all of this. I suspect that Wiles et al. think that my robot would be a social imbecile waiting around for some “ossified” set of predetermined “external exigencies,” whereas their robot would “get it” and be just fine in the gritty realities of the police locker room. But I think my robot needs what they're describing (to be flexible and to actually engage with specific group tokens), and that their robot needs what I'm describing (to have a representation of the type [group-in-conflict] in the first place).

To be clear, subjective context dependence exists because of the problem of having to apply a static, objective inference (the roles within triadic interaction types). An analogy with danger is apt: Suppose that [X will lead to entropy] is the mental representation of what it means for X to be dangerous. To be useful, such a representation has to be “shuttled around” computationally (or “protected”) from circumstances under which it is not relevant. So if we wanted to build a robot that can see danger, it would need to have additional machinery for being “flexible” and “context-specific” about when X does and does not lead to entropy (a toaster is perfectly safe, but not when you stick a butter knife in it or bathe with it). The same applies to the roles within triadic interaction types account: Different group tokens are going to occupy group-constitutive roles under different circumstances.

For this reason, I'm a bit skeptical about directly pitting objective against subjective, fixed against flexible. After all, objective and fixed information-processing (or developmental) rules cause our flexible, internal subjective states. So our task as scientists is to explain flexibility and subjectivity as outcomes of objective and fixed computational procedures.

R2.3. Clarifying levels of analysis

One way to think about all of this is with respect to the three different levels of reduction or levels of analysis at which one can describe the mind (see Pietraszewski & Wertz, Reference Pietraszewski and Wertz2021): The current framework adopts the middle functional level of analysis, in which there are only rule-governed mechanisms. The subjective, flexible experiences that Wiles et al. describe then live at the higher intentional level of analysis.

Wittmann et al. also make this level of analysis distinction and go on to decompose the problems entailed by the target article into functions (somewhat) known to neuroscientists, noting that (i) “many component processes underlying social cognition are shared between social and non-social domains,” and (ii) motivations and emotions are important.

I agree, and have two things to say. First – and I don't think we disagree on this point – motivations and emotions can be understood at all three levels of analysis: We can have a mechanistic account of their computational logic (functional level), along with their neuroscientific (implementational) and subjective (intentional) descriptions. Second, Wittmann et al. suggest that “by employing a neurocomputational perspective, we gain more precise information on which aspects of group representations may be genuinely social.” I have no doubt that this is true. But I'd offer that such a thing is by-the-wayside. By way of analogy with an automobile, it's helpful to know if turniness is separate from stoppiness. But that's not our final destination in reverse-engineering how a car works. So I'd put the enterprise somewhat differently: that we have to specify the processes and procedures that make all of this group stuff work, both in terms of their abstract functional logic, and also their physical realization.

I worry there is an outright level of analysis confusion lurking in Gelpi, Allidina, Hoyer, and Cunningham (Gelpi et al.); they seem to think that because I'm being concrete I'm both (1) adopting a “bottom-up approach” and (2) claiming that the triadic roles can only be inputs. As such, they think I'm being unnecessarily narrow and leaving out important “top-down” things like categorization, induction, and inference.

But they're wrong on all counts. First, what I'm describing here is the functional role semantics of a group representation, which is orthogonal to the top-down/bottom-up distinction. Second, I do think the group-constitutive roles can be inferential outputs in response to abstract category information, which is why I say so in the target article.

The only way I can make sense of Gelpi et al.'s comments is if they are confusing intentional and functional levels of analysis. They seem to be adopting a Fodorian view of the mind, in which concrete things are low-level inputs, and abstract things belong to the gooey-center homunculus (see Pietraszewski & Wertz, Reference Pietraszewski and Wertz2021), whereas in fact concrete things (in the context of a computational theory) are descriptions of everything the mind does at a functional level of analysis – gooey center included.

The funny thing is, I don't disagree with anything that Gelpi et al. are saying about categorization or induction. And what they are explaining to me about categorization – such as its being highly flexible – is something I show in my own empirical work. So my issue is with their inferences and argument, and that they seem to be equating categorization with induction. My claim is that the containment metaphor doesn't get you induction. I'm saying, “We don't have a theory of what induction is happening until we known what representations are internal to those induction processes.” Gelpi et al. are responding with, “But induction happens!” I know; the problem is how (mechanistically). They seem to use a description of a phenomenon-to-be-explained as an alternative to explaining how the phenomenon happens.

R2.4. Complementary approaches

Fog, Suchow, and Moffett likewise say things that are wonderful and that I agree with, but they seem to think is a problem with my theory (are we detecting a theme?).

Fog describes research exploring the attributes that make group tokens successful and enduring. This is great, but the target article isn't a theory of that. So I'm confused why he thinks the existence of this research poses a problem for my theory. In fact, the two approaches are complementary: The triadic framework articulates what conflict-related dynamics successful group tokens avoid within their own ranks and provides a way to measure “infighting” (i.e., to what degree group tokens are composed of many other smaller group tokens).

Suchow notes that agents may have diverging representations of what group memberships exist, which requires meta-representational machinery for representing what others are representing groupwise. He also notes that group representations are intersubjective – that representations need to be somewhat coordinated to get a group off the ground. I agree (similar points appear in the target article). So, if Suchow thinks I wasn't thinking this, I'm glad we could clarify. I also agree that recursive mentalizing is an important and under-studied aspect of social group cognition. I'd add that having more precise accounts of what cues lead to group representations in the first place gives us traction on such mentalizing procedures.

Moffett worries that I'm shoe-horning one particular meaning onto the word “group” – namely, cooperation. (And to think, Levine & Philpot were worried that I was only focusing on conflict!)

I agree that “group” is – like any word – a “suitcase” of meanings (as Minsky puts it), and I don't want to lose the suitcase. What about food groups, after all? I also agree that (i) pragmatic and context clues narrow down referents, (ii) that polysemy is a feature and not a bug, and (iii) thinking there's only one computational theory for a word is likely misguided. So, I view the enterprise here as the analog of pulling one sock out of the suitcase: but where I'm pulling out a type, not a token.

Moffett – echoing Barth, Sapir, and others – then suggests that large-scale social identities are phylogenetically recurrent entities that also deserve to be called groups, and that socially aligned groups or SAGs (his moniker for the target article's “groups”) are somewhat orthogonal to this. I agree and have made a similar point elsewhere (Pietraszewski & Schwartz, Reference Pietraszewski and Schwartz2014).

My only point of disagreement with Moffett is when he suggests that certain group representations “needn't be built on anything more complex…than how we distinguish tiger from panda, with our fear of the other developing toward the former; whether our minds represent any collection of things as a group—humans included—isn't necessarily determined by calculations around whether, and how, they might cooperate.”

I worry that we're confusing experience with computation here. If I stand on the ledge of a tall building I experience fear. But that doesn't mean that I'm not calculating that a fall will lead to the entropic disordering of my body. (Or, if you prefer, natural selection “saw” this relationship and tuned the systems that comprise me to intrinsically fear this kind of situation; similar to what the target article called intrinsic ancillary cues.) So it's not clear why Moffett wants to jettison calculations about behavior. Does he think that any intact human would fail to understand that members of some feared outgroup would occupy group-constitutive roles? And what is the fear about anyway, if not possible cost infliction?

Oláh and Király similarly point out that there are reasons for the mind to attend to collectives (such as genders, sexes, languages, etc.) aside from inferences about who will take who's side in a conflict. Again, I agree and make similar points elsewhere (Pietraszewski & Schwartz, Reference Pietraszewski and Schwartz2014). They propose calling collectives involving social interactions (of the cooperation/competition sorts; Moffet's SAGs) social groups, and collectives that warrant inferences (e.g., sex, language, etc.) social categories. Cikara (Reference Cikara2021) does a lovely job of articulating this distinction, so I'll simply point to that paper, rather than trying to cram in something here.

What I will say about Oláh and Király's commentary is that it highlights the importance of distinguishing between two different enterprises within our science of groups. One describes the information-processing underwriting each collective studied under the rubric of “social groups.” The second describes the information-processing common and universal to the folk-notion “group.” The first requires a computationally adequate account of every group token – an array of computational types. The second requires explaining why people think there is an over-arching category “group,” and why they agree about a continuum of groupishness across group tokens. (This distinction is well captured by Ratner et al., when they mention Caporeal and Lickel's group types and entitativity continua.)

The target article is concerned with the second enterprise: the representation(s) applicable to the type that captures intuitions about each token being more-or-less-a-group (i.e., the degree of “entitativity”). The argument being that certain collectives are conceivable as “groups” (either by scientists or laypeople) if they have attributes that make them probabilistically informative with respect to conflict expectations (i.e., the event grammar). It's also worth considering whether taxonomies like Lickel's – things like task groups, social categories, intimacy groups, and weak social relationships – describe a factor analysis of tokens, or a set of computational types. It's important work either way, but we shouldn't confuse the two. It would be nice to see work addressing this in the future.

Like Oláh and Király, Pun and Baron review relevant developmental work. It's wonderful, I agree with everything they say, and they're doing some of the best current developmental work related to “core” group inferences. So I have nothing more to say.

Likewise, Cikara is doing some of the best research on adult's social group representations and inferences, and I'm glad the latent structure learning work is reviewed in her commentary. However, I'm puzzled as to why she's pitting her computational model directly against the target article's computational theory, as they're two very different things.

Cikara's work shows that relative similarity of expressed opinions (how close you and X are, compared to Y) is linked to minimal group or coalitional or alliance-based inferences and motivations (as opposed to absolute similarity on opinions; i.e., how close you and X are). I'm a fan of this work, and it validates past theorizing. But I'm puzzled why Cikara seems to assert that this experimental effect – a computational model, which is a mathematical description of the relationship between dependent and independent measures – is any kind of computational theory of a group representation. I also don't understand why she implies that ancillary cues can't be relative. I think they often are, and have been explicit on this point both in the past (e.g., Pietraszewski, Reference Pietraszewski, Banaji and Gelman2013, Reference Pietraszewski, Van Lange, Higgins and Kruglanski2020b) and also in the target article.

Finally, in Cikara's effect, similarity (of both the relative and absolute kind) is specific to sharing an opinion (what we might call epistemic coordination), and is not similarity writ large. I bring this up because distinguishing between different kinds of similarity is crucial, as there are a number of similarity-based theories that simply don't work. For instance, Ratner et al. incorrectly argue that I “rely on perceivers inferring similar fate…when analyzing the group-constitutive roles” – when in fact I do not only not do this, I don't even think this works. If you look at the triadic interaction types within the target article, the agents who all share a common fate are those who are attacked. But who's attacked and who's in the same group clearly do not map onto one another. So group membership can't be isomorphic with shared fate.

R2.5. Describing computations

Other commentaries tackle the computations determining which triadic interactions are more or less likely to happen. These include Fischer, Levin, Rubenstein, Avrashi, Givon, and Oz (Fischer et al.), Redhead, Minocher, and Deffner (Redhead et al.), Qi, Vul, Schachner, and Powell (Qi et al.), and Radkani, Thomas, and Saxe (Radkani et al.).

The Fischer et al. commentary asks, in essence, what are the minimal numerical libraries needed to keep track of and produce social interactions? They offer a taxonomy covering all possible degrees-of-freedom for social interactions involving costs and benefits – one element of which is Fischer et al.'s specific notion of similarity: subjective Expected Relative Similarity, or SERS, the degree to which I think my behavior will be yoked to yours.

I view Fischer et al.'s taxonomy a bit like latitude and longitude coordinates: It describes all possible locations, and the target article describes a particular location. On this account, “group” is a special subset of a larger possibility space. That the target article's proposal fits within this larger possibility space (and that all possible triadic interactions fit within a single figure) is no small feat.

I agree with nearly everything Fischer et al. say, but I think they sell themselves short when they say “we do not assume that the applied model embodies a representation of the mind, but expect it to provide testable and valid hypotheses.” While they're being careful, I'm happy to speculate (someone has to!) that what Fischer et al. present is represented in the mind – if for no other reason than that it does describe a space of all possible social interactions involving costs and benefits.

Redhead et al. add a latent positive tie between the two agents not in conflict within each of the four triadic interaction types. This addition adds tractability from both a network science and on-the-ground-measurement perspective, and opens the door for polyadic benefit-conferral frameworks (something colleagues and I are also starting to work on; e.g., Conroy-Beam, Ghezae, & Pietraszewski, Reference Conroy-Beam, Ghezae and Pietraszewski2021). This notion of ties is helpful in that it allows for dyadic and polyadic continuity, and starts to get at the nature of the underlying mental calculations.

Qi et al. suggest that the triadic framework can be reduced to dyadic welfare trade-off ratios (WTRs) – a representation of how much one agent places another's welfare against their own – and therefore may be a more parsimonious group representation than what is presented in the target article. While I'm sympathetic, I worry this may be a case of what Dennett (Reference Dennett1995) called greedy reductionism. Yes, you can reduce a computer program to 0's and 1's, but that doesn't mean the program is just 0's and 1's. In other words, we shouldn't confuse the ability to describe something in a particular language with the claim that the existence of the language is itself sufficient for the creation of what we just said. (If that were true, we should all stop typing and go home now.)

Here's one argument (of many) for why WTRs probably aren't sufficient: Many species have dyadic relationships. Far fewer have polyadic relationships. Yet both dyadic and polyadic relationships can be described in terms of WTRs. If WTRs are all that are necessary, then we shouldn't see a discrepancy between dyadic and polyadic capacities.

That said, I have no doubt WTRs can describe the triadic interactions (with some exceptions; see Pietraszewski, Reference Pietraszewski2016), if for no other reason than conflict on behalf of another is a benefit to that other (generally), and to impose a cost on another is, well, to impose a cost on another. WTRs are also likely both inputs and outputs to the triadic architecture – that is, WTRs can be calculated as a result of the triadic interactions, and are also values that can inform which agents occupy which roles (I say as much in the target article). But again, we shouldn't confuse the values that the polyadic interaction systems take in and generate as being the same thing as the representation itself.

If instead you want to argue instead that WTRs play out in a particular way in polyadic conflict (i.e., the group-constitutive roles within the triadic interactions), then you've redescribed the present account with different language, which is of course no alternative at all. The upshot of all of this is that WTRs are something different than the target articles' roles within triadic interaction types.

The same comments apply to Radkani et al.'s proposal of a recursive utility calculus; that it's a theory of values, not a theory of what gets done with those values, or of how those values coalesce into or map onto a representation of a “group,” so it's a category mistake to directly contrast the utility calculus with the proposal in the target article. Otherwise, I agree with pretty much everything else that Radkani et al. say. They note, for example, that groups are a special case of symmetry relationships, but not all relationship will be symmetric – and I agree (indeed, the more general asymmetric cases are covered in Pietraszewski, Reference Pietraszewski2016; a similar observation is made by Ho, Rosenthal, Fox, Garry, Gopang, Rollins, Soliman, & Swain [ Ho et al.]). I also agree that the scaling-up architecture may often get things wrong (a point also made by Boyer). I'm not sure they quite understood which elements of the group membership machinery I was claiming could be avoided within the scaling-up architecture, but I don't think either they or I have enough to go on to have that conversation here.

Leibo et al. explore how we might get the representations described in the target article in the head of an organism via reinforcement learning (RL) – but (thankfully) not using RL as an explanation, but as a starting framework for specifying the things you would need to put in the system (or see in the environment) to get the necessary representations (and motivations).

I like the commentary so much that I only have a few things to say. First, I appreciate that they point out that RL does not imply a blank-slate approach. Second, it's heartening to see that the reinforcement learning methods proposed take the credit assignment problem seriously (if you haven't tackled credit assignment, then you don't really have a RL solution). Leibo et al. in particular describe a hierarchical structure featuring a manager (who would get rewards from the world), and a worker (for whom the manager creates intrinsic rewards, and who works on more immediate tasks and time-frames). The decoupling between the manager and worker allows for longer time-span credit assignment, among other things. I think this is all great, and also a place for a learnability or task analysis of the environment, where one can ask things like, What outcomes would the manager be exposed to in principle? and What errors or mistakes would it need to protect the learner from? I also agree that it will be helpful to apply these methods to both real-world and artificial toy world scenarios, and then have the two approaches meet in the middle. I hope the proposed research program comes to fruition.

I also really liked DeDeo's commentary, not least because he suggested my framework may still be too folk psychological (!) in that it may be part of the “talking,” “public representation” kind of mental representation – the kind that explains things to others, but that doesn't necessarily do most of the work. DeDeo also mentions sparse coding, a way for system to determine points of maximum inferential leverage at different levels of abstraction. (Essentially, if you and your spouse produce nearly identical social inferences for a third party, then that third party's mind might chunk you and your spouse together for the purposes of making calculations about the social world.) This suggests that my overly granular “atomic-compositional” framework may be better thought of as a population of representations that “mix different levels” of granularity.

I agree with all of this, and have been thinking along similar lines. But to clarify on the first point: The current framework suggests many group summary representations will be private and not communicable (which also means there are far more group representations than first-person phenomenology would suggest). So I might reframe it this way: There are hierarchal and nested sets of representations that chunk the triadic state space in precise ways. This machinery is not hooked up to the “talky” parts of the mind. And the talky parts have a simpler scale-up homogenization architecture, so as to not contaminate the actual computational, precise system.

With respect to sparse coding, I would put it that the atomic system spawns sparse coding representations, which are time-consuming, but once made can be used quickly. So the two are not mutually exclusive. Sparse coding produces efficient “chunked” representations that explain a lot without having to repeat the calculations that made the chunk in the first place. I suspect probable events and behaviors get run through an off-line version of this triadic state-space inference machine (tuned according to current relationships, and modified according to changes to relationships and events). Relationships and events that do not qualitatively shift outcomes are then more readily chunked. In other words, the system is constantly asking “What's the least detailed representation that can be used to generate and predict behavior effectively?”

R2.6. Additional (computational) considerations

Bryant and Bainbridge suggest that social interactions offer readily available cues – either by design, or by accident – for inferring group membership. They note work on auditory communication that suggests yelling and laughter (to take just two examples) have design features that make them surprisingly appropriate for making social inferences of the kind in the target article. This work suggests how the mind comes to wrest group membership representations from an opaque and cue-stingy environment, and is relevant to both the Leibo et al. and DeDeo commentaries.

Boyer's commentary is profound in two ways. First, he points out that intuitive, folk theories distort our scientific intuitions. Which means that one of the best ways to improve science is to render their operations explicit. I couldn't agree more, and this is one of the themes of the target article.

The second point follows from the first: Boyer asks to what degree the representation [groups as agents] is ever veridical, particularly when (1) the scale of interaction goes beyond what the cognitive systems evolved to deal with (e.g., modern nation-states), and (2) when methodological individualism no longer holds (i.e., even if we knew everything about how the mind works, we would still have to capture higher-level social dynamics to understand and predict behavior).

Both points are germane and pressing. I would add that the scaling-up architecture probably also gets things wrong even at relatively small scales (does Vanessa really hate Anne, or is that just a facile inference generated by the fact that Vanessa hates Rachel, and that Anne and Rachel are good friends?) Furthermore, because everyone shares the same scaling-up architecture, these inaccuracies can become self-fulfilling (Did “America” really want to go to war with North Vietnam? Probably no. Yet it did). The outputs of the scaling-up architecture are also likely easier to coordinate around than the messiness of reality – because the outputs are for communication and coordination (see also DeDeo). This means that we constantly over-homogenize collectives and their relationships with one another. As much as our group psychology allows us to “see” groups, it blinds us to the many realities underlying them.

Tatone points out what he calls the arbitration problem: that there are often competing interpretations (or frames) for understanding any particular event – triadic interaction types included. For instance, Displacement may reflect a dominance hierarchy and not a shared group membership. Therefore, computational systems must exist for arbitrating between which event framing is correct (or should be favored).

I agree and would note that what Tatone is expressing here is also mirrored in the commentary of Radkani et al., among others, which is that group relationships are a special class of relationships that determine what kind of triadic interactions are likely. However, they are not the only relationships that do so – and we need theories of all the relationships that determine behavior within triadic interactions. Tatone's frames are also the kind of latent representations that will be necessary in any theory of how group representations can come to be and remain accurate (as described by Leibo et al. or Simandan).

I'd add that I'm not sure that things always get arbitrated cleanly between different possible framings. Given that there are plurality of systems within the mind, there needn't be a single point decision between competing frames – particularly when the contents of the competing frames (e.g., dominance vs. group membership) are not themselves mutually exclusive out in the world. Such simultaneous perceptions may then account for why we can speak, for instance, of ethnic group memberships simultaneously being about group memberships and dominance relationships within a society.

Thomsen suggests that containment, unity, and oneness is a computational theory of groups. I fear this reflects a confusion between types and tokens, and simply retraces what I say about intuition in the target article: “Intuition highlights what is variable about groups (who belongs to what group, and what individual tokens of groups exist) while blinding us to what is universal about group memberships (what constitutes a group, and what is done within cognitive architecture once a group is detected).”

In other words, while I agree with Thomsen on most issues, I think she is pushing us in the wrong direction here: rehashing the intuitive output of an essentialized “oneness” as the computational theory itself – rather than as the intuitive output that needs to be explained by the computational theory. To make just one point: There are an infinite number of ways to treat ourselves and each other as “one.” Only a vanishingly small subset generates coherent and functional outcomes. It is our job to specify this subset, not to let intuition silently eliminate all but the coherent and functional.

To be clear, I do agree that group tokens (intuitively) involve containment metaphors and oneness. But these do not describe what is done by the cognitive system(s) once group members are so contained or unified. The target article suggests that containment or oneness corresponds to agents being substitutable within particular event grammars or inference engines. And without this notion of what is done with containment or oneness, there is no computational theory. So Thomsen has a perfectly good intentional level theory of what constitutes a group (i.e., what, at an intuitive level, the “person” represents), but not a functional level theory (what, at a functional level, mechanisms represent; Pietraszewski & Wertz, Reference Pietraszewski and Wertz2021) – which is our concern in the target article.

R2.7. Real-world applications

Finally, both Allen and Richardson and Deminchuk & Mishra explore how the target articles' roles within triadic interaction types account may inform real-world phenomena, such as riot contagion and polarization on social media. They find two elements of the account particularly appealing: that (i) identities emerge out of collective behaviors, and (ii) which identities emerge depends on how bystanders and uninvolved third parties become involved. These commentaries demonstrate how explicit computational theorizing provokes new research questions and ways of looking at old problems.

Allen and Richardson do worry that certain behaviors are not captured by the present account. But this concern amounts to what was already addressed in response to Levine and Philpot: that to propose a theory of the conflict-related element of a group representation is not to claim that particular group tokens will be characterized only by conflict behaviors. This, and the distinction between outright behaviors and internal representations, and between ancillary and direct cues (and that the latter can be present in different combinations even with the same group for different members) addresses their concerns – concerns similar to Simandan's in that they (helpfully) point out additional information-problems that have to be solved if the present account is to work.

R3. Conclusion: The study of groups needs more and better conceptual distinctions

In closing, I think everyone who studies social groups understands that we're not only dealing with fascinating science, we're dealing with matters of life and death. Right now, people all over the earth are cowering in fear as they are being bombed and shot at, hurt and abused as a direct result of our species' group psychology. So, as much as we have to get anything right in science, we have to get the psychology of social groups right. In this light, I'm more convinced than ever that what we scientists of social groups need most right now is not more data, nor better methods, but more and better conceptual distinctions. I am grateful to the commentators for starting a collective discussion about what these might be (see Box R2 for a summary). I hope the discussion continues.

Box R2. Conceptual distinctions and principles raised by the commentaries

Funding

This work was supported by funding from the Max Planck Society.

Conflict of interest

None.

References

Balliet, D., Tybur, J. M., & Van Lange, P. A. M. (2017). Functional interdependence theory: An evolutionary account of social situations. Personality and Social Psychology Review, 21, 361388. https://doi.org/10.1177/1088868316657965.CrossRefGoogle ScholarPubMed
Bruner, J. S. (1957). On perpetual readiness. Psychological Review, 64, 123152.CrossRefGoogle Scholar
Bruner, J. S., Goodnow, J., & Austin, G. A. (1956). A study of thinking. Wiley.Google Scholar
Chase, I. D. (1985). The sequential analysis of aggressive acts during hierarchy formation: An application of the “jigsaw puzzle” approach. Animal Behavior, 1985, 86100.CrossRefGoogle Scholar
Chomsky, N. (1959). A review of B. F. Skinner's verbal behavior. Language, 35, 2658.CrossRefGoogle Scholar
Chomsky, N. (1980). Rules and representations. The Behavioral and Brain Sciences, 3, 161.CrossRefGoogle Scholar
Cikara, M. (2021). Causes and consequences of coalitional cognition. Advances in Experimental Social Psychology.CrossRefGoogle Scholar
Conroy-Beam, D., Ghezae, I., & Pietraszewski, D. (2021). A sufficiency test of the alliance hypothesis of race. Talk presented at Human Behavior and Evolution Society (Virtual).Google Scholar
Dennett, D. (1995). Darwin's dangerous idea: Evolution and the meanings of life. Simon & Schuster.Google Scholar
Gardner, H. (1985). The mind's new science. Basic Books.Google Scholar
Heisenberg, W. (1983). Encounters with Einstein: And other essays on people, places, and particles. Princeton University Press.Google Scholar
Kuhn, T. S. (1962/1970). The structure of scientific revolutions. University of Chicago Press.Google Scholar
Maloney, L. T., & Brainard, D. H. (2010). Color and material perception: Achievements and challenges. Journal of Vision, 10(9), 1919.CrossRefGoogle ScholarPubMed
Marr, D. (1982). Vision: A computational investigation into the human representation and processing of visual information. Henry Holt and Co.Google Scholar
Minsky, M. (1961). Steps toward artificial intelligence. Proceedings of the IRE, 49, 830.CrossRefGoogle Scholar
Minsky, M. (1974). A framework for representing knowledge. Artificial Intelligence. Memo No. 306.Google Scholar
Minsky, M. (2011). The society of mind, Fall 2011. MIT OpenCourseWare. retrieved from: https:www.youtube.com/watch?v=−pb3z2w9gDg&list=PLUl4u3cNGP61E-vNcDV0w5xpsIBYNJDkU.Google Scholar
Pietraszewski, D. (2013). What is group psychology? Adaptations for mapping shared intentional stances. In Banaji, M. & Gelman, S. (Eds.), Navigating the social world: What infants, children, and other species can teach us (pp. 253257). Oxford University Press.CrossRefGoogle Scholar
Pietraszewski, D. (2016). How the mind sees coalitional and group conflict: The evolutionary invariances of coalitional conflict dynamics. Evolution and Human Behavior, 37, 470480.CrossRefGoogle Scholar
Pietraszewski, D. (2020a). The evolution of leadership: Leadership and followership as a solution to the problem of creating and executing successful coordination and cooperation enterprises. The Leadership Quarterly, 31, 101299.CrossRefGoogle Scholar
Pietraszewski, D. (2020b). Intergroup processes: Principles from an evolutionary perspective. In Van Lange, P., Higgins, E. T., & Kruglanski, A. W. (Eds.), Social psychology: Handbook of basic principles (pp. 373391). Guilford.Google Scholar
Pietraszewski, D., & Schwartz, A. (2014). Evidence that accent is a dedicated dimension of social categorization, not a byproduct of coalitional categorization. Evolution and Human Behavior, 35, 5157.CrossRefGoogle Scholar
Pietraszewski, D., & Wertz, A. E. (2021). Why evolutionary psychology should abandon modularity. Perspectives in Psychological Science doi: 10.1177/1745691621997113Google ScholarPubMed
Strayer, F. F., & Noel, J. M. (1986). The prosocial and antisocial functions of preschool aggression. In Zahn-Waxler, C., Cummings, E. M., & Iannotti, R. (Eds.), Altruism and aggression: Biological and social origins (pp. 107131). Cambridge University Press.CrossRefGoogle Scholar
Taylor, S. E., Fiske, S. T., Etcoff, N. L., & Ruderman, A. J. (1978). Categorical and contextual bases of person memory and stereotyping. Journal of Personality and Social Psychology, 36, 778793.CrossRefGoogle Scholar
van Rooij, I., & Baggio, G. (2021). Theory before the test: How to build high-verisimilitude explanatory theories in psychological science. Perspectives on Psychological Science, 16(4), 682697.CrossRefGoogle ScholarPubMed
Weiner, N. (1948/1961). Cybernetics: Or control and communication in the animal and the machine (2nd ed.). MIT Press.Google Scholar
Figure 0