1. INTRODUCTION
In recent years Artificial Intelligence (AI) has found an increasing role in the creative arts, indicating a growing interest among researchers and artists to collectively explore the possibilities. The field of Creative AI broadly looks at how AI techniques can be applied to the design and generation of creative artefacts, or to support human creators in their creative practices.
This article considers a particular subset of AI technologies for music that enable real-time interactive improvisation, co-creation and performance to take place live between people and machines. We address a range of theoretical and methodological issues for designing and developing these systems. By doing so we hope to support the development of a stronger design framework for this exciting subfield of AI for creative applications.
Through our interactions with these systems, our perception of them shifts from being tools or instruments that passively support human creativity, to a new kind of active creative partner. In this mode, the machine initiates new creative ideas, supports the human performer’s creative practice in meaningful ways, and develops and adapts to the individual mode and style of artistic creation ‘in the moment’ with a specific human partner. While mechanical and computational tools have, in the past, supported these concepts individually, we argue that it is the combination of these modes of engagement that leads to the perception of the machine as an ‘intelligent’ creative partner.
Our interest in this class of real-time AI systems for creative collaboration – whose design requires combining artistic challenges and scientific goals – has several motivations:
Performance and improvisation are among the most challenging creative activities undertaken by humans. To do them successfully requires a great deal of proficiency and virtuosity which typically takes many years of practice and experience before one can claim anything close to mastery (Ericsson, Krampe and Tesch-Römer Reference Ericsson, Krampe and Tesch-Römer1993; Gladwell Reference Gladwell2008). For machines to operate successfully in this situation provides us with a clear challenge, one that is quite unlike that of automatically generating finished artworks, mimicry of an existing style, or classifying creative outputs.
For the first time in history, technology can provide artists with an opportunity to improvise, perform and co-create with intelligent machines. Co-creative machines suggest a new kind of agency: one that enables interactions that are fundamentally new and different from our previous creative interactions with human-made tools. This opens up exciting opportunities for human–machine cooperation that have never before been possible.
The successful design of systems that have to create in real-time requires us to consider specific aspects of the human creative process in new ways. These include issues of teamwork, trust, cooperation, shared knowledge, social accountability, shifting goals and evaluation ‘in the moment’.
These challenges provide us with a reality check on the limitations and affordances we require from AI to support and understand human creative processes. Moreover, they operate in contexts which are highly familiar to audiences, artists and performers alike.
1.1. A practice-driven approach to creative AI
Artists have long used technologies to help them create, whether that is the pen, the piano or the computer. While technology often plays an important role, in our view the artistic work’s meaning and relevance is foremost, regardless of the technology, methodology or tool being used. This means that it is important to begin with the artist’s creative goals and understand how technology can assist them rather than the other way around. For creative works to have sustained appeal, they should be driven primarily by artistic, not technological goals: what can be termed a practice-driven approach.
Using the latest AI techniques and demonstrating their effectiveness by generating outputs is insufficient without understanding the creative intention of human artists and audience expectations built around them (O’Hear Reference O’Hear1995; d’Inverno and McCormack Reference d’Inverno and McCormack2015; Still and d’Inverno Reference Still and d’Inverno2016; Yee-King and d’Inverno Reference Yee-King and d’Inverno2016). So we consider creativity not only determined by examining the final output, but also through the process and experience of making it as well.
Despite AI’s enormous creative potential, many remain sceptical or wary of its ability to play any useful role in human creative practice or shed new light on our creativity beyond surface mimicry (O’Hear Reference O’Hear1995). Issues for current creative AI research include:
over emphasising the final product rather than the underlying creative process;
developing techniques for single, specific individuals, performances or outcomes, that are not designed to elicit sustained interest or longer-term creative development and do not shed light on the design of collaborative AI systems in general;
using systems that cannot articulate or explain the decisions they have made, making understanding by both performers and audiences opaque and difficult.
To address these issues, we need a design process specifically oriented around collaboration and teamwork between human and machine performers. Real-time co-creative systems provide the opportunity for artists to develop collaborations with a creative agency rather than just work with a tool. Such collaborations require sustained, multiple encounters where each participant – human and machine – learns about the other. These interactions aim to build trust and familiarity through each exchange, balanced with a mutual openness to go into unexplored territory.
1.2. Artistic and technological co-development
Artistic practice is not static, it develops and changes over time, driven by culture, economics, personal development, experience and technology (Gombrich Reference Gombrich1995; Freeland Reference Freeland2001; Thornton Reference Thornton2009). We think the most successful outcomes are those where the boundaries of creative practice and AI system design are simultaneously expanded by interplay with each other (Figure 1). In this view, AI has multiple roles to play, one critical role is as a ‘cultural influencer’ where, for example, new creative practices that are unique to human–AI collaborations are adopted across different creative domains – a type of transformational creativity in Boden’s terminology (Boden Reference Boden1991). These practices further inform the design of new AI systems, and the process repeats, leading to a symbiotic co-development between artistic and technological imperatives.
![](https://static.cambridge.org/binary/version/id/urn:cambridge.org:id:binary:20200303134544353-0942:S1355771819000451:S1355771819000451_fig1.png?pub-status=live)
Figure 1. Creative practice and AI system design work best when they mutually inform each other in human–machine collaborations.
Beyond cultural change, there is evidence to suggest the field can have a wider societal relevance. In contemporary society, ‘creativity’ is a highly sought-after commodity for both business and education. Regular music learning, for example, is associated with increases in IQ and general academic achievement (Schellenberg Reference Schellenberg2004; Schellenberg and Weiss Reference Schellenberg, Weiss and Deutsch2013) and creative thinking abilities have been posited as the key to overcoming decline and stagnation for new economies (Pink Reference Pink2006).
In popular media, there is currently a huge interest in AI and what it might be able to achieve in the near future. But there is also anxiety about AI’s potential negative consequences, such as job redundancies, biased decision making and loss of privacy. Opportunities for the public to better understand the reality of current AI research can be assisted through familiar scenarios such as music performance and improvisation. These use-cases demonstrate people working productively with AI, rather than being subsumed, beaten or replaced by it. They emphasise the positive value of AI and how it can contribute to a richer culture.
1.3. Overview of the article
We now have the opportunity to improvise, perform and co-create with intelligent machines. We think that these kinds of creative interactions are fundamentally new, due to the increased scope for agency and autonomy that computing brings over previous human-made tools (Bown and McCormack Reference Bown and McCormack2010). We will examine the relationships between improvisation, interaction and co-creation with an AI system in detail, specifically targeting AI techniques that support performing and co-creating with a creative human musical partner.
Section 2 provides an overview of Creative AI systems, illustrating the range of applications actively researched. This section also looks at some of the criticisms of Creative AI from creative arts and critical perspectives. Following that, Section 3 sets out in more detail what we believe are the main theoretical and technological design issues that need to be addressed when designing collaborative AI Systems. Drawing from research across psychology, musicology, improvisation theory and social informatics, we explore what can be learnt from these disciplines when designing AI systems that support real-time human collaboration with shared creative goals.
In the final section we discuss the important considerations for a methodological framework, assisting the field to develop in more coherent and informed ways.
2. BACKGROUND AND RELATED WORK
The application of AI methods for real-time musical performance and improvisation has become prominent of late, due to increased accessibility of open source, re-usable software components for analysis and classification using techniques such as Deep Learning, combined with the increase in readily available computer resources and large data sets. However, real-time music performance systems have existed for nearly half a century (Eigenfeldt Reference Eigenfeldt2007). Gifford, Knotts, McCormack, Kalonaris, Yee-King and d’Inverno (Reference Gifford, Knotts, McCormack, Kalonaris, Yee-King and d’Inverno2018) undertook a survey of computational systems for music improvisation, and developed a taxonomy through a detailed examination of 23 indicative systems, covering all major approaches. Their key findings included the idea that system complexity had little influence on the perceived creative agency of the artificial improviser, and the conceptualisation of the system as a creative partner dates back more than 30 years.
2.1. Artificial creativity
Many different AI techniques have been applied to tasks typically associated with human creativity, including the production of visual and moving image art (Avila and Bailey Reference Avila and Bailey2016), literature and music (Cope Reference Cope1989), but also more broadly in areas such as games (Chen Reference Chen2016) and game shows (Best Reference Best2013).
The development of artificial creative systems has generally been a bespoke and multifaceted effort driven by several competing objectives. These range from understanding human creativity, to assisting human creators through the practical generation of specific artefacts, to building systems that are deemed – to varying degrees – as autonomously creative. A recurring challenge for AI is ‘to beat the best human’ at some specific task or problem. But this approach seems counter to the goal of expanding and supporting all human creativity.
An alternative is to develop new models of interaction and co-creation that are designed to nurture and enhance the user’s creativity and creative practice. In this mode of AI-as-collaborator, human–machine collaborations have shown to foster human creativity in specific contexts (Liapis, Yannakakis and Lopes Reference Liapis, Yannakakis and Lopes2016; McCormack, Gifford, Hutchings, Rodriguez, Yee-King, and d’Inverno Reference McCormack, Gifford, Hutchings, Rodriguez, Yee-King and d’Inverno2019). One approach is to parameterise systems for the generation and curation of output. In music composition collaborators specify parameters such as the types of melodies and themes for generated compositions to be based on, time and key signatures to constrain the number of melodies, and so on (Morgan, Ackerman and Cassion Reference Morgan, Ackerman and Cassion2018; Cassion, Ackerman, Loker and Palkki Reference Cassion, Ackerman, Loker and Palkki2017). Higher levels of autonomy have also been seen in systems that make decisions based on metadata-level criteria. For example, the goal-awareness approach of Hantula and Linkol (Reference Hantula and Linkol2018) models its collaborators by measuring their musical potential in order to make decisions about how to carry on with the collaboration.
2.2. Beyond mimicry: collaborative co-creation
A recurring question for many creative AI systems, trained on an existing artistic corpus or style, is whether they have simply learned how to imitate an artist’s style but cannot go beyond it. An alternate approach is to assist with the generation of new artefacts rather than mimicry of existing artistic styles (Elgammal, Liu, Elhoseiny and Mazzone Reference Elgammal, Liu, Elhoseiny and Mazzone2017). Researchers can imbue their systems with the basic aspects of music theory – such as the way some combination of sounds shared a particular meaning and trigger particular emotions – in order to provide a general tool that musicians can use to create soundtracks, rather than relying only on big data analysis of famous composers’ corpus or using machine learning methods to develop statistical models of musical genres (Hutchings Reference Hutchings2018).
These kinds of systems have enabled successful creative collaborations with artists. Flow Machines (Ghedini, Pachet and Roy Reference Ghedini, Pachet, Roy, Corazza and Agnoli2016), the technology behind Daddy’s Car (‘the first structured AI pop song’), is a prominent example of AI for musical co-creation. In such systems musical styles – including melodic phrases and harmonisation and progression patterns – are represented as computational objects, allowing users to explore, manipulate and change styles in order to create their own.
Research has emphasised the value in developing systems that focus on the creative process rather than just the artefact output (Iba Reference Iba2010). For example, by giving more autonomy to the creative process by allowing systems to ‘inspire themselves’ as part of the artefact generation process (Colton Reference Colton, McCormack and d’Inverno2012), or enabling them to write their own code (Cook, Colton and Gow Reference Cook, Colton and Gow2017). The underlying rationale is that by allowing greater autonomy into the decision-making process, we (and therefore our AI systems) can better understand what aspects of the creative process lead to successful outcomes. Some of these interactions are criticised because the software ‘merely follows a random process’ to generate suggestions (Liapis et al. Reference Liapis, Yannakakis and Lopes2016), and it is actually the artist’s decisions that shape the space from which the software generates alternatives. Nonetheless, researchers claim these approaches increase the distance between developer intentions and the system’s creative process, even possibly uncovering novel and interesting ways of artefact generation, in spite of the resulting artefact being possibly of poor quality – quite common also in human creative practices.
2.3. Evaluation
Creative activity is our great need, but criticism, self-criticism is the road to its release. (John Dewey Reference Dewey1930: 141)
To deem a system ‘creative’ and for it to improve and grow creatively, evaluation is necessary (Sawyer Reference Sawyer2011). This may consist of machine reflection and self-criticism or may arise through mutual interplay and feedback of a human–machine partnership where the human is performing real-time aesthetic evaluation.
Agres, Forth and Wiggins (Reference Agres, Forth and Wiggins2016) classified creative systems according to the degree to which they can reflect about their own output, suggesting three main categories: (i) purely generative; (ii) with internal or external feedback; and (iii) with capabilities of reflection and self-reflection.
External feedback through human–machine interaction can be considered as negotiated evaluation of the creative partnership. Additionally, the primary mode of communication may in fact be via the creative medium itself. In human–human music improvisation it is through the music that communication and negotiation primarily take place. Whilst a plethora of extra-musical communication channels are involved, such as physical gestures, eye contact, and even verbal cues, these are often seen as secondary across jazz (Hagberg Reference Hagberg, Jordan, McClure and Woodward2017), free (Nunn Reference Nunn1998) and electroacoustic (Nort Reference Nort2018) improvisation genres. Similarly, interactive music systems such as Cypher (Rowe Reference Rowe1992), OSCAR (Beyls Reference Beyls1988), Voyager (Lewis Reference Lewis1999) and CIM (Brown, Gifford and Voltz Reference Brown, Gifford and Voltz2013) privilege this mode of ‘performance-as-interface’ (Brown Reference Brown2018) whether or not some additional parametric controls are exposed. Thus, human evaluation in the human–machine creative partnership can enter through the creative improvisation itself. The challenge is for the machine to successfully interpret and to act on it so as to improve.
User experience can form part of the evaluation of creative systems, and recent studies have supported a co-creational approach by placing emphasis on usability evaluation, specifically geared towards a subjective and experiential analysis rather than on task-based usability found in traditional HCI research (Brown, Nash and Mitchell Reference Brown, Nash and Mitchell2017b). Accordingly, it is important to consider how human improvisations are evaluated. In iterative models of the creative process, evaluation plays a key role in the iterative cycle and the creator is often in a constant process of evaluation (Guilford Reference Guilford1967; Mumford, Baughman and Sager Reference Mumford, Baughman, Sager and Runco2003; Sawyer Reference Sawyer2011).
One important methodological consideration that has received much support from researchers is the use of subjective aesthetic assessment by ‘domain insiders’ as an instrument for evaluating improvisation (Stowell, Robertson, Bryan-Kimms and Plumbley Reference Stowell, Robertson, Bryan-Kinns and Plumbley2009; O’Modhrain Reference O’Modhrain2011; Linson, Dobbyn and Laney Reference Linson, Dobbyn and Laney2012). This method parallels similar arguments in creativity assessment theory (Amabile Reference Amabile1996; Eisenberg and Thompson Reference Eisenberg and Thompson2003).
In designing AI collaborators, it may be that the designer themselves play the role of expert judge. For example, in the context of designing Digital Musical Instruments (DMIs), Jordá and Mealla comment:
[Just as] much research in HCI culminates in lists of guidelines and/or principles for design (and/or evaluation of design) based on research or practical experience relating to how people learn and work, it comes as no surprise that the first tentative NIME design frameworks have been mostly proposed by experienced digital luthiers. (Jordà and Mealla Reference Jordà and Mealla2014: 234)
As all creative domains have some culture of critique, and corresponding critical values, embedding criticism in AI design seems an obvious fit. However, making those values explicit remains challenging. As Eisenberg and Thompson note, ‘the assessment of improvised music is particularly mysterious’ (Eisenberg and Thompson Reference Eisenberg and Thompson2003: 287). Consequently, implicit approaches leveraging the subjective real-time evaluation of a ‘human-in-the-loop’ are fundamental to good system design.
2.4. AI-as-collaborator
There has been a significant amount of research into designing competent improvisational companions. The goal of these systems is to contribute creatively during performances in order to stimulate the human collaborator(s) into new expressions or creative territory. Approaches include the use of a predefined palette of subroutines (Lewis Reference Lewis2000; Gifford and Brown Reference Gifford and Brown2011; Brown, Gifford and Voltz Reference Brown, Gifford and Voltz2017), grammars (Keller and Morrison Reference Keller, Morrison, Spyridis, Georgaki, Kouroupetroglou and Anagnostopoulou2007; McCormack Reference McCormack, Stocker, Jelinek, Durnota and Bossomaier1996), or a combination of both (Kitani and Koike Reference Kitani and Koike2010). While these approaches require a human expert to design the subroutines or grammar rules, other approaches employ machine learning of performance patterns, which are then used during real-time performance to generate novel compositions (Biles Reference Biles1994; Thom Reference Thom1999; Assayag and Chemillier Reference Assayag and Chemillier2006; Weinberg, Raman and Mallikarjuna Reference Weinberg, Raman and Mallikarjuna2009). Collectively, these systems tend towards being bespoke to their designer(s), making general or sustained creative development difficult (cf. section 1.1). Our own projects have incorporated AI techniques for real-time co-creational applications, including Eden (McCormack Reference McCormack, Komosinski and Adamatzky2009), Reflexive Looper (Pachet, Roy, Moreira and d’Inverno Reference Pachet, Roy, Moreira and d’Inverno2013) and Controlling Interactive Music (CIM) (Brown, Gifford and Voltz Reference Brown, Gifford and Voltz2013).
2.5. Summary: towards a truly collaborative AI
The notion of AI-as-collaborator is being increasingly studied within the field of creative AI, becoming (in our view) a fruitful and productive approach as evidenced by the creative sophistication and audience interest in the results (Stocker Reference Stocker2019).
However, many efforts in creative AI are driven by the goal of attaining expert-level human performance rather than by understanding how a system can support and enhance human creative activity collaboratively. While this has driven important technological advances, in order to advance the concept of AI-as-collaborator we need to follow a practice-driven approach (section 1.1), which furthers human creative practice as its first and foremost design goal. Then only by challenging, provoking, stimulating and pushing the process and experience of human creative activity are new artistic achievements possible.
Steps are now being taken towards this goal: researchers have begun moving away from approaches that simply mimic or parody existing artistic canons, focusing instead on the creative process rather than the output and favouring evaluation through self-reflection and ecological experience over traditional task-based measurements.
Improvisation, performance and co-creativity provide ideal activities in which to explore the role of machines as creative partners, that is, creative collaborators that are engaged with an artistic goal but that have the freedom to explore different possibilities to achieve it.
In the next section we explore some of the considerations that can assist in building models that foster an ongoing, sustained dialogue for real-time human–machine collaborations. It seems logical that human–machine partnerships can be better understood through the lens of how humans collaborate with one another. To this end, we study collaborations from various perspectives in the human context, mainly surrounding the role of teamwork (or lack of thereof) within these interactions and propose how these different perspectives can be applied to the design of AI collaborators.
3. DESIGN CONSIDERATIONS FOR COLLABORATIVE AI SYSTEMS
Here we present a set of design considerations for creating collaborative real-time AI systems. We take the position that collaboration between AIs and people shares characteristics with collaboration between people, and therefore that the nature and mechanisms of successful human teams can inform the design of AI collaborators.
We begin by considering the characteristics of a dysfunctional team, based on a widely used model from management studies put forward by Lencioni. We contextualise this model by imagining a perfectly dysfunctional band of musicians, then a perfectly dysfunctional AI collaborator.
Based on this knowledge of how teams can fail, we develop a set of considerations for people interested in designing AI systems that can operate successfully as team members in creative collaborations.
3.1. How do dysfunctional teams behave?
Lencioni, a theorist in management studies, identified five areas of dysfunction in teams: absence of trust, fear of conflict, lack of commitment, avoidance of accountability and inattention to results (Lencioni Reference Lencioni2006). Imagine a band of musicians who exhibit all these problems. According to Lencioni, lack of trust leads to undesirable behaviours such as concealment of weaknesses. The drummer does not admit that they cannot play in 7/8 time and refuses to explain why they come in at the wrong time with the wrong beat. Fear of conflict leads to stagnant music because nobody is prepared to criticise anyone else’s playing, and therefore to encourage improvement. Lack of commitment leads to nobody taking the instrumental solo as they want to get through the performance with minimal risk. Avoiding accountability means nobody wants to admit why the music is stagnant, or why the drummer came in at the wrong time or why nobody took the solo. Finally, inattention to results leads to the band walking off stage at the end and none of them commenting on the performance and why it was so terrible, or why the audience left early. Even the success of the guitarist, who is very goal driven and who set fire to his guitar and destroyed his amplifier very effectively, is not commended by his team-mates in the van on the way home.
A dysfunctional band of musicians is one thing, but what would be the experience of interacting with a Lencioni dysfunctional AI collaborator? We shall consider a scenario in which a person is co-composing a piece of music with an AI. The AI should be able to provide feedback on the musical score and to suggest changes. What kind of suggestions might an untrustworthy AI make? It might heap positive praise on everything the human writes, or it might criticise without justification. If the AI fears or is programmed to avoid conflict, it avoids any critique or suggestion at all if it challenges the human musician. Since the AI does not contribute any real critique to the composition, it has no ‘skin in the game’, and therefore bears no responsibility or accountability for the results. Finally, the AI attends to all results with the same, positive response, so it does not recognise the significance of different results, be they good or bad music.
So what makes for a good team and how should we design collaborative AI systems, so they are good team members? We have explored the characteristics we should avoid, but let us now consider some other research into teamwork that explains mechanisms we can use to help us avoid these problems.
3.2. Designing for trust
A strong and productive human creative partnership is fundamentally founded on trust between the participants: performers trust their collaborators to share, contribute and participate towards a common creative goal. Yet this goal is not always formalised or articulated explicitly beforehand, rather it is often negotiated during performance. This makes it very difficult to present it to an AI for consideration. An improvisational collaboration takes place ‘in the moment’ of performance – in real-time with little or no time for conscious reflection or planning – meaning that it becomes very easy to disrupt if performers are not all invested in an overall creative outcome. Hence, trust in an exemplary performance plays over many different relationships and levels: trust in each participant’s intention and creative virtuosity (Pachet Reference Pachet, McCormack and d’Inverno2012); trust that each articulation has a purpose in the overall performance; trust in the expectations of a response put ‘out there’ by another, trust in the understanding of the current creative direction of the work as it progresses, trust that allows participants ‘to risk everything in the moment of performance’ (Waterman Reference Waterman, Caines and Heble2015: 59).
A lack of trust in an improvisational partner can foster fears of failure in performance, amplifying worries about one’s own ability to perform competently with others or undermining the willingness to act freely. On the other hand, performers must take risks that embolden them into new creative ideas and territories rather than remain in stasis or isolation during a performance or improvisation. Many see this negotiation between trust and risk within improvisation and collaborative performance as social interaction and accountability (Waterman Reference Waterman, Caines and Heble2015), which raises interesting questions on how to ensure trust between humans and creative machines. Certainly, trust in computing and automated systems is well studied, where similar issues of social interaction and accountability are acknowledged as fundamentally important (Lee and See Reference Lee and See2004). Typically however, social interactions with machines are asymmetrical, with an unbalanced awareness of each other’s behaviour and intentions (Deutsch Reference Deutsch1960), making building the necessary trust problematic. As more complex AI systems become part of those interactions, this asymmetry may increase.
Trust can be attributed through direct observation of three layers of abstraction: performance (behaviour), process (underlying mechanisms) and purpose (system intent). Building trust with a non-human agent also requires calibration between a person’s expectations of the agent and the agent’s capabilities (Muir Reference Muir1987; Lee and Moray Reference Lee and Moray1994; Lee and See Reference Lee and See2004). Exposing these three layers in any collaborative AI system can serve as a useful design goal, with the constraint of designing for an appropriate level of trust rather than trying to maximise it. An ‘appropriate’ level balances trust in the system with the necessary risk to push the interaction creatively.
Another complex factor in building trust is the use of anthropomorphisms. Overuse of anthropomorphic skeuomorphisms, such as human-like voices, facial expressions, or bodies, leads to false assumptions about underlying mechanisms and capabilities of the system, because we expect it to be more human-like than it really is. This parallels the ‘uncanny valley’ effect well known in animation, though originally discussed in the context of humanoid robotics (Mori, MacDorman and Kageki Reference Mori, MacDorman and Kageki2012).
While impressions of creative trust in a performative or improvisational context can be evaluated effectively, formalising them from an AI perspective is problematic. Hence, we see a measure of trust as a possible evaluative outcome of a human–machine co-creation, rather than something that is directly built into a design in an engineering sense. Some possible mechanisms for building trust in human–machine partnerships are outlined in the sections that follow.
3.3. Designing for team cognition
We can think of machine–human improvisation or co-creation as teamwork, which is a well-studied area in psychology. What, according to psychologists, are the factors that affect the success of teams and how might we consider them in the design process for collaborative AIs? In a meta-analysis of the cognitive underpinnings of effective teamwork, DeChurch and Mesmer-Magnus highlight the importance of team cognition (DeChurch and Mesmer-Magnus Reference DeChurch and Mesmer-Magnus2010). They identify two critical elements of team cognition: team mental models and transactive memory.
Team mental models are defined as ‘organized mental representations of the key elements within a team’s relevant environment that are shared across team members’ (Mohammed, Ferzandi and Hamilton Reference Mohammed, Ferzandi and Hamilton2010). Essential elements here are the roles and capabilities of the team members and the expected procedure for the work. Wegner defines transactive memory as a cognitively independent system for encoding, storing and retrieving information that combines the knowledge possessed by each individual with a shared awareness of who knows what (Wegner Reference Wegner, Mullen and Goethals1987). Trust will emerge only when team members are confident that there are shared models and shared information, so they need to be there, and the team need to know they are there.
The ideas of shared dynamic models and a shared space for information storage (which is how one might paraphrase team mental models and transactive memory) are also apparent in the literature on music improvisation. In a review paper considering the literature relating specifically to free improvisation, Ng identifies the themes of situated, collaborative knowledge construction, and the real-time emergence of means of working towards shared goals (Ng Reference Ng2019). It seems that musical interaction theorists take the ability to share information and to be aware of others’ goals and capabilities for granted in musical interaction, suggesting an implicit acceptance of shared mental models and shared information. The theory of social constructivism seems to neatly encompass shared models, shared information and the emergence of a meaningful, trusted collaboration (Berger and Luckmann Reference Berger and Luckmann1967).
So how do we unlock this implicit information and expose it to team members in a human–AI collaboration? Lewis, who developed the Voyager computer music improviser considered how computer systems might communicate their internal state to human band members during an improvisation. He stated that ‘the nature of the internal representation used by a system will be audible to the trained improviser based on the system’s performance and the improviser’s experiences with it’ (Lewis Reference Lewis1999: 106). Therefore the shared awareness required for successful teamwork does not necessarily need to be explicitly expressed; for example, through visualisation or numerical display, as it will be apparent in the output the system is designed to generate (music in this case).
In summary, we are interested in systems which work alongside people to enhance, provoke and challenge their creativity. In real-time, interactive contexts, trust is a critical element of the interaction, and it is a complex phenomenon to describe and model. We do know that trust emerges when the collaborators are aware that they have shared goals and direction.
Psychologists interested in teamwork have described the mechanisms and content of these shared goals and how they are communicated, in the form of shared dynamic models of the team members’ roles and capabilities, shared awareness of the work flow they need to engage in to achieve the task at hand, and some sort of shared information space.
3.4. Designing for feedback and discussion during the creative process
As discussed in section 1, a common issue with computational creative systems is that their sole focus is on the end product. This reduces the interaction between a person and the AI to a request followed by an acceptance or rejection, rather like ordering something from an online shop, or Thorndike’s Stimulus-Response model of learning.
Producing an artefact is only one of many possible creative and artistic activities. People create various versions of their work, discuss ideas, intentions, influences about their work, compare them to previous works and so on. These types of communication have been identified as an intrinsic component of creativity, ‘attempting to communicate a creative work often feeds back to fundamentally transform the creative work itself’ (Sawyer Reference Sawyer2011: 110). Enabling systems with communication and dialogue capabilities is thus increasingly seen as paramount. This dialogue is also an enabler of trust, as opinions are presented, justified and interrogable. Lencioni tells us that the opportunity to confront is critical to excellent teamwork, as is accountability, both of which are enabled by dialogical systems.
We can imagine an AI system that holds a useful mental model of its human collaborator and which can reason about that model and express that reasoning. Such a system would need to understand the domain of work, and the conventions governing how one operates in that domain. For example, imagine a musical bandleader demonstrating on the piano how the pianist should emphasise a particular phrase – that is quite normal. Now imagine an AI painter-collaborator that paints all over the section of the painting they think should be emphasised.
Effective communication would enable systems to interact with human collaborators at new levels. A system that can communicate its mental models and the mental models it holds of its collaborators would engender increased trust from these collaborators (see section 3.3). As described in (Morgan et al., Reference Morgan, Ackerman and Cassion2018), ‘collaboration embraces flaws’, and two-way communication would also provide systems with the ability to reflect about conflicts, mistakes and past versions. In this kind of model, the systems would become active participants as opposed to only supporting their human collaborators. Finally, having the ability to explain their contributions would also mean giving collaborative AI systems the ability to argue for them and achieve a more balanced partnership.
3.5. Designing for agency, autonomy and reflection
Like intelligence, creativity has historically been difficult to define, and indeed, has had different definitions at different points across the history of the term’s use (Still and d’Inverno Reference Still and d’Inverno2016). Considerations of agency, autonomy and reflection provide the designer with more straightforward ways to characterise the nature of the interaction between human and machine.
Recent work unpicks a distinction between the autonomous heroic lone agent that produces content without interaction with human agent and the collaborative creative computing agent that interacts with a human user to create new kinds of behaviours and performances (d’Inverno and McCormack Reference d’Inverno and McCormack2015). It outlines the differences between the two approaches from that of the human artist/performer, the aims and motivations of the human designer/software-engineer and the experiences of an audience. This approach draws from concepts of agency and autonomy (Luck and d’Inverno Reference Luck and d’Inverno1995) where non-autonomous agents serve the artistic motivations of another, and autonomous agents have their own motivations and so can generate their own artistic goals. Currently, questions about the level of an AI’s autonomy is primarily used in the characterisation of systems rather than in directing the creative process and generating artistic or performative ‘goals’, what we have termed ‘creative agency’ (Bown and McCormack Reference Bown, McCormack, Kampis, Karsai and Szathmáry2009). Shifting the emphasis to both developing and supporting creative goals seems a more beneficial approach.
3.5.1. Reflection
We consider reflection the ability of an agent to look back upon the process and results of its collaborations. Currently, designs rarely consider the opportunity to grow and adapt through evaluation and reflection of past successes or failures. The majority of creative systems are switched on to create something and then stopped (Cook and Colton Reference Cook and Colton2018). We can connect this design flaw to one of the characteristics of dysfunctional teams – inattention to results – emphasising the importance of considering reflection in the design of collaborative AI systems.
One way to focus on process and encourage growth is to reconceptualise creative agent systems from a two-step, action–reaction model to agents as systems of continuous, cyclical co-ordination (Figure 2). The term ‘co-agency’ has been used to describe the bi-directional relationship of agents with cyclical coordination between the processes of ‘intentionality, (re)action and reflexivity’ (Glăveanu Reference Glăveanu, Gruber, Clark, Klempe and Valsiner2015: 258). This model highlights a limitation of current creative systems, in that they are typically excluded from establishing intention.
![](https://static.cambridge.org/binary/version/id/urn:cambridge.org:id:binary:20200303134544353-0942:S1355771819000451:S1355771819000451_fig2.png?pub-status=live)
Figure 2. The internal processes that drive agency and autonomy in collaborative agents modify the ‘goal space’ which is expanded through divergent thought and reflected on through action and reference to knowledge. These reflections guide intentions to converge on specific goals, establish plans and act on them.
3.5.2. Goals and intention
Both agency and autonomy pertain to goals, either in how they are formed, or how they are attempted and solved. Goals can only be formed with the knowledge the system has access to, and intention can only be established with the tools to form plans to reach these goals and commitment to carry them out (Cohen and Levesque Reference Cohen and Levesque1990). Autonomy to develop new goals is bounded by knowledge. As such, exploring knowledge from different domains and translating it to a desired context is a fertile area for expanding the goals of creative systems and giving them the tools to act with intention (Pereira Reference Pereira2007).
Interesting translations of knowledge are typically discovered by people and used to establish the knowledge base and toolset of their systems. DeepDream (Mordvintsev, Olah and Tyka Reference Mordvintsev, Olah and Tyka2015) and neural style transfer (Gatys, Ecker and Bethge Reference Gatys, Ecker and Bethge2016) techniques utilise object detection models to generate interesting visual artefacts. AI systems with knowledge outside of the domain of application can help to expand the search for novel ideas and then narrow down those ideas by finding patterns and algorithms that have been useful or interesting in other domains.
Creating goals and establishing intention to act on them can be analysed through the concept of ‘Convergent and Divergent thinking’, where new ideas and possibilities are explored and then the best ideas are selected, focused on and developed. This concept has been used by psychologists to explain creative mental processes in individuals and groups (McCrae Reference McCrae1987; Runco Reference Runco2014).
While divergent thinking is most commonly associated with creativity, with studies showing that creative professionals have increased mental activity on tasks associated with divergent thinking (Gibson, Folley and Park Reference Gibson, Folley and Park2009), developing creative thoughts into tangible artefacts requires evaluation of novelty and realisation, making convergent thinking also essential in creative practice (Cropley Reference Cropley2006). Balancing the flow of converging and diverging states, building goals and intention to reach them, is a skill that creative people excel at, making it an important process to consider in the design of creative AI systems.
4. CONCLUSION
There is little doubt that AI and machine learning are changing the way that improvisation and performance are conceptualised and realised. We have undertaken an analysis of the role of AI systems that interact with humans in real-time, on a moment-to-moment basis over a sustained period to produce creative content and encourage human creative development. We have synthesised a set of design considerations for researchers wishing to create AI systems that can collaborate with people. The key elements discussed are trust, team cognition, feedback and autonomous agency with reflection.
We are interested in these systems because they focus our attention on specific design challenges for the AI researcher and the creative practitioner: how and why would we interact with a machine? What is the payoff for the engineer, the artist co-creating with the system, and the audience? What can be generated that could not be generated without the active participation and creative agency of the machine? To what extent – if at all – does the audience need to know about the agency or design of the system in order to appreciate the content?
Throughout our discussions in this article we have tried to show that it is important to understand the experience of the human co-creator and their relationship to the artificial creative partner, so as to design systems that have genuine creative agency (unlike the piano or pen) that can be sustained beyond a single performance or fixed interaction. The considerations presented provide practitioners with considerations for analysing, describing and creating systems with individual merit, supporting a wide breadth of scientific and artistic aims whilst contributing to the field more generally. However, we recognise that this field is in its infancy. Many systems have been designed for a single performance or performer, rather than conceptualised as a creative collaborator that can build a sustained and maturing creative relationship like those which exist between successful human collaborators.
We have offered a set of design considerations, based around successful human collaborative teams, as simple but effective criteria from which to consider the design of collaborative AI systems. Our vision is for an expanded notion of creativity: one that supports new possibilities for the human creative artist through a productive partnership with AI, rather than trivialising, superseding or replacing human creativity. We look forward to seeing this vision unfold over the coming decades.
Acknowledgements
This research was support by Australian Research Council grants DP160100166 and FT170100033. Many of us wish to acknowledge our gratitude to Arthur Still for introducing us to the work of John Dewey.