Introduction
The character of the defendant has won considerable attention in legal literature. Its relevance and significance for decisions on preventive means, on guilt and on sentencing have been widely debated.Footnote 1 These debates have been mirrored in legal developments and reforms in all these areas.Footnote 2
This paper discusses one narrow but persisting question: whether evidence of what is often called the defendant's ‘bad character’ should be admissible for the purpose of proving the defendant's criminal liability.Footnote 3 The focus will be on evidence of the defendant's previous misconduct, and more specifically, on evidence of previous commission of comparable crimes.Footnote 4 Against the background of indecisive literature,Footnote 5 the paper will identify new grounds for the continuing legal ambivalence about the evidence and will propose a way forward. Paying respects to Professor Mike Redmayne, whose work has always been a compass for understanding evidence law, the paper has been written with his last book, Character in the Criminal Trial, vividly in mind.Footnote 6
The paper begins with a brief account of the development of the law of ‘bad character evidence’. The transition from the common law regime of narrow categories of admissibility to the Criminal Justice Act 2003 regime of probative value is noted, and the ground is set for its examination.
The paper then proposes a new analysis of the legal ambivalence about ‘bad character evidence’. It is suggested that false convictions based on such evidence are profoundly tragic in the Aristotelian sense: the defendant had made a tragic error that put him in the claws of the gloomy recidivism statistics, but he nevertheless managed to stay away from crime against all odds – only to then be falsely convicted based on the very odds he had almost heroically beaten. It is further suggested that the common law's categories approach may have captured this aesthetic characteristic of the evidence.
Next, the paper examines the likelihood of profoundly tragic false convictions under a rule of admission and under a rule of exclusion. It is demonstrated that juries’ interpretations of ‘bad character evidence’ or its absence may affect the likelihood of tragic false convictions. However, we are deeply ignorant about the way juries interpret the evidence or its absence. The maximin rule for decision-making in conditions of deep ignorance is then applied, leading to the conclusion that exclusion should be preferred.
Last, the conclusions suggest how this analysis could affect the interpretation of the Criminal Justice Act 2003.
1. ‘Bad character evidence’ in the common law and in the criminal Justice Act 2003
Since the early nineteenth century, the common law had been ambivalent about evidence of ‘bad character’. A rule of exclusion emerged,Footnote 7 which was initially associated with the probative value of the evidence and then with the possibility of prejudice.Footnote 8 Yet the common law never resorted to blanket exclusion. Rather, the rule of exclusion ‘was subject to exceptions from the moment of its birth’.Footnote 9 At first these took the form of categories of admissibility based on the purpose for which the evidence was introduced. Relying on a list devised by Cross, Redmayne notes that ‘bad character evidence’ was admissible for the purposes of ‘rebutting a defence of accident or involuntary conduct, rebutting the accused's plea of ignorance or mistake of fact; rebutting an innocent explanation of a particular act or of the possession of incriminating material; proving identity; and rebutting a defence of innocent association’.Footnote 10
Most of the categories had something in common. Redmayne suggests that apart from proving identity, all the categories allowed admission of ‘bad character evidence’ for the purpose of rebutting ‘confession and avoidance’ defences.Footnote 11 In ‘confession and avoidance’ the defendant admits facts that the prosecution seeks to prove, but she argues for the existence of additional facts that preclude liability. Significantly, in such cases the defendant bluntly and transparently fails herself. The defendant: (1) puts herself in a difficult position that allows the prosecution to easily connect her to the offence at issue; and (2) subsequently decides to make a partial confession, endorsing some of the circumstances of the offence and thereby completing a significant part of the task for the prosecution. Through her miscalculated moves the defendant thus puts herself on the defensive: she now needs to provide independent evidence to rebut the charges. In most of the common law categories, this transparent self-failing opened the door to the introduction of further prosecution evidence of the defendant's ‘bad character’. The significance of transparent self-failing for decisions about admissibility of ‘bad character evidence’ will be further discussed in section 3.
Notably, such failing of oneself could also be traced in one subcategory of proof of identity, namely in modus operandi cases. The clearest example was Straffen,Footnote 12 which concerned a ‘singularly bizarre crime which Straffen had himself committed twice the year before’, to use John Spencer's lucid description.Footnote 13 The defendant in this case thus devised a ‘criminal fingerprint’. This is a transparent act of self-failing; it involves taking a clear risk that one would be directly connected to specific past or future offences. In the similar ‘brides in the bath’ case of Smith,Footnote 14 three of the defendant's wives died of drowning in bathtubs shortly after making financial arrangements that would benefit the defendant if they died. Here, transparent self-failing consisted of one of the following: either devising a ‘criminal fingerprint’ or failing to correct a clear impression that such a ‘fingerprint’ has been devised. In these modus operandi cases the error is so remarkably obvious that self-failing can be identified even without partial confession.Footnote 15
The only subcategory that did not require the defendant to have bluntly and transparently failed herself was proving identity in cases where a witness's testimony was available against the defendant. Here, the defendant did not self-failingly connect herself to the offence at issue; rather, the alleged connection between the defendant and the offence was mediated by an external indicator (the testimony). This subcategory can be understood as a first sign of future legal developments: it signalled that the defendant's conduct in the run up to trial is to be set aside, and instead the focus should be on probative value. Specifically, this subcategory focused on the probative value of one type of incriminating evidence that existed in addition to bare past misconduct, namely the testimony.
This shift of focus continued to unfold in Boardman,Footnote 16 where the common law categories were replaced by a test of ‘striking similarity’ between past misconduct and the conduct at issue. The ‘striking similarity’ test focused on probative value of incriminating data that could not be introduced without revealing bare ‘bad character evidence’, namely data regarding a similar pattern of commission that the defendant endorsed. Boardman established that only where such data were highly probative could the defendant's past misconduct be revealed to the jury. Yet the decision did not complete the shift of focus to considerations of probative value by considering the independent value of bare past misconduct and by generalising to all incriminating data that might exist in addition to bare past misconduct, and it has been criticised on this ground.Footnote 17
It was the Criminal Justice Act 2003 (CJA 2003) that concluded the shift of focus from the defendant's blunt and transparent self-failing to considerations of probative value. The 2003 Act defines evidence of ‘bad character’ in s 98 as ‘evidence of, or of a disposition towards, misconduct’; and ‘misconduct’ is defined in s 112(1) as ‘the commission of an offence or other reprehensible behaviour’.
Section 101(1) sets seven gateways for admission of ‘bad character evidence’. For current purposes, the relevant gateway is gateway (d), which makes the evidence admissible if ‘it is relevant to an important matter in issue between the defendant and the prosecution’. This gateway has relaxed the common law's striking similarity (or enhanced relevance) requirement, and in this way it has signified completion of the transition to a strict regime of probative value. According to s 103(1)(a), matters in issue between the defendant and the prosecution include, among other things, ‘the question whether the defendant has a propensity to commit offences of the kind with which he is charged, except where his having such a propensity makes it no more likely that he is guilty of the offence’. Section 103(2) further mentions one way of proving propensity, namely by introducing evidence of previous convictions of an offence of the same description or category as the offence he is accused of. Admission is subject to s 101(3), which requires exclusion where the evidence would have an adverse effect on the fairness of the proceedings.
Other gateways allow the introduction of ‘bad character evidence’ where all parties agree to it;Footnote 18 where it is the defendant who introduces itFootnote 19 or who raises the issue of his characterFootnote 20 or of another person's ‘bad character’;Footnote 21 where it is important explanatory evidence;Footnote 22 and where it has substantial probative value for an important matter in issue between the defendant and a co-defendant.Footnote 23
These provisions changed the exclusion regime in more than one way. The definition of ‘bad character evidence’ that they adopt is exceptionally broad, at least in some respects. Significantly, Federico Picinali has noted that in addition to evidence of previous commission of comparable crimes, propensity evidence might include evidence of other indicators of propensity, such as affiliations and group memberships.Footnote 24 As already noted, the discussion in this paper focuses on evidence of previous commission of comparable crimes; extending the argument to other indicators of propensity, and indeed to naked statistical evidence that does not indicate propensity,Footnote 25 remains for future work.Footnote 26
As for the scope of admissibility, Hanson Footnote 27 was one of the first decisions to interpret the CJA 2003 provisions, and it remains relevant. The decision focuses on probative value. Analysing the probative value of bare evidence of past misconduct for proof of propensity, the Court of Appeal has noted the significance of the number of instances of past misconduct. Analysing the possible impact of admission on the fairness of proceedings, the court addressed mainly the probative value of other available incriminating data. Here, the court referred to the level of similarity of past commissions to the offence at issue and, more generally, to the strength of the prosecution case. The court also mentioned the importance of the respective gravity of past and present offences. Last, Hanson outlined a warning that the trial judge should make when admitting propensity evidence.Footnote 28
It is fairly consensual that by completing the shift of focus to probative value, the 2003 Act has extended the scope of admissible evidence of previous misconduct.Footnote 29 Yet examination of the case law that followed Hanson has led Redmayne to conclude that it is impossible to describe it with much precision, especially given that the Court of Appeal has granted trial judges a broad discretion in applying the law.Footnote 30 Redmayne notes that where probative value in the circumstances of the case is the main consideration, this is not necessarily a bad thing;Footnote 31 after all, these circumstances change from one case to another in a way that may require particularistic judgment.
2. Tragic errors
(a) ‘Case-specific evidence’, ‘bad character evidence’ and tragic errors
The law of ‘bad character evidence’ has thus been characterised by constant movement and unrest. I would like to suggest that the law has been traditionally ambivalent about ‘bad character evidence’ because of the nature of (entirely possible) errors made based on this evidence: such errors are profoundly tragic in the Aristotelian sense. This argument focuses on the nature of the error (as profoundly tragic); it does not focus on the nature of the inference from the evidence, nor does it focus on the consequences of convicting based on the evidence, such as creating inappropriate incentives or indeed causing unnecessary suffering in cases of error.Footnote 32
To unfold this argument, we can start by comparing errors based on different types of evidence. Assume first that we have ‘case-specific’ CCTV evidence, in which a person can be seen committing robbery, and that based on this evidence the defendant can be identified to a likelihood of 98%. The defendant is convicted, alas wrongfully – unfortunately, his conviction falls within the 2% likelihood of error. The error is of course tragic in the plain sense of the term, ie unfortunate and regrettable; but this is what it amounts to, and nothing more than this. The meaning of this error amounts only to the fact that CCTV evidence has been wrongly assumed to prove that which it did not prove. In such circumstances the error is a highly unfortunate chance event.
The same is true with respect to some statistical evidence that the legal system easily endorses, such as DNA evidence introduced for identification purposes.Footnote 33 To see this, assume again that DNA evidence is available against the defendant, and it matches a DNA sample taken from the defendant, again to a likelihood of 98%. The defendant is again convicted, alas wrongfully, as his conviction falls within the 2% of errors. Here too, the error only amounts to the unfortunate actualisation of the very low likelihood of, for example, misidentification following DNA testing or mistake in DNA testing. Here too, it is an unfortunate chance event.Footnote 34
Now assume that the same defendant is convicted based on evidence that he previously committed robberies; assume further, unrealistically, that the evidence creates a merely 2% probability of error.Footnote 35 Again, the conviction falls within that 2%, as the defendant actually did not commit this robbery. But here, the meaning of the false conviction is different and more profoundly tragic: it is a false conviction based on past misconduct whose statistical implications the defendant has, in fact, almost heroically beaten when refraining from crime. It can be suggested that this is a tragedy at a scale better left for the theatre. It is hard enough for the legal system to live with the disaster of any false conviction; living with the overwhelming tragedy of falsely convicting those who have beaten all odds, based on the very odds they have beaten, is more than the legal system might be willing to do.
(b) The defendant's story unfolded
To clarify this argument, let us first examine how the story unfolds for a defendant who is falsely and tragically convicted based on previous misconduct. The starting point of this story is the actor's past choice to commit some crimes. This choice puts the actor in the claws of the gloomy recidivism statistics. And the statistics are gloomy. They are gloomy not because they show that an actor who previously committed crime is almost certain to commit further such crimes (they do not show thisFootnote 36); the statistics are gloomy because they show that an actor who previously committed crime is much more likely to commit further such crimes than an actor who did not previously commit crime. This is not the place to analyse all existing findings,Footnote 37 but a few of Redmayne's conclusions can nevertheless be mentioned. Redmayne compared conviction rates among adults released from custody or commencing a court order under probation supervision over a follow-up period of one year with conviction rates in the general population over that year. When numbers of offences were measured, a previously convicted offender was 49 times more likely to be convicted of any offence than a member of the general population; an offender previously convicted of a violence offence was 98 times more likely to be convicted of a further violence offence; an offender previously convicted of burglary was 773 times more likely to be convicted of a further burglary; and an offender previously convicted of a sexual offence was 2,353 times more likely than a member of the general population to be convicted of a further sexual offence over the one year follow-up period.Footnote 38 These figures dropped but remained significant where a study measured the numbers of offenders who reoffend (rather than numbers of offences) over the course of two years (rather than over the course of one year).Footnote 39
These statistics have two meanings. First, they have a purely evidential meaning: they simply indicate that the actor is statistically more likely than a random person to have committed the crime at issue.Footnote 40 Second, the statistics have a metaphysical meaning: they indicate that the choice to refrain from crime is more challenging and difficult for this actor than for others who did not previously commit crime. Compared with others, this actor has apparently been subject to particularly strong forces that push towards further crime commission. At the very least, reality has presented these forces to the actor as traps to be avoided, and this in itself is challenging. As for the nature of these forces, it can be speculated that the actor may have developed the self-perception of a criminal, may have crossed the internal Rubicon that usually stops people from crime commission, may have befriended other criminals who encourage further crime commission, etc. The operation of such forces must be presumed if the statistics are not to be taken as mere ‘spurious’ generalisations.Footnote 41 And while not all the forces are relevant for all actors who have previously committed crime, at least some of them are relevant for many of these actors, and at least one of them will have been relevant for our actor.Footnote 42 This last metaphysical meaning of the statistics, and only this meaning, is morally significant: it implies that our actor faces a difficult moral challenge.
But as he is faced with this difficult moral challenge, the actor then almost heroically stands up to it. He fights the forces that push towards crime commission, whatever these may be, and wins; he beats the odds. And while many in his position manage to do the same, his achievement is still impressive (and so is that of similar others). They should all be compared with others who do not face similar challenges at all. The achievement of each and every actor who manages to evade the traps of past misconduct and thereby wins over the statistics is impressive.
Alas, against all expectations, the statistics then nevertheless chase the actor all the way to his false conviction for other crimes. Just as it seems that the actor has remarkably situated himself next to law-abiding citizens and escaped the fate of a criminal, the fate of a criminal knocks on his door. It suddenly becomes apparent that while he managed to beat the morally significant metaphysical implications of the error, its morally insignificant evidential implications cannot be evaded. The evidence created by his original misconduct brings about a false conviction.Footnote 43 His story is, in the end, a story of an actor who finds himself unable to escape the punitive implications of the statistics he had erroneously put himself into, and whose human journey ends in a disproportionate disaster.
A couple of words of caution are due when considering this story: first, only previous convictions have been statistically connected with recidivism.Footnote 44 Thus, the broader category of previous comparable crime commission can be connected with recidivism only if it is assumed that the previous choice to commit comparable crime can help to explain high recidivism rates among those previously convicted.Footnote 45 While this assumption may seem reasonable, it is yet unproven and indeed is hard to prove. If rejected, the defendant's story as outlined above remains relevant only for those with previous convictions. Second, the available statistics about previous convictions have their own limitations. As it is hard to isolate and measure the effect of previous convictions on recidivism, other factors, such criminal justice policies, can affect results. Redmaye's conclusions about the available statistical data thus seem valid:
… no great claims are being made about the comparative figures presented here. They give us some idea about the differences between those with previous convictions and those without, but cannot be taken as anything like exact. However, even if the figures are all reduced by a factor of ten, they still represent significant differences. If a person with previous convictions is only twice as likely to offend as someone without, that is still significant in terms of probative value.Footnote 46
Another qualifying point is that, as presented, the story of a defendant who is falsely and tragically convicted is a rare occurrence. First, in reality ‘bad character evidence’ can hardly bring about false convictions on its own. Given its more limited probative value, it can merely contribute to false convictions in combination with other circumstantial evidence. I will have more to say about this below,Footnote 47 but for now it is enough to note that in most cases this does not change the way in which the defendant's story unfolds.Footnote 48 As far as the narrative is concerned, the impact of this last piece of ‘bad character evidence’ that may well have sealed the false conviction remains significant. It still produces astonishment in the face of a disproportionate disaster. Indeed, the question what would have happened if only the defendant had refrained from committing the past crime remains as nagging as ever, if not more than ever (due to the unsettling effect of uncertainty).
Another rare aspect of the story is the defendant's presumably full reform as he refrains from crime and wins over the statistics. Reality rarely presents an actor who is fully reformed, and even more rarely does it present an actor who is fully reformed and is nevertheless falsely and tragically convicted. We can thus now observe the story of a defendant who has not gone through the unlikely process of full reform, but rather did not commit the crime on the occasion at issue for other reasons, for example because she was too tired or busy doing something else.
The story of such a defendant is not significantly different from the story of the fully reformed defendant. To see this, we should think of the probability of commission per opportunity to commit comparable crime (usually expressed in Bayesian termsFootnote 49). For a defendant who has already committed crimes, the probability of commission per opportunity to commit a comparable crime is comparatively high. Whenever such an offender chooses not to take an opportunity to commit such a crime – whatever the reason for that choice may be – she beats the odds. Every such choice involves a small but nevertheless impressive victory over the statistics. Alas, as with the fully reformed defendant, the statistics still chase her all the way to her false conviction.
Arguably, there is a difficulty with the defendant who has not fully reformed, when her case is observed through the prism of probability per time unit (rather than per opportunity).Footnote 50 According to this argument, while such a defendant did not commit the crime on the relevant opportunity, she did not thereby confront and beat the high odds of crime commission over the same week or month. As the relevant time period has not been exhausted, the same odds are still there, waiting to be confronted. Thus, when probabilities are observed per time unit, this defendant has not yet beaten the odds.
This argument relies on several inaccuracies. First, such a defendant is actually expected to commit less crimes per (sufficiently long) time unit than a similar fellow defendant who has taken an early opportunity to commit crime. This indicates an initial process of distancing herself from crime, or of reform. Second, with every day that goes by without her committing crime, this defendant's likelihood of repetition drops (as recidivism rates drop with timeFootnote 51). For this reason, too, postponing criminal activity indicates initial reform. Accordingly, even if in the end this defendant does not reform, her story is still similar to that of the reformed defendant: she is falsely convicted for something that took place exactly in the time period when she opened the door for reform, and based on her low likelihood of reform. Last, observing probabilities per time unit only provides part of the picture, and thus we must have regard also to the abovementioned probabilities per opportunity.
The only cases where the story of a false conviction of a defendant based on bare ‘bad character evidence’ unfolds differently are the following: (1) where the defendant did not commit the relevant crime because she was busy committing another comparable crime; and (2) where the defendant did not commit the relevant crime but had already committed a comparable recidivist crime, say on the day before. For current purposes, it is enough to note that were it possible to identify such cases, it would be hard to say whether using evidence of previous misconduct would still cause unrest.Footnote 52
(c) The Aristotelian tragedy
Let us now examine the materials tragedies are made of, with the prospect of highlighting the strong tragic aspects of our defendant's story. Nowadays the word ‘tragedy’ is used loosely to refer to bad events that cause suffering and regret. Any false conviction would therefore easily qualify as a tragedy in this sense. However, the theoretical-theatrical concept, and particularly the original Greek theoretical-theatrical concept, has a narrower meaning. To comprehend this narrower meaning, we can observe Aristotle's account of tragedy in his Poetics.Footnote 53 While probably non-exhaustive,Footnote 54 this account is illuminating.
In terms of content, Aristotle's account captures what we can call the profound tragedy, where a disproportionate disaster or total destruction is brought about on the hero not by some arbitrary fate but by her own misjudgement or by her human weaknesses (known as hamartia in Greek tragedy).Footnote 55 The Aristotelian tragedy is thus made of an imperfect protagonist making what initially looks like a limited error, but which ends up having extremely destructive consequences.
The structural components of the Aristotelian tragic plot are the following:Footnote 56 a sequence of events that are connected as causes and effects;Footnote 57 an element of surprise;Footnote 58 reversal of the situation/intention, where an event ends up having the opposite effect to that which was intended or expected;Footnote 59 a moment of recognition, where the protagonist comes to realise his error;Footnote 60 and a scene of suffering (or a tragic incident), where the painful impact of the disaster manifests itself.Footnote 61
In terms of its effects, the tragic course of events stimulates pity and fear.Footnote 62 Aristotle further elaborates the means by which this effect is produced, or the circumstances that in his mind strike us as most terrible or pitiful. According to Aristotle, pity and fear are stimulated when the tragedy happens between persons who are near or dear to one another;Footnote 63 and where the protagonist is above common level and is characterised by goodness and propriety,Footnote 64 but is nonetheless true to life and fairly consistent.Footnote 65 The tragic hero should remind us of ourselves,Footnote 66 ‘a man who is not eminently good and just, yet whose misfortune is brought about not by vice or depravity, but by some error or frailty’.Footnote 67 In this way tragedies demonstrate that we are our biggest enemies; and this insight is terrible and frightening.
(d) Application of the definition of tragedy to Oedipus Rex and to the defendant's story: a comparison
We can now examine how these characteristics present themselves in Sophocles’ Oedipus Rex, which Aristotle considered the purest example of a tragedy,Footnote 68 and, by way of comparison, how they present themselves in the story of the defendant who is falsely convicted based on previous misconduct. The examination will start with contents and then move to the structure of the plot and the means by which the tragic effect is achieved.
The content of Oedipus Rex is familiar enough: Oedipus has been given a prophecy according to which he will kill his father and sleep with his mother. Trying to escape this prophecy, he leaves his home in Corinth and the people he believes are his biological mother and father. Alas, his fate (or the prophecy) cannot be escaped, and Oedipus ends up discovering that after leaving his home he committed these horrors against his biological parents without realising who they were. Notably, the origins of the disaster trace back to the crime of killing a stranger at a crossroads.Footnote 69 The implications of this crime seemed to have been limited, but eventually revealed themselves as disastrous (the stranger was later revealed as Oedipus’ biological father).
This story bears strong similarity to the story of the defendant that has already been outlined. Like Oedipus, this defendant lives in the shadow of a prediction: the statistics predict that she will end up being destroyed by repeated convictions of further crimes. In what initially looks like a successful attempt to escape this prediction, she refrains from further crime commission. But, like Oedipus, this defendant's fate cannot be escaped, and she ends up being falsely convicted based on the statistics and thereby being destroyed. Here too, the origins of the disaster trace back to crime commission whose implications seemed to have been limited but eventually revealed themselves as disastrous.Footnote 70
The plots of both Oedipus Rex and the defendant's story have the abovementioned structural characteristics of the Aristotelian tragedy. As already demonstrated, in both stories the chain of events flows as a causal chain. Furthermore, in both stories the destruction of the protagonist is surprising: Oedipus seems to have evaded the prophecy when leaving Corinth for the city of Thebes, and his discovery that the prophecy has been realised is astonishing. Similarly, the defendant seems to have escaped the claws of statistics when she refrains from further crime commission, and her false conviction based on these statistics is not less astonishing. Moreover, in both cases there is the ‘reversal of situation’. In Oedipus Rex the messenger brings information that is supposed to free Oedipus from his worries (namely that Merope was not his real mother), but this message produces the opposite effect, leading to the discovery that Jocasta is his mother. In the defendant's case, the trial is supposed to reveal the truth and free the defendant from false suspicions, but it ends up producing the opposite effect when the defendant is falsely convicted. In addition, in both stories there is a moment of recognition. Aristotle notes that ‘[t]he best form of recognition is coincident with a Reversal of the Situation, as in the Oedipus’.Footnote 71 The same applies for the defendant, who comes to realise her disaster as the evidence is presented against her and seals her false conviction. Last, in both stories there is a scene of suffering, where the protagonist is mortified. Oedipus plunges Jocasta's pins into his eyes, and the defendant's suffering is, presumably, apparent as the court finds her guilty.
As for the tragic effect, while both stories stimulate extreme pity and fear, the means that achieve the tragic effect are admittedly more strictly Aristotelian in Oedipus Rex than they are in the defendant's story. Unlike in Oedipus Rex, the defendant's tragedy does not involve family or friends. Furthermore, unlike Oedipus, who comes from the ruling classes and who accordingly is considered above common level, the defendant is not above common level and is not necessarily characterised by goodness and propriety. But at least to the extent that this defendant is fully or partly reformed, her intentions are good and we can identify with her. Moreover, like Oedipus, the defendant is true to life and fairly consistent, and this too allows identification.
To conclude, the defendant's story has the content and structure of an Aristotelian tragedy. The absence of some characteristics that Aristotle considered as best producing the tragic effect does not necessarily prevent the development of this effect. The defendant who is falsely convicted based on his past is therefore very similar to Aristotle's tragic hero.
(e) Back to the law
We can now go back to observing the law. The analysis proposed so far can shed light on the common law exceptions to the rule of exclusion of ‘bad character evidence’. It was noted in section 1 that almost all of these exceptions involved cases where the defendant failed herself one more time, above and beyond her bare past misconduct, and this time bluntly and transparently so. But this additional moment of blunt and transparent self-failing affects our evaluation of the defendant's entire erroneous course of conduct and its outcomes in several ways.
First, it deprives the defendant's original error (the bare past misconduct) of its limited scope, turning it into one in a sequence of risky behaviours. Secondly, a subsequent false conviction based partly on the original error no longer comes as a complete surprise: the defendant who has transparently turned herself into a suspect has also openly taken all the risks associated with a criminal justice process, including that of a false conviction. That this might actually lead to her being falsely convicted is hardly astonishing. Thirdly, the theatrical disproportionality of a false conviction is no longer measured only against the single original error, but against the combination of this error with the following act of blunt self-failing. Accordingly, the disaster seems less disproportionate. Fourthly, her blunt and transparent self-failing makes the defendant significantly harder to identify with. The defendant therefore loses many of the characteristics of a tragic protagonist, and the false conviction loses many of the characteristics of a tragic outcome.
It can thus be suggested that the common law categories captured cases where the admission of ‘bad character evidence’ no longer had the potential to lead to profoundly tragic false convictions. Only in these cases was admission exceptionally allowed. Clearly, this is not to suggest that the common law categories exhausted the cases where admission of ‘bad character evidence’ would not lead to tragedies; neither is it to suggest that all the categories were accurate and fully explicable along the suggested lines. Still, the proposed analysis does demonstrate that the common law's approach may have had its grounds. In addition, this analysis can explain why the categories only partly overlap with the pool of cases in which other strong evidence is available against the defendant.Footnote 72 Last, the analysis sheds different light on the transition completed by CJA 2003 into a regime of probative value: this transition has removed the distinction between profoundly tragic errors and other errors.
(f) A brief comment on the connection between aesthetics and ethics
Some questions nevertheless remain: can the proposed analysis offer more than aesthetic observations? Can the 2003 Act be criticised for disregarding the distinction between tragic and non-tragic errors? Why should the law care about tragedies at all? Is ‘tragedy’ a legally relevant category, and if so, how exactly? These questions concern the possible connection between aesthetics and ethics: whether there is such a connection at all; and if there is, whether it is such that good aesthetics accurately captures ethically relevant distinctions and categories. That positive answers are far from impossible is evident already in Kant's suggestion in his Critique of the Power of Judgement that taste is ‘at bottom a faculty for the estimation of the sensible rendering of moral ideas’.Footnote 73 Still, the questions run deep and merit separate attention.Footnote 74 The following paragraphs merely propose initial routes to explore.
We can assume for now the existence of only a loose connection between aesthetics and ethics, of the type proposed in Kant's third Critique.Footnote 75 For Kant, morality is based on reason and is independent from aesthetics; still, there is an analogy between the aesthetic and the moral; this analogy is essential for an account of moral feeling; and moral feeling, in turn, makes morality possible for human beings.Footnote 76 Aesthetics are thus not a source of moral duties; however, the aesthetic symbolises the moral.
Even under these relatively weak assumptions, tragedy emerges as a morally and legally relevant category: the aesthetic impact of tragedy is powerful and timeless because tragedy symbolises morally relevant categories. Tragedy produces such strong unrest because it is an aesthetic representation of moral failure. Arguably, tragedy symbolises a human failure to prevent certain misfortunes that must be prevented (as a matter of moral duty) whenever possible. Since tragedy symbolises such moral failure, it engages not only aesthetic taste but also the comparable and closely connected moral feeling (indeed, this is the source of its powerfulness).
For current purposes, the significant conclusion is that tragedy symbolises failure to fulfil duty. Exploring the independent moral derivation of this duty is a different matter that cannot be comprehensively pursued here. A couple of initial possibilities can nevertheless be briefly proposed.
Evidently enough, tragedies do not necessarily indicate a particularly high level of suffering by the defendant. The amount of suffering or loss that any event involves is contingent and depends on its particular circumstances. A tragic course of events may well involve less suffering than other extremely unfortunate courses of events. The suffering of a parent who is falsely convicted of the murder of her child is almost beyond imagination, and it is far greater than the suffering of a defendant who is falsely convicted of burglary based on previous burglaries she committed.
Yet it can be tentatively suggested that failing to prevent tragedies is unfair to the defendant:Footnote 77 it puts on her shoulders more than her fair share of arbitrariness. According to this suggestion, while arbitrariness cannot be completely avoided, human choice usually serves to minimise it and distribute it fairly.Footnote 78 But the defendant's choice does the opposite of this: it frustrates minimisation and fair distribution of arbitrariness. By depriving the defendant's subsequent good choice of any potential to shield from arbitrariness, it leaves the defendant exposed to a tremendous and indeed unfair share of arbitrariness.Footnote 79 And where individual choice proves counter-productive, justice kicks in: it requires that society bear some arbitrariness of false acquittals for the sake of preventing the defendant from carrying this excessive share of arbitrariness on her shoulders.
It can be added in a similar tentative spirit that allowing tragic courses of events to unfold might also be plainly disrespectful of defendants: if the duty of respect that the state owes to its citizens entails acknowledging their exercises of autonomy,Footnote 80 then allowing tragic false convictions is a failure to respect the defendant's impressive and almost heroic manifestation of autonomy (through change).Footnote 81
If failure to prevent tragedies indeed has distinct negative value that is ethically and legally relevant, there is at least an a priori reason to criticise the 2003 Act for not taking account of this negative value.
(g) Interim conclusions
To conclude the discussion so far, the legal system's traditional reluctance to admit evidence of previous misconduct can be defended with reference to the nature of entirely possible errors based on such evidence: false convictions based on such evidence are profoundly tragic in the Aristotelian sense. Possibly, this means that they are also unfair and disrespectful of the defendant. The 2003 Act reflects a transition from a regime that takes into account this special nature of possible errors, to a regime that is concerned solely with probative value.
But this analysis only takes us so far in deciding whether to admit or exclude the evidence even in those cases where the defendant is comparable to a tragic protagonist. Examination of possible interpretations by juries of the evidence or of its absence obscures the picture. Mainly, a risk of profoundly tragic errors may arise not only where the evidence is admitted, but also where it is excluded. Thus, the possibility of profoundly tragic errors does not yet necessarily call for exclusion. The next section addresses this problem.
3. The likelihood of errors
(a) The problem
The decision whether to admit or exclude ’bad character evidence’ can be made based on the expected negative values of errors associated with each rule: the rule that is associated with lower negative values should be preferred. For the purposes of convenience, I will refer to such negative values hereinafter as ‘costs’. Estimating the cost of errors involves identifying the different costs associated with different types of errors and the likelihoods of different types of errors under the two types of rules (admission and exclusion). Likelihoods of errors, in turn, are not determined directly by the evidence or its absence, but rather by the factfinders’ interpretations of the evidence or its absence. This section demonstrates that we are deeply ignorant about factfinders’ interpretations of ‘bad character’ evidence or its absence. Derivatively, we are also deeply ignorant about the overall costs associated with a rule of admission and with a rule of exclusion. The section then uses decision theory and the maximin rule to advance the argument that considering the difference in the costs of the possible errors under a rule of admission and under a rule of exclusion, a rule of exclusion should be preferred.
It is often argued that juries overestimate the probative value of evidence of previous misconduct, or alternatively convict in order to punish for previous misconduct.Footnote 82 One of Redmayne's significant contributions to the research of evidence of previous misconduct was his careful and thorough analysis of the existing empirical research on bias. While indecisive, the conclusions of this analysis are far from being insignificant. Redmayne concludes that we cannot know whether evidence of previous misconduct is likely to be biasing, and that there are reasons to think either way.Footnote 83 Furthermore, Redmayne notes that in some cases, evidence of previous misconduct might even have the unexpected outcome of reducing the likelihood of conviction.Footnote 84 This might be the case where introduction of evidence of previous misconduct seems to the jury to be unfair;Footnote 85 or, alternatively, where absent the evidence the jury would have made unfounded and more condemning speculations.Footnote 86
The picture we end up with regarding the expected outcomes of admission or exclusion thus becomes remarkably obscure: both a rule of admission and a rule of exclusion might or might not hamper juries’ judgement in either of two ways (for or against the accused) and to an unknown extent.Footnote 87 Accordingly, we cannot know how many errors and of which type (profoundly tragic or not) would be brought about by each of the rules. It is therefore impossible to know which rule might minimise the cost of errors.
The difficulty is a general one in the law of evidence. It occurs where there are reasons to think that the evidence is overall valuable and other reasons to think it is damaging overall, and we cannot evaluate the probability of it being valuable or damaging. The evidence (might) pull in different directions.
Indeed, this difficulty goes beyond the law of evidence; it is a difficulty long known to decision theorists and referred to as ‘deep ignorance’.Footnote 88 Deep ignorance is a condition where the likelihoods of different outcomes of different courses of action are unknown and unknowable, since a sufficiently broad range of evaluations can be supported by good reasons.Footnote 89 Thus, unlike in conditions of mere uncertainty, cost-expectancies cannot be calculated, and a decision about the appropriate course of action cannot be made based on a principle of cost minimisation. As the principle of cost minimisation is inapplicable, different principles of decision are needed for conditions of deep ignorance.
(b) Decision theory: the maximin rule for decisions in conditions of deep ignorance
Decision theory proposes several decision rules for conditions of deep ignorance.Footnote 90 The two dominant rules in decision theory, and those which have already been applied convincingly in moral and political theory,Footnote 91 are the maximin rule and the lexical maximin rule (leximin). The following discussion applies these rules to the problem of bias in the context of evidence of previous misconduct.
The maximin rule requires maximising the minimal level of value obtainable with each decision. Accordingly, this rule works where we can discern one potential decision whose worst outcome has more value than that of the worst outcome of other potential decisions. If this is not the case, and the minimal obtainable value is identical for all potential decisions, the lexical maximin rule comes to play: if the worst outcomes are equal, we should maximise the value of the second worst outcome; if the second worst outcomes are equal, we should maximise the value of the third worst outcome, and so on.
The following analysis examines the possible outcomes of a rule of admission and a rule of exclusion. It does so with reference to the types of possible errors under each regime, given the ways juries might interpret the evidence or its absence. Subsection (c) demonstrates that the range of possible outcomes under a rule of admission and under a rule of exclusion is, in fact, qualitatively equal: considering the possibility that juries might ‘fill in the gap’ where the evidence is excluded and presume propensity, profoundly tragic errors are possible not only under a rule of admission, but also under a rule of exclusion. Next, subsection (d) demonstrates that the possible outcomes of a rule of admission and of a rule of exclusion are nevertheless quantitatively different: profoundly tragic errors made under a rule of admission are more tragic than those made under a rule of exclusion. Accordingly, principles offered by decision theory lead to the conclusion that exclusion is preferable to admission.
(c) Initial application: errors and profoundly tragic errors
To apply maximin to admission of evidence of previous misconduct clearly, two unrealistic assumptions are made: first, it is assumed that evidence of the defendant's previous misconduct is the only available evidence; and secondly, it is assumed that juries’ possible estimations of the likelihood of recidivism are radical, so that overestimation means estimation of 100% likelihood and underestimation means estimation of 0% likelihood.
The two relevant acts are admission and exclusion of the evidence. The three relevant states are: (1) juries overestimate the likelihood of recidivism given the defendant's past; (2) juries estimate correctly the likelihood of recidivism given the defendant's past; and (3) juries underestimate the likelihood of recidivism given the defendant's past. Where the evidence is excluded, these three states are further differentiated: under each of them, juries might assume no past, correct past, or excessive past. The five relevant possible outcomes are the following: true acquittal, false acquittal, true conviction, false conviction, and tragic false conviction.
Tragic false conviction is a false conviction based on the defendant's previous misconduct, whether or not it involves overestimation of the likelihood of recidivism. It has been demonstrated in section 3 that such a conviction is tragic where juries estimate the likelihood of recidivism correctly. Overestimation of this likelihood by the juries does not negate the tragic nature of the conviction: it is the reliance on the defendant's poor odds, rather than the accuracy of the estimation of these poor odds, that makes the false conviction tragic.
Tragic false conviction can occur either where evidence of previous misconduct has been introduced or where it has not been introduced, but juries assume rightly or wrongly that the defendant has a past of misconduct. We have already seen why tragic errors are possible where the evidence is introduced. Where the evidence is not introduced, but juries fill the gap by rightly assuming that the defendant has a past of previous misconduct, the outcome is similar to that of admission, and there is a similar risk of tragic errors. Where the evidence is not introduced, and juries wrongly assume that the defendant has a past of misconduct, erroneous convictions would still be tragic. Unfortunate as this may be, the statistics indicate that, at the very least, defendants are indeed more likely to have previous convictions than random citizens (and to the extent that they have such previous convictions, they are more likely to have committed the offence at issue).Footnote 92 Accordingly, the fact that a person has been accused of crime indicates that there are strong forces pushing him to commit crime; indeed the forces pushing such a person to commit crime are stronger than the forces pushing random citizens to commit crime. Given these statistical realities, where the defendant has beaten the odds by not having committed past crimes and accordingly not having such a past (and hence he is no longer more likely to have committed the offence at issue), her false conviction based on the odds that she has beaten is still profoundly tragic.
As for the costs of each outcome, it is assumed that true results have no relevant costs; that false convictions are worse or more costly than false acquittals; and that tragic false convictions are worse or more costly than other non-tragic false convictions. The first and second assumptions do not require explanation. The third assumption (that tragic false convictions are worse than other non-tragic false convictions) attributes additional costs to profound tragedy. It relies on our natural strong response to tragedies – a response that implies such additional costs. In addition, this assumption can rely on the additional unfairness, beyond that of any false conviction, that tragic false convictions involve, as well as on the disrespect towards the defendant that the legal system arguably demonstrates when it allows tragic false convictions to happen.Footnote 93
Now, let us observe the respective acts, states, and outcomes as summarised in Tables A–D in the Appendix below. The observation leads to the following conclusions: if juries do give some weight to evidence of previous misconduct, but where the evidence is excluded they assume that the defendant has no history of previous misconduct, then maximin would recommend excluding the evidence. The reason is that under this assumption, the worst possible outcome of exclusion is false acquittal, and this outcome is better than the worst possible outcome of admission, namely tragic false conviction.
However, assuming again that juries give some weight to the evidence, but that following exclusion juries might actually assume that the defendant does have some history of previous misconduct, maximin becomes less helpful. The reason is that under this assumption, the worst possible outcomes of exclusion and of admission are equally bad (tragic false convictions). In such circumstances, we should apply leximin and examine the second worst possible outcomes. Yet this too does not take us far, as the second worst possible outcomes of admission and of exclusion are also equal (false acquittal); and other outcomes are true outcomes and therefore good outcomes.
The same is true assuming that juries do not attribute any weight to the evidence: the worst outcomes under a rule of admission and of exclusion are equally bad (false acquittals); and other possible outcomes are true outcomes and therefore good outcomes.
Thus, under such assumptions, maximin and leximin do not take us far in deciding the question of admission.
(d) Taking the analysis forward: the role of incentives
Reconstruction of the incentive argument originally proposed by SanchiricoFootnote 94 and further developed and extended by Enoch, Spectre and FisherFootnote 95 can get the analysis going. According to this argument, admission of evidence of previous misconduct creates a disincentive for potential defendants with a past of misconduct to refrain from the proscribed act. This argument is consequentialist and tentative in nature, but it can be integrated in a non-consequentialist argument about the duty to prevent the most tragic of errors based on maximin.
Where evidence of previous misconduct is admissible, potential defendants with a past of misconduct know that their past would be considered during trial; they know that it might be overestimated by the juries; and they know that they might be falsely convicted on the basis of their past. Thus, the possibility of a tragic false conviction becomes visible to them. It is this visibility that reduces potential defendants’ motivation to refrain from further crime, with the implication that they are more likely to commit such crime. Arguably, this visibility does more than that: it actually increases the motivation to commit further crime; it creates a positive incentive for further crime commission. If potential defendants know that their prospects of a good life are diminished because of the possibility of tragic false convictions, they have a positive incentive to balance this out by taking what they consider as ‘more’ of life (for example by way of stealing property or taking criminal revenge etc) in the time that is left before the expected false conviction. Thus, the visibility of possible tragic errors puts potential defendants deeper in the claws of the recidivism statistics. But the deeper potential defendants are trapped by these claws, the more heroic is their eventual decision to refrain from further crime, and the more profoundly tragic is their eventual false conviction. If this is true, then a rule of admission involves a risk of errors whose tragic nature is particularly profound; it involves a risk of errors that are more profoundly tragic than other tragic errors. A decision to opt for a legal rule of admission, therefore, has particularly low minimal value obtainable with it.
The same does not apply to the minimal value obtainable with a decision to opt for a rule of exclusion. Where the rule is one of exclusion, the possibility of tragic false convictions is far less visible to potential defendants. Unless highly sophisticated or unusually pessimistic, they are not expected to think that juries would assume they have a past of misconduct absent evidence of such misconduct (and indeed, this is the assumption that lies in the heart of any incentive-based analysis of evidence of previous misconduct). Thus a decision to opt for a rule of exclusion does not reduce potential defendants’ motivations to refrain from the proscribed act in a way similar to a decision to opt for a rule of admission. A rule of exclusion therefore does not increase the odds of commission and put potential defendants deeper in the recidivism statistics, and accordingly it does not make their eventual decision to refrain from crime more heroic. Thus, a false conviction reached in an exclusion regime may well be profoundly tragic, but it is not as profoundly tragic as a false conviction reached in an admission regime. A decision to opt for a legal rule of exclusion therefore does not have particularly low minimal value obtainable with it.
If this is the case, then maximising the minimal value obtainable with the admissibility decision, as required by the maximin rule, entails opting for exclusion. The worst outcome of exclusion (profoundly tragic false conviction) is still better than the worst outcome of admission (more profoundly tragic false conviction).
Note that in this analysis contingent incentives play a role, but it is not a justifying role: the way it is applied here, maximin does not recommend exclusion based on the added value of discouraging crime (which is a contingency). Rather, the justification for exclusion is the profoundly tragic and unfair nature of possible errors.
Conclusions
This paper has argued that where the defendant does not connect herself to the offence at issue, admission of ‘bad character evidence’ can lead to the most profoundly tragic of false convictions. It has further proposed that this aesthetic characteristic of the evidence can explain the legal ambivalence about it; and that to the extent that this characteristic has moral grounds and moral bearings on the fair treatment of defendants, a rule of exclusion can be justified. Such a rule would also be justifiable, based on maximin, if juries’ possible interpretations of the evidence or its absence are taken into consideration.
CJA 2003, however, sets a regime of broad admissibility based on probative value. It can now be cautiously suggested that in such conditions, there may be a reason to use the unfairness provision in s 101(3) in order to reintroduce into the law the distinction between profoundly tragic false convictions and other false convictions. The common law categories that seem to have captured this distinction could serve as guidance. Thus, where the defendant transparently connects herself to the offence at issue in ways comparable to those covered by the common law categories, admission of ‘bad character evidence’ no longer has the potential to lead to profoundly tragic false convictions, and accordingly it can be sustained. In all other cases exclusion should be preferred.Footnote 96
Appendix: the range of possible outcomes under a rule of admission and under a rule of exclusion
Table A: possible outcomes under a rule of admission
![](https://static.cambridge.org/binary/version/id/urn:cambridge.org:id:binary:20191106124351069-0057:S0261387518000569:S0261387518000569_tab1.gif?pub-status=live)
Table B: possible outcomes under a rule of exclusion, assuming juries’ overestimation
![](https://static.cambridge.org/binary/version/id/urn:cambridge.org:id:binary:20191106124351069-0057:S0261387518000569:S0261387518000569_tab2.gif?pub-status=live)
Table C: possible outcomes under a rule of exclusion, assuming juries’ correct estimation
![](https://static.cambridge.org/binary/version/id/urn:cambridge.org:id:binary:20191106124351069-0057:S0261387518000569:S0261387518000569_tab3.gif?pub-status=live)
Table D: possible outcomes under a rule of exclusion, assuming juries’ underestimation
![](https://static.cambridge.org/binary/version/id/urn:cambridge.org:id:binary:20191106124351069-0057:S0261387518000569:S0261387518000569_tab4.gif?pub-status=live)