Auceps syllabarum: A Digital Analysis of Latin Prose Rhythm

Tom Keeline; Tyler Kirby

doi:10.1017/S0075435819000881

Auceps syllabarum: A Digital Analysis of Latin Prose Rhythm

Published online by Cambridge University Press: 24 September 2019

Tom Keeline and

Tyler Kirby

Show author details

Tom Keeline*: Affiliation:
Department of Classics, Washington University in St Louis
Tyler Kirby*: Affiliation:
Independent scholar
*: tkeeline@wustl.edu
tyler.kirby9398@gmail.com

Article contents

Abstract
METHODOLOGY
DATA
ANALYSIS
CONCLUSIONS
Footnotes
References

Rights & Permissions

Abstract

In this article we describe a series of computer algorithms that generate prose rhythm data for any digitised corpus of Latin texts. Using these algorithms, we present prose rhythm data for most major extant Latin prose authors from Cato the Elder through the second century a.d. Next we offer a new approach to determining the statistical significance of such data. We show that, while only some Latin authors adhere to the Ciceronian rhythmic canon, every Latin author is ‘rhythmical’ — they just choose different rhythms. Then we give answers to some particular questions based on our data and statistical approach, focusing on Cicero, Sallust, Tacitus and Pliny the Younger. In addition to providing comprehensive new data on Latin prose rhythm, presenting new results based on that data and confirming certain long-standing beliefs, we hope to make a contribution to a discussion of digital and statistical methodology in the study of Latin prose rhythm and in Classics more generally. The Supplementary Material available online (https://doi.org/10.1017/S0075435819000881) contains an appendix with tables, data and code. This appendix constitutes a static ‘version of record’ for the data presented in this article, but we expect to continue to update our code and data; updates can be found in the repository of the Classical Language Toolkit (https://github.com/cltk/cltk).

Keywords

Latin prose rhythm clausulae Latin prose style digital analysis statistical analysis Cicero

Type: Articles
Information: The Journal of Roman Studies , Volume 109 , November 2019 , pp. 161 - 204

DOI: https://doi.org/10.1017/S0075435819000881 [Opens in a new window]
Copyright: Copyright © The Author(s) 2019. Published by The Society for the Promotion of Roman Studies

To sum up, we may accept that Zielinski's statistics, while they are far from perfect, do nevertheless give a tolerably accurate picture of Cicero's clausulae … It is conceivable that in the future computer technology may allow accurate statistics to be produced for large amounts of material, such as whole authors, at the touch of a button. But until that day arrives, Zielinski's figures for Cicero's speeches … may suffice. They are the best we have and, until computers come to our aid, will not be improved upon.Footnote ¹

For over a century, scholars studying Latin prose rhythm have relied on the statistics generated by Theodor Zielinski's pioneering Das Clauselgesetz in Ciceros Reden.Footnote ² They have also complained about his methodology and its inadequacies:Footnote ³ Zielinski read his own Russian translations of Cicero's speeches out loud in order to develop a feel for where sense breaks (and so clausulae) occurred in the Latin;Footnote ⁴ he arbitrarily decided that the cretic was the basis for Latin prose rhythm;Footnote ⁵ he did not compare his observed frequencies of clausular patterns to any expected values, thus ignoring the naturally occurring rhythms of the Latin language;Footnote ⁶ he came up with dubious rules for word division and resolutions within his clausular categories.Footnote ⁷ But this was path-breaking scholarship, for Zielinski had no real predecessors — and he has had no successors either.Footnote ⁸ In the 115 years since Das Clauselgesetz, no scholar has had the Sitzfleisch to do what Zielinski did: he counted, by hand, 17,902 clausulae in Cicero's speeches. He analysed his results in detail and produced elegant summary tables, all without the aid of electronic calculators. The result is an imposing and apparently authoritative monument.

The real problem with Zielinski's analysis, however, is not its methodological basis. About his methodology Zielinski is an exemplar of openness and honesty: he lays out his assumptions and reasoning at every step of the process, and in exhaustive detail. While all of these have been questioned, no one would expect the first explorer of uncharted terrain to map it perfectly. A bigger problem is that Zielinski's results are unverifiable and unreproducible. He seems to provide a deluge of data, but readers must trust that he has scanned and counted and tabulated correctly, for he provides comprehensive scansion for only one speech. Zielinski was a great scholar, but in most fields of scientific inquiry we do not simply accept unverifiable pronouncements. And yet it is not just Zielinski: all scholars of Latin prose rhythm who give even partial statistics have presented their varying results and varying methodologies from a black box that could not be inspected or verified.Footnote ⁹ Furthermore, there looms a potentially even bigger problem: if one wanted to modify Zielinski's methodology — disregarding some of his strictures on word division and resolution, say — it would require recounting everything from scratch.

Fortunately, computers have come to our aid.Footnote ¹⁰ In this article, we describe a series of interrelated algorithms and modules that can produce a comprehensive analysis of the prose rhythms of a given corpus of Latin literature with a few keystrokes. This digital approach presents entirely new possibilities for the study of prose rhythm. With complete openness and transparency, we can calculate prose rhythm statistics from across the whole of extant Latin literature. Furthermore, we can be absolutely consistent in our procedures and confident in our statistics, and yet we are not bound to any one methodology. If it becomes clear, for example, that we should treat elision differently, we can do so and generate new numbers and new statistics — instantly. Zielinski laboriously counted 17,902 clausulae by hand over years: we count hundreds of thousands of clausulae in seconds. Furthermore, all of our results are verifiable from the highest to the lowest level: we can show how any individual phrase has been scanned and categorised, and all of our code and data are open source. We can thus answer fundamental and challenging questions about prose rhythm, and answer them with speed, consistency and transparency.

I METHODOLOGY

Latin prose rhythm sometimes looks like a species of philological witchcraft, albeit one without the seductive power of most black magic. In part this is because the ancient testimonia on the subject are confused or confusing, and ancient theory does not always seem to match ancient practice.Footnote ¹¹ But it is clear that ancient orators and rhetoricians perceived prose rhythm as a real phenomenon, and they cannot be faulted for failing to reduce a complex and intuitively felt system to a set of clear rules. Indeed, it was not just prose rhythm that caused headaches for ancient linguistic theorists: everything from the Latin stress accent to its ablative case proved obstacles for ancient authorities trying to systematise the properties of their language.Footnote ¹² Self-diagnosis is hard.

With the distance of two millennia and a bevy of statistics, we may actually stand a better chance today of describing the practice of ancient prose rhythm. Modern theories continue to proliferate, and we do not propose to adjudicate among them here.Footnote ¹³ After much trial and error, we have settled on a system that both seems generally reasonable and accounts for the data. Our typology accords fairly well with modern scholarly approaches, but it is fundamentally a pragmatic choice, adopted because it yields useful and interesting results.Footnote ¹⁴ It is not meant to be the last word. Again, a virtue of the digital approach is that we can adjust — and have adjusted — our methods and classification as our understanding improves. Looking at the data that we provide, new readers may detect other points of interest which have eluded us.

We divide all possible clausular patterns into seven main categories. Of these seven categories, the first four — cretic-trochaic, double cretic/molossus cretic, ditrochaic and hypodochmiac — are traditionally considered ‘rhythmic’. These are the rhythmic preferences that seem to have been developed for Greek prose by the shadowy Hegesias (third century b.c.) and are exemplified in Latin by Cicero.Footnote ¹⁵ When scholars talk about ‘rhythmic’ authors, they usually mean those who follow this system.Footnote ¹⁶ Hegesias’ doctrines were very influential and found a number of adherents; as we will see, looking at authors’ differing preferences for so-called ‘rhythmic’ and ‘non-rhythmic’ clausulae has real explanatory power. But we will also see that all Latin authors have their own rhythmic preferences, even those who do not follow this artificial system. Thus, in a slight but significant terminological shift, we will avoid calling authors ‘rhythmic’ and ‘non-rhythmic’, even as we still find it useful to compare the artificially ‘artistic’ rhythms (the first four categories below) with ‘non-artistic’ rhythms (the last three).

1. Cretic-Trochaic: –⏑– –×
Resolved:
1. a. ⏑⏑⏑– –×
2. b. –⏑⏑⏑–×
3. c. –⏑–⏑⏑×
2. Double cretic/molossus cretic: –⏑– –⏑× or – – – –⏑×
Resolved:
1. a. ⏑⏑⏑– –⏑×
2. b. –⏑⏑⏑–⏑×
3. c. –⏑–⏑⏑⏑×
4. d. ⏑⏑– – –⏑×
5. e. –⏑⏑– –⏑×
6. f. – –⏑⏑–⏑×
7. g. – – –⏑⏑⏑×
8. h. –⏑– – –⏑×Footnote ¹⁷
3. Double trochee: –⏑–×
Resolved:
1. a. ⏑⏑⏑–×Footnote ¹⁸
2. b. –⏑⏑⏑×Footnote ¹⁹
4. Hypodochmiac: –⏑–⏑×Footnote ²⁰
Resolved:
1. a. ⏑⏑⏑–⏑×
2. b. –⏑⏑⏑⏑×
5. Spondaic: – – –× (no resolutions)
6. Heroic: –⏑⏑–× (no resolutions)
7. Miscellaneous (everything else)Footnote ²¹

In the first four categories, we allow one resolution of a long into two shorts. Despite some temptation, we have nowhere permitted two or more resolutions in a single clausula. Once you allow more than one resolution, clausulae quickly begin to lose their individual character: should ⏑⏑⏑⏑⏑–× count as a twice-resolved cretic-trochaic or a once-resolved double trochee? There are ways around this problem, but complications immediately multiply, and we doubt whether something like ⏑⏑⏑⏑⏑⏑⏑× could ever be felt as anything other than a very long series of shorts.Footnote ²²

We have used the Packard Humanities Institute (PHI) Latin texts as our corpus of data.Footnote ²³ These texts are of high quality and freely available, although they require extensive preprocessing for machine analysis. First they must be reformatted to Unicode and extra spaces and line breaks must be removed, along with section numbers and book divisions and so forth. Then their orthography must be made uniform: we have converted consonantal i and u to j and v throughout, and systematically incorporated certain unusual features of Latin prosody (for example, huius → hujjus). Then the texts must be ‘macronised’: vowels that are long by nature must be so marked. This is a non-trivial process for which we have used the excellent tool of Johann Winge, which shows a remarkably high degree of accuracy for classical Latin texts (95–98 per cent).Footnote ²⁴ This done, the texts must be syllabified, i.e., separated out into their constituent syllables; here again we have made use of an open-source tool, this time from the Classical Language Toolkit (CLTK).Footnote ²⁵ Finally, problematic elements must be removed from our sample and tracked separately: we exclude clausulae that contain abbreviations (most notably proper names), Roman numerals, textual corruptions marked by editors (daggers, brackets and the like) or fewer than four syllables.

After preprocessing, by default we collect up to thirteen syllables worth of clausular data before every mark of ‘heavy’ punctuation, viz. full-stops, semicolons, colons, question marks and exclamation marks (. ; : ? !). This is not a perfect method, since clausulae can and do occur where editors tend to punctuate with commas, as well as in places where there is no punctuation at all.Footnote ²⁶ Furthermore, many previous scholars have only looked at clausulae before periods, question marks and exclamation marks.Footnote ²⁷ Including semicolons and colons by default seems best to us, but within our framework users can decide for themselves and set which punctuation they would like to consider, and so results with different punctuation patterns can easily be generated.Footnote ²⁸

Then these data must be scanned, sorted and counted. On the one hand, it is easy to write a programme to scan macronised Latin texts. The basic rules are straightforward: if a syllable is closed or ends with a long vowel, it is long. If a syllable is open and ends with a short vowel, it is short. But there are a variety of subtleties that must be accounted for, including elision, instances of mute + liquid and cases of short open syllables before s impura (sc, sm, sp, sq, st, z; so ipse sceleratus); in Cato, at least, one might even countenance ‘sigmatic ecthlipsis’, or loss of final s.Footnote ²⁹ By default we do elide, do lengthen a short final open syllable followed by an s impura, but do not lengthen a short vowel followed by a mute + liquid. We think that this is the most accurate representation of classical Latin pronunciation.Footnote ³⁰ But we also allow users to set these parameters for themselves, and we try to track more fine-grained data as well: so we record whether an elision is of a long vowel/diphthong or of -m or of a short vowel and allow users to choose to elide or not elide in any of these categories.Footnote ³¹ We furthermore track word division/word shape and word accent, which may be relevant if we wish to consider iambic shortening or rules for resolutions that depend on word division or hypothetical ‘prose ictus’.Footnote ³² In sum, we have built in flexibility to allow users to set their own preferred parameters and slice the data differently.

With these tools, we can generate all manner of reports in seconds.Footnote ³³ After preprocessing, we can show the complete syllabification, scansion and accentuation of any Latin text; we can show those results divided into clausulae; we can produce data on numbers and percentages of individual clausulae within a text; and of course we can combine all this information to yield comprehensive data on the prose rhythms of any set corpus of Latin literature, as we do below. Such reports allow us to ask and answer with ease questions that would have taken weeks and months and years of tedious (and error-prone) calculation before.

Some Limitations

Our method certainly is not perfect. For example, we currently assume that Latin prosody showed no variation or evolution over time. This is manifestly untrue, most obviously perhaps in the case of final -o. We know from verse evidence that in the first century b.c., final -o in most words was regularly long (for example, ergō, so always in Vergil). But by the time of Lucan, and still more so by that of Martial, final -o was usually short. We treat such cases as invariably long. In our current model, we likewise ignore effects like iambic shortening, which presumably was in operation in all ages on at least some words at least some of the time.

With further modifications we could allow users to consider different treatments of prosody, but two risks immediately present themselves: first, if final -o is treated differently in, for example, Cicero and Pliny, it may no longer be legitimate to compare results between the two. Second, and perhaps more seriously, it is hard to know in any individual case whether -o is pronounced short or long. In Silius Italicus we find only ergŏ — except at 16.217 ‘cui nescire licet? quin ergō tristia tandem’. For Silius, metre guarantees prosody in each instance, including when it differs from our expectations. But what do we do with Pliny the Younger? At Ep. 6.19.5 ‘concursant ergo candidati’, ergō gives a ‘better’ clausula (molossus ditrochee), and so perhaps -ō should be preferred there, but there is no metrical guarantee. For now consistent practice throughout seems methodologically safest.Footnote ³⁴

There is also the fact that our output will necessarily be determined by our input. The PHI texts are meticulous reproductions of standard print editions, but they do not include a critical apparatus, and so we cannot take account of variant readings. More importantly, for the past century editors have made decisions among variant readings and competing emendations at least in part based on their understanding of prose rhythm. Indeed, they have also considered prose rhythm in how they punctuate their texts. Thus, to some degree, prose rhythm has already been ‘baked in’ to these texts, and our results could be circular. While this is admittedly true at a local level — that is, in the case of any given sentence — over a large corpus, the vast majority of clausulae will be free from textual troubles, and most editorial decisions concerning choice of reading and choice of punctuation will not have hinged on prose rhythm. This objection is thus more potent in theory than practice.

Finally, the various component parts of our programme occasionally err. Although the macroniser returns correct results 95–98 per cent of the time, the rest of the time it does not.Footnote ³⁵ Even more rarely, sometimes the u/v and i/j converter makes a mistake, as does the syllabifier.Footnote ³⁶ While we have tried to make our algorithms as accurate as we can, some error inevitably remains, and we have not adjusted any of our results by hand. We plead the following:

1. The error is very small by comparison with the enormous amounts of data that we can consider. Our sample size is large enough that we can rely on the central limit theorem to justify our statistical analysis. Put plainly: Big Data eliminates small error as a practical issue.
2. The error will be the same in all of our tests. That is to say, we expect that the same types and proportions of error will be present in a text of Cicero or Caesar or Apuleius. Since we use a uniformly consistent methodology, we will always be comparing like with like.
3. It seems very likely that those who count by hand make mistakes too, although because their results are not easily reproducible, it is very hard to determine what kinds of mistakes they have made and how often they have made them.Footnote ³⁷

Our method is not perfect, but we believe that the advantages of getting very accurate — but not perfect — results on large swaths of data in an instant are bigger than the advantages of getting ‘perfect’ results on small amounts of data that take a long time to compile, which cannot be verified and from which it is hard to generalise.Footnote ³⁸

II DATA

Without further ado, we present in tabular form some of the data that our algorithms have generated. We give first a table of the prose rhythms of most major extant Latin prose authors through the age of Trajan, with Suetonius, Gellius and Apuleius appended. There follow tables of Cicero's speeches, his rhetorical and philosophical works, and his letters. Finally we give detailed results for Tacitus and Pliny, which we will discuss in the next section. The arrangement within each table is broadly chronological, although perfect consistency in arrangement has proved neither possible nor desirable.

Fragmentary and incomplete works have generally been excluded.Footnote ³⁹ We have removed passages in verse from Seneca's Apocolocyntosis and Petronius’ Satyrica, but otherwise have not systematically taken special account of verses or quotations.Footnote ⁴⁰ In some authors and works particular caution must be exercised. Given the nature of the Suasoriae and Controuersiae, for example, the statistics for Seneca the Elder are probably of little value and are included only for the sake of completeness. Similar warnings apply to certain texts with particularly small sample sizes or those with unusual transmissions. Numbers never absolve readers of the responsibility to think critically, but with the appropriate caveats in mind, we hope that these numbers will be useful.

The columns in the tables are as follows:

A. Author and title of work.
B. Total number of clausulae detected in the work.
C. Total number of clausulae excluded from consideration (those containing abbreviations, editorially marked textual corruptions, fewer than four syllables and so forth).
D. Total number of clausulae considered (= B - C).
E. Percentage of cretic trochees (including resolved forms).
F. Percentage of double cretics and molossus cretics (including resolved forms).
G. Percentage of double trochees (including resolved forms).
H. Percentage of hypodochmiacs (including resolved forms).
I. Percentage of double spondees (no resolutions).
J. Percentage of heroic clausulae (no resolutions).
K. Total percentage of ‘artistic’ clausulae (= E + F + G + H).
L. Total percentage of double spondees and heroic clausulae (= I + J).
M. Total percentage of miscellaneous (that is, all other) clausulae.

More detailed tables will be found in the Supplementary Material online (https://doi.org/10.1017/S0075435819000881).

Table 1 All authors.

Table 2 Cicero's speeches.

Table 3 Cicero's rhetorica and philosophica.

Table 4 Cicero's letters.

Table 5 Tacitus.

Table 6 Pliny the Younger.

III ANALYSIS

The foregoing tables provide an order of magnitude more information about Latin clausulae than has been available before, and they provide it all in one place with a consistent methodology. We hope that they will prove useful in a variety of research questions, and we give a sample of such questions below. These only scratch the surface of what we think is possible. We begin with a new approach to determining statistical significance in prose rhythm data, and then proceed to specific questions about the prose rhythm practices of individual authors like Cicero, Sallust, Tacitus and Pliny the Younger.

How Do You Tell If Any of These Data Are Meaningful? A New Approach

It is not necessarily obvious that the use of particular sequences of short and long syllables should be regarded as a consciously sought artistic phenomenon in Latin prose. After all, every Latin syllable is long or short, and so every sentence must end with some pattern of longs and shorts.Footnote ⁴¹ Furthermore, the character of the Latin language itself will dictate that some patterns occur more frequently than others: long syllables are more common than short, for example, and so it would surprise no one to hear that – – –× is more common than ⏑⏑⏑×. Likewise, many authors favour verbs at the ends of clauses (i.e., in an important clausular position), and the third person and past tense are disproportionately represented in our surviving texts. These and many other tightly intertwined biases make it extremely hard — we think impossible — to establish any kind of ‘baseline’ expected distribution of rhythms. There is simply no way to say that you would ‘expect’ Latin sentences to end with a cretic-trochee 6 per cent of the time: what do you base your expectations on?

Scholars have generally taken one of three approaches to this question. Some, like Zielinski, ignored it altogether, and simply presented absolute numbers and percentages. But from the beginning it was objected that, for example, reporting that clausulae of the ēssĕ uĭdĕātūr type occur with 4.7 per cent frequency in Cicero's speeches whereas the type ōmnēs ēssēnt occur 6.4 per cent of the time is not in itself useful. What if ēssĕ uĭdĕātūr–type clausulae naturally occur in Latin 2.4 per cent of the time, while the type ōmnēs ēssēnt naturally occurs 23.5 per cent of the time? Then the real point of interest would be that Cicero sought out the former and deliberately avoided the latter, but this is hidden behind the absolute frequencies ‘4.7 per cent’ and ‘6.4 per cent’.Footnote ⁴² To determine the significance of any observed frequency, it must somehow be compared against an expected baseline.

A second approach has been to calculate an expected value based on a ‘neutral’ sample of Latin. Albert De Groot at one point tried sampling scholarly translations of Greek texts made in the nineteenth century, but it is almost impossible to say how such scholarly Latin would map onto a native speaker's intuitions about rhythm.Footnote ⁴³ François Novotný looked at the distribution of syllables not in clausular position, but this is to compare different things.Footnote ⁴⁴ Others have tried still other approaches: Henri Bornecque, for example, considered the proportions of various patterns in authors whom he deemed unlikely to be rhythmic.Footnote ⁴⁵ But this is arbitrary at best, and circular at worst; deciding that Sallust, say, is unrhythmic, and using his numbers as a baseline, is simply to assume your desired conclusion, and it is not much helped if you add a few other authors into the mix.Footnote ⁴⁶

In response to the problems of external comparison, Tore Janson and his student Hans Aili pioneered a form of ‘internal comparison’.Footnote ⁴⁷ They looked at a sample of an individual author's clausulae and determined the frequency of longs and shorts in each position (that is, what percentage of penultimate syllables are long, what percentage of antepenultimate syllables are long and so forth). From this they calculated an expected frequency for each type of clausula in that author, which is simply the product of the observed frequencies for each individual syllable.Footnote ⁴⁸ Then they could compare the observed percentage of a given clausula with its expected value and run statistical tests on their results. This method is ingenious, but it has a fundamental weakness that vitiates any statistics derived from it: these scholars base their ‘expected’ values on the very material that they are trying to observe. If an author systematically seeks certain clausulae and avoids others, those preferences will already be part of the ‘expected’ values and so cannot be called neutral or natural. It is a circular procedure.Footnote ⁴⁹

We propose a new approach to the question of expected values. We think that the only secure basis for comparison is to look at the tendencies of individual authors and attempt to determine whether there are statistically significant differences in their practices. If so, then we can at least say that the differences among authors are unlikely to be due to random chance. Until now, this task was more or less impossible, because while there exist studies of individual authors’ rhythmic tendencies, the scholars carrying out these studies made different assumptions and employed different methodologies. Our data, by contrast, allow a comparison of like with like across all of Latin prose. Furthermore, in authors with sufficiently large corpora, we can also consider a portion of the corpus and determine whether its rhythmic practices match the rest of the corpus. So with Cicero's speeches, for example, we can consider each individual speech separately and compare it to the rest of his corpus with that speech removed. Indeed, such a comparison can even be applied to individual letters of Cicero's to determine whether it is likely that he paid extra attention to rhythm in them, or, with some further work, to compare the rhythmic practices of speeches and narrative in a historian. We will carry out all of these tests in the following sections.

Any such statistical tests must be used with appropriate caution, for their results are wholly determined by the data input. Take Varro and his two substantially extant works, De lingua Latina and De re rustica. We could consider his distribution of clausulae in five categories (including resolutions in each): cretic-trochaic, double cretic or molossus cretic, double trochee, hypodochmiac, and ‘everything else’. We would then have a table of data like this:

The most appropriate statistical test to analyse such data, and one with a long history in studies of prose rhythm, is the chi-square test.Footnote ⁵⁰ The details are available in any statistical handbook,Footnote ⁵¹ but in essence, the chi-square test applied to this data will test the null hypothesis that the two rows of data come from the same distribution and that variation between the two is merely due to chance. (This is not a measure of degree of difference between two samples, but a test of whether these differences are unlikely to arise by chance if both samples were drawn from identical populations.) From our chi-square test statistic is derived a p-value; if our p-value is below a certain threshold (in this paper, as often, .05), we reject the null hypothesis and conclude that there is a statistically significant difference between the two rows of data.Footnote ⁵² Put plainly, the chi-square test allows us to say whether an apparent difference in authors’ use of particular clausulae is in fact statistically significant.Footnote ⁵³

If we run a chi-square test on the above five columns of data, we get χ² = 39.796; with four degrees of freedom this results in a p-value near zero.Footnote ⁵⁴ Such a value indicates that it is almost impossible for the prose rhythms of these two works to belong to the same distribution. But a priori this is very unlikely; Varro wrote both of them, and the rhythms of neither look to be ‘artistically’ rhythmic in the Ciceronian sense of the term. A test treating these five columns of data appears too sensitive. If, however, we pool the data differently and group our ‘artistic’ clausulae (cretic trochees, double cretics, ditrochees and hypodochmiacs) together and our ‘non-artistic’ clausulae (double spondees, heroic clausulae and everything else) together, we can look instead at the two columns of the following table:

A glance at these proportions will show that they are very similar. It is no surprise, then, that a chi-square test on these data yields χ² = 0.977, producing a p-value of about 0.32294. This p-value, by contrast, indicates that it is reasonable to conclude that any deviation in the prose rhythms of these two works is due to random chance. We get the same result if we compare the individual books of De lingua Latina and De re rustica using a chi-square test of ‘artistic’ vs ‘non-artistic’ clausulae: there are no statistically significant differences in preferences for artistic and non-artistic clausulae among the various books.

These two very different results are a salutary warning that statistical tests must be used cautiously, and always with an eye on the underlying data and reasonable expectations.Footnote ⁵⁵ The choice of collapsing our data into two categories of artistic and non-artistic clausulae is, again, fundamentally a pragmatic one. It produces sensible and interesting results. It has the further virtue of agreeing with many of the theoretical models that have been constructed for Latin prose rhythm. But there may be better — and there are certainly other — ways of dividing the data, and binary tests between ‘artistic’ and ‘non-artistic’ clausulae should simply be seen as one useful tool, not as some kind of definitive measure.

This test also suggests that we should adjust certain assumptions, as another example will make clear. We can compare Varro's De re rustica and Cato's De agri cultura using our ‘artistic’ vs ‘non-artistic’ model as follows:

χ² = 14.463, p-value ≈ 0.00014. These two authors, according to our test, almost certainly show different propensities to artistic clausulae. The commonly accepted prior assumption is that neither Varro nor Cato cares about prose rhythm, but we suggest that this assumption is wrong. It is all but certain that any Latin author had intuitive preferences for some rhythms and unconsciously avoided others. Indeed, this is borne out by our data: when we look at our tables for all authors’ prose rhythm preferences, we nowhere see, even in supposedly ‘unrhythmic’ authors, convergence around particular baseline numbers. This should not be surprising: in English no one would expect Jonathan Franzen and David Foster Wallace to share the same rhythmic tendencies, even if they were contemporaries and friends who wrote in the same genres for similar audiences. All Latin authors have their own rhythmic profiles, and thus no universal expected values can be established. But authors can be compared with each other, and furthermore, authors can be compared with the artificial system of ‘artistic’ clausulae adopted by Cicero and many later writers.

So Varro is consistent with Varro, and Caesar is consistent with Caesar:

χ² = 1.173, p-value ≈ 0.27879: any variation in Caesar's tendency toward artistic clausulae between the Bellum Gallicum and the Bellum ciuile is not statistically significant. By contrast, Varro and Caesar clearly differ from each other:

χ² = 148.224, p-value ≈ 0: these two authors do not have the same preferences at all. If we say that they are ‘not rhythmic’, what we really mean is that they do not follow the distribution of clausulae characteristic of Cicero, because they clearly have their own tendencies in how they distribute longs and shorts.Footnote ⁵⁶

It is pretty clear from our data that no two authors show the same rhythms, although many authors are consistent with themselves in their preferences (so, for example, Sallust). What also seems pretty clear is that some authors deliberately avoid spondaic, heroic and other unusual clausulae in favour of forms of the ‘artistic’ four (including resolved forms), viz. cretic trochees, double cretics (or molossus cretics), double trochees, and hypodochmiacs. Latin teems with long syllables, and authors who have a markedly lower proportion of – – – × are probably avoiding it deliberately. The effects can be pervasive: in Cicero, for example, audistis is found 72 times, audiuistis 2, audisti 16, audiuisti 0.Footnote ⁵⁷ Cicero seems to avoid the sequence of four long syllables. So too does Pliny the Younger show a marked aversion to double spondaic clausulae, which occur in his writings only around 6 per cent of the time. Authors like Tacitus, by contrast, are much less averse to double spondees, which comprise nearly a quarter of his clausulae.

In addition to double spondees, it is especially relevant to consider the frequency of heroic clausulae (that is, hexameter endings). In most authors these are not very frequent, but in certain authors, like Cicero, they are exceptionally rare.Footnote ⁵⁸ The sum of double spondaic and heroic clausulae thus provides an approximate index for how ‘artistically’ rhythmic an author is; adding in the rare miscellaneous clausulae makes this measure the precise complement of the artistic four.Footnote ⁵⁹ Authors who clearly pay attention to the canons of an artificial doctrine of ‘artistic’ prose rhythm include (in parentheses is given the author's percentage of artistic clausulae):

1. Cicero (e.g. 83.42 per cent in the speeches taken together)Footnote ⁶⁰
2. Velleius Paterculus (79.68 per cent)Footnote ⁶¹
3. Seneca the Younger (e.g. 80.92 per cent in the Epistulae morales)Footnote ⁶²
4. Q. Curtius Rufus (85.29 per cent)Footnote ⁶³
5. Pomponius Mela (82.62 per cent)Footnote ⁶⁴
6. Pliny the Younger (84.87 per cent in Epist. 1–9; 85.36 per cent in Pan.)Footnote ⁶⁵
7. Suetonius (80.57 per cent in the Vitae)Footnote ⁶⁶
8. Apuleius (in some works; e.g. 78.50 per cent in Met.)Footnote ⁶⁷
9. [Quintilian], Declamationes maiores (84.03 per cent)Footnote ⁶⁸

In the main, our results confirm earlier scholars’ smaller, sample-based studies of individual authors; such replication and verification has long been missing in studies of prose rhythm.Footnote ⁶⁹ So, for example, Velleius Paterculus shows a remarkable affection for double cretic and molossus cretic rhythms, which comprise some 40 per cent of his clausulae. This striking preference is unexpected, unprecedented and not imitated by later authors. Aili looked at a sample of 500 Velleian clausulae, and, although counting only six syllables and presenting his data somewhat differently, found essentially the same tendency.Footnote ⁷⁰

The great bulk of Latin prose authors, however, seem to have followed their own rhythmical preferences, not a set of Hellenistic precepts. To this generalisation one special case should be noted: both Sallust and especially Livy must have consciously sought out heroic and spondaic rhythms, and to an extraordinary degree (Sall., Iug.: 34.93 per cent, Cat.: 33.73 per cent; Livy: 43.99 per cent). Livy's preferences moreover intensified over time, being least marked in the first decade (35.41 per cent) but increasingly so in Books 21–30 (48.37 per cent) and 31–40 (49.72 per cent). These authors have deliberately chosen to go in precisely the opposite direction to the Ciceronian system.Footnote ⁷¹ Whether Livy's and Sallust's predilection for non-artistic clausulae constitutes a ‘historical style’ is unclear; Tacitus, at any rate, does not follow their example.Footnote ⁷²

In sum, ‘expected values’ for the distribution of rhythms in unmarked Latin prose simply cannot be established on the basis of surviving evidence, for all authors have their own rhythmic preferences. But there are statistically significant differences in these authorial preferences. Furthermore, an important subset of Latin authors adhered in some fashion to a particular ‘artistic’ rhythmic canon, and at least a couple deliberately rebelled against it. It is in this sense that we can claim that Latin prose rhythm is not just a chimera that scholarly syllable counters have been chasing after in vain for over a century.

Authorial Variation and ‘Spurious’ Compositions

Cicero has always provided the notional benchmark against which Latin prose rhythm has been measured, but Cicero's own rhythmical practices vary widely over time and genre and even an individual work. One often reads, for example, that Cicero was less attentive to prose rhythm in his correspondence. While this claim can and should be nuanced, it is clearly right, as can be seen by comparing Cicero's speeches with the Epistulae ad Atticum:

χ² = 856.038, p-value ≈ 0: these distributions are very different. The letters are markedly less concerned with artificially artistic prose rhythm.

Of course, not all letters are created equal.Footnote ⁷³ When Cicero is writing for a wider audience, as in his long letter of advice to Quintus during the latter's time as a provincial administrator in Asia, he uses markedly different rhythms than when he writes for his brother's ears alone:

χ² = 35.94, p-value ≈ 0. The polished and public Q. fr. 1.1 was composed with much more attention to pretty clausulae.

Furthermore, it should be observed that even within Cicero's corpus of speeches we find considerable variation. Pro Roscio comoedo, for example, is notably non-artistic in its rhythms, perhaps showing a ‘studied negligence’ in imitation of comedy.Footnote ⁷⁴ While general trends can be descried — the earliest speeches show fewer cretic trochees, say — there exist occasional counter-examples to almost all of them (so the later Pro Rabirio Postumo shows a very low percentage of cretic trochees). Given all this variation, can we even talk about Cicero's ‘prose rhythm preferences’ as some kind of Platonic form? We are sceptical.

Studies of prose rhythm often hold out the promise of uncovering an author's unique rhythmic fingerprint, a sort of unchanging stylistic essence. Such a fingerprint could be of enormous use in questions of authenticity. Some authors, as we have seen, do present a very consistent fingerprint: Caesar is consistent with Caesar; Varro is consistent with Varro. Other authors, however, are chameleons, adapting their rhythms to circumstances. Cicero is a chameleon. Such authorial variation and adaptability means that we cannot naively rely on prose rhythm to distinguish between genuine and spurious compositions.

This claim is most easily demonstrated by using our artistic vs non-artistic test for each of Cicero's speeches set against the corpus of the rest of his speeches. In effect, we are conducting a thought experiment in which we ask, ‘If this work were not known to be Cicero's, would it fit rhythmically with the rest of his corpus?’ Table 7 shows Cicero's surviving speeches, sorted from most to least artistically rhythmic.

Table 7 Cicero's speeches (ranked).

The test that we have just described would identify fully twenty-two of these speeches as suspect:

• Non-artistic to a statistically significant degree (9): Quinct., Rosc. Am., Caecin., Tul., Verr. 2.1 and 2.2, Q. Rosc., Rab. Post., Phil. 8.
• Artistic to a statistically significant degree (13): Leg. Man., Catil. 2 and 4, Arch., Dom., Vat., Prov. cons., Cael., Balb., Planc., Marcell., Phil. 3 and 4.

Now these data are not without use. We have already commented on the exceptional Pro Roscio comoedo, which is, rhythmically speaking, far and away Cicero's ‘least Ciceronian’ speech. It is probably not coincidence that most of the other less ‘artistic’ speeches cluster at the beginning of Cicero's career; it would not be surprising to find that his rhythmic preferences evolved and were refined over time, and any such change has been flattened out in this test. And yet Philippic 8 is rather unexpected; Cicero's tendency towards more artistic clausulae is hardly a fixed law. On the other hand, sometimes Cicero seems to have gone out of his way to be especially ‘artistic’ in his rhythms. Such speeches include some of Cicero's most important, like the Catilinarians (a sign of careful revision?), as well as particularly literary efforts like Pro Archia and Pro Caelio.Footnote ⁷⁵

But while the data are not useless, a test showing that fully 38 per cent of Cicero's speeches appear ‘non-Ciceronian’ is clearly not the appropriate instrument to determine authorship of a potentially Ciceronian speech.Footnote ⁷⁶ For Cicero, prose rhythm is not just a signature of authorship; it is in fact a form of content. A too simple application of statistical tests to prose rhythm to resolve questions of authenticity risks conflating variation in content with variation in authorship.

We still think that such tests can sometimes be applied with profit, but they must be applied very carefully. They work best with authors who do not appear to vary their rhythmic practices depending on content, like Sallust. As our tables show, Sallust exhibits the same rhythmic profile in all of his historical works, and we shall soon see that he does not evince any differences between his narrative and set-piece speeches within those works either. The author of the pseudo-Sallustian Inuectiua in Ciceronem, on the other hand, has a markedly different set of preferences for artistic and non-artistic clausulae:

χ² = 4.549, p-value ≈ 0.03293.Footnote ⁷⁷ One might still try to argue that this is simply an instance of generic differences dictating different rhythms, but in any case we can say that overall propensity to artistic clausulae does not encourage belief in Sallustian authorship.Footnote ⁷⁸ By contrast, preferences for artistic clausulae at least do not militate against the claim that Sallust wrote the Epistulae ad Caesarem:

χ² = 0.404, p-value ≈ 0.52503. The rhythms of the Epistulae ad Caesarem are indistinguishable from Sallust in his historical works; if they are not genuine, the imitator showed a remarkably accurate knowledge of Sallust's unusual rhythmic tendencies.Footnote ⁷⁹

But such applications are perhaps more limited than we might want. Rhetorica ad Herennium, for example, is rhythmically indistinguishable from De inuentione, but this is not a function of Ciceronian authorship: you might guess that similarity in content is the reason that their rhythms converge. Tests using this method can measure real differences between texts, and this is of value, but such variation may be tied to any number of factors, most notably variation in content. While in certain circumstances, particularly when an author shows very stable rhythmic practices, these tests can be a piece of evidence in the discussion of authenticity, prose rhythm is very far from a panacea for resolving the attribution of a disputed work.

Variation Within a Text: Speeches vs Narrative in Sallust and Tacitus

We have just seen that some authors vary their prose rhythm practices in different genres (private letters vs public speeches, say), and that indeed some authors show remarkable variation even within a single broad genre (Cicero's orations). This naturally leads to the question of whether authors show different rhythmic practices within an individual work. In Latin historiography, for example, is there a difference in prose rhythm between narrative and inset speeches?Footnote ⁸⁰ We have looked at the cases of Sallust and Tacitus. For Sallust, the answer is a clear no. For Tacitus the situation is more complex: Tacitus does seem to have different rhythmic profiles, and they do sometimes correlate with the distinction between narrative and speeches — but not always.

To arrive at these answers we must first separate the historians’ corpora into speeches and narrative. While it is perhaps not impossible to do this programmatically, it is a challenge,Footnote ⁸¹ and we have simply segregated by hand. We have included only longer instances of direct speech, excluding both short utterances and all indirect discourse.Footnote ⁸² Our corpora of speeches are as follows:Footnote ⁸³

Sall., Cat. 20, 33, 51, 52, 58; Iug. 10, 14, 31, 85, 102, 110; Hist. Or. Lepidus, Philippus, Cotta, Macer.

Tac., Agr. 30–2, 33–4; Hist. 1.15–16, 29–30, 37–8, 83–4; 2.47, 76–7; 3.2, 20; 4.32, 42, 58, 64–5, 73–4, 77; 5.26; Ann. 1.22, 28, 42–3, 58; 2.37–8, 71, 77; 3.12, 16, 46, 50; 4.8, 34–5, 37, 40; 6.6, 8; 11.24; 12.37; 13.21; 14.43–4, 53–4, 55–6; 15.2, 20; 15.22, 31.

Table 8 Sallust and Tacitus, speeches vs narrative.

For Sallust the results are plain.Footnote ⁸⁴ For example, in the Bellum Iugurthinum:

χ² = 0.977, p-value ≈ 0.32294. The Bellum Catilinae shows an even greater similarity:

χ² = 0.043, p-value ≈ 0.83573. Even the longer speeches of Sallust's Historiae seem to fit this pattern. We here compare them with the Bellum Iugurthinum and Bellum Catilinae, because the fragmentary state of the remainder of the Historiae makes any inferences drawn against them unreliable at best:

χ² = 0.328, p-value ≈ 0.56684. Sallust shows an apparently unshakable consistency in his preferences for artistic and non-artistic clausulae, both across his various works and within them, making no distinctions between speeches and narrative.

For Tacitus the story is more nuanced. In the Annales, he shows a slight tendency toward more artistic clausulae in speeches, but it is slight and not statistically significant:

χ² = 1.523, p-value ≈ 0.21717. In his last work, it appears that Tacitus did not differentiate speeches from narrative rhythmically, or at any rate that any differentiation is so small that it may well have arisen by chance.

But in his earlier works the tendency toward artistic clausulae in speeches is more pronounced. So in the Agricola:

χ² = 1.31, p-value ≈ 0.25239. The chi-square test statistic here is small both because the difference in the proportion of artistic clausulae is not large and, importantly, because the sample size of speeches in the Agricola is so small. But these proportions are very nearly what we see in the Historiae, where the larger sample size allows for more statistical confidence:

χ² = 14.969, p-value ≈ 0.00011. The difference between speech and narrative here is large and statistically significant. The narrative portion of the Historiae shows almost the exact same propensity to artistic clausulae as the narrative of the Annales (and the Agricola). The speeches of the Historiae, however, resemble nothing so much as the Dialogus and Germania, from which they are indistinguishable in their preferences for artistic clausulae.

χ² = 0.207, p-value ≈ 0.64913.

χ² = 0.157, p-value ≈ 0.69193.

What do all these numbers mean? They seem to indicate that while Sallust has a uniformly consistent set of (dis)preferences for artistic clausulae, Tacitus has at least two separate rhythmic profiles that he can use. These two separate profiles sometimes correlate with the distinction between speech and narrative (so in the Dialogus, Agricola and Historiae), but not always: in the Annales, Tacitus shows roughly the same proportion of artistic clausulae in both speech and narrative, and in the Germania, which is exclusively narrative, Tacitus exhibits the rhythmic preferences that he shows elsewhere for speeches. More investigation is needed here, but it is plain that prose rhythm is part of Tacitus’ literary artistry, and that he sometimes varies his practice for some kind of effect. It would certainly be a mistake to claim, as many scholars have, that Tacitus is indifferent to prose rhythm.Footnote ⁸⁵

Tacitus, Dialogus de oratoribus

We have just seen that Tacitus makes use of a particular rhythmic profile in the Dialogus de oratoribus. Now in that work he imitates Cicero in numerous and varied points of diction. He postpones igitur to second position; he uses the word autem some twenty times (compared to six instances in all of the Historiae and Annales); he indulges in a number of synonymous doublets.Footnote ⁸⁶ One might wonder whether his rhythmic preferences in the Dialogus are a sought-out imitation of Cicero too, as Gregory Hutchinson claims.Footnote ⁸⁷

It is in some sense true that the Dialogus is Tacitus’ ‘least Tacitean’ work in its propensity to artistic clausulae. A test of its numbers of artistic and non-artistic clausulae against those of the rest of Tacitus’ corpus marks it as a clear outlier:

χ² = 27.483, p-value ≈ 0. But as we have already seen, that is only part of the story. The Germania too, for example, shows the same rhythmic profile, as do the speeches in the Agricola and the Historiae.

Moreover, this propensity to artistic clausulae is not necessarily ‘Ciceronian’. The best point of comparison between the Dialogus and ‘Cicero’ is not completely clear. Does the Dialogus map onto the prose rhythm of Cicero's speeches?

χ² = 51.135, p-value ≈ 0. No, it is not even close. What about Cicero's own dialogues? Here it is hard to know what corpus to pick, but the Dialogus is less artistically rhythmic than any of Cicero's surviving dialogues. If we compare it, for example, with all of Cicero's extant rhetorical and philosophical works pooled together, we get:

χ² = 62.125, p-value ≈ 0. Again, not even close; even further away, in fact.

The rhythms of the Dialogus are clearly different from the narrative portions of Tacitus’ historical works, but they resemble the Germania and the speeches of the Agricola and the Historiae. What Tacitus is doing with this varying propensity toward artistic clausulae calls for further study, but we can say with confidence that neither in the Dialogus nor anywhere else does he even approach a true rhythmic imitation of Cicero.

Pliny the Younger

Pliny the Younger offers an interesting test case for a variety of questions, not least because he, like Sallust, presents such a consistent set of rhythmic preferences. We can thus use our statistical tests to answer questions such as: do Pliny's private letters (Ep. 1–9) differ from his correspondence with Trajan (Ep. 10)? Is there any variation within the books of private correspondence? Does Trajan's prose rhythm in Ep. 10 differ from Pliny's? And what of the rhythms of the Panegyricus, an epideictic speech perhaps liable to entirely different generic conventions from a book of stylish letters?

In the first instance we can observe that Pliny is an author with a marked preference for artistic rhythms. He shuns spondaic and heroic clausulae (even more than Cicero did in his speeches, to say nothing of his letters), and he favours cretic-trochaic rhythms to an almost unprecedented degree and with remarkable consistency across the private correspondence: they comprise some 40 per cent of the clausulae in Ep. 1–9.Footnote ⁸⁸ These preferences combine to yield an extraordinarily stable rhythmic profile across the private letters. Indeed, those similarities extend even to the Panegyricus. Consider a detailed chi-square test of the sort that showed different distributions for Varro's two works:

χ² = 5.902, p-value ≈ 0.20658. The Panegyricus, even on a very fine-grained test, cannot be distinguished from the letters, and the individual books of letters are themselves all but indistinguishable from each other.Footnote ⁸⁹

The exception, of course, is Book 10. Trajan's replies show a clearly different rhythmic fingerprint. If we compare the pooled artistic and non-artistic patterns in Ep. 1–9 with Trajan's replies to Pliny in Book 10, the latter are conspiculously less artistic:

χ² = 33.083, p-value ≈ 0. Trajan's rhythms in Book 10 are completely different from Pliny's in Books 1–9. Indeed, Trajan's rhythms in Book 10 are completely different from Pliny's in Book 10:

χ² ≈ 10.53, p-value ≈ 0.00117. Trajan (or his chancery secretary) speaks in his own voice and with his own cadences.

The prose rhythm of Pliny's own letters in Book 10 is only slightly less ‘artistic’ than that of Books 1–9, although the difference does rise to statistical significance:

χ² = 5.066, p-value ≈ 0.02439. Nevertheless, prose rhythm appears to have been a natural part of Pliny's composition process in a way that it was not for Cicero in his letters, although it must still be a learned part, because his preferences are so distinctive — or, just maybe, he revised Book 10 for publication himself and took some care for its rhythmic properties.Footnote ⁹⁰

Finally, as we have already seen, although the Panegyricus is a speech, in it Pliny uses almost exactly the same rhythmical patterns as he does in the Epistulae. But to think of the rhythmic preferences of the Panegyricus as the same as those of the Epistulae is probably to put the cart before the horse. In his own lifetime, Pliny was above all an orator, and it is a simple twist of fate that we happen to have ten books of Pliny's letters and only one preserved speech. It seems very likely that the prose rhythms we find in his letters have their origin in the preferences that he developed for his speeches. This is probably a deliberate (and artistic) affectation, since one might have expected his correspondence, like Cicero's, to be looser about such details, and it is another reason we should consider Pliny's letters highly polished literary compositions.

IV CONCLUSIONS

Our algorithms and the data that they generate provide a powerful tool to answer questions like the ones posed above, a list which can be extended indefinitely. Because we are using computers and code, we can change assumptions or look at different texts or divide our existing texts up differently — and immediately generate refreshed data for the entirety of the corpus that we are considering. Furthermore, although it is in most cases impossible to replicate previous scholars’ methodologies with absolute precision, in broad outline we can nevertheless check their results almost instantaneously. This process of replication and verification has long been absent from studies of Latin prose rhythm. Since all our code and data are open source and publicly available, our own results can also be easily checked (and perhaps improved).

Improvements and extensions of these data may take a variety of forms. A different approach to locating clausulae, one that does not rely on punctuation, might help advance exploration of ‘internal’ clausulae, a topic which has thus far resisted rigorous analysis. More extensively marked up texts would facilitate other kinds of investigations: for example, does Cicero use different rhythms in his exordia, or narrationes or perorationes? Annotating his speeches with consistent metadata would allow for more detailed study. More sophisticated data manipulation techniques, like Principal Component Analysis, might give us other profitable ways to categorise our data beyond just ‘artistic’ and ‘non-artistic’.Footnote ⁹¹ And this is to say nothing of further work that can be done with the data that we have already collected, like that on word division and word accent in clausulae, which would necessarily be crucial in studying the rhythms of late antique texts as the cursus begins to develop.

Of course, none of the broad brush pictures painted by statistical analysis can give insights at the level of an individual clausula in an individual sentence in an individual author's text. Such an analysis of the details of prose rhythm in the context of a speech or a letter is eminently worthwhile and can have great explanatory power.Footnote ⁹² So when Cicero describes the same event twice in almost the same words in Pro Milone, he once writes ‘respondit triduo illum aut summum quadriduo esse periturum’ (Mil. 26), but later ‘audistis … periturum Milonem triduo’ (Mil. 44). It seems likely that he wrote esse periturum in the first case because it was in clausular position (= esse uideatur), whereas in the second the infinitive came in the middle of the phrase and so he preferred simply periturum. Prose rhythm is one of the keys to unlocking the secrets of Latin word order and word choice, revealing points of emphasis and rhetorical artifice, and understanding it at the local level is essential for appreciating an author's verbal artistry. Much of this artistry must have been put into practice subconsciously or unconsciously (see, for example, Quint., Inst. 9.4.119–20), and we remain sceptical of accounts that attempt to quantify the force of any individual clausula, but it is clear that ancient authors and ancient audiences could perceive and appreciate rhythmic prose.Footnote ⁹³ Today, without native speaker Sprachgefühl, we can only recover these effects by philological analysis.

While interpreting prose rhythm at the level of the sentence and clause requires close reading and analysis, at the global level, questions of prose rhythm cry out for an open-source, Big Data approach. We have offered one such approach, producing algorithms to detect and categorise the rhythms of any Latin prose text, providing comprehensive data generated by these algorithms for most of extant classical Latin prose, presenting a new statistical approach to analysing the significance of those data, and giving several examples of how to use our data and procedures to answer particular questions about authors’ propensity toward artistic rhythms. For example, we can confirm that Cicero's letters are significantly less concerned with ‘artistic’ prose rhythm than are his speeches, but we can also show how certain letters, like the lengthy and polished Q. fr. 1.1, take particular care to be artistically rhythmical. We can with a few clicks compare the prose rhythms of the perhaps spurious Inuectiua in Ciceronem or Epistulae ad Caesarem senem with those of the undisputedly genuine Sallust: the former does not look at all Sallustian, but the latter actually does. We can compare the rhythms of speeches and narrative in authors like Sallust and Tacitus: Sallust's rhythms never change, but Tacitus has at least two distinct rhythmic profiles (neither of which, even in the Dialogus, counts as ‘Ciceronian’). We can see almost at a glance that Trajan's replies to Pliny's letters in Book 10 have an entirely different rhythmic fingerprint from Pliny's, while in the Panegyricus Pliny mirrors the rhythmic preferences that he shows in the Epistulae. It may be an exaggeration to claim that technology will revolutionise the study of Latin prose rhythm — the fundamental insights as worked out over a century ago seem to stand correct and confirmed — but it will certainly replough the entire field, offering fresh data and the possibility of countless new results. Nothing will ever make the study of Latin prose rhythm easy, but computers will certainly make it a lot easier.

SUPPLEMENTARY MATERIAL

For Supplementary Material for this article please visit https://doi.org.10.1017/S0075435819000881.

Footnotes

We thank Kathleen Coleman, Myles Lavan, Tim Moore, Christopher Whitton and the Journal’s anonymous readers: all have substantially improved this article. We also thank Christopher Kelly for his friendly and efficient editorial work.

¹ Berry Reference Berry1996a: 50.

² Although born Tadeusz Stefan Zieliński, when publishing in German he went by Theodor Zielinski. For a full biography, see Srebrny Reference Srebrny2013, with prose rhythm discussed at 149–51. An accessible introduction to Zielinski's methods and their results is provided by Clark Reference Clark1905.

³ For a comprehensive critique of Zielinski, see Oberhelman Reference Oberhelman2003: 90–106.

⁴ Zielinski Reference Zielinski1904: 7. See Laurand Reference Laurand1936–38: 2.199–200; Berry Reference Berry1996a: 49–50 with details on what Zielinski did — and did not — count using this method.

⁵ He also arbitrarily allowed molossi, choriambs and epitrites to be substituted for a cretic. See Shewring Reference Shewring1930: 165; cf. Aili Reference Aili1979: 67–8. Cicero himself points to the double trochee as the fundamental unit (Orat. 212–15); see further Winterbottom Reference Winterbottom, Obbink and Rutherford2011.

⁶ See, for example, De Groot Reference De Groot1921: 18–20; Shewring Reference Shewring1930: 165; Berry Reference Berry1996a: 48.

⁷ See, for example, Shewring Reference Shewring1930: 165; Berry Reference Berry1996b: 52 n. 253.

⁸ So Winterbottom Reference Winterbottom, Obbink and Rutherford2011: 265 n. 17 on Zielinski's results: ‘still the only source for complete figures on Cicero's speeches’. Zielinski was not in fact the first to study prose rhythm (see Novotný Reference Novotný1929: 2–16), but his results were so novel and comprehensive that, for all intents and purposes, they sprang fully formed from his head and revolutionised the field.

⁹ So, for example, Bornecque Reference Bornecque1907; De Groot Reference De Groot1921; Reference De Groot1926; Broadhead Reference Broadhead1922; Primmer Reference Primmer1968; Aili Reference Aili1979; Aumont Reference Aumont1996.

¹⁰ An earlier digital tool developed especially for the analysis of cursus rhythms is described in Spinacce Reference Spinacce, Tomasi, Rosselli del Turco and Tammaro2014 (and is online at: http://cursusinclausula.uniud.it/public/). Because this tool requires users to intervene manually in cases of potentially ambiguous prosody, and because so many clausulae contain instances of such prosody (for example, puellă vs puellā), this tool is of only the most limited application. (In the exordium of Pro Milone, which consists of about 575 words, twelve user interventions were required, and several other errors were generated. Each user intervention must be hand encoded into the text being scanned.) Furthermore, it is hard to get the programme to work, and the results it produces are not presented in a useful format. A truly remarkable early pioneer in applying computers to the study of Greek prose rhythm was McCabe Reference McCabe1981; the understated description of what he managed to accomplish with the technology of forty years ago (at 82–118) is awe-inspiring.

¹¹ Most notably Cicero's: see, for example, De or. 3 and Orat. 168–238. Further testimonia are collected in Bornecque Reference Bornecque1907: 5–166; Clark Reference Clark1909; recent discussion in Oberhelman Reference Oberhelman2003: 27–67. Zielinski Reference Zielinski1904: 4, among many others, reasonably concluded that Cicero did not know his own practice. A few scholars have tried to show that Cicero's theory and practice do align, most notably Laurand Reference Laurand1936–38: 2.159 and passim; Schmid Reference Schmid1959 (followed by, for example, Koster Reference Koster2011), but this requires exceptional creativity. Quintilian likewise seems hopeless; in the Institutio oratoria, ‘there is hardly a single type of ending to a Latin sentence that is not recommended’ (Winterbottom Reference Winterbottom, Obbink and Rutherford2011: 263). On these issues, see sensibly Hutchinson Reference Hutchinson2018: 5–10, 16–19.

¹² Latin accent: Allen Reference Allen1978: 83–4; ablative case: Taylor Reference Taylor1991.

¹³ For a summary of research from the Renaissance through the early twentieth century, see Novotný Reference Novotný1929: 2–33. Useful sketches of work from the nineteenth century onwards include Wilkinson Reference Wilkinson1963: 237–40; Aili Reference Aili1979: 8–15; Aumont Reference Aumont1996: 11–58. A comprehensive survey and evaluation of all major modern studies of prose rhythm is provided by Oberhelman Reference Oberhelman2003: 69–184.

¹⁴ The system that we describe here is very similar to (but not exactly identical with) the schemata of Nisbet Reference Nisbet and Craik1990; Hutchinson Reference Hutchinson2018: 11–12.

¹⁵ For Hegesias and his system, see concisely Hutchinson Reference Hutchinson2018: 5–10, 16–19; full testimonia in FGrH 142, RE 7.2, cols 2607–8. Hutchinson Reference Hutchinson2013: 233–5 argues that Cicero introduced these rhythms to Latin; this is at least a plausible suggestion, but given the fragmentary evidence for Latin prose before Cicero, certainty is impossible. The seemingly Ciceronian rhythmic propensities of the non-Ciceronian Rhet. Her. may raise some doubts (on which see Hutchinson Reference Hutchinson2013: 235).

¹⁶ So explicitly Hutchinson Reference Hutchinson2015: 789.

¹⁷ Epitrite substitution: Zielinski Reference Zielinski1904: 85–92; Berry Reference Berry1996b: 51.

¹⁸ The pattern –⏑⏑⏑–× is counted as a resolved cretic trochee, not a resolved double trochee.

¹⁹ The patterns –⏑–⏑⏑⏑× and – – – ⏑⏑⏑× are counted as resolved double cretics/molossus cretics, not resolved double trochees.

²⁰ Hypodochmiac clausulae are rare, occurring less frequently than double spondees even in authors with a predilection for ‘artistic’ rhythms. But Hutchinson Reference Hutchinson1995: 485–6, looking at the alternation of atque/ac before consonants in Cicero, is a simple and persuasive piece of evidence in favour of treating them as artistic. If they were treated as non-artistic, however, very little would change in the following discussion.

²¹ We do track certain forms of ‘everything else’ individually, for example, first paeons (–⏑⏑×) that do not constitute parts of a once-resolved cretic trochee (–⏑–⏑⏑×), or choriamb-trochees (–⏑⏑– –×), but their numbers are generally so small that it makes most sense to lump them all together in a miscellaneous group.

²² In cases of ambiguity, Zielinski Reference Zielinski1904 took the most rigorous line, attempting to determine the appropriate category for a multiply resolved clausula by considerations of word division, accent and supposed ictus. Even if he managed to be consistent in his choices (unverifiable), the problems with this approach are so considerable as to render it of little practical value.

²³ http://latin.packhum.org/.

²⁴ https://github.com/Alatius/latin-macronizer; see Winge Reference Winge2015. The challenges involved are considerable: puellă (nom.) vs puellā (abl.), incīdo (‘I cut into’) vs incĭdo (‘I fall upon’), omnĭs (nom. and gen. sg.) vs omnīs (acc. pl.), etc. These problems present a major obstacle for automating scansion, and Winge has done groundbreaking work. His approach uses the RFTagger (http://www.cis.uni-muenchen.de/~schmid/tools/RFTagger/) for part-of-speech tagging trained on the Perseus Latin Dependency Treebank (https://github.com/PerseusDL/treebank_data) and PROIEL (https://github.com/proiel). Larger data-sets of training data and other machine-learning approaches are possible and may increase accuracy still further.

²⁵ Johnson et. al. Reference Johnson2014–.

²⁶ See, for example, Nisbet Reference Nisbet and Craik1990 and the earlier investigations of Fraenkel Reference Fraenkel1968 (building on his own previous work); Habinek Reference Habinek1985. Restricting ourselves to clausulae before punctuation here ensures consistency in our results. Note too that the placement of commas differs widely in different critical texts: compare, for example, the practice of German and English editors.

²⁷ So, for example, Aili Reference Aili1979, among others; cf. Berry Reference Berry1996a: 64, who does include colons and semicolons.

²⁸ In our testing, as you might expect, considering clausulae only before full-stops, exclamation marks and question marks increases the proportion of artistic rhythms, whereas including commas decreases it.

²⁹ On this feature of archaic prosody see Allen Reference Allen1978: 36–7, and the references collected in Butterfield Reference Butterfield2008: 188 n. 4. We have disregarded the possibility of weakening or loss of final s.

³⁰ See Allen Reference Allen1978: 78–82, 89–90; on s impura, Cser Reference Cser2012 with somewhat different conclusions. Different scholars have treated these cases differently. Aili Reference Aili1979: 48–9 excludes all such potentially ambiguous prosody from his corpus (although he is content to include instances of aphaeresis like factumst), attempting to limit his investigation to cases of certainty. The number of clausulae that he is forced to exclude, however, is enormous, amounting to nearly half of the total in Cicero. Most other scholars have tended to treat these ambiguities on a case-by-case basis, deciding in cases of uncertainty based on an idea of which potential clausula would be ‘better’; the criteria tend to the subjective: so Zielinski Reference Zielinski1904: 174–5 had proclaimed that syllables before s impura are lengthened ‘without exception’, but Nisbet replied that ‘when I read “ipse sceleratus” before a pause (Pis. 28), I hear esse videatur’ (Nisbet Reference Nisbet and Craik1990: 359). Ancient precepts on these questions are often frustratingly vague, for example, Quint., Inst. 9.4.36: ‘nonnumquam hiulca etiam decent faciuntque ampliora quaedam.’

³¹ Our tests also show that eliding maximises artistically rhythmic patterns: if you exclude all elisions, you vastly increase the number of clausulae in our non-artistic ‘miscellaneous’ category at the expense of artistic clausulae. Similarly, if you allow mute + liquid to lengthen the preceding syllable, you simply increase the number of long syllables, thus favouring more spondaic cadences.

³² As did Zielinski Reference Zielinski1904. Prose ictus, a supposed accent on the first long syllable of each metrical foot in the clausula, continues to play a part in numerous studies of Latin prose rhythm (for example, Aumont Reference Aumont1996: 211–17), despite the lack of any ancient evidence for such a thing. It seems very likely that any apparent tendencies toward coincidence or clash of ‘ictus’ and word accent in prose are epiphenomenal; see especially Oberhelman Reference Oberhelman2003: 106–10. A similar case has been made (less persuasively) about ictus in Latin poetry: Stroh Reference Stroh, von Albrecht and Schubert1990; Zeleny Reference Zeleny2008; Fortson Reference Fortson and Clackson2011.

³³ It is important to note that each of these tools is modular and can be reused for other purposes; furthermore, it is easy to make changes to one module without affecting the rest of the system and then rerun tests and reports.

³⁴ You can imagine an algorithm that decides prosody (and elision and so forth) in ambiguous cases so as always to produce the ‘more preferred’ clausula. Such a system, however, almost immediately becomes circular and self-reinforcing. This is, writ very large, the methodological difficulty that more subjective scholars face in their categorisation of doubtful clausulae.

³⁵ In fact the macroniser tends toward the upper-end of the accuracy range on syllables in clausular position because of the types of words that tend to be found there.

³⁶ Determining algorithmically when i and u are consonantal is surprisingly hard: consider, for example, ui, iui, ii, II (Roman numeral), ius, Seruius, uua, fluuius, mortuus, quid (this last betwixt and between, a digraph). These problems are bound up with syllabification, which also comes with its own challenges: disyllabic lin-gua vs trisyllabic ar-gu-o, or sua-de-o vs su-a and su-ap-te. And sometimes Latin orthography simply does not represent pronunciation: abicio = abjicio, cuius = cujjus, etc. This is to say nothing of truly edge cases, where algorithmic perfection is almost impossible: for example, thy-i-o (Prop. 3.7.49, if right), Thyi-as (Verg., Aen. 4.302)!

³⁷ For mistakes that can be detected, see, for example, Oberhelman Reference Oberhelman2003: 92 n. 36: ‘Zielinski's percentages … are typically at variance with my calculation of the data (from 1 to 2 percent)’; earlier Axer Reference Axer1980: 21 n. 1. We have observed similar errors in the Plinian rhythms tabulated by Hofacker Reference Hofacker1903.

³⁸ The limit case of ‘perfect’ data that cannot be generalised is shown in Koster Reference Koster2011, where Pro Roscio Amerino is laid out in scanned cola according to the ideas of Schmid Reference Schmid1959, but without statistics or further commentary; cf. too, this time with abundant statistics, the massive study of Sträterhoff Reference Sträterhoff1995, over 900 pages devoted to just De imperio Cn. Pompei and Livy 1.1.1–26.8. A somewhat different approach is illustrated by Vretska and Vretska Reference Vretska and Vretska1979, where colometry and rhythmic analysis of the Pro Archia is a fully integrated part — but only one part — of a broader commentary.

³⁹ We have also excluded Augustus’ Res Gestae, which is a forest of brackets and editorial reconstructions. For some comments on its prose rhythm, see Zwierlein Reference Zwierlein2002: 43–5.

⁴⁰ Note that in Petronius we have not separated narration from dialogue; Müller Reference Müller1983: 449 claims that the former is rhythmic but the latter is not, and at 449–70 analyses Petronian rhythm in detail. Similar questions may be asked of speeches compared to narrative in historiography, on which see our comments on Sallust and Tacitus below.

⁴¹ cf. Quint., Inst. 9.4.61: ‘neque enim loqui possum nisi e syllabis breuibus ac longis, ex quibus pedes fiunt.’

⁴² So De Groot Reference De Groot1926: 20–1, whose examples we have borrowed here. On Cicero's preference for the esse uideatur type, see, for example, Quint., Inst. 9.4.73, 10.2.18, Tac., Dial. 23.1; for Zielinski's weakness on this score, see, for example, Bornecque Reference Bornecque1907: 212–14; Shewring Reference Shewring1930: 165; Oberhelman Reference Oberhelman2003: 98–9; cf. Aumont Reference Aumont1996: 13–14.

⁴³ De Groot Reference De Groot1921; detailed criticism in Aili Reference Aili1979: 21–5. See also Wilkinson Reference Wilkinson1963: 140–1.

⁴⁴ Summarised in Novotný Reference Novotný1929: 25–7; in more detail Novotný Reference Novotný1926.

⁴⁵ Sall., Iug., Tac., Ann. 1, Brutus’ letters to Cicero, Trajan's letters to Pliny, Fronto's letters to Marcus Aurelius: Bornecque Reference Bornecque1907: 216; critique in, for example, Oberhelman Reference Oberhelman2003: 115–16. Aumont Reference Aumont1996 still uses Bornecque's ‘non-metrical’ data (esp. at 67).

⁴⁶ For critique of these and other methods, see Aili Reference Aili1979: 21–32; Orlandi Reference Orlandi, Reinhardt, Lapidge and Adams2005: 396–401.

⁴⁷ Janson Reference Janson1975, esp. 10–34 (applying the method to the medieval cursus); Aili Reference Aili1979, esp. 32–9.

⁴⁸ So, labelling the first long of a cretic trochee position 5, the following short position 4, and so on: expected frequency of –⏑– –× = observed percentage of long in position 5 multiplied by observed percentage of short in position 4 multiplied by observed percentage of long in position 3 multiplied by observed percentage of long in position 2 multiplied by 1 (since the last syllable is indifferent). An example is provided by Aili Reference Aili1979: 36, another by Oberhelman Reference Oberhelman2003: 177–9.

⁴⁹ As observed by Gotoff Reference Gotoff1981: 337; cf. Janson Reference Janson1975: 26–8. Detailed further criticism in Aumont Reference Aumont1996: 47–57.

⁵⁰ The test is used by, for example, Janson Reference Janson1975; Aili Reference Aili1979; McCabe Reference McCabe1981; Aumont Reference Aumont1996; Hutchinson Reference Hutchinson2015; Reference Hutchinson2018.

⁵¹ See, for example, https://onlinecourses.science.psu.edu/stat500/node/56/; a very useful online calculator is Preacher Reference Preacher2001. Hutchinson Reference Hutchinson2018: 20 tries to explain the test for classicists; similarly Hutchinson Reference Hutchinson2015: 792, and earlier Aili Reference Aili1979: 37–9; McCabe Reference McCabe1981: 176–83; Aumont Reference Aumont1996: 69–72.

⁵² Note that the chi-square test statistic is also correlated with sample size: the larger the samples, the more statistically significant will be the variation between them. (More random variation is possible in a smaller sample: if paragraph A has two artistic and one non-artistic clausulae, while paragraph B has one artistic and two non-artistic clausulae, the variation may be due to chance. If, on the other hand, text A has 2,000 artistic clausulae and 1,000 non-artistic, whereas text B has 1,000 artistic compared to 2,000 non-artistic, chance is a much less likely explanation for the observed variance.) The chi-square test has certain minimum requirements on sample size, which are met in this paper.

⁵³ The statistically savvy may wonder about a philosophical question: the chi-square test is usually used to compare two random samples in order to infer whether the populations from which they were drawn are different. Here, however, we might be thought not to have a sample but rather the entire population (all clausulae), thus obviating the need for such a test. In a real sense, however, we do not have the whole population: most of classical literature has perished. Since much of Tacitus’ Annales and Historiae have been lost, for example, what we have is a sample of all of Tacitus. Is our sample random? Admittedly not in the way a statistician would prefer, but it is random in the sense that the works that have been preserved were not preserved because of their rhythmic properties (although those properties could sometimes be correlated with other reasons that they were preserved, like ‘literary quality’). We thank one of the anonymous JRS readers for insightful comments on this issue, which we hope to explore further elsewhere.

⁵⁴ In this article, we will give p-values to five decimal places, hence here p ≈ .00000. With more decimal places, here p = .00000005.

⁵⁵ See the sensible preliminary cautions of Aumont Reference Aumont1996: 9.

⁵⁶ And this is still to say nothing of tendencies within the works: does Caesar, for example, pay more attention to ‘artistic’ prose rhythm in speeches? The question has not been sufficiently investigated; see, for example, Gaertner and Hausburg Reference Gaertner and Hausburg2013: 71 n. 207; Börner Reference Börner2016. We will discuss Sallust's and Tacitus’ rhythmic tendencies in speeches vs narrative below.

⁵⁷ Cicero accounts for the overwhelming majority of audi(ui)sti(s) in classical Latin, and so comparisons with other authors are not especially fruitful.

⁵⁸ See, for example, Zielinski Reference Zielinski1904: 163–6; Shipley Reference Shipley1911; Laurand Reference Laurand1911; Reference Laurand1936–38: 2.179–80; Adams Reference Adams2013.

⁵⁹ Similarly Hutchinson Reference Hutchinson2018: 19 on using double spondees and their resolved forms (including the heroic clausula) as a gauge for how rhythmic an author is.

⁶⁰ The literature on prose rhythm in Cicero is too vast to cite here; see the references collected in Berry Reference Berry1996b: 49 n. 247, to which can be added Sträterhoff Reference Sträterhoff1995; Hutchinson Reference Hutchinson1995; Reference Hutchinson1998: 9–12; Oberhelman Reference Oberhelman2003; Koster Reference Koster2011; Winterbottom Reference Winterbottom, Obbink and Rutherford2011.

⁶¹ See Bornecque Reference Bornecque1907: 571–4; Aili Reference Aili1979: 126–7; Oakley Reference Oakleyforthcoming.

⁶² See especially Axelson Reference Axelson1933: 7–16; Reference Axelson1939: 23–48; earlier Bourgery Reference Bourgery1910 and, unhelpfully, Zander Reference Zander1910–14: 2.65–121.

⁶³ See Müller Reference Müller1954: 755–82.

⁶⁴ See Havet Reference Havet1904; Parroni Reference Parroni1984.

⁶⁵ See Hofacker Reference Hofacker1903; Bornecque Reference Bornecque1907: 323–40; Whitton Reference Whitton2013: 28–32 and his index s.v. ‘rhythm’, and our comments below.

⁶⁶ See, for example, Macé Reference Macé1900: 379–400; Bornecque Reference Bornecque1907: 574–8; Fry Reference Fry and Poignault2009: 19–20, and further references in Power Reference Power, Power and Gibson2014: 76 n. 47.

⁶⁷ ‘Apuleius’ in the tables above includes works of disputed authorship (the ‘preface’ to De deo Socratis, De mundo and De Platone). For Apuleian prose rhythm generally, see Bernhard Reference Bernhard1927; in Metamorphoses, Hijmans Reference Hijmans, Hijmans and van der Paardt1978; Nisbet Reference Nisbet, Kahane and Laird2001. In the philosophical works, where the accentual cursus mixes with quantitative rhythm, see Axelson Reference Axelson, Axelson, Önnerfors and Schaar1987; Redfors Reference Redfors1960: 75–113; Stover Reference Stover2016: 42–4.

⁶⁸ Not only is [Quintilian] not Quintilian, it is not even just one author. For a minutely detailed study of prose rhythm in the Declamationes maiores, see Håkanson Reference Håkanson, Håkanson and Santorelli2014.

⁶⁹ Understandably so: if someone else has already spent a long time counting something, there would seem to be little earthly reward for taking a similarly long time to check the work and pronounce it sound.

⁷⁰ Aili Reference Aili1979: 126–7.

⁷¹ See especially Aili Reference Aili1979: 69–130.

⁷² It also raises the question of whether their prose rhythm is in some sense ‘epic’: ‘historia … proxima poetis’ (Quint., Inst. 10.1.31)?

⁷³ See especially Hutchinson Reference Hutchinson1998: 9–12; earlier, for example, Bornecque Reference Bornecque1907: 565–70.

⁷⁴ Von Albrecht Reference von Albrecht2003: 23 n. 72. For full details of Cicero's prose rhythm practices in this speech, see Axer Reference Axer1980: 21–4.

⁷⁵ For prose rhythm in the Pro Archia, see Vretska and Vretska Reference Vretska and Vretska1979.

⁷⁶ The only speech in the above results whose authorship has been seriously questioned is De domo sua. While we find that the speech is ‘artistic’ to a statistically significant degree (contra Zielinski Reference Zielinski1904: 218–19; Nisbet Reference Nisbet1939: xxxii–xxxiii), these would be very weak grounds to reject Ciceronian authorship in any event, and would not go along with supposed stylistic defects in other aspects of the speech. For a recent explanation of some of the apparent oddity of this speech, see Kenty Reference Kenty2018.

⁷⁷ Note that the chi-squared test statistic, while yielding a statistically significant p-value, is still relatively small here because of the small sample size of the Invective; see n. 52 above.

⁷⁸ Few scholars believe that Sallust wrote the Invective; see Novokhatko Reference Novokhatko2009: 111–29; Santangelo Reference Santangelo2012: 29–32.

⁷⁹ Similarly few scholars believe that Sallust wrote the Epistulae, but see Posadas Reference Posadas2016, who does, with further bibliography on the question in his n. 2. For the other side, see Mastrorosa Reference Mastrorosa, Gavoille and Guillaumont2017, with comprehensive bibliography on both sides of the debate in her nn. 2–3. The extent to which later imitators perceived and replicated the prose rhythm of their models, and whether (or how) such sensitivity changed over the centuries, merits further investigation. As we will see below, Tacitus, for one, is not concerned to be especially Ciceronian in his ‘Ciceronian’ Dialogus. The pseudo-Ciceronian In Sallustium (of uncertain date) would be Cicero's least artistically rhythmic speech, the Pro Roscio comoedo excepted; the Epistula ad Octauianum (also of uncertain date), on the other hand, one of his most artistically rhythmic letters.

⁸⁰ This question has been explored to some degree for Sallust, Livy and Tacitus (see Ullmann Reference Ullmann1925; Aumont Reference Aumont1996: 383–7, including also Caesar, for whom see further n. 56 above), but most extensively for Tacitus. A summary of scholarship on Tacitean prose rhythm is provided by Hellegouarc'h Reference Hellegouarc'h1991: 2437–45. Discussions of narrative vs speech in Tacitus include Ullmann Reference Ullmann1925; Reference Ullmann, Holst and Mørland1931; Salvatore Reference Salvatore1950: 143–68; Andreoni Reference Andreoni1968; Dangel Reference Dangel1991: 2496–504. None of these treatments has been able to perform a consistent comparison on all the clausulae in question, leading to unreliable conclusions.

⁸¹ Editors typically denote the beginning of direct speech with ‘ and its end with ’, but they differ in how they treat a single speech that continues over multiple paragraphs (for example, repeat the ‘ at the beginning of each paragraph or not?), and ’ is also sometimes used for other purposes (for example, M.’ = Manius). This is one of many cases where a corpus marked up with metadata would prove useful; see our remarks in conclusion.

⁸² Our reasons for not considering indirect discourse separately from narrative and direct speech are strictly pragmatic: it is much harder to find and segregate such instances of reported speech. Their rhythms thus remain an open question, but note that if they agree with the rhythms of direct speech, then all the rhythmic differences between speech and narrative found here will be magnified.

⁸³ We have listed the section or section range where the speech is found, but we have only included in our corpus the portion of that section which contains direct speech. Our corpora are similar but not identical to those of Ullmann Reference Ullmann1925: 67, 72; Reference Ullmann, Holst and Mørland1931: 72; Andreoni Reference Andreoni1968: 304–5.

⁸⁴ Full data for both Sallust's and Tacitus’ speech and narrative prose rhythms are available in the Supplementary Material online.

⁸⁵ Starting with Norden Reference Norden1918 [1898]: 2.942: ‘Dagegen [sc. in contrast to Pliny the Younger] ignoriert Tacitus … den Rhythmus der Klausel durchaus.’ Further references in Aili Reference Aili1979: 128–9; Hellegouarc'h Reference Hellegouarc'h1991: 2445; Dangel Reference Dangel1991: 2496.

⁸⁶ For Cicero and Ciceronianisms in the Dialogus, see van den Berg Reference van den Berg2014: esp. 208–40; Keeline Reference Keeline2018: 223–76, neither considering prose rhythm.

⁸⁷ Hutchinson Reference Hutchinson2018: 9.

⁸⁸ The figures in Whitton Reference Whitton2013: 29, reporting 29 per cent cretic trochees, do not seem to include resolutions. Pliny's only predecessor to show such a love for cretic-trochaic rhythms is Quintus Curtius.

⁸⁹ In a comparison of all five rhythmic categories, only Book 3 stands out slightly, where Pliny has a particular preference for double trochees and lower than usual affection for cretic trochees and double cretics. This difference disappears, however, in a pooled comparison of artistic vs non-artistic categories. Interestingly, in the latter comparison it is Book 9 that looks slightly unusual, because it is overall a bit less artistically rhythmic, and yet when comparing all five categories it looks normal.

⁹⁰ To us this hypothesis seems unlikely (see especially Coleman Reference Coleman2012), but it is currently in vogue: see, for example, Gibson and Morello Reference Gibson and Morello2012: 259–64; Woolf Reference Woolf2015, with further references.

⁹¹ In this paper we consciously chose to group all possible clausulae into seven patterns; we then sub-divided those seven into ‘artistic’ and ‘non-artistic’. Principal Component Analysis, by contrast, is a data reduction technique that would ignore ancient and modern prose rhythm classifications and instead seek algorithmically to group together clausular patterns into the ‘principal components’ (whatever those may be) that best account for the observed variance between samples: see Jolliffe Reference Jolliffe2002.

⁹² See, for example, Vretska and Vretska Reference Vretska and Vretska1979; Hutchinson Reference Hutchinson1995; Reference Hutchinson1998: 9–12; Reference Hutchinson2015; Reference Hutchinson2018; Riggsby Reference Riggsby, Berry and Erskine2010.

⁹³ For an attempt at measuring the Schlußwert of individual clausulae, see Primmer Reference Primmer1968 with the critique of Aili Reference Aili1979: 25–32. On ancient audiences’ perception of prose rhythm, from the Greek side, see Vatri Reference Vatriforthcoming.

References

BIBLIOGRAPHY

Adams, E. D. 2013: Esse videtur: Occurrences of the Heroic Clausulae in Cicero's Orations, MA thesis, University of Kansas.Google Scholar

Aili, H. 1979: The Prose Rhythm of Sallust and Livy, Stockholm.Google Scholar

Allen, W. S. 1978: Vox Latina: A Guide to the Pronunciation of Classical Latin (2nd edn), Cambridge.Google Scholar

Andreoni, E. 1968: ‘Le clausole nei discorsi dell’Agricola, delle Historiae e degli Annales’, Rivista di cultura classica e medioevale 10, 299–320.Google Scholar

Aumont, J. 1996: Métrique et stylistique des clausules dans la prose latine. De Cicéron à Pline le Jeune et de César à Florus, Paris.Google Scholar

Axelson, B. 1933: Senecastudien. Kritische Bemerkungen zu Senecas Naturales quaestiones, Lund.Google Scholar

Axelson, B. 1939: Neue Senecastudien. Textkritische Beiträge zu Senecas Epistulae morales, Lund.Google Scholar

Axelson, B. 1987: ‘Akzentuierender Klauselrhythmus bei Apuleius. Bemerkungen zu den Schriften De Platone und De mundo’, in Axelson, B. (Önnerfors, A. and Schaar, C. (eds)), Kleine Schriften zur lateinischen Philologie, Stockholm, 233–45. (Reprint of Vetenskapssocieteten i Lund, Årsbok 1952, 3–20.)Google Scholar

Axer, J. 1980: The Style and Composition of Cicero's Speech Pro Q. Roscio comoedo: Origin and Function, Warsaw.Google Scholar

Bernhard, M. 1927: Der Stil des Apuleius von Madaura. Ein Beitrag zur Stilistik des Spätlateins, Stuttgart.Google Scholar

Berry, D. H. 1996a: ‘The value of prose rhythm in questions of authenticity: the case of De optimo genere oratorum attributed to Cicero’, Papers of the Leeds International Latin Seminar 9, 47–74.Google Scholar

Berry, D. H. 1996b: Cicero: Pro P. Sulla oratio, Cambridge.Google Scholar

Bornecque, H. 1907: Les clausules métriques latines, Lille.Google Scholar

Börner, K. 2016: ‘Klauselrhythmus in den direkten Reden des Corpus Caesarianum’, Acta Antiqua Academiae Scientiarum Hungaricae 56, 81–92.Google Scholar

Bourgery, A. 1910: ‘Sur la prose métrique de Sénèque le philosophe’, Revue de philologie 34, 167–72.Google Scholar

Broadhead, H. D. 1922: Latin Prose Rhythm: A New Method of Investigation, Cambridge.Google Scholar

Butterfield, D. J. 2008: ‘Sigmatic ecthlipsis in Lucretius’, Hermes 136, 188–205.Google Scholar

Clark, A. C. 1905: ‘Zielinski's Clauselgesetz’, Classical Review 19, 164–72 (review of Zielinski 1904).Google Scholar

Clark, A. C. 1909: Fontes prosae numerosae, Oxford.Google Scholar

Coleman, K. M. 2012: ‘Bureaucratic language in the correspondence between Pliny and Trajan’, Transactions of the American Philological Association 142, 189–238.Google Scholar

Cser, A. 2012: ‘Resyllabification and metre: the issue of s impurum revisited’, Acta Antiqua Academiae Scientiarum Hungaricae 52, 363–73.Google Scholar

Dangel, J. 1991: ‘Les structures de la phrase oratoire chez Tacite. Étude syntaxique, rhythmique et métrique’, Aufstieg und Niedergang der römischen Welt 2.33.4, 2454–538.Google Scholar

De Groot, A. W. 1921: Der antike Prosarhythmus. Zugleich Fortsetzung des Handbook of Antique Prose-Rhythm, Groningen.Google Scholar

De Groot, A. W. 1926: La prose métrique des anciens, Paris.Google Scholar

Fortson, B. W. 2011: ‘Latin prosody and metrics’, in Clackson, J. (ed.), A Companion to the Latin Language, Malden, MA, 92–104.Google Scholar

Fraenkel, E. 1968: Leseproben aus Reden Ciceros und Catos, Rome.Google Scholar

Fry, C. 2009: ‘De Tranquilli elocutione. Suétone en utilisateur de sa langue’, in Poignault, R. (ed.), Présence de Suétone. Actes du colloque tenu à Clermont-Ferrand, 25–27 novembre 2004, Tours, 15–29.Google Scholar

Gaertner, J. F. and Hausburg, B. 2013: Caesar and the Bellum Alexandrinum: An Analysis of Style, Narrative Technique, and the Reception of Greek Historiography, Göttingen.Google Scholar

Gibson, R. K. and Morello, R. 2012: Reading the Letters of Pliny the Younger: An Introduction, Cambridge and New York.Google Scholar

Gotoff, H. C. 1981: ‘The prose rhythm of Sallust and Livy’, Classical Philology 76, 335–40 (review of Aili 1979).Google Scholar

Habinek, T. N. 1985: The Colometry of Latin Prose, Berkeley.Google Scholar

Håkanson, L. 2014: ‘Der Satzrhythmus der 19 Größeren Deklamationen und des Calpurnius Flaccus’, in Håkanson, L. (Santorelli, B. (ed.)), Unveröffentlichte Schriften, Vol. 1. Studien zu den pseudoquintilianischen Declamationes maiores, Berlin, 47–130.Google Scholar

Havet, L. 1904: ‘La prose de Pomponius Méla’, Revue de philologie 28, 57–9.Google Scholar

Hellegouarc'h, J. 1991: ‘Le style de Tacite. Bilan et perspectives’, Aufstieg und Niedergang der römischen Welt 2.33.4, 2385–453.Google Scholar

Hijmans, B. L. 1978: ‘Asinus numerosus’, in Hijmans, B. L. and van der Paardt, R. T. (eds), Aspects of Apuleius’ Golden Ass: A Collection of Original Papers, Groningen, 189–209.Google Scholar

Hofacker, C. 1903: De clausulis C. Caecili Plini Secundi, Bonn.Google Scholar

Hutchinson, G. O. 1995: ‘Rhythm, style, and meaning in Cicero's prose’, Classical Quarterly 45, 485–99.Google Scholar

Hutchinson, G. O. 1998: Cicero's Correspondence: A Literary Study, Oxford.Google Scholar

Hutchinson, G. O. 2013: Greek to Latin: Frameworks and Contexts for Intertextuality, Oxford.Google Scholar

Hutchinson, G. O. 2015: ‘Appian the artist: rhythmic prose and its literary implications’, Classical Quarterly 65, 788–806.Google Scholar

Hutchinson, G. O. 2018: Plutarch's Rhythmic Prose, Oxford.Google Scholar

Janson, T. 1975: Prose Rhythm in Medieval Latin from the 9th to the 13th Century, Stockholm.Google Scholar

Johnson, K. et al. 2014–: CLTK: The Classical Language Toolkit, https://github.com/cltk/cltk, DOI 10.5281/zenodo.593336.Google Scholar

Jolliffe, I. T. 2002: Principal Component Analysis (2nd edn), New York.Google Scholar

Keeline, T. J. 2018: The Reception of Cicero in the Early Roman Empire: The Rhetorical Schoolroom and the Creation of a Cultural Legend, Cambridge.Google Scholar

Kenty, J. 2018: ‘The political context of Cicero's De domo sua’, Ciceroniana online 2, 245–64.Google Scholar

Koster, S. 2011: Ciceros Rosciana Amerina. Im Prosarhythmus rekonstruiert, Stuttgart.Google Scholar

Laurand, L. 1911: ‘Les fins d'hexamètre dans les discours de Cicéron’, Revue de philologie 35, 75–88.Google Scholar

Laurand, L. 1936–38: Études sur le style des discours de Cicéron. Avec une esquisse de l'histoire du ‘cursus’ (4th edn, 3 vols), Paris.Google Scholar

Macé, A. 1900: Essai sur Suétone, Paris.Google Scholar

Mastrorosa, I. G. 2017: ‘Les épîtres à César du Pseudo-Salluste. Des conseils pour gouverner dans l'antiquité tardive?’ in Gavoille, É. and Guillaumont, F. (eds), Conseiller, diriger par lettre, Tours, 155–72.Google Scholar

McCabe, D. F. 1981: The Prose-Rhythm of Demosthenes, New York.Google Scholar

Müller, K. 1954: Geschichte Alexanders des Grossen. Lateinisch und deutsch, Munich.Google Scholar

Müller, K. 1983: Petronius Satyrica: Schelmenszenen. Lateinisch – deutsch, Munich.Google Scholar

Nisbet, R. G. 1939: M. Tulli Ciceronis De domo sua ad pontifices oratio, Oxford.Google Scholar

Nisbet, R. G. M. 1990: ‘Cola and clausulae in Cicero's speeches’, in Craik, E. M. (ed.), Owls to Athens: Essays on Classical Subjects Presented to Sir Kenneth Dover, Oxford, 349–59. (Reprinted in R. G. M. Nisbet (S. J. Harrison (ed.)), Collected Papers on Latin Literature, Oxford, 1995, 312–24.)Google Scholar

Nisbet, R. G. M. 2001: ‘Cola and clausulae in Apuleius’ Metamorphoses 1.1’, in Kahane, A. and Laird, A. (eds), A Companion to the Prologue of Apuleius’ Metamorphoses, Oxford, 16–26.Google Scholar

Norden, E. 1918: Die antike Kunstprosa vom VI. Jahrhundert v. Chr. bis in die Zeit der Renaissance (3rd edn, 2 vols), Leipzig.Google Scholar

Novokhatko, A. A. 2009: The Invectives of Sallust and Cicero, Berlin.Google Scholar

Novotný, F. 1926: ‘Le problème des clausules dans la prose latine’, Revue des études latines 4, 221–9.Google Scholar

Novotný, F. 1929: État actuel des études sur le rhythme de la prose latine, Lwów.Google Scholar

Oakley, S. P. forthcoming: ‘Point and periodicity: the style of Velleius Paterculus and other Latin historians writing in the early Principate’.Google Scholar

Oberhelman, S. M. 2003: Prose Rhythm in Latin Literature of the Roman Empire: First Century B.C. to Fourth Century A.D., Lewiston, NY.Google Scholar

Orlandi, G. 2005: ‘Metrical and rhythmical clausulae in medieval Latin prose: some aspects and problems’, in Reinhardt, T., Lapidge, M. and Adams, J. N. (eds), Aspects of the Language of Latin Prose, Oxford, 395–412.Google Scholar

Parroni, P. 1984: Pomponii Melae De chorographia libri tres, Rome.Google Scholar

Posadas, J. L. 2016: ‘Los consejos de Salustio a César antes de la guerra civil’, Florentia Iliberritana 27, 195–205.Google Scholar

Power, T. 2014. ‘The endings of Suetonius’ Caesars’, in Power, T. and Gibson, R. K. (eds), Suetonius the Biographer: Studies in Roman Lives, Oxford, 58–77.Google Scholar

Preacher, K. J. 2001: ‘Calculation for the chi-square test: an interactive calculation tool for chi-square tests of goodness of fit and independence’, http://www.quantpsy.org/chisq/chisq.htm.Google Scholar

Primmer, A. 1968: Cicero numerosus. Studien zum antiken Prosarhythmus, Vienna.Google Scholar

Redfors, J. 1960: Echtheitskritische Untersuchung der apuleiuschen Schriften De Platone und De mundo, Lund.Google Scholar

Riggsby, A. 2010: ‘Form as global strategy in Cicero's Second Catilinarian’, in Berry, D. H. and Erskine, A. (eds), Form and Function in Roman Oratory, Cambridge, 92–104.Google Scholar

Salvatore, A. 1950: Stile e ritmo in Tacito, Naples.Google Scholar

Santangelo, F. 2012: ‘Authoritative forgeries: late republican history re-told in Pseudo-Sallust’, Histos 6, 27–51.Google Scholar

Schmid, W. 1959: Über die klassische Theorie und Praxis des antiken Prosarhythmus, Wiesbaden.Google Scholar

Shewring, W. H. 1930: ‘Prose-rhythm and the comparative method’, Classical Quarterly 24, 164–73.Google Scholar

Shipley, F. W. 1911: ‘The heroic clausula in Cicero and Quintilian’, Classical Philology 6, 410–18.Google Scholar

Spinacce, L. 2014: ‘“Cursus in clausula”, an online analysis tool of Latin prose’, in Tomasi, F., Rosselli del Turco, R. and Tammaro, A. M. (eds), Proceedings of the Third AIUCD Annual Conference on Humanities and their Methods in the Digital Ecosystem, https://dl.acm.org/citation.cfm?id=2802635, New York.Google Scholar

Srebrny, S. 2013: ‘Tadeusz Zieliński (1859–1944)’, Eos C (fasciculus extra ordinem editus electronicus), 118–63 (http://eos.uni.wroc.pl/wp-content/uploads/2016/07/118_EOS-2013-CD.pdf).Google Scholar

Stover, J. A. 2016: A New Work by Apuleius: The Lost Third Book of the De Platone, Oxford.Google Scholar

Sträterhoff, B. 1995: Kolometrie und Prosarhythmus bei Cicero und Livius. De imperio Cn. Pompei und Livius I, 1–26, 8 kolometrisch ediert, kommentiert und statistisch analysiert (2 vols), Oelde.Google Scholar

Stroh, W. 1990: ‘Arsis und Thesis, oder: Wie hat man lateinische Verse gesprochen?’ in von Albrecht, M. and Schubert, W. (eds), Musik und Dichtung. Neue Forschungsberichte, Viktor Pöschl zum 80. Geburtstag gewidmet, Frankfurt, 87–116. (Reprinted in W. Stroh (J. Leonhardt and G. Ott (eds)), Apocrypha. Entlegene Schriften, Stuttgart, 2000, 193–216.)Google Scholar

Taylor, D. J. 1991: ‘Latin declensions and conjugations: from Varro to Priscian’, Histoire Épistémologie Langage 13, 85–109.Google Scholar

Ullmann, R. 1925: ‘Les clausules dans les discours de Salluste, Tite Live et Tacite’, Symbolae Osloenses 3, 65–75.Google Scholar

Ullmann, R. 1931: ‘Les clausules dans les discours de Tacite’, in Holst, H. and Mørland, H. (eds), Serta Rudbergiana, Oslo, 72–9.Google Scholar

van den Berg, C. S. 2014: The World of Tacitus’ Dialogus de Oratoribus: Aesthetics and Empire in Ancient Rome, Cambridge.Google Scholar

Vatri, A. forthcoming: ‘The nature and perception of Attic prose rhythm’, Classical Philology.Google Scholar

von Albrecht, M. 2003: Cicero's Style: A Synopsis, Leiden.Google Scholar

Vretska, H. and Vretska, K. 1979: Marcus Tullius Cicero Pro Archia Poeta. Ein Zeugnis für den Kampf des Geistes um seine Anerkennung, Darmstadt.Google Scholar

Whitton, C. 2013: Pliny the Younger: Epistles Book II, Cambridge.Google Scholar

Wilkinson, L. P. 1963: Golden Latin Artistry, Cambridge.Google Scholar

Winge, J. 2015: Automatic Annotation of Latin Vowel Length, BA thesis, Uppsala University.Google Scholar

Winterbottom, M. 2011: ‘On ancient prose rhythm: the story of the Dichoreus’, in Obbink, D. and Rutherford, R. (eds), Culture in Pieces: Essays on Ancient Texts in Honour of Peter Parsons, Oxford, 262–76.Google Scholar

Woolf, G. 2015: ‘Pliny/Trajan and the poetics of empire’, Classical Philology 110, 132–51.Google Scholar

Zander, C. 1910–14: Eurhythmia vel compositio rhythmica prosae antiquae (3 vols), Leipzig.Google Scholar

Zeleny, K. 2008: Itali modi. Akzentrhythmen in der lateinischen Dichtung der augusteischen Zeit, Vienna.Google Scholar

Zielinski, T. 1904: Das Clauselgesetz in Ciceros Reden. Grundzüge einer oratorischen Rhythmik, Leipzig.Google Scholar