1. Introduction
This journal recently introduced a new series, International Law and its Methodology, inviting scholars and practitioners to ‘lean in’.Footnote 1 The series is motivated by an ongoing disciplinary struggle between the fields of international law and international relations. These disciplines have become increasingly intertwined in ways which have arguably provoked a crisis in the field of international law and prompted calls to rearticulate its methodology.
Thus far, the debate has not been especially constructive or rewarding. By accelerating into an argument about inter-disciplinarity and its methodological implications, it has presented international legal scholars with two equally unappealing options. The first is to defend the autonomy of international law and its legal methodology; the second is to let the discipline be colonized by the theories and empirical methodology of political and social scientists.Footnote 2 The aim of this article is to show that there is a third methodological route which preserves the main features of the distinct legal approach, complementing it with empirical tools and techniques that enhance its validity, transparency and reliability. Our main claim is that tools and techniques originally developed to study citation networks, as well as corpus linguistic tools developed to study natural written and spoken language, provide a deeper understanding of how international courts develop international law. They can be calibrated and attuned to complement legal doctrinal methodology without turning it into empiricism.
We ask a methodological question and a substantive legal question:
-
1. How can citation network analysis and corpus linguistics complement the study of (international) case law?
-
2. How do European courts broaden legal concepts and introduce new legal categories?
The questions are connected insofar as the first question's more general methodological discussion prepares a framework for understanding the working methods employed when investigating the second question. Both questions are formulated to help demonstrate the new insights that quantitative methods can generate when integrated into a legal study.
We answer both questions by considering three components: the structure of the case law, the language of a court (reoccurring and co-occurring expressions), and legal arguments. First, we map the structure of the case law by automatically extracting direct links between cases: namely, case citations. If case A cites case B, then there is a direct link between cases A and B. Case B receives one citation and has an ‘in degree’ of 1. Cases A and B are likely to belong to the same group of cases if subsequent cases refer to both of them. They will thus form a cluster with those cases. Second, we record linguistic particularities of judicial decisions by looking at the dominance of specific words and word combinations that permeate the cases over time. Third, we show how legal (qualitative) analysis can proceed on the basis of the findings of citation and linguistic analysis, obtained with quantitative methods.
Our empirical material consists of the case law of two European international courts: the CJEU, and the ECtHR.Footnote 3 We selected the materials for a number of reasons. First, these courts have the largest repository of case law. It is a repository that is free and easily searchable in the databases, and it is available in a format readable by standard computer software. Second, their decisions are not excessively long in the manner of judgments delivered by some other international courts, such as the International Criminal Tribunal for the former Yugoslavia (ICTY). Yet, judgments from our chosen courts are long and complex enough for the automated text analysis to make sense. Third, these courts have a relatively well-established citation practice and refer to past decisions as a rule. This is helpful in building citation networks and essential for the accuracy of the ensuing analysis (bearing in mind that complete accuracy can never be achieved). Fourth, the CJEU and the ECtHR are very different in terms of tasks and competence, meaning that these institutions provide a diverse test for our approach. The CJEU is a generalist court with a broad competence, while the ECtHR is a court that interprets one legal instrument: the Convention for the Protection of Human Rights and Fundamental Freedoms (ECHR). Each institution also has a distinct organizational structure: a large secretariat performs much of the drafting at the ECtHR, while at the CJEU smaller three to five person chambers of individual judges draft the report for the hearing as well as the final judgment. Dissent is allowed at the ECtHR, while the rule of strict collegiality is in place at the CJEU. The courts also differ greatly with regard to procedures: with the preliminary reference procedure, in particular, being particular to the CJEU.
These differences attest to the broad applicability of the approach developed here: offering the prospect of fruitful work on other international courts. This potential breadth matters when entering the debate about the need to rearticulate the methodology of international law, and it will also matter to our conceptions of how this might be done. Certainly, every international court has unique features. These include factors such as the length of judgments, as noted with the ICTY earlier; or the low number of cases adjudicated by courts such as the International Criminal Court (ICC) and the East African Court of Justice (EACJ); or a diverse and complex jurisdiction such as the one presided over by the hybrid Caribbean Court of Justice. Yet there are factors at work other than uniqueness or exceptionalism. The citation network of the EACJ might contain only 52 judgments and two advisory opinions at present, but the numbers are growing fast and some cases are more important than others.Footnote 4 Moreover, even though the ICC rendered only two convictions by the end of 2014, the trials produced a rather important corpus of 2,300 decisions which were collected and studied from the network approach.Footnote 5 Such study shed light on the neglected procedural aspects of the ICC and on themes addressed by this jurisdiction such as the evolution of the status of victims.Footnote 6 Hence, it can be argued that unique features such as low case numbers, complex jurisdictions, and the relative youth of international courts do not compromise the accuracy of the citation network's findings and certainly do not speak against the use of the approach. Rather, the challenges presented by some characteristics of individual courts should prompt researchers toward further adjustments of the existing methods, and toward the development of new measurement techniques.Footnote 7
From a substantive perspective, we first demonstrate how the CJEU constructs new legal areas. Interestingly, a systematized investigation of the CJEU's language on the basis of direct links between cases and groups of cases uncovers categories which have not hitherto been recognized in generalist European law scholarship, but which might be distinct and significant from the perspective of courts and practitioners in a specific area. These findings are confirmed by corpus linguistic analysis and legal literature that specializes in the field. In contrast, the area is not (yet) part of the general discussion of European Union (EU) law. Second, we scrutinize the process in which the ECtHR gradually extends its own jurisdiction under the umbrella of Article 14 of the ECHR. We demonstrate which new substantive rights questions the ECtHR subsumed under Article 14 ECHR, which cases were central in this respect, and in what periods they were the dominant cases.
Most importantly, we examine whether the game is worth the candle. We argue that quantitative methods, such as corpus linguistics and citation network analysis, ensure the reproducibility, generalizability, and empirical validity of doctrinal studies. They add to the transparency of legal methodology while substantially clarifying the legal method. They can provide empirical evidence to validate hunches and prove legal intuitions correct. Furthermore, they effectively address the limitations of traditional legal scholarship, including a lack of precision, subjectivity, a surplus of anecdotal evidence, and a tendency to succumb to herd behavior. Quantitative methods set objective benchmarks from which legal scholarship can, when required, criticize the practice of international courts for a lack of coherence of legal reasoning, for unjustified breaks with established case law, or for deviations from precedent which exceed judicial powers and competences; yet such methods also provide a necessary means of critically evaluating the research practice of the discipline itself. These are methods through which normative questions of judicial legitimacy, continuity, and legal certainty can be discussed from the same vantage point. They also represent one way of rendering future debates concerning the methodology of international law and the method of international courts significantly more productive. Law remains an argumentative practice, which can be analyzed empirically without compromising its normative core.
A final remark on technical skills and technology: network analysis and corpus linguistics rely heavily on technology to enable a faster, methodical, more efficient leafing through case law. Nonetheless, they are in no way futuristic or exclusionary methods. Search engines and databases in many domestic courts (and almost all international courts) have become of paramount importance for the visibility and legitimacy of these courts,Footnote 8 an indispensable research tool for legal scholars, and a must for legal practitioners and civil servants working under intensifying time pressure. In this sense, technology and big data have already changed the method of how to establish valid law.Footnote 9 There are plentiful technical tools developed for the study of case citation networks and corpus linguistics. The newest software is for the most part free, and it is increasingly user-friendly.
Our argument proceeds in sections henceforth. In Section 2, we map the tools of network studies and corpus linguistics and discuss their contribution to doctrinal research methods in more abstract terms. In Section 3, we illustrate the general discussion by conducting a closer study of the CJEU and the ECtHR, thereby demonstrating the usefulness of quantitative methods for legal studies of international case law. In Section 4, we briefly address the remaining challenges and avenues for future research.
2. Tools, methods and methodological synergies
There are important and unexplored methodological synergies between citation networks, language analysis, and legal analysis. All methods target the content of judicial opinions, albeit in different ways and different aspects of that content. Citation network analysis and doctrinal legal studies share the idea of webs and patterns, hence there is a structural similarity between these two types of analysis. Just as citation network analysis connects cases through their citations and orders the network in clusters, so too the doctrinal scholar organizes legal sources in a conceptual and systematic order, where different cases belong to different compartments of the system. Linguistic analysis and legal analysis focus on the structure of the legal argument in a series of judicial opinions. Unlike case citation networks, linguistic analysis is not preoccupied with the structure of the case law and with how cases hang together to form a coherent whole. Rather, it is interested in different usages of words, word combinations, and phraseology. It can be argued that linguistic analysis and legal analysis both look for conspicuous patterns in the language of courts.
Most importantly, when used in a complementary manner legal analysis, citation network analysis, and corpus linguistic analysis can efficiently address the methodological shortcomings of each discrete method, offering corrective possibilities.
Legal analysis is often focused on the analysis of legal problems which occur in a single case or reoccur in a small group of cases. Especially on the continent, legal commentary orders individual legal events (cases) according to more general rules and principles.Footnote 10 Its aim is to seek and ensure the logic and coherence of law,Footnote 11 relying on prior systematizations and established legal categories.Footnote 12 This method poses the risk that inquiry will be disproportionately shaped by predefined categories and that, therefore, cases which do not fit into these categories will be either left out as irrelevant or misrepresented. Furthermore, the accounts of the case law will most likely not be based on exhaustive reading of all potentially relevant cases, or on research into all possible connections between these cases.
By comparison, computer assisted analysis of legal texts (cases) can analyze an endless amount of texts. One can both start bottom-up (the so called corpus driven approach) and investigate the language of a part of the corpus and compare it with the language of the rest of the corpus, or compare the language of courts in early and later periods. Alternatively, one can look for conspicuous concepts, which herald judicial lawmaking in a specific legal context, and concepts that legal scholarship has identified as such, or phrases that indicate novel developments (corpus based approach). Corpus linguistic analysis does not provide any conclusive answers as to the legal importance and relevance of individual texts. Nor does it distinguish between more and less important cases and more and less legally relevant concepts. Often, too, the software does not structure the documents historically, but weights them equally and orders them by linguistic similarity. It looks for individual patterns in the case law but not at its structure. To answer any questions about why specific patterns occur, the findings must be interpreted.
Citation network analysis orders the documents (cases) into a structure based on the objective fact of case citation (a link between two cases), and it identifies cases that are central to this structure. It does not, however, uncover why they are central to this structure legally speaking (i.e., why case A cites case B). The content of citations and the reasons behind citations remain obscure. Moreover, citation network analysis gives a static image of a dynamic process: in reality law is not a perfect spider web, but an ongoing Scrabble game, constantly evolving.Footnote 13
The legal method is best placed to further qualitatively examine the findings of quantitative investigation. In what follows, the tools and methods of citation analysis and corpus linguistic analysis are presented in more detail.
2.1. Case citations networks: Concepts and use
The tools and methods of citation networks are not new to the study of courts and judicial behavior. Political scientists have successfully exploited case to case citations to map the structure of the case law of the United States Supreme Court (USSC).Footnote 14 They have also used the method to inquire into concepts and processes that are highly relevant to law such as judicial activism,Footnote 15 the historical rise of stare decisis, the depreciation of precedents,Footnote 16 and the strategic behaviour of individual justices of the USSC.Footnote 17 A few studies have explored citation networks and citation strategies of international courts,Footnote 18 showing that international courts develop precedent in the same way as their national counterparts: they cite precedents to legitimize their decisions to nation states. At the same time, citation patterns suggest that international judges are addressing legal actors (the national courts) rather than political actors (national governments).Footnote 19
Such research has clearly demonstrated the face validity of the network approach, and it has contributed to the advance of research agendas which take the content of judicial opinions as a starting point for analysis. This body of research has also shown that real world networks such as legislative networks, the World Wide Web, and scientific citation networks share fundamental similarities with case law citation networks. The evolution of law mimics the evolution of other network phenomena and can be studied in the same way.Footnote 20
Numerous tools and techniques to analyze citation networks are being developed and applied. For legal scholars, the study of the structure of these networks, the connections between cases and groups of cases and their evolution over time, is perhaps the most relevant aspect of the approach. On the basis of incoming citations it is possible to systematically determine which cases have a higher (mathematical) importance score: namely, which cases are central to the structure. The tools to measure case importance differ greatly. Some are basic, like citation counts (in degree, out degree), others are more advanced and analogous to the measures of social status of individuals in social networks, or the measures of importance of academic papers in academic citation networks (hub and authority scores, or a very similar page-rank).Footnote 21
Different measures should be used to answer different research questions. For instance, to study the relevance of individual cases in a citation setting, or in specific periods, legal scholars might simply need to count all instances in which cases are cited to identify the most frequently cited cases and their legal context. This measure is clearly biased towards older cases that have had more time to accumulate citations, and might not lead to convincing answers to the question of their landmark status. By contrast, a further legal inquiry into the content of those citations will lead to a more productive study of how the concept (first used in the studied case) developed over a series of cases and how it affected subsequent case law.
To calculate the landmark status of cases, existing studies of case law use a more complex measure, based on the so-called eigenvector centrality,Footnote 22 which takes into account the importance of citations as well as the number of citations and eliminates the bias of time. The importance of a case is based on the quantity (the number) as well as the quality of citations (their importance). By way of analogy, the importance of a given scientific paper is proportionate to the importance of papers that cite it, or the importance of the journal in which it is cited (i.e., the number of citations the citing paper itself receives). It depends on both the number of citations and their relative importance. No matter how large the network, or how many citations we record in absolute terms, there will always be some judgments in the network that will be more central and others which will remain more peripheral.Footnote 23
The findings of the citation network analysis will rarely stand alone or provide answers to legal questions. The studies noted above combined citation network tools with statistical tools to investigate questions related to law. By contrast, we turn now to the language of courts.
2.2. Corpus linguistic analysis as an extension of legal analysis
Language analysis is in many ways the most natural complement of doctrinal legal analysis. On a theoretical plane, legal scholars recognize that the meaning of legal concepts can only be captured by examining the usages of a concept or a word in larger language structures, like sentences or phrases.Footnote 24 On a practical plane, courts, national and international, use highly specialized language to construct legal concepts and legal doctrines.Footnote 25 International case law is imbued with pre-fabricated phrases which spread throughout the lexicon of international courts. Such phrases also cross-pollinate from international courts to national courts and vice versa.Footnote 26 On a methodological plane, recent empirical studies emphasize the importance of the language of legal doctrine and language development in judicial opinions.Footnote 27
Linguistics has undergone a similar turn in its investigation of meaning, from individual words to phraseology,Footnote 28 spurred by Sinclair's seminal studies of word sequences.Footnote 29 Since then, the study of structure, meaning, and word combinations of natural language has grown rapidly. By comparison, research in specialized language, like the language of law and courts, has lagged behind.Footnote 30 Thus, the role of key phrases in jurisprudence in the process of doctrine building remains insufficiently explored.
In brief, modern corpus linguistic analysis is an approach to the study of language that uses computer assisted methods to detect multi word units, word sequences, and phrases. Empirical materials can consist of any kind of text such as judgments of international courts, legislative proposals, or statutes. The findings obtained on this basis are objective: an argument does not depend entirely on the authority of the interpreter but on the persuasiveness of the findings.Footnote 31 One of the most widespread methods of studying meaning is through analyzing collocations. Collocation is the statistical tendency of words to co-occur. For instance, the words ‘big’ and ‘great’ will occur with the word ‘deal’ to form a stable semantic unit (big deal, great deal), which will acquire a specific meaning (big deal will signify something of importance while great deal will signify magnitude). More importantly, ‘big’ paired with ‘deal’ will disambiguate the word deal.Footnote 32
In the following section, we illustrate how the methodological synergies between social networks, corpus linguistics, and legal analysis discussed above can be exploited to answer legal questions.
3. Quantitative basis, qualitative analysis: Stronger together
3.1. Application: The effectiveness of European Union law
Courts everywhere invoke past decisions to justify their rulings, citing them as precedents that propel specific solutions. When they do so, it is possible to build a citation network and visualize the structure of the case law, including the cases that are central to this structure. When courts use past decisions implicitly by using the same arguments, which tend to be expressed in standardized linguistic formulations or phrases, the analysis of language rather than citations will be more informative to legal scholars. Similarly, language analysis based on citation networks will be a more productive method of investigating the underlying legal reasons for citations, their specific type, or normative force.Footnote 33 Together, these quantitative methods will unpack the process in which international courts develop international law through practice.
As noted above, a corpus based approach is helpful in analyzing how and when specific concepts developed. For example, the CJEU regularly weaves effectiveness into its decisions. Linguistically, effectiveness is asserted in terms of effective protection of individual rights, effective remedy, the full effect of secondary law, Treaty articles or European policies, and useful effect. It is epitomized in the so-called effectiveness and equivalence requirement. Previous literature has identified effectiveness as a concept that is often associated with legal innovation. Literature typically distinguishes between effectiveness in the sense of full effect of EU law, and effectiveness in the procedural sense. Full effect is usually defined as a legal principle, which is closely related to the CJEU's methods of interpretation, in particular with the liberal statutory interpretation.Footnote 34 Procedural effectiveness is discussed in terms of the principle of effectiveness and equivalence of EU remedies in procedures in which EU rights are enforced before national courts. Occasionally, the application of procedural effectiveness has been problematized because of its possible negative impact on the autonomy of legal remedies to secure EU rights and incursions into the autonomy of the member states (the principle of procedural autonomy).Footnote 35
To examine how the CJEU uses effectiveness in its case law, we constructed a citation network of all cases that contain the word effectiveness (1,707 in total) from the network of all cases of the CJEU (approximately 10,000 judgments in total). We extracted all the information from Eurlex, which is a publicly available database of all legislation and case law of the EU.Footnote 36 We then stored it, and further processed it to obtain more granulated data on the cited paragraphs: that is, their consecutive number and text.Footnote 37 The network presented in Figure 1 is a network of individual paragraphs of cases, which contain the word effectiveness in the text, composed of 18,247 paragraphs and 15,615 links or citations between those paragraphs.
![](https://static.cambridge.org/binary/version/id/urn:cambridge.org:id:binary-alt:20170428123823-78195-mediumThumb-S0922156517000085_fig1g.jpg?pub-status=live)
Figure 1: Paragraph to paragraph case citation network of the case law of the CJEUFootnote 54
The additional information points to specific parts and aspects of cases which the CJEU cites most often, indicating that they might also be more important from the legal point of view to the CJEU or the parties to the case. The visualization of the network in Figure 1 is an overview of the contexts in which the concept of effectiveness appears. This will be immediately obvious to a legal scholar who specializes in EU law, but will also give a good general impression to non-specialists in the field. In Figure 1, we can observe that effectiveness features in cases often associated with the full effect of EU law (Factortame,Footnote 38 Simmenthal,Footnote 39 Francovich,Footnote 40 Brassserie Footnote 41 ) and in cases that concern the procedural type of effectiveness, meaning national procedural arrangements, rules, and remedies in place for the enforcement of European rights (Rewe/Comet).Footnote 42 However, the network in Figure 1 also provides more detailed information about the specific sub-areas of the procedural type of effectiveness such as time limits (Palmisani),Footnote 43 rules of evidence (San Giorgio),Footnote 44 effective judicial control and the principle of effective protection of rights (Johnston,Footnote 45 Heylens),Footnote 46 the importance of co-operation of national courts with the CJEU (Peterbroeck),Footnote 47 and the principle of conform interpretation or the so-called indirect effect (Marleasing,Footnote 48 Von Colson).Footnote 49 It points to sub-areas related to procedural effectiveness, like the limits of the CJEU's jurisdiction with regard to preliminary references from national courts. In the latter sense, PreussenElektra, 39Footnote 50 and Bosman, 59 and 61Footnote 51 deal with conditions under which a reference is necessary. Skatteverket v. A.Footnote 52 and Compagnie St. Gobain Footnote 53 concern tax law.
On the basis of direct citations and the network structure, we searched for conspicuous linguistic patterns or unexpected formulations in the sub-groups. Effectiveness of fiscal supervision appears as a separate and linguistically stable category, as presented in Figure 2. Figure 2 is a so-called collocate cloud which efficiently summarizes and visualizes the use of the concept of effectiveness in our corpus of all 1,707 effectiveness judgments and provides a focused view of this corpus. Collocate clouds generally contain words and phrases that co-occur with the search word (before – to the left – or after – to the right – of the search word, or in a specified interval). Unlike a typical word cloud, a collocate cloud does not summarize the entire texts of judgments. Figure 2 below is an extract of the collocate cloud, meaning that it contains only the most frequently used collocates of effectiveness which are still readable in print.Footnote 55 The size of the text in Figure 2 reflects the frequency of particular words and phrases that co-occur with effectiveness, while the level of textual brightness is proportionate with the collocation strength, defined as a tendency of a word or phrase to occur with the search word.
![](https://static.cambridge.org/binary/version/id/urn:cambridge.org:id:binary-alt:20170428123823-64566-mediumThumb-S0922156517000085_fig2g.jpg?pub-status=live)
Figure 2: Effectiveness word cloud (extract)
The effectiveness of fiscal supervision is a long-established and well-accepted justification of restrictions to the free movement of goods, services and capital. A corpus linguistic analysis also shows that this category of effectiveness has gained increasing importance over four consecutive decades, from 1970s to 2000s, as Table 1 below summarizes. Table 1 gives an overview of the most frequent collocates, and the most likely collocates of effectiveness. The collocates are ordered (ranked) according to their t-score. For instance, in the 1980s (columns five to eight from the left), the CJEU most often used effectiveness in connection with community law, and in formulations such as deprive the effectiveness of the Treaty, or of the system of protection, or the effectiveness of national rules. The words community, law, deprive, system, treaty, and national are the most frequent (column six from the left) and have the highest t-score (column seven from the left). By way of comparison, in the 2000s, effectiveness was most often associated with the words principle, directive, article, and principles, as shown in the first column from the right. Apart from these expected and most likely collocates, Table 1 also lists some less expected collocates and some changes in the linguistic patterns, as expressed through collocates, over four decades.Footnote 56
Table 1: Effectiveness collocates (extract)Footnote 59
![](https://static.cambridge.org/binary/version/id/urn:cambridge.org:id:binary-alt:20170428123823-06770-mediumThumb-S0922156517000085_tab1.jpg?pub-status=live)
The most conspicuous example is that the effectiveness of fiscal supervision has become a more frequent collocate of effectiveness since the 1970s. This is demonstrated by its t-value,Footnote 57 listed in column three from the left for the 1970s, in column seven for the 1980s, in column 11 for the 1990s, and in column 14 for the 2000s. It is also demonstrated by the frequency with which it occurred in those periods (columns two, six, 10, and 14 from the left). The collocates fiscal and supervision are ranked higher (they have a higher t-score) in the 1990s and the 2000s than they were in the 1970s and the 1980s, as is clear from columns one, five, nine and 13. If in the 1980s the word fiscal ranked as 188 according to its t-score, it ranked as 64 in the 1990s. The expressions fiscal and supervision are also closer together in the 2000s than they were in earlier periods (ranked 19 and 20, as presented in the first column from the right) which further indicates their frequent reoccurrence.
These findings imply that when scholars examine the principle of effectiveness in legal literature they would be wise to pay special attention to how this principle operates as a justification of state measures, and how this impacts first, the balance of EU and member state competences, and second, the substantive law in different areas in which the justification is asserted and accepted or rejected. The findings also suggest that the current typology of effectiveness is imprecise and cannot accommodate a category that is different from either the procedural aspect of effectiveness and effectiveness as a principle of interpretation in the context of protection of individual rights and the corresponding duties of national courts. We might be dealing with a new category dating back to the famous Cassis de Dijon Footnote 58 case that has not been discussed as such because it is usually subsumed under free movement.
Firstly, the example shows that the combination of citation analysis and corpus linguistic analysis offers a basis that can complement legal analysis and systematization – especially when a more granulated approach is applied to individual paragraphs and to the investigation of stable formulations in a smaller sets of texts, which we believe share an important legal concept such as effectiveness. Secondly, it demonstrates why qualitative analysis and quantitative methods are stronger together, and it notably highlights the continued relevance of legal analysis. Namely, while the findings of corpus linguistic analysis strongly indicate the increasing importance of a new category of effectiveness in legal discourse they are inconclusive in themselves. Is the category becoming legally significant from the perspective of the Court, or is it only increasingly invoked by the legal actors, and especially the member states, while its legal merit or relevance remain unchanged? Whichever the case, the quantitative findings are interesting, but must be interpreted qualitatively.
The method can also detect significant communities in the case law that have not yet been awarded standing in the general textbooks systematizing the existing case law, but which might become significant in the future, or are already significant from the perspective of the CJEU and practitioners.
The next example takes into account direct links between cases, linguistic similarity, and legal arguments. We built a network of all CJEU cases decided from 1954 to 2014 (approximately 10,000 judgments). To visualize the structure of the network, we used a method that identifies groups of cases, or clusters, on the basis of their similarity. The so-called Louvain method for detecting clusters is a specific technique usually referred to as modularity maximization, and it is the method most commonly used to structure and analyze large networks.Footnote 60 Cases are organized into groups or clusters on the basis of their links – direct citations. The cases in the same cluster cite each other more than they cite other cases in the network, most likely because they relate to the same or a similar legal issue. In Figure 3, groups of cases are visualized as large dots composed of cases that spread around a central case, which is also the center of individual clusters. The method identifies well-established communities such as the prohibition of measures having equivalent effect to quantitative restrictions, freedom of movement of persons, and staff cases. These center around well-known cases like Dassonville,Footnote 61 Cassis Footnote 62 and Bosman.Footnote 63 Those cases are the most cited cases in the communities.
![](https://static.cambridge.org/binary/version/id/urn:cambridge.org:id:binary-alt:20170428123823-94810-mediumThumb-S0922156517000085_fig3g.jpg?pub-status=live)
Figure 3: Clustering of the case law of the CJEU from 1954 – 2013Footnote 68
One (perhaps surprising) center also identified with this method is Halifax,Footnote 64 a case decided by the CJEU in 2006 (bottom left corner in Figure 3). According to the classification that was assigned to the case by the Registry, Halifax concerned the Sixth VAT Directive, and in particular transactions designed solely to obtain a tax advantage. The CJEU explicitly referred to Halifax in 50 subsequent cases, and when measured by its authority score the case ranks as number 243. This could indicate that the CJEU has subsequently developed a line of case law around its decision in Halifax, which is becoming more important as the case law network continues to grow. Literature review suggests that tax law scholars saw Halifax as a redefining moment in tax law, striking down transactions that were designed for the purposes of tax evasion and avoidance.Footnote 65 By contrast, the case went largely unnoticed in generalist textbooks of EU law.Footnote 66
To investigate the contribution of Halifax to the case law, employing a method similar to the above example, we combined its citation pattern with the analysis of collocations. As is well known, the CJEU developed an abuse of rights test in Halifax, and the court distinguished tax avoidance cases from tax planning cases.Footnote 67 The CJEU added that (provided that the effectiveness of Community law was not undermined) national courts had to verify whether action constituting such an abusive practice had taken place in the case before them in accordance with the rules of evidence of national law. Nonetheless, it gave the referring national court precise guidelines, from which it was obvious that the situation amounted to an abuse of rights in the field of VAT.Footnote 69
A table of collocates (see Table 1 above) confirms and contextualizes the findings obtained on the basis of citation network analysis. We used the same corpus of texts as previously, since effectiveness of Community law is a standard against which the national law will be assessed when national judges apply the text in their jurisdictions. We checked to see whether the abuse of community rules, or tax avoidance, or any other words that might point in the direction of a developing doctrine of abuse of rights in tax law, especially regarding the Sixth VAT directive, gained significance. In fact, words associated with the abuse of rights doctrine in the area of taxation (tax, evasion, avoidance, taxation, sixth – marked in bold in Table 1) appear only in the 2000s which is the decade of the CJEU's decision (see first column from the right in Table 1). The words tax, evasion, and avoidance exhibit an especially strong relation to the concept of effectiveness, as demonstrated by a very high t-score in the 2000s (third column from the right) and the rank in Table 1 (fourth column from the right). The findings of the quantitative analysis confirm a clear trend from which to conduct a further legal analysis, and the investigation of nuances in the legal development of the abuse of rights doctrine.
3.2. Application: Expanding the reach of non-discrimination under Article 14 ECHR
This example demonstrates that citation network analysis can both structure and extend legal analysis. We focused on the ECtHR and demonstrated how the ECtHR gradually included new categories under Article 14 of the ECHR. The network analysis shows: first, the development of case law over subsequent periods; and second, it identifies cases that have served as stepping stones in this development. Qualitative analysis, which accompanies quantitative findings, shows that the ECtHR substantially extended its own jurisdiction by broadening the scope of protection of Article 14 ECHR to include new legal categories and rights.
In 1970s, and onward through the 1980s, the ECtHR used Article 14 in association with Article 1 of Protocol 1 to the ECHR (hereinafter P1-1) in relation to discrimination issues that concerned inheritance rights (where the right to inherit was interpreted as falling within the ambit of the right to property) and tax issues (the duty to pay or deduct tax being interpreted by the ECtHR as the right to property). In the 1990s, the ECtHR started to take on various cases relating to pension rights and other welfare allowances. The first of these cases concerned the right to receive a pension from so-called contributory pension schemes (i.e., pensions that were paid out to persons who over the years had contributed to the fund that was paying out the pensions), but this case law gradually spread to non-contributory pension-schemes (i.e., pension schemes that are funded through general taxation and where recipients have not necessarily contributed).
To chart legal development, we used a part of the citation network of all cases that refer to Article 14 ECHR. The data was collected from HUDOC, a publicly available database. The selection of cases was made on the basis of the classification in the database: in this instance, metadata. The final network includes 14 cases that refer to Article 144 and P1-1 (Figure 4). As we investigated the development of the case law that followed Marckx v. Belgium,Footnote 71 we designed Figure 4. It only displays those cases which: i) refer to Marckx; or ii) reference another case that cites Marckx (i.e., cases indirectly linked to Marckx via another case).
![](https://static.cambridge.org/binary/version/id/urn:cambridge.org:id:binary-alt:20170428123823-37390-mediumThumb-S0922156517000085_fig4g.jpg?pub-status=live)
Figure 4: Case citation network consisting of cases that refer to Marckx v. Belgium Footnote 70
In Marckx, the ECtHR held that the Belgian law which limited the inheritance of children born out of wedlock violated Article 14, in conjunction with P 1-1. While admitting that ECtHR P1-1 did not grant a right to inheritance, the ECtHR held that, in conjunction with Article 14, it prohibited the state from discriminating in regard to children born to married couples and children born out of wedlock.Footnote 72 It found a violation of Article 8, Article 8 + 14 and Article 14 + P1-1. The next case, James and Others v. the United Kingdom,Footnote 73 decided in 1986, allowed the United Kingdom to maintain its legislation (which let the long-term tenants concerned buy the land they had rented) in spite of the fact that the legislation discriminated on the basis of wealth. The ECtHR found that the legislation served a legitimate purpose and hence did not violate the Convention rights.Footnote 74 Inze v. Austria (1987) concerned, in a manner similar to Marckx, differential treatment with regard to inheritance rights. The ECtHR found a breach of Article 14 + P1-1, because there was no legitimate reason for the differential treatment.Footnote 75 As indicated in Figure 4, both cases explicitly refer to Marckx.
Karlheinz Schmidt v. Germany (1994) touched upon a discriminatory duty to pay a fire service levy, imposed on men only.Footnote 76 In the judgment, the ECtHR applied Article 14 in conjunction with Article 4 (freedom from slavery and forced labor), citing, however, cases dealing with the right to property. Gaygusuz v. Austria (1996) dealt with discrimination of property rights on grounds of nationality and set an important precedent in that social benefits fell within the scope of P1-1 and had to respect Article 14.Footnote 77 In 1997, in Van Raalte v. The Netherlands, the ECtHR found a breach of Article 14 + P1-1. This was a case concerning differential treatment in granting exemptions from taxation to childless women over 45 but not to childless men of the same age.Footnote 78 Neither Karlheinz Schmidt nor Gaygusuz cite Marckx; they refer instead to Inze. Van Raalte cites Karlheinz Scmidt but not Inze, Gaygusuz, or Marckx. This is indicative of the developments within the case law. A case setting a precedent (such as Marckx) will not be continuously cited as the case law evolves but will nevertheless be treated as a leading case from the network citation perspective, because of its connections with cases that replaced it as precedent. Stec and Others and Andrejeva show the extent to which Article 14 + P1-1 has developed throughout the last three decades.Footnote 79 None of the cases cite Marckx but – like the leading cases of the nineties – they expand upon previous precedents. Through this charting method, it becomes easier to accurately envisage the expansion in the network and the development in the Article 14 + P1-1 case law. The benefit of this method is further strengthened by subsequent developments.
In Willis v. the United Kingdom (2002) the ECtHR held that the entitlement to the widow's benefit was discriminatory since it only applied to women.Footnote 80 Willis cites Gaygusuz multiple times and is a continuation of the precedent set in this judgment. Stec and Others v. the United Kingdom (2006) concerned differential treatment with regard to Reduced Earnings Allowance, which was a benefit given to people who could no longer work full-time, granted at different ages depending on gender.Footnote 81 The differential treatment was found to be justified as it was grounded in a legitimate aim.Footnote 82 However, although the ECtHR did not find a breach in the specific case, it did find the allowance fell within the right to property. This extends the reach of the ECHR to any state benefit. Despite being a recent case, Stec and Others is one of the most cited cases in the Article 14 network, further indicating the significance of the judgment. A further development took place in Andrejeva v. Latvia (2009) which concerned discrimination on the grounds of nationality in the calculation of pensions. While the ECtHR found that the different treatment between citizens and non-citizens pursued a reasonable aim, it did not do so at a reasonable level of proportionality, and therefore constituted a breach of Article 14 + P1-1.Footnote 83 Stec cites James and Others, and Andrejeva cites both Stec and Gaygusuz. This latest extension of the case law therefore continues the citation patterns in the network.
3.3. In parentheses: One method for all courts?
A criticism that is regularly mounted against the use of citation network analysis is that it can only be applied to common law courts with a developed and rigorous citation practice. The critique is twofold: it targets the reliability of the method (are we measuring what we think we are measuring?) and it also questions whether the method will yield (more) accurate results in the case of common law courts, such as the USSC, while yielding inaccurate or less reliable results in the case of international courts modeled after continental style courts like the CJEU.
As comparative studies attest, explicit references to case law are a matter of judicial style, which varies among countries and courts.Footnote 84 Some courts will follow precedent without citing it, referring to the established case law instead. For instance, especially in older judgments, the CJEU might make no reference to a specific case, but choose to phrase matters as follows:
as the Court has repeatedly held in its decided cases [in French: ‘comme la Cour l'a dit dans une jurisprudence constante’] it should be stressed that, as far as concerns general acts, especially regulations, the requirements of Article 190 of the Treaty are satisfied if the statement of reasons given explains in essence the measures taken by the institutions and that a specific statement of reasons in support of all the details which might be contained in such a measure cannot be required, provided such details fall within the general scheme of the measures as a whole.Footnote 85
The problem is not impossible to overcome. Even in the case of the CJEU the overlap between qualitative and quantitative methods of identifying important precedent is significant.Footnote 86 To this end, while the case citation network of the CJEU might contain fewer citations (meaning fewer links between the cases than that of the USSC) the relationship between more-cited and less-cited cases will be stable. The mere fact that the USSC cites more cases per judgment on average does not mean that the method is not valid for the CJEU, which cites fewer cases per judgment. A more instrumental fact is that the structure of the case citation network of the CJEU is almost identical to the structure of the case citation network of the USSC: both networks have the same structure, with more important cases and less important cases. The cases group into clusters, meaning that some cases exhibit many links to certain cases and have fewer links to other cases. Most often, this will be due to the fact that they address the same subject, such as free movement of goods in the case of the CJEU or the right to privacy in the case of the USSC. However, to find groups of cases which exhibit greater similarity, or which have a specific feature in common, researchers might use different techniques that are better suited for specific types of citation networks. This is commonly referred to as clustering analysis,Footnote 87 and comprises various methods and techniques.Footnote 88
The network approach relies on observable relations between cases, meaning that individual cases have to be connected either by citations, legal concepts, linguistic expressions, or references to legislative instruments. While some international courts might not have a developed case citation practice, nearly all courts will refer to legislative instruments and treaties, and use the same linguistic expressions or concepts. This will certainly have an impact on the set of questions that researchers will be able to answer. At the same time, it might increase the variety of questions asked – and their originality – leading in turn to new research agendas.
4. Conclusion
This article contributes to the methodological debate surrounding the study of international law and the case law of international courts. It focuses on how empirical quantitative methods, especially citation network analysis and corpus linguistics, enhance the validity, reliability, and transparency of the established legal method. By analyzing the output of two European courts, it demonstrates how quantitative methods complement legal studies and contribute to a fuller understanding of the law making of international courts.
Our conclusions are two-fold. Firstly, citation network analysis and corpus linguistics do not dispense with traditional legal studies, which remain highly relevant. They do, however, reduce the traditional (ideological) influence of legal scholars by providing a tool to reproduce their studies and test their arguments. This in turn creates a platform for legal debate, which can now be substantiated with more than a handful of selected examples or judicial big bangs. Furthermore, even when the results of citation network analysis point in the same direction as legal studies, they can further refine such studies and render them reproducible (namely, processes that are often discussed in legal scholarship such as precedent construction, justification, and legal interpretation, can be made explicit). In brief, legal scholars gain a stable and comprehensive quantitative basis for a qualitative study of case law, precedent and interpretation. Additional benefit stems from a set of transparent criteria by which to criticize the jurisprudence of international courts. Firmer ground emerges from which to evaluate the courts’ role in the political process, their societal impact, and their legitimacy.
Secondly, a quantitative approach can serve to (re)articulate the methodology of international law rather than simply add to the mounting literature produced by social and political scientists. Quantitative techniques of citation network approach are relevant for the study of international case law primarily because (contrary to many methods used by political scientists) they clearly shift the focus from the questions related to law to the content of judicial decisions, meaning the law itself. Network analysis is interested in structures, just like legal science, which is sometimes (mockingly) labeled as an ordering exercise. Legal scholars stand to gain very substantially from systematically mapping structures as they emerge through direct relations between cases and from investigating how individual cases relate to (and fit into) this structure. This potential gain is at the heart of the network approach. Case citation network analysis supplements – or at least provides a mode of checking – traditional criteria such as coherence and constituency. These criteria continue to be widely used to asses and criticize the case law of courts, yet in an unaltered state they remain open to being critiqued as open ended, formal (lacking content)Footnote 89 and ideological.Footnote 90
Finally, the overall contribution of this article stems from practice – it operationalizes the theoretical and epistemological debate concerning inter-disciplinarity. It contributes to the debate by putting inter-disciplinarity into practice.