Introduction
‘An Invincible Memory’ (in Portuguese, Viva O Povo Brasileiro) is the masterpiece of the Brazilian writer João Ubaldo Ribeiro (Ribeiro, Reference Ribeiro2009). In his book, Ribeiro discusses the formation of the identity of the Brazilian people passing through all main political moments of the country’s modern history: colonization by Portugal (1500–1882), empire (1822–1889) and republic (1889–now). The beginning of both stories (Ubaldo Ribeiro’s novel and Brazilian history) occurs in a territory currently known as the Brazilian Northeast – one of the five geographic regions of the largest country in South America.
During the colonial period, the Brazilian Northeast was the main economic and administrative centre of the country. This region was also where the first admixture occurred among the three main peoples that currently constitute the genetic background of the Brazilian population: Europeans (mainly Portuguese), Africans and Native Americans (Pena et al., Reference Pena, Bastos-Rodrigues, Pimenta and Bydlowski2009; Manta et al., Reference Manta, Pereira, Vianna, Araújo, Leite and Silva2013; Ruiz-Linares et al., Reference Ruiz-Linares, Adhikari, Acuña-Alonzo, Quinto-Sanchez, Jaramillo and Arias2014). The Northeast is the second most populous Brazilian region at present (more than 53 million inhabitants), and it presents the third largest geographic area (more than 1.5 million km2) and the lowest Human Development Index (HDI: 0.663) (IBGE, 2010).
Classic and recent studies have shown that the Brazilian Northeast presents one of the highest rates of inbreeding in the country, associated with a higher incidence of autosomal recessive diseases (Freire-Maia, Reference Freire-Maia1957, Reference Freire-Maia1990; Weller et al., Reference Weller, Tanieri, Pereira, Almeida, Kok and Santos2012; Cardoso et al., Reference Cardoso, de Oliveira, Paixão-Côrtes, Castilla and Schuler-Faccini2018). The relationship between endogamy and health outcomes such as the prevalence of congenital anomalies and agglomeration of cases of rare genetic illnesses has been investigated in some Northeastern municipalities (Santos et al., Reference Santos, Kok, Weller, Paiva and Otto2010; Weller et al., Reference Weller, Tanieri, Pereira, Almeida, Kok and Santos2012). In a recent study, it was showed that this region presents the greatest number of rumours of clusters of genetic diseases in Brazil (Cardoso et al., Reference Cardoso, de Oliveira, Paixão-Côrtes, Castilla and Schuler-Faccini2018).
The elaboration of public health policies focused on the epidemiological surveillance of congenital defects and rare genetic diseases in the Northeast is urgently needed. This need is even more striking due to the vast territory and the low number of professionals with expertise in medical genetics present in the region (Novoa & Burnham, Reference Novoa and Burnham2011). The study of surnames can provide a cheap and in-depth tool for decision-making in this field; such studies have already been conducted in some populations from the region, but not always for health purposes and never for the Northeast as a whole (Azevedo, Reference Azevedo1980; Monasterio, Reference Monasterio2017).
Surnames constitute a unique cultural inheritance system of human species. Family names can be powerful tools in population research, combining genealogy, local history, linguistics and genetics (Redmonds et al., Reference Redmonds, King and Hey2011). The analysis of surname distribution via an isonymic method can provide quantitative information in the assessment of population structure. This method, developed by Crow and Mange (Reference Crow and Mange1965), is based on the fact that surnames are not distributed homogeneously in different places and among different social groups. Although not always necessarily reflecting ancestry, population-level surname studies can provide large-scale information on population structure. Several studies involving surname distribution have shown this approach to be widely applicable, with surnames constituting useful indicators of migration, miscegenation or isolation at several levels: communal, regional, national or continental (Madrigal & Ware, Reference Madrigal and Ware1997; Dipierri et al., Reference Dipierri, Alfaro, Scapoli, Mamolini, Rodriguez-Larralde and Barrai2005, Reference Dipierri, Rodriguez-Larralde, Alfaro, Scapoli, Mamolini and Salvatorelli2011; Bedoya et al., Reference Bedoya, García, Montoya, Rojas, Amézqita and Soto2006; Scapoli et al., Reference Scapoli, Mamolini, Carrieri, Rodriguez-Larralde and Barrai2007; Tarskaia et al., Reference Tarskaia, El’chinova, Scapoli, Mamolini, Carrieri and Rodriguez-Larralde2009; Liu et al., Reference Liu, Chen, Yuan and Chen2012; De Oliveira et al., Reference De Oliveira, Schüler-Faccini, Demarchi, Alfaro, Dipierri and Veronez2013; Carrieri et al., Reference Carrieri, Sans, Dipierri, Alfaro, Mamolini and Sandri2019).
Therefore, the main aim of this study was to analyse surnames as a proxy for population isolation in the Brazilian Northeast, retrieving historical aspects that may have been key to the formation of the communities that exist in this region and comparing them with health statistics on congenital anomalies and population medical genetics.
Methods
The Brazilian Northeast
The Brazilian territory is divided into five main geographic regions and 26 states plus the Federal District (Fig. 1). The Northeast region is divided into nine states, which differ in terms of territory, socioeconomic indicators and settlement history. Until 2010, they were subdivided into 1794 municipalities as minor administrative divisions (IBGE, 2010).
Data sources
Voting is mandatory for Brazilian citizens over 18 and under 70 years of age who are literate and optional for those who are 16 or 17 years old, over 70 years old or who are illiterate. Therefore, the voter registries include the majority of adult Brazilian citizens. The dataset of voter registry for all the states of the Northeast region included 37,410,645 individuals. The surnames of men and women were analysed together within each state as well as within each of the municipalities. Table 1 summarizes the number of inhabitants, the number of voters and the percentage of the total that they represent.
aAccording to the 2010 National Census from IBGE (Brazilian Institute of Geography and Statistics); baccording to the Electoral Registry from 2010.
Demographic information was extracted from the 2010 Census available from IBGE (IBGE, 2010). Information on the absolute number of live births and live births with congenital anomalies was obtained from the Live Birth Information System (SINASC) – a tool of the DATASUS (Brasil, 2017). Health data were collected over a period of 10 years (2005–2014) before the outbreak of microcephaly associated with Zika virus infection that occurred, beginning in 2015, in some areas of the Brazilian Northeast (Schuler-Faccini et al., Reference Schuler-Faccini, Ribeiro, Feitosa, Horovitz, Cavalcanti and Pessoa2016) to avoid possible direct or indirect interference with the teratogenic effects of the infection. The frequency of live births with congenital anomalies (FLBCA) was obtained via the ratio between the absolute number of births with congenital anomalies and the total number of births in the studied period, with the result being expressed for 10,000 births.
Isonymy analysis
In Brazil, each person receives the surname of each parent: the mother’s name first followed by the father’s name (double surnames). However, this inheritance system is not strictly bilateral, since women only pass on to their children the family name that they received from their own fathers. Additionally, it is possible to have just the father’s surname, or after marriage, people may adopt their spouse’s family name, with women usually dropping their mother’s surname and adding the husband’s (De Oliveira et al., Reference De Oliveira, Schüler-Faccini, Demarchi, Alfaro, Dipierri and Veronez2013; Monasterio, Reference Monasterio2017). Due to these particularities of the Brazilian surname inheritance system and for homogenization of the global calculations, it was decided to work only with the paternal last name of each person because this is consistently inherited by the next generation. Thus, the surname distribution of all 1794 municipalities from the Northeast was obtained.
The calculation of unbiased isonymy (I NS ), defined as the probability that two people share the same surname by common ancestry, was performed according to a previous report (Rodríguez-Larralde et al., Reference Rodríguez-Larralde, Barrai and Alfonzo1993).
where N ki is the frequency of surname k for population i; the sum is over all surnames; and N i represents the size of the corresponding population. The final values can vary between 0 and 1, and the closer they are to zero, the greater the variability of the population will be since the probability of two people sharing a near common ancestor will be lower.
From I NS , Fisher’s alpha index (α) was calculated. This indicator was first defined as a measure of species richness in a sample (Fisher et al., Reference Fisher, Corbet and Williams1943). In surname studies, the α index is used to evaluate the surname diversity of each locality. Low values correspond to more-isolated populations, indicating a high rate of inbreeding and drift, while high values are observed in localities with higher immigration rates (Rodríguez-Larralde et al., Reference Rodríguez-Larralde, Barrai and Alfonzo1993).
The sedentary estimator (index A ) is the percentage of the population covered by single surnames with only one bearer. High values are registered when natural population growth is negative. Men and women who do not have children and who are beyond reproductive age may be the only and sometimes last remaining representatives of their surnames. The isolation estimator (index B ) is the percentage of the population covered by the seven most frequent surnames (Rodríguez-Larralde, Reference Rodríguez-Larralde1990). High values of B are observed in small or isolated populations where some very common surnames are found that are shared by a large number of inhabitants. This situation may be a consequence of the initial establishment of a few families in the past whose surnames became very frequent over the years. Both estimators provide very informative data about demographic dynamics and must be correctly interpreted within the particular context of each locality.
To detect isolation by distance, a matrix correlation analysis was used. Using the Euclidean formula, matrixes of paired isonymic distances were calculated between all municipalities inside each state. When two groups or municipalities share no surnames, distance is equal to unity. The Euclidean distance (Cavalli-Sforza & Edwards, Reference Cavalli-Sforza and Edwards1967) between groups I and J is defined as follows:
where the summation is over all surnames.
Matrixes of paired geographic distances were constructed by measuring the distances in kilometres between all capital cities of each municipality. The significance of the correlations was evaluated by the Mantel test (Mantel, Reference Mantel1967). The relationships of the surname distributions between municipalities were graphically expressed by means of a dendrogram based on the UPGMA (unweighted pair group method with arithmetic mean) algorithm using NTSYSpc 2.11SR software (Rohlf, Reference Rohlf2000).
Spatial and statistical analysis
Spatial analyses were conducted according to the methodology detailed by Cardoso-Dos-Santos et al. (Reference Cardoso-Dos-Santos, Boquett, Oliveira, Callegari-Jacques, Barbian and Sanseverino2018). Briefly, they were conducted in ArcGis version 10.3 software using the IBGE cartographic database. The I NS index for each municipality within a state or within the Northeast region was plotted in maps following cluster and outlier analysis (Anselin Local Moran’s Index). This analysis generated maps indicating statistically significant hot spots of municipalities with high isonymy rates surrounded by other municipalities with high I NS values (high–high areas) as well as cold spots (low–low areas) and spatial outliers (high–low and low–high). A similar cluster map was generated from the frequency of live births with congenital malformations (2005–2014). For each map, the Global Moran Index (GMI) was calculated to test the global spatial dependence of the indicators (Anselin, Reference Anselin2010).
Central tendency values, confidence intervals and distribution graphs were obtained using SPSS v18.0 software. In this software, Pearson’s correlation coefficient (r) between the I NS index and the frequency of live births with congenital malformations (2005–2014) was also calculated. A final map correlating the two normalized indicators at the municipality level was obtained with the cluster and outlier analysis tool, applying a spatial correlation index with the maximum (max) and minimum (min) values of both indicators using the following formula:
In all analyses, the threshold for statistical significance was set to p < 0.05.
Results
Among more than 37 million voters, 74,714 different surnames are registered in the Brazilian Northeast. ‘Silva’ was the most frequent surname in the region as a whole (19.2%) as well as in all the states except for Bahia and Sergipe, where the most common surname was ‘Santos’, which was the second most frequent surname in the region (11.8%). The surnames ‘Sousa/Souza’ were the third most frequent (5.4%), and they were considered together because this common variation can be found within the same family and they have the same ancestry (Moser, Reference Moser1960). Figure 2 shows the frequency of the 100 main surnames across the region: the y-axis shows the logarithm of the total number of bearers of each surname (occurrence) and the x-axis shows their rank. The graph is a sharp, exponential decay-like curve. There is a noticeable difference between the occurrence of the first ten surnames and the following forty. This difference is even more remarkable in comparison with the other fifty surnames. Together, the ten most frequent surnames were present in more than half of the voters included in all Northeast databases (51.7%).
I NS was calculated for the region as a whole and for individual states and municipalities. The values of the whole region and states are summarized in Table 2. Isonymy was 0.059 for the Northeast, and among the states it ranged from 0.0407 in Ceará to 0.1401 in Alagoas. At the municipality level, this index varied widely, ranging from 0.022 to 0.359. Consequently, α also presented wide variation, from 3 in municipalities from Pernambuco and Alagoas to 46 in Penalva, a municipality from Maranhão. A pattern of the spatial distribution of surnames (and, consequently, of the population that inherits them) may be inferred from the Mantel test correlation. The highest correlation values were registered in Pernambuco, followed by Ceará and Alagoas. The sedentary estimator (index A) presented low values for all states. This indicates that the natural growth of the population is quite high and that there would be no entry of new family names via migration. Consequently, index B showed higher values: most of the current inhabitants bear only a few different surnames.
aIsonymy value; bFisher’s α index; cMantel test correlation between geographic and surname distance matrixes (all the states presented a correlation test with p < 0.001); dSedentary estimator (index A); eIsolation estimator (index B); fFrequency of live births with congenital anomalies (per 10,000 live births).
According to the spatial analysis of I NS in the region as a whole, a significant clustered pattern was found (GMI = 0.58; p < 0.001) (Fig. 3a). Clusters of municipalities with high I NS values were found in the eastern portion of the region, mainly in the states of Alagoas, Sergipe and Pernambuco. As shown in Fig. 4, the first quartile of I NS values in Alagoas, for example, was greater than practically all values of isonymy for the municipalities of Ceará, Maranhão and Piauí, which are states located farther north in the region.
Considering the wide variation of I NS among Northeastern municipalities, a separate analysis was carried out among those with the highest values to investigate their spatial distribution. Thus, the 2.0% of municipalities with the highest I NS values were arbitrarily selected, resulting in a total of 37 municipalities distributed according to Fig. 5. Among these localities, only five were located outside of Alagoas and Pernambuco states (one in Paraíba and four in Sergipe).
More specifically, there was a spatial agglomeration of those municipalities with the highest isonymy values within the region that is historically equivalent to the Quilombo dos Palmares – the largest conglomerate of escaped slaves in Latin America (Gomes, Reference Gomes2011). To conclude this analysis, the coverage area of the Quilombo dos Palmares from the work by Anderson (Reference Anderson1996) was georeferenced, using the official cartographic database of the Brazilian territory publicly available on the IBGE website. Then, a map was created in which the area equivalent to the Palmares and the 37 municipalities with the highest levels of isonymy were overlapped (Fig. 5).
Regarding the health indicators, Table 2 summarizes the main findings. From 2005 to 2014, more than 8.6 million live births were recorded in the Brazilian Northeast, among which approximately 62.65/10,000 presented birth defects (Fig. 3b). Sergipe, Pernambuco and Paraíba presented the highest frequencies of congenital birth defects among live births. A positive Pearson’s correlation between the I NS values and the frequency of live births with congenital malformations at a municipal level was found (r = 0.268; p < 0.001). A spatial correlation between the two indicators revealed overlap between them (Fig. 3c) with a statistically significant spatial concentration (GMI = 0.50; p < 0.001). A statistically significant difference (ANOVA test, p = 0.002) was found between the average frequency of live births with congenital malformations among the 37 municipalities with the highest values of I NS (mean: 69.79/10,000) and the rest of the Northeast region (mean: 53.06/10,000). In those 37 municipalities, the most frequent congenital anomalies were ‘other congenital malformations and deformations of the musculoskeletal system’ (mean: 19.60/10,000) and ‘congenital deformities of feet’ (mean: 13,09/10,000); the same pattern was found for Northeast region as a whole, as well as for other Brazilian regions, as shown in Table 3.
a: Spina bifida (Q05); b: other congenital malformations of the nervous system (Q00–Q04,Q06–Q07); c: congenital malformations of the circulatory system (Q20–Q28); d: cleft lip and cleft palate (Q35–Q37); e: congenital absence, atresia and stenosis of small intestine (Q41); f: other congenital malformations of the digestive system (Q38–Q40,Q42–Q45); g: undescended and ectopic testicle (Q53); h: other congenital malformations of genital organs (Q50–Q52,Q54–Q64); i: congenital deformities of hip (Q65); j: congenital deformities of feet (Q66); k: other congenital malformations and deformations of the musculoskeletal system (Q67–Q79); l: other congenital malformations (Q10–Q18,Q30–Q34,Q80–Q89); m: chromosomal abnormalities, not elsewhere classified (Q90–Q99); n: haemangioma and lymphangioma (D18).
Discussion
The modern history of Brazil can be told beginning with the contact between Portuguese colonizers and Native Americans in the early 16th century in a region currently known as the Brazilian Northeast. Since then, settlement of this region has progressed from the oceanic coast to the interior, relying mainly on economic activities related to agriculture and livestock (Menezes, Reference Menezes1970; Filho, Reference Filho1977). In addition, there has been strategic occupation of certain territories by specific population groups, such as slaves fleeing mistreatment by their masters (Anderson, Reference Anderson1996; Gomes, Reference Gomes2011), Sephardic Jews who fled Europe during the Portuguese Inquisition (Freyre, Reference Freyre2004) and others. These historical and other more recent demographic movements have profoundly affected the social and health characteristics of the current Brazilian Northeastern society, some of which have been investigated in this work through surname analysis.
First, the most common surnames in the region as a whole and in all of its states were of Iberian origin, mainly ‘Silva’, ‘Santos’ and ‘Sousa/Souza’. This is a clear reflection of the strong economic and cultural power that the European colonizers had over other people who have co-existed in this territory. Indeed, according to genetic research focused on autosomal markers or linked to Y-chromosome markers, European ancestry (over 50%) is the most prevalent ancestry among people from the Northeast, followed by African (below 30%) and Native American (below 20%) ancestry (Manta et al., Reference Manta, Pereira, Vianna, Araújo, Leite and Silva2013; Ruiz-Linares et al., Reference Ruiz-Linares, Adhikari, Acuña-Alonzo, Quinto-Sanchez, Jaramillo and Arias2014). However, considering markers linked to mitochondrial inheritance, African ancestry was predominant in the Brazilian Northeast (44%) (Pena et al., Reference Pena, Bastos-Rodrigues, Pimenta and Bydlowski2009). Taken together, these results point to an admixture scenario involving European males and African and Native females in the Brazilian Northeast, which would account for the higher prevalence of European surnames given the patrilineal surname transmission system. A patrilineal society transmits family names in a regular pattern associated with the father’s Y-chromosome (Azevedo, Reference Azevedo1980).
In particular, the surname ‘Silva’ was the most popular surname in the Brazilian Northeast in this work, which is in agreement with other studies that have reached the same conclusion not only for the region but for Brazil as a whole (Azevedo, Reference Azevedo1980; Monasterio, Reference Monasterio2017). This surname first appeared in the 10th century in the region of Galicia in Spain, and in 1090, it was established in Portugal. The surname came to Brazil at the beginning of Portuguese colonization in the 16th century and has become the most prevalent surname in the country as a whole (Monasterio, Reference Monasterio2017). In addition to Portuguese migrants, Jewish migrants also arrived in Brazil as New Christians. One of the notable characteristics of forced conversions was the adoption of a new name, and these names can be found in the Brazilian population today in large numbers, such as ‘Silva’, ‘Fernandes’, ‘Nunes’ (Glasman, Reference Glasman2006) and names with origins in plant names, such as ‘Carvalho’ and ‘Figueira’ (Moser, Reference Moser1960; Filho, Reference Filho1977).
However, it is worth noting that not everyone with a surname of European origin necessarily presents European ancestry. The conquest, colonization and settlement of Brazil were complex socioeconomic and cultural processes in which a huge percentage of the population (mostly of African or Native American origin) received a European surname in a compulsory manner. The administrative changes that occurred from the colonial period, the independence of Portugal in 1822, to the Republican consolidation in 1889 generated a great documental disparity in the registration and inheritance of surnames (Marcilio, Reference Marcilio1972). This lack of continuity and homogeneity in historical records and inheritance of surnames was also a common phenomenon in the Iberian Peninsula and in the Azores archipelago, whose populations contributed significantly in migration waves to Brazil (Feijó, Reference Feijó1987; Fuster et al., Reference Fuster, Mesa, Jiménez, Jerez and Morales1996; Branco & Mota-Vieira, Reference Branco and Mota-Vieira2003; Santos et al., Reference Santos, Abade, Cantons, Mayer, Aluja and Lima2005; Román et al, Reference Román, Guardado Moreira, Zuliaga, Blanco Villegas, Colantonio and Fuster2007). Some documented facts, in addition to the purposeful change of surnames among Jewish families, are the horizontal transmission of surnames between landowners and slaves or free workers and the adoption of devotional surnames – names of saints, religious symbols, ceremonies or festivities – related to conversion to Catholicism (Filho, Reference Filho1977; Azevedo, Reference Azevedo1980; Freyre, Reference Freyre2004). Slavery in Brazil lasted until 1888 and at the time of its abolition, around 800,000 slaves were freed and at once acquired some kind of surname (Salzano & Freire-Maia, Reference Salzano and Freire-Maia1967). Part of the current Northeast population is descendant from slaves freed in the 19th century and inheritor of their family names (Tavares-Neto & Azevedo, Reference Tavares-Neto and Azevedo1977).
This polyphiletism (along with population isolation for extended periods of time) may help to explain the low diversity of surnames found in this work. A different situation is found in some other South American countries such as Bolivia, where surnames of indigenous origin are more frequent than those of Hispanic origin (Rodriguez-Larralde et al., Reference Rodriguez-Larralde, Dipierri, Gomez, Scapoli, Mamolini and Salvatorelli2011). The surnames diversity values are high in Chile, where Amerindian family names are also registered, mainly from Mapuches and Aymara ethnicities, with a very specific spatial distribution, the first mostly gathered in the south of the country (Barrai et al., Reference Barrai, Rodríguez-Larralde, Dipierri, Alfaro, Acevedo and Mamolini2012). A similar situation can be described in Argentina (Dipierri et al., Reference Dipierri, Alfaro, Scapoli, Mamolini, Rodriguez-Larralde and Barrai2005). By contrast, in Paraguay, there is a near absence of indigenous surnames (Dipierri et al., Reference Dipierri, Rodriguez-Larralde, Alfaro, Scapoli, Mamolini and Salvatorelli2011).
In the present study, the 100 most common surnames account for more than 80% of all voters in the entire region. Compared with other places in the Americas, this index might be considered very high, as it is only 16% in the United States (Barrai et al., Reference Barrai, Rodriguez-Larralde, Mamolini, Manni and Scapoli2001) and 29.5% in Argentina (Dipierri et al., Reference Dipierri, Alfaro, Scapoli, Mamolini, Rodriguez-Larralde and Barrai2005). High values are reported for isolated populations in the Caribbean Coast of Honduras, where the 100 most common surnames are shared by 74.7% of voters belonging to Garifuna communities, descendants from African slaves and Native Americans (Herrera-Paz, Reference Herrera-Paz2013). This over-representation of certain family names can be related to a preferential choice and horizontal transmission of certain surnames. This situation might also have contributed to another interesting result found for the Brazilian Northeast population: high isonymy. In the present study, the I NS index was 0.059. This value is higher than that found in any other similar geographic area studied to date. However, these results are not unexpected. Freire-Maia, with his pioneering studies using surnames from ecclesiastical marriage records, analysed the frequencies of several subtypes of consanguineous unions in the Brazilian population and its possible genetic consequences (Freire-Maia, Reference Freire-Maia1952, Reference Freire-Maia1958). Since then, the Northeast has been categorized as the region with the highest inbreeding coefficient in Brazil (Freire-Maia, Reference Freire-Maia1957, Reference Freire-Maia1990; Azevedo et al., Reference Azevedo, Morton, Miki and Yee1969).
In addition, the distribution of I NS throughout the Brazilian Northeast was notably heterogeneous. Clusters of municipalities with high values of isonymy were found in the eastern states, mainly in Alagoas, Sergipe and Pernambuco. This territory historically corresponds to a hot spot of the main economic activity in Brazil (a Portuguese colony at that time) between the 16th and 18th centuries: the production and export of sugar. At this time, the commercialization of sugar took place almost exclusively as an economic activity (monoculture) and was based on the concentration of great extensions of land as the main form of property and on the enslavement of people from Africa (mainly Central Africa) as a social class institution. Endogamy was a common practice among families who owned the means of production (the Brazilian colonial ‘aristocracy’), leading to sedentary and hierarchical communities where it was also common for workers and ex-slaves (from the ending of slavery in 1888) to inherit the surnames of their masters (Arcanjo, Reference Arcanjo1996; Filho, Reference Filho1977; Freyre, Reference Freyre2004; Gomes, Reference Gomes2011).
However, from the 16th century onward, many communities of escaped slaves formed in the Latin American colonies. Studies conducted in Venezuela have also showed high values of isonymy in populations derived from these communities (called cumbes) – settlements in mountain sites where the slaves hid from their persecutors. These populations are still under relative isolation, with more homogeneity in surnames throughout time (Castro de Guerra et al., Reference Castro de Guerra, Pinto Cisternas and Rodríguez Larralde1990, Reference Castro de Guerra, Arvelo and Pinto Cisternas1999). The largest of these Latin American communities was named Quilombo dos Palmares (a Maroon society known as Palmares) and was located in Brazil. This huge Maroon society officially lasted until the 18th century and consisted of many mocambos (Maroon settlements). They were distributed over a wide area that was difficult to access due to the steepness of the landform and the closed forest located between present-day northern Alagoas and southern Pernambuco (the current territory of Alagoas was linked to the captaincy of Pernambuco until the beginning of the 19th century) (Anderson, Reference Anderson1996; Gomes, Reference Gomes2011). The population of Palmares increased through unions with other escaped slaves as well as through endogenous growth among the inhabitants themselves, who were mainly African people from Angola and Congo (Filho, Reference Filho1977; Gomes, Reference Gomes2011).
Even after the official extinction of Palmares, there is evidence that remnants of its population spread throughout the region, where many of their descendants remain today, sometimes forming isolated (or even marginalized) communities (Chagas & Nunes, Reference Chagas and Nunes2016; Ribeiro, Reference Ribeiro2018; Rodrigues, Reference Rodrigues2011a, b). According to this work, the municipalities with the highest I NS values (37 municipalities) were concentrated in an area corresponding to the Quilombo dos Palmares. These results, together with high B index values and remarkably low Fisher’s α values, point to population dynamics involving inbreeding tendencies and possible consanguinity where migration is not a significant phenomenon. These findings, together with anthropological/sociological studies (Gomes, Reference Gomes2011; Ribeiro, Reference Ribeiro2018) and newspaper reports (Rodrigues, Reference Rodrigues2011b), lead us to believe that the population of these areas should be the focus of population medical genetic analyses.
For instance, according to a 2011 report from the O Estado de São Paulo newspaper, the municipalities of União dos Palmares and Santana do Mundaú (both in Alagoas) harbour isolated populations descended from Quilombo dos Palmares (Rodrigues, Reference Rodrigues2011b). According to the report, these populations show recurrence of marriage between relatives, accompanied by the familial transmission of albinism, achondroplasia and congenital malformations. In this study, the group of 37 municipalities with the highest values of isonymy presented an average frequency of live births with congenital malformations that was statistically higher than the frequency for the rest of the Northeast region.
In fact, the heritage of the Quilombo dos Palmares has extended across the centuries and can be seen today in the names of some municipalities in the region, such as União dos Palmares (in Alagoas) and Palmares (in Pernambuco). Even among the municipalities that are outside the geographical area of the Quilombo dos Palmares, some municipalities maintained historical relations with Palmares, either by facilitating access to it (such as Jacuípe and Campestre) or by maintaining economic relations with people who lived there (such as Jundiá) (IBGE, 2010). Curiously, among the four municipalities with the highest levels of isonomy located in Sergipe, only one (Divina Pastora) was outside the area of the remaining officially recognized quilombos communities (Palmares, Reference Palmares2019).
In addition, in three of these municipalities, there are records of clusters of recessive genetic diseases according to the National Census of Isolates (CENISO) – a registry that aims to map geographic isolates of rare diseases across Brazil (Cardoso et al., Reference Cardoso, de Oliveira, Paixão-Côrtes, Castilla and Schuler-Faccini2018). These registries record oculocutaneous albinism (OMIM: 203100) in Santana do Mundaú (Alagoas) and Quipapá (Pernambuco) and Verma-Naumoff Syndrome (OMIM: 613091) in Gameleira (Pernambuco). Currently, most of the remaining communities of quilombos recognized by the Federal Government are located in the Brazilian Northeast (Palmares, Reference Palmares2019). Genetic studies have been conducted in some of these communities and have shown that geographic isolation, consanguinity and genetic diseases are common (Auricchio et al., Reference Auricchio, Vicente, Meyer and Mingroni-Netto2007; Soares et al., Reference Soares, Lima, Silva, Fernandes, Silva and Lins2017). Similar results have been reported for other Latin American populations sharing the same characteristics of a colonial past, low population density, low migration and high inbreeding (Morera & Barrantes, Reference Morera and Barrantes2004; Rodríguez-Acevedo et al., Reference Rodríguez-Acevedo, Morales, Durango and Pineda–Trujillo2012; Dipierri et al., Reference Dipierri, Rodríguez-Larralde, Barrai, Camelo, Redomero and Rodríguez2014; Pacheco-Orozco et al., Reference Pacheco-Orozco, Torres and Velasco2019).
The methodology used in the present spatial analysis allowed the identification of outliers as well; that is, Northeastern municipalities with high I NS values surrounded by municipalities with low values (or vice versa). From these findings, interesting cases arise such as the high–low outliers of municipalities with high isonymy values in the West Potiguar Mesoregion, in the south-west of Rio Grande do Norte. In two municipalities from this region, an autosomal recessive neurodegenerative disease known as SPOAN (OMIM: 609541) was discovered in 2005 because of a high prevalence of affected individuals; this situation is strongly influenced by inbred marriages, which are very common in the region, and genetic drift (Santos et al., Reference Santos, Kok, Weller, Paiva and Otto2010).
The settlement of the Northeast sugar region was very different from that of states such as Ceará and Piauí and other more internal areas, sometimes referred to as ‘the other Northeast’ (Menezes, Reference Menezes1970). This region is characterized by a dry climate and soil that is unfavourable to the cultivation of sugarcane and was initially intended primarily for livestock rearing. The socioeconomic reality in this area favoured the formation of an unstable society centred on clan families and nomadic populations (Menezes, Reference Menezes1970; Arcanjo, Reference Arcanjo1996). According to this work, this part of the Northeast presented the lowest rates of isonymy, although the spatial analysis of the states revealed some isolated clusters of municipalities with higher rates.
The present study combined affordable and inexpensive methodologies to assess socio-demographic and cultural factors that may influence health indicators in a large and diverse region. The Brazilian Northeast is a poor region in several regards (reflecting the lowest HDI among the five Brazilian regions), including a low density of professionals specialized in medical genetics (Novoa & Burnham, Reference Novoa and Burnham2011). Thus, this work can contribute to strategic planning and the allocation of efforts with regard to public policies focused on population medical genetics. A similar approach might be especially useful in countries with similar socio-demographic characteristics to those of Brazil.
In conclusion, this work provides a detailed view of the wide isonymic landscape of the most economically disadvantaged and inbred region in Brazil – the Brazilian Northeast. A heterogeneous distribution of isonymy values was found across the region, with a remarkable spatial pattern of distribution and low diversity of surnames. Along with the evaluation of information from historical, epidemiological and journalistic sources, it was possible to list those municipalities with the highest rates of isonymy. These municipalities are located in an area historically equivalent to the largest quilombo in Latin America, where there are currently cases of transmission of rare genetic diseases and congenital defects, possibly associated with the inbreeding tendency among local families. Thus, this region can be a strategic for the elaboration of public health policies aimed at the epidemiological surveillance of congenital defects and issues related to population medical genetics.
Acknowledgments
The authors wish to acknowledge in memoriam the significant contribution of ideas from Francisco Mauro Salzano, Gabriela Costa Cardoso and Eduardo E. Castilla. This work was supported by the National Institute of Population Medical Genetics (INAGEMP; CNPq 465549/2014-4).
Funding
This study was funded by the National Institute of Population Medical Genetics (INAGEMP; CNPq 465549/2014-4).
Conflicts of Interest
The authors have no conflicts of interest to declare.
Ethical Approval
For surname data, the 2010 national electoral registry was used. Permission to work with these documents was obtained from the Brazilian national authorities (Tribunal Superior Eleitoral – TSE) under Protocol #81693/13. All data were treated anonymously and globally and were considered non-binding, thus respecting the right to privacy. Data from the Brazilian Institute of Geography and Statistics (IBGE) and DATASUS (a database that compiles the statistical records of the Brazilian Unified Health System) were also consulted. Both databases are fully anonymous and publicly available; therefore, ethical approval was not required. The authors assert that all procedures contributing to this work comply with the ethical standards of the relevant national and institutional committees on human experimentation and with the Helsinki Declaration of 1975, as revised in 2008.