On April 12 and 13, 2014, Valparaiso suffered the worst urban fire in its history. On this occasion, a usual situation in Valparaiso, the spread of a wildfire to poor peripheric urban areas, went further than usual. More than 3000 houses were burned in 2 days. The fire also destroyed the family health care facility that served a high proportion of the affected population. Two of the authors were part of a team that created a specific website using the Ushahidi/Crowdmap (Nairobi, Kenya) platform to collect data in real time. This allowed data capture from the field from April 14 through June 26. Environmental epidemiological risk factors, described previously by other authors,Reference Astorga, Eguren and Espinoza 1 were explored and displayed.
It is well known that disasters are related to a broad spectrum of health problems, such as diarrhea, fever, and respiratory diseases in children.Reference Datar, Liu and Linnemayr 2 Environmental risk factors like access to potable water, wastewater disposal, solid waste management, food security, vector exposure, and weather conditions play a key role in the risk of developing those diseases in the affected population. 3
Health conditions during and after a disaster are a key part of the disaster’s impact assessment and must be studied carefully because their effects are not immediately evident in the early stages of a disaster.Reference Noji, Eric 4 Data on the health of a population and the interpretation of these data are fundamental for decision-making about the whole response to a disaster, including preventive actions for medium- and long-term effects. Epidemiological data and information management, bias control, and sanitary intelligence are basic for public health decisions, their implementation, and communication.Reference Zook, Graham and Shelton 5
Since the health care sector is responsible for just a small share of all the factors influencing the health of some populations, data must be useful as feedback for decision-making at all territorial scales and for different disciplines and management areas. Also, the data and information generated for health monitoring support further knowledge production through more detailed analysis and interpretation, which is of vital importance.
Disaster learning requires that researchers utilize these scarce, ominous real-life situations to test new conceptual, technological, and operational developments. Too often, the valuable data set collected is not used to generate learning at a later stage. This is an insurmountable waste in such scarce opportunities, because the previous experience is one of the most important sources of preventive information, allowing modeling of future scenarios for better preparedness.Reference Van den Berg, Grievink and Gutschmidt 6
Some similar experiences of data collection during uncontrolled wildfires affecting houses do exist, but the processing of these data was usually qualitative.Reference Petersen 7 These examples show the need for the mapping process to be collective instead of being a centralized effort restricted to first responders or official managers. Experiences describing the use of information technologies in collective mapping of disasters clearly show that Internet users are faster than seismological agencies in providing updated data.Reference Zook, Graham and Shelton 5 Vector-transmitted diseases can account for an important part of the consequences of a disastrous event.Reference Korteweg, van Bokhoven and Yzermans 8 Furthermore, although most authors agree that a sizable portion of the effects of a disaster emerge in the medium and long term, most epidemiological studies only consider short-term effects.
The system disruption inherent to disasters demands tools able to deal with high levels of uncertainty for both their management and their description. It is worthless to expect a detailed description of highly controlled variables. Uncertainty will be a part of data management, from capture to interpretation. This problem surely exists and complicates the performance of managers, because the subject itself puts uncertainty in the center. Any model intended to describe such chaotic situations must bring along uncertainty.
Materials And Methods
An observational, descriptive, cross-sectional study was performed. Because of the diversity of the data generators and the potential users, the monitoring system was based on crowdsourcing/crowdfeeding. All 243 reports were collected in a census-sampling mode.Reference Astorga, Eguren and Espinoza 1
During the emergency, different stakeholders, such as medical doctors, nurses, physical therapists, and students of the same discipline, collected data. The data were uploaded in real time through a website/map generated in Crowdmap (http://incendiovalpo.crowdmap.com; Ushahidi, Nairobi, Kenya). This free map-based tool allows users to create a thematic mapping project, with uploading of data in a distributed way, and is managed on a cloud-based server. Administrators can define categories, organize categories, and manage users’ privileges and displays. Users can create their own reports as geographic objects (point, line, polygon) and associate attributes and media.
The authors began creating the website on April 14, 2 days after the start of the emergency. The website was started simultaneously with intentional data collection in the field. Ushahidi’s plasticity allowed us to enter the initial data “on the run,” while the website was still in a basic configuration. We initially used categories defined by the Pan American Health Organization and Chile’s Ministry of Health.Reference Van den Berg, Grievink and Gutschmidt 6 The categories were later reorganized according to feedback in the field and the characteristics of the collected data.
Ten categories were finally defined, in order to better collect the different types of reports. These were as follows: (1) chemical toilets, (2) garbage, (3) feces, (4) damaged infrastructure, (5) health care facilities, (6) water, (7) landslides, (8) donation center, (9) vectors, and (10) others. Customization properties of Ushahidi were used to set 2 different levels of access privileges. Those reporting collective situations and areas were left open access. Personal disease surveillance data are covered by Personal Data Protection Law; 9 thus, these data were set for restricted access, just by the Sanitary Authority and system administrators. Surveillance data were also transmitted directly to the Sanitary Authority by using a protocol agreed upon on the third day of operation, safeguarding all the ethical research criteria.
For the present research, a new database was created to ease descriptive analysis. Categories in each report and the content described by the user were recorded in 10 dichotomous categories based on the presence or absence of the relevant category in the report. Analysis did not consider the actual meaning of the variable in terms of its positive or negative impact on the population but only whether it was included in each report. Because reports could have data from more than one category, all of them were included. Data were processed by using STATA 13.0 (StataCorp, College Station, TX), characterizing the proportions of the different factors.
Results
A total of 243 reports were analyzed, with an average presence of 1.72 (±1.44) categories. Almost one-third of the reports presented data about garbage (30%) and chemical toilets (29%). One-fourth reported situations linked to donation centers (25%), water (24%), and infrastructural damage (24%). Noticeably, 12% of the reports included data about vectors, the same percentage as those labeled “others.” Landslides and feces were included in few reports (7% each). Just 2% of the reports included health facilities (Table 1). If only absolute frequencies of positive observations, in each category, were selected (n=491), three-fourths (76%) of the observations corresponded to 5 categories: chemical toilets (17%), garbage (17%), donation centers (14%), water (14%), and infrastructural damage (14%) (Figure 1).

Figure 1 Frequency Plot of the Impact of Each Category on the Total Reports. Abbreviations: Chem. Toilets, chemical toilets; D. Centers, donation centers; Health Fc., health facilities; Inf. Damage, infrastructural damage.
Table 1 Presence of Categories in the ReportsFootnote a

a Abbreviations: CI, confidence interval; Fr, frequency; SE, standard error.
One to one, contrasts were also performed, checking for dependence of study variables. Reports related to water, infrastructural damage, and garbage had significant associations with 4 categories. These were followed by reports about chemical toilets, which were associated with 3 categories. Donation centers and vectors were associated with 2 categories. Health facilities showed no associations (Table 2).
Table 2 Significant Contrasts Performed on the Reports Showing Dependency of Study Variables

a P value by chi-square test.
b P value by Fisher’s exact test.
The study of these associations was expanded by using a logistic regression model. This showed that, for instance, for the dependent variable “chemical toilets,” just the variable “water” showed as explicative in the model (P value=0.00) (model P value: 0.00; R2: 0.117). The dependent variable “garbage” confirmed predictor variables “infrastructural damage” (P value: 0.00), “water” (P value: 0.028), and “vectors” (0.00) (model P value: 0,00; R2: 0.2309). The other variables were not explicative.
Finally, a multiple correspondence analysis showed that the variables “feces” and “landslides” could be more closely related to each other. Something similar happened with “vectors,” “infrastructural damage,” and “garbage.” There was also an association in the groups “chemical toilets” with “water,” and “donation centers” with “others.” The health facilities variable did not seem to be related with any other category (Figure 2).

Figure 2 Multiple Correspondence Analysis Coordinate Plot of Category Reports. Abbreviations: D. Centers, donation centers; Health Fc., health facilities; Inf. Damage, infrastructural damage.
Discussion
Implementing a sampling plan in an emergency or disaster situation is a complex task. On the one hand, resources are distributed to the event itself, and on the other, the unique conditions of each disaster often make it impossible to have a predefined research protocol before the event and its consequences. Because of the practical impossibility of having complete sampling frames, or clusters after the situation, it is not possible to infer through probabilistic sampling. Other authors have described this constraint.Reference Astorga, Eguren and Espinoza 1 , Reference Zook, Graham and Shelton 5 , Reference Van den Berg, Grievink and Gutschmidt 6 Collaborative sampling brings promising solutions but challenges remain regarding bias control. There are important restrictions to standardization when collectors with very different backgrounds are going to the field for other purposes and secondarily collect data. Written detailed instructions can help to reduce bias but have a restricted effect.
Census-based sampling, like that explored in this research, could solve such difficulties. Nevertheless, such sampling can entail problems with capturing an emergency’s variability or its evolution. Also, volunteers may feel uncomfortable and even frustrated when they carry out data collection instead of physical, immediate-impact work.Reference Astorga, Eguren and Espinoza 1
The literature shows an imperative need for tools for rapid data collection in emergency and disaster situations.Reference Datar, Liu and Linnemayr 2 Recent experiences show the potential utility of distributed knowledge facing that need. This research confirms previous findings from other sources in the literature.Reference Noji, Eric 4 , Reference Petersen 7
Concomitancy of variables could be used as a predictor orienting future preparation. Predefined and preset ways of assessing this possible innovation should be designed in advance as part of a preparedness process. Probabilistic models could be implemented for specific scenarios. These research designs must take account of modeling uncertainty and the diversity of such scenarios.
The necessary flexibility and improvisation of the whole process of data management—from capture to decision-making and learning—requires, reciprocally, a careful and detailed preparation in this specific area and cannot be improvised. A well-trained team should be able to start data collection in a predefined, flexible way.Reference Dominici, Levy and Louis 10
Evidence of the linkage of some categories, especially chemical toilets, water, damaged infrastructure, and vectors, makes necessary further exploration of incidences and prevalence of diseases with a geostatistical perspective. However, most of the patient’s personal data files were destroyed together with the Primary Health Centre. Current data do not provide a reliable baseline.
The use of internationally accepted categories eases the reproducibility of studies. However, it is not certain that during the development of the emergency such categories will be stabilized.
Health facilities were not related with the presence of any variable. This could be explained by the fact that sanitary facilities are covered early in an emergency, reducing the probability of finding sanitary risk conditions.
Emergency managers have to monitor the whole landscape of the emergency, and thus need data feeding in from the field. Official field staff usually cannot cover the whole territory, and the bureaucratic pattern of data management reduces management capabilities during emergencies. Distributed-knowledge-based technologies can be effective tools for increasing updated data streaming. Categorization and filtering, like those shown in this article, can be automatized for better results.
Limitations
The main limitations of this study were due to variable recategorization and not distinguishing the negative from the positive connotations of each report. This could bias the presence of the variables considering that in emergencies, attention is focused on deficits much more than on well-functioning aspects.
Conclusions
During the uncontrolled wildfire of April 2014 in Valparaiso, the most frequent environmental risk factors in the reports uploaded to a Crowdmap-based platform were garbage, chemical toilets, and donation centers. Multiple correspondence analysis enlightens the emergence of environmental risk factors after an uncontrolled wildfire. The highest correlation found was for damaged infrastructure, vectors, and garbage. Maps are a fundamental tool for disaster management, allowing a deeper and broader understanding of a disaster’s development and territorial impact. A pyramidal style of data flux and data management paradigm must be replaced by one based in the acknowledgment of the distributed condition of knowledge/data. This is the greatest challenge for disaster and emergency information systems.