Introduction
Organic farming has emerged in the second half of the last century as an alternative to the negative externalities of industrial agriculture (Campanelli et al., Reference Campanelli, Acciarri, Campion, Delvecchio, Leteo, Fusari, Angelini and Ceccarelli2015). There has been double-digit growth in the US market for organic produce over the last 20 years. Likewise, local food sales increased from US$1 billion in 2005 to US$6.1 billion in 2012 (USDA-ERS, 2015). Nonetheless, an estimated 95% of varieties grown on organic farms are not bred for organic environments, though research has shown that cultivars developed in conventional systems often under-perform in organic systems (Lammerts van Bueren et al., Reference Lammerts van Bueren, Jones, Tamm, Murphy, Myers, Leifert and Messmer2011; Reid et al., Reference Reid, Yang, Salmon, Navabi and Spaner2011; Renaud et al., Reference Renaud, Lammerts van Bueren, Paulo, van Eeuwijk, Juvik, Hutton and Myers2014; Hoagland et al., Reference Hoagland, Navazio, Zystro, Kaplan, Gomez Vargas and Gibson2015). Furthermore, many breeding programs focus on broad adaptation across a wide range of environments, reflecting a long-term trend in plant breeding towards breeding for larger seed markets (Hoagland et al., Reference Hoagland, Navazio, Zystro, Kaplan, Gomez Vargas and Gibson2015). Such an approach has been shown to jeopardize important traits that may enhance performance in local organic systems (Dawson et al., Reference Dawson, Murphy and Jones2008; Annicchiarico et al., Reference Annicchiarico, Pecetti and Torricelli2012). To optimize productivity on organic farms, farmers need varieties bred to fit their environments, management realities and customers’ preferences. There is a need for public plant breeders to focus research on breeding organic crops in partnerships with local farmers, seed companies and other food system stakeholders.
In 2012 Wisconsin organic vegetable farmers identified tomatoes as a top priority for research and breeding initiatives (Lyon et al., Reference Lyon, Silva, Zystro and Bell2015). Farmers selling direct to consumers or local markets can often charge a premium for high quality organic tomatoes. Accordingly, certified organic tomato production increased 277% from 2007 to 2011 (USDA- ERS, 2013). Heirloom and other open-pollinated tomatoes are prized for their flavor and eclectic colors, but are frequently perceived as low-yielding and susceptible to splitting and disease (Coolong, Reference Coolong2009; Hoagland et al., Reference Hoagland, Navazio, Cerruti, Maynard, Kaplan and Gibson2014). Tomatoes bred for high yield, disease resistance and uniform ripening often lack the fresh-eating and nutritional quality customers prefer (Powell et al., Reference Powell, CNguyen, Hill, Cheng, Figueroa-Balderas, Aktas, Ashrafi, Pons, Fernández-Muñoz and Vicente2012). Combining superior quality and flavor with agronomic performance and regional adaptation are important priorities for Wisconsin organic farmers.
High tunnels, or hoop houses, have gained popularity in recent years as tools for extending the tomato season by as much as 2 months, and potentially obtaining higher yields and higher quality than from field-grown tomatoes. Because hoop house plants are watered using drip irrigation and sheltered from rain and wind, they are protected from common pathogens that depend on leaf moisture to reproduce, such as septoria leaf spot and bacterial spot and speck (Kaiser and Ernst, Reference Kaiser and Ernst2012). A study in North Carolina compared a popular heirloom tomato cultivar, Cherokee Purple, in hoop house and field conditions, and found that the hoop house tomatoes produced 3–4 weeks earlier and achieved yield 33% higher yield than the field tomatoes (O'Connell et al., Reference O'Connell, Rivard, Peet, Harlow and Louws2012). Hoop houses also increase temperature and humidity, potentially encouraging certain diseases, such as leaf mold and physiological disorders associated with these factors, such as blossom end rot. For organic farmers to take full advantage of hoop houses, and capture the market benefit, there is a need for more tomato varieties adapted to organic hoop house conditions.
The main objectives of this study were to measure the effect of hoop house versus field production on certain traits of interest to local organic farmers and plant breeders, to characterize genetic variation for those traits, and to compare market classes of tomatoes within and across managements systems. Testing a broad range of heirloom and modern genetic backgrounds in a hoop house versus field comparison is unique in the literature. Previous studies have examined conventional hoop house versus field environments or focused on a small selection of varieties in organic hoop house versus field comparisons (Taber et al., Reference Taber, Havlovic and Howell2007; Hunter et al., Reference Hunter, Drost and Black2010; O'Connell et al., Reference O'Connell, Rivard, Peet, Harlow and Louws2012). We prioritized testing more varieties and included varieties in different market classes with 100% heirloom, 100% modern and 50% modern/ 50% heirloom parentage in order to better understand genetic variation for traits of interest, and to be able to more accurately generalize results for market classes and management systems, rather than focusing on specific pairwise comparisons among varieties. Our objective was to complement previous studies and respond to regional organic farmers’ requests for research that is representative of the different management systems and types of varieties that they may choose to grow.
Trial varieties, priorities and criteria were established collaboratively with a group of plant breeders, farmers and seed companies working in the upper Midwest, and chefs serving the Madison, WI area. Trial varieties were grown in side-by-side field and hoop house management conditions at the West Madison Agricultural Research Station (WMARS). Yield, disease resistance, harvest period and flavor components (sugar and acidity) data were collected at WMARS in 2014 and 2015.
Materials and Methods
Plant material
Plant breeders, seed companies and farmers submitted varieties for the trial. Varieties were selected based on purported high quality for fresh eating, novelty, adaptation to organic environments and lack of previous trialing in the upper Midwest. Varieties were grouped by market class, color, growth habit and breeding type (F1 or open pollinated, OP) and they came from eight seed companies and five individual breeders (Table 1). Market classes correspond to genetic parentage, with the heirloom market classes having 100% heirloom parentage (5 varieties), large slicers (5 varieties) and small slicers (5 varieties) having 100% modern parentage and crosses having 50% modern/50% heirloom parentage (4 varieties), generally with the appearance of a small slicer. We trialed 19 varieties in 2014 and 2015. All seed entries were either certified organic or verified as untreated.
Table 1. Tomato trial varieties grown in hoop house and field plots in 2014 and 2015.

OP, open pollinated; F1, first generation hybrid; i, indeterminate; d, determinate; s, semi-determinate; u, unknown.
1 market classes are determined based on variety parentage. Heirlooms are varieties with 100% heirloom parentage, including crosses between two heirlooms. Crosses are varieties with half heirloom and half-modern parentage, including F1 hybrids and crosses that have been selfed. Large and small slicers have modern variety parentage, with large slicers being a beefsteak type and small slicers being slightly larger than a cocktail tomato. Plum Regal was considered a small slicer because, while it is a plum tomato, it is purported to have good fresh eating quality.
Experimental design
In both 2014 and 2015, all trial varieties were grown at WMARS on land certified by Midwest Organic Services Association (MOSA) since 2008. The hoop house was built in the winter of 2014 on a level area oriented north-south with the rows running east-west in the direction of the prevailing winds. The outdoor tomato field was located 30 feet west of the hoop house with identical rows also oriented east-west. The hoop house and field areas were roughly 32 × 88 feet long.
Within each environment, trials were conducted as a completely randomized design with two replications of three-plant plots of each variety in each environment and four replications of the check varieties, Big Beef and Pruden's Purple in each environment. Big Beef is a common hybrid cultivar among organic farmers and is valued for its relative consistency and productivity. Pruden's Purple is a Brandywine-type heirloom that is also very commonly grown by organic farmers and gardeners in the region. Within the rows plants were spaced two feet apart (0.6m), and the beds measured 5 feet (1.5m) from bed- center to bed-center, with two foot (0.6m) aisles. Three-plant plots were considered the experimental unit for analysis.
The total area and number of plants was limited by the dimensions of the hoop house. The decision was made to favor testing more varieties over increasing replication of a few varieties because we were interested primarily in differences among management systems and market classes/genetic backgrounds rather than pairwise comparisons of specific variety means. We chose an experimental design, which would allow for better estimation of the comparisons most of interest.
Management
In 2014 and 2015, soil testing at the UW Madison soil and plant analysis laboratory assessed soil organic matter and nutrient availability in each location (Table 2). Before each planting, a cover crop of winter rye (a winter-hardy grass that provides fall and early spring cover) was grown and incorporated into each plot. Compost was applied to each plot and incorporated the week before planting. Pelleted chicken manure was also applied and incorporated to achieve a total rate of 150 lbs N per acre (168 kg/ha) from all sources, according to recommendations for tomato production.
Table 2. The year 2014 and 2015 soil test results from West Madison Agricultural Research Station.

Transplants were grown in an organically managed greenhouse at the Arlington Agricultural Research Station in 2014, and in certified organic greenhouses at West Star Farms in 2015. Hoop house tomatoes were planted on May 13, 2014 and May 1, 2015. Field tomatoes were planted on June 2, in 2014 and May 28, in 2015. The year 2014 plantings were delayed due to cold rainy weather. Before planting, all transplants were watered with a 2% solution of fish emulsion to reduce transplant shock.
Beds were slightly mounded, laid with drip irrigation and covered in black plastic mulch. Straw mulch was used to control weeds in aisles. Hoop and field plants received roughly 0.75–1.5 inches of water per week, with two waterings per week, tapering down starting at peak harvest as recommended to encourage ripening (Whiting et al, Reference Whiting, O'Meara and Wilson2015). Rainfall was monitored and field tomatoes were also drip irrigated with 0.75–1.5 inches of water per week when rainfall did not provide sufficient water.
Hoop house plants were trellised using tomato clips to guide the plants up lengths of twine attached to cross-beams. Field tomatoes were trellised using t-posts every 3–4 plants and a basket weave with tomato twine. Indeterminate tomatoes were pruned according to convention, with 2–3 leaders established and suckers removed until fruit started to ripen. Determinants were lightly pruned, with excess foliage removed up to 6 inches up the stem to encourage airflow. Breeding lines of uncertain growth habit were pruned minimally to begin, and more if they appeared to be indeterminate.
Temperature and relative humidity were also monitored in the hoop house and the field using a Spectrum Technologies Watchdog 1000 microstation (model number 1450) in each environment. Hoop house side-walls were opened and closed manually from planting until night time temperature were consistently above 65° Farenheit, and again in the fall when temperatures dropped. Human error may have contributed to occasional spikes in temperature, as discussed in the results section.
Harvest
In 2014 tomatoes were harvested twice weekly as we attempted to pick at peak ripeness for weekly flavor evaluations with a panel of local chefs. In 2015, chefs evaluated tomato flavor only twice throughout the season, and we were able to reduce harvest frequency to once a week. In 2015, fruits were harvested at the USDA ‘turning stage’ rather than full ripeness to minimize loss due to splitting. We found that harvesting some tomatoes slightly under-ripe did not significantly impact harvest weight, though only fully ripe fruit was used for quality analysis, described below.
Yield was recorded on a per-plot basis. At the time of harvest, tomatoes were sorted into ‘marketable’ and ‘unmarketable’ categories. Any fruits damaged by splitting, disease, insects, rodents or weather were considered unmarketable. Natural cat-facing on heirlooms or small dry cracks due to rapid growth were still considered marketable. Total weight and number of marketable tomatoes were recorded for each plot, as were total weight of unmarketable tomatoes and causes of damage. Perfectly ripe tomatoes were put aside for quality evaluation. The percent of unmarketable fruit by weight was also calculated for each plot.
Quality components
In 2014, 8–10 tomato varieties from both management systems were evaluated weekly by a panel of five local chefs for flavor and culinary quality. This frequency proved to be a burden on tasters, leading to gaps in the data. In 2015, tomatoes were evaluated in bi-weekly ‘crew tests’ by members of the WMARS field crew. Popular tomatoes from the ‘crew tests’ and a few unique specimens were evaluated by the chefs at two tastings during peak season. These evaluations were done exclusively on hoop house tomatoes as there were limitations on sample size due to taster fatigue after about six samples. In 2015, it was also difficult to find unblemished tomatoes in the field due to the incidence of bacterial speck. In their evaluations, chefs used a modified version of the ‘crew test’ questionnaires, rating tomatoes for flavor intensity and preference, while providing detailed qualitative information about flavor profiles and potential culinary applications. Flavor intensity ratings (on a 1–8 scale) are discussed in the results section. Additional data from flavor evaluations will be presented in a separate publication.
Whenever a tomato plot was evaluated for flavor, a bulked sample of three to six ripe tomatoes from that plot was brought to the laboratory for °Brix (as a measure of total dissolved solids) and titrateable acidity analysis to measure citric acid by volume (CA). This way, we could look for relationship between tasters’ flavor perceptions and basic elements contributing to flavor: sugar and acid. For °Brix and CA, tomato juice samples were filtered through cheesecloth to remove excess solids. °Brix levels were tested using a Sper Scientific digital refractometer (model number 30051). Citric acid content by volume was measured using a Hanna Instruments automatic titrator (model 902). pH was also measured for each tomato juice sample, though it was of lesser interest than CA, since the latter has been shown to correlate with how acid is perceived as a flavor in tomato (Baldwin et al., Reference Baldwin, Scott, Einstein, Malundo, Carr, Shewfelt and Tandon1998).
Disease incidence and severity
Disease incidence and severity were measured three times in each year in each environment. Severity measurements were based on nearest percentage estimates, in 5% increments (0–100%), of leaf area covered with disease. Plants that received 100% ratings were completely dead. Area under the disease progress curve was calculated for each variety in each environment using July 1, with a disease incidence of 0%, as the first data point. The same researcher always measured disease incidence and severity to prevent variation due to observer.
Statistical analysis
Data were analyzed with the goal of characterizing management effects, genotype × environment interactions and genetic variation for specific traits of interest (i.e. yield, disease, and quality components). Data were analyzed using the function lmer in the R package lme4 (Bates et al., Reference Bates, Maechler, Bolker and Walker2014; Walker et al., Reference Walker, Bates, Maechler and Bolker2015). Analysis of variance (ANOVA) was performed for all traits of interest in each environment for each year.
A fixed effects models was created using year, management, variety and their interactions as sources of variation. We ran this model separately from the market class model (below) as we wished to assess the contribution of varieties to variation for traits of interest first without imposing any structure on the data.

where V i = variety i, i = 1–19, M j = management j, j = hoop house or field and Y k = year k, k = 2014, 2015.
We then analyzed the effect of market classes by nesting variety within market class, and testing for the effects of market class and the interaction of market class with year and management.

where C l = marked class l, l = heirloom, cross, large slicer or small slicer, V(C) = variety within market class, M and Y as above. The effect of variety mating type (F1 hybrid vs open pollinated) was assessed using the same model substituting variety mating type for market class. The package lsmeans (Lenth, Reference Lenth2016) was used to calculate means and confidence intervals for market classes and variety types (F1 and OP). We assessed the relative magnitude of the effects using the partial omega squared metric as described in Carrol and Nordholm (Reference Carrol and Nordholm1975).
Due to an error in the 2015 hoop house plot design, blocking within environments was not included as a fixed source of variation in the ANOVA, and the experiment was therefore analyzed as a completely randomized design. Comparison of Akaike information criterion values for replication, management and variety in 2014 and for the 2015 field data indicated that the model did not suffer from removing block from the fixed effects model. For all analyses, means and totals were adjusted to account for missing plants in one plot in 2014 and one plot in 2015 (due to low germination or seedling disease) in each environment.
To determine the relative proportion of the total variance contributed by variety, repeatability (broad sense heritability) was calculated using the variety model above, but with all effects considered random to calculate variance components, then using these components in the formula (Holland et al., Reference Holland, Nyquist and Cervantes-Martnez2003, pg. 88):

With V G = variance due to genotype (variety); V GM/2 = variance due to genotype × management interaction divided by the number of management treatments (2); V GY/2 = variance due to genotype × year interaction divided by the number of years (2); V GMY/4 = variance due to genotype × year × management interactions divided by the number of years times the number of management treatments (4); V e /8 = variance due to error divided by the number of years, number of management systems and number of reps within year and management (8).
Results
Temperature
Significant temperature differences were observed between the hoop house and field environments. In 2014 and 2015, the hoop house was significantly warmer than the field, with mean temperatures of 69.5/20.8° and 62.5/16.9° (F/C), respectively, between April and October 2015, and means of 70.2/21.2° and 62.75/17.08° (F/C), respectively, between May and October 2014. While the outdoor temperature peaked in the mid 90s in 2015, hoop house temperatures occasionally spiked into the 100s, reaching a maximum of 114/45.6° (F/C) in 2015, and 112 /44.4° (F/C) in the 2014. Relative humidity was also significantly higher in the hoop house than in the field in both years, with spikes into the 90% range in both 2015 and 2014.
Yield, disease and flavor components
The ANOVA results in Table 3 identify the primary sources of variation for each trait of interest in the 19 tomato varieties present in both years of the trial. Results from the model with variety nested within market class are shown, as variety within class in the market class model showed the same degree of significance as variety in the variety-only model. Results from the mating type model were very similar to the market class model, even though the F1/OP classes do not correspond perfectly to the market classes.
Table 3. Combined ANOVA table of traits of interest for 19 varieties trialed in 2014 and 2015, hoop house and field. Top panel gives effect size (partial ω2), F-test and P-values for market class model ANOVA.

NS, not significant; mgmt, management (hoop house or field).
P ≤ 0.10=+, *P ≤ 0.05, **P ≤ 0.01, ***P ≤ 0.001.
Management (hoop house versus field) had the largest effect size in determining marketable yield, total area under the disease progress curve (AUDPC), and °Brix. Year and management × year interactions also had a large effect on AUDPC. Year was the most significant source of variation in percent unmarketable yield and CA, and had a significant effect on all traits except for the °Brix/CA ratio.
Variety within market class and market class were the primary sources of variation determining average fruit weight, TA and the ratio of °Brix/CA, though variety within market class also contributed significantly to variation in all traits of interest. Market class was not significant for marketable yield, AUDPC or °Brix.
Management × market class interactions were only significant for °Brix, and AUDPC at the P < 0.1 level but the magnitude of the interaction effect was small compared with main effect sizes. Market class × year interactions were only significant for AUDPC, and fruit size at the P < 0.1 level. Again, these effect sizes were small compared with the main effect sizes. Management × year interactions were significant for AUDPC, and for average fruit weight, but of smaller effect size for fruit weight than market class or variety within market class. The interaction between market class, management and year was significant only for average fruit weight, with a smaller effect size than the management × year interaction.
Repeatability
Calculations of repeatability (Table 4) followed expectations for trait broad sense heritability. Average fruit weight had the highest repeatability, with a value of 0.96. Total marketable yield and percent unmarketable had moderate repeatability, at 0.36 and 0.27, respectively. AUDPC had a repeatability of near zero (5.0 × 10−13), which is not surprising given the non-significant effects of variety and the strong effects of management and year. For quality-related traits, the °Brix/CA ratio had the highest repeatability at 0.83, then CA at 0.77, with °Brix having very low repeatability at 0.08, consistent with the major effects of management on this trait.
Table 4. Repeatability measures for traits of interest. Repeatability calculated based on formulas in Holland et al., Reference Holland, Nyquist and Cervantes-Martnez2003, pg 88.

Discussion
Marketable yield was significantly higher in the hoop house than in the field in both years of the study and across the four market classes: heirloom, crosses, large slicer and small slicer (Fig. 1). In 2015, the large difference in yield numbers between the two environments was greater than in 2014. Market class did not have a significant effect on marketable yield, but variety within market class did, leading to the conclusion that there is more variability for productivity within market classes than between them.

Fig. 1. Market class mean values for hoop house and field, 2014–2015. Error bars are +/− the standard error of the mean. F, field; H, hoop house. (a) Marketable Yield (g/plot) for all varieties grown in both years, by market class. (b) Proportion unmarketable for all varieties grown in both years, by market class. (c) °Brix/CA ratio for all varieties grown in both years, by market class. (d) AUDPC for all varieties grown in both years by market class.
Looking at these results, an important question arose: could the higher yield numbers in the hoop house simply be the result of the longer season in the hoop house, or is there an additional bump in yield from hoop house production over and above the season extension effect?
To answer this question, we looked at weekly averages in hoop house and field yield for both 2014 and 2015. In 2014 the weekly average tomato production in the field was 504.8 grams per plant over the 11-week season. In the hoop house, weekly average production was 473.7 grams per plant over the 15-week season. These numbers were not significantly different, so we can infer that the increased yield in the hoop house in 2014 was mostly due to the longer season.
In 2015, the result was different. Average weekly production in the field was 330.4 grams per plant, while in the hoop house it was 451.3 grams per plant. These numbers were significantly different, suggesting an additional boost from the hoop house above the effect of season extension. This opposite result from 2014 may be attributable to the elevated levels of disease in the field in 2015, due to an outbreak of bacterial speck. This suggests that while the primary benefit of hoop houses may be a longer season when field conditions are good, they may offer an additional benefit to tomato growers when field conditions are sub-optimal.
Spikes in temperature and high relative humidity, however, may have negatively affected plants in the hoop house. Research has shown that temperatures above 104/40° (F/C) for as few as 4 h can cause damage to tomato plants, including flower abortion (Ozores-Hampton and McAvoy, Reference Ozores-Hampton and McAvoy2015). These temperature spikes may have been avoided in a hoop house with automated sidewalls. High relative humidity can cause problems with pollen viability and pathogen development (Harel et al., Reference Harel, Fadida, Slepoy, Gantz and Shilo2014). Hoop houses can produce benefits for organic growers but optimizing this technology requires farmers to invest in safeguards against extremes of temperature and humidity.
Like marketable yield, average fruit weight is an important characteristic for growers selling direct to consumers. At informal meetings in January 2014 and 2015, chef and farmer participants expressed a need for a smaller-sized heirloom-type tomato. Since tomatoes at farmers’ markets are often sold by the pound, a single large organic tomato can easily cost US$5.00–7.00, which according to participating farmers, deters customers. Researching farmers’ ideal fruit size range and looking for varieties consistently within that range would be an interesting topic for future inquiry. For this study, we analyzed sources of variation in fruit size, and found that market class and variety contribute much more to average fruit size (in grams) than management or year. This is to be expected since one of the distinguishing characteristics of varieties and one of the defining characteristics of market class is fruit size. This finding also suggests that farmers need not worry about large differences in fruit size when deciding whether to grow tomatoes in a hoop house or open field.
AUDPC varied significantly between hoop house and field in both 2014 and 2015, and this difference was particularly strong in 2015 (Fig. 1). This is likely due to different leaf moisture levels in the two environments. Leaves remained dry in the hoop house, limiting the spread of fungal and bacterial diseases. In 2015 the disease differential between hoop house and field was more dramatic than in 2014 because of a crippling outbreak of bacterial speck in the field, which barely influenced plants in the hoop house. This finding demonstrates that farmers experiencing bacterial or fungal disease pressure in open tomato fields might enjoy a significant benefit from adding hoop house production to their operation if rotation within the hoop house or moving the hoop house to avoid disease build up is possible. This data also demonstrates that while heirlooms have a reputation for disease susceptibility, they are not always the most heavily impacted. Neither market class nor variety within market class had a significant effect on AUDPC. This may be due in part to relatively low disease pressure in 2014 and very high disease pressure in 2015 in the field making it difficult to detect varietal differences in sensitivity to the foliar diseases that were present. More research is needed to understand disease susceptibility in the two environments, and identify genotypes that perform well in both environments. Hoop house production will likely be able to use varieties with less resistance to the common foliar diseases we found in the field trial: early blight, septoria and bacterial speck. There are also other diseases, such as molds and mildews, that are more prevalent in some hoop houses and may require varietal resistance, but we did not observe them in our trial (Miller, Reference Miller2015).
Hoop house management led to significantly higher °Brix measurements than field management across market classes in both years of the study. This is perhaps due to consistently higher heat and/or lower soil moisture in the hoop house than in the field. Interestingly, we did not observe significant differences in °Brix levels between market classes, but variety within market class was significant. This means that there is more variation within market classes than between them for °Brix content, making it hard to generalize about differences between heirloom and modern varieties. Modern varieties with higher °Brix levels may be useful for breeding for higher quality and production. However, as Baldwin et al. (Reference Baldwin, Scott, Einstein, Malundo, Carr, Shewfelt and Tandon1998) found in 1998, °Brix measurements alone may not be useful in anticipating tomato flavor. CA and the ratio between °Brix and CA are equally important (Bucheli et al., Reference Bucheli, Voirol, de la Torre, López, Rytz, Tanksley and Pétiard1999).
For CA, unlike for °Brix, management did not have a large effect, and while year was significant, the effect size was much smaller than variety and market class (Table 3). The parity between hoop house and field is interesting and suggests that hoop house conditions influence sugar accumulation much more than CA, which may affect perceptions of flavor between management systems.
Baldwin et al., (Reference Baldwin, Scott, Einstein, Malundo, Carr, Shewfelt and Tandon1998) reported that the ratio of soluble sugars to CA correlates better with over-all flavor acceptability than either flavor component in isolation. Bucheli et al., came to similar conclusions in Reference Bucheli, Voirol, de la Torre, López, Rytz, Tanksley and Pétiard1999, suggesting that sugar to acid ratio may be a reliable method for anticipating tomato flavor, though more research is needed to confirm this proposition. °Brix/acid ratios are more commonly used to anticipate flavor at peak ripeness in grapes (Liu et al., Reference Liu, Wu, Fan, Li and Li2006; Jayasena and Cameron, Reference Jayasena and Cameron2008). In looking at the °Brix/CA ratio for this study, variety and market class had the most influence over this ratio, while management was also significant. As we would expect given the high °Brix levels in the hoop house and consistent CA between management systems, the °Brix/CA ratios were significantly higher in the hoop house than in the field in both years. This may indicate a flavor difference between hoop house and field-grown fruits.
In their tomato flavor evaluations, Baldwin et al. (Reference Baldwin, Scott, Einstein, Malundo, Carr, Shewfelt and Tandon1998) found an r = 0.71 correlation between flavor intensity ratings and over-all flavor acceptability, suggesting that consumers prefer a more intensely flavored fruit. During our flavor evaluations, chefs were not asked to rate over-all flavor acceptability but they were asked to rate flavor intensity on a 1–8 scale. From comments on the rating sheets it was clear that more intense flavors were preferred. These flavor intensity ratings were plotted against the °Brix/CA ratios to yield a slope of −0.13 and an R-squared value of 0.22 (Fig. 2). While the correlation is not strong, it does suggest some relationship between the chefs’ flavor intensity ratings, and therefore potentially their preferences, and the °Brix/CA ratios observed in the laboratory. Higher ratios in this case correlated with lower flavor intensity. The large slicer category had the highest ratio, meaning potentially the least intense flavor, while crosses and heirlooms and small slicers had similar, lower ratios. This aligns with crew and chef preferences for varieties from the different categories, with very few large slicers marked as having intense flavor.

Fig. 2. °Brix to CA ratio versus chefs’ intensity ratings for 21 varieties trialed in the hoop house in 2015.
The relatively high values of repeatability are to be expected in a trial with such a broad range of varieties. In a breeding program with a single market class and more uniform entries, these would likely be reduced. However, they provide an indication that there is significant genetic variation for traits of interest including marketable yield, the proportion of yield that is marketable and acidity as a component of quality. AUDPC and °Brix both showed much stronger environmental effects. The challenge in selecting for the traits with high repeatability is that even with strong genetic effects, it is difficult to find significant differences among a larger number of breeding lines and varieties. These are relatively expensive traits to measure (labor-intensive) and it is difficult to achieve a high enough level of statistical power to detect significant differences.
Certain varieties emerged as high performers for each trait of interest and a select few varieties did well across traits. While testing for pairwise, significant differences among varieties was not a primary goal of this study, Table 5 describes four notable varieties from the 2014 and 2015 trials. Plum Regal, a small slicer hybrid tomato from Bejo Seeds, did well by a broad range of metrics. This tomato is a plum-type used for both paste and fresh market. It was in the top three for marketable yield in both environments in 2015 and in the field in 2014. Plum Regal also exhibited significant disease and was in the mid-range for weight, which participating farmers reportedly prefer. Plum Regal was in the mid-range for °Brix/CA ratio, and was not rated as a highly flavorful tomato by tasters. Nonetheless, Plum Regal's productivity characteristics make it a good candidate for future breeding, particularly for dual-purpose tomatoes that can be used both fresh and for small-scale processing.
Table 5. Characterization of four notable varieties as high, low or middle according to five traits of interest. High and Low designations may apply to both years, or 1 year in the management system indicated.

* Significantly different from at least one other variety in the same environment and the same year at P ≤ 0.05 using Tukey's honest significant difference.
H, hoop house; F, field.
Garden Gem, an heirloom/modern cross from Dr. Harry Klee at University of Florida was similarly productive to Plum Regal in terms of numbers of fruit, but because it is a small tomato it was only in the top three for yield by weight once, in the 2015 field. Nonetheless, it had a consistently low percent unmarketable yield, ranking in the bottom three in both the hoop house and field in 2015 and in the field in 2014. Garden Gem also had a consistently low °Brix/CA ratio, by a significant margin in both environments in 2014 and in the field in 2015. This aligns with our taste-testers qualitative analysis of the tomato, which identified it as a common favorite with intense flavor among chefs and the crew.
A6 is a large heirloom-type open-pollinated tomato from local Wisconsin breeder Dr. Craig Grau. Though this tomato might be too big for some market growers, and it did exhibit the unremarkable yields often expected of heirlooms, it also displayed some disease tolerance in the field in 2015. Additionally, it was well-rated for flavor by the field crew and chefs and had a low °Brix/CA ratio. A6 may be a good candidate for developing disease tolerance traits in heirloom-like varieties.
Caiman is a large red slicer from Vitalis Organic Seeds. It had a high marketable yield in the hoop house and field in both years of the study, and exhibited a low percent unmarketable yield in the hoop house in both years. The large slicer market class was generally less preferred for flavor than the other market classes, and this was true for Caiman; it was middling in our flavor evaluations. But its productivity characteristics out-ranked the check variety, Big Beef, suggesting that Caiman, if crossed with a more flavorful parent to generate a breeding population, could be useful to develop a higher-yielding alternative to Big Beef.
Conclusion
To meet the needs of organic direct-market farmers, it is important to develop tomato varieties that perform exceptionally well in organic conditions. As hoop house production is a tool commonly used on organic farms, it is important for plant breeders to understand the effects of hoop house management on variety performance, particularly genotype × management system interactions. More research is also needed on this management system, especially as it relates to tomato flavor development and the prevalence of particular diseases and physiological disorders such as blossom end rot.
Based on the results of this study, the hoop house outperformed an adjacent and identically managed field plot by a significant margin, with higher marketable yield and lower disease incidence. It also resulted in higher °Brix levels compared with field grown fruit. The levels of citric acid by volume were consistent between the hoop house and the field in both years, and the °Brix/CA ratio was significantly higher in the hoop house than in the field. This is interesting given the significant negative correlation between °Brix/CA ratio and chefs’ flavor intensity perceptions during test tastes. This suggests that °Brix/CA ratio may be a simplified corollary for how a tomato will taste to the eater. While previous research has also demonstrated this correlation, more research is needed to determine whether a sugar to acid can be used to predict flavor acceptability. For this study, the existence of this correlation implies that tomatoes grown in a hoop house may have lower flavor intensity than field grown tomatoes, although this remains to be confirmed with further research.
We did not find evidence of genotype × management system interactions of a similar magnitude to the main effects of management, year, market class and variety within market class. This is encouraging as breeders can develop lines in one system and they are likely to also have good yield and quality in the other system. The only interaction of consequence was that of management × year for AUDPC, which given the difference in field disease incidence in 2015 is not surprising.
Interestingly, market class did not have a significant effect on yield or disease incidence, two of the main characteristics supposedly differentiating heirloom and modern varieties. We found more variation within these market classes than between them, with some heirlooms having similar yields and disease incidence as modern varieties. For quality characteristics, while the large slicer category had less desirable °Brix/CA ratios, the other classes, including crosses with modern and heirloom parentage and small slicers, had similar values to heirlooms. This is promising as it means that breeders may be able to identify excellent parents within both modern and heirloom market classes and combine productivity and quality traits within a single variety. While none of the varieties in this study were optimal for all traits, we did not find evidence of significant negative associations between important traits of interest. Further screening and selection at research stations and on farms will help identify varieties that optimize productivity as well as flavor under the heterogeneous management realities of working organic farms.
Acknowledgements
We would like to thank all the individuals and companies that provided seed for the project (list in table 1); Janet Hedtcke and the staff at the West Madison Agricultural Research Station; field crew members Terri Theisen, Natalie Cotter, Thomas Hickey, Mariana Debernardini, Bradley Melinger, Laura Jacobson, Marissa Nix, Lucas Holiday, Sarah Lee, Molly Kreykes, Jamie Lovely and Maya Reese; Tricia Bross at Luna Circle Farm; the farmers and chefs that participated in the project; the Sustainable Agriculture Research and Education Program of the United States Department of Agriculture (USDA)(LNC14-357), the staff of the West Madison Agricultural Research Station, the Hatch Program of the USDA (WIS01775).