ABSTRACT The Asian hornet (Vespa velutina) is a major invasive predator of honeybees. Theoretical ecology techniques were used to give managers insight into a potential invasion, and to allow managers to formulate counter-strategies. An invasion of North America (Canada and the United States) was simulated using the standard, four-stage model of invasion: transportation phase; establishment phase; growth and spread phase; and impact phase. The transportation phase was modeled with pathway analysis. Pathway analysis showed 5.461% of US imports by value, and 9.345% of Canadian imports by value, are potential vectors for invasion. The establishment phase was modeled using niche analysis. Niche analysis showed the western coast and eastern coast of North America, and the southern United States east of the Mississippi River, are highly suitable to V. velutina invasion, while the middle of the continent is inhospitable. All ports (n=24) studied in the United States and Canada were suitable for invasion except Anchorage, AK. Growth and Spread was simulated using a continuous Fisher-Skellam ReactionDiffusion model, and a discrete Markov model. The continuous model projects a mean nest population of 222.745 nests after ten years, while the discrete model projects a mean nest population of 289.823 nests after ten years. The impact phase was modeled using estimates of losses to agricultural output. The United States agricultural industry could face a loss of $565,181,398.135 USD, and Canada could face a loss of at least $6,475,218.878 CAD. Monitoring invasion vectors, and prioritizing certain ports above others, were deemed impractical for managers. The distribution of suitable habitat suggests North i American managers have the option, unavailable to managers in Europe and Asia, of containing an invasion to a single coast. ii MODELING AN ASIAN HORNET (Vespa velutina) INVASION IN NORTH AMERICA A THESIS SUBMITTED TO THE SCHOOL OF GRADUATE STUDIES of BLOOMSBURG UNIVERSITY OF PENNSYLVANIA IN PARTIAL FULFILLMENT FOR THE REQUIREMENTS FOR THE DEGREE OF MASTER OF SCIENCE PROGRAM IN BIOLOGY DEPARTMENT OF BIOLOGY BY THOMAS A. O’ROURKE BLOOMSBURG PENNSYLVANIA 2020 iii ACKNOWLEDGEMENTS I would like to thank my Advisor Dr. John M. Hranitz, my committee members Dr. Abby Hare-Harris and Dr. Clay Corbin, and Graduate Chair Dr. Thomas Klinger, for all of their hard work. I would like to thank Bloomsburg University’s Dr. Nonlhe Mziniso for her advice on constructing the statistical models. I would like to thank Dr. Allen Smith-Pardo at USDA-APHIS, Carrie Lima-Brown at Cornell University’s New York Invasive Species Institute, and Belle Bergner at the North American Invasive Species Management Association for putting me in contact with people who can use this information. v Table of Contents Abstract……………………………………………………………………..……………..i Title Page…………………………………………………………………………...……iii Approval Form………………………………………………………………………,,,..iv Acknowledgements……………………………………………………………………....v Table of Contents……………………………………………………………….……….vi List of Tables………………………………………………………………………….....ix List of Figures…………………………………………………………………..………..x List of Appendices……………………………………………………………………..xii Main Work 1-Introduction…….…………………………………………………………………..…1 1.1-Life History of Vespa velutina…………………………………..………….2 1.2-Invasion Biology of the Asian Hornet………………………………………4 1.3-Ecology and Natural History of the Asian Hornet………………….…....10 1.4-Invasion as Habitat Distribution: Niche Modeling ………………………15 1.5-Invasion as Population Dynamics: Growth and Spread………………....16 1.5.1-Continuous Growth and Spread……………………………………...…17 1.5.2- Stochastic Growth and Spread………………………………………....20 1.5.3-Discrete Growth…………………………………………………………..22 1.5.4-Discrete Spread…………………………………………………………...24 2-Aims…………………………………………………………………………………..26 3-Materials and Methods……………………………………………………………...27 vi 3.1-Transportation Phase……………………………………………………...28 3.2-Establishment Phase……………………………………………………….,29 3.3-Growth and Spread Phase………………………………………………..31 3.3.1-Refining Model…………………………………………………………..32 3.3.2-Minor Variables………………………………………………………….34 3.3.3-Input Dating and Processing……………………………………………35 3.3.4-Trial Ports………………………………………………………………..37 3.3.5-Methods of Calculating r(x,y)…………………………………………...38 3.3.6-Measurements ……………………..…………………………………….39 3.3.7-Experimental Control and Statistical Tests …………..……………….40 3.4-Impact Phase………………………...…………………………………….43 3.5-Model Validation……………….…………………………………………44 4-Results………………………………………………………………………………..43 4.1-Transportation Phase…………………………………………………..…46 4.2-Establishment Phase…………………………………………………….…47 4.3-Growth and Spread Phase……………………………………………..…47 4.3.1-Continuous Growth and Spread……………………………………….49 4.3.2 – Discrete Growth and Spread………………………………………….53 4.3.3-Both Discrete and Continuous Results………………………………….56 4.4-Impact Phase………………………………………………………………57 4.5-Model Validation…………………………………………………………..59 5-Discussion………………………………………………………………….………...59 LITERATURE CITED ………………………………………………………………65 vii TABLES…………………………………….…………………………………………71 FIGURES……………………………………………………………………………...98 APPENDICES APPENDIX A: PYTHON SOURCE CODE………………………………………120 APPENDIX II: STATISTIAL ANALYSIS………………………………………..176 APPENDIX III: RAW OUTPUT…………………………………………………..213 viii LIST OF TABLES 1. Hulme’s Transportation Criteria with Respect to V. velutina Biology 2. Table of Equations Used in this Study. 3. Summary of Constants Used for the Growth and Spread Simulation. 4. Ports in the Test Group with GPS Coordinates and Matrix Indices. 5. Locations in the Positive and Negative Control Groups with GPS Coordinates and Matrix Indices. 6. Model Statements and Calculations. 7. Honey Impact by State 8. Summary of Transportation Calculations for the United States. 9. Summary of Transportation Calculations for Canada. 10. Summary Statistics for N10 Projections for the Test Group. 11. Summary Statistics for N10 Projections for the Positive Control Group. 12. Summary Statistics for N10 Projections for the Negative Control Group. 13. Summary Statistics for Mean Nest Density for the Test Group. 14. Summary Statistics for Mean Nest Density for the Positive Control Group. 15. Summary Statistics for Mean Nest Density for the NegativeControl Group 16. ANOVA Results for Sites, Model Statements, and Trial Groups. 17. Summary of Economic Impact Calculations for the United States. 18. Summary of Economic Impact Calculations for Canada. ix LIST OF FIGURES 1. Life Cycle of V. velutina. 2. Flow Chart of Study Aims. 3. V. velutina Occurrence in Europe. 4. V. velutina Occurrence in Asia. 5. World V. velutina Occurrence Probability Raster. 6. Program Architecture of Niche Analysis Model. 7. Program Architecture of Growth and Spread Simulations. 8. Human Population Density of North America. 9. Major River Presence in North America. 10. Elevation (m) in North America. 11. V. velutina Occurrence Probability by State. 12. Example of Negative Nest Density-St. Paul, MN. 13. Trial Means with Standard Error. 14. Boxplots of Trials for the Test Group, Positive Control Group, and Negative Control Group. 15. Distribution of N10 Projections by Site. 16. Comparison of Discrete and Continuous Geometry – New Orleans, LA. 17. Boxplot of Continuous and Discrete Trial Means. 18. Boxplot of Continuous and Discrete N10 Population Projections Based on Occurrence. 19. Boxplot of N10 Population Projections for the Test Group, Negative Control Group, and Positive Control Group. x 20. Principal Components Analysis. 21. Correlation between Human Population Density and Land Area 22. M. apicalis Occurrence Probability xi LIST OF APPENDICES A. Python Source Code A.1 Script 1-Niche Analysis Modeling A.2 Script 2-Continuous Growth and Spread Model A.3 Script 3- Discrete Growth and Spread Model A.4 Script 4-Statistical and Arithmetic Function Module “funx” A.5 Script 5-Matrix Manipulation Function Module “mfunx” A.6 Script 6- String Functions from Useful Functions Module “ufunx B. Statistical Analysis and Scripts B.1 R Script for Statistical Analysis B.2 Python Script for Statistical Analysis C. Raw Data C.1 N10 Projections for Test Groups. C.2 N10 Projections for Positive Control Groups. C.3 N10 Projections for Negative Control Groups. C.4 Mean Nest Density for Trial Groups. C.5 Mean Nest Density for Positive Control Groups. C.6 Mean Nest Density for Negative Control Groups. C.7 Z-Score for the Test Group C.8 Z-Scores for the Positive Control Group C.9 Z-Scores for the Negative Control Group. xii 1-INTRODUCTION Vespa velutina, commonly known as the Asian hornet, or the yellow-legged Asian hornet, is one of the greatest threats faced by pollinators in the world today. Native to southern Asia, V. velutina has been pushing north into Korea and Japan as climate change expands this hornet’s range. On the Korean Peninsula, V. velutina’s range has been pushing north at a rate of 12 km per year, on average. In Busan, South Korea, V. velutina is now the dominant hornet species after ten years of invasion, accounting for 37% of observed hornet nests. V. velutina accounted for 47% of emergency calls for nest removal, and in one Busan apiary 50 of 300 beehives were destroyed by V. velutina (Choi et al. 2012). V. velutina was first observed on Tsushima Island, Nagasaki Prefecture, Japan, near Tsushima City, in 2012. Impacts of V. velutina in Japan have not been observed at this time (Ueno 2014). In approximately 2004, likely due to the increase of international trade, V. velutina became invasive in France (Arca et al. 2015). The impacts of the Asian hornet in France, where these wasps have no natural predators, no known pathogens, and prey with no natural defenses, have been even more devastating than those in Korea, as Asian honeybees (Apis cerana) have evolved defensive strategies against Asian hornets (Choi et al. 2012). V. velutina has experienced explosive growth in France, with the front of the invasion wave sweeping northward at an average rate of 60 km per year (Monceau et al. 2014), much faster than the 12 km rate of spread observed in South 1 Korea (Choi et al. 2012), though the reason for this faster spread is not understood at this time (Monceau et al. 2012; Robinet et al. 2017). In recent years, the French V. velutia population has begun to spread into the Iberian and Italian Peninsulas (Bertolini et al. 2016). In 2016, V. velutina was observed in Gloucestershire, UK: Keeling et al. (2017) speculate V. velutina crossed the English Channel by flight from France. The international trade that probably brought V. velutina to France will likely bring V. velutina to North America (Canada and the United States) and, if it is just a matter of time until V. velutina becomes in North America, then it behooves American and Canadian authorities to investigate the possibilities, and formulate counterstrategies. This study is intended to be an important first step in that process. 1.1-The Life History of Vespa velutina V. velutina’s life cycle begins when a fertilized gyne emerges from winter hibernation and builds a primary nest. The new queen then lays its first round of eggs in the primary nest. Once these first-generation workers have reached maturity, the colony then builds a much larger secondary nest. In the secondary nests, more workers and males come to maturity, and the queen begins laying gynes, potential future queens. Adults consume carbohydrates, chiefly nectar, but the adults feed the larvae with proteins, chiefly proteins from insects. One third to two thirds of this protein come from honeybees. As autumn comes to an end, the gynes migrate out of the nest, mate, and find 2 a place to hibernate for the winter as the parent colony dies, and the cycle begins again (Monceau et al. 2014). This life cycle is summarized in Figure 1. V. velutina does best in urban and suburban environments where human structures likely provide safe microhabitat for overwintering, and primary nest construction (Monceau et al. 2014). In Busan, South Korea, V. velutina nests were observed under the eaves of buildings, possibly due to the lack of tall trees near the center of the city (Choi et al. 2012). In France, they have been observed in the rafters of sheds (Monceau et al. 2014). V. velutina’s secondary nests are often found in proximity to rivers, through the reason for this preference is unknown at this time (Bessa et al. 2015). Data from the French population (Franklin et al. 2016) suggests that V. velutina does quite well where there is an active fishing industry, as V. velutina seems to have an affinity for seafood protein. The native range of V. velutina is quite warm, but the population data from Korea suggests that V. velutina can do well in more temperate climatic zones (Choi et al. 2017). This ability to adapt to cooler temperatures may be due a preference for highland areas in the tropical native range (Bessa et al. 2015). In the Iberian peninsula, Bessa et al. (2015) observe V. velutina occurring at a temperature range of 15.2℃ - 30.2 ℃. The one environmental condition seemingly repulsive to V. velutina is aridity: V. velutina was not observed in the arid regions of southern Spain, and were only observed to occur in regions with annual rainfall ranging from 410 mm – 1572 mm (Bessa et al. 2015). Thus, the ideal location for a V. velutina population might be a warm, humid port city located at a river mouth with a bustling seafood trade and expansive suburban and agricultural development outside the city. 3 1.2-Invasion Biology and the Asian Hornet Invasion is a biological and ecological process. It is therefore necessary to discuss invasion biology theory generally, and apply that theory to the context of V. velutina. Firstly, an invasive species is a non-native species that has substantial potential to grow, spread and outcompete endemic species in the introduced habitat: while all invasive species are non-native species, most non-native species are not invasive species (Sakai et al. 2001). Sakai, et al. (2001) identify four major stages of biological invasion: Transportation, Establishment, Growth and Spread, and Impact. The transportation phase is defined as the period during which the non-native species is transferred from its previous habitat (to which the species may have been native, though the species could have been non-native or invasive there as well) to the new (“introduced”) habitat (Sakai et al. 2001). When this new habitat is far from the species’ native range, transportation usually happens because of human economic activity, either intentionally (e.g. an exotic plant brought for cultivation for sale as ornamentation (Sakai et al. 2001)) or unintentionally, as V. velutina was likely transported to France in a shipping container (Arca et al. 2015). As trade frequency and efficiency have increased, biological invasions have increased through these economic pathways (Hulme 2009). Hulme (2009) identifies ten important factors associated with working out the pathways of invasion: a) the strength of the association between the species and the vector (commodity, mode of transport, etc.) at the point of export; b) volume of vector imports at the point of interest; c) frequency of importation; d) survivorship and growth 4 during transport; e) suitability of the importing point to species establishment; f) appropriateness of the time of year for the establishment of the species; g) the ease of containing the species within the vector; h) effectiveness of management measures; i) distribution of the vector post importation; and j) the likelihood of post-importation transport to suitable habitat. Unfortunately, there is no literature on what commodities may be favored by V. velutina, and while the vector of invasion in the French case has been speculated to be shipping containers (Arca et al. 2015), this is not certain. There is also no literature on the validity of control measures while a V. velutina gyne is in transit. While it is not possible to ascertain the rate of survivorship in transit, it can be inferred from the life history of V. velutina that growth in transit is not very likely: Growth in colonies occurs only once a year, and supporting a nest without a reliable source of protein will not be possible. It can be likewise inferred from life history that the most dangerous time for transit would be in late winter or early spring: From late spring through early summer, the gyne is building her primary nest; from the middle of summer through early fall, the queen is growing her secondary nest; and in late fall, the colony is in the process of die-back, and the offspring gynes are in the process of mating and preparing for hibernation. If transportation takes place too early in the winter, e.g November or December, the gyne may not survive exposure to the winter in the introduced habitat. If some material containing a hibernating gyne is transported in late winter or early spring, the gyne could take full advantage of settling the introduced habitat at the right time in its life cycle. The biology of V. velutina in relation to Hulme’s criteria is summarized in Table 1. 5 In the establishment phase, after the non-native species has disembarked its vector and settled in the introduced habitat, the non-native species produces its first generation. If the introduced habitat is not suitable, the invader will not be able to effectively establish itself and go to extinction. It is common for an invasive population to remain in the area of establishment for several generations before spreading to new areas-known in the literature as lag time - due to adaptation, inbreeding depression, and the need to reach some critical population density (Sakai et al. 2001). The critical population density of V. velutina is currently unknown, but this population-density dependent growth and spread is known as the Alee effect, the decrease in per capita growth due to lower population density. The Alee effect has numerous potential causes: inbreeding depression (lower fecundity due to high levels of inbreeding); lack of available mates; and a lack of sufficient competition to incentivize emigration from the establishment habitat, inter alia (Monceau et al. 2014). In V. velutina, inbreeding depression has been observed in the invasive population in the form of diploid males, though there is no evidence that this effect negatively impacted their rate of spread in France. Hymenopteran males are usually haploid, as hymenopterans exhibit haplodiploid sex determination (Daurrouzet et al. 2015). What habitats are suitable for establishment of V. velutina? Given that cities are ideal environments for V. velutina (Choi et al. 2012; Franklin et al. 2016), and most ports are located in large cities, it is reasonable to conclude that any port which is sufficiently warm and wet (Bessa et al. 2015) would suit an invasive V. velutina population. 6 In the growth and spread phase of the invasion, the population moves out of its newly established range and expands to a new range. This can happen because of the organism’s innate ability to spread in the new environment or by human-mediated transport (Sakai et al. 2001). V. velutina gynes are capable of flying up to 200 km to establish new nest locations, and human-mediated dispersal has been discussed as a possibility for rapid and incongruous jumps in range in the context of the French population (Robinet et al. 2017). Spread and growth of the population is not merely a function of whether a habitat is suitable or unsuitable, but also how that suitability affects growth and spread: Growth and Spread must be interpreted as a function of habitat suitability as a continuous variable, rather than a binary variable. Mathematical, population-dynamical models have been historically used to describe how a population reproduces and migrates during the growth and spread phase (Archer 1985; Franklin et al. 2017; Keeling et al. 2017; Renshaw 1991; Robinet et al. 2016; Shigesada and Kawasaki 1997; Varley et al. 1973). These models are discussed at length in Section 1.5. Finally, after the invasive population has increased in number and expanded into its new, more extensive range, it is necessary to consider the ecological and economic impacts of the invasion, and if countermeasures have not yet been employed, countermeasures must be considered to stop further losses and further growth and spread of the invasive population. These calculations constitute the impact phase of the biological invasion model (Sakai, et al. 2001). At this time, there is no widely agreed upon way to quantify ecological impacts. Much work has been done in recent years to integrate ecological concerns into economic 7 calculations (Gowdy and Erickson 2005). A considerable problem for ecological economics is adapting ecological concepts to the assumptions of neoclassical economics: neoclassical economics assumes value-monism, that all values in a system are reducible to a single unit of account (to wit, everything can be assigned a currency value), and assumes that consumers make rational choices based on self-interest. Gowdy and Erickson (2005) propose making allowances for a plurality of valuations, and for the consumer to be viewed as a citizen making responsible choices, but these augmentations do nothing to improve the ease of calculation. There is not much literature on how ecological-economic impact can be calculated. For example, Limnois et al. (2009) provide one of the few examples of explicit calculation: the authors demonstrate a method to calculate the ecological impacts of the production of goods. Regarding invasive species specifically, Zhang and Boyle (2010) relate the impact of invasive waterfowl to the decline in lakefront property values. In an exhaustive survey of the ecological-economic impacts of invasive species, Pimentel et al. (2005) note that calculating the ecological-economic impact of invasive species is nearly impossible, and most of the impacts calculated are from losses to agriculture or removal cost. The ecological economics of invasive species requires some mapping from some ecological unit to a financial unit. Even if a mapping existed for V. velutina, data on ecological impacts of V. velutina in Europe or Asia would be required to project those impacts onto North America, and no such data on ecological impacts in Europe and Asia exist in the literature. Fortunately, there is literature on the agricultural impacts of V. velutina, making agricultural impact calculations possible. 8 Pollination services contributed $10.95 billion to US agriculture in 2009, the last year for which there is data (Hein 2009). While pollination is doubtlessly important to wild plants, the value of pollinators to the ecosystem is impossible to calculate precisely (Monceau et al. 2014). Surveys of beekeepers in invaded regions found a total honeybee hive loss (complete destruction of the hive) of between 5.0-7.5% (Monceau et al. 2014). Colony losses are usually calculated once a year in the spring, and lost colonies are replaced, usually by brood splitting (Smith 2013). Apiaries in the United States produced $353 million in honey production in 2016, the most recent year for which there is data (FAOSTAT 2020). Higher resolution on impacts of honey production is possible: forty states have published honey production in those states for the year 2017 (Flottum 2017). Finally, V. velutina has been responsible for three human deaths, but it is impossible to calculate the probability of death by V. velutina sting due to a lack of comprehensive envenomation statistics (Monceau et al. 2014). These impacts must be mitigated with countermeasures. The primary countermeasure used in both South Korea and France is nest destruction. This countermeasure requires involvement from the general public: nest sightings must be reported to authorities, who then investigate. In France, nearly a third of reports are false alarms. Nest destruction is conducted by either government authorities or private companies. The costs of nest removal ranged from €130-€500 per nest in 2013. Trapping has also been employed, but trapping is optimally efficient when a trapped wasp dosed with a pesticide returns to the nest, killing the nest. This requires a pesticide 9 sufficiently specific to kill the nest without killing other insects in the environment. Such a pesticide has not been found in the case of the V. velutina (Monceau et al. 2014). Sakai et al. (2001) observe that invasiveness is not merely a function of habitat suitability and mechanical growth and spread. Invasiveness is fundamentally an evolutionary question. Invasive species are thrust into new environments. How quickly can the new species adapt? If invasive species begin with a small initial population, how susceptible are those populations to the founder effect and inbreeding depression? How does interspecies competition effect invasion, and how do pathogens effect the invader? Studying the ecology and evolutionary natural history of V. velutina, and of vespids generally, may provide insight into the V. velutina invasion. 1.3-Ecology and Natural History of the Asian Hornet V. velutina has no known natural predators in the European or Asian ranges, and no pathogens are known to effect V. velutina in the European range, though there have been anecdotally observed instances of predation by the European Honey Buzzard (Pernis apivorus), and the domestic chicken (Gallus gallus domesticus) (Monceau et al. 2014). V. velutina sampled in China tested positive for the Israeli Acute Paralysis Virus (IAPV), acquired from feeding on infected A. melifera, though the virus has no known effect on the infected V. velutina (Yañez et al. 2012) While it is not possible to say what predators and pathogens will affect V. velutina in North America, it is reasonable to assume the North American case will be analogous 10 to the European case: no predator or pathogen will have a substantial effect on V. velutina in North America. The main prey of V. velutina in its native range is the Asian honeybee (Apis cerana,) (Tan et al. 2005). A. cerana has developed a robust system of defense from V. velutina . V. velutina hunts primarily by sending scouts to identify prey items. When a worker bee is identified by the scout, the scout follows the bee back to the hive, and secretes a pheromone on the entrance of the hive. The scout then returns to her nest, and the V. velutina nest assaults the marked beehive. This practice is known as bee hawking (Tan et al. 2005). The beehive has little hope of surviving the assault. The hive’s only hope is to take out the scout before she can recruit her sisters. When the worker bee detects the V. velutina scout stalking her, the worker may send the scout an “I-See-You" signal, for example a unique wingbeat, which may cause the scout to disengage from the hunt (Tan et al. 2012). The worker bee’s physiology changes, releasing pheromones and triggering a pheromone increase put out by guard bees, rallying the whole hive and bringing reinforcements to the guard bee position. The bees begin raising their body temperature by vibrating their wings, which also spreads alert pheromones (Tan et al. 2005). The physiology of all the bees in the hive changes. When the scout arrives at the beehive, she is ambushed by the hive. The hive forms a ball around the scout, and the bees begin vibrating their wings (Tan et al. 2005). The temperature inside the ball raises to 46oc, and the CO2 rises to 4%, which kills the scout but not the bees, who are able to survive the high temperatures. This behavior is known as heat balling (Sugahara 2012; Tan et al. 2005). 11 While some strains of the European honeybee (Apis melifera ligustica) have been known to exhibit similar behavior (though to a lesser degree) in areas with the predatory European hornet (Vespa crabo), most strains of European honeybee are unable to implement this defense, or have only a primitive ability to do so. It is possible that all Apidae have the genetic basis to evolve this defense, as heat balling has also been observed in Apis dorsata (Tan et al. 2005). This may account in part for the much faster rate of spread of V. velutina in Europe compared to the Korean population: Korea has an A. cerana population (Choi et al. 2012). It is reasonable to believe that the relative lack of A. cerana in North America would similarly affect an invasive V. velutina population in North America. The evolutionary history of the Asian hornet is included here for completeness. Hymenopterans first evolved in the Triassic period, with sawflies (suborder Symphyta) being the most basal form. The suborder Apocrita, containing wasps, bees and ants, arose between late Triassic and early Jurassic. Bees and ants split from the wasp lineage in the early Cretaceous. (Ward 2014). The most well-known and controversial issue in hymenopteran evolution is the evolution of eusociality. Eusociality is an arrangement where adults in social organisms are divided into reproductive castes: some members reproduce, while other members raise the brood of the reproductive members, without undergoing reproduction (Nowak et al.2010). Eusociality affords some flexibility in adapting to new environments through social intelligence (Moller 1996), so an understanding of how eusociality evolved may provide insight into this valuable technique for invasion survival. The classic model of eusocial evolution is the Hamilton model, also called the inclusive fitness model (Nowak, 12 et al. 2010; Queller, 2011). According to this model, ancestral hymenopterans developed an altruistic phenotype, which caused the hymenopterans who express the altruistic phenotype to give up their reproductive fitness for their siblings. These closely related siblings had a probability of carrying the allele for altruism, and as such the phenotype survives according to Hamilton’s inequality, to wit: the product of the fitness benefit to the beneficiary and the relatedness coefficient between the altruist and the beneficiary, is greater than the fitness cost to the altruist. There is evidence for the accuracy of Hamilton’s model. Waibel et al. (2011) simulated the evolution of altruism using artificially intelligent robots, capable of foraging for food and given the choice to share food or keep it to themselves. The authors found that, in every case (of 500 simulations) where altruism evolved, Hamilton’s Rule predicted it. However, even among hymenopterans, relatedness is no guarantee of eusociality or even altruistic behavior: for example, males of the fig wasp (family Agaonidae) compete for mates aggressively regardless of relatedness (West et al. 2001). Nowak, et al. (2010) propose an alternative model for the evolution of eusociality. The authors –including the eminent E.O. Wilson- note that, in every known instance of eusociality, the organism builds a defensible nest. If some of the offspring of the founder develop a mutation that causes them to not leave the nest, the founder will have more aide in defending and maintaining the nest, and rearing future young. The “eusociality” gene emerges in some of the species, and the mutants help their peers without respect to relatedness (Hughes, et al. 2008). . The authors concede that, while relatedness was important, it becomes the enforcer and consequence of this behavior, rather than the driver (Nowak, et al. 2010; Wilson and Wilson. 2007). This model has been extremely 13 controversial (Abbot, et al. 2011), and has been undermined experimentally: Genetic analysis of hymenopteran lineages finds that that eight of the nine eusocial lineages practiced monandry (females mating with only one male), a practice that maximizes relatedness (Hughes et al. 2008). A third hypothesis is developed by Wheeler (1986), Hunt and Amdam (2005), Toth et al. (2007), and Jeanne and Suryanarayanan (2011). Caste determination is a developmental question. It is believed that caste determination occurs by developmental ques passed from mother to daughter during larval feeding, including chemical ques, food quantity (Hunt and Amdam 2005), and vibrational cues (Toth et al. 2007). Many species of hymenopterans practice bivoltinism, or the laying of two brood during a single season. Many insects enter stages of diapause, or developmental delay. It may be that ancestral wasps could suppress the reproductive potential of their daughters by inducing diapause in the early brood. On this view, eusociality might have evolved as a form of social parasitism by queens on their daughters, a “Bad Mother” hypothesis. Under social parasitism, benefit accrue to the parasite by exploiting the social structure of a community. Social parasitism has been observed in hymenopterans: Slave-maker ants (e.g. Harpagoxenus sublaevis) drive off adults from a host colony, and use the host brood to serve the queen of the slave-maker ants (D’Ettorre and Heinze 2001). Whatever the ultimate cause of hymenopteran eusociality evolution was, hymenopteran capacity to implement this strategy makes them formidable invaders. 14 1.4-Invasion as Habitat Distribution: Niche Modeling The suitability of a habitat to an organism is easy to determine if the organism is already living there. How does one determine the suitability of habitat where an organism has never lived, but one day might? Niche modeling attempts to quantify the suitability of habitat for an organism. First, variables relevant to habitat suitability must be identified from the literature: for example, rainfall, average temperature, flora and fauna, etc. Secondly, habitats where the organism lives (and does not live) must be identified. Thirdly, a method of comparing habitats based on the environmental variables must be developed. Finally, variables from occurrence habitats and candidate habitats are compared using the developed model (Villemant et al. 2011). Statistical regression is the most common model of comparison. The relationship between variables and habitat suitability is assumed to fit some function with coefficients. These coefficients are then estimated computationally, finding the coefficient values that minimize the error in the model. The function is then used to project suitability of the new habitat (Villemant et al. 2011). For the sake of example, let the relationship between habitat and n variables Vn, be linear: the probability of occurrence increases directly to the sum of variables with coefficients (Equation 1): 𝑃(𝑂𝑐𝑐𝑢𝑟𝑟𝑒𝑛𝑐𝑒) = 𝐵 + 𝐶/ 𝑉/ + 𝐶1 𝑉1 + ⋯ + 𝐶3 𝑉3 where B is the y-intercept, the probability of occurrence when all values V1,…,Vn =0, and C1,…,Cn are the regression coefficients. Relationships need not be 15 linear: the relationships between probability and variables may assume any function (Villemant et al. 2011). 1.5-Invasion as Population Dynamics: Growth and Spread The standard model of the growth and spread of invasive species is the FisherSkellam reaction-diffusion model (Shigesada and Kawasaki 1997), which has been wellsupported in the study of V. velutina in France (Robinet et al. 2017). Section 1.5.1 contains a derivation of the Fisher-Skellam model. There are limitations to the FisherSkellam model: the model is deterministic-the model says the population will definitely be of this density at this time and place- while real organisms vary with some probability (Renshaw 1991); and the model assumes continuous growth and spread, while vespids spread during discrete times and grow in discrete generations (Archer 1985). To counter the stochasticity objection, Keeling et al. (2017) and Franklin et al. (2017) develop stochastic models of V. velutina growth and spread. The work of these authors is detailed in section 1.5.2. In response to the discreteness objection, Archer (1985) develops a discrete model of vespid growth, discussed in section 1.5.3. To model discrete spread of vespids, this study develops a spread model using the mathematical object known as the Markov Chain, a set of weighted nodes, and a set of paths weighted with the probability of moving from one node to another (Hill et al. 2004), discussed in section 1.5.4. The stochastic models developed by Keeling et al. (2017) and Franklin et al. (2017) (section 1.5.2) require positions of observed nests as inputs. In North America, there are no observed nests. To use the Keeling and Frankel models in North America, it would be 16 necessary to lay down some initial probability distribution. The Fisher-Skellam model can take a probability distribution as an input, meaning the output of a Fisher-Skellam model taking a probability distribution as an input can itself be interpreted as a probability distribution: thus, despite being a deterministic model, the Fisher-Skellam model can account for stochasticity, given the proper input (Shigesada and Kawasaki 1997). The Markov Chain object contains a probability component, thereby also accounting for stochasticity (Hill et al. 2004). Given the stochasticity implicit in the Fisher-Skellam model and the Markov Chain model, the Keeling and Frankel models are presented here for the sake of completeness: only the Fisher-Skellam model and the Archer-Markov models were used in this study. All Equations described in this study are summarized in Table 2. 1.5.1-Continuous Growth and Spread The growth of a population over the time interval [t, t+1] will depend in part of the population at time t: the bigger the population, the more reproduction to contribute to the population at time t+1. The population will also depend in part on the average reproductive contribution of each member of the population, the intrinsic rate of growth. For eusocial organisms like V. velutina, it is sufficient to count the growth of nests, the reproductive unit. If no other factors (e.g. death) affect the rate of growth over the time interval [t,t+1], then the population N at time t+1 is given by (Equation 2): 𝑁56/ = 𝑁5 + 𝑟𝑁5 where N is the population, and r is the innate rate of growth. This sort of growth is discrete. Growth is measured in discrete time units, often the time of whole generations. 17 If the time interval, rather than consisting of discrete time units, is the variable interval [t, t+h], and h is allowed to approach zero, then the limit becomes (Equation 3): 𝑑𝑁 = 𝑟𝑁 𝑑𝑡 This differential equation describes the exponential (or Malthusian) growth curve. No population experiences literal continuous growth, but for large, rapidly reproducing populations for very small values of t, this curve is a useful model (Renshaw 1991). As t increases, N tends to hit restrictions in its growth. This restriction, the maximal population an environment can sustain, is known as carrying capacity K. For these large values of t, the rate of growth is also proportional to the difference between the current population and the carrying capacity. The resulting equation describes a bound or logistic growth curve (Renshaw 1991) (Equation 4): 𝑁 𝑁 ∗ = 𝑟𝑁(1 − ) 𝐾 This paper adopts the notation Y* to mean the derivative of Y with respect to time, the complete derivative for univariate functions, and the partial derivative for multivariate functions. Invasive populations do not merely grow. Invasive populations also spread. Let U(x,y,t) be the concentration of the population at the point (x,y) at time t. If the population is spreading only, then the rate of change in U is driven entirely the rate of diffusion. If it is assumed that U at t=0 follows a Gaussian distribution around the origin, then the rate of diffusion from the origin is given by the Heat Equation (Equation 5): 𝑈 ∗ = 𝐷( 𝜕1𝑈 𝜕1𝑈 + ) 𝜕𝑥 1 𝜕𝑦 1 18 D is the innate rate of diffusion, in units of area per time (Shigesada and Kawasaki 1997). If a population is both growing and spreading, then the rate of concentration change will be the sum of the rate of diffusion and the rate of growth (Equation 6): 𝜕1𝑈 𝜕1𝑈 𝑈∗ = 𝐷 B 1 + 1 C + 𝑁 ∗ 𝜕𝑥 𝜕𝑦 Expanding this equation with the logistic growth equation (Equation 4), gives (Equation 7): 𝜕1𝑈 𝜕1𝑈 𝑁 𝑈 = 𝐷 B 1 + 1 C + 𝑟𝑁(1 − ) 𝜕𝑥 𝜕𝑦 𝐾 ∗ Equation 7 is known as the Fisher-Skellam equation (Shigesada and Kawasaki 1997). This model assumes continuous growth and continuous spread. A population may grow only during a discrete mating season, and may spread only during a discrete migratory season. Such a model would therefore be inaccurate for those populations. This will be addressed in sections 1.5.3 and 1.5.4. There is another assumption which limits the effectiveness of the Fisher-Skellam model: The population in this model is growing and moving over a homogeneous, smooth environment. The population everywhere has the same growth rate and the same rate of spread. Heterogeneous or patched models assume a different value of r, D and K at each point (x,y) (Equation 8): 𝜕1𝑈 𝜕1𝑈 𝑁 𝑈 ∗ = 𝐷(𝑥, 𝑦) B 1 + 1 C + 𝑟(𝑥, 𝑦)𝑁(1 − ) 𝜕𝑥 𝜕𝑦 𝐾(𝑥, 𝑦) 19 The smooth version of Fisher-Skellam has algebraic solutions, but the patched version requires numerical methods to solve. Numerical solutions to differential equations require specifying starting conditions. The starting conditions are described with the three-dimensional Gaussian distribution (Shigesada and Kawasaki 1997), the general formula for which is given by (Equation 9): / HJLM J BK O 1 𝑃(𝑥, 𝑦) = 𝑒 1 NM 2𝜋𝜎H 𝜎I P 6Q IJLR P S C NR where 𝜇H and 𝜎H are the mean and standard deviation in the x direction, respectively, and 𝜇I and 𝜎I are the mean and standard deviation in the y direction, respectively. 1.5.2- Stochastic Growth and Spread It may be objected that the Fisher-Skellam model is too deterministic. The model predicts that the population density must be some value at time t and point (x,y). Real organisms, it might be argued, do not behave so deterministically. The growth and movement of real organisms, it might be argued, should be measured stochastically. Franklin et al. (2016) conducted a lengthy study on the V. velutina population in the vicinity of the French seaside commune of Andernos-les-Bains. On the basis of these data, the authors formulate a stochastic model for growth as a Poisson Distribution with a mean, an expected population value, at the ratio of prior population growth to the prior population’s density dependence, to wit (Equation 10): 𝑟𝑁5J/ 𝑁5 = 𝑃𝑜𝑖𝑠𝑠𝑜𝑛 X Y 𝑁5J/ 1+ 𝐾 20 Keeling et al. (2017) extended this stochastic model to describe the stochastic spread of the V. velutina population, and model the spread of the population into Great Britain. Let Pjk be the probability that a queen spawned at location j will establish a nest at location k (Equation 11): 𝑃Z[ = 𝑒 J \]^ L 𝑇𝑒𝑟𝑟𝑎𝑖𝑛 [ where ΔZ[ is the distance between the two locations, µ is the mean flight distance, and Terraini is the suitability of the ith terrain. The suitability of the ith terrain is given by Franklin et al. (2016), where terrain is defined by type: city, park, coniferous forest, deciduous forest, and shrubland. This suitability varies between terrains, e.g. deciduous forests are better than coniferous ones. The probability that a nest existed at the ith location in year y, given that nests were discovered at locations r1 and r2 in year y+1, and assuming there is no density dependent competition, is given by (Equation 12): 𝑃(𝑖, 𝑦) = 𝑛𝑟𝐸d 𝑓(𝑙𝑎𝑡d ) g𝑒 J \hij L 𝑇𝑒𝑟𝑟𝑎𝑖𝑛 kj l g𝑒 J \hiP L 𝑇𝑒𝑟𝑟𝑎𝑖𝑛 kP l where n is a normalizing parameter, r is the growth rate, Ei is the effect of local habitat as defined by habitat distribution modeling (see section IV of this introduction), f(lati) is the effect of the position of the ith location (latitude, in this formulation), and e is Euler’s constant (approximately 2.718). The authors then define the probability that a nest exists at location q at time y+1 given that a nest existed at location j in year y given p environments (again, assuming no density dependent competition) is (Equation 13): 21 g𝑒 J \hp L 𝑃(𝑞|𝑗, 𝑦 + 1) = 𝑛𝑟𝐸d 𝑓(𝑙𝑎𝑡d ) ∑𝑒 J 𝑇𝑒𝑟𝑟𝑎𝑖𝑛q l \hs L 𝑇𝑒𝑟𝑟𝑎𝑖𝑛 t The authors then compute the probability of finding a nest at location q and year y+1 by integrating the above probabilities over all k nests, given the posterior probability distribution of r, Post(r) (Equation 14): 𝑃(𝑞|𝑦 + 1) = u 𝑃𝑜𝑠𝑡(𝑟) vw 𝑝d I y 𝑝q|d I6/ While these stochastic models may provide useful insight, they are presented here for the sake of completeness. These models require observations of actual nests (Keeling et al. 2017), which would not be possible in North America, as there are presently no nests in North America. The Gaussian nature of the initial conditions of the FisherSkellam model, and the stochastic nature of the Markov Chain discrete spread model (Section 1.5.3; Section 1.5.4) already account for the randomness found in nature. 1.5.3-Discrete Growth As discussed in Section 1.1, V. velutina undergoes discrete growth and spread. It will be therefore necessary to refine the models further, in order to more closely tailor them to the biology of the Asian hornet. Archer (1985) proposes a model for the discrete growth of eusocial wasps in the related genus Vespula (Equation 15): 𝑁56/ = 𝑁5 𝑄𝑆 22 Where N is the number of nests, Q is the number of gynes produced per hive, and S is the fraction of Q which survives to the spring. Because the population of the previous generation has completely died off by time t+1, the generations do not sum. Rome et al. (2015) offers the first experimental measure of Q. Each V. velutina hive produces an average of 560 gynes. This measure of Q was obtained by sampling and freezing V. velutina nests throughout the year, and dissecting the nests. The value of S has not yet been determined for V. velutina, but Archer (1985) gives a value of S for the related genus Vespula: S=0.02. The value of S=0.02 for Vespula was confirmed by Plunkett et al. (1989). The literature is silent on values of S for the Vespa genus. The Archer model, like the Fisher-Skellam model, also assumes that populations have a smooth growth rate. It is entirely possible that Asian hornets may produce fewer gynes under austere conditions, and that fewer gynes survive overwintering in austere conditions, but there is no literature to suggest that Q varies with environmental conditions, while the work of Monceau et al. (2014) and Bessa, et al. (2015) implies that overwintering survival S may vary with environmental conditions (Section 1.1). For these reasons, and for simplicity of calculation, the Archer model must be patched: it may be assumed that Q is a constant and all variation is due to S (Equation 16): 𝑁56/ = 𝑁5 𝑄𝑆(𝑥, 𝑦) 23 1.5.4-Discrete Spread: Markov Chains The Archer Model offers a model of vespid growth rooted in the biology of vespids, but hitherto there has been no proposed model of spread based on the biology of vespids. This paper proposes the use of Markov Chains to model the spread of a discretely growing, discretely spreading population. A Markov Chain is a graph (a mathematical object comprising a set of vertices and a set of edges) wherein the vertices (“nodes”), and their weights, represent states, while the weight of the directional edges represent the probabilities of state change. For example, the nodes of a Markov Chain can represent the nucleotide bases (adenine, guanine, cytosine, thymine) and the edges of the graph represent the probability of mutation from one base to another (Hill et al. 2004). Consider a population that undergoes a distinct growth phase and a distinct spread phase. Let E0 , E1 , E2 , … En be the discrete environments over which the population will range, beginning at E0 and spreading to new environments during each spread phase. Let ri be the growth rate of the population contained entirely within the ith environment and, if j and k are environments, then let Ojk be the probability of a member of the population at the jth environment will move to the kth environment during the spread phase. Let Oii denote the probability that a member of the population in the ith environment remains at the ith environment. Finally, let Pjk denote the edge, the “path” from environments j to k. From these definitions, the Markov Chain M may be formally defined (Equation 17): 𝑀 = [𝐸d ] ∪ [𝑟d ] ∪ €𝑃Z[ • ∪ €𝑂Z[ • 24 Where [𝐸d ] is the set of environments, [𝑟d ] is the set of growth rates at each environment, €𝑃Z[ • is the set of paths between environments and [Ojk] is the set of the probabilities of movement along the paths between the environments. Let 𝑁(𝑒, 𝑡) be the population at environment e and at time t. This function may be defined piecewise. If t falls during the growth phase, a modified version of equations 15 and 16 may be used (Equation 18): 𝑁 = 𝑁‚ 𝑟ƒ where Ns is the population in environment e at the end of the last spread phase, and re is the intrinsic growth rate at environment e. If t falls during the spread phase, then (Equation 19): [ [ 𝑁(𝑒, 𝑡 + 1) = w 𝑁(𝑖, 𝑡)𝑂dƒ − w 𝑁(𝑒, 𝑡)𝑂ƒd (d„ƒ) d d Where N(i,t) is the population in the ith environment at the end of the last growth phase and Oie is the probability that a member of the ith population will enter environment e. This Markov Chain Model requires only one modification to make it applicable to vespid biology: if t falls during the growth phase (Equation 20): 𝑁 = 𝑁‚ 𝑄𝑆ƒ Growth is here defined according to the Archer Model: Se is the overwintering survival in environment e. Moreover, let h be the interval tg + ts, where tg is the growth phase and ts is the spread phase. As h approaches zero and the number of environments per unit distance approach infinity, N(e,t) will approximate U(x,y,t), a Fisher- 25 Skellam model of any patching. Thus, the Archer-Markov model can itself be used as a numerical approximation. 2-AIMS The purpose of this study is to model the four stages of a hypothetical invasion of North America, both to advance the field of invasion biology and to provide useful information to authorities, that those authorities may plan countermeasures. The transportation phase will be modeled by an analysis of transportation vectors to determine the most-likely method by which V. velutina might make its way to North America. These most-likely channels of invasion might warrant special attention from management authorities. The establishment phase will be modeled using niche analysis to determine the suitability of habitat to V. velutina establishment. Such analysis may tell management authorities what areas are at greatest risk of invasion, areas for managers to focus their efforts, and what areas, being inhospitable, should be given lesser priority among managers, making for the most efficient use of limited time and resources. The Spread phase will be modeled using the continuous Fisher-Skellam model and the discrete Archer-Markov model. The niche analysis will be used as the basis for the patching on both models. If an invasion begun at one port were to produce much greater nest density than others, this port would warrant extra attention from management authorities. Further, if the more traditional Fisher-Skellam model and the more biologically accurate Archer-Markov model produce statistically significantly different 26 predictions of nest density, it will be necessary to work out which model managers should favor, but if the difference in predictions is not statistically significant, then managers may use either model freely. Finally, the impact phase will be modeled by examining impacts in the European range and extrapolating impacts to the hypothetical North American range. It may then be possible for managers to understand the magnitude of the danger, and prepare accordingly. Managers may use these figures to raise awareness, for example. Beekeepers may also use these data to plan financially for the loss resulting from invasion. The conceptual model of invasion theory used in this study is summarized in Figure 2. 3-MATERIALS AND METHODS This study is committed to the principles of the Open Source movement: When possible, software used in this study were those distributed under the Creative Commons License: the source code is freely available and free to be modified (Bonaccorsi and Rossi 2006). Any software developed during this project will similarly be released under the Creative Commons license. In the spirit of citizen science, this study was conducted at no cost, using widely commercially available hardware: an HP laptop with an Intel © CORE i5 7th Gen processor with 4 GB of RAM and running a Windows 10 OS. All scripts were programmed in Python 3.7.4 (Van Rossum et al. 2009), or R 3.6.2 (R Core Team 2019). Scripts for habitat distribution modeling, Fisher-Skellam modeling, Archer-Markov modeling, and basic data processing were composed in Python 3.7.4 (Van Rossum et al. 2009). Arithmetic, statistical, and calculus functions were saved 27 in the module “funx” while matrix manipulation functions were saved in the module “mfunx” (Appendix A). The only third-party Python module used was “random,” a module packaged with base Python (Van Rossum et al. 2009). Additional data processing, statistical tests, and graphics were done in R 3.6.2 (R Core Team 2019). R packages used in this study were “dplyr” (Wickham et al. 2018) for data manipulation, “ggplot2” (Wickham 2016) and “reshape” (Wickham 2007) for graph generation, and “Stargazer” (Hlavac 2018) for table generation. 3.1-Transportation Phase There is no solid information on what vectors will be favored by V. velutina. It has been speculated that V. velutina came to France in a shipping container (Arca et al. 2015), but this is not known with certainty. Given that transport most likely occurs during the hibernation state and given gynes can take advantage of human structures (Monceau et al. 2014), it is entirely conceivable that a gyne could hibernate in some compartment on an oil tanker. Any form of import could be a vector. As such, it is impossible to define a formal vector: transoceanic shipping, airfreight, land travel, or even the luggage of commercial airline passengers could be possible vectors. The relevant factor is the amount of imports from infected sites during the ideal transport phase (The first economic quarter, January through March. See Table 1, and Section 1.2 of the Introduction). Fortunately, the U.S. Census Bureau’s FT900 U.S. International Trade in Goods and Services Report 2020 accounts for imports by value, source, and time, for all potential vectors (transoceanic cargo, commercial air travel, etc.). 28 The FT900 U.S. International Trade in Goods and Services Report 2020 does not contain data on when items were packed for shipping, so for example if a hibernating gyne were packaged in November and not shipped until April, there would be no way to account for it in this model. U.S. Census Bureau statistic on imports were accessed from the FT900 U.S. International Trade in Goods and Services Report 2020 for the first quarter (January through March) from China, South Korea, and France for 2019. Canadian Trade statistics were accessed from the Statistics Canada database (Accessed 2020) for yearly imports from China and the European Union for 2018. Finer resolution on Canadian imports was not available. 3.2-Establishment Phase Occurrence Coordinates for V. velutina were extracted from the Global Biodiversity Information Facility (GBIF) database (GBIF.org, 2019), while climactic variables were extracted from rasters obtained from the WorldClim database (Hijmans et al. 2005). The European Occurrence is presented in Figure 3, while the Asian Occurrence is presented in Figure 4. It was determined that the optimal algorithm for computing occurrence probability would be a logistic regression test performed on the rainfall variable and a temperature variable (Mdziniso, Bloomsburg University, Personal Communication). Since overwinter survival is such an essential component of vespid population growth (Archer 1985), mean yearly winter temperature was chosen for the temperature variable, and aridity being such a critical factor affecting V. velutina habitat suitability (Bessa et 29 al. 2015), average yearly rainfall was chosen for the rainfall variable. Bessa et al. (2015) find a minimal suitable amount of rainfall to be 410 mm per year, and a minimum temperature of 15.2℃ , though this temperature was recorded while the colony was active, and does not reflect overwintering temperature. Rainfall and winter temperature values were sampled at the occurrence points using a script in Python (Appendix A) to generate PRESENCE values. ABSENCE values were extracted by constructing the smallest rectangle possible around the range, and values were sampled at random from the regions of the rectangle containing no known V. velutina presence. The occurrence range combined both the European, North Asian, and South Asian range. Owing to the random sampling of the ABSENCE statistic, there will be some variation between repeated samplings of ABSENCE. Logistic coefficients were approximated using JMP Pro, Version 14.3 (SAS Institute Inc., 1989-2019). The rainfall coefficient was 0.157869, the winter temperature coefficient was 0.0158029, and the intercept coefficient was –5.45707014. These coefficients were used to generate the logistic regression equation (Equation 21): 𝑃(𝑂𝑐𝑐𝑢𝑟𝑟𝑒𝑛𝑐𝑒) = 1 1+ 𝑒 J….‡…ˆ‰ˆ‰/‡6‰.‰/…Š‰1‹Œ6‰./…ˆŠ•‹t Where w is the mean winter temperature and p is the average yearly precipitation. A Python script (Appendix A) was then used to extract the variable values for all points, compute the probability of occurrence using Equation 20, and write those values to a raster file (Figure 5). All probabilities are values between 0.0 (no probability of occurrence) and 1.0 (100% probability of occurrence). All rasters in this paper are standardized using QGIS 3.8.3 (QGIS Development Team 2019) to 1500 by 3600 pixels, each representing one 10th of a degree by one 10th of a degree square of Earth’s 30 surface, an area equal to 36 nautical miles2. See Section 3.3.3 for more information on the standardization process. The program architecture of the Python Script is summarized in Figure 6. 3.3-Growth and Spread Phase The continuous population density U at point (x,y) and time t is given by (Equation 22; from Equation 8): 5 𝜕1𝑈 𝜕1𝑈 𝑈 𝑈(𝑥, 𝑦, 𝑡) = u 𝐷(𝑥, 𝑦) B 1 + 1 C + 𝑟(𝑥, 𝑦)𝑈 Q1 − S 𝜕𝑈 𝜕𝑥 𝜕𝑦 𝐾(𝑥, 𝑦) ‰ This equation requires numerical techniques to solve (Shigesada and Kawasaki 1997). Erdmann (2009) and Herman (2014) offer such numerical techniques. An initial population of 1 nest is assumed. While the exact location of the nest is unknown, it is assumed that the probability of finding a nest is a three-dimensional Gaussian distribution with a mean at the point of introduction. A modified Gaussian distribution (from Equation 9) describes these initial conditions (Equation 23): P / HJ• J BK O 1 𝑈(𝑥, 𝑦, 0) = 𝑒 1 1.‡ 11.52𝜋 IJ‘ P 6K O C 1.‡ where [I,J] is the matrix index representing the starting location of the invasion. See Section 3.3.1 for a calculation of σ. This probability is treated as the initial population density. The values of the patching are calculated from the input of the environmental rasters discussed in Section 3.2 above. An algorithm then cycles rapidly between a growth phase and a spread phase according to the logistic growth equation and the diffusion equation. Five cycles per annum were used, for a total of fifty cycles over a tenyear simulation. In testing the scripts, additional cycles did not produce meaningfully different projections (varying only in decimal values, biologically meaningless for 31 discrete objects like nests) while occupying valuable computer time. The five cycles per annum therefore represent a compromise between accuracy and efficiency. The Program Architecture, sensu lato, of this Growth and Spread simulation is summarized in Figure 7. While this model can account for multiple nests at the same origin (the initial probability distribution is simply multiplied by the number of nests), the model cannot account for nests introduced at two separate locations without specifying a new initial probability distribution. 3.3.1-Refining the Model To calculate the standard deviation, it is assumed that the edge of the invasion wave in the first year (in this case, a radius of 60 km around the central point (Robinet et al. 2017; Monceau et al. 2015;Franklin, et al. 2017) is three standard deviations away from the source, so that 99.9% of the probability (to wit, nest density) lies within that radius. Thus, one standard deviation is a third of that distance, equivalent to 2.4 map units. This value is calculated by taking the square of 60 km, converting this value to map unit area, taking the square root of that value, and dividing that value by three. The diffusion equation itself requires the solution of a derivative of spatial change. A very small amount of discrete spatial change (h) must be specified. This value is arbitrary so long as it is very small (Erdmann 2009). A value of 0.000001 units was specified. The French population data suggest that the front of the Asian hornet territory expands by 60 km per year on average (Monceau et al. 2015: Franklin, et al. 2017). If the population expands by 60 km in the x direction and 60 km in the y, then the area expands at a rate of 3600 km2 per year, or 51.84 map units2, an experimental value for D. There is 32 no literature suggesting the rate of spread is patched, so D was assumed to be constant for all patches (Robinet et al. 2017). Carrying Capacity (K) was calculated from the European population (Robinet et al. 2017) through unit conversion. A K of 7.397 nests per unit was calculated. In testing the prototype draft of the script, after running the first continuous simulation and the first discrete simulation, no population in any cell exceeded the calculated density of 7.397 nests per unit. Varying K would have provided insight into how reaching the maximum population density in an environment would have changed the dynamics of growth and spread, but as no population tested reached carrying capacity, patching K would have offered no new insight while occupying valuable computer time, though studying how varying K affects overall predictions of Fisher-Skellam models may make for interesting future work. Thus, the only patched variable required is growth, r(x,y). The rate of growth,r(x,y), when P(occurrence)=1 was assumed to be the same as that projected by the Archer model, where S is assumed to be the same as in the Vespula genus: 0.02. Logistic growth is approximately equal to Malthusian growth for small values of t (Renshaw 1991). Assuming the continuous and discrete growth rates are approximately equal, and approximately described by Malthusian growth, at least between t0 and t1, r can be calculated as: 1𝑒 /k = 560 × 0.02 = 11.2 ln 𝑒 k = 𝑟 = ln(11.2) = 2.415 Solving for r by taking the natural logarithm of 11.2 gives a value of 2.415. This rvalue causes the number of nests to increase by an order of magnitude per generation. This assumption is reasonable, as the European population increased by approximately an 33 order of magnitude each generation, until reaching carrying capacity (Robinet et al. 2017). This value of r is close to the May Threshold (r >2.692), the value for r at which a logistic equation becomes chaotic. At 2.415, a logistic differential equation becomes cyclical. However, this only applies to logistic differential equations where the population of the previous generation is added to the new generation (May 1974): for V. velutina, the prior population dies off (Monceau et al. 2014). For methods of calculating r(x,y) for minor variables, see section 3.3.5, below. The Fisher-Skellam equation may therefore be simplified to (Equation 24): 5 𝜕1𝑈 𝜕1𝑈 𝑈 𝑈(𝑥, 𝑦, 𝑡) = u 51.84 B 1 + 1 C + 2.415𝑃˜™™šk (𝑥, 𝑦)𝑈 Q1 − S 𝜕𝑈 𝜕𝑥 𝜕𝑦 7.397 ‰ The constant values of Equation 23, their method of calculation, and sources are summarized in Table 3. 3.3.2-Minor Variables In addition to the variables from the establishment analysis (rainfall and overwinter temperature, here combined into the single “occurrence” variable), human population density (Choi et al. 2015), the presence or absence of a major river (Bessa et al. 2015), and elevation (Robinet et al. 2017) are reported as potentially important secondary variables of habitat suitability. Urban and suburban environments (Choi et al. 2015), and environments with a major river (Bessa et al. 2015) benefit V. velutina , while environments with elevation over 791 m are detrimental to V. velutina (Robinet et al. 2017). These secondary variables, henceforth minor variables, were left out of the logistic regression because the literature (Choi et al. 2015; Bessa et al. 2015; Robinet et al. 2017) 34 does not suggest an equal importance to climatic variables in limiting the range of V. velutina, but these variables are accounted for by additional model statements. Elevation data was drawn from the U.S. Geological Survey National Map Digital Elevation Model (Archuleta et al. 2017). River shapefiles were drawn from Natural Earth 1.2 (Kelso and Patterson 2009). Human Population Density will be drawn from NASA Socioeconomic Data and Applications Center (SEDAC) (CIESIN, 2018). These raster and vector files were then standardized to the standard raster format using QGIS 3.8.3 (QGIS Development Team 2019). Human Population Density is presented in Figure 8. Major River Presence is presented in Figure 9. Elevation is presented in Figure 10. 3.3.3-Input Data and Processing The details of the specific models will be discussed thoroughly, but it may be useful to describe the general workflow here. For each model statement, environmental data rasters were loaded into the program. The patching r(x,y) was calculated from these rasters. The script then carried out the appropriate calculations, using Equations 19 and 20 for the discrete simulation, and Equation 22 for the continuous simulation. These calculations output a raster file showing the population density at each point, and a statistics file containing the total number nests after 10 years. This process was carried out for each specified port, and the new model statement was specified. Input rasters were of different dimensions, x by y. Dimensions need to be standardized so that a single coordinate system of matrix indices can be used to pull data from the same cell in multiple raster. If the dimensions of the rasters are too large, the 35 rasters exceed the 4 GB of RAM used in this study. Standardizing raster size to 3600 x by / 1500y allows for each cell to be /‰žŸ of one degree. There is a downside to this standardization: Decreasing the dimensions of a raster decreases the resolution. Consider a raster of a map of a coastline. When the resolution is reduced, the QGIS algorithm must make a choice as to whether a set of pixels containing land and water will now be represented by a single land pixel or a single water pixel. Consequently, a GPS coordinate pair that once represented a land pixel may now represent a water pixel. Several matrix indices for ports represented water pixels after the rasters were standardized. In the standardized rasters, water pixels are assigned NODATA values. Points surrounding the indices were pulled to determine if they were land or water pixels. In these cases, the nearest index representing a land point was used. GPS coordinates were converted to matrix indices using the CoordConvert function in the Python Module “funx” (Appendix A). A Python script (Appendix A) was used to read in the standardized rasters. To simplify the calculations, each raster combination was used to calculate a growth raster, which was then written out and saved to an ASCII text file. These saved ASCII text growth rasters were then loaded into the script, rather than recreating the growth matrix each time. The simulation was then performed for each point in the raster, with a mean at the test point. The resulting raster was then written out, along with a statistics file containing the final nest density at the test point, and a sum of the total population density within a search area of 121 square units (4,356 square nautical miles) centered at the test point. While the script was being developed, Because it was possible for land area at ports to effect nest projections, a control for land area was added. 36 For the discrete simulation, the process is the same as the continuous simulation, except that the alternating growth and spread phases are taken for each year, and spread is defined according to a matrix of distances between points. The probability that a member from one population will emigrate to another is a Gaussian probability of the distance between the first and the second, in standard deviations. More memory for a matrix of all points to be computed was required than the hardware could access, so a smaller matrix of points within 441 square units (15,876 square nautical miles) was constructed for this purpose. 3.3.4-Trial Ports Thirty-two sites were chosen for simulation, 24 test ports and 8 control sites (some of which are not ports: See section 3.3.6). The test sites were chosen by studying a Google Map of the United States and Canada for major port cities. GPS coordinates for these locations were obtained from Google Maps. GPS coordinates were then converted to matrix indices of the standardized raster matrix. The standardization process in QGIS (QGIS Development Team 2019) assigned NODATA values to some of those indices that are on coastal areas. In those cases, the nearest index containing a data value was used instead. The test ports are summarized in Table 4, while control sites are summarized in Table 5. 37 3.3.5-Methods of Calculating r(x,y) Rate of growth for human population density was assumed to be maximal (2.415) in the areas of highest population density, declining linearly as population declines. “Urban” human population density was found to be favorable to V. velutina (Choi et al. 2012), but there is no literature suggesting that increased population density within an urban site has an effect (e.g. between 2,000 people per km2 and 2,500 people per km2) Here, urban populations were defined as those greater than or equal to 2,000 people. Population was corrected so that maximal population density was 2,000 people per km2 . The variable pop_val was defined as Human_Population_Densty(x,y) divided by 2,000. The r-value, r(x,y) for the human population density model was calculated by multiplying pop_val by 2.415. The river presence raster contains binary values: 1 for presence, 0 for absence. V. velutina can clearly grow in the absence of rivers (Bessa et al. 2015), these binary values needed to be altered: a value of growth when river presence=0 needed to be assumed. It was assumed that V. velutina grew half as well in areas without a river. The variable riv_val was defined as 1 when river presence= 1, and riv_val=0.5 when river presence =0. The r-value, r(x,y) for the river presence trial was calculated by multiplying riv_val by 2.415. Elevation acts as a governor, such that growth does not occur above 791 m. The variable elev_val was defined as 1 when elevation(x,y) was less than 791 m, and 0 when elevation was greater than or equal to 791 m. The r-value r(x,y) for the elevation trial was calculated by multiplying elev_val by 2.415. 38 Growth values for trials combining multiple variables were calculated by multiplying the trial variables, e.g. 2.415 * riv_val*elev_val for minor variables, and 2.415*Poccur(x,y)*riv_val*elev_val for trials combining major and minor variables. Model Statements and methods of calculation are summarized in Table 6. 3.3.6-Measurements Fifteen Continuous trials and fifteen Discrete trials were run for 24 test ports and 8 control sites, for a total of 960 simulations. The output of the simulation, for a single port, is a raster representing nest density distribution after ten years, U(x,y,10). The simulation could have been run for any number of years, but ten years was chosen for the duration of the invasion in Europe (Robinet et al. 2017) and South Korea (Choi et al. 2012). Outputting a raster for every year was not feasible, given the amount of computing time and storage required for constructing and saving these rasters. Nest density output is a distribution of decimal values. This output is not useful for statistical analysis: one value, or a small set of values, per port is needed to summarize the trial, and properly test hypotheses. For this purpose, projected nest population after 10 years (N10) was extracted instead. There is only one generation of V. velutina secondary nests per year (Monceau et al. 2014). This extraction was done by summing up nest density within 121 square map units (plus or minus ten map units in the x and y directions, centered at the point of origin). This measurement may not represent the entire N10 population, but the measurement represents the majority of the population, and the measurement allows for quantitative comparisons. Means, standard deviations, minima, maxima, ranges, and z- 39 scores for each port were calculated with a Python script (Appendix A) from these N10 projections. The number of map units, within the 121 square map unit search area, positive for land (rather than ocean) were extracted, and the N10 projection was divided by the total land area to calculate a Mean Nest Density per map unit for each site. Mean nest density allows for comparisons between coastal sites and sites more inland. Statistical tests (see Section 3.3.7) were done with an R (R Core Team 2019) script (Appendix B). 3.3.7-Experimental Controls and Statistical Tests There are three sorts of controls in the growth and spread simulation: controls for land topography, negative controls (sites Albuquerque, NM; Barry County, MI; St. Paul, MN) and positive controls (sites Busan, SK; Montreal, QC; Nerac, FR; Tsushima City, JP; Walhalla, SC). In testing the scripts, a much smaller set of ports, entirely within the contiguous United States, variances in N10 were found that could not be explained by differences in the Major Variable. Interestingly, ports set further inland along rivers, e.g. the river port of Philadelphia, PA, had higher projections of N10 than oceanfront ports, e.g. Charleston, SC. Intuitively, it was possible that more land around the initial invasion point provided more opportunity for settlement and growth. To test this hypothesis, a raster was created with an r(x,y) =1 assigned to each point, and the simulation was run with the prototype draft of the script. Z-Scores were taken and compared. Land control accounted for all this outstanding variance. For this reason, a control trial for land shape, Land Control, was added to the slate of tests. 40 Negative controls denote sites where growth is expected to be much lower than the test sites. Very arid sites and very cold sites are expected to very small N10 projections (see Section 3.3.2), so Albuquerque, NM (very arid) and St. Paul, MN (very cold) were added to negative controls. Sites which are not extremely cold nor extremely arid, but are predicted by the Niche Analysis (see Section 3.2) to be habitat of intermediate quality for V. velutina, are likely to produce less growth than a test site in a region identified by Niche Analysis to be ideal habitat, so Barry County , MI was added to Negative controls (see Section 4.2). Positive controls denote sites where V. velutina is known to occur, or where growth is likely to be very high. The V. velutina invasion in Korea began in Busan (Choi et al. 2012). An invasion in Japan began in 2012 in Tsushima City, Nagasaki Provence (Ueno 2014). While the precise location of the origin of the French invasion is unknown, the first identified specimen was documented in the town of Nerac, Lot-et-Geronne, France (Haxaire et al. 2006). Using these starting sites where an invasion was known to begin, will provide a strong basis for comparison between these models and observed nest populations. Every port except Anchorage, AK is located in highly suitable habitat but abutting water. What might an invasion be like if the invasion began in highly suitable habitat, but completely surrounded by land? For this reason, Walhalla, SC was added to the positive control group. What might an invasion begun in a highly suitable but completely inland port, far from the ocean, look like? Thus, Montreal, QC was added. The addition of these Positive controls is to contextualize the results of the test, e.g. that an invasion beginning at port A was a statistical outlier greater than the mean, but not greater than the control population emanating from Walhalla, SC. 41 This context is useful for certain aspects of hypothesis testing: the hypotheses being statistically tested are 1) that some ports will produce statistically greater or lesser invasive populations; 2) that discrete and continuous simulations will produce statistically similar predictions; 3) that the addition of the minor variables to model statements will not significantly change projections of N10 ; and 4) that the test ports are similar to the positive controls and statistically significantly different from the negative controls. The first hypothesis was tested by examining the test data for statistical outliers. Sites generating an outlier projection of N10 for a given trial were identified by manually looking at z-score output tables (Appendix C) for a z-score of 3.0 or greater. If an outlier was identified using z-score, that site’s outlier status was confirmed by a One-Way ANOVA, conducted in R (R Core Team 2019), comparing that site to all other sites in that site’s trial group, across all trials for both N10 and Mean Nest Density.. To test the second hypothesis, the trial means for the fifteen continuous test trials, and the trial means for the fifteen discrete trials (30 trial means in total) were compared using the Unequal Variance T-Test in base R (R Core Team 2019). Then, the continuous N10 projections for all test sites were compared to the discrete N10 projections for all test sites using an Unequal Variance T-Test, and finally all N10 projections (test group, negative control group, positive control group) for continuous and discrete trials were compared using an Unequal Variance T-Test. In this latter comparison, Walhalla, S.C. provides an upper bound of the distribution of possible N10 projections (see Section 5). To test the third hypothesis, if the minor variables in themselves, or in combination with the major variable Occurrence, produce significantly different results from the major variable alone, a One-Way ANOVA was conducted comparing 42 Occurrence other trial variables for all trial groups. Finally, to test hypothesis four, if the assumptions of the model are valid (for more on model validation, see Section 3.4), the projections of the test group should not differ, in a statistically significant way, from the projections made for the positive control group, but should differ significantly from the projections made for the negative control group. A One-Way ANOVA comparing the test group to the control groups was conducted (R Core Team 2019). By default, R (R Core Team 2019) regards a p-value of 0.05 or less to be statistically significant. For this reason, this study adopts that a p-value of 0.05 or less is statistically significant. 3.4-Impact Phase The literature only provides a means of calculating values for two impacts: the value of implementing nest destruction; and the value of economic loss to the agriculture industry. Monceau (2014) gives a cost of nest destruction ranging from €130 to €500 ($144-$555 in 2020 US Dollars) per nest. These values were multiplied by the range of estimated population growth (from the Occurrence simulation) to arrive at a projected cost range for the complete removal of these nests. Total economic loss is the loss of economic productivity from the loss of pollinator services, honey production, and the loss to beekeepers accrued from replacing lost hives. Monceau et al. (2014) reports a rate of total hive loss from V. velutina attacks at 5%. Heim (2009) estimates pollinator services are worth 5% of the total agricultural production (which was $219 Billion in the US in 2009). The total value of honey produced in 2009 was $147 Million, while production in 2016, the most recent year for 43 which there is data, was $162 million (FAOSTAT 2020). Statistics on the total number of hives in the United States was acquired from the U.S. Department of Agriculture’s National Agricultural Statistics Service (NASS) (2019). Multiplying total hives by 0.05 gives the total number of hives expected to be lost, a 5% loss in honey production, and a 5% loss of pollinator-generated agricultural value. It is possible to calculate impacts on honey production at a finer resolution: forty states have published the value of honey production in the state (Flottrum 2017). A vector map of the United States was obtained from The National Weather Service (Accessed 15 June 2020). This map of the United States was overlaid on the heat map produced from the Habitat Distribution modeling in the Establishment analysis (Figure 11). Whether a state contained suitable habitat for V. velutina was determined by manually looking at Figure 11, and states with suitable habitat were assigned a binary variable value of 1, while states without suitable habitat were given a binary variable value of 0. Honey production for states with a binary variable value of 1 were multiplied by 0.05, to obtain the USD value of the loss. These results were summarized in Table 7. 3.5-Model Verification: The Case of the Centaurea Leafcutting Bee (Megachile apicalis) Megachile apicalis, known as the Centaurea leafcutting bee, or the apical leafcutter bee, is an invasive solitary bee introduced in the area of Santa Barbara, CA in the 1980’s. Since M. apicalis’s introduction, M. apicalis has become invasive throughout the United States, and especially the west coast, making it as far as Washington state in 2003 (Barthell et al. 2003). 44 M. apicalis is trivoltine, laying three nests per season (Kim 1997). Each nest contains approximately 12 brood, of which only 49% percent survive (Hranitz et al. 2009; Hranitz Personal Communication 2020); approximately 40% of offspring are female, due to the greater nutritional need of females (Kim 1997). 102 nests were observed in a field of 3.5 km2 (Kim 1997). M. apicalis may be a useful organism to verify the V. velutina models used. A maximal growth rate for M. apicalis of 1.954 was calculated from fecundity rates (Kim 1997; Hranitz et al. 2009; Hranitz. Personal Communication. 2020). A maximal carrying capacity of 2023.140 nests per map unit was calculated from data provided by Kim (1997). A diffusion rate of 63.722 square map units was calculated from Barthell et al. (2003). Major bioclimactic variables were determined to be average yearly rainfall and mean yearly temperature (Hranitz et al. 2009). A habitat distribution raster was obtained using the open-source habitat distribution modeling software Wallace (Kass et al. 2018). Bioclimactic variables were automatically accessed from WorldClim (Hijmans et al. 2005) by the Wallace package, and M. apicalis occurrence was automatically accessed from GBIF (GBIF.org accessed 26 January 2020) by the Wallace software. Wallace can only run a BioClim model at this time (Kass et al. 2018). The output raster was standardized using QGIS 3.8.3 (QGIS Development Team 2019) as described in Section 3.2). A starting index representing Santa Barbara was chosen at a matrix index of [555,603]. This standardized raster and the constants were entered into the discrete and continuous model scripts. 45 4-RESULTS 4.1-Transportation Phase 105.973 billion US Dollars of commerce were imported from China in the first quarter of 2019, while 19.883 billion were imported from South Korea and 14.139 billion were imported from France during the same period, for a total of 542.791 billion US Dollars in trade imported from countries with source populations of V. velutina during the biologically appropriate period. 5.461% of all US imports by value are potential vectors of transmission. This calculation is summarized in Table 8. In Canada, 82.653 billion Canadian Dollars in goods and services were imported from Europe and 128.300 billion Canadian Dollars were imported from Asia, accounting for 37.383% of Canadian import values (Statistics Canada. Accessed 2019). Assuming an even distribution of trade across months, potentially invasive trade would occur during the first quarter, accounting for 0.25 of that import value, a value of 52.783 billion Canadian Dollars, to wit 9.345% of Canadian import values could be potential vectors of a V. velutina invasion. This calculation is summarized in Table 9. The lack of reliable vectors in the literature prohibits the most effective analyses of the risk of exposure, so these claims must be tempered by their generality. An exposure of 5.461% of import value to the United States shows V. velutina to be a strong candidate for invasion. In Canada, 9.345% of trade import values are potential vectors, but this is less likely to be accurate than the 5.461% for the United States. The Canadian measure includes trade from “Asia” and “Europe” rather than the specific countries with host populations (Statistics Canada. Accessed 2019), and it assumed an equal distribution of 46 trade across economic quarters, which is not necessarily the case. Therefore, it may not be the case that Canada’s exposure is greater than the United States. 4.2-Establishment Phase An occurrence raster was obtained with occurrence probability values were obtained for the whole world, based on the regression performed on the mean yearly rainfall and mean winter temperature. A QGIS projection of this raster can be found in Figure 5 (QGIS Development Team 2019). Much of the west coast of North America is suitable for invasion, and much of the east coast and Gulf of Mexico. The interior of the Southern United States is also highly suitable for invasion, but the center of the continent is not suitable for invasion. A QGIS projection a vector political map of the United States (National Weather Service 2020) overlaid on this raster can be found in Figure 11 (QGIS Development Team 2019). The raster output of the habitat shows that every port studied in the test group was maximally suitable (p(occur)=1.0) for establishment of an invasive V. velutina population, except for Anchorage, AK (p(occur)= 0.048), as a function of the occurrence variable. 4.3-Growth and Spread Phase The projected populations of nests after ten years, N10, the Mean Nest Density, and the z-scores of these populations, for all trials and controls, are presented as Raw Output in Appendix C. Summary statistics of N10 for all trials in the test group can be 47 found in Table 10. Summary statistics of N10 for all trials for the positive control sites ca be found in Table 11. Summary Statistics of N10 for the negative control group can be found in Table 12. Summary Statistics of the Mean Nest Density for the test group, positive control group, and negative control group can be found in Tables 13, 14, and 15, respectively. ANOVA outputs can be found in Table 16. The Growth and Spread phase, in addition to generating predictions, statistically t tested four hypotheses. The first hypothesis was that some of the test sites produce statistically different N10 projections – outliers – from the rest of the test sites. These outliers would then warrant special attention from managers, or could be eliminated as a site of concern. Anchorage, AK was the only outlier from among the test sites (for Continuous Occurrence, N10: 5.454 nests, Mean Nest Density: 0.07575 nests, Z-score: 3.29739), and differed significantly all sites in the test group except San Francisco CA, St. Petersburg FL, and Vancouver BC, per the One-Way ANOVA (R Core Team 2019). The Mean Nest Density ANOVA showed Anchorage, AK to be statistically different from all test sites except San Francisco CA, Savannah GA, St. Petersburg FL, and Vancouver BC. The Second Hypothesis was that continuous and discrete trials would not produce statistically significantly different results. If the two models do not produce different results, it would not matter which model managers used to predict population. The trials were marginally statistically significantly different (p-value= p= 0.02698 ) when comparing just the test group, but that difference went away when all groups together were compared (p-value= 0.1084). 48 The third hypothesis was that the major variable Occurrence would be different from the minor variables. The results of the One-Way ANOVA (R Core Team) Continuous Occurrence (O) differed significantly from all but Human Population Density x River Presence (HR), Occurrence x Human Population Density x River Presence (OHR), Occurrence x Human Population Density x Elevation (OHE) and All Variables (ALL). Discrete Occurrence differed significantly from all discrete trials except Human Population Density (DH), Occurrence x Elevation (DOE), and Occurrence x Human Population Density x Elevation (DOHE). The fourth hypothesis was that the test group would be significantly different from the negative control group, but not the positive control group, tested by a One-Way ANOVA (R Core Team 2019). The test group was not significantly different from the positive control group (p-value=0.301), but did differ significantly from the negative control group (p-value <2e-16). 4.3.1-Continuous Growth The Simulation based on occurrence alone (O) had a mean total population (in the search area) of 222.75 nests after ten years (N10), and a standard deviation of 64.5 nests. Anchorage, AK is an outlier with a population of 5.47 nests (z score: -3.368). The rest range from 164.2 nests (San Fransisco, CA) to 311.96 nests (Portland, OR). Due to the mechanics of the model, in areas of low growth, it is possible for the spread phase to overwhelm the growth phase and produce negative population densities. An illustration of this distribution geometry can be found in Figure 12. This is the case for Anchorage: -2.43 nests at the point of origin. The rest had 5.17 nests at the point of origin. Figure 13 shows the means of all trials, together with the standard deviation. 49 Figure 14 contains the Boxplot distributions of all trials for the test group, positive control group, and negative control group. Figure 15 summarizes the distribution of N10 projections across all sites. Human Population Density produced wildly different results from those of Occurrence, with a mean of 14.698 nests (compare Occurrence, mean: 222.75) and a standard deviation of 29.809 (Occurrence: 64.5). The minimum population was –16.916 nests (Port Charlotte, FL) and the maximum population was 99.880 nests (Los Angeles, CA). None of these values are outliers (z score > ±3.000). In the other trials where Human Population Density is a variable (Trials: OH, HR, HE, OHR, OHE, HRE, ALL), similarly low results were obtained. HR and HRE had identical means (-5.455 nests), standard deviations (6.833 nests), minimum populations (-18.325 nests) both in PortCharlotte, and maximum populations (5.563 nests) in Portland, OR. OHR and ALL had nearly identical means (-7.673 and –7.674 nests respectively), identical standard deviations (4.893 nests), identical minimum populations (-18.453 nests), both at PortCharlotte, FL, and identical maximum populations (0.001 nests) both at San Fransisco, CA. OH and HE also have very similar means (14.698 and 14.819 nests, respectively), similar standard deviation (29.809 and 29.725 nests, respectively), identical minimum populations (-16.916) both at Port Charlotte – Fort Meyers, FL and identical maximum populations (99.869 nests) at Los Angeles, CA. Finally, OHE is much lower than O, with a lower mean (4.689 to 222.75 nests, respectively). Major River Presence also produces lower population projections than the occurrence variable. The calculation based solely on river presence (R) had a mean of 37.702 nests (compared to a mean of 222.75 nests in O), a standard deviation of 50 22.415 nests (compared to a standard deviation of 64.5 nests in O), a minimum population projection of 17.357 nests in Miami FL, and a maximum population projection of 92.180 nests in New Orleans, LA. When river presence is combined with occurrence (OR), the mean is reduced to 17.787 nests, a standard deviation of 13.586 nests, a minimum population of –1.684 nests in Anchorage, AK, and a maximum population of 52.285 nests, also in New Orleans, LA. Elevation had a mixed effect on population projections. Calculations based on elevation alone (E) had a higher mean than O (237.367 to 222.75 nests, respectively), a lower standard deviation (49.010 to 64.5 nests, respectively), a higher minimum population (129.298 to 5.47 nests, respectively), in Vancouver, BC, and a nearly identical maximum population (311.192 to 311.964), both in Portland, OR. The E calculation assigns maximal growth rate (2.415) to all points except those over 791 meters in elevation, so Anchorage Ak is stronger in this trial than it is in O, accounting for the higher mean, smaller standard deviation, and higher minimum population. When occurrence is added to the elevation calculation (OE), the mean is reduced (123.878 nests), the standard deviation is reduced (35.025 nests), the minimum population is nearly identical (17.310 nests) at Anchorage AK, and a reduced maximum population of 174.983 nests, also in Portland, OR. The positive control sites showed a smaller mean (211.362 nests) and larger standard deviation (124.719 nests) in the continuous major variable trial (O). This is likely accounted for by the lower population projection from Tsushima City (27.357 nests). Human Population Density (H) had a low mean (-1.059) and standard deviation (22.111 nests). River Presence (R) had a mean nest density of 53.94 nests with a standard 51 deviation of 39.453/ Elevation (E) had a mean of 220.9579 nests and a standard deviation of 129.3804 nests. Occurrence and Elevation (OE) had a mean of 120.3965 nests and a standard deviation of 70.64999 nests. The remaining combined trials continuous trials had a mean no greater than 52.977 nests and a standard deviation no greater than 39.261 (RE). The negative control sites showed a broader range of values than the test sites in O, with a much smaller minimum projection (-9.400 nests in Albuquerque, NM compared to 5.454 nests in Anchorage, AK), a much larger maximum projection (439. 680 nests in Barry County, MI compared to 311.91 nests in Portland, OR), and a lower Mean (3.244 nests) and Standard Deviation (26.665 nests). The Controls in the minor variables followed a similar pattern as the test sites. The European population grew by approximately an order of magnitude each year until reaching a population of 330 nests (Robinet et al. 2017), while the Korean population grew to 453 nests after ten years of invasion (Choi et al. 2012) The precise nest population near Tsushima City is unknown. (Ueno 2014). Only projections based on Occurrence (O) and Occurrence with Elevation (OE) produced populations in the hundreds. The elevation variable only acts as a cap on growth in OE. The multiplicative proportionality assumption used to integrate the minor variables may account for the failure of these trials to adequately add predictive power. It may be that the minor variables need to be added to the regression model in future analysis. Growth and spread based on a continuous model rooted in the major variable of habitat suitability shows, if no countermeasures are implemented, populations of several hundred nests are possible at all test locations (except Anchorage, AK) within ten 52 years. When Anchorage is removed, the mean prediction rises from 222.745 nests to 232.193 nests. The variance in nest population (excluding Anchorage, which has inferior growth conditions) may be accounted for by the land area surrounding the port. Intuitively, an inland port has more land area and therefore more spaces for nests to colonize, relative to a port located close to the water. Walhalla, SC was included as a control site because the location, like all ports save Anchorage, have p(occur)=1.0, but unlike the ports, Walhalla is landlocked, and the land that surrounds it also has a p(occur) of 1.0. Walhalla has more land area than any port. The occurrence projection for Walhalla was 439.680 nests. This pattern holds for nest density as well: Trial O had a mean of 2.949 nests per unit, with a standard deviation of 0.613. while Trial H had a mean of 0.191 nests per unit, with a standard deviation of 0.396. Trial E had a mean of 3.009 and a standard deviation of 0.239. Occurrence with Elevation (OE) had a mean of 1.637 nests per unit, and a standard deviation of 0.342. The combined continuous trials did not have a mean greater than 0.484 (RE) or a standard deviation greater than 0.396 (HE). 4.3.2-Discrete Growth The geometry the discrete model produces is a rectangle of homogeneously filled cells with an even density of nests. An example of this geometry is provided by Figure 16. The discrete model does not produce negative nest densities: the model rather produces very small nest density projections (e.g. on the order 1.0 * 10-31 nests) in the same cases where the continuous model produces negative nest densities. 53 The discrete model produced similar results to the continuous trials. In the major variable trial DO, the mean population was 289.823 nests, with a standard deviation of 123.440 nests, a minimum projection of 0.138 nests (Anchorage, AK) and a maximal projection of 439.821 nests (Portland, OR). The minor variables, accounting for the difference in the way the models handle cases where spread exceeds growth, showed the same pattern as the continuous variables: trials containing the Human Population Density variable (DH, DOH, DOHR, DOHE, DHRE, DALL) have means ranging from 0.083 to 0.087 nests. Elevation acts as a control, DE reducing the mean to 262.715 nests from 289.823 nests in DO. The river presence variable reduces the mean further, to 3.506 nests in RO, 3.135 nests in RE, and 0.597 nests in OR. The positive control sites showed a smaller mean (211.3624 nests; compare 289.823 nests) and larger standard deviation (124.719 nests) in the discrete major variable trial (DO). This is likely accounted for by the lower population projection from Tsushima City (33.684 nests). Human Population Density (DH) had a low mean (0.001 nests) and standard deviation (0.002 nests). Elevation (DE) had a mean of 291.142 nests and a standard deviation of 170.714 nests. Occurrence and Elevation (DOE) had a mean of 100.8953 nests and a standard deviation of 52.438 nests. The remaining combined trials continuous trials had a mean no greater than 4.396 nests and a standard deviation no greater than 1.252 nests (DRE). The negative control projections in DO ranged from 5.079e-19 nests in St. Paul MN to 439.680 nests in Barry County, MI. The Land Control Trial LC had a Standard 54 Deviation of 0.0, making z-scores impossible to calculate for the control populations. Maps were very homogenous across all invasive territory. The discrete model predicts a higher population and more homogeneous distribution. The Discrete model is more biologically accurate than the Continuous model, but adding more biological accuracy may not produce a statistically significantly different prediction from the less biologically accurate continuous model. An Unequal Variance T-Test was performed in R (R Core Team 2019). The script and outputs can be found in Appendix B. In the first test, the means of the discrete and continuous trials (see Figure 12) were compared. The t-test showed no statistically significant differences (p=.099) in the predictions made by both models. Figure 17 represents the two datasets, continuous means and discrete means, as boxplots. A second Unequal-Variance T-test was performed comparing the projections made by trials O and DO, the most biologically relevant trials. The sets of data containing only the test sites did generate statistically significant differences (p= 0.02698) at a 95% confidence interval, but when the control groups were added –two sets of all sites – the significance went away (p = 0.1084). Figure 18 represents the population projections of O and DO with a boxplot. This difference is unlikely to be biologically significant: whereas the continuous model deals with spread exceeding growth by predicting negative nest densities, the discrete model deals with these cases by generating very small nest densities. Both models are also functions, generating geometries in three-dimensional space. The geometry of the continuous model is a Gaussian surface expanding outward over a circular base with a center at the starting point. The discrete model forms a cube centered 55 at the starting point. The search area is a square centered at the starting point. It may be that the discrete model fills the search area more completely. The mean projection of the discrete model in DO projects a mean which differs from the continuous projection by only 67 nests (289.827 nests to 222.745 nests) and while that is slightly more than one standard deviation greater (64.510 nests) than the mean of the continuous model, both results are in the same order of magnitude. The pattern holds for nest density in continuous trials as well: Trial DO had a mean of 3.785 nests per unit, with a standard deviation of 1.461, while Trial DH had a mean of 0.001 nests per unit, with a standard deviation of 0.005. Trial DE had a mean of 3.423 and a standard deviation of 1.533. Occurrence with Elevation (DOE) had a mean of 1.596 nests per unit, and a standard deviation of 0.949. The combined continuous trials did not have a mean greater than 0.041 (DRE) or a standard deviation greater than 0.037 (DRE). Regardless of which model – discrete or continuous – is ultimately more accurate, both models project a substantial invasive population within ten years of invasion. 4.3.3- Both Discrete and Control Trials There was a significant difference (p-value <2e-16) between the negative control group and the test group for all trials, while no significant difference (p-value=0.301) exists between the test group and the control group for all trials. These groups are summarized with boxplots in Figure 19. 56 Minor variables had the effect of dragging down N10 and Mean Nest Density. This may have been caused by a failure to properly integrate these variables in the model. In future analyses, it may be useful to find a way to bring the minor variables into the logistic regression model. A visual inspection of the data suggests that Human Population Density had the strongest effect in dragging down the nest projections and mean nest density. This observation was confirmed by Principle Components Analysis in R (R Core Team 2019). The analysis showed that Human Population Density had the largest effect on the distribution of data. See Figure 20. It might be asked if Human Population Density is a function of Land Area, as ports tend to have high populations. A correlation analysis was run in R (R Core Team 2019). There was no statistically significant difference between Human Population Density and Land Area (p-value=0.134, Correlation=0.314). See Figure 21. 4.4-Impact Phase Based on a five percent loss to the beekeeping industry if the values of the most recent years are taken as representative of any given year, and the cost of nest removal in 2020 dollars, the total potential cost of an V. velutina invasion is projected to range from 565,181,398.135 USD to 565,307,127.539 USD, including a nest removal cost ranging from 31,398.135 USD to 157,127.539 USD. In Canada, a value for pollination services could not be found, but Canadian honey production for the last year for which there is data, 2016, was $97,931,320 USD (FAOSTAT 2019), equal to 129,504,377.57 CAD (Bank of Canada Accessed 2/21/2020). A loss of five percent honey production would be a loss of 6,475,218. 878 CAD. 57 Converting estimated nest removal const from USD to CAD gives a range of 41,526.350207,397.338 CAD (Bank of Canada Accessed February 21st 2020), for a total range of economic impact of 6,516,745.228 -6,682,616.216 CAD. Tables 17 and 18 summarize these calculations for the United States and Canada, respectively. Ecological and human health impacts, while probably significant, are not possible to calculate due to lack of data. Economic impact was possible to estimate based on some available data: five percent total loss to the main services provided by honeybees (pollination and honey production) is possible to calculate, though it may be unlikely that the kind of complete invasion assumed in the calculation would materialize. Nest removal cost, ranging from 31,398.135 USD to 157127.539 USD, is more likely to be a more accurate reflection, calculated from mean projected population and nest removal cost in France, assuming a cost in line with 2020 exchange rates. In Canada, the full measure of cost is less accessible than in the United States because pollinator values are not available. In the analysis of impacts of honey production by state revealed a total of $9,904,350 USD in loss of honey production from all states with suitable habitats. Of the potentially infected states, California had the greatest impact, with a loss of 1,435,300 USD. See Table 7. A potential invasion cost of over half a billion USD to American agriculture, and a loss of 6,475,218. 878 CAD to Canadian honey production, is nonetheless substantial, and it demonstrates the value of countermeasures to a V. velutina invasion. 58 4.5-Model Validation The heat map of M. apicalis occurrence, generated with the software package Wallace, can be found in Figure 22. The continuous model projects a population of 4.219 nests within the search area, while the discrete model projects a population of 2.308e-08 nests. In cases where the rate of spread eclipses the rate of growth, the continuous model assigns negative nest projections while the discrete model assigns very small nest projections. In the M. apicalis case, it is clear the rate of spread exceeds the rate of growth in both the discrete and continuous model. The M. apicalis invasion was therefore not an effective means of model validation. 5-DISCUSSION The purpose of this study was to determine what a V. velutina invasion would look like in quantitative terms: How likely was an invasion to occur? What habitats were the hornets likely to occupy, and how many nests could be expected in these habitats? What would the financial and ecological impacts of the invasion be? This study was successful in generating a prediction of what habitats were likely to support V. velutina, and that prediction raised an important insight: An invasion beginning on the west coast, while potentially calamitous for ecology and agriculture on the West Coast, is unlikely to cross the center of the continent and settle in the East. A large band of inhospitable territory splits the continent between the two hospitable coats. An invasion begun on the east coast would be significantly more devastating in terms of its potential to spread, given suitable habitat from Florida to the Canadian 59 Maritimes, through the Midwest as far as Michigan. Habitats inhospitable to V. velutina separate the coasts, so V. velutina will not migrate naturally. An additional transportation event will be required for V. velutina to move from the invaded coast to the non-invaded coast. Goods shipped from the invaded coast to the other could be screened for the presence of V. velutina gynes. This study was also successful in generating predictions of nest density and nest populations (N10). In France, Robinet et al. (2017) report an observed nest population N10 of 330 nests, while in South Korea, an observed N10 of 453 nests was reported (Choi et al. 2012). The Positive control site Nerac had a projected N10 of 311.962 based on the major variable, within 5.4% of the observed value reported by Robinet et al. (2017). The discrete trial based on the major variable at the same location produced a projection of 440.829 nests, consistent to within 2.8% of the observed nest population at Busan (Choi et al. 2012). Busan itself, however, was consistent only to within order of magnitude, being less than half the population observed at Busan. Projections of nest population (N10) in North America based on the major variable alone, using both the continuous and discrete models, are consistent with the what has been observed in Europe (Robinet et al. 2017) and Korea (Choi et al. 2012), save Anchorage. Of note is Walhalla, SC, chosen to represent maximal land area, had the same continuous and discrete projections as Nerac. Observed invasive populations range from 330 to 453 nests. For the major variable, continuous projections predict a value within 5.4% of the lower bound, and discrete projections predict a value within 2.8% of the upper bound, when land shape is not an issue. This result suggests that the main limitation of both the discrete and continuous models is its inability to account for land shape. The model is 60 accurate when land shape is not an issue. This result also implies that the assumptions of the model were justified: 2.415 is an appropriate value for r; the logistic growth model was approximately equal to Malthusian growth, which was approximately equal to Archer’s (1985) discrete growth model; the calculated model values from the French population (Robinet et al. 2017) were accurate in the map unit basis coordinates; and the discrete model laid out here is validated. A second goal of the population simulation was to determine whether the continuous and discrete models produced statistically significantly different population projections. The two models did not produce statistically significantly different results. This result implies that either model could be used for further predictions of nest density and population, giving allowances for land shape. The tendency of the continuous model to produce negative results, and of the discrete model to produce very small results, when spread exceeds growth, also presents a problem for the future usage for the model. Adding the minor variables took the N10 projections far from the observed values of 330 nests and 453 nests in France and Korea (Robinet et al. 2017; Choi et al. 2012), respectively. This may be attributed to the failure of integrating the minor variables into the model, something that may be fixed by including the minor variables into the habitat suitability regression analysis. This study was unsuccessful in validating the population models for other species. The biology of M. apicalis was not compatible with the models used. This species reproduces far too slowly relative to its rate of spread. This fact of M. apicalis generated negative population densities. The continuous model used in this study has been validated 61 for V. velutina in Europe (Robinet et al. 2017), and these results suggest that both models have been validated, giving allowances for land shape. A major goal of the population projection study was to determine which port would produce the highest nest population (N10) given an invasion beginning at that port. This study was less successful in achieving this goal: all ports studied except Anchorage, AK occurred in habitats that were highly hospitable to V. velutina. Variation in nest population (N10) can be explained by the shape of the land surrounding it. More land available to settle leads higher populations. It is therefore not possible, based on these data, to say which port should be the focus of heightened attention from authorities. In the continuous major variable trial, Anchorage AK is a statistical outlier (Z<-3.0, Z>3.0) based on z-score, but only that trial. The results of the ANOVA indicate that all the distributions are statistically significantly different from each other across all trials, but not that any given port was worthy of any special attention from authorities. This conclusion is supported by the Mean Nest Density results: the majority of the mean nest density for ports in the continuous, major variable trials (25th-75th percentile) ranged from 2.935-3.123 nests per square unit. This study was less successful in ascertaining the risk of V. velutina invasion itself. The lack of literature on invasion vectors accounted for this lack of success. This study was able to generate an estimate of the maximal share of goods that could be potential vectors, but a more-refined estimate was not possible. This study was also unsuccessful in quantifying ecological damage caused by a V. velutina invasion. A lack of literature on the ecological impacts of V. velutina in Europe and Northern Asia, and a lack of literature concerning methods for calculating ecological 62 impacts, accounted for this lack of success. This study was marginally more successful in generating predictions of economic costs of a V. velutina invasion. Economic impact data was available from Europe and Northern Asia (Monceau et al. 2014), and economic data from the United States was available (Smith et al. 2009), enabling calculations of economic impact. These calculations represent an estimate of the maximal economic impact, assuming an invasion of the entire United States. A full invasion of the United States is unlikely, given that several large regions of the United States are inhospitable to V. velutina (See Figures 5, 7), so this value is necessarily an overestimate. Literature on pollinator impact by state is not available, but Flottum (2017) provided a means of calculating impacts to honey production by state. A study of the efficacy of countermeasures would logically be the next step in researching a biological invasion of North America. Such a study would likely want to look at the effects of increasing the proportion of resistant strains of bees (A. cerana and A. melifera ligustica) in North American apiaries, or selective breeding programs, both on V. velutina resistance and the ecological impacts such organisms must have. More research is also needed into potential vectors of V. velutina, and more research is needed into ecological impacts of V. velutina. The theoretical ecology techniques used in this study raise a very powerful prospect for the future of invasive species management: species likely to become invasive in new habitats can be studied by habitat distribution modeling and population-dynamic modeling years in advance of them actually becoming invasive. Modelers can give these data, and their recommendations, to management authorities years in advance of the 63 invasion. In cases where inhospitable habitat separates suitable habitat, such habitats could serve as crucial lines of redoubt for containing invasions. Given the volume of trade serving as potential vectors of V. velutina invasion, it is not practical to screen every possible potential vector. It is also not practical to prioritize a port for targeted enforcement. There is no evidence that one port above the others will lead to a worse invasion. It is however possible to eliminate Anchorage, AK as a potential port of invasion. The most practical techniques for containing a V. velutina invasion are those implemented in Europe and Korea: nest documentation and destruction. North American managers do have access to a strategy not available in Europe and Asia: screening goods shipped from the invaded coast to the non-invaded coast. 64 LITERATURE CITED Abbott P, Abe J, et al. 2011. Inclusive fitness theory and eusociality. Nature. 471: E1-E4. Annual Exchange Rate. Bank of Canada. https://www.bankofcanada.ca/rates/exchange/annual-average-exchange-rates/. Arca M, Mougel F, Guillemaud T, Dupas S, Rome Q, Perrard A, Muller F, Fossoud A, Capdevielle-Dulac C, Torres-Leguizamon M, et al. 2015. Reconstructing the invasion and the demographic history of the yellow-legged hornet, vespa velutina, in 65oncat. Biol Invasions 17(8):2357. Archer M, Population dynamics of the social wasps vespula vulgaris and vespula germanica in England. 1985. J Anim Ecol :473. DOI: 10.2307/4492 Archuleta, Christy-Ann M.; Constance, Eric W.; Arundel, Samantha T. ; Lowe, Amanda J.; Mantey, Kimberly s.; Phillips, Lori A. 2017. The National Map Seamless Digital Elevation Model Specifications. https://viewer.nationalmap.gov/basic/ Barthell JF, Thorpe RW, Frankie GW, Kim JY, Hranitz JM. 2003. Impacts of introduced solitary bees on natural and agricultural systems: the case of the leafcutting bee, megachile apicalis (hymenoptera: megachilidae). In: Strickler K, Cane JH, editors. From nonnative crops, when pollinators of the future? Lanham, MD: Entomological Society of America. P. 151-162. Bertolino S, Lioy S, Laurino D, Manino A, Porporato M. 2016. Spread of the invasive yellow-legged hornet vespa velutina (hymenoptera: vespidae) in Italy. Appl Entomol Zool 51(4):589. DOI: 10.1007/s13355-016-0435-2 Bessa AS, Carvalho J, Gomes A, Santarém F. 2016. Climate and land-use drivers of invasion: Predicting the expansion of vespa velutina nigrithorax into the 65oncate peninsula. Insect Conservation and Diversity 9(1):27. doi: 10.1111/icad.12140 Bonaccorsi A, and Rossi C. 2006. Comparing motivations of individual programmers and firms to take part in the open source movement: From community to business. Knowledge, Technology & Policy. 18(4):40-64. Center for International Earth Science Information Network – CIESIN – Columbia University. 2018. Gridded Population of the World, Version 4 (GPWv4): Administrative Unit Center Points with Population Estimates, Revision 11. Palisades, NY: NASA Socioeconomic Data and Applications Center (SEDAC). https://doi.org/10.7927/H4BC3WMT. Choi MB, Martin SJ, Lee JW. 2012a. Distribution, spread, and impact of the invasive hornet vespa velutina in south korea. Journal of Asia-Pacific Entomology 15(3):473. DOI: 10.1111/icad.12140 65 Darrouzet E, Gévar J, Guignard Q, Aron S. 2015. Production of early diploid males by 66oncaten colonies of the invasive hornet vespa velutina nigrithorax. PloS One 10(9). doi: 10.1371/journal.pone.0136680 D’Ettore P, Heinze J. 2001. The sociobiology of slave-making ants. Acta Ethologica. 3(2): 67-82. Erdmann, R. 2009. Finite-Difference Solutions to the 2-D Heat Equation. http://www.u.arizona.edu/~erdmann/mse350/_downloads/2D_heat_equation.pdf EUR-USD X-Rate. Bloomberg.com. https://www.bloomberg.com/quote/EURUSD:CUR. Flottum K. 2017.U.S. Honey Industry Report. https://www.beeculture.com/u-s-honeyindustry-report-2017/ Food and Agriculture Organization Corporate Statistical Database. http://www.fao.org/faostat/en/#home. Foreign Trade. FT900 U.S. International Trade in Goods and Services Report 2020. https://www.census.gov/foreign-trade/PressRelease/current_press_release/index.html. Franklin DN, Brown MA, Datta S, Cuthbertson AGS, Budge GE, Keeling MJ. 2017. Invasion dynamics of asian hornet, vespa velutina (hymenoptera: Vespidae): A case study of a commune in south-west France. Appl Entomol Zool 52(2):221. GBIF.org .GBIF Occurrence Download. https://www.gbif.org/. Hein L. 2009. The economic value of the pollination service, a review across scales. The Open Ecology Journal 2(1):74. Gowdy, J. and Erickson, J.D., 2005. The approach of ecological economics. Cambridge Journal of economics, 29(2), pp.207-222. Haxaire, J., Tamisier, J.P. and Bouguet, J.P., 2006. Vespa velutina Lepeletier, 1836, une redoutable nouveauté pour la faune de France (Hym., Vespidae). Bulletin de la Société entomologique de France, 111(2), pp.194-194. Herman, R.L. 2014. Numerical Solution to 1D Heat Equation. http://people.uncw.edu/hermanr/pde1/NumHeatEqn.pdf Hijmans, R. J., S.E. Cameron, J.L. Parra, P.G. Jones and A. Jarvis, 2005. Very high resolution interpolated climate surfaces for global land areas. International Journal of Climatology 25: 1965-1978. Hill MF, Witman JD, Caswell H. 2004. Markov chain analysis of succession in a rocky subtidal community. Am Nat 164(2):E46 66 Hlavac M. 2018. stargazer: Well-Formatted Regression and Summary Statistics Tables. R package version 5.2.1. https://CRAN.R-project.org/package=stargazer Hranitz JM, Barthell JF, Thorp RW, Overall LM, Griffith JL. 2009. Nest site selection influences mortality and stress responses in developmental stages of megachile apicalis spinola (hymenoptera: mechachilidae) . Environmental Entomology. 38(2): 484-492. DOI: 10.1603/022.038.0223 Hughes WO, Oldroyd BP, Beekman M, Ratnieks FL. 2008. Ancestral monogamy shows kin selection is key to the evolution of eusociality. Science 320(5880):1213-6. DOI: 10.1126/science.1156108 Hulme PE. 2009. Trade, transport and trouble: Managing invasive species pathways in an era of globalization. J Appl Ecol 46(1):10. Doi:10.1111/j.13652664.2008.01600.x Hunt JH and Amdam GV. 2005. Bivoltinism as an antecedent to eusociality in the paper wasp genus polistes. Science 308(5719):264-7. DOI: 10.1126/science.1109724 International Trade. Statistics Canada. https://www150.statcan.gc.ca/n1/en/subjects/international_trade. Jeanne RL and Suryanarayanan S. 2011. A new model for caste development in social wasps. Communicative & Integrative Biology 4(4):373. Doi:10.4161/cib.15262 JMP ®, Version <14.3 >. SAS Institute Inc., Cary, NC, 1989-2019. https://www.jmp.com/en_us/support/jmp-documentation.html Kass JM, Vilela B, Aiello-Lammens ME, Muscarella R, Merow C, Anderson RP. 2018. Wallace: a flexible platform for the reproducible modeling of species niches and distributions built for community expansion. Methods in Ecology. 9(4): 11511156. Doi:10.1111/2041-210X.12945 Keeling MJ, Franklin DN, Datta S, Brown MA, Budge GE. 2017. Predicting the spread of the asian hornet (vespa velutina) following its incursion into great britain . Scientific Reports (Nature Publisher Group) 7(1). Kelso NV, Patterson T, 2009. Natural Earth: free vector and raster map data. https://www.naturalearthdata.com/blog/miscellaneous/natural-earth-version-1-2release-notes/ Accessed 1 June 2019. Ken T, Hepburn HR, Radloff SE, Yusheng Y, Yiqiu L, Danyin Z, Neumann P. 2005. Heat-balling wasps by honeybees. Naturwissenschaften 92(10):492. DOI 10.1007/s00114-005-0026-5 Kim JY. 1997. Female size and fitness in the feafcutter bee megachile apicalis. Ecological Entomology. 22: 275-282. Limnios, E.A.M., Ghadouani, A., Schilizzi, S.G. and Mazzarol, T., 2009. Giving the consumer the choice: A methodology for Product Ecological Footprint calculation. Ecological Economics, 68(10): 2525-2534. 67 May R. 1974. Biological populations with nonoverlapping generations: stable points, stable cycles, and chaos. Science. 186:645-647. Moller H. 1996. Lessons for invasion theory from social insects. Biol Conserv 78(12):125. Doi: 10.1016/0006-3207(96)00022-5 Monceau K and Thiéry D. 2017. Vespa velutina nest distribution at a local scale: An 8year survey of the invasive honeybee predator: vespa velutina nest distribution at a local scale. Insect Science = 昆虫科学(英文版) 24(4):663. DOI 10.1111/17447917.12331 Monceau K, Bonnard O, Thiéry D. 2014. Vespa velutina: A new invasive predator of honeybees in europe. Journal of Pest Science 87(1):1. DOI 10.1007/s10340-0130537-3 Monceau K, Maher N, Bonnard O, Thiéry D. 2013. Predation pressure dynamics study of the recently introduced honeybee killer vespa velutina: Learning from the enemy. Apidologie 44(2):209. DOI: 10.1007/s13592-012-0172-7 National Weather Service. 2020. https://www.weather.gov/source/gis/Shapefiles/County/s_11au16.zip Nowak MA, Tarnita CE, Wilson EO. 2010. The evolution of eusociality. Nature 466(7310):1057. Pimentel D, Zuniga R, and Morrison D. 2005. Update on the environmental and economic costs associated with alien-invasive species in the United States. Ecological economics, 52(3): 273-288. Plunkett GM, Moller H, Hamilton C, Clapperton BK, and Thomas CD. 1989. Overwintering colonies of German (Vespula germanica) and common wasps (Vespula vulgaris)(Hymenoptera: Vespidae) in New Zealand. New Zealand journal of zoology, 16(3): 345-353. Queller D, 2011. Expanded social fitness and Hamilton’s rule for kith, kin and kind. Proceedings of the National Academy of Sciences 108(2) 1079210799. DOI: 10.1073/pnas.1100298108 R Core Team. 2019. R: A language and environment for statistical computing. R Foundation for Statistical Computing, Vienna, Austria. URL https://www.Rproject.org/. Renshaw, E. 1991. Modelling Biological Populations in Space and Time. Cambridge University Press. Robinet C, Suppo C, Darrouzet E. 2017. Rapid spread of the invasive yellow-legged hornet in france: The role of human-mediated dispersal and the effects of control measures. J Appl Ecol 54(1):205. doi: 10.1111/1365-2664.12724 68 Rome Q, Muller FJ, Touret-Alby A, Darrouzet E, Perrard A, Villemant C. 2015. Caste differentiation and seasonal changes invespa velutina(hym.: vespidae) colonies in its introduced range. Journal of Applied Entomology = Zeitschrift Für Angewandte Entomologie 139(10):771. doi: 10.1111/jen.12210 Sakai AK, Allendorf FW, Holt JS, Lodge DM, Molofsky J, With KA, Baughman S, Cabin RJ, Cohen JE, Ellstrand NC, et al. 2001. The population biology of invasive species. Annu Rev Ecol Syst 32(1):305. DOI: 0066-4162/01/1215-0305$14.00 Shigesada, N., Kawasaki, K. 1997. Biological Invasions: Theory and Practice. Oxford University Press Smith KM, Loh EH, Rostal MK, Zambrana-Torrelio CM, Mendiola L, Daszak P. 2013. Pathogens, pests, and economics: drivers of honey bee colony declines and losses. EcoHealth 10(4):434. DOI: 10.1007/s10393-013-0870-2 Statistical Summary: Honeybees. 2019. U.S. Department of Agriculture’s National Agricultural Statistics Service (NASS). https://www.nass.usda.gov/Publications/Highlights/2019/2019_Honey_Bees_Stati sticalSummary.pdf. Sugahara M, Nishimura Y, Sakamoto F. 2012. Differences in heat sensitivity between honeybees and hornets under high carbon dioxide and humidity conditions inside bee balls. Zool Sci 29(1):30. DOI: 10.2108/zsj.29.30 Tan K, Li H, Yang MX, Hepburn HR, Radloff SE. 2010. Wasp hawking induces endothermic heat production in guard bees. J Insect Sci 10(142):1. DOI: 10.1673/031.010.14102 Tan K, Wang Z, Li H, Yang S, Hu Z, Kastberger G, Oldroyd BP. 2012. An ‘I see you’ prey–predator signal between the asia honeybee, apis cerana, and the hornet, vespa velutina. Anim Behav 83(4):879. DOI: 10.1016/j.anbehav.2011.12.031 Toth AL, Varala K, Newman TC, Miguez FE, Hutchison SK, Willoughby DA, Simons JF, Egholm M, Hunt JH, Hudson ME, et al. 2007. Wasp gene expression supports an evolutionary link between maternal behavior and eusociality. Science 318(5849):441-4. DOI: 10.1126/science.1146647 Ueno T. 2014. Establishment of the invasive hornet Vespa velutina (Hymenoptera: Vespidae) in Japan. International Journal of Chemical, Environmental & Biological Sciences, 2(4): 3 Varley, G.C., Gradwell, G.R., Hassell, M.P. 1973. Insect Population Ecology. University of California Press. ISBN: 0-520-02667-5 Van Rossum, G. Python Development Team. 2018. Python Tutorial: Release 3.7.3. https://bugs.python.org/file47781/Tutorial_EDIT.pdf Villemant C, Barbet-Massin M, Perrard A, Muller F, Gargominy O, Jiguet F, Rome Q. 2011. Predicting the invasion risk by the alien bee-hawking yellow-legged hornet 69 vespa velutina nigrithorax across 70oncat and other continents with niche models. Biol Conserv 144(9):2142. DOI:10.1016/j.biocon.2011.04.009 Waibel M, Floreano D, Keller L. 2011. A quantitative test of 70amilton’s rule for the evolution of altruism. PloS Biology 9(5). DOI: 10.1371/journal.pbio.1000615 Ward PS. 2014. The phylogeny and evolution of ants. Annual Review of Ecology, Evolution and Systematics 45(1):23. DOI: 10.1146/annurey-ecolsys-120213091824 West SA, Murray MG, Machado CA, Griffin AS, Herre EA. 2001. Testing Hamilton’s rule with competition between relatives. Nature 409(6819):510. Wheeler DE. 1986. Developmental and physiological determinants of caste in social hymenoptera: Evolutionary implications. Am Nat 128(1):13. Wickham H. 2007. “Reshaping Data with the reshape Package.” Journal of Statistical Software, 21(12): 1–20. http://www.jstatsoft.org/v21/i12/. Wickham H. 2016. ggplot2: Elegant Graphics for Data Analysis. Springer-Verlag New York. https://ggplot2.tidyverse.org. Wickham H, Francois R, Henry L, and Müller K. 2018, dplyr: A Grammar of Data Manipulation. R Package Version 0.7.6. https://CRAN.Rproject.org/package=dplyr. Yañez O, Zheng HQ, Hu FL, Neumann P, and Dietemann V. 2012. A scientific note on Israeli acute paralysis virus infection of Eastern honeybee Apis cerana and vespine predator Vespa velutina. Apidologie, 43(5): 587-589. Zhang, C. and Boyle, K.J., 2010. The effect of an aquatic invasive species (Eurasian watermilfoil) on lakefront property values. Ecological Economics. 70(2):394-404. 70 TABLES TABLE 1: Summary of analysis comparing transportation criteria (Hulme 2009) to the known biology of V. velutina. Table contains the criteria, conclusions of the criteria, and the reasoning for those conclusions. Hulme’s (2009) Criterion The strength of the association between the species and the vector at the point of export. Volume of vector imports at the point of interest Frequency of importation V. velutina Biology Unknown. Reasoning No literature on Potential Vectors. Unknown Survivorship and growth during transport Suitability of the importing point to species establishment Appropriateness of the time of year for the establishment of the species The ease of containing the species within the vector Effectiveness of management measures Distribution of the vector post importation The likelihood of postimportation transport to suitable habitat Survivorship Unknown; Growth =0 in transit. Warm, wet urban ports No literature on Potential Vectors. No literature on Potential Vectors. Inferred from life cycle of V. velutina Choi et al. 2012; Bessa et al. 2015; Franklin et al. 2016. Unknown January through the end of March Inferred from life cycle of V. velutina Unknown No literature on Potential Vectors. Monceau et al. 2014. Limited to Nest Destruction at present. unknown Very Likely, as most ports studied are already in suitable habitat. No literature on Potential Vectors. Results of the Establishment Phase Simulation. 71 TABLE 2: Table of Equations, together with legends of variable meanings. Y*=derivative or partial derivative of Y with respect to time, for all Y. Unless otherwise specified, e=2.718 (Euler’s Constant). Equation Number 1 Equation 𝑃(𝑂𝑐𝑐𝑢𝑟𝑟𝑒𝑛𝑐𝑒) = 𝐵 + 𝐶/ 𝑉/ + 𝐶1 𝑉1 + ⋯ Description Variables Example Linear Regression B=intercept Cn=nth regression coefficient Vn=nth variable Malthusian Difference Equation Malthusian Differential Equation Logistic Differential Equation Nt+1=population at t+1 Nt=population at time t r=innate rate of growth N=population r= innate rate of growth 2D Heat Equation D=innate rate of diffusion U=population density at point (x,y) and time t. 2D Heat Equation plus a differential equation of Growth. D=innate rate of Diffusion U=population density at point (x,y) and time t. N=Population 2D Heat Equation plus Logistic Growth D=innate rate of Diffusion U=population density at point (x,y) and time t. N=Population + 𝐶3 𝑉3 2 𝑁56/ = 𝑁5 + 𝑟𝑁5 3 𝑁 ∗= 𝑟𝑁 4 𝑁 𝑁 ∗ = 𝑟𝑁(1 − ) 𝐾 5 𝜕1𝑈 𝜕1𝑈 𝑈 = 𝐷( 1 + 1 ) 𝜕𝑥 𝜕𝑦 6 7 8 9 ∗ 𝑈∗ = 𝐷 B 𝜕1𝑈 𝜕1𝑈 + C + 𝑁∗ 𝜕𝑥 1 𝜕𝑦 1 𝜕1𝑈 𝜕1𝑈 𝑁 𝑈 ∗ = 𝐷 B 1 + 1 C + 𝑟𝑁(1 − ) 𝜕𝑥 𝜕𝑦 𝐾 𝑈 ∗ = 𝐷(𝑥, 𝑦) B 𝜕1𝑈 𝜕1𝑈 𝑁 + C + 𝑟(𝑥, 𝑦)𝑁(1 − ) 𝜕𝑥 1 𝜕𝑦 1 𝐾(𝑥, 𝑦) P / HJLM J BK O 1 𝑃(𝑥, 𝑦) = 𝑒 1 NM 2𝜋𝜎H 𝜎I 6Q IJLR P S C NR N=Population r= innate rate of growth K=Carrying Capacity r= innate rate of growth K=Carrying Capacity Patched FisherSkellam Differential Equation 3D Gaussian Curve D=innate rate of Diffusion U=population density at point (x,y) and time t. N=Population r= innate rate of growth K=Carrying Capacity 𝜎H =Standard deviation of x 𝜎I =standard Deviation of y 𝜇H =Mean of x 𝜇I =Mean of y 72 Equation Number 10 Equation 𝑟𝑁5J/ 𝑁5 = 𝑃𝑜𝑖𝑠𝑠𝑜𝑛 X Y 𝑁5J/ 1+ 𝐾 11 12 𝑃Z[ = 𝑒 J \]^ L 𝑇𝑒𝑟𝑟𝑎𝑖𝑛 [ 𝑃(𝑖, 𝑦) = 𝑛𝑟𝐸d 𝑓(𝑙𝑎𝑡d ) g𝑒 \hi J j L 𝑇𝑒𝑟𝑟𝑎𝑖𝑛 kj l g𝑒 13 g𝑒 \hi J P L 𝑇𝑒𝑟𝑟𝑎𝑖𝑛 \hp J L 𝑃(𝑞|𝑗, 𝑦 + 1) = 𝑛𝑟𝐸d 𝑓(𝑙𝑎𝑡d ) ∑𝑒 kP l 𝑇𝑒𝑟𝑟𝑎𝑖𝑛q l \hs J L 𝑇𝑒𝑟𝑟𝑎𝑖𝑛 Description Variables Franklin et al. (2016) Equation Nt-1=population at t-1 Nr=population at time t r=innate rate of growth K=Carrying Capacity Keeling et al. (2017) equation 1 µ=mean flying Distane ∆jk=distance between both sites. Terraink=quality of kth terrain. Keeling et al. (2017) equation 2 N=normalizing coefficient r=innate rate of growth Ei=Suitability of ith environment f(lati)=position function µ=mean flying Distane ∆jk=distance between both sites. Keeling et al. (2017) equation 3 N=normalizing coefficient r=innate rate of growth Ei=Suitability of ith environment f(lati)=position function µ=mean flying Distane ∆jk=distance between both sites. t 14 𝑃(𝑞|𝑦 + 1) = u 𝑃𝑜𝑠𝑡(𝑟) vw 𝑝d I y 𝑝q|d I6/ Keeling et al. (2017) equation 4 Post(r)=Posterior Distribution 15 𝑁56/ = 𝑁5 𝑄𝑆 Archer (1985) equation 16 𝑁56/ = 𝑁5 𝑄𝑆(𝑥, 𝑦) Patched Archer (1985) equation 17 𝑀 = [𝐸d ] ∪ [𝑟d ] ∪ €𝑃Z[ • ∪ €𝑂Z[ • Formal Definition of a Markov Chain 18 𝑁 = 𝑁‚ 𝑟ƒ Discrete Growth Phase Nt+1=population at t+1 Nt=population at time t Q=number of gynes produced per hive S=overwinter survival rates. Nt+1=population at t+1 Nt=population at time t Q=number of gynes produced per hive S=overwinter survival rates. [Ei]=set of environments [ri]=set of growth rates [pjk]=set of paths connecting environments j and k. [Ojk]=set of probabilities of transmission from j to k. Ns=Population at the end of the Spread Phase re=rate of growth at environment e. 19 [ [ 𝑁(𝑒, 𝑡 + 1) = w 𝑁(𝑖, 𝑡)𝑂dƒ − w 𝑁(𝑒, 𝑡)𝑂ƒd (d„ƒ) d d Discrete Spread Phase N(e,t+1)=Population at environment e at time t+1 N(I,t)=Population at the ith environment at time t. Oie=probability of a member of a population at I migrating to e. N(e,t) =population at e at time t. 73 Oei=probability that a member of the population at e will leave e. Equation Number 20 21 Equation 𝑁 = 𝑁‚ 𝑄𝑆ƒ 𝑃(𝑂𝑐𝑐𝑢𝑟𝑟𝑒𝑛𝑐𝑒) = 22 1 1+ 𝑒 J….‡…ˆ‰ˆ‰/‡6‰.‰/…Š‰1‹Œ6‰./…ˆŠ•‹t 5 𝑈(𝑥, 𝑦, 𝑡) = u 𝐷(𝑥, 𝑦) B ‰ 𝜕1𝑈 𝜕1𝑈 + C 𝜕𝑥 1 𝜕𝑦 1 + 𝑟(𝑥, 𝑦)𝑈 Q1 − 23 𝑈(𝑥, 𝑦, 0) P / HJ• J BK O 1 = 𝑒 1 1.‡ 11.52𝜋 24 𝑈 S 𝜕𝑈 𝐾(𝑥, 𝑦) IJ‘ P 6K O C 1.‡ 5 𝜕1𝑈 𝜕1𝑈 𝑈(𝑥, 𝑦, 𝑡) = u 51.84 B 1 + 1 C 𝜕𝑥 𝜕𝑦 ‰ + 2.415𝑃˜™™šk (𝑥, 𝑦)𝑈 Q1 − Description Variables Combination of equations 15 and 18 Logistic Regression Equation Ns=Population at the end of the Spread Phase Se=Overwinter survival at environment e. Integral of the restatement of equation 8 in terms of U D=innate rate of Diffusion U=population density at point (x,y) and time t. N=Population r= innate rate of growth K=Carrying Capacity Restatement of Equation 9 with values [J,K]=mean located at the point (J,K) W=mean winter temperature P=Mean yearly rainfall Restatement of Equation 22 with values. 𝑈 S 𝜕𝑈 7.397 74 TABLE 3: Summary of the constants and their values used in the Growth and Spread Simulation, together with their units, the method of their calculation, and the sources of the value. Constant D Value 51.84 Unit Unit/Year K 7.397 Nests/Unit Max. r 2.415 NA t Standard Deviation h Max Pop. Density Elevation above which AH habitation is unlikely River Presence and Absence 10 2.4 years Units 0.000001 2000 Square Units humans/unit 791 meters 1,0 NA Calculation Source Unit Conversion Robinet et al. 2017 Unit Conversion Robinet et al. 2017 Archer 1985; Rome 2015 From max yearly Robinet et al. invasion wave 2017 Definition Definition Robinet et al. 2017 Definition 75 TABLE 4: Sites in the Test Group, together with GPS coordinates and the Matrix indices used to represent the port. Site # Site 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 Anchorage AK Baltimore MD Biloxi MS Boston MA Charleston SC Houston TX Jacksonville FL Los Angeles CA Miami FL Mobile AL New Orleans LA New York NY Pensacola FL Philadelphia PA Port Charlotte-Ft. Myers FL Portland ME Portland OR Providence RI San Francisco CA Savannah GA Saint John NB St. Petersburg-Tampa FL Seattle WA Vancouver BC 16 17 18 19 20 21 22 23 24 GPS Coordinates Long,Lat -148.9, 61.216 - 76.6122, 39.2904 - 88.8853, 30.3960 -71.0589, 42.3601 - 79.9311, 32.7765 - 95.3698, 29.7604 - 81.6557, 30.3322 - 118.2437, 34.0522 - 80.1918, 25.7617 - 88.0399, 30.6954 - 90.0715, 29.9511 - 74.0060, 40.7128 - 87.2169, 30.4213 - 75.1652, 39.9526 - 81.8606, 26.6031 Matrix Index Used [Row, Column] [287, 301] [506, 1033] [595, 911] [476, 1089] [572, 1000] [602, 846] [596, 983] [559, 617] [642, 997] [593, 918] [600, 899] [492, 1059] [595, 927] [500, 1048] [632, 981] - 70.2568, 43.6591 -122.6750, 45.5051 - 71.4128, 41.8240 -122.4194, 37.7749 - 81.0912, 32.0809 - 66.0633, 45.2733 - 82.6403, 27.7676 - 122.3321, 47.6062 - 123.1207, 49.2827 [463, 1097] [444, 573] [481, 1085] [522, 575] [578, 989] [447, 1139] [620, 975] [422, 576] [407,568] 76 TABLE 4: A) Sites in the Positive Control Group, together with their GPS coordinates and the Matrix Index used to represent it. B) Sites in the Negative Control Group, together with their GPS coordinates and the Matrix Index used to represent it. A) Site # Site GPS Coordinate Used Long., Lat. Index Used [Row, Column] 25 Montreal, QC - 73.5673, 45.5017 [ 448,1068] 26 Walhalla SC - 83.0640, 34.7648 [555,966] 27 Busan, South Korea 179.0667,35.1667 [548,3090] 28 Nerac, France 0.3342, 44.1316 [458,1803] 29 Tsushima City, 129.2833, 34.200 [558,3092] Japan B) Site # Site Index Used [Row, Column] Albuquerque NM GPS Coordinate Used Long., Lat. - 106.6504, 35.0844 30 31 Barry County MI - 85.3550, 42.5354 [475, 946] 32 St Paul MN - 93.0900, 44.9537 [450,869] [549, 734] 77 TABLE 6: Model Statements with Trial Codes, Trial Numbers, and the Method each Model uses to calculate r(x,y). Trial codes presented are for continuous trials. Discrete trials add D to the beginning of the continuous trial code. Odd trial numbers represent the continuous version of the model statement, while even numbers represent the discrete version of the model statement. These trial numbers are represented by the x-axis of Figure 13. Trial # Trial Code Model Statement Formula for Calculating r(x,y) 1,2 C Land Control =1 for all cells 3,4 O Occurrence Probability Only Poccur(x,y) *2.415 5,6 H Human Population Density Only 7,8 R River Presence Only Riv(x,y)*2.415 9,10 E Elevation Only Elev(x,y)*2.415 11,12 OH Occurrence x Human Population Density Poccur(x,y)*Pop(x,y)*2.415 13,14 OR Occurrence x River Presence Poccur(x,y)*Riv(x,y)*2.415 15,16 OE Occurrence x Elevation Poccur(x,y)*Elev(x,y)*2.415 17,18 HR Human Population Density x River Presence Pop(x,y)*Riv(x,y)*2.415 19,20 HE Human Population Density x Elevation Pop(x,y)*Elev(x,y)*2.415 21,22 RE River Presence x Elevation Riv(x,y)*Elev(x,y)*2.415 23,24 OHR Occurrence x Human Population Density x Rivers Poccur(x,y)*Pop(x,y)*Riv(x,y)*2.415 25,26 OHE Occurrence x Human Population Density x Elevation Poccur(x,y)*Pop(x,y)*Elev(x,y)*2.415 27,28 HRE Human Population Density x Rivers x Elevation Pop(x,y)*Riv(x,y)*Elev(x,y)*2.415 29,30 ALL Occurrence x Human Population Density x Rivers x Elevation Poccur(x,y)*Pop(x,y)*Riv(x,y)*Elev(x,y)*2.415 ˜t(H,I) 1‰‰‰ *2.415 78 TABLE 7: Impact on Honeybee production by State. The “State” column contains the sta ndard abbreviations for the U.S. State. “Product” is the Honey Production by State in 1,0 00 USD (Flottum 2017). “Suitable” is =1 if there is suitable habitat in that state, and =0 if there is no suitable habitat in that state (See Figure 7) Total impact from these states is $9 ,904,350 USD. Table Generated by R Package “Stargazer” (Hlavac 2018). Impact State # State Product 1000 USD Suitable 1000 USD 1 AL 873 1 43.650 2 AZ 1,725 1 86.250 3 AR 4,265 1 213.250 4 CA 28,706 1 1,435.300 5 CO 2,923 1 146.150 6 FL 21,156 1 1,057.800 7 GA 9,377 1 468.850 8 HI 2,374 1 118.700 9 ID 7,482 1 374.100 10 IL 2,409 1 120.450 11 IN 1,324 1 66.200 12 IA 4,507 1 225.350 13 Ks 2,312 1 115.600 14 KY 775 1 38.750 15 LA 6,548 1 327.400 16 ME 2,158 1 107.900 17 MI 9,435 1 471.750 18 MN 4,530 0 0 19 MS 1,882 1 94.100 20 MO 1,836 1 91.800 21 MT 24,012 1 1,200.600 22 NE 5,266 0 0 23 NJ 2,861 1 143.050 79 Impact State # State Product 1000 USD Suitable 1000 USD 24 NY 9,608 1 480.400 25 NC 1,957 1 97.850 26 ND 63,636 0 0 27 OH 3,416 1 170.800 28 OR 5,897 1 294.850 29 PA 2,502 1 125.100 30 SC 1,665 1 83.250 31 SD 27,762 0 0 32 TN 1,343 1 67.150 33 TX 16,711 1 835.550 34 UT 1,724 1 86.200 35 VT 1,314 1 65.700 36 VA 1,003 1 50.150 37 WA 7,796 1 389.800 38 WV 924 1 46.200 39 WI 8,221 0 0 40 WY 3,287 1 164.350 80 TABLE 8: Summary of the calculations of the risk of transportation of V. velutina into the United States. Source Population China Korea France Total: 1st Quarter Total Yearly Trade 2019 (BillionsTrade 2019 of US Dollars) (Billions of US Dollars) 105.9739 418.575 19.8839 70.7191 14.1397 53.4974 139.995 542.791 % of Total Yearly Trade 25.32 28.11 26.43 N/A % of Total US Imports 2018 (2563.651 Billion US Dollars) 4.134 0.775 0.552 5.461 81 TABLE 9: Summary of the calculations of the risk of transportation of V. velutina into Canada. Import Values from Europe (1000 Canadian Dollars) 82,653,655 Total Imports ¼ 20,663,413.75 of imports Import Values from Asia (1000 Canadian Dollars) 128,300,426 Total % of Total Imports (1000 Canadian (564,297,051,000 CD) Dollars) 210,954,081 37.383% 32075106.50 52,738,520.25 9.345% 82 TABLE 10: Summary Statistics of Nest Projections after 10 years (N10) of the test groups. All units are in Nests. Table Generated by R Package “Stargazer” (Hlavac 2018). Trial C DC O DO H DH R DR E DE OH DOH OR DOR OE DOE HR DHR HE DHE RE DRE OHR DOHR OHE DOHE HRE DHRE ALL DALL Means St.Dev Min Max Range 9.206083 2.647439 2.647439 12.845 10.19756 0.299113 0.218101 0.173784 1 0.826216 222.7439 65.89757 5.454 311.964 306.51 289.8275 126.1021 0.138655 439.8218 439.6832 14.82025 30.36555 -16.916 99.88 116.796 0.087967 0.281478 1.19E-14 1 1 37.7025 22.89726 17.357 92.18 74.823 3.506447 2.658928 1 11.7447 10.7447 227.3673 50.06505 50.06505 311.192 261.1269 262.7155 129.0834 1 431.9165 430.9165 14.69858 30.45055 -16.916 99.88 116.796 0.087938 0.281485 8.32E-32 1 1 17.78671 13.87871 -1.684 52.285 53.969 0.597163 0.434697 0.004421 1.84934 1.844919 123.8779 35.77859 17.31 174.983 157.673 120.5644 71.20098 0.155001 204.2721 204.1171 37.20604 23.1366 14.707 92.18 77.473 0.083338 0.282328 1.17E-17 1 1 14.81967 30.36432 -16.916 99.869 116.785 0.087906 0.281492 1.19E-14 1 1 37.20604 23.1366 14.707 92.18 77.473 3.135698 2.932463 0.122799 11.7447 11.6219 -7.67393 4.998902 -18.4536 4.998902 23.45247 0.083338 0.282328 8.12E-35 1 1 4.689659 18.45213 -17.5556 53.96497 71.52054 0.083334 0.28233 2.09E-21 1 1 -5.456 6.518497 -18.3252 6.518497 24.84371 0.083338 0.282328 1.17E-17 1 1 -7.67407 4.998869 -18.4536 4.998869 23.45244 0.083334 0.28233 2.09E-21 1 1 83 TABLE 11: Summary Statistics of Nest Projections after 10 years (N10) of the positive control group. All units are in Nests. Trial Means St.Dev Min Max Range C 8.686036 6.296198 -1.35245 12.84592 14.19837 DC 0.324116 0.21772 0.21489 0.711775 0.496885 O 211.3624 124.7196 27.357 311.9645 284.6075 DO 295.4947 182.0664 33.68451 440.829 407.1445 H -1.05997 22.11139 -14.6894 37.73063 52.42006 DH 0.000944 0.002111 0 0.004721 0.004721 R 53.09421 39.45305 0.46196 DR 5.171332 2.540185 2.540185 9.296599 6.756414 E 220.9579 129.3804 27.357 311.9617 284.6047 DE 291.142 170.714 33.68451 440.829 407.1445 OH -1.0921 22.12406 -14.6894 37.72569 52.41512 DOH 0.000761 0.001703 0 0.003807 0.003807 OR 26.59586 21.64653 -1.35245 54.58461 55.93706 DOR 0.716341 0.419361 0.419361 1.437427 1.018066 OE 120.3965 70.64999 14.84971 175.389 160.5393 DOE 100.8953 52.43802 26.60819 166.0678 139.4596 HR -5.99571 12.21505 -14.7538 14.46248 29.2163 DHR 8.64E-06 1.93E-05 0 4.32E-05 4.32E-05 HE -1.05997 22.11139 -14.6894 37.73063 52.42006 DHE 0.000944 0.002111 0 0.004721 0.004721 RE 52.97759 39.26183 0.46196 104.3679 103.9059 DRE 4.396096 1.25215 1.25215 5.778534 4.526384 OHR -7.94798 9.247058 -14.7757 9.247058 24.02272 DOHR 7.37E-06 1.65E-05 0 OHE -4.98596 14.99171 -14.7266 20.71365 35.44024 DOHE 0.000128 0.000286 0 0.000639 0.000639 104.951 3.68E-05 104.489 3.68E-05 84 Trial Means St.Dev Min Max Range HRE -5.99571 12.21505 -14.7538 14.46248 29.2163 DHRE 8.64E-06 1.93E-05 0 4.32E-05 4.32E-05 ALL -7.94798 9.247058 -14.7757 DALL 1.20E-06 2.69E-06 0 9.247058 24.02272 6.02E-06 6.02E-06 85 TABLE 12: Summary Statistics of Nest Projections after 10 years (N10) of the negative groups. All units are in Nests. Table made with “Stargazer” (Hlavac 2018). Trial Means C 12.84592 0 0 12.84592 12.84592 0.21489 0 0 DC St.Dev Min Max 0.21489 Range 0.21489 O 3.244912 26.66573 -14.7457 33.88079 48.62649 DO 6.733086 11.66205 5.08E-19 20.19926 20.19926 H -0.24212 12.0827 -13.1205 12.0827 25.2032 DH 5.09E-09 8.81E-09 0 1.53E-08 1.53E-08 R 70.26567 35.22918 30.389 97.167 66.778 DR 3.472664 2.062487 1.971025 5.824315 3.853291 E 203.0277 188.6841 -14.8459 311.9645 326.8105 DE 293.1201 253.8495 0 439.6802 439.6802 OH -14.6566 0.297953 DOH 2.61E-29 OR -2.63198 1.259693 DOR 0.010769 0.017208 0.000719 0.030638 0.029919 OE 21.54878 41.0471 -14.8459 66.04125 80.88719 DOE 7.569119 12.91092 0 22.47677 22.47677 HR -6.42195 10.49922 -14.4772 10.49922 24.97647 DHR 2.05E-10 HE -5.70718 14.36045 DHE 5.09E-09 RE 37.57046 56.35099 -14.8459 97.16766 112.0136 DRE 2.598447 2.962416 0 5.824315 5.824315 OHR -13.7749 1.13904 -14.6694 DOHR 1.39E-30 2.40E-30 0 -13.633 1.57345 -14.8459 OHE 4.51E-29 3.55E-10 8.81E-09 -14.8328 0.297953 0 7.82E-29 15.1308 7.82E-29 -3.66749 1.259693 4.927183 0 6.15E-10 6.15E-10 -14.8459 14.36045 29.20639 0 1.53E-08 1.53E-08 1.13904 15.80844 4.16E-30 4.16E-30 1.57345 16.41939 86 Trial Means St.Dev Min Max Range DOHE 3.35E-14 5.80E-14 0 1.00E-13 1.00E-13 HRE -7.95699 11.61418 DHRE 2.05E-10 ALL -14.0026 1.310719 DALL 3.35E-14 3.55E-10 5.80E-14 -14.8459 11.61418 26.46011 0 6.15E-10 6.15E-10 -14.8459 1.310719 16.15666 0 1.00E-13 1.00E-13 87 TABLE 13: Summary Statistics for Mean Nest Density of the Test Groups. All units are in nests. Table was made with “Stargazer” (Hlavac 2018). Trial Means St.Dev Min Max Range C 0.124029 0.035277 0.035277 0.192892 0.157615 DC 0.004235 0.003688 0.002165 0.016667 0.014502 O 2.948504 0.631156 DO 3.784934 H 0.190696 0.395825 -0.21687 1.280513 1.497385 DH 0.001394 0.004516 1.66E-16 0.016667 0.016667 R 0.491018 0.261401 0.253051 1.197143 0.944092 DR 0.046413 0.033197 0.015385 0.152529 0.137144 E 3.008939 0.239235 0.239235 3.512986 DE 3.423084 1.532854 0.015385 4.407311 4.391926 OH 0.189006 0.397015 -0.21687 1.280513 1.497385 DOH 0.001394 0.004516 1.16E-33 0.016667 0.016667 OR 0.231173 0.165216 -0.02339 0.679026 0.702415 DOR 0.008095 0.005791 6.14E-05 0.024017 0.023956 OE 1.637377 0.323778 0.240417 1.959986 1.719569 DOE 1.596143 0.948873 0.002153 2.726958 2.724806 HR 0.483616 0.262584 0.188551 1.197143 1.008592 DHR 0.001336 0.004528 1.62E-19 0.016667 0.016667 HE 0.190688 0.39581 -0.21687 1.280372 1.497244 DHE 0.001394 0.004516 1.66E-16 0.016667 0.016667 RE 0.483616 0.262584 0.188551 1.197143 1.008592 DRE 0.041268 0.036909 0.001574 0.152529 0.150954 OHR -0.10154 0.063072 -0.23658 0.063072 0.299657 OHE 0.059322 0.242268 -0.22507 0.691859 DOHE 0.001335 0.004528 2.91E-23 0.016667 0.016667 HRE -0.07329 0.084532 -0.23494 0.084532 0.07575 3.512986 3.437236 1.46112 0.001926 4.407311 4.405385 3.27375 0.91693 0.31947 88 Trial Means St.Dev Min Max Range DHRE 0.001336 0.004528 1.62E-19 0.016667 0.016667 ALL -0.10154 0.063072 -0.23658 0.063072 0.299656 DALL 0.001335 0.004528 2.91E-23 0.016667 0.016667 89 TABLE 14: Summary Statistics for Mean Nest Density of the Positive Controls. All units are in nests. Table was made with “Stargazer” (Hlavac 2018). Trial Means St.Dev Min Max Range C 0.067765 0.132411 DC 0.020118 0.038512 0.002149 0.088972 0.086823 O 3.011135 0.311897 0.311897 3.419625 3.107727 DO 4.070482 0.398905 0.398905 H -0.03591 0.46782 DH 1.85E-05 4.14E-05 R 0.63314 0.41144 0.057745 -0.16906 0.132411 0.301467 4.40829 4.009386 -0.53158 0.739816 1.271401 0 9.26E-05 9.26E-05 1.04951 0.991765 DR 0.148674 0.181153 0.028427 0.466614 0.438187 E 3.120518 0.195267 0.195267 3.419625 3.224358 DE 4.101137 0.454023 0.454023 OH -0.03624 0.467829 DOH 1.49E-05 OR 0.284958 0.298709 DOR 3.34E-05 4.40829 3.954268 -0.53158 0.739719 1.271304 0 7.47E-05 7.47E-05 -0.16906 0.545846 0.714902 0.02436 0.036324 0.004349 0.088972 0.084623 OE 1.699962 0.118569 0.118569 1.856214 1.737645 DOE 1.797211 0.925386 0.878765 3.326024 2.447259 HR -0.12999 0.288762 DHR 1.69E-07 3.79E-07 HE -0.03591 0.46782 DHE 1.85E-05 4.14E-05 RE 0.631974 0.40997 0.057745 1.043679 0.985934 DRE 0.140922 0.184952 0.028427 0.466614 0.438187 OHR -0.1645 0.236645 OHE -0.10787 0.334293 DOHE 2.51E-06 HRE -0.12999 0.288762 5.60E-06 -0.53168 0.288762 0.820439 0 8.47E-07 8.47E-07 -0.53158 0.739816 1.271401 0 9.26E-05 9.26E-05 -0.53169 0.236645 0.768336 -0.53162 0 0.40615 0.937769 1.25E-05 1.25E-05 -0.53168 0.288762 0.820439 90 Trial DHRE ALL DALL Means St.Dev Min Max Range 1.69E-07 3.79E-07 0 8.47E-07 8.47E-07 -0.1645 0.236645 2.36E-08 5.28E-08 -0.53169 0.236645 0.768336 0 1.18E-07 1.18E-07 91 TABLE 15: Summary Statistics for Mean Nest Density of the Negative Controls. All units are in nests. Table was made with “Stargazer” (Hlavac 2018). Trial Means St.Dev Min Max Range C 0.128459 0 0 0.128459 0.128459 DC 0.002149 0 0 0.002149 0.002149 O 0.032449 0.266657 0.14746 0.338808 0.486265 DO 0.067331 0.11662 5.08E21 0.201993 0.201993 H -0.00242 0.120827 -0.1312 0.120827 0.252032 DH 5.09E-11 8.81E-11 0 1.53E-10 1.53E-10 R 0.702657 0.352292 0.30389 0.97167 0.66778 DR 0.034727 0.020625 0.01971 0.058243 0.038533 E 2.030277 1.886841 0.14846 3.119645 3.268105 DE 2.931201 2.538495 0 4.396802 4.396802 OH -0.14657 0.00298 0.14833 DOH 2.61E-31 4.51E-31 0 0.00298 0.151308 7.82E-31 7.82E-31 OR -0.02632 0.012597 0.03667 0.012597 0.049272 DOR 0.000108 0.000172 7.19E06 0.000306 0.000299 OE 0.215488 0.410471 0.14846 0.660413 0.808872 DOE 0.075691 0.129109 0 0.224768 0.224768 HR -0.06422 0.104992 0.14477 0.104992 0.249765 DHR 2.05E-12 HE -0.05707 0.143605 DHE 5.09E-11 3.55E-12 8.81E-11 0 6.15E-12 6.15E-12 0.14846 0.143605 0.292064 0 1.53E-10 1.53E-10 92 Trial Means St.Dev Min Max Range RE 0.375705 0.56351 0.14846 0.971677 1.120136 DRE 0.025984 0.029624 0 0.058243 0.058243 OHR -0.13775 0.01139 OHE -0.13633 0.015734 DOHE 3.35E-16 5.80E-16 HRE -0.07957 0.116142 DHRE 2.05E-12 3.55E-12 ALL -0.14003 0.013107 DALL 3.35E-16 5.80E-16 0.14669 0.01139 0.158084 0.14846 0.015734 0.164194 0 1.00E-15 1.00E-15 0.14846 0.116142 0.264601 0 6.15E-12 6.15E-12 0.14846 0.013107 0.161567 0 1.00E-15 1.00E-15 93 TABLE 16: Outputs of the three One-Way ANOVAs. The first ANOVA compares N10 for Trial O to the other continuous trials, while comparing N10 for Trial DO to all other discrete trials. The second ANOVA compares N10 and Mean Nest Density of Anchorage, AK, to the N10 and Mean Nest Density to all other sites in the test group. The third ANOVA compares N10 of the test group for all trials to N10 of the positive and negative control groups. The One-Way ANOVA model (R Core Team) requires one variable designated for comparison. The p-values indicate that Alaska is significantly different than the other ports for all trials. Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1 Trial P-Value – N10 Projections Site 1.08e-13 *** C Anchorage 1.47e-06 *** DC P-Value – N10 Projections P-Value Mean Nest Density Trial Group P-value N/A N/A Positive Control 0.301 < 2e-16 *** 3.15e-15 *** Negative Control <2e-16 *** Baltimore O N/A Biloxi 5.25e-07 *** 9.38e-09 *** DO N/A Boston 9.93e-08 *** 7.72e-10 *** 0.000454 *** 4.24e-13 *** Charleston 1.85e-10 *** 4.10e-09 *** 9.69e-15 *** H 0.963430 DH Houston R 8.28e-13 *** Jacksonville 3.27e-09 *** 4.79e-08 *** DR 0.024648 * Los Angeles < 2e-16 *** 8.31e-15 *** E < 2e-16 *** Miami 0.967191 0.385207 3.82e-07 *** 2.80e-06 *** DE 0.000194 *** Mobile OH < 2e-16 *** New Orleans 1.71e-08 *** 2.05e-07 *** DOH 0.860158 New York 5.25e-07 *** 4.09e-06 *** OR 1.27e-14 *** Pensacola 9.93e-08 *** 8.41e-07 *** Philadelphia 0.000454 *** 0.000685 *** 2.00e-11 *** 6.04e-10 *** 2.23e-08 *** DOR < 2e-16 *** OE Port Charlotte DOE 0.571501 Portland ME 1.22e-13 *** 7.72e-12 *** HR 0.949 Portland OR < 2e-16 *** 1.56e-14 *** DHR 0.744362 Providence 5.81e-07 *** 4.73e-06 *** 94 Site Trial P-Value – N10 Projections P-Value – N10 Projections P-Value Mean Nest Density HE 2.87e-07 *** San Francisco 0.115240 0.146754 DHE 0.568416 Savannah 0.093864 0.124283 1.13e-11 *** 3.73e-10 *** 0.08795 0.08043 0.000123 *** 0.000403 *** 0.882858 0.901461 1.66e-10 *** RE DRE Trial Group P-value SaintJohn 0.002876 ** St Petersburg 0.177 OHR Seattle 0.0124 * DOHR Vancouver OHE 0.276 DOHE 0.521720 HRE 6.70e-05 *** DHRE 4.983e11*** ALL 0.858 DALL 4.983e11*** 95 TABLE 17: Summary of Economic Impact Cost Calculations in the United States. Calculations are based on trial means from O and DO, being the highest trial means. Calculation Steps Pollinator Services (U.S.D.) Total Value 10.95 billion Means ContinuousDiscrete Low Estimate ContinuousDiscrete High Estimate [ 5% of the value Total Impact Honey Production (U.S.D) 353 million Removal Cost (U.S.D) 140.96-542.15 (accessed 2.21.20, Bloomberg.com) 222.745-289.823 31,398.135 – 40,853.450 120,761.202- 157,127.539 547,500,000.0 17,650,000.0 565,150,000.0 565,181,398.135 – 565190853.539 565,270,761.202 – 565,307,127.539 96 TABLE 18: Summary of Economic Impact Cost Calculations in the Canada. Calculations are based on trial means from O and DO, being the highest trial means. Calculation Steps Total Value Means Continuous-Discrete Low Estimate Continuous-Discrete High Estimate [ 5% of the value Total Impact Honey Production (C.A.D) 129.5 million Removal Cost (C.A.D.) 186.429-694.04 (accessed 2.21.20, Bank of Canada) 222.745-289.823 41,526.350- 54,031.701 154,593.9398- 207,397.338 6,475,218. 878 6,475,218. 878 6,516,745.228 – 6,529,250.579 6,629,812.816 – 6,682,616.216 97 FIGURES FIGURE 1: Diagram of V. velutina life cycle (Monceau et al. 2014). 98 FIGURE 2: An illustration of the phases of biological invasions, sensu lato (Sakai et al. 2001). Phases were modeled for a hypothetical biological invasion by V. velutina: Transportation from importation records, Establishment in an ecological niche model, Spread predicted by FisherSkellam and Archer-Markov simulations, and invasive species impact estimated using estimated economic values. 99 FIGURE 3: V. velutina Occurrence in Europe (GBIF.org. Accessed 2019). Dark Red is positive for an V. velutina nests. Map created in QGIS (QGIS Development Team 2019). 100 FIGURE 4: V. velutina Occurrence in Europe (GBIF.org. Accessed 2019). Dark Red is positive for an V. velutina nests. Map created in QGIS (QGIS Development Team 2019). 101 FIGURE 5: Heat map of V. velutina habitat suitability, defined as the probability of occurrence. Map was generated in QGIS (QGIS Development Team 2019) from the output raster of the Niche Analysis, Darkest red indicates the probability of occurrence =1, to wit highly suitable habitat, while light pink indicates probability of occurrence=0, highly inhospitable habitat. 102 FIGURE 6: Program Architecture for the Niche Analysis script. 103 FIGURE 7: Program Architecture of the Continuous Growth and Spread Simulation. The Discrete Simulation differs only in that the Main Algorithm carries out the Markov Chain Calculation rather than solving a differential equation. 104 FIGURE 8: Human Population Density in the North America. (CIESIN, 2018). Darkest Red have the maximum Hunan Population Density (> 2000 people per km). Original map contained population density for the entire world. This map was clipped from the larger one. Map created in QGIS (QGIS Development Team 2019). 105 FIGURE 9: Major River Courses in North America. (Kelso and Patterson 2009). Dark red indicates River Presence =1.while pink indicates River Presence=0. Map was of the entire world. This map was clipped from that larger map. Map created in QGIS. (QGIS Development Team 2019). 106 FIGURE 10: Elevation (m) in North America. (Archuleta et al. 2017). The input raster contained elevation data from the entire world. This map was clipped from a portion of that map. Darkest red has the highest elevation. Map created in QGIS (QGIS Development Team 2019).. 107 FIGURE 11: Habitat suitability of V. velutina by US State. US vector shapefile obtained from the National Weather Service (Accessed 15 June 2020). Map generated in QGIS 3.8.3 (QGIS Development Team 2019). 108 FIGURE 12: Geometry of a distribution of a Continuous Trial with negative nest density at the point of origin. Map was created from the output raster of trial O beginning in St. Paul, MN. Map was created in QGIS (QGIS Development Team 2019). 109 FIGURE 13: Nest Projection (N10) Means for Each Trial with Standard Error. Character denotes whether the trial was Continuous or Discrete. Trial Number is listed in Table 5. Graph was generated using the “ggplot2” package (Wickham 2016) in R (R Core Team 2019). 110 FIGURE 14: A) Boxplot of N10 Distributions for all trials for the Test Group; B) Boxplot of N10 Distributions for all trials for the Positive Control Group; C) Boxplot of N10 Distributions for all trials for the Negative Control Group. All Boxplots were made with ggplot2 (Wickham 2016) and reshape (Wickham 2007). 111 A) B) FIGURE 15: A) Boxplots of N10 projections for all sites across all trials. B) N10 projections by site for Trials O (teal) , DO (red), Mean Nest Density for Trial O (purple), and Mean Nest Density for Trial DO (Green). Site numbers can be found in Tables 4 and 5. All Boxplots were made with ggplot2 (Wickham 2016) and reshape (Wickham 2007). 112 A) Continuous Geometry B) Discrete Geometry FIGURE 16: A comparison of the geometries of the distributions produced by (A) the continuous simulation and (B) the discrete simulation. Both maps were generated in QGIS (QGIS Development Team 2019) from the output rasters of the trials O and DO respectively. 113 FIGURE 17: Boxplots of distributions of the means of the continuous trials (Column1) and the discrete trials (Column 2). The y-axis represents population projections after ten years, N10, in nests. The boxplots was generated using the “boxplot” command (R Core Team 2019) in base R. 114 A-Test Sites B-All Sites FIGURE 18: Boxplots of distributions of the continuous trial O (Column1) and the discrete trial DO (Column 2). The y-axis represents population projections after ten years, N10, in nests. (A) Comparison of the distributions of the test groups between the two trials. (B) Comparison of the distributions of all sites, test group and control group together. Boxplots were generated using the “boxplot” command (R Core Team 2019) in base R. 115 FIGURE 19: Boxplot of N10 Population Projections for the Test Group, Negative Control Group, and Positive Control Group across all Trials. All Boxplots were made with ggplot2 (Wickham 2016) and reshape (Wickham 2007). 116 A) B) C) FIGURE 20: A) Biplot of the Principle Components Analysis (PCA) of the Nest Projections after 10 years (N10) for the test group. PCA and Biplot performed in R (R Core Team 2019). B) Elbow Plot of the variances explained by Principle Components. Elbow plot created in R (R Core Team 2019). C) Plot of Principle Component 1 by Principle Component 2, colored for Human Population Density. PCA showed Human Population Density (variable name: “hpop”) the major driver of PC1. Plot made with ggplot2 (Wickham 2016). 117 FIGURE 21: Correlation plot with Linear Regression Line, Land Area by Human Population Density at the Starting Point. Plot created with ggplot2 (Wickham 2016). Correlation= 0.314, p-value=0.134. 118 FIGURE 22: Heat map of M. apicalis suitability, defined as the probability of occurrence. Probability was calculated with Wallace (Kass et al. 2018). The map image was generated using QGIS (QGIS Development Team 2019). Darkest red areas have the highest occurrence probability, to wit the highest environmental suitability. 119 APPENDIX A PYTHON SOURCE CODE A.1 Script 1-Habitat distribution modeling ## Modules ## import funx as f #Stats and Calculus Library import mfunx as m #Matrix Manipulation Library ## Data ## #Header for the Output Raster 120onca=”ncols 3600\nnrows 1500\nxllcorner -180.000000000000\nyllcorner -60.000000000000\ncellsize 0.100000000000\nNODATA_value 3.4028234663852885981e+38\n” ## Extract Occurance Data ## #Open I Data(.csv) tfile=open(“test_extract_form.csv”,”r”,errors=”ignore”) rfile=tfile.readlines() #Open Empty List of Coords coordlist=[] #Make List of Ordered Points #Make Sure to Check Indices for I in rfile: coordcell=[] #open new cell 120 coordline=i.split(“,”)#split line at comma coordcell.append(coordline[17])#latitude string to cell coordcell.append(coordline[18])#longitude string to cell coordlist.append(coordcell) #add cell to list #Clip First Entry – Pulled From the Header – From List del coordlist[0] #Close .csv File tfile.close() ## Write I Points Out ## # Write the raw points to a .txt file for quick access if needed later extractfile=open(“extracted_coordinates.txt”,”w”) extractfile.write(“Latitude Longitude\n”) extractfile.write(“\n”) for I in coordlist: extractfile.write(i[0]+” “+i[1]+”\n”) extractfile.close() ## Clean Up I Points ## #Convert Matrix of Strings to Matrix of Floats #Filter Out And Report Corrupted Data coord_fl=[] #float matrix opened for I in coordlist: coordcell=[] #open new cell ecount=0 #initiate error count errorlist=[] #open list of corrupted cells 121 e_index_list=[] #open list of corrupted indicies try: x=float(i[0]) y=float(i[1]) coordcell.append(x) coordcell.append(y) coord_fl.append(coordcell) except: ecount=ecount+1 #increase error counter by 1 errorlist.append(i)#add corrupted cell to list e=coordlist.index(i) #index of corruption e_index_list.appendI #add index to list ## Write Error Report ## #Write an Error Report #Error defined as an invalid [x,y] point ##efile=open(“Occurance_Error_Report.txt”,”w”) ##efile.write(“Occurance Points Error Report: “) ##er_count=”Number of Errors: “+str(ecount)+” “ ##efile.write(er_count) ##if ecount!=0: ## for I in e_index_list: ## efile.write(“Index of Errors: “+str(i)+” “) ## for j in errorlist: ## efile.write(“Error Cells: “+j[0]+” “+j[1]+” “) ##efile.close() ## Convert GPS to Matrix Indicies ## 122 ##Convert GPS points to Matrix indicies #Open Empty List Of Coordinates coord_index=[] #Create Points for I in coord_fl: coord_index.append(f.CoordConvert(i)) ## Eliminate Duplicate Points ## #Get Rid of Duplicate indicies ucoords=m.NoDupRows(coord_index) ## Read In Rasters ## #Build Matrix of Minimum Temperature Values tmin_file=open(“raster_library_avg_min_temp.txt”,”r”) #open avg min temp data tmin_mat=m.RasterToMat(tmin_file)#convert raster to matrix of floats from “mfunx” tmin_file.close()#close File #Build Matrix of Precip Values precip_file=open(“raster_library_avg_precip_mm.asc”,”r”) #open avg precip precip_mat=m.RasterToMat(precip_file) #convert raster to matrix of floats precip_file.close() #close file ## Extract Values at Occurance Points ## tmin_occur_list=[] #Initialize list of temperatures for I in ucoords:#iterate through points 123 tmin_occur_list.append(m.valxy(tmin_mat,i[0],i[1])) #append val at (x,y) to list. Precip_occur_list=[] #initialize list of temperatures for I in ucoords: precip_occur_list.append(m.valxy(precip_mat,i[0],i[1])) ## #### Process NODATA values ## #filter out NODATA error points from tmin nu_o_list_tmin=[] for I in tmin_occur_list: if i!=-3.4028234663852886e+38: nu_o_list_tmin.append(i) #filter out NODATA errors from precip nu_o_list_precip=[] for I in precip_occur_list: if i!=-3.4028234663852886e+38: nu_o_list_precip.append(i) ## Write Out Raw Data ## ###Write Values at Points to Report ##raw_precip=open(“raw_precip_data.txt”,”w”) ##raw_precip.write(“Raw Total Yearly Precipitation Data (mm): \n”) ##raw_precip.write(“\n”) ##for I in nu_o_list_precip: ## raw_precip.write(str(i)+”\n”) ## ##raw_precip.close() 124 ## ##raw_tmin=open(“raw_tmin_data.txt”,”w”) ##raw_tmin.write(“Raw Avg Minimun Temperature (degrees C*10):\n”) ##raw_tmin.write(“\n”) ##for I in nu_o_list_tmin: ## raw_tmin.write(str(i)+”\n”) ## ##raw_precip.close() ## #### Compute Statistics ## ###tmin ##mu=f.Mu(nu_o_list_tmin) ##me=f.Median(nu_o_list_tmin) ##sd=f.SD(mu,nu_o_list_tmin) ##range_list=f.MaxMinRange(nu_o_list_tmin) ##occur_dis_tmin=f.DistroMat(nu_o_list_tmin) ##mode_l=f.Mode(occur_dis_tmin) ## ###precip ##mu_p=f.Mu(nu_o_list_precip) ##me_p=f.Median(nu_o_list_precip) ##sd_p=f.SD(mu,nu_o_list_precip) ##range_list_p=f.MaxMinRange(nu_o_list_precip) ##occur_dis_precip=f.DistroMat(nu_o_list_precip) ##mode_l_precip=f.Mode(occur_dis_precip) ## #### Write Stats Report ## ## 125 ###Write tmin Stats to Report ##tmin_stats=open(“Occurance_Statistics_Min_Temp.txt”,”w”) ##tmin_stats.write(“Occurance Statistics: \n”) ##tmin_stats.write(“Average Minimum Temperature (degrees c*10): \n”) ##tmin_stats.write(“\n”) ##tmin_stats.write(“Mean: “+str(mu)+”\n”) ##tmin_stats.write(“Median: “+str(me)+”\n”) ##tmin_stats.write(“Mode: “+str(mode_l)+”\n”) ##tmin_stats.write(“Standard Deviation: “+str(sd)+”\n”) ##tmin_stats.write(“Range: “+str(range_list)+”\n”) ##tmin_stats.write(“\n”) ##tmin_stats.write(“Minimum Temperature Frequency\n”) ##tmin_stats.write(“\n”) ## ##for I in occur_dis_tmin: ## tmin_stats.write(str(i[0])+” “+str(i[1])+”\n”) ## ##tmin_stats.close() ## ## ###Write Precip Stats to Report ##precip_stats=open(“Occurance_Statistics_Precip.txt”,”w”) ##precip_stats.write(“Occurance Statistics: \n”) ##precip_stats.write(“Total Yearly mm Precipitation: \n”) ##precip_stats.write(“\n”) ##precip_stats.write(“Mean: “+str(mu_p)+”\n”) ##precip_stats.write(“Median: “+str(me_p)+”\n”) ##precip_stats.write(“Mode: “+str(mode_l_precip)+”\n”) 126 ##precip_stats.write(“Standard Deviation: “+str(sd_p)+”\n”) ##precip_stats.write(“Range: “+str(range_list_p)+”\n”) ##precip_stats.write(“\n”) ##precip_stats.write(“Minimum Temperature Frequency\n”) ##precip_stats.write(“\n”) ## ##for I in occur_dis_precip: ## precip_stats.write(str(i[0])+” “+str(i[1])+”\n”) ## ##precip_stats.close() ## Create Output Rasters ## #Open a Matrix with 1500 r0ws and 3600 columns n_mat=m.NbyOMat(1500,3600) #Add NODATA Value -3.4028234663852886e+38 to every cell niche_mat=m.AddMat(n_mat,-3.4028234663852886e+38) #Unique Points pz=m.ListOfPoints(1500,3600)#total list of points pre_pz=[] land_pz=[] for I in pz: if m.valxy(precip_mat,i[0],i[1])!= -3.4028234663852886e+38: pre_pz.append(i)#append all points with precip vals for I in pre_pz: if m.valxy(tmin_mat,i[0],i[1])!=-3.4028234663852886e+38: 127 land_pz.append(i) #Compute the Final Matrix ##for I in land_pz: ## tm=m.valxy(tmin_mat,i[0],i[1]) ## pc=m.valxy(precip_mat,i[0],i[1]) ## pval=f.LogOdds(tm,pc) ## if pval>2.0: ## pval=2 ## m.ChangeMat(niche_mat,i[0],i[1],pval) #Compute occurrence matrix for I in land_pz: m.ChangeMat(niche_mat,i[0],i[1],0) for I in ucoords: m.ChangeMat(niche_mat,i[0],i[1],1) #Write the Matrix to the Output Raster ##mout=open(“Niche_Analysis_Output_Raster”,”w”) ##mout.write(128onca) ##moutstr=m.MatToString(niche_mat) ##mout.write(moutstr) ##mout.close() #Write occurrence raster to out mout=open(“Vespa_Occurrence”,”w”) mout.write(128onca) moutstr=m.MatToString(niche_mat) 128 mout.write(moutstr) mout.close() ##Write land points to .txt file ##need it for Growth / Spread ##pzout=open(“Land_Point_Values_Output_List.txt”,”w”) ##for I in land_pz: ## pzout.write(str(i[0])+” “+str(i[1])+”\n”) ##pzout.close() ## A.2 Script 2-Continuous Growth and Spread Simulation # Import Modules # import funx as f # Stats, Arithmatic, and Calculus Library import mfunx as m # Matrix Manipulation Library ## Header ## #Header for the Output Rasters 129onca=”ncols 3600\nnrows 1500\nxllcorner -180.000000000000\nyllcorner -60.000000000000\ncellsize 0.100000000000\nNODATA_value 3.4028234663852885981e+38\n” ### Read in Rasters # #####Niche Model Output ##nish=open(“Niche_Analysis_Output_Raster.asc”,”r”) ##nish_f=m.RasterToMat(nish) ##nish.close() #######Population Density 129 ##popfile=open(“Population_Density_Adjusted.asc”,”r”) ##pop_mat=m.RasterToMat(popfile) ##popfile.close() #####River Course ##rivfile=open(“World_River_Raster.asc”,”r”) ##riv_mat=m.RasterToMat(rivfile) ##rivfile.close() ###Elevation ##elevfile=open(“World_Elevation_Raster.asc”,”r”) ##elev_mat=m.RasterToMat(elevfile) ##elevfile.close #M apicalis ##nish=open(“Bee_Niche_Processed.asc”,”r”) ##nish_f=m.RasterToMat(nish) ##nish.close() ## Growth Matrices ## #Growth Matrix Control ##growfile=open(“Growth Matrix Control.asc”,”r”) ##rlist=m.RasterToMat(growfile) ##growfile.close() #Growth Matrix – niche only ##growfile=open(“Growth Matrix Occur Only.asc”,”r”) ##rlist=m.RasterToMat(growfile) ##growfile.close() #Growth Matrix – pop only 130 ##growfile=open(“Growth Matrix Pop.asc”,”r”) ##rlist=m.RasterToMat(growfile) ##growfile.close() #Growth Matrix – rivers only ##growfile=open(“Growth Matrix Rivers.asc”,”r”) ##rlist=m.RasterToMat(growfile) ##growfile.close() #Growth Matrix -Elevation only ##growfile=open(“Growth Matrix Elev Only.asc”,”r”) ##rlist=m.RasterToMat(growfile) ##growfile.close() #Growth Matrix-Occur and Pop ##growfile=open(“Growth Matrix Occur and Pop v3.asc”,”r”) ##rlist=m.RasterToMat(growfile) ##growfile.close() #Growth Matrix-Occur and River ##growfile=open(“Growth Matrix Occur River.asc”,”r”) ##rlist=m.RasterToMat(growfile) ##growfile.close() #Growth Matrix-Occur and Elevation ##growfile=open(“Growth Matrix Occur Elev.asc”,”r”) ##rlist=m.RasterToMat(growfile) 131 ##growfile.close() #Growth Matrix -Pop and Rivers ##growfile=open(“Growth Matrix Pop River.asc”,”r”) ##rlist=m.RasterToMat(growfile) ##growfile.close() #Growth Matrix-Pop and Elevation ##growfile=open(“Growth Matrix Pop Elev.asc”,”r”) ##rlist=m.RasterToMat(growfile) ##growfile.close() #Growth Matrix-River and Elevation ******* ##growfile=open(“Growth Matrix River Elev.asc”,”r”) ##rlist=m.RasterToMat(growfile) ##growfile.close() #Growth Matrix-Occur Pop River ##growfile=open(“Growth Matrix Occur Pop River.asc”,”r”) ##rlist=m.RasterToMat(growfile) ##growfile.close() #Growth Matrix-Occur Pop Elev* ##growfile=open(“Growth Matrix Occur Pop Elev.asc”,”r”) ##rlist=m.RasterToMat(growfile) ##growfile.close() 132 #Growth Matrix-Occur River Elevation ##growfile=open(“Growth Matrix Occur River Elev.asc”,”r”) ##rlist=m.RasterToMat(growfile) ##growfile.close() #Growth Matrix-Pop River Elev ##growfile=open(“Growth Matrix Pop River Elev.asc”,”r”) ##rlist=m.RasterToMat(growfile) ##growfile.close() #Growth Matrix-Occur River Pop Elevation *Done* ##growfile=open(“Growth Matrix Occur Pop River Elev.asc”,”r”) ##rlist=m.RasterToMat(growfile) ##growfile.close() #Growth Matrix Bee growfile=open(“Growth Matrix Bee.asc”,”r”) rlist=m.RasterToMat(growfile) growfile.close() #### Read in List Of Points ## ###open list of points pz=open(“Land_Point_Values_Output_List.txt”,”r”) pzlist=pz.readlines() pointmat=[] for I in pzlist: p=i.split(“ “) p[0]=int(p[0]) 133 p[1]=int(p[1]) pointmat.append(p) pz.close() ## #### Growth Matrix ## ###Ideal Growth Rate ###e^r = 560*.02 = 11.2 ###r=ln(11.2)=2.415 ###r=(p-1)*x ###r=2-1)x=2.415 ###x=2.415 #for M apicalis #r=1.954 ##grow_mat=m.NbyOMat(1500,3600) ##rlist=m.AddMat(grow_mat,-3.4028234663852886e+38) ##for p in pointmat: ## pval=m.valxy(nish_f,p[0],p[1]) ## growval=pval*1.954 ######## proval=pval-1.0 #### popdenval=m.valxy(pop_mat,p[0],p[1]) #### popval=float(popdenval/2000) ###### g_val=m.valxy(nish_f,p[0],p[1]) #### 134onca=m.valxy(riv_mat,p[0],p[1]) #### if 134onca==0: #### 134onca=.5 134 ########## growval=g_val*135onca #### e_val=m.valxy(elev_mat,p[0],p[1]) #### if e_val>791.5: #### egval=0 #### if e_val<=791.5: #### egval=1 ########## growval=g_val #### growval=egval*135onca*popval*2.415 ###### growval=1 ## m.ChangeMat(rlist,p[0],p[1],growval) ###### Write Out Growth Raster ## #####Do this only once #######then comment out grow_out=open(“Growth Matrix Bee.asc”,”w”) grow_out.write(135onca) growstr=m.MatToString(rlist) grow_out.write(growstr) grow_out.close() #### Initial Parameters ## # k=0.06 per square km. #Must Convert to Square Nautical Models #.205/nautical_mile^2 * 36 = 7.397 #k=7.397 #m apicalis k=2023.140 135 #Rate of Spread #D=km^2/year #D=7.2^2 #D=51.84 #D for M. Apicalis #7.982 ^2 D=63.722 #Standard Deviation # 3 standard deviations in any direction contains >99% of data #sd_x=2.4 #sd_y=2.4 #7.982/3 = 2.650871 sd_x=2.650871 sd_y=2.650871 #Number of Years Running the Simulation #t=10 #M. apicalis t=20 #Spacial Change h=.000001 136 # Run Scripts for All Points # #port_list=[[506,1033],[595,911],[476,1089],[600,899],[602,846],[578,989],[559,617],[5 72,1000],[596,983],[642,997],[593,918],[492,1059],[595,927],[500,1048],[632,981],[463 ,1097],[444,573],[481,1085],[522,575],[422,576],[620,975],[407, 568],[287, 301],[447, 1139]] port_list=[[555,603]] for I in port_list: #set starting point po=i mu_x=po[0] mu_y=po[1] ########################### ###### Initialize Matrix ## ########################### #Define the Size of the Rows and Columns col_val=1500 row_val=3600 # Open a col*row Matrix with All Vals =0 emptym=m.NbyOMat(col_val,row_val) #add nodata to every cell emptymat=m.AddMat(emptym,-3.4028234663852886e+38) #Change Each Value in “emptymat” to the 3D Gaussian #Value at that Position. For p in pointmat: gval=f.Gauss3D(mu_x,mu_y,sd_x,sd_y,p[0],p[1]) emptymat[p[0]][p[1]]=gval ############################### ###### Growth and Spread Loop # 137 ############################### #Compute Fisher Value at Each Point P after t years for p in pointmat: rval=m.valxy(rlist,p[0],p[1]) #pull r from list at point p fval=f.Fisher(mu_x,mu_y,sd_x,sd_y,p[0],p[1],D,h,5,rval,k) emptymat[p[0]][p[1]]=fval ############################### ## Compute Growth Statistics ## ############################### #Sum of Population within +/- 5 units mincol=po[0]-5 maxcol=po[0]+5 minrow=po[1]-5 maxrow=po[1]+5 nsum=m.MatSubSum(emptymat,mincol,maxcol,minrow,maxrow,3.4028234663852885981e+38) centerval=m.valxy(emptymat,po[0],po[1]) ########################### ##### Write to Raster Out # ########################### #Write the Result of the Growth/Spread Loop to an Output Raster #Make a String of the Final Matrix Final_Val_String=m.MatToString(emptymat) #Create a File Name outfile_name=”Bee_Fin_cond_mu_x_”+str(mu_x)+”_mu_y_”+str(mu_y)+”_D_”+str(D) +”_t_”+str(t)+”Pop_River_Elev.txt” #Open a Writable File fil=open(outfile_name,”w”) 138 #Write Header to File fil.write(139onca) #Write Final Value String to File fil.write(Final_Val_String) #Close the File fil.close() ############################### ## Append Statistics to File ## ############################### statfile=open(“Statistics_Data_Bee.asc”,”a”) statfile.write(outfile_name+”\n”) statfile.write(str(nsum)+”\n”) statfile.write(str(centerval)+”\n”) statfile.close() ## Run Script for All Control Locations ## ##control_list=[[450, 869],[555,966],[448, 1068],[549, 734],[475, 946]] ##for I in control_list: ## #set starting point ## po=i ## mu_x=po[0] ## mu_y=po[1] ## ########################### ## ###### Initialize Matrix ## ## ########################### ## #Define the Size of the Rows and Columns ## col_val=1500 139 ## row_val=3600 ## # Open a col*row Matrix with All Vals =0 ## emptym=m.NbyOMat(col_val,row_val) ## #add nodata to every cell ## emptymat=m.AddMat(emptym,-3.4028234663852886e+38) ## #Change Each Value in “emptymat” to the 3D Gaussian ## #Value at that Position. ## for p in pointmat: ## gval=f.Gauss3D(mu_x,mu_y,sd_x,sd_y,p[0],p[1]) ## emptymat[p[0]][p[1]]=gval ## ############################### ## ###### Growth and Spread Loop # ## ############################### ## #Compute Fisher Value at Each Point P after t years ## for p in pointmat: ## rval=m.valxy(rlist,p[0],p[1]) #pull r from list at point p ## fval=f.Fisher(mu_x,mu_y,sd_x,sd_y,p[0],p[1],D,h,5,rval,k) ## emptymat[p[0]][p[1]]=fval ## ############################### ## ## Compute Growth Statistics ## ## ############################### ## #Sum of Population within +/- 5 units ## mincol=po[0]-5 ## maxcol=po[0]+5 ## minrow=po[1]-5 ## maxrow=po[1]+5 ## nsum=m.MatSubSum(emptymat,mincol,maxcol,minrow,maxrow,3.4028234663852885981e+38) ## centerval=m.valxy(emptymat,po[0],po[1]) 140 ## ########################### ## ##### Write to Raster Out # ## ########################### ## #Write the Result of the Growth/Spread Loop to an Output Raster ## #Make a String of the Final Matrix ## Final_Val_String=m.MatToString(emptymat) ## #Create a File Name ## outfile_name=”Contr_cond_mu_x_”+str(mu_x)+”_mu_y_”+str(mu_y)+”_D_”+str(D)+” _t_”+str(t)+”Control.txt” ## #Open a Writable File #### fil=open(outfile_name,”w”) #### #Write Header to File #### fil.write(141onca) #### #Write Final Value String to File #### fil.write(Final_Val_String) #### #Close the File #### fil.close() #### ############################### ## ## Append Statistics to File ## ## ############################### ## statfile=open(“Control_Statistics_Control.asc”,”a”) ## statfile.write(outfile_name+”\n”) ## statfile.write(str(nsum)+”\n”) ## statfile.write(str(centerval)+”\n”) ## statfile.close() 141 A.3 Script 3 – Discrete Growth and Spread Simulation # Import Modules # import funx as f # Stats, Arithmatic, and Calculus Library import mfunx as m # Matrix Manipulation Library ## Header ## #Header for the Output Rasters 142onca=”ncols 3600\nnrows 1500\nxllcorner -180.000000000000\nyllcorner -60.000000000000\ncellsize 0.100000000000\nNODATA_value 3.4028234663852885981e+38\n” ## Initialize Constants ## K=7.319 gen_num=10 ### Read in Rasters # #Growth Matrix Control ##growfile=open(“Growth Matrix Control.asc”,”r”) ##rlist=m.RasterToMat(growfile) ##growfile.close() #Growth Matrix – niche only 142 ##growfile=open(“Growth Matrix Occur Only.asc”,”r”) ##rlist=m.RasterToMat(growfile) ##growfile.close() #Growth Matrix – pop only ##growfile=open(“Growth Matrix Pop.asc”,”r”) ##rlist=m.RasterToMat(growfile) ##growfile.close() #Growth Matrix – rivers only ##growfile=open(“Growth Matrix Rivers.asc”,”r”) ##rlist=m.RasterToMat(growfile) ##growfile.close() #Growth Matrix -Elevation only ##growfile=open(“Growth Matrix Elev Only.asc”,”r”) ##rlist=m.RasterToMat(growfile) ##growfile.close() #Growth Matrix-Occur and Pop ##growfile=open(“Growth Matrix Occur and Pop v3.asc”,”r”) ##rlist=m.RasterToMat(growfile) ##growfile.close() #Growth Matrix-Occur and River ##growfile=open(“Growth Matrix Occur River.asc”,”r”) ##rlist=m.RasterToMat(growfile) 143 ##growfile.close() #Growth Matrix-Occur and Elevation growfile=open(“Growth Matrix Occur Elev.asc”,”r”) rlist=m.RasterToMat(growfile) growfile.close() #Growth Matrix -Pop and Rivers ##growfile=open(“Growth Matrix Pop River.asc”,”r”) ##rlist=m.RasterToMat(growfile) ##growfile.close() #Growth Matrix-Pop and Elevation ##growfile=open(“Growth Matrix Pop Elev.asc”,”r”) ##rlist=m.RasterToMat(growfile) ##growfile.close() #Growth Matrix-River and Elevation ##growfile=open(“Growth Matrix River Elev.asc”,”r”) ##rlist=m.RasterToMat(growfile) ##growfile.close() #Growth Matrix-Occur River Pop ##growfile=open(“Growth Matrix Occur River Pop.asc”,”r”) ##rlist=m.RasterToMat(growfile) ##growfile.close() #Growth Matrix-Occur Pop Elev* 144 ##growfile=open(“Growth Matrix Occur Pop Elev.asc”,”r”) ##rlist=m.RasterToMat(growfile) ##growfile.close() ###Growth Matrix-Occur Pop River *Ignore This one ##growfile=open(“Growth Matrix Occur Pop River.asc”,”r”) ##rlist=m.RasterToMat(growfile) ##growfile.close() #Growth Matrix-Occur River Elevation ##growfile=open(“Growth Matrix Occur River Elev.asc”,”r”) ##rlist=m.RasterToMat(growfile) ##growfile.close() #Growth Matrix-Pop River Elev ##growfile=open(“Growth Matrix Pop River Elev.asc”,”r”) ##rlist=m.RasterToMat(growfile) ##growfile.close() #Growth Matrix-Occur River Pop Elevation ##growfile=open(“Growth Matrix Occur Pop River Elev.asc”,”r”) ##rlist=m.RasterToMat(growfile) ##growfile.close() #Growth Matrix Bee ##growfile=open(“Growth Matrix Bee.asc”,”r”) ##rlist=m.RasterToMat(growfile) ##growfile.close() 145 #### Read in List Of Points ## ###open list of points pz=open(“Land_Point_Values_Output_List.txt”,”r”) pzlist=pz.readlines() pointmat=[] for I in pzlist: p=i.split(“ “) p[0]=int(p[0]) p[1]=int(p[1]) pointmat.append(p) pz.close() # Run Scripts for All Points # #port_list=[[506,1033],[595,911],[476,1089],[600,899],[602,846],[578,989],[559,617],[5 72,1000],[596,983],[642,997],[593,918],[492,1059],[595,927],[500,1048],[632,981],[463 ,1097],[444,573],[481,1085],[522,575],[422,576],[620,975],[407,568],[287,301],[447,11 39]] #port_list=[[506,1033]]#diagnostic to run one trial #port_list=[[600,899]] for I in port_list: #set starting point# #################### 146 po=i#itial point mu_x=po[0]#x of initial point mu_y=po[1]#y of initial points #set region around starting point #MemmoryError occurs for all pointmat #A smaller zone is required #This defines that smaller zonw mincol=po[0]-10 maxcol=po[0]+10 minrow=po[1]-10 maxrow=po[1]+10 ################################ ## Generate Pobability Matrix ## ################################ small_pz=m.SubListOfPoints(mincol,maxcol,minrow,maxrow)#define smaller search area small_pointmat=[] for I in small_pz:#eliminate water units for j in pointmat: if i==j: small_pointmat.append(i) meta_pz=m.MetaMat(small_pointmat,small_pointmat)#all combinations of points for j in meta_pz: 147 dval=f.PointDist(j[0],j[1])#distance between points in combinations if dval>6.0: probval=0.0#w/o this, spread is too large. If dval<=6.0: probval=f.gauss(0,2.4,dval)#gauss. Prob. Of distance j.append(probval)#add probability to the combination ########################### ###### Initialize Matrix ## ########################### #Define the Size of the Rows and Columns col_val=1500 row_val=3600 # Open a col*row Matrix with All Vals =0 emptym=m.NbyOMat(col_val,row_val) #add nodata to every cell emptymat=m.AddMat(emptym,-3.4028234663852886e+38) #Change Each Value in “emptymat” to the 3D Gaussian #Value at that Position. For p in pointmat: emptymat[p[0]][p[1]]=0 #make the starting square =1 emptymat[mu_x][mu_y]=1 ############################ ## Growth and Spread Loop ## 148 ############################ it_num=0#initialize iterator while it_num6.0: ## ## ## ## probval=0.0 if dval<=6.0: probval=f.gauss(0,2.4,dval) j.append(probval) ## ## ## ## ########################### ## ###### Initialize Matrix ## ## ########################### ## #Define the Size of the Rows and Columns ## col_val=1500 ## row_val=3600 ## # Open a col*row Matrix with All Vals =0 ## emptym=m.NbyOMat(col_val,row_val) ## #add nodata to every cell ## emptymat=m.AddMat(emptym,-3.4028234663852886e+38) ## #Change Each Value in “emptymat” to the 3D Gaussian ## #Value at that Position. ## for p in pointmat: ## emptymat[p[0]][p[1]]=0 ## #make the starting square =1 ## emptymat[mu_x][mu_y]=1 152 ## ## ############################ ## ## Growth and Spread Loop ## ## ############################ ## it_num=0 ## while it_num 1/M[I,j]” newmat=[] #open matrix for I in M: #column iterator newcell=[] #open row for j in i: #row iterator if j!=0: #avoid “Divide by Zero” Error 168 newcell.append(1/j) if j==0: newcell.append(0)#make =0 when “div. by Zero error” newmat.append(newcell)#append row to matrix return newmat #Matrix of inverted values #Add constant “c” to every element in the matrix def AddMat(M,c): “Add constant ‘c’ to Every Element of Matrix M” newmat=[] #open new matrix for I in M: newcell=[] #open new cell for j in i: newcell.append(j+c)#add c to j newmat.append(newcell)#add row to matrix return newmat #Replace the entry in a specific cell [I,j] with constant c def ChangeMat(M,I,j,c): “Replace the entry at [I,j] in Matrix ‘M’ with ‘c’” M[i][j]=c return M #Make every element of a matrix <=0 Positive def MakePositiveMat(M,c): “If M[I,j] >=0, replace with Constant ‘c’” newmat=[] 169 for I in M: newcell=[] for j in i: if j>0: #leave this value alone newcell.append(j) if j<=0:#write new value newcell.appendI newmat.append(newcell) return newmat ## Matrix to Raster ## ##Matrix to String def MatToString(M): “Convert Matrix ‘M’ to Writable Raster String” mastr=”” #New String: Repository for substrings for I in M: #define column iterator linstr=”” #open substring for j in i: #define row iterator linstr=linstr+str(j)+” “ #add matrix element to substringstring linstr=linstr+” \n” #add line break to substring mastr=mastr+linstr #add substring to master string return mastr #master string ## Matrix Statistics ## 170 ##Delete Dublicate Rows From Matrix def NoDupRows(M): “Return Matrix ‘m’ with Duplicate Rows Removed” uniq_rows=[[0,0]] #repository for unigue rows for I in M: row_count=0#counts the number of occurances of a point for j in uniq_rows:#compare I to unique vals list if i==j:#if I has I already, row_count=row_count+1#add one to the occur counter if row_count==0: #ie if the row is unique uniq_rows.append(i)#add the row to the list del uniq_rows[0]#get rid of dummy row return uniq_rows#list of unique rows #Compute Mean of All Elements in a Matrix def MeanMat(M): “Find the Mean of All Values in a Matrix ‘M’ “ ent_count=0 val_count=0 for I in M: ent_count=ent_count+len(i) for j in i: val_count=val_count+j mu=val_count/ent_count return mu ## Matrix Analysis ## #sum up a specific subset of a matrix 171 #used for data analysis in growth/spread def MatSubSum(M,min_col,max_col,min_row,max_row,no_data): “sum of all entries in matrix ‘M’ bound by four vals. Ignore NODATA val” subsum=0 for I in range(min_col,max_col): for j in range(min_row,max_row): if M[i][j]!=no_data: subsum=subsum+M[i][j] return subsum #Pull All Indicies Containing Specific Values def MatIndex(M,val): “return the index [I,j] of the occurrances of ‘val’ in matrix ‘M’ “ oclist=[] i_count=0 for I in M: i_count=i_count+1 j_count=0 for j in i: j_count=j_count+1 if j==val: occell=[] i_val=i_count-1 j_val=j_count-1 occell.append(i_val) occell.append(j_val) oclist.append(occell) return oclist 172 A.6 Script 6-String Functions from my Useful Function Library “ufunx” ## String Functions ## #Given a string of mixed letters and numbers #pull numbers, decimals, and any other specified #symbols “exceptors” to a new string def NumbersFromString(s,*args): “pull numbers and excepted symbols to new string” new_s=”” #open new string base_float=[“0”,”1”,”2”,”3”,”4”,”5”,”6”,”7”,”8”,”9”,”.”]#numbers/decimals for arg in args: base_float.append(str(arg))#add exceptors to list for e in s: for f in base_float: if e==f:#if element in string is in exceptor list new_s=new_s+e#add to new list return new_s #Replace Every Instance of one character in a string with another #the same as s.replace(c1,c2), but I like this version better. Def ReplaceAllInstances(s,c1,c2): “Replace every instance of c1 in string s w/ c2” new_s=””#open new string 173 for e in s: if e!=c1: new_s=new_s+e if e==c1: new_s=new_s+c2 return new_s #Better Split Function #split can be tempermental. I like this version better. Def RobustSplit(s,c): “split strings at c. better than s.splitI” new_list=[]#repository of strings #find out how many c’s. split_counter=0#Initialize counter. For e in s: if e==c:#if element is splitter split_counter=split_counter+1#raise split_count by +1 #open as many cells in new_list to I strings for I in range(split_counter+1):#number of splits new_cell=[]#open new cell new_list.append(new_cell)#add to repository #Sort string elements into lists cell_num=0#cell the text goes in for e in s: if e!=c: new_list[cell_num].appendI#add to the cell_num cell if e==c: cell_num=cell_num+1#move to next cell 174 #Eliminate empty cells short_list=[]#list containing only full cells for cell in new_list: if len(cell)>0:#empty cells have len=0 short_list.append(cell)#add to new matrix #Concatinate strings in cells to one string string_list=[]#list containing only 175oncatenated strings for cell in short_list: string_cell=[]#new cell cell_str=””#open new string for e in cell: cell_str=cell_str+e#add elements to one string string_cell.append(cell_str)#append string to cell string_list.append(string_cell)#append cell to list str_list=[] for I in string_list: for k in i: str_list.append(k) return str_list 175 APPENDIX B STATISTICAL CALCULATIONS AND SCRIPTS B.1 R Script for Growth and Spread Analysis ################## #Import Libraries# ################## library(dplyr) #Data Mannipulation library(ggplot2)#graphs library(stargazer)#tables library(readxl)#for Excel sheets library(reshape)#need for boxplot ############# #Import Data# ############# #Test Data test_sites<-c("Anchorage AK","Baltimore MD","Biloxi MS","Boston MA","Charleston SC","Houston TX","Jacksonville FL","Los Angeles CA","Miami FL","Mobile AL","New Orleans LA","New York NY","Pensacola FL","Philadelphia PA","Port Charlotte FL","Portland ME","Portland OR","Providence RI","San Francisco CA","Savannah GA","Saint John NB","St. Petersburg FL","Seattle WA","Vancouver BC") O