Shaylie Augustine California University of Pennsylvania Honors Advisory Board Committee: Dr. Craig Fox Dr. Michelle Valkanas Dr. Min Li Dr. Peter Cormas Keywords: Donora Smog, Donora Zinc Works, zinc production, zinc resistance, soil bacteria, zinc, heavy metal pollution, organic matter, moisture, growth curves, genome assembly, Serratia liquefacines, Bacillus cereus, Alcaligenes faecalis, Delftia acidovorans. Stenotrophomonas maltophilia ACKNOWLEDGEMENTS The following individuals are acknowledged by the author for their significant contributions to this study: Dr. Robert Whyte who provided assistance and supervision in processing soil samples for moisture and organic content, Dr. Min Li who provided assistance and supervision in processing soil samples for zinc content, and Dr. Peter Cormas who acted as an Honors Advisory Board committee member. The author would also like to acknowledge administration of the Honors Program not only for overseeing this project, but the entirety of the undergraduate’s studies: Dr. Craig Fox, Associative Director, and Dr. Mark Aune, Director. Gratitude is extended to James Augustine, Sonya Augustine, Carlene Pugh, Austin Smith, and Madeline Sanders for their constant motivation and encouragement over the course of the study. A special acknowledgement is extended to Dr. Michelle Valkanas, the advisor of this study. Her first-hand experience and knowledge of soil microorganisms, as well as the instruction and supervision she provided over a multitude of experiments and the many written drafts, combined with her endless support and encouragement contributed significantly to the overall project. Without her time, motivation, and instruction, this study would most certainly not have produced such a wonderful product. Her selflessness and dedication are greatly appreciated and will never be forgotten. ii TABLE OF CONTENTS Acknowledgements…………………………………………………………………….………………………………….…………..ii List of Tables………………………………………………………………………..………………………………………………..…..iv List of Figures…………………………………………………………………………..…………………………………………………v Abstract………………………………………………………………………………..……………………………………………………vi Introduction……………………………………………………………………………..………………………………………………..1 Materials and Methods………………………………………………………………………………………..…………………….4 Results…………………………………………………………………………………………..…………………………………………...7 Discussion………………………………………………………………….……………………………………………………………..13 Conclusion…………………………………………………………………….………………………………………………………….19 References………………………………………………………………………………………………………………………………..19 Appendices……………………………………………………………………………………………………………………………….22 iii LIST OF TABLES Table 1: Zinc Content in the Soil………………………………………………………………………………..……………7 Table 2: Results from DNA Extraction………………………………..………………………………………………….10 iv LIST OF FIGURES Figure 1: A map of Donora, PA……………………………………………………………………………………………..…1 Figure 2: American Steel and Wire - Donora Wire Works and Zinc Works…………………………….…2 Figure 3: A mill worker stands behind newly minted zinc ingots……………………………………………..2 Figure 4: The calculated average moisture content of three samples from 7 experimental sites………………………………………………………………………………………………………………………………………..7 Figure 5: The calculated average organic content of three samples from 7 experimental sites……….…………………………………………………………………………………………………………………………….…7 Figure 6: Control 2 Isolate Zinc Growth Curves…………………………………………….………………………….8 Figure 7: Tryptic Soy Media Growth Curves for Donora, Pollution, and Inorganic Farm Sites……………………………………………………………………………………………………………………………………….9 v ABSTRACT THE DEEP-ROOTED DAMAGE OF THE DONORA SMOG DISASTER By Shaylie Augustine April 2022 Honors thesis supervised by Dr. Michelle M. Valkanas The Donora smog is an infamous event caused by Donora Zinc Works, which operated for thirty-three long years prior to the disaster which resulted in a foggy haze mixed with atmospheric pollutants to smother the proximal areas. The Donora smog occurred over the Donora and Webster areas on October 29th, 1948. Soil samples were collected, in triplicate, from the location where the plant used to stand, as well as from surrounding areas not impacted by industrialization. The soil samples were analyzed for pH, moisture content, organic matter content, and zinc concentrations. Bacteria isolated from the soil were tested for their ability to grow in high zinc concentrations and DNA was extracted from isolates exhibiting zinc resistance. The extracted DNA underwent shotgun sequencing to generate genome assemblies of the isolates. Some of the genomes identified were Serratia liquefacines, Alcaligenes faecalis, and Stenotrophomonas maltophilia which have previously shown zinc resistance. The soil samples collected from Donora displayed high concentrations of zinc and the microbes isolated from the samples displayed the ability to grow in high zinc concentrations. The results from this study provide evidence to support that there are long-lasting effects of industrialization on the bacterial communities within the soil. vi 1. Introduction Known for its protective properties, a component of brass zinc was used in manufacturing cartridges, shells, fuses, and detonators. When alloyed with aluminum, magnesium, and manganese, it was used in the manufacturing of shafts, propellers, and bearings of aircraft parts as well as marine hardware, cables, canisters, and drums for the Navy (Donora Historical Society and Smog Museum). 1.1 Donora Smog Background A small town located south of Pittsburgh, settled along a horseshoe bend in the Monongahela River is known as Donora, and the town across the river, Webster (Figure 1). Two major plants can be seen pictured in Figure 2, the American Steel and Wire plant (to the left), and the Donora Zinc Works (to the right), which were found nestled in the small towns and the air pollution they produced was nothing new to its residents. The infamous Donora Zinc Works would begin construction in 1915, and on October 29th, and the plant would produce its first Zinc, running until its closure in 1957 (Donora Historical Society and Smog Museum). Shortly after production began, the residents of Donora and Webster started to notice a difference in the atmosphere, especially the land-owning farmers. At the beginning of the 1920’s residential farm owners sued for damages such as loss of crop, livestock, topsoil, and even destruction of fences and houses. During the height of the Great Depression, many local residents came together to sue Donora Zinc Works for damages to their health due to the atmospheric effluent (Snyder, 1994). One of the largest of its time, the Donora Zinc Works sat on 4,000 feet of land and played a vital role in the National Defense efforts of World War II. Sulfuric acid, one of the by-products from the zinc plant was used in manufacturing explosives, however the cadmium and zinc by-products were less useful. For decades, Zinc Works was viewed as a threat to public health, and that would become undeniably true in October 1948. On Tuesday, October 26th, 1948, the fog from the industrial production lingered longer than usual. By Wednesday, it had become more noticeable, and did not burn off at all by Thursday (Snyder, 1994). When Friday evening arrived, visibility was dwindling, and many elderly residents reported signs of respiratory distress. Volunteer firefighters navigated the reluctant fog on foot to administer oxygen to elderly residents and asthmatics. Despite this, Donora citizens did not cancel neither their annual Halloween parade on Friday, October 29th, 1948, nor a high school football game the following day. As the smog got thicker, more citizens began to exhibit signs of respiratory distress, and as if from the plot of a horror movie, on October 30th, 1948, at around 2:00 AM, the Figure 1: A map of Donora, PA Source: (Ivel). 1 Figure 2: American Steel and Wire - Donora Wire Works and Zinc Works. The photo above shows the American Steel and Wire (left) and Donora Wire Works and Zinc Works (right) in 1930s as taken from across the Monongahela River in Webster. Source: (Donora Historical Society and Smog Museum). smog took its first victim. Twelve hours later, another 17 Donora and Webster residents would be pronounced dead. Of the 14,000 Donora residents, the number of deceased ranges from 20-26 (Jacobs et al., 2018; Donora Historical Society and Smog Museum). Many other individuals were impacted with 1,440 individuals reporting to have suffered from serious illness and 4,470 from moderate symptoms (Jacobs et al., 2018). These numbers of course do not include individuals who suffered from long term effects, or died within the following weeks, but display the volume of destruction caused by the smog. The mills did not cease production until Sunday, October 31st, 1948, at which point a heavy rain had begun to dissipate the lethal smog. (Baranauskas, 2017). nitrogen dioxide, and multiple sulfur containing compounds as well as heavy metal particulates (Jacobs et al., 2018). The preliminary report listed the other contributing factors to be the unusual weather system and the geography of Donora that aided in the smog formation. The temperature inversion caused the warm air and smog to be trapped below the cold air between the high mountains of the horseshoe bend in the Monongahela River (Jacobs et al., 2018). This investigation and its findings contributed to the implementation of the 1955 Air Pollution Control Act, and eventually the 1970 Clean Air Act which would authorize the development of federal and state regulations that limit the emissions of industrial and mobile sources of air pollution (Helfand et al., 2001). After the poisonous gases cleared and the town quieted back down, an investigation was conducted by the United States Public Health Service to determine the cause of this smog (Jacobs et al., 2018). A preliminary report of the investigation released in 1949 concluded that the smog was caused by a combination of three factors. The first, and probably most obvious contributor, was the air pollution emitted by both the American Steel and Wire company and the Donora Zinc Works, with Zinc Works being named as the major polluter due to the emission of hydrogen fluoride, carbon monoxide, Figure 3: A mill worker stands behind newly minted zinc ingots Source: (Bleiwas & DiFrancesco, 2010). 2 After the report was released to the public by the United States Health Public Health Service, many individuals began to identify gaps within the investigation and other longterm health effects that had not been accounted for. An editorial published by the New England Journal of Medicine described the United States Public Health Service report as a missed opportunity for conducting more detailed research of health effects on long-term pollution and severe acute effects (Jacobs et al., 2018). Zinc is an essential element biologically aiding in many crucial processes such as DNA and RNA metabolism, protein processing, and neural response modulation. Like most biologically essential elements, there are serious detrimental effects at decreased or heightened levels of zinc ranging from diarrhea and urinary problems to neurological problems, organ damage and genetic defects (Environmental Pollution Centers, 2022). Unnaturally occurring zinc like that accumulated from zinc mining and industrialization can easily contaminate soil and water. There are two processes available for zinc manufacturing and both begin with roasting or sintering to remove the sulfur from the mined zinc (zinc blende), producing zinc oxide and releasing sulfur dioxide as a by-product (Greenspec, 2022). This event is known today as the Donora smog and since then much research has been conducted on the surrounding areas to further discover the negative effects that the smog has caused. In 1961, two biostatisticians from the University of Pittsburgh conducted a study and found that there was a higher-than-expected mortality rate for cardiovascular disease and cancer in Donora during the decade following the smog (Jacobs et al., 2018). During the hydrometallurgical process the zinc oxide is then separated from other calcines using sulfuric acid which dissolves the zinc leaving iron, lead, and silver precipitates that are later removed using zinc dust. During the pyrometallurgical process the zinc oxide is mixed with crushed coke and heated at extreme temperatures to reduce the zinc oxide to metallic zinc. The metal is condensed leaving behind cadmium, lead, and iron impurities (Greenspec, 2022). An additional study published in 2017 details the results found from sediment cores retrieved from a lake 6-miles northeast of Donora. The cores showed that after the opening of Zinc Works in 1915 there was a substantial increase in cadmium, lead, and zinc levels, all known carcinogens, and the levels did not subside following the closing of the plant in 1957. The cadmium and lead contaminants were found to have remained at a higher than recommended concentration for 70 years following the smog event. It was noted that the disturbance of the soil can release these contaminants back into the water, and because of this, the pollution from the event remains a risk, even today (Rossi et al., 2017). Zinc industrialization has many negative effects on the environment including the production of sulfur dioxide which can cause acid rain. Other unwanted byproducts that are produced during the manufacturing include cadmium vapor, sulfur oxide, carbon monoxide, carbon dioxide, and other heavy metals. Many of the vapors released have negative effects like carbon monoxide which is known to be ozone forming. Many sites of previous zinc mining or manufacturing have been found to leach significant amounts of 1.2 Effects of Zinc Industrialization on the Environment 3 zinc, cadmium, and other heavy metals into the soil and surrounding waterways, posing a health threat to humans and the environment (Greenspec, 2022). isolated from the soil. If the industrial pollution in this area still exists today, then the soil collected from Donora will contain a higher zinc concentration than that collected from the other test sites. It is also hypothesized that bacteria isolated from the Donora site will exhibit a greater ability to grow in high zinc concentrations when compared to other test sites. 1.3 Microorganisms and Zinc Pollution Zinc plays a vital role in plant nourishment at low levels. At higher concentrations zinc can become phytotoxic and persist for a long period of time, unlike other pollutants that can be chemically and biologically degraded (Kour et al., 2020). Bioremediation is a process that uses living organisms, such as soil bacteria, to detoxify heavy metals from the environment (Kour et al., 2020). 2. Materials and Methods 2.1 Soil Collection Soil samples were collected in triplicate at seven sites within southwestern Pennsylvania to explore the metabolic capabilities of the microorganisms within them. The sites were chosen based on their proximity and relationship to the location where the Zinc Works production plant was known to reside before its demolition. High concentrations of zinc within the soil effects the flora and fauna that inhabits it. Excess zinc can alter plant development, although a few species of plants have developed the ability to grow in high zinc concentrations. The effect that high levels of zinc has on bioavailability of plants and soil bacteria depends on a combination of factors including, but not limited to, microbial community structure and available organic matter (Balafrej et al., 2020). Soil collected from Donora Industrial Park (40°10’41” N 79°51’11” W) is referred to as the “Donora” site and represents soil directly affected by the Zinc Works production. Soil from Nemacolin Park (39°52’44” N 79°55’13” W) in Carmichaels, PA which is located downhill from Hilltop Energy Center and was collected to represent soil affected from a currently active powerplant and is referred to as the “Pollution” site. Soil collected from California University of Pennsylvania’s SAI Farm (40°02’50” N 79°54’49” W) from both the organic orchard and organic produce plots were chosen to study the agriculture located near the affected area and are referred to as “Organic Orchard” and “Organic Farm”, respectively. Additional samples from inorganic orchard and produce plots (40°04’48” N 79°08’20” W) in Somerset, PA were obtained to study an agriculture environment distant from the affected area and are referred to as “Inorganic Orchard” and “Inorganic Farm”, High concentrations of zinc in soil kills the majority of the microflora, creating a selective pressure for the emergence of heavy metal resistant strains of microorganisms. Heavy metal resistant microorganisms effect the mobility the heavy metals and detoxify the soil by converting the toxic form of heavy metals into a nontoxic form through the production of metabolites (Kour et al., 2019). Donora is home to a monumental event caused by zinc production, and because of this, was chosen as the target area of study. Soil was collected from Donora and surrounding areas to analyze the zinc, moisture, and organic matter content, as well as the capabilities of microorganisms 4 respectively. Finally, samples were obtained from an old forest located on a hill above the SAI Farm (40°02’40” N 79°54’51” W) representing the control group and is referred to as “control” (Supp. Table 1). was loaded, the samples were heated for approximately 24 hours at 350°C, then removed, reweighed, and recorded. The organic matter content was calculated by first finding the percent of mineral content then subtracting from 100. On May 17th, 2021, samples were collected from the Donora, Pollution, SAI Orchard, and SAI Farm sites, on May 20th, 2021, samples were collected from the Organic Orchard and Organic Farm sites, and on May 24th, 2021, samples were collected from the Control site (Supp. Table 1). 2.3 Bacterial Isolation and Broth Cultures Soil bacteria was cultivated using three types of agars to diversify the organisms collected. Lima bean, tryptic soy, and nutrient agar (Fisher Bioreagents) plates were made, and the soil samples were swabbed with a sterile cotton swab and plated onto the prepared plates. Individual colonies of unique colony morphology were selected for additional screening and were restreaked in order to purify. Once the bacteria were purified, they were transferred from the plates to a liquid broth culture of the appropriate medium using aseptic technique. The soil was collected using a pre-sterilized soil core and 5% Lysol was used to clean the device between sites to ensure that there was no cross-contamination between the samples obtained from each site. The soil samples were refrigerated in storage between tests. 2.2 Moisture and Organic Content Test 2.4 Zinc Content Test Soil samples were weighed and dried to calculate moisture and organic content. The moisture content test was conducted by weighing at least 5 grams but not exceeding 28 grams of soil from each sample and placing it in an aluminum foil boat, both the weight of the foil and the wet weight of the soil were recorded. The samples were then placed into a dehydration machine located in the biology department of California University of Pennsylvania. After approximately 24 hours at 105°C the samples were removed from the dehydrator and weighed again, then the moisture content for each sample was calculated by first finding the percent soil content then subtracting from 100. Soil samples from the Donora, Pollution, and Control sites were processed for zinc concentration by first drying the soil samples for 48 hours at 65°C. The three samples that were collected for each site were combined to provide better quality data, and approximately five grams of dried soil from each site were digested using 20 mL of nitric acid, 50 mL of hydrochloric acid, and 4 mL of hydrogen peroxide. The solutions were then filtered through a 12.5 cm diameter filter paper using gravity filtration and the filtrate was added to deionized water to achieve a diluted solution of 250 mL. Zinc calibration standards of 0.1 PPM, 0.5 PPM, and 1 PPM were prepared from a 1000 PPM zinc stock solution for a total volume of 50 mL. The prepared solutions were then analyzed using an Atomic Absorption Spectroscopy (AAS) located in the Chemistry Department at California University of Pennsylvania. The Similarly, the organic content was tested using the dehydrated soil from the moisture test. Approximately 5 grams of soil was weighed, added to a pre-weighed crucible, and arranged in the furnace. After the oven 5 zinc content of the samples was determined using the average of three reads from each sample and the results from the calculation were graphed. The extracted DNA samples were sent to the Microbial Genome Sequencing Center (Pittsburgh, PA) where shotgun sequencing was performed on the samples. The FASTQ files received back from sequencing were scaffolded, assembled, and annotated for each genome using Kbase data base (https://www.kbase.us/). The sequences were imported into FASTQ files to get a paired-end library of each sample. The FASTQC report written by Simon Andrews of Babraham Bioinformatics was then conducted on the samples to assess the read quality. From this quality control check the per base sequence quality, or Phred score, nucleotide distribution, unidentified bases and adapters, and Kmer content were analyzed to assess the quality of the pairedend library. The trimmomatic program, written by Anthony Bolger, Marc Lohse, and Bjoern Usadel was used on each sample to remove any poor-quality sequences, using the results from the Kmer data to determine the Head Crop length adjustments. The trimmomatic program parameters used were a sliding window size of 4 and a sliding window minimum quality of 15. An additional FASTQC report was obtained, and the results were compared to those of the first report. The reads were then assembled through de novo assembly using both the Velvet and SPAdes assembler. A QUAST report was then conducted on both the Velvet and SPAdes assemblies and the reports were compared to determine which assembly would be used moving forward. The chosen assemblies were then annotated using the annotate assembler Annotate Microbial Contigs using Rapid Annotations using Subsystems Technology (RAST). The quality of the assembly was then analyzed using the CheckM tool. The RAST annotated genome assembly for each sample was then used to build a metabolic model using the 2.5 Zinc Growth Curves Isolates from the Donora, Pollution, Control, and an isolate from the Inorganic Farm (n=28) were tested on their ability to grow in high zinc environments. The Inorganic Farm isolate was included due to the isolates ability to emit a purple color that was not observed by any of the other bacterial communities. A 96 well plate was used to simulate microenvironments which were then observed using a plate reader (BioTek Instruments Inc). The absorbency was measured at 600nm every 3 hours for 27 hours. The isolates were exposed to 80 mM, 40 mM, 20 mM, 10 mM, 5 mM and 0 mM concentrations of zinc with their corresponding medias using 20 μL of growth obtained from the broth cultures for a final volume of 220 μL. Wells containing only media and media plus zinc only were created as controls to ensure there was no contamination. Each sample was tested in duplicate, including controls, to ensure the quality of the data collected. Raw data for the zinc growth curves can be found in Supp. Tables 7-14. 2.6 DNA Extraction Eight isolates: C2-NB, C2-LB, P1-TSB, C2-TSB, D1-TSB, D2-TSB, D3-TSB, and IO-TSB that displayed zinc resistance were selected for DNA extraction. DNA extraction was completed using a DNeasy® UltraClean® Microbial Kit (Qiagen, Hilden, Germany). The extracted DNA was analyzed using a Broad Range Assay kit on a Qubit 4 Fluorometer (Invitrogen, United States) to determine the DNA concentration of each sample. 2.7 Genome Assembly 6 Build Metabolic Model tool. The Insert Genome into Species Tree tool was used to determine the identity of the assembled genomes and compare taxonomic relationships. The 16S gene nucleotide sequence was then compared using another nucleotide sequencing database, BLAST (https://blast.ncbi.nlm.nih.gov/Blast.cgi) to confirm the identities of the isolates Moisture Content (%) 25 20 15 10 5 0 Control Donora Pollution SAI Orchard SAI Farm Inorganic Inorganic Orchard Farm Sites 3. Results Figure 4: The calculated average moisture content of three samples from 7 experimental sites. The red line indicates optimal agricultural moisture content (Blumberg, 1982). 3.1 Moisture and Organic Content Test Organic Matter Content (%) The results from the moisture content test are the average of the three samples from each site (Figure 4 and Supp. Table 1). The results revealed that all the samples had similar calculated moisture percentages. The Pollution site was determined to have the lowest content of moisture with 19.48% as the calculated average percent, and the Donora site had the highest, with a calculated average of 23.56%. The results from the organic matter content are displayed as the average of the three samples from each site (Figure 5 and Supp. Table 2). The results reveled that overall, the samples collected from the Organic SAI Farm had the lowest amount of organic matter, the SAI Farm site was determined to be 4.84% and the SAI Orchard site, 5.56%. The Control site was determined to have the most organic matter with a recorded average of 19.71% organic matter. 25 20 15 10 5 0 Control Donora Pollution SAI Orchard SAI Farm Inorganic Inorganic Orchard Farm Sites Figure 5: The calculated average organic content of three samples from 7 experimental sites. The red line indicates optimal agricultural moisture content (Blumberg, 1982). Site Name Control Zinc Content (PPM) The zinc content for the control site was the lowest of the three sites tested and had a zinc concentration of 3.945 PPM (Table 1). The soil from the pollution site was determined to contain 21.645 PPM of zinc (Table 1). The Donora site contained a substantially higher amount of zinc and had a zinc concentration of 51.48 PPM (Table 1). 3.945 3.3 Zinc Growth Curves Donora 51.48 Pollution 21.645 The results obtained from the growth curve plates were based off the ninth read from 3.2 Zinc Content Test Table 1: Zinc Content in the Soil 7 every test (27 hours in total). The absorbance for each well was averaged with its duplicate and the results are displayed in Figure 6 and 7. A 2.5 ABSORBANCE 2 The Lima Bean media showed that at 80 mM of zinc isolate Control 2 had displayed the least inhibition when compared to its control (Figure 6A). It is observed that the isolate performed better at 80 mM zinc concentration, having a faster growth rate than its control from reads 1-6 (Figure 6A). Because of this, the isolate was selected to investigate further. 1.5 1 0.5 0 Read Read Read Read Read Read Read Read Read 1 2 3 4 5 6 7 8 9 The nutrient broth media plate for Control 2 shows that the isolate performed better than its control at 5 mM concentration of zinc from reads 1-3 (Figure 6B). The results also show that the isolate had a faster growth rate at 80 mM than at 40 mM when compared to its control, and because of this, was also chosen to investigate further. B C2 80 mM C2 40 mM C2 20 mM C2 10 mM C2 5 mM C2 0 mM 2.5 ABSORBANCE 2 1.5 1 0.5 Isolates growing in tryptic soy media grew at a faster rate in comparison to the other two medias overall (Figure 6 and 7). Isolate Control 2 in tryptic soy media displayed minimal inhibition at 5 mM, 10 mM, and 20 mM when compared to its control. It was observed that the isolate experienced inhibition at 40 mM, displaying slower growth rates for 40 mM and 80 mM concentrations when compared to its control (Figure 6C). Due to the minimal inhibition observed at lower concentrations the isolate was selected to be used in future experiments. C 0 Read Read Read Read Read Read Read Read Read 1 2 3 4 5 6 7 8 9 C C2 80 mM C2 40 mM C2 20 mM C2 10 mM C2 5 mM C2 0 mM 2.5 ABSORBANCE 2 Figure 6: Control 2 Isolate Zinc Growth Curves. Growth curves for Control 2 in lima bean media (A), nutrient broth (B), and tryptic soy media (C) were averaged with their duplicate and the recorded absorbency was plotted against read time for a total of 27 hours. The control is denoted as “C2 0 mM” and can be visualized as the green curve in figures A-C. 1.5 1 0.5 0 Read Read Read Read Read Read Read Read Read 1 2 3 4 5 6 7 8 9 8 C2 80 mM C2 40 mM C2 20 mM C2 10 mM C2 5 mM C2 0 mM A 2.5 B 2.5 2 ABSORBANCE ABSORBANCE 2 1.5 1 0.5 1.5 1 0.5 0 0 Read Read Read Read Read Read Read Read Read 1 2 3 4 5 6 7 8 9 Read Read Read Read Read Read Read Read Read 1 2 3 4 5 6 7 8 9 D1 80 mM D1 40 mM D1 20 mM D2 80 mM D2 40 mM D2 20 mM D1 10 mM D1 5 mM D1 0 mM D2 10 mM D2 5 mM D2 0 mM D 2.5 2 2 ABSORBANCE C 2.5 ABSORBANCE 1.5 1 1 0.5 0.5 0 0 Read Read Read Read Read Read Read Read Read 1 2 3 4 5 6 7 8 9 Read Read Read Read Read Read Read Read Read 1 2 3 4 5 6 7 8 9 E 1.5 D3 80 mM D3 40 mM D3 20 mM P1 80 mM P1 40 mM P1 20 mM D3 10 mM D3 5 mM D3 0 mM P1 10 mM P1 5 mM P1 0 mM 2.5 ABSORBANCE 2 1.5 1 Figure 7: Tryptic Soy Media Growth Curves for Donora, Pollution, and Inorganic Farm Sites. Growth curves for Donora 1 (A), Donora 2 (B), Donora 3 (C), Pollution 1 (D), and Inorganic Farm (E) were averaged with their duplicate and the recorded absorbency was plotted against read time for a total of 27 hours. The control is denoted at “*Isolate abbreviation* 0 mM” and can be visualized as the green curve in figures A-E. 0.5 0 Read Read Read Read Read Read Read Read Read 1 2 3 4 5 6 7 8 9 IO 80 mM IO 40 mM IO 20 mM IO 10 mM IO 5 mM IO 0 mM 9 Isolate Donora 1 in tryptic soy media observed inhibition at 40 mM. For concentrations of 5 mM, 10 mM and 20 mM, the isolate grew with minimal inhibition, however, much slower growth rates were observed for 40 mM and 80 mM concentration when compared to its control (Figure 7A). Due to the minimal inhibition at lower concentrations, the isolate was selected to investigate further. Table 2: Results from DNA Extraction For both isolate Donora 2 and Donora 3 growing in tryptic soy media, a faster growth rate was observed at 40 mM from reads 1-2 when compared to their controls (Figure 7B & 7C). Isolate Donora 2 displayed inhibition at 40 mM (Figure 7B) and isolate Donora 3 displayed inhibition at 20 mM (Figure 7C) when compared to their controls after 27 hours. Minimal inhibition was observed at lower concentrations of zinc (5 mM and 10 mM) when compared to their controls and therefore both isolates were selected for further testing (Figure 7B & 7C). Sample Name Qubit Reading (ng/µL) TSB C2 15.2 TSB P1 29.6 NB C2 33.3 TSB D1 280 TSB D2 240 TSB D3 33.4 LB C2 39 TSB Purple 328 3.4 Genome Assembly All isolate identities were determined by cross analyzing 16s sequences with the BLAST database and only those containing a 100% query cover that scored higher than 90% for the percent identity of the genome were recorded. The Inorganic Farm genome in tryptic soy media had 1,984,094 paired end sequences. Most sequences had a length of 147.68 base pairs and a mean Phred score of 33.36. After quality filtering, 644 base pairs were dropped. The SPAdes assembly contained longer and higher quality contigs when compared to the Velvet assembly and was used for downstream annotation and analysis (Supp. Figure 1). Isolate Pollution 1 growing in tryptic soy media displayed some degree of inhibition at 10 mM, 20 mM, 40 mM and 80 mM concentrations when compared to its control (Figure 7D). At reads 8-9 the isolate performed better than its control at 5 mM and was selected for further testing. Isolate Inorganic Farm growing in tryptic soy experienced inhibition at 40 mM when compared to its control. At 40 mM the isolate performed better than its control for reads 1-2 (Figure 7E). Due to the results provided from the growth curve, the isolate was chosen to continue investigating. The longest contig was 538,056 base pairs and the total genome size was 51,013,143 base pairs in length with a 58.16% GC content. The genome was assembled from 2,271 contigs and the genome was determined to have 100% completeness based on marker lineage across 5,656 genomes (56 markers). DNA Extraction The Qubit results of the extracted DNA can be found in Table 2. The Donora 1 genome in tryptic soy media had 1,759,608 paired end sequences with a mean read length of 146.44 base pairs. The 10 genome had a mean Phred score of 33.38, after quality filtering 591 base pairs were dropped. The SPAdes assembly contained longer, higher quality contigs when compared to the Velvet assembly and was used for downstream annotation and analysis (Supp. Figure 2). to also be Serratia liquefaciens or a close relative with a 99.92% identity and 100% query cover. The Donora 3 isolate cultivated in tryptic soy had 1,665,436 paired end sequences with a mean read length of 149.53 base pairs. The genome had a mean Phred score of 33.30 and 590 base pairs were dropped after quality filtering. The SPAdes assembly contained longer and higher quality contigs when compared to the Velvet assembly (Supp. Figure 4) and was used for downstream analysis and annotation. The longest contig was 64,486 base pairs and the total genome was 3,854,901 base pairs with a 66.63% GC content. The genome was assembled from 233 contigs and was determined to have 100% completeness based on marker lineage across 5,449 genomes (104 markers). The longest contig was 467,849 base pairs and the total genome was 13,635,030 base pairs with a 47.12% GC content. The genome was assembled from 3,548 contigs and was determined to have 100% completeness based on marker lineage across 5,656 genomes (56 markers). The results from Kbase identified the isolate Inorganic Farm (98.426 ANI) and Donora 1 (98.4671 ANI) growing in tryptic soy as closely related to Serratia liquefaciens which was supported by the BLAST database with the isolates scoring 99.61% and 99.15% identity, respectively. Both isolates had 100% query cover. Kbase results showed that the Donora 3 (89.6983 ANI) isolate in tryptic soy was most closely related to Bacillus mycoides. After comparing the 16s sequence to BLAST, it showed the isolate to match Bacillus cereus, or a close relative scoring a 97.08% identity with 100% query coverage. Donora 2 in tryptic soy contained 1,468,406 paired end sequences with a mean read length of 149.16 base pairs and a mean Phred score of 33.31. After preforming quality filtering, 301 pairs were dropped. The SPAdes assembly contained longer and higher quality contigs when compared to the Velvet assembly (Supp. Figure 3) and was used for downstream annotation and analysis. Pollution 1 isolate growing in tryptic soy media contained 1,412,215 paired end sequences and a mean read length of 142.42 base pairs. The mean Phred score was reported to be 33.31. After preforming quality filtering 474 base pairs were dropped. The SPAdes assembly contained better quality contigs (Supp. Figure 5) and was used for downstream annotation and analysis. The longest contig was 68,093 base pairs and the total genome was 16,334,954 base pairs with a 59.1% GC content. The genome was assembled from 2,345 contigs and determined to contain 100% completeness based on marker lineage across 5,656 genomes (56 markers). The longest contig was 531,839 base pairs and the total genome was 18,024,254 base pairs with a 55.91% GC content. The genome was assembled from 6,810 contigs and recorded 100% completeness based on The Kbase results for Donora 2 (98.1358 ANI) in tryptic soy were inconclusive, however, the BLAST database confirmed the identity 11 marker lineage across 5,656 genomes (56 markers). completeness based on marker lineage across 5,656 genomes (56 markers). The isolate from Control 2 growing in tryptic soy media contained 2,214,563 paired end sequences. There was a mean read length of 140.98 base pairs and a mean Phred score of 33.49. After preforming quality filtering, 527 base pairs were dropped. When compared to the Velvet assembly, the SPAdes assembly contained longer and higher quality contigs (Supp. Figure 6) and was chosen to be used in downstream annotation and analysis. The isolate from Control 2 (92.5459 ANI) in nutrient broth was closely related to Alcaligenes faecalis and was further confirmed by BLAST with a percent identity score of 99.16% and 100% query coverage. Due to contamination, an additional 16s genome was identified and ran in BLAST and confirmed the identity to be Delftia acidovorans (97.6674 ANI) or a close relative with a percent identity score of 99.70%. The longest contig was 390,323 base pairs and the total genome was 10,278,979 base pairs with a 55.81% GC content. The genome was assembled from 1,209 contigs and contained 100% completeness based on marker lineage across 5,656 genomes (56 markers). Isolate Control 2 grown within Lima bean media contained 1,694,457 paired end sequences. There was a mean read length of 149.27 base pairs and a mean Phred score of 33.32. The quality filtering resulted in 675 base pairs to be dropped. The SPAdes assembly contained longer and higher quality contigs when compared to those of the Velvet assembly (Supp. Figure 8) and was chosen to be used in downstream annotation and analysis. The Kbase results for the identity of isolates Pollution 1 (97.9503 ANI) and Control 2 in tryptic soy were inconclusive, however the BLAST database suggested that both isolates were closely related to Alcaligenes faecalis with 100% query coverage and precent identity scores of 99.11% and 99.16% respectively. The longest contig was 261,853 base pairs and the total genome was 6,083,381 base pairs in length with a 58.41% GC content. The genome was assembled from 2,100 contigs and contained 100% completeness based on marker lineage across 5,449 genomes (104 markers). Isolate Control 2 growing in nutrient broth contained 1,868,298 paired end sequences with a mean read length of 149.30 base pairs. The mean Phred score was 33.34 and after quality filtering 699 base pairs were dropped. The SPAdes assembly contained longer and higher quality contigs compared to the Velvet assembly (Supp. Figure 7) and was used in downstream analysis and annotation. The isolate from Control 2 (91.2187 ANI) in Lima bean was closely related to Stenotrophomonas maltophilia as suggested by Kbase and this was further confirmed by BLAST scoring a percent identity of 93.54% and 100% query coverage. During the genome assembly, it was discovered that all isolates contained some degree of contamination (Supp. Table 7). Due to this, genome assembly was not used for discussion purposes. The longest contig was 688,228 base pairs and the total genome was 16,324,272 base pairs in length with a 62.55% GC content. The genome was assembled using 2,833 contigs and determined to have 100% 12 4. Discussion organic matter is a major source of energy for microorganisms within the soil and can stimulate their growth and development. Changes in the microbial composition will directly affect organic matter measurements because the changes in the rate of carbon and nutrient cycling. Organic content can reflect the activity of other soil organisms such as earthworms which are involved in carbon and nitrogen recycling through shedding organic residues and promoting microbial decomposition (Raj & Syriac, 2017). 4.1 Moisture and Organic Content Both moisture and organic content of soil directly affect the microbial communities that reside within it. The ideal agricultural soil composition is 45% mineral content, 5% organic matter content, and 50% pore space. For optimal plant growth, the pore space should be filled with equal parts water and air (Blumberg, 1982). With an optimal moisture content of 25%, all the samples fell within normal ranges. The Donora site recorded the closest to optimal moisture content scoring 23.56%. The Inorganic orchard recorded a slightly higher moisture content at 22.06% compared to the Organic orchard which scored a value of 21.07%. However, the Organic farm scored a higher moisture content, 23.48%, compared to the Inorganic farm which recorded a moisture content of 20% (Figure 4). These results suggest that the soil isolated from these sites is relatively healthy in composition. Organic matter can work to facilitate heavy metal release, as well as immobilize it. This immobilization helps to prevent plant uptake of toxic metals, decreasing the pollutant transport and redistribution from contaminated sites. Organic matter has been used as a soil amendment in polluted soil for this reason. Increased soil organic matter does not reduce the amount of heavy metal pollutants but reduces the bioavailability of them to plants. Increase in soil organic matter has been found to decrease the amount of cadmium, zinc, and lead in tested plants (Kwiatkowska-Malina, 2018). The Control and Pollution sites recorded the lowest amount of moisture content scoring 19.71% and 19.48% respectively (Figure 4). Although these scores are lower in comparison to the other samples, no conclusions can be drawn from this data that indicates a substantial difference in soil composition across the samples. Due to the similarities observed in moisture content across all sites, it can be concluded that that moisture content is not a major contributor to the structure of microbial communities. Herbicide application has been found to increase or decrease the amount of organic matter present and can be a contributing factor to abnormally high and low organic matter scores (Raj & Syriac, 2017). With an optimal organic matter content of 5%, the SAI Organic site reported scores closest to this with the Organic orchard scoring 5.56% and the Organic farm slightly lower at 4.84% (Figure 5). These scores combined with the moisture content scores allow for it to be concluded that of the sites tested, the SAI Organic sites contained soil that was the most agriculturally optimal in composition. Organic content of soil can be used as an indicator of soil health because it fluctuates to adequately reflect biological changes induced by pollution and contamination. Although an excess of organic matter is not favorable to plant growth and development, The Inorganic site was slightly less optimal with the Inorganic farm sample recording 13 6.89% and the Inorganic orchard 10.30% organic matter (Figure 5). This increase in organic matter could be due to the use of herbicides which were not present at the SAI Organic site. Overall, the moisture and organic content results suggest that the soil from Inorganic farm was more agriculturally optimal than that from the Inorganic orchard, however, both were within normal ranges suggesting a robust soil microbiome should be present at both sites. to drying the samples in the oven, causing an inaccurate score. Even with this data point excluded, the soil composition of the Control site is still less optimal than the Donora, SAI Organic, and Inorganic sites, but slightly more than the Pollution site. It is likely that the agricultural soil samples were closest to optimal levels due to human intervention and maintenance. The Control and Pollution sites lacked the presence of human intervention causing an accumulation of organic matter over time, creating an environment for a diverse community of microorganisms to thrive on. The Donora site received a score of 8.95% organic matter which is comparable to the optimal score (Figure 5). The organic matter and moisture content data combined provides evidence to support that overall, the soil isolated from the Donora site is in relatively healthy composition. The higher than optimal score for organic content could reflect higher microbial biomass. It has been found that an increase in organic matter in soil is indicative of increased enzyme activity. This suggests that larger microbial communities are present within the soil (Bending et al., 2002). The Control and Pollution sites contained the highest amount of organic matter scoring 19.71% and 14.84% respectively (Figure 5). These scores are much higher than the optimal score and combined with the moisture content data it can be concluded that the soil collected from these two sites may contain higher microbial biomass. The excessive amount of organic matter could indicate increased enzyme activity due to the availability of nutrients for soil microorganisms (Li et al., 2018). A data point for Control 2 was an extreme outlier recording a score of 37.35% organic matter, if this data point is excluded when averaging, the Control site receives a score of 10.89% organic matter (Figure 5, Supp. Table 4, & Supp. Table 5). This data point may have been caused by failure to remove large rocks and other debris from the soil sample prior The Donora, Pollution, and Control sites elevated levels of organic matter likely reflect heavy metal contamination within these soils. The adaption of the soil to accumulate organic matter would benefit both plants and soil microorganisms in polluted areas. An increase in soil organic matter would immobilize toxic metals, preventing the reuptake by plants, but also provide microorganisms with abundant resources to grow and develop (Kwiatkowska-Malina, 2018). 4.2 Zinc Content A 1998 study was conducted to determine the heavy metal content of Pennsylvania soils based on samples collected from the 1980’s. The results state that the mean zinc content for Washington County was determined to be 53.44 PPM and 37.29 PPM for Greene County. The study mentions that zinc concentrations were higher in surface horizons than subsurface horizons and that this is most likely due to man-made pollutants. The zinc concentrations were found to be correlated to both organic matter content and pH among other factors. (Ciolkosz et al., 1998). 14 Zinc concentrations were measured at the Control, Pollution and Donora sites. The Control site, found within Washington County contained the lowest amount of zinc (3.945 PPM) (Table 1). This value is unexpected due to its proximity to the Donora site, as well as the reported baseline zinc content data for that area. It is suspected that due to the geology of the Control site, the low amount of zinc could be attributed to runoff of heavy metals released during rainfall (Wei et al., 2019). site were the highest of the experimental sites, the value still falls within a normal range (Wade, 2019). Given the almost 40-year difference between the soil collections, a change in the zinc content of the soil is expected, but not observed unlike in the Control and Pollution sites where a large drop in zinc concentration was observed. This data provides evidence to support the claim that zinc pollution is persistent across many decades. The Pollution site, found within Greene County contained 21.645 PPM of zinc which is comparable to the expected value (Table 1). This value is low in comparison to the data reported data, but the difference in values could be attributed to the time gap between studies (Ciolkosz et al., 1998). It is possible that the difference in the results and the expected value may be due to the man-made contribution. Although there is a pollution source actively affecting this soil, it may not be emitting the same contaminants. Power stations have been found to emit differing heavy metal contaminants such as sulfur, nitrogen, and carbon oxides, none of which were examined in this study and may provide an explanation for the decrease in zinc as observed by the results (Minnikova et al., 2017). 4.3 Zinc Growth Curves Heavy metal contamination of soil dramatically effects the microbial communities that live within it. Soil that has high levels of heavy metal contamination can substantially reduce the total microbial biomass as well as significantly alter the bacterial community structure (Sandaa et al., 1999). A 2011 publication was conducted on the bacteria S. pneumoniae provided evidence that at concentrations of < 1 mM of zinc bacterial growth was completely inhibited (Sandaa et al., 1999). Another study from 1998 notes a decrease in biomass production at just 0.153 mM of zinc concentration (Sandaa et al., 1999). The results from the Lima bean plate (Figure 6A) suggest that the isolate from Control 2 exhibited the most resistance to zinc at 80 mM. The isolate at 80 mM performed better than all other concentrations, including its control. This data suggests that the isolate may contain a metabolic capability that allows it to utilize zinc at high concentrations to increase its growth and development. At this high of a concentration, it was expected that none of the bacteria would be able to grow, because this isolate performed better at the highest concentrations of zinc than The Donora site, found within Washington County, contained the highest concentration of zinc with a score of 51.48 PPM (Table 1). Although this value is the highest of the sites examined within this study, it is very similar to the previously reported values (Ciolkosz et al., 1998). Although zinc levels differ greatly across soil (10-300 PPM), there are many other factors that contribute to zinc toxicity including soil composition and organic matter content, among other things. Although the levels of zinc from the Donora 15 lower concentrations, it can be suggested that the isolate has become adapted to a high zinc environment. at least 20 mM, or 130.78 PPM, with Donora 1 and Donora 2 isolates (Figure 7A and 7B) observing inhibition at 40 mM or 261.56 PPM. These results are much higher than the amount of zinc found within the soil and again speak to the adaptability of the isolates. The results from the nutrient broth plate (Figure 6B) show that the isolate experience inhibition at 40 mM. Initial reads indicate that the isolate thrived in a 5 mM concentration, performing better than its control. This data suggests that the bacterial communities from the Control site contain zinc resistance capabilities. The Pollution 1 isolate cultivated in tryptic soy media (Figure 7D) displayed a more uniform level of inhibition when compared to other tested isolates. The isolate appears to be affected incrementally as the zinc concentration increases, whereas other isolates appeared to only be affected between the 20 mM - 40mM zinc concentrations (Figure 7). The data from the zinc content experiment showed that the soil obtained from the Pollution site contained 21.645 PPM zinc (Table 1). At 5 mM, or 32.695 PPM, the isolate experienced very little inhibition. These results were expected due to the concentration of zinc found within the soil. The results from the tryptic soy plate indicate that the isolate can perform just as well as its control at the lower three experimental concentrations of zinc (Figure 6C). These results display the zinc resistant capabilities of this isolate. The results for the Control site were unexpected as it was not anticipated that the Control site contained high levels of zinc contamination (Figure 6). These results, in conjunction with the low zinc levels found at the site (Table 1), were a surprising discovery. The findings from the zinc growth curve experiment suggest that zinc resistant bacteria found within uncontaminated sites are still able to persist within high concentrations of zinc contamination. The results from the tryptic soy plate (Figure 7AC) show that of the Donora isolates, Donora 3 (Figure 7C) had the lowest amount of zinc resistant capabilities. All three Donora isolates cultivated on tryptic soy media exhibited the ability to grow in 5 mM zinc with little to no inhibition (Figure 7A-C). Even at 5 mM zinc, growth was not expected, therefore these results provide evidence that the Donora isolates are not inhibited by relatively high zinc concentrations. The zinc content test revealed that the soil from the Donora site contained 51.48 PPM of zinc (Table 1). The isolates obtained from the Donora site displayed the capacity to grow in The isolate from the Inorganic site coined “IO” experienced minimal inhibition at lower concentrations of zinc (Figure 7E). Although zinc content was not measured for the soil obtained from this site, these results show that the isolate also retained some degree of zinc resistance and was comparable to the capabilities of isolates from other sites. The data from this experiment revealed the remarkable capabilities of the bacterial communities to persist in extremely high concentrations of zinc. Of the 28 isolates screened for zinc resistance, 8 exhibited the ability to grow in elevated levels of zinc (Figure 6 and 7). Based on the results received from the zinc content test (Table 1) it was anticipated that the Donora isolates would obtain the ability to grow in at least 5 mM or 32.695 PPM, and 10 mM, or 65.39 PPM. Not only did the Donora isolates 16 display this ability, but they also showed very minimal inhibition at these concentrations which was an interesting result. states microbial communities exposed to long-term zinc pollution will exhibit less growth inhibition in higher zinc concentrations. Again, based on the results from the zinc content test (Table 1), it was expected that the Pollution isolates would be able to grow in at least 5 mM concentrations of zinc. The results were consistent with this expectation, and even displayed the isolate’s ability to grow within higher concentrations. Although the isolate experienced a degree of inhibition at higher concentrations, its ability to grow at 40 mM, or 261.56 PPM was not expected and displays zinc resistance for the isolate. 4.4 Taxonomic Identification of Isolates Confirming the identity of the isolates that exhibited resistance to zinc can be extremely helpful to infer characteristics about the soil and help to predict the metabolic potential of the microbial community within it. Samples TSB Inorganic Farm “IO”, TSB Donora 1, and TSB Donora 2 were all identified as Serratia liquefacines. There is an abundant amount of research confirming S. liquefacines is commonly found in industrial effluents known to be polluted with heavy metals (Kumar et al., 2019; Ramya & Boominathan, 2017; Zagui et al., 2021). This species has been reported to exhibit resistance to cadmium as well as zinc and has been used in trial studies as an agent for bioremediation where the bacteria was isolated from a sample of polluted water and cultivated in laboratory conditions. The bacteria displayed a resistance to heavy metals and was further proven to significantly reduce heavy metal concentration by 44.46% when compared to its control (Kumar et al., 2019). No growth was expected for the Control isolates under any concentrations of zinc based on the results from the zinc content test (Table 1). The ability of the isolates to grow at high concentrations of zinc was surprising and suggests that the isolates retain zinc resistant capabilities. Zinc resistance was observed in isolates across all experimental sites, in both polluted and unpolluted soils. This data speaks to the resilience of microorganisms and to their ability to persist in contaminated environments. Of all the sites tested, the Control site is the closest in proximity to the Donora site and can be considered soil that may have been affected by the smog. Due to the low amount of zinc found within the Control site soil (3.945 PPM; Table 1), it can be stated that the isolates were not forced to exist within a zinc contaminated environment. The proximity to the pollution event can help to explain the zinc resistance capabilities of these isolates in the Lima bean and nutrient broth plates. Due to this, the zinc resistance observed in the Control isolates corroborates the initial hypothesis that A study conducted to investigate the capabilities of Serratia liquefacines identified a total of 110 genes related to iron acquisition and metabolism. Of these genes, 49 were associated with siderophore biosynthesis, secretion, and iron internalization. The other 61 genes were associated with metal metabolism. It was determined through experimentation that a specific genome within the Serratia liquefacines clade, SlFG3, produces two siderophores, one of which has chemical characteristics of catecholates that produces 17 a purple color after development. This stain was also found in extreme environmental conditions and displayed high adaptability (Caneschi et al., 2019). This research helps to explain the purple hue that was emitted from the TSB Inorganic farm colony and displays the unique capabilities and adaptability of this strain of bacteria. copper, zinc, lead, and chromium. A strain of Alcaligenes faecalis, phenolicus MB207 underwent genome sequencing and annotation to document the genes found within it and the capabilities they contain. Genes commonly used in metal transport inside and outside of the cell were detected within the analyzed genome commonly used in metal detoxification and survival in high metal stressed environments. The genome has also been associated with nanoparticle production as well as the conversion of the most toxic form of arsenic to its less dangerous form (Basharat et al., 2018). The results from the zinc growth curves show differing inhibitions for these isolates, where Control 2 LB showed preference to the 80 mM concentration, performing better than the control (Figure 6). Pollution 1, however, did not show this preference to this concentration (Figure 7). This suggests that these isolates may be differing strains of Alcaligenes faecalis. This species is another contender for use in bioremediation due to the zinc resistant capabilities it possesses. Although it contains metal detoxifying properties, this bacterium is known to be a common soil dweller and so its presence was not unexpected. Due to it being ubiquitous, it is expected to be found in all environments, polluted and nonpolluted. While all three isolates were determined to be Serratia liquefaciens, only one of the three isolates exhibited purple growth, the Inorganic Farm isolate. It is also important to note that the three isolates displayed differing degrees of zinc resistance (Figure 7). Isolate Donora 1 displayed a higher degree of inhibition when compared to the other two isolates. While all isolates obtained the ability to grow at 40 mM concentration, isolate Donora 1 performed better when compared to its control than the other two isolates (Figure 7). These findings suggest that the isolates appear to be different strains of Serratia liquefaciens. The TSB Donora 3 bacteria was determined to be Bacillus cereus, a common soil bacterium, which has been identified as having metal resistance capabilities to lead, cadmium, and chromium. A study conducted on the NWUAB01 strain of Bacillus cereus discovered the abundance of heavy metal resistant genes against arsenic, cadmium, copper, cobalt, and zinc within the genome. These defining characteristics of this species make it another promising contender for use in bioremediation techniques and help to explain its presence in areas effected by heavy metal industrial pollution (Avangbenro & Babalola, 2020). NB Control 2 contained two hits from BLAST with a high match, therefore the identity of this isolate could not be determined. The top hits were Alcaligenes faecalis and Delftia acidovorans. As mentioned above, Alcaligenes faecalis has been previously shown to have metal resistance capabilities. The Delftia genus has also been found to be resistant to zinc, lead, selenium, copper, aluminum, and nickel. A specific Delftia species has also been proved to have lead and zinc sorption capacities (BautistaHernandez et al., 2012). These findings are The bacteria from TSB Pollution 1 and TSB Control 2 were identified to be Alcaligenes faecalis which has been reported to demonstrate tolerance to heavy metal micropollutants such as nickel, cadmium, 18 supported by the zinc growth curve obtained in this study which showed that the isolate at 80 mM concentration performed better when compared to 40 mM concentration. community structure, where they may be less prevalent in uncontaminated environments. Further research involving the overall microbial community structure in contaminated and uncontaminated environments could provide an insight into the competitiveness of these bacteria and their persistence. The LB Control 2 sample was positively identified to be Stenotrophomonas maltophilia, a heavily studied soil bacterium with remarkable abilities. It has been classified as an opportunistic pathogen that causes nosocomial infections and has also displayed antibiotic resistance (Sanchez, 2015). S. maltophilia has also been shown to tolerate high levels of toxic metals including cadmium, lead, copper, zinc, mercury, silver, selenite, tellurite, and uranyl (Pages et al., 2008). A 2008 study provides data that suggests that in addition to the high tolerance of antibiotics, the bacterium has developed at least two different mechanisms to overcome metal toxicity and detoxification of cadmium to cadmium sulfide (Pages et al., 2008). 5. Conclusion The findings of this study suggest that industrial pollution can prevail for decades following the pollution event. The bacteria isolated from the soil has displayed the capabilities of growing in heavy metal contaminated environments, especially zinc, and suggest that this characteristic has aided in their survival within the polluted soil. The soil from Donora contained the highest zinc content, providing evidence of persisting pollution. The results from this study show that the damage caused by the Donora Smog event and the long years of industrialization in Donora still effects the soil and the microorganisms within it to this day. While zinc resistant genes within the isolates’ genome could not be conclusively confirmed due to contamination, previously published literature supports the ability of the identified organisms to confer zinc resistance. The ability of all of species identified to be resistant to heavy metals is synonymous with the expected results. Zinc resistance was found across varying soil conditions, not only contaminated environments. This suggests that these microorganisms do not depend on the contamination for survival, but rather persist within a wide range of environments, utilizing their zinc resistant capabilities should that environment become contaminated. This ability to survive within polluted soils would allow them to out compete more metal sensitive microorganisms within contaminated sites, like Donora, shifting the microbial 6. References 1948 smog. Donora Historical Society and Smog Museum. (n.d.). Retrieved April 28, 2022, from https://www.sites.google.com/site/donorahistorica lsociety/donora-history/1948-smog Ayangbenro, A. S., & Babalola, O. O. (2020). Genomic analysis of bacillus cereus NWUAB01 and its heavy metal removal from polluted soil. Scientific Reports, 10(1). https://doi.org/10.1038/s41598-020-75170-x Balafrej, H., Bogusz, D., Triqui, Z.-E. A., Guedira, A., Bendaou, N., Smouni, A., & Fahr, M. (2020). Zinc hyperaccumulation in plants: A Review. Plants, 9(5), 1–22. https://doi.org/10.3390/plants9050562 Baranauskas, L. (2017, November 29). The historically hazy story of Donora's deadly smog. Atlas Obscura. Retrieved April 25, 2022, from https://www.atlasobscura.com/articles/donorasmog-1948 19 Basharat, Z., Yasmin, A., He, T., & Tong, Y. (2018). Genome sequencing and analysis of alcaligenes faecalis subsp. phenolicus MB207. Scientific Reports, 8(1). https://doi.org/10.1038/s41598-018-21919-4 Ivel, J. (n.d.). Donora, Pennsylvania Smog Event of 1948. Donora, pennsylvania smog event of 1948. Retrieved April 25, 2022, from http://www.soe.uoguelph.ca/webfiles/gej/AQ2017 /Ivel/index.html Bautista-Hernández, D. A., Ramírez-Burgos, L. I., DuranPáramo, E., & Fernández-Linares, L. (2012). Zinc and lead Biosorption by Delftia tsuruhatensis: A bacterial strain resistant to metals isolated from mine tailings. Journal of Water Resource and Protection, 04(04), 207–216. https://doi.org/10.4236/jwarp.2012.44023 Jacobs, E. T., Burgess, J. L., & Abbott, M. B. (2018). The Donora Smog Revisited: 70 years after the event that inspired the Clean Air Act. American Journal of Public Health, 108(S2). https://doi.org/10.2105/ajph.2017.304219 Kour, R., Bhojiya, A. A., Meena, R. H., Singh, A., Mohanty, S. R., Rajpurohit, D., & Ameta, K. D. (2020). Zinc tolerant plant growth promoting bacteria alleviates phytotoxic effects of zinc on maize through zinc immobilization. Scientific Reports, 10(1), 1–13. https://doi.org/10.1038/s41598-020-70846-w Bending, G. D., Turner, M. K., & Jones, J. E. (2002). Interactions between crop residue and soil organic matter quality and the functional diversity of soil microbial communities. Soil Biology and Biochemistry, 34(8), 1073–1082. https://doi.org/10.1016/s0038-0717(02)00040-8 Kour, R., Jain, D., Bhojiya, A. A., Sukhwal, A., Sanadhya, S., Saheewala, H., Jat, G., Singh, A., & Mohanty, S. R. (2019). Zinc biosorption, biochemical and molecular characterization of plant growth-promoting zinctolerant bacteria. 3 Biotech, 9(11), 1–17. https://doi.org/10.1007/s13205-019-1959-2 Bleiwas, D. I., & DiFrancesco, C. (2010). Historical zinc smelting in New Jersey, Pennsylvania, Virginia, West Virginia, and Washington, D.C., with estimates of atmospheric zinc emissions and other materials. U.S. Geological Survey Open File Report, 1131. https://doi.org/10.3133/ofr20101131 Kumar, P., Gupta, B. S., Anurag, & Soni, R. (2019). Bioremediation of Cadmium by Mixed Indigenous Isolates Serratia liquefaciens BSWC3 and Klebsiella Pneumoniae RpSWC3 Isolated from Industrial and Mining Affected Water Samples. Pollution, 5(2), 351–360. https://doi.org/10.22059/poll.2018.268603.533 Blumberg, B. (1982). An Introduction to Soils of Pennsylvania (dissertation). The Pennsylvania State University, University Park, PA. Caneschi, W. L., Sanchez, A. B., Felestrino, É. B., Lemes, C. G., Cordeiro, I. F., Fonseca, N. P., Villa, M. M., Vieira, I. T., Moraes, L. Â., Assis, R. de, do Carmo, F. F., Kamino, L. H., Silva, R. S., Ferro, J. A., Ferro, M. I., Ferreira, R. M., Santos, V. L., Silva, U. de, Almeida, N. F., … Moreira, L. M. (2019). Serratia liquefaciens FG3 isolated from a metallophyte plant sheds light on the evolution and mechanisms of adaptive traits in extreme environments. Scientific Reports, 9(1), 1–16. https://doi.org/10.1038/s41598-019-54601-4 Kwiatkowska-Malina, J. (2018). Functions of organic matter in polluted soils: The effect of organic amendments on phytoavailability of heavy metals. Applied Soil Ecology, 123, 542–545. https://doi.org/10.1016/j.apsoil.2017.06.021 Li, L., Xu, M., Eyakub Ali, M., Zhang, W., Duan, Y., & Li, D. (2018). Factors affecting soil microbial biomass and functional diversity with the application of organic amendments in three contrasting cropland soils during a field experiment. PLOS ONE, 13(9). https://doi.org/10.1371/journal.pone.0203812 Ciolkosz, E. J., Stehouwer, R. C., & Amistadi, M. K. (1998). Metals Data for Pennsylvania Soil (dissertation). Pennsylvania State University, University Park, PA. Environmental Pollution Centers. (2022). Zinc poisoning. Environmental Pollution Centers. Retrieved April 25, 2022, from https://www.environmentalpollutioncenters.org/zi nc/ Minnikova, T. V., Denisova, T. V., Mandzhieva, S. S., Kolesnikov, S. I., Minkina, T. M., Chaplygin, V. A., Burachevskaya, M. V., Sushkova, S. N., & Bauer, T. V. (2017). Assessing the effect of heavy metals from the Novocherkassk power station emissions on the biological activity of soils in the adjacent areas. Journal of Geochemical Exploration, 174, 70–78. https://doi.org/10.1016/j.gexplo.2016.06.007 Greenspec. (2022). Zinc Production & Environmental Impact. Greenspec. Retrieved April 25, 2022, from https://www.greenspec.co.uk/building-design/zincproduction-environmental-impact/ Pages, D., Rose, J., Conrod, S., Cuine, S., Carrier, P., Heulin, T., & Achouak, W. (2008). Heavy Metal Tolerance in Stenotrophomonas Maltophilia. PLoS ONE, 3(2). https://doi.org/10.1371/journal.pone.0001539 Helfand, W. H., Lazarus, J., & Theerman, P. (2001). Donora, Pennsylvania: An environmental disaster of the 20th Century. American Journal of Public Health, 91(4), 553–553. https://doi.org/10.2105/ajph.91.4.553 20 Raj, S. K., & Syriac, E. K. (2017). Herbicidal effect on the bioindicators of soil health- A Review. Journal of Applied and Natural Science, 9(4), 2438–2448. https://doi.org/10.31018/jans.v9i4.1551 Maltophilia. Frontiers in Microbiology, 6. https://doi.org/10.3389/fmicb.2015.00658 Wade, K. M. (2019). Zinc. plantprobs.net. Retrieved April 28, 2022, from https://plantprobs.net/plant/nutrientImbalances/zi nc.html Ramya, R., & Boominathan, M. (2017). Isolation of Serratia Liquefaciens as Metal Resistant Bacteria from Industrial Effluent. International Journal of Advance Research, Ideas and Innovations in Technology, 3(6), 1272–1275. Wei, L., Liu, Y., Routh, J., Tang, J., Liu, G., Liu, L., Luo, D., Li, H., & Zhang, H. (2019). Release of heavy metals and metalloids from two contaminated soils to surface runoff in southern China: A simulated-rainfall experiment. Water, 11(7), 1339. https://doi.org/10.3390/w11071339 Rossi, R. J., Bain, D. J., Hillman, A. L., Pompeani, D. P., Finkenbinder, M. S., & Abbott, M. B. (2017). Reconstructing early industrial contributions to legacy trace metal contamination in southwestern Pennsylvania. Environmental Science & Technology, 51(8), 4173–4181. https://doi.org/10.1021/acs.est.6b03372 Zagui, G. S., Moreira, N. C., Santos, D. V., Darini, A. L., Domingo, J. L., Segura-Muñoz, S. I., & Andrade, L. N. (2021). High occurrence of heavy metal tolerance genes in bacteria isolated from wastewater: A new concern? Environmental Research, 196, 110352. https://doi.org/10.1016/j.envres.2020.110352 Sandaa, R.-A., Torsvik, V., Enger, Ã. I., Daae, F. L., Castberg, T., & Hahn, D. (1999). Analysis of bacterial communities in heavy metal-contaminated soils at different levels of resolution. FEMS Microbiology Ecology, 30(3), 237–251. https://doi.org/10.1111/j.15746941.1999.tb00652.x Snyder, L. P. (1994). “The death-dealing smog over Donora, Pennsylvania”: Industrial Air Pollution, public health policy, and the politics of expertise, 1948–1949. Environmental History Review, 18(1), 117–139. https://doi.org/10.2307/3984747 Sánchez, M. B. (2015). Antibiotic resistance in the opportunistic pathogen Stenotrophomonas 21 APPENDICES A.1 Soil Collection Data Supplemental Table 1. Soil Collection Data for the Seven Sample Sites Sample Name Donora Control Pollution Organic Farm Organic Orchard Inorganic Farm Inorganic Orchard Collection Date 05/17/2021 05/24/2021 05/17/2021 05/17/2021 05/17/2021 05/20/2021 05/20/2021 GPS Coordinate Address (40°10’41” N 79°51’11” W) (40°02’40” N 79°54’51” W) (39°52’44” N 79°55’13” W) (40°02’50” N 79°54’49” W) (40°02’50” N 79°54’49” W) (40°04’48” N 79°08’20” W) (40°04’48” N 79°08’20” W) 470 Galiffa Dr. Donora PA 377 E Malden Dr. Coal Center PA 41 Haig Ave. Carmichaels PA 377 E Malden Dr. Coal Center PA 377 E Malden Dr. Coal Center PA 1665 Coxes Creek Rd. Somerset PA 745 Edie Rd. Somerset PA A.2 Soil Moisture Raw Data Supplemental Table 2: Raw Data Moisture Content Test Sample Name Inorganic Orchard 1 Inorganic Orchard 2 Inorganic Orchard 3 Inorganic Farm 1 Inorganic Farm 2 Inorganic Farm 3 Organic Farm 1 Organic Farm 2 Organic Farm 3 Pollution 1 Pollution 2 Pollution 3 Organic Orchard 1 Organic Orchard 2 Organic Orchard 3 Control 1 Control 2 Control 3 Donora 1 Donora 2 Donora 3 Weight of the foil 2.01 2.02 2.02 2.01 2.03 1.98 1.99 2 2.02 2 2.02 2.02 2.02 2 2.02 2.02 2 2.02 2.01 2.04 2.01 Wet Weight of soil 17.68 10.44 12.54 22.25 25.87 26.85 27.68 23.64 26.12 10.02 13.48 15.22 21.42 24.76 21.01 14.43 8.95 13.81 10.42 18.22 16.97 22 Dry Weight of soil with foil 16.16 10.07 11.64 20.67 21.85 23.33 22.58 20.54 22.06 9.58 13.37 14.46 18.28 22.08 18.78 13.34 8.57 14.31 9.73 16.16 15.2 Dry weight of soil only Calculated Moisture % 14.15 8.05 9.62 18.66 19.82 21.35 20.59 18.54 20.04 7.58 11.35 12.44 16.26 20.08 16.76 11.32 6.57 12.29 7.72 14.12 13.19 19.97 22.89 23.29 16.13 23.39 20.48 25.61 21.57 23.27 24.35 15.81 18.27 24.09 18.9 20.23 21.55 26.59 11.01 25.91 22.5 22.27 Supplemental Table 3: Average Moisture Content Test and Calculated Standard Deviation Site Name Inorganic Orchard Inorganic Farm Organic Farm Pollution Organic Orchard Control Donora Average Moisture % 22.06 20.00 23.48 19.48 21.07 19.71 23.56 Standard Deviation 1.789450567 3.653724128 2.028431249 4.396013345 2.695817749 7.950153038 2.03840624 A.3 Soil Organic Matter Raw Data Supplemental Table 4: Raw Data Organic Matter Content Test Sample Name Inorganic Orchard 1 Inorganic Orchard 2 Inorganic Orchard 3 Inorganic Farm 1 Inorganic Farm 2 Inorganic Farm 3 Organic Farm 1 Organic Farm 2 Organic Farm 3 Pollution 1 Pollution 2 Pollution 3 Organic Orchard 1 Organic Orchard 2 Organic Orchard 3 Control 1 Control 2 Control 3 Donora 1 Donora 2 Donora 3 Weight of crucible 7.18 17.56 17.51 29.46 17.58 29.25 16.62 16.68 30.48 27.37 16.58 15.96 16.54 17.16 16.68 28.62 9.44 10.58 6.57 10.23 7.28 Weight of soil 5.31 5.57 5.76 5.15 5.24 5.05 5.81 5.69 5.63 5.22 5.27 5.44 5.6 5.83 5.28 5.59 5.06 5.23 5.11 5.3 5.25 Dry Weight of soil and crucible 12.05 22.46 22.66 33.96 22.44 34.26 22.09 22.17 35.82 31.86 20.88 20.74 21.85 22.64 21.67 33.42 12.61 15.41 11.15 15.02 12.17 23 Dry weight of soil only 4.87 4.9 5.15 4.5 4.86 5.01 5.47 5.49 5.34 4.49 4.3 4.78 5.31 5.48 4.99 4.8 3.17 4.83 4.58 4.79 4.89 Calculated Organic Matter % 8.29 12.03 10.59 12.62 7.25 0.79 5.85 3.51 5.15 13.98 18.41 12.13 5.18 6.00 5.49 14.13 37.35 7.65 10.37 9.62 6.86 Supplemental Table 5: Average Organic Matter Content Test and Calculated Standard Deviation Site Name Inorganic Orchard Inorganic Farm Organic Farm Pollution Organic Orchard Control Donora Average Organic Matter % 10.30 6.89 4.84 14.84 5.56 19.71 8.95 Standard Deviation 1.8878457 5.922661268 1.199290492 3.223336909 0.416337915 15.61777227 1.851225103 A.4 Zinc Content Raw Data Supplemental Table 6: Absorbency readings and calculated zinc concentrations Site Name Control Donora Pollution Absorbency Concentration Calculated Zinc Content (PPM)* -0.0086 0.0789 3.945 0.087 1.0295 51.48 0.027 0.4329 21.645 *The calculated average absorbency of isolates from Pollution, Control, and Donora sites after 9 reads. A.5 Growth Curve Raw Data Supplemental Table 7: Absorbency readings Control 2 LB Results 24 Supplemental Table 8: Absorbency readings Control 2 NB Results Supplemental Table 9: Absorbency readings Control 2 TSB Results 25 Supplemental Table 10: Absorbency readings Donora 1 TSB Results Supplemental Table 11: Absorbency readings Donora 2 TSB Results 26 Supplemental Table 12: Absorbency readings Donora 3 TSB Results Supplemental Table 13: Absorbency readings Pollution 1 TSB Results 27 Supplemental Table 14: Absorbency readings Inorganic Farm TSB Results A.5 Genome Assembly Data Supplemental Table 7: Shotgun Sequencing Data Assembly Results Total Sequences Inorganic Farm TSB 1,984,094 Mean read length 147.68 Donora 1 - TSB Donora 2 - TSB Donora 3 - TSB Pollution 1 - TSB Control 2- TSB Control 2 – NB Control 2 - LB 1,759,608 1,468,406 1,665,436 1,412,215 2,214,563 1,868,298 1,694,457 146.44 149.16 149.53 142.42 140.98 149.30 149.27 Phred score Completeness Contamination ANI 33.36 100% 258.71 98.426 33.38 33.31 33.30 33.31 33.49 33.34 33.32 100% 100% 100% 100% 100% 100% 100% 243.76 206.13 145.47 265.16 133.26 237.97 34.96 98.4671 98.1358 89.6983 97.9503 98.0441 92.5459 91.2187 28 Supplemental Figure 1: Inorganic Farm TSB QUAST Comparison Supplemental Figure 2: Donora 1 TSB QUAST Comparison 29 Supplemental Figure 3: Donora 2 TSB QUAST Comparison Supplemental Figure 4: Donora 3 TSB QUAST Comparison 30 Supplemental Figure 5: Pollution 1 TSB QUAST Comparison Supplemental Figure 6: Control 2 TSB QUAST Comparison 31 Supplemental Figure 7: Control 2 NB QUAST Comparison Supplemental Figure 8: Control 2 LB QUAST Comparison 32