admin
Fri, 02/09/2024 - 19:54
Edited Text
Abstract
This thesis will use data analysis to explore changes in voting in Allegheny
County over the last 6 years. Allegheny County provides information on polling
locations, voting district boundaries, congressional district boundaries, and house of
representative boundaries. In addition to these political data sets, demographic
information such as school district boundaries, police zones, and census date will also
be considered. Using data from 2014 to present, analysis will be performed to see what
has changed about voting in Allegheny County over time. In addition to this data
analysis, this paper will explore trends in voting in Metropolitan areas, such as Atlanta,
and how those compare to Pittsburgh. Atlanta has had statistically significant changes
in recent history and will feature a useful case study to compare to Allegheny County.
1
Introduction
The goal going into this project was to perform data analysis using publicly
accessible data to see how voting in Pittsburgh compares to other areas, and how it has
changed over time. Initially, it was intended to focus on analyzing information related
to polling booth locations and voting districts, but the formatting of that data did not fit
the methodology that used for data analysis. It will still include an analysis of
demographic data describing Allegheny County, and discuss what that information
implies about voter suppression.
Background
Discourse about the 2020 election was nearly impossible to avoid prior to the
election. Discussion of voter suppression, voter fraud, and accessibility to voting was
prominent during this time. As a nation there were debates over how to make the
process of voting accessible during the pandemic, and that conversation included how
to avoid voter suppression that was already widespread in America. These current
events led to the idea to conduct data analysis related to these topics, especially in
relation to what data there was on Allegheny County to learn what that data implies
about the state of voter suppression within Allegheny County. Aside from choosing
this location due to our proximity to it, since Pittsburgh is in Allegheny county the
metropolitan area could provide more interesting results.
Literature Review
The purpose of this literature review is to get background information on what
voter suppression is, what caused it in the past, and what contributes to it today. Having
2
sufficient background knowledge on voter suppression will be necessary for further data
analysis. “Passive Voter Suppression: Campaign Mobilization and the Effective
Disfranchisement of the Poor” looks at Old vs. New Voter Suppression, Passive Voter
Suppression, and how to combat passive voter suppression. “The Politics of Voter
Suppression: Defending and Expanding Americans’ Right to Vote” is a broader
overview of how voter suppression affects voting in America and how voter suppression
affects elections. Both resources were helpful in gaining a base understanding of what
may indicate voter suppression in my research.
Historical Context
The United States has a long history of discrimination affecting voting. When
America was first founded, voting laws were decided by each state, meaning only
white men who owned land had the right to vote in most cases (The Library of
Congress). “African Americans, women, Native Americans, non-English speakers, and
citizens between the ages of 18 and 21 had to fight for the right to vote in this country”
(The Library of Congress). The Voting Rights Act was enacted in 1965, making sure
people of every race had the right to vote (Parish, 2014, p. 1). Even though legally
everyone has the right to vote now, there were and are still ways to enforce voter
suppression.
Voter Suppression
Voter suppression is the attempt to “restrict the right to vote” (Wang, 2016, p.
x). There have been many attempts of voter suppression throughout American history,
some examples include a poll tax, grandfather clauses, and literacy tests (Wang, 2016,
p. 20). Voter suppression is very harmful to democracy, both in the United States and
3
elsewhere. “Every effort at vote suppression harms democracy, and it harms
democratic citizenship” (Wang, 2016, p. 109).
Access to Voting
Voting continues to get more accessible with each election. In the 2008
presidential election, North Carolina introduced in-person early voting and allowed
people to register and vote at the polling place during the early voting period (Wang,
2016, p. 91). North Carolina had 236,700 new voters who took advantage of same day
registration and had the largest increase of new voters in the country (Wany, 2016, p.
91).
Analysis Plan
For the data collection process Data.gov, “The Home of the U.S. Government’s
open data” was utilized to find data about Allegheny County (U.S. General Services
Administration). It provided Census data from 2000 and 2010, information on school
districts in Allegheny County, and employment data. There was also data about voting
districts and polling locations, but unfortunately the formatting did not work with the
program that was used to perform data analysis.
For the analysis plan, demographic data describing Allegheny County was used
to determine what may be implied about voting. Another goal within the Analysis plan
is to determine how Pittsburgh compares to other areas in relation to voter
suppression. On its own the information from Allegheny County does not say much
about voting, so looking at how that information compares to other places will be
helpful in the analysis plan.
4
Data Design
After finding data to use and developing a plan for how to analyze it, a data
design was established to perform that analysis. The program used to analyze data
required a connection to a server, so it was necessary to emulate a server. The Oracle
VM Virtual Box simulates a server, that can run SAS without needing to connect to a
specific network or use a specific server. It creates a window that runs the virtual
machine which emulates a server, so you can use your web browser as a host.
The data that was collected about Allegheny County was uploaded into SAS.
Most of the data had a data dictionary to go along with it, which describes each
component of the data. Since there was data from a few different sources, it was
important to look at each data dictionary to understand what the data was before trying
to do anything with the data. A data dictionary has each field and a brief description of
the data in each field. Once the data was uploaded into SAS, different data models were
used to visualize each component of the data to make it easier to understand.
5
Results
This graph shows the average riders on different public transportation routes
each month. Each symbol represents a different bus route. While initially this may not
seem relevant to voting, we can look at where each of these routes is located and see how
that compares to the voting districts. Is the accessibility to voting proportionate to the
amount of people who live and work in these areas? Will people who rely on public
transportation be able to access their assigned voting locations? We also see that the
average amount of riders drops significantly in the time leading up to the election due to
the pandemic. Will these people still be able to visit their voting locations, or will they be
able to vote remotely?
6
“In the 2016 presidential election, there was, according to the United States
Census, a 30% reported turnout gap between the wealthy and the poor” (Ross, 2019, p.
656). In addition to looking at bus usage throughout Pittsburgh, we can also look at the
property values to make a stronger profile for what the demographic information tells
us about voting activity. In this graph each dot represents a different property value, and
they are sorted by their value and their location. Income is a strong indicator of if
someone votes. If there is a clear disparity between property values in Allegheny
County, that may indicate that there is also a disparity in income, and furthermore a
disparity in voting.
7
8
This chart shows the Annual Salary in 2017 and 2020 of workers in Pittsburgh,
sorted by their starting date. The different symbols are for different departments. Aside
from some outliers, there is not any significant difference between these two charts that
stand out to draw any conclusion from. We can use the visualization of annual salaries
in Allegheny County to make some assumptions in relation to voting but that is about
it.
Discussion
Based on the information found through data analysis, there was not any
statistically significant data that implied that there was voter suppression happening in
Allegheny County. While none of the data collected implies that there is voter
suppression, that does not mean that it does not happen. In addition to that, Allegheny
County does not represent the Unites States as a whole, and there may still be voter
suppression elsewhere.
Comparison to Atlanta
Recently state lawmakers in Georgia made provisions to restrict access to early
or absentee voting and require an approved form of identification. Senate Bill 202 has
been compared to the voting laws present during the Jim Crow Era (Cox, 2021). While
some of the data that was analyzed may indicate signs of voter suppression, that seems
insignificant when there are clear signs of it occurring right now. While looking for de
facto signs of voter suppression in Allegheny County, a movement for de jure voter
suppression is underway in the state that Atlanta is located in.
9
The initial plan was to do a case study comparing results about Allegheny
County to information on Atlanta, but as of right now that is not a fair comparison to
make.
Conclusion
Going into this I expected to find gradual changes over the years that may indicate a mild amount
of voter suppression. While I did not find that in Allegheny County, there is still clearly evidence
of voter suppression, maybe just not in Pittsburgh. While I was looking for subtle trends in
voting that changed gradually, majorly impactful changes to suppress voting are being made in
plain sight. Going into my thesis this is not the conclusion that I imagined that I would be giving,
but it feels fitting. While data analysis and data mining can be used to find deeper insights and
structure in the data that we already have, they are not tools to predict the future. My research did
not go as planned, but that is a part of this as a learning experience.
10
References
Cox, C. (2021, April 10). Georgia voting law explained: Here’s what you need to know
about the state’s new election rules. USA Today.
https://www.usatoday.com/story/news/politics/2021/04/10/georgia-new-votinglawexplained/7133587002/.
The Library of Congress. The Founders and The Vote. (n.d.).
https://www.loc.gov/classroommaterials/elections/right-to-vote/the-founders-and-the-vote/
Parish, H. (2014). Voting Rights Act: Historical Context and Associated Issues and
Trends.
Nova Science Publishers, Inc.
Roiger, R. J. (2017). Data Mining: A Tutorial-Based Primer, Second Edition. Chapman
and Hall/CRC.
Ross II, B. L., & Spencer, D. M. (2019). Passive Voter Suppression: Campaign
Mobilization and the Effective Disfranchisement of the Poor. Northwestern
University Law Review, 114(3), 633–703.
U.S. General Services Administration. The home of the U.S. Government’s open data.
https://www.data.gov/
Wang, T. (2016). The Politics of Voter Suppression: Defending and Expanding
Americans’ Right to Vote. Cornell University Press.
This thesis will use data analysis to explore changes in voting in Allegheny
County over the last 6 years. Allegheny County provides information on polling
locations, voting district boundaries, congressional district boundaries, and house of
representative boundaries. In addition to these political data sets, demographic
information such as school district boundaries, police zones, and census date will also
be considered. Using data from 2014 to present, analysis will be performed to see what
has changed about voting in Allegheny County over time. In addition to this data
analysis, this paper will explore trends in voting in Metropolitan areas, such as Atlanta,
and how those compare to Pittsburgh. Atlanta has had statistically significant changes
in recent history and will feature a useful case study to compare to Allegheny County.
1
Introduction
The goal going into this project was to perform data analysis using publicly
accessible data to see how voting in Pittsburgh compares to other areas, and how it has
changed over time. Initially, it was intended to focus on analyzing information related
to polling booth locations and voting districts, but the formatting of that data did not fit
the methodology that used for data analysis. It will still include an analysis of
demographic data describing Allegheny County, and discuss what that information
implies about voter suppression.
Background
Discourse about the 2020 election was nearly impossible to avoid prior to the
election. Discussion of voter suppression, voter fraud, and accessibility to voting was
prominent during this time. As a nation there were debates over how to make the
process of voting accessible during the pandemic, and that conversation included how
to avoid voter suppression that was already widespread in America. These current
events led to the idea to conduct data analysis related to these topics, especially in
relation to what data there was on Allegheny County to learn what that data implies
about the state of voter suppression within Allegheny County. Aside from choosing
this location due to our proximity to it, since Pittsburgh is in Allegheny county the
metropolitan area could provide more interesting results.
Literature Review
The purpose of this literature review is to get background information on what
voter suppression is, what caused it in the past, and what contributes to it today. Having
2
sufficient background knowledge on voter suppression will be necessary for further data
analysis. “Passive Voter Suppression: Campaign Mobilization and the Effective
Disfranchisement of the Poor” looks at Old vs. New Voter Suppression, Passive Voter
Suppression, and how to combat passive voter suppression. “The Politics of Voter
Suppression: Defending and Expanding Americans’ Right to Vote” is a broader
overview of how voter suppression affects voting in America and how voter suppression
affects elections. Both resources were helpful in gaining a base understanding of what
may indicate voter suppression in my research.
Historical Context
The United States has a long history of discrimination affecting voting. When
America was first founded, voting laws were decided by each state, meaning only
white men who owned land had the right to vote in most cases (The Library of
Congress). “African Americans, women, Native Americans, non-English speakers, and
citizens between the ages of 18 and 21 had to fight for the right to vote in this country”
(The Library of Congress). The Voting Rights Act was enacted in 1965, making sure
people of every race had the right to vote (Parish, 2014, p. 1). Even though legally
everyone has the right to vote now, there were and are still ways to enforce voter
suppression.
Voter Suppression
Voter suppression is the attempt to “restrict the right to vote” (Wang, 2016, p.
x). There have been many attempts of voter suppression throughout American history,
some examples include a poll tax, grandfather clauses, and literacy tests (Wang, 2016,
p. 20). Voter suppression is very harmful to democracy, both in the United States and
3
elsewhere. “Every effort at vote suppression harms democracy, and it harms
democratic citizenship” (Wang, 2016, p. 109).
Access to Voting
Voting continues to get more accessible with each election. In the 2008
presidential election, North Carolina introduced in-person early voting and allowed
people to register and vote at the polling place during the early voting period (Wang,
2016, p. 91). North Carolina had 236,700 new voters who took advantage of same day
registration and had the largest increase of new voters in the country (Wany, 2016, p.
91).
Analysis Plan
For the data collection process Data.gov, “The Home of the U.S. Government’s
open data” was utilized to find data about Allegheny County (U.S. General Services
Administration). It provided Census data from 2000 and 2010, information on school
districts in Allegheny County, and employment data. There was also data about voting
districts and polling locations, but unfortunately the formatting did not work with the
program that was used to perform data analysis.
For the analysis plan, demographic data describing Allegheny County was used
to determine what may be implied about voting. Another goal within the Analysis plan
is to determine how Pittsburgh compares to other areas in relation to voter
suppression. On its own the information from Allegheny County does not say much
about voting, so looking at how that information compares to other places will be
helpful in the analysis plan.
4
Data Design
After finding data to use and developing a plan for how to analyze it, a data
design was established to perform that analysis. The program used to analyze data
required a connection to a server, so it was necessary to emulate a server. The Oracle
VM Virtual Box simulates a server, that can run SAS without needing to connect to a
specific network or use a specific server. It creates a window that runs the virtual
machine which emulates a server, so you can use your web browser as a host.
The data that was collected about Allegheny County was uploaded into SAS.
Most of the data had a data dictionary to go along with it, which describes each
component of the data. Since there was data from a few different sources, it was
important to look at each data dictionary to understand what the data was before trying
to do anything with the data. A data dictionary has each field and a brief description of
the data in each field. Once the data was uploaded into SAS, different data models were
used to visualize each component of the data to make it easier to understand.
5
Results
This graph shows the average riders on different public transportation routes
each month. Each symbol represents a different bus route. While initially this may not
seem relevant to voting, we can look at where each of these routes is located and see how
that compares to the voting districts. Is the accessibility to voting proportionate to the
amount of people who live and work in these areas? Will people who rely on public
transportation be able to access their assigned voting locations? We also see that the
average amount of riders drops significantly in the time leading up to the election due to
the pandemic. Will these people still be able to visit their voting locations, or will they be
able to vote remotely?
6
“In the 2016 presidential election, there was, according to the United States
Census, a 30% reported turnout gap between the wealthy and the poor” (Ross, 2019, p.
656). In addition to looking at bus usage throughout Pittsburgh, we can also look at the
property values to make a stronger profile for what the demographic information tells
us about voting activity. In this graph each dot represents a different property value, and
they are sorted by their value and their location. Income is a strong indicator of if
someone votes. If there is a clear disparity between property values in Allegheny
County, that may indicate that there is also a disparity in income, and furthermore a
disparity in voting.
7
8
This chart shows the Annual Salary in 2017 and 2020 of workers in Pittsburgh,
sorted by their starting date. The different symbols are for different departments. Aside
from some outliers, there is not any significant difference between these two charts that
stand out to draw any conclusion from. We can use the visualization of annual salaries
in Allegheny County to make some assumptions in relation to voting but that is about
it.
Discussion
Based on the information found through data analysis, there was not any
statistically significant data that implied that there was voter suppression happening in
Allegheny County. While none of the data collected implies that there is voter
suppression, that does not mean that it does not happen. In addition to that, Allegheny
County does not represent the Unites States as a whole, and there may still be voter
suppression elsewhere.
Comparison to Atlanta
Recently state lawmakers in Georgia made provisions to restrict access to early
or absentee voting and require an approved form of identification. Senate Bill 202 has
been compared to the voting laws present during the Jim Crow Era (Cox, 2021). While
some of the data that was analyzed may indicate signs of voter suppression, that seems
insignificant when there are clear signs of it occurring right now. While looking for de
facto signs of voter suppression in Allegheny County, a movement for de jure voter
suppression is underway in the state that Atlanta is located in.
9
The initial plan was to do a case study comparing results about Allegheny
County to information on Atlanta, but as of right now that is not a fair comparison to
make.
Conclusion
Going into this I expected to find gradual changes over the years that may indicate a mild amount
of voter suppression. While I did not find that in Allegheny County, there is still clearly evidence
of voter suppression, maybe just not in Pittsburgh. While I was looking for subtle trends in
voting that changed gradually, majorly impactful changes to suppress voting are being made in
plain sight. Going into my thesis this is not the conclusion that I imagined that I would be giving,
but it feels fitting. While data analysis and data mining can be used to find deeper insights and
structure in the data that we already have, they are not tools to predict the future. My research did
not go as planned, but that is a part of this as a learning experience.
10
References
Cox, C. (2021, April 10). Georgia voting law explained: Here’s what you need to know
about the state’s new election rules. USA Today.
https://www.usatoday.com/story/news/politics/2021/04/10/georgia-new-votinglawexplained/7133587002/.
The Library of Congress. The Founders and The Vote. (n.d.).
https://www.loc.gov/classroommaterials/elections/right-to-vote/the-founders-and-the-vote/
Parish, H. (2014). Voting Rights Act: Historical Context and Associated Issues and
Trends.
Nova Science Publishers, Inc.
Roiger, R. J. (2017). Data Mining: A Tutorial-Based Primer, Second Edition. Chapman
and Hall/CRC.
Ross II, B. L., & Spencer, D. M. (2019). Passive Voter Suppression: Campaign
Mobilization and the Effective Disfranchisement of the Poor. Northwestern
University Law Review, 114(3), 633–703.
U.S. General Services Administration. The home of the U.S. Government’s open data.
https://www.data.gov/
Wang, T. (2016). The Politics of Voter Suppression: Defending and Expanding
Americans’ Right to Vote. Cornell University Press.