Money is a finite resource, and so is our time; there are only so many places that Progressive-leaning groups can invest in during a given election cycle, and we must be strategic with our spending to make the most impact across the country, in our states, and in our local communities. Data-driven decision making is crucial. In this case, we needed to target districts within the state of North Carolina for strategic investment.
Case Study: North Carolina State Legislature
We were asked to look at North Carolina’s State Legislature and identify rural districts where some targeted investment would make the most impact. The approach seemed simple: find districts that have been competitive over the last few elections, throw in a few uncontested districts that may have demographics that could be won over with a good persuasion campaign, and we are on our way to winning some state legislature seats.
Unfortunately, as is common with electoral data, the result was much messier than expected.
There are many places to get electoral data, including the Secretary of State, which is where we started. North Carolina's Secretary of State's website had every race broken down by precinct (with some statistical noise thrown in to remove identifying characteristics). However, the data was particularly messy with some precincts having different identifiers.
This led us to use data from the MIT Election Data and Science Lab (MEDSL). MEDSL provides a cleaned and standardized version of precinct-level election data from around the country, using the same information across years to identify precincts and districts.
Districts Change Fast...
North Carolina has had several more district shakeups than the usual 10-year Census update, due to repeated gerrymandering from the Republican state legislature and subsequent judicial interventions. New maps have been created in 2017, 2018, 2019, 2021, 2022, and new ones are coming soon.
...But Data is Forever
Luckily, there are not nearly as many radical changes in precincts, the small areas with under 1000 voters that sort voters into polling locations on election day. A precinct may have moved to a new district, but the shapes and sizes of precincts have largely stayed the same over the last few elections. Therefore, we should be able to "join" past election results into new districts using these common precincts. This would then allow us to analyze past elections to establish some trends and hand off a set of districts for targeting to the client.
If only it were that easy...
Unfortunately, the pesky data wrangling didn't end there, because a "join" of the data requires us to know which precinct belongs/belonged (now and in the past) to which district. That data was difficult to find, even though the North Carolina Secretary of State has a wonderful interactive website that relies on this exact mapping where you can enter a precinct, address, county, ZIP code, or municipality and it will show you on a map what district it is in. The website does not allow for exporting these district-precinct mappings. So we had to find a way to determine which district a given precinct is in and associate the election results from that precinct with the current district map.
Mapping with Shapefiles
Shapefiles are geospatial vector files used to process geographic data. Basically, they are fancily encoded maps. The federal government provides many types of shapefiles, and many states do as well (including North Carolina). The North Carolina SoS provides many shapefiles, but the two types we were interested in were the precincts and the districts. Collecting both precinct and state legislative district shapefiles allowed us to overlay and find the connections between them, i.e. which precinct fell under which district for each redistricting cycle. We were able to program this process of overlaying both shapefiles to determine when a precinct was inside a district and passed along the output to the rest of the election analysis. Programmatically, a precinct was considered “in” a district if >50% of the precinct’s area was contained in the district. However, in the real world the maps are drawn to not cut precincts in half, but the 50% rule accounted for if any precincts changed during any of the years under study.
The Results: North Carolina State Legislative District Analysis Overview
Below are the core findings for the State House and State Senate of North Carolina. Additionally, in each section are maps from the last three elections, but drawn using the current districts. This helps illustrate how the current districts have changed over time.
There are 30 districts where the Republican candidate was not contested:
There are 13 districts where Republicans won by less than 5% in 2022:
Dem Win: 2.51%
Dem Win: 5.78%
Dem Win: 1.19%
Dem Win: 2.77%
Dem Win: 1.50%
Dem Win: 44.16%
Dem Win: 0.67%
Dem Win: 5.32%
Dem Win: 9.71%
Dem Win: 19.24%
Dem Win: 2.78%
Dem Win: 1.86%
Dem Win: 2.38%
Dem Win: 1.29%
There are 14 districts where the Republican candidate was not contested:
There are four districts where Republicans won by less than 5% in 2022:
Dem Win: 1.63%
Dem Win: 3.74%
One of the most important pieces of information that can be taken from the data above is there are many districts that a Democrat should run just to give an alternative to the Republican. There are only 50 state Senate seats in North Carolina, and with 14 of those being uncontested races, that is 28% of the Senate that Democrats did not even attempt to contest.
Secondly, there are several districts in both the House and the Senate that are quite competitive. 5 out of the 8 rural House seats were won by a Democrat in 2020 and 2018, and were narrowly lost in 2022. These would be great places for organizations to target for their work.
A majority of the problem-solving for this project was spent figuring out how to translate old election results to fit into their current districts. A lot of that code has now been written which makes this an easier project to apply to other states and data sets.
A good extension of this project would be to apply some turnout and voter file information to the above districts to further narrow the targeting range. For example, if those closely contested districts were lost on turnout, a strong GOTV campaign would be able to sway them. However, that cannot be concluded by-election results alone, and would need more data about the voter makeup of the specific districts to be effective.