Part 1 - How Census Data is Gathered
Three Part Series on the Scope, Tech, and Impact of the U.S. Census Bureau
By: Isaac D. Tucker-Rasbury, Data Fellow '22 | Bluebonnet Data
“The scope of the census is overwhelming at first – almost like if you tried to count every star in the sky.”
Every 10 years, the United States (U.S.) Census Bureau hires approximately 500,000 temporary workers on top of its regular staff of four thousand employees and volunteers to take on the behemoth task of conducting the Decennial Census – recording, organizing, analyzing, and disseminating data to the public about the approximately 332 million people living within the 50 states, District of Columbia, Puerto Rico and other territories. If that wasn’t enough, the Census Bureau also conducts other censuses, surveys, population estimates, and projections, and creates a number of publicly available data products and applications. While that’s a mouthful, the short and sweet of it is that “the census” is an incredibly vast resource that is rubber-stamped for quality by the U.S. government, regularly refreshed, and available for data practitioners everywhere to access.
Personally, I knew none of this coming out of college and still didn’t know much if any of this until about three years after college when I began working as a data analyst at a consulting company, where I conducted global strategic growth research on their Financial Planning and Analysis (FP&A) team. Even then, my knowledge only scratched the surface as we downloaded only what we needed and moved on to collect other data.
Enter stage left, Bluebonnet Data! As a Data Fellow with Bluebonnet, I was fortunate enough to receive training on the data sources that progressive campaigns and organizations make use of – and the U.S. census was prominently discussed among them. The more I learned about the census itself and all of the incredible resources they share, the more excited the data nerd in me became. And, that’s what I’m interested in exploring more in-depth with you today!
We’ll explore the richness of the U.S. Census in this three parts series and it will fall into the following buckets:
Reviewing a list of free tools anyone can leverage to access and use the underlying data in the census,
And, discussing the wide-ranging impacts of the results of the census.
Reviewing the different censuses, surveys, estimates, and projections published by the U.S. Census Bureau is a greater task than we have space for now, but we will start by trying to boil it down.
Imagine a map of the United States. Then break that up into the 50 states. Things get a little harder when we dig a little deeper and consider the approximately 90,000 state and local governments regulating virtually every industry and geographic area. Take that a step further and layer in what the U.S. population looks like, how people live, income levels, poverty, education, health insurance coverage, housing quality, crime victimization, computer usage, social security benefits, education, employment, food security, health, housing status, household spending, consumer spending, etc.
The scope of the census is overwhelming at first – almost like if you tried to count every star in the sky. Yet, the U.S. Census Bureau not only imagines it all but regularly produces and refreshes this data for public use. All this makes the census an incredible resource for researchers and data practitioners interested in the United States, domestically and abroad.
Prior to Bluebonnet, I only had the opportunity to work with city-level data from the U.S. and other countries’ censuses, so the full scope of censuses escaped me. Subsequently, it has been incredibly enriching to learn about the U.S. Census. It’s an intricate and impactful endeavor at scale. Given that I was/am still fairly new to the census, I had a hard time trusting that this all went over smoothly – I wanted to know the answer to “how does the U.S. Census Bureau ensure the quality of the data they compile and analyses they conduct?” I couldn’t leave that question alone, so here’s what I learned!
Data Quality Considerations
Over time, the Census Bureau has improved the data collection process by iterating on survey and questionnaire design as well as different modes of data collection, analysis, and information dissemination. They have simultaneously strived to curate a dataset that addresses the fundamental criteria of data quality, ie. accuracy, completeness, reliability, relevance, and timeliness. The table below defines these factors and we close with a quick walkthrough of how the census meets the respective criteria.
How we'll define it
Is the information correct in every detail?
How comprehensive is the information?
Does the information contradict other trusted resources?
Do you really need this information?
How up-to-date is the information? Can it be used for real-time reporting?
Accuracy. As of March 2022, the Post Enumeration Survey (PES), which measures the accuracy of the census by independently surveying a sample of the population, found that the 2020 Census had neither an undercount nor an overcount for the nation. In their words, the census had an “estimated net coverage error of -0.24% (or 782,000 people) with a standard error of 0.25% for the nation, which was not statistically different from zero.” To put that into perspective, that margin of error is roughly the size of northwestern cities like San Francisco or Seattle, both larger than 780,000 people, and greater than the midwestern city of Boise, which is approximately 765,000. And, as you’ll see below, this is a part of a broader trend towards an increasingly accurate census. However, this trend’s hopeful direction does paint a broad stroke over the inaccurate counting of ethnic minorities and undocumented communities in the latest census.
Completeness. We have already covered the vast array of data points and statistics that the Census Bureau compiles. In short, it is fair to say that the census has captured a thorough picture of the people of the United States and how they live their lives. However, "completeness" has become a shakier factor over the last decade where politics has aggressively bled into how the count is conducted and who gets counted.
Reliability. The Census Bureau details at length the possible survey methods errors impacting their statistics here and their broader methodology here. It is in this way that the Bureau engenders trust and reliability by being extremely transparent about its methods. This, in turn, allows others coming behind them to use the data to make their own decisions about what the upstream methods may or may not mean for their own analyses.
Relevance. Do you imagine that only a few number crunchers with thick calculators in a dark room in the back of a kafkaesque government building are the only ones using this data? If so, you are very mistaken. The census is outlined in our constitution, so conducting it fulfills part of our civic duty to the nation, but we as a nation also use it to decide where stores, schools, hospitals, and businesses will go. Associations, corporations, government agencies, lawmakers, and individuals alike rely on the census to give them a clear picture of what is going on in areas of interest. At a high level…
“Census results have several high profile applications: they are used to reapportion seats in the House of Representatives, to realign congressional districts, and as a factor in the formulas that distribute hundreds of billions of dollars in federal funds each year. Because of the importance of this population count, procedural changes in the decennial census often reflect larger organizational shifts at the Census Bureau.” - U.S. Census Bureau Website
Timeliness. The census does not meet the criteria for real-time reporting. It doesn’t even meet the criteria for daily, weekly, monthly, or even annual reporting. The constitution lays out a requirement to do the Population and Housing Census once a decade, ie counting everyone and every household, which is what we typically refer to as “the census.” Then, there’s the business side with the Economic Census. “The Economic Census is the U.S. Government's official five-year measure of American business and the economy for planning and key economic reports, and economic development and business decisions.” And then, there is the Census of Governments which “identifies the scope and nature of the nation's state and local government sector; provides authoritative benchmark figures of public finance and public employment; classifies local government organizations, powers, and activities; and measures federal, state, and local fiscal relationships.” Then there is the most timely of the bunch, the American Community Survey. This is a monthly report sent out to a sample of the population (roughly three and a half million people) about things like education, employment, internet access, and transportation.
With all this in mind and using a (not so) strict eyeball approximation, the census meets about ⅘ of the criteria we laid out for a high-quality dataset, barely a B-. However, more importantly than a loose score, the census chronicles the lived experiences and circumstances of millions of people, businesses, and government agencies residing stateside - an applaudable behemoth-sized endeavor.
In Part II, we’ll dive deeper into how you can access the data underlying the census and the tools you can use to do it!
Follow the conversation on LinkedIn here.
About The Author
Isaac D. Tucker-Rasbury (he/him) was a Bluebonnet Data Fellow for 2Million Texans’ Blue Action Network in 2022. He is currently working remotely from Los Angeles as a data analyst at Slalom and as a financial technology tutor for edX which manages coding boot camps for continuing learners. When Isaac isn’t neck deep in spreadsheets, he is dancing on rollerskates or practicing photography. If you would like to get in touch, he can be contacted via LinkedIn.