Skip to Main Content
Speaking of the Economy
a world of data
Speaking of the Economy
July 27, 2022

The Pursuit of Data

Audiences: Business Leaders, Community Advocates, General Public, Policymakers

Sierra Latham and Stephanie Norris discuss the challenges of collecting socioeconomic data on small towns and rural areas, how data analysts address them, and the value of that work in analyzing and improving the Richmond Fed's understanding of these communities. Latham and Norris are senior research analysts on the Regional and Community Analysis team at the Richmond Fed.

Speakers


Transcript


Tim Sablik: Hello, and welcome to Speaking of the Economy. I'm your host, Tim Sablik, an economics writer at the Richmond Fed.

My guests today are Sierra Latham and Stephanie Norris. They are both senior research analysts in the Regional and Community Analysis group at the Richmond Fed. We're going to be talking about the challenges smaller communities face when it comes to gathering and analyzing data about their economy.

Longtime listeners will know that one of the Richmond Fed's research initiatives is to better understand the challenges and opportunities facing rural and small towns in our region. If you haven't already, I'd encourage you to listen to the episode we recorded with Richmond Fed President Tom Barkin at the beginning of this year, where he discusses this research focus. We'll put a link up to that episode in the show notes for anyone who's interested.

Sierra and Stephanie, you're both deeply involved in this process of trying to gather better data about rural and small towns. We're going to get into some of the challenges of that work. But first I wanted to ask, Sierra, how do you define data in this context? I think it's a word that gets used a lot these days. But I believe we all have a different definition in mind when we hear it.

Sierra Latham: Sure. When we talk about data, we are referring to both quantitative and qualitative data. Quantitative data, I'd say, is more familiar to folks. Quantitative data may come in the form of a single metric for a specific county — for example, the median household income for Richmond, Va., is about $47,250 — or in the form of a dataset containing fields or metrics and records. A dataset we might be interested in is spending per pupil at schools within a county. Spending per pupil is the field and the records contain data for each individual school.

In addition to quantitative data, we're also interested in qualitative data. Qualitative data represents knowledge generated through conversations, focus groups, open-ended survey questions, and other sources where the information collected is narrative in nature. At the Richmond Fed, we collect these data through a variety of sources like our Community Conversations, where the objective is to learn about emerging economic issues from members of the community. These data are valuable, either in isolation or alongside quantitative data, because they provide context and nuance regarding the economic conditions at the local level. They help tell stories that might otherwise be lost if we were looking at quantitative data alone.

Sablik: Yeah, that's a great point. You can sometimes miss a lot of things if you're only looking at the big quantitative datasets. Stephanie, maybe you can talk a bit about where you actually find the data that you use in your work.

Stephanie Norris: There are three primary tools for gathering the quantitative data that we use when we're looking at communities. Those are administrative records, the census, and representative surveys.

Administrative records are typically collected by government agencies, companies and other organizations, largely for recordkeeping and reporting purposes. This would include data on, for example, unemployment insurance applications. But it would also include data on school enrollment, and it would include vital statistics records like birth records. These data are generally reliable but not always representative and have pretty limited applicability in our work.

When we talk about data collection through [the U.S. Census Bureau], most people think about the decennial census. Most people know every 10 years, census takers try to talk to every household in the country to provide a full count of the U.S. population, along with other information on household characteristics. As you can imagine, this is a pretty expensive and time-consuming process, which is why it only happens every 10 years. But it's great because, in theory, every household participates, so it's a full capture of the U.S. population.

Representative surveys, on the other hand, rely on samples. Instead of collecting data on everyone, these surveys rely on a portion of the population to represent the full population. Because you're relying on a small group of respondents to represent everyone, selecting the right sample can be challenging. But this process is easier and less time consuming to do than asking everyone, like in the census.

We use a lot of survey data. The American Community Survey, which is produced by the U.S. Census Bureau, and the Current Population Survey, conducted by the Census and the U.S. Bureau of Labor Statistics, are two that we use pretty frequently.

Sablik: Thanks for sharing that. How do you use that data in your day-to-day work once you've collected it?

Latham: For starters, researchers like Stephanie and me rely on these data for their research we do on rural communities. All of the writing we do on rural communities at the Fed is grounded in evidence which we get from analyzing data, both quantitative and qualitative.

Businesses use data on local demographic characteristics and market conditions. For instance, when deciding where to locate a new manufacturing operation, a company will want to explore how many potential employees are located in a community. As another example, a retail establishment will want to learn about potential customers' demographic characteristics as well as how many competitor businesses they might have in the community.

Another group that might want to use data are community-based organizations, which rely on quantitative data alongside qualitative information to determine where the populations they intend to serve are located. Like local governments, they also use these data to apply for grant funding and conduct program evaluations.

Sablik: Yeah, that's a great point. Obviously, our researchers here at the Bank are very interested in this data for better understanding the conditions in our [Federal Reserve] District. But it's also very important to the businesses and people on the ground.

How about the local policy leaders in these communities? How do they use this data?

Norris: Well, like Sierra said, local government agencies and local community leaders do rely on these datasets for program evaluations. For example, if they institute a program aimed at improving health care access for their residents, they'd want to collect data before and after implementation to see if it's working. That would require administrative data, but they also would want to gather demographic and socioeconomic data that would come from larger surveys.

Local government agencies also use these data to apply for grants and state funding. The data are important for local economic development authorities who might want to use the data on, say, educational attainment, housing availability and other market data to attract businesses and investment to the area. Importantly, these data can also drive policy decisions and resource allocation decisions at the state and federal levels.

Sablik: Right. I think now's a good time to get into one of the big questions that I wanted to talk to you both about, which are the challenges having to do with collecting and producing this data, particularly when you're trying to collect information on smaller communities. Sierra, maybe you could go first.

Latham: Sure. With all surveys, bias is a concern. There are two types of bias that I wanted to talk about today: sampling bias and nonresponse bias. Sampling bias occurs when a subset of the population you're hoping to learn about is excluded from the sample in the first place. Nonresponse bias occurs when the people who do respond to a survey are meaningfully different from those who didn't respond. As a result, responders appear to represent a larger share of the population than they actually do.

Sampling bias may occur in a rural area for a number of reasons. It may be more difficult to physically reach the people due to geographic isolation and low population density. One of the most talked about factors making it difficult to reach rural populations is a lack of broadband access. Survey data collection is moving to an online format because it is less costly than other methods and collected data are easier to clean and process. Because rural households are less likely to have broadband access, they're more likely to miss out on the opportunity to participate in surveys.

Even if survey researchers are able to reach the people they're hoping to collect data from, they could run into nonresponse issues. If rural people are less willing to disclose their information on a survey than those living in urban areas, the aggregated data will look more urban than the true values.

Norris: One of the reasons that local governments in rural areas rely on these larger scale survey data collection sources is the time and resources that it takes to collect good data. Local rural governments often have limited capacity to collect data on their own. They also may lack the financial resources to outsource formal data collection, and they often don't have the professional expertise in house to do it.

Sablik: Assuming that local governments are able to get the data that they need, what happens next? You mentioned that they might lack the expertise or the resources to do the processing and putting the data into practice. What are the issues there?

Norris: In some cases, even if government agencies do successfully collect data on rural communities, they might not make that data available to the public due to small sample size. Data at the county or zip code level might be suppressed if there's a concern that data users could identify respondents based on the survey data. This is especially true when you're working with sensitive data like data on health conditions or even personal income data.

Because of this small sample size problem, rural communities are often grouped with urban areas for data reporting. For example, a metropolitan statistical area is a geography that typically includes at least one major city and then surrounding counties that include what we would consider small towns or rural places.

So, if you're looking at data on home prices for the Charlotte metropolitan statistical area, that includes the city of Charlotte but it also includes the small town of Wadesboro and Anson County, which is pretty different from downtown Charlotte. If you live in Wadesboro, you're likely facing a very different housing market than you do if you live in downtown Charlotte. But that's not clear from data reported at the metropolitan statistical area level.

Latham: Another issue associated with a small sample size has to do with the precision of estimates generated based on the data collected. Say, for example, we wanted to know about the median household income for all counties in the United States. It's not feasible to ask 100 percent of the households what their income is, so the Census Bureau asks the question of about 1 percent of U.S. households every year as part of the American Community Survey or ACS. Using sample data to represent all households in each county, survey administrators calculate the median household income for each county.

Because the survey administrators know the 1 percent sample that they pick might not look like the full population, they also calculate a margin of error which indicates the level of confidence in their estimates of the median household income. To minimize the margin of error, the Census Bureau publishes five-year estimates which pull data from five consecutive sample years and allows for estimates to be based on 5 percent of the population.

Let's compare the city of Richmond to rural Northampton County, which is on Virginia's Eastern Shore, to show how this plays out. Based on 2019 ACS five-year estimates, Richmond has a median household income of about $47,250. Northampton County has a median household income of about $47,230. Those are both pretty close to one another. But the margin of error for Richmond is $1,316 while the margin of error for Northampton County is greater at $4,230. This is because a 5 percent sample from Richmond, which has a population of about 225,000, will contain more than a 5 percent sample from Northampton County, which has a population of about 11,700 people. A larger sample size means our estimated statistics such as median household income will be more precise. From a practical standpoint, this means that statistics for small towns and rural areas reflect the truth less precisely than the same statistics on larger areas.

Sablik: Yeah, that definitely illustrates what a challenge it can be to get the data from the smaller communities. I'm sure that also plays into your ability to get timely data on these metrics that you're interested in as well.

Norris: Yeah, definitely. The time lag between data collection and when the data are released is a concern for all data users. Data that reflects current, on-the-ground experience is pretty hard to come by because it takes time to collect, clean, and then distribute robust data to the public. Some of the economic data that we use today were collected months or even years ago and may not reflect current economic conditions that folks are experiencing now. This is particularly true in rural areas, which tend to be more dependent on a few big businesses and institutions compared to large urban centers.

One example of how this can be especially problematic in rural places is looking at hospital closures. In addition to being the only source of critical health care in many places, a hospital tends to employ a lot of people and has a large effect on the overall economic activity of a rural region. So, say a major employer like a community hospital closed in 2021 in a rural place. If we're using data that were collected in 2019 to describe how the community looks right now, we aren't seeing the ripple effect of that hospital closing. Things might look better on paper than they actually do on the ground. A hospital closing in a big city has consequences too, of course. But since large metropolitan areas tend to have several hospitals and lots of other employers, the local economy is in a better position to absorb the shock of the hospital closure.

Latham: Another topic I want to talk about is how small towns and rural governments might not have the capacity necessary to analyze and interpret data available to them.

For starters, administrative data may be difficult to analyze if the community is storing data on outdated systems. Combining data from the sources that weren't designed to talk to each other is challenging, which makes it difficult to use those data to answer policy questions. Local governments might also not employ a data analyst in house who would be able to reconcile data from a variety of sources. In most cases, this is either because the data analysts would be needed too infrequently, or because their salary would be too expensive.

Sablik: Right. Do you have any thoughts on some improvements that small towns or rural places could adopt to try and alleviate these challenges, maybe drawing from any sort of strategies that you've adopted in your own practice?

Latham: Sure, I can start starting with how small town and rural governments use data. One thing they do is combine their resources to either outsource data analysis tasks or to hire someone to serve the larger region.

For example, we recently wrote about the New River Valley Regional Commission's housing study that was released last year. This study was a joint effort between the Virginia Center for Housing Research at Virginia Tech and HousingForward Virginia. Funded by a grant from Virginia Housing, the state's housing authority, housing experts at the Virginia Center for Housing Research conducted an extensive housing study for the 13 local governments that make up the New River Valley. HousingForward Virginia developed accompanying policy recommendations for the region as a whole and for each of the local governments. By combining their efforts, these 13 local governments were able to benefit from expert data and policy analysis that would have been difficult for them to accomplish in isolation.

Norris: There are other efforts at the local, state and national level that can improve rural data collection. One strategy for increasing rural representation in large surveys is oversampling rural residents. In populations with high rates of nonresponse, which Sierra talked about earlier, using a larger sample of that population can increase overall rural responses in the survey.

Partnering with local leaders can reduce nonresponse rates in rural communities as well. A strategy that the decennial census used in 2020 was to encourage the development of Complete Count Committees in communities across the country. These were volunteer committees established by tribal, state and local governments, and community leaders and organizations to increase awareness of the census and to motivate residents to respond. Engaging with trusted voices and communities can help reduce hesitancy that some people might have around providing data to survey collectors.

In our business surveys at the Richmond Fed, we do look to partner with membership organizations and groups who are directly connected to the respondents that we want to include in our surveys. We are always looking for ways to increase the number of rural businesses that participate in the surveys. For example, we might connect with a local chamber of commerce to recruit survey participants. That might help us reach businesses that we might otherwise miss and can also improve the likelihood of response.

Sierra also mentioned broadband as a challenge for rural communities earlier in the podcast. We know expanding broadband is important for economic development overall in rural areas. It can also reduce sample selection issues for online surveys by connecting small town and rural residents with the opportunities to participate in these surveys.

We recently highlighted a case study of an innovative partnership to expand broadband to residents and businesses on Maryland's Eastern Shore in our rural Spotlight Series. There are examples from across the [Fifth] District of towns working together to expand broadband.

But even with improved data collection strategies in place, nothing can replace the value of our community outreach and engagement. Like Sierra mentioned at the top of the episode, we are constantly meeting with people in the District to get a sense of what's happening on the ground in their communities. These are our subject matter experts. We rely on the information we get directly from businesses and community leaders to add important context to the data that we use from other sources that we've talked about.

Particularly in rural communities, it's critical to get this information to help us understand the complex economic challenges that towns are experiencing. But it also gives us the opportunity to hear about success stories and initiatives that are really making a difference in rural communities that we can share.

Sablik: Yeah, and we can definitely add some links in the show notes for any listeners out there who are interested in participating in those conversations.

As always, it's an opportunity for me to encourage our listeners who are interested in learning more about these topics to head over to our Research page at Richmondfed.org. There, you can find all kinds of economic indicators collected by our research analysts like Sierra and Stephanie, including some recently released interactive maps comparing rural and urban regions in the District across a variety of metrics.

That's all we have time for today. Sierra and Stephanie, thanks very much for joining me to talk about this.

Latham: Thank you.

Norris: Thank you for the opportunity.

Phone Icon Contact Us

Research Department (804) 697-8000