INNOVATIVE THINKING
RIGOROUS APPROACHES
REAL WORLD IMPACT

 

Development of the BFAL

Overview Explore the BFAL Development of the BFAL

Introduction

IMPAQ’s Barriers to Food Access Locator (BFAL) is an interactive, online data visualization tool that allows users to identify communities and neighborhoods with households likely experiencing barriers to food access. The BFAL is unique in that it is the first attempt at incorporating the multitude of barriers to accessing adequate food, in particular as COVID exacerbates barriers to food access. Given the exacerbation of hunger and food insufficiency that has accompanied the COVID-19 pandemic, this localized approach to targeting at-risk communities will facilitate more targeted strategies for improving food access to those most in need.

This document describes the development of the BFAL, including measure selection, data manipulation, and analyses that led to the classification of Census tracts into groups experiencing low, moderate, and high barriers to food access.

Methodology

Overview

The BFAL was created using 17 variables that reflect five themes associated with barriers to food access, and identified in the analysis: community characteristics, economic risk, income disparities, populations with potential food access challenges, and proximity to food outlets (Table 1). The final variable list was determined through a multistage process, including identification of variables through a priori association with food access and food insecurity, and anticipated challenges in food access that could arise during the COVID-19 pandemic; consultation with food insecurity and hunger experts; and analytic data reduction techniques, including principal component analysis[1]. After finalizing the variable list, we used cluster analysis[2] to classify Census tracts into groups experiencing low, moderate, or high barriers to food access. The sections below detail each of these steps.

Table 1. Variables Used in BFAL Classification

Theme Variable Source
1. Community characteristics Population density ArcGIS
Percentage of occupied housing units that are renter-occupied ACS 2015-2019
Percentage of households (HHs) with residents speaking English less than very well ACS 2015-2019
Percentage of population using public transportation to get to work ACS 2015-2019
Percent of HHs with high housing burden (rent or mortgage >30% of income) ACS 2015-2019
Percentage of HHs with high transportation burden (>30% of rent/mortgage) Center for Neighborhood Technology, 2017
2. Financial disadvantage Percentage of population in poverty ACS 2015-2019
Percentage of households receiving cash public assistance ACS 2015-2019
Percentage population with a disability ACS 2015-2019
Percentage of HHs without a computer ACS 2015-2019
3. Populations at heightened risk Percent low income jobs lost due to COVID-19 Urban Institute
Percentage of population over age 65 ACS 2015-2019
Percentage of adults age 65 and older living alone ACS 2015-2019
4. Income disparities Black to White income disparity ratio ACS 2015-2019
American Indian/Alaska Native to White income disparity ratio ACS 2015-2019
Hispanic to White income disparity ratio ACS 2015-2019
5. Proximity to food outlets Percentage of population not within 1 mile of food store in urban areas or 10 miles in rural areas ERS, Food Atlas, 2015

Note: ACS = American Community Survey; ERS=Economic Research Service

Variable identification

The study team identified variables based on a priori association with food access, food insecurity, or other elements related to COVID-19 that might impact food access (e.g., government restrictions or rapidly changing economic circumstances). In order to develop the classifications at the most granular level possible, we limited our search to data available at the Census tract level. The initial set of variables was shared with experts in the field of food insecurity and hunger, and variables were added and eliminated based on their feedback. In total, we identified 28 variables with Census tract-level data for all 50 states plus the District of Columbia.

Many of the variables reviewed in the construction of the index were downloaded from the original data sources as percentages. In some cases, it was necessary to sum percentages to create variables (for instance, the percentage of adults aged 65 and older was the sum of the proportions aged 65-74, 75-84, and 85 and older).  For estimates of high housing burden, it was necessary to total the number of households for which housing costs exceeded 30% of income and then divide by the total number of households in the Census tract. We calculated three racial income disparity ratios by dividing the median income of minority-headed households (HH; numerator) by the median income of White-headed HHs (denominator), using estimates provided by the American Community Survey (ACS):

  • Black to White income disparity ratio
  • American Indian/Alaska Native to White income disparity ratio
  • Hispanic to White income disparity ratio

A disparity ratio of less than one indicates that minority-headed HHs typically had a median income less than White-headed HHs; a ratio of one indicates equal median income between minority- and White-headed HHs; and a ratio greater than one indicates that minority-headed HHs earn more than White-headed HHs.

Some Census tracts did not have a median income reported for a given race/ethnicity due to having no HHs headed by an individual who identified with that race/ethnicity, or due to too few HHs to reliably estimate the group’s median income.  For the purposes of this analysis, missing data were treated as follows:

  • Tracts with no persons of a given race/ethnicity are treated as having a disparity ratio of 1 (no disparity)
  • Tracts missing income data for minority-headed households were assigned a disparity ratio equal to the median income for the Census tract divided by the White-headed household income
  • Tracts missing income data for White-headed households were assigned a disparity ratio equal to the minority-headed household income divided by the median income for the Census tract

While race and ethnicity have both been associated with higher risk of food insecurity and exposure to COVID-19, we did not include the proportion of minority populations in the classification because they are exclusive of one another by definition. In other words, a Census tract with a very high proportion of non-Hispanic Black individuals would not also have a very high proportion of Hispanic or American Indian/Alaska Native individuals, so the variables could potentially confound one another in our analysis. However, because of the disproportionate impact of COVID-19 on minority populations, we included the three racial income disparity variables; the values for which are independent of one another. After classifying Census tracts to low, moderate, and high barriers we found positive associations with the omitted race and ethnicity variables, and, as expected, tracts with greater barriers to food access had greater proportions of minority populations.

Variable Reduction

The study team reviewed each of the potential variables for inclusion in the index, seeking to eliminate redundancy and simplify the creation of the index. We first looked at correlations between variables, resulting in the following:

  • Percentage of HHs with no access to broadband internet was eliminated for having a very strong correlation with the percentage of HHs with no computer
  • Percentage of HHs with Supplemental Nutrition Assistance Program (SNAP); percentage of HHs with no vehicles, and median household income were eliminated for having moderate to strong correlations with at least 4 other variables

We used principal component analysis (PCA) as the final data reduction step. Prior to conducting the PCA, we standardized the remaining 21 variables by creating ranked variables where each Census tracts’ values were ranked low to high and assigned a value of 1 (low) to 100 (high) based on the placement of their rank. The three disparity ratios were coded in reverse (100 to 1) where lower values (closer to zero) were ranked higher than larger numbers. We included the 21 ranked variables in the PCA and reviewed the results to determine which variables loaded favorably on components with an eigenvalue greater than 1. From these results, we identified 5 themes (listed in Table 1), and 17 variables that loaded heavily within one of these themes. From this step, we eliminated an additional four variables from the final variable list:

  • Percentage of HHs with children under age 18
  • Percentage of jobs in retail and trade
  • Percentage of jobs in arts and entertainment
  • Percentage of jobs in other services industries

Food Need Identification

We anticipate that tracts ranking highly on multiple of these 17 variables have greater barriers to food access, compared to tracts that rank lower on these variables or rank high on only a few of them. We used cluster analysis to group tracts based on similarities across these 17 characteristics. Since urban and rural communities face different barriers to food access (e.g., access to transportation, proximity to food outlets) and have different COVID-related challenges (e.g., aging populations, job loss, population density), we conducted the cluster analysis separately for urban and rural tracts. Doing so resulted in three distinct clusters, or groups, of urban and rural tracts.

The characteristics of the distinct clusters were then reviewed to identify those that likely have high, moderate, and low barriers to food access. Table 2 demonstrates how the mean for each ranked variable within each cluster compares to the median for the nation:

  • “B” indicates that mean Census tract rankings are better than the median for the nation; a B+ indicates that values are better than the 40th percentile for the nation.
  • “W” indicates that mean Census tract rankings are worse than the median for the nation; a W+ indicates that values are worse than the 60th percentile for the nation.

For instance, the urban “high barriers” cluster had mean rankings that were worse than the national median for 14 of 17 indicators, 11 of which were worse than the 60th percentile for the nation. The rural “high barriers” cluster contained Census tracts with mean rankings that were worse than the median for 10 of 17 indicators, 7 of which were worse than the 60th percentile.

It is worth noting that each group of Census tracts, including those with anticipated low barriers to food access, have some characteristics that may indicate challenges in accessing food. For instance, urban “low barrier” Census tracts have high transportation cost burden, and high job loss due to COVID-19 relative to the rest of the country (Table 2). We expect, however, that this groups’ relative financial advantage, better than average community characteristics, and lack of income disparities serve as a protective factor compared to those in other barriers to food access classifications.

It is also worth noting that these classifications are at the community- and not individual-level. Not every resident of a high need community will have significant barriers to food access, nor will every resident in a low need community have no barriers. The BFAL is a strategy to identify areas where we would expect relatively larger proportions of residents to face barriers to food access.

Table 2. Mean Rankings for Analytic Variables Compared to the National Median, by Rural/Urban Geography and BFAL Classification

Theme Variable Urban Census tracts Rural Census tracts
Low Moderate High Low Moderate High
Community Characteristics Population density B W+ W B+ B+ B+
% of homes rented B+ W+ W+ B+ B+ B
Non-English speakers B W+ W B+ B+ B+
Reliance on public transit B+ W+ W+ B B+ B+
High housing burden B+ W+ W+ B+ B+ B+
High transportation burden W B+ B+ W W+ W+
Financial Disadvantage % in poverty B+ W W+ B+ B W+
% HHs receiving cash assistance B+ W W+ B+ B W
% of population with disabilities B B+ W+ B+ W+ W+
% HHs with no computer B+ B+ W+ B+ W+ W+
Heightened Risk Due to COVID Job loss due to COVID-19 B W W W B+ B+
% age 65+ W+ B+ B B W+ W
Older adults living alone W B+ W+ B+ W+ W
Income Disparities Black to White income disparity ratio B B W+ B B+ W+
American Indian/Alaska Native to White income disparity ratio B B W+ B B+ W+
Hispanic to White income disparity ratio B B W+ B B+ W+
Proximity to Food Stores Percentage of population not within 1 mile of food store in urban areas or 10 miles in rural areas W+ W B B+ B+ B+

W = Worse than median for the nation; W+ = worse than the 60th percentile for nation.

B = Better than median for the nation; B+ = better than 40th percentile for nation.

Limitations

General

The general limitations in our work lay in the areas of data availability and validation. An initial scan of publicly available data led to over 40 measures, however many were either highly correlated, had missing data, or could not provide the granularity we sought. We decided to draw primarily from ACS due to its Census-tract level data and the ability to consistently update the BFAL as new ACS data become available.

Validation using a comprehensive food access measure not in our BFAL would have been ideal; however, such food access data are not readily available at the Census tract level or for all parts of the country. Our validation focused on a few areas of the country that have measures available beyond those used in our index.  These comparisons were favorable to our tool, however ultimately feedback on our initial version of the BFAL may provide the best validation information.

Principle Component Analysis

Although PCA results in the reduction of redundant (or highly correlated) variables, PCA results are less interpretable than more direct measures or linear combinations of the original measures.  However, we used PCA as a data reduction technique to inform the final inclusion of variables in the cluster analysis.

Cluster Analysis

Cluster analysis (CA) provides an intuitive, conceptually straightforward approach to identifying similar groups based on the analytic inputs – in our case Census tracts with similar characteristics related to barriers to food access. The main limitation in the use of CA in development of the BFAL classification is sensitivity in results owing to the type of clustering algorithm chosen and decisions made by the analytic team. An exhaustive sensitivity analysis of all CA procedures was beyond the scope of this project, however our large data set, single-type (continuous) variables, and our goal to identify non-overlapping clusters as opposed to nested clusters made k-means clustering a reasonable choice.

References

  1. C. E. Friesen, P. Seliske, and A. Papadopoulos, “Using Principal Component Analysis to Identify Priority Neighbourhoods for Health Services Delivery by Ranking Socioeconomic Status,” Online J. Public Health Inform., vol. 8, no. 2, 2016, doi: 10.5210/ojphi.v8i2.6733.
  2. N. Sauvageot et al., “Stability-based validation of dietary patterns obtained by cluster analysis,” Nutr. J., vol. 16, no. 1, pp. 1–13, 2017, doi: 10.1186/s12937-017-0226-9.

Acknowledgments

Our team wishes to thank the following individuals for providing valuable insight in the early stages of the BFAL tool development:

Godfred Boateng is an Assistant Professor at the University of Texas at Arlington, and is the Director of the Global & Environmental Health Lab at the College of Nursing and Health Innovation. Dr. Boateng is an expert in the design and application of culturally relevant scalable methodologies to understand global health challenges in the area of maternal and child health, cardiometabolic diseases, and resource insecurity. His research aims to examine the interconnections that exist between food, water, and energy insecurity and associated health implications, particularly those that produce health inequities among women, infants, children and ethnic minority populations.

Hilary Seligman is a Professor of Medicine and of Epidemiology and Biostatistics at the University of California. Dr. Seligman is an expert in food insecurity and its health implications across the life course. She directs the Food Policy, Health, and Hunger Research Program at UCSF’s Center for Vulnerable Populations at Zuckerberg San Francisco General Hospital and the CDC’s Nutrition and Obesity Policy, Research and Evaluation Network. She also serves as Senior Medical Advisor for Feeding America.

Rachel Zack is an epidemiologist at the Greater Boston Food Bank (GBFB). She leads GBFB’s research to measure the relationships between food interventions (at medical and food pantry sites) and health outcomes and healthcare costs. She designs and leads studies to aid GBFB’s understanding of its operational impact on hunger in Eastern Massachusetts. She also works to raise public awareness of the hunger-health connection and of GBFB's research findings.

Data Sources

Center for Neighborhood Technology. 2020. Housing and Transportation Affordability Index, https://htaindex.cnt.org/. Accessed Jun 24, 2020.

Economic Research Service, U.S. Department of Agriculture. Food Access Research Atlas, https://www.ers.usda.gov/data-products/food-access-research-atlas/ Accessed June 4, 2020.

ESRI, ArcGIS Hub, REST Services U.S. Tract Boundary Lines (ID:8), https://hub.arcgis.com/datasets/fab7849b55d54f0f8f246605f6ee9306?geometr..., Accessed June 5, 2020.

Urban Institute. 2020. Estimated Low Income Jobs Lost to COVID-19. Accessible from https://datacatalog.urban.org/dataset/estimated-low-income-jobs-lost-cov.... Data originally sourced from Census LODES, IPUMS USA, BLS, NY Department of Labor, & WA Employment Security Department, developed at the Urban Institute, and made available under the ODC-BY 1.0 Attribution License. Accessed January 20, 2021.

U.S. Census Bureau. (2012). 2015-2019 American Community Survey 5-year Estimates.  Census Data Application Programming Interface (API) Retrieved from: https://api.census.gov/data/2019/acs/acs5, Accessed January 19, 2021.

 

Prepared by:

Jennifer Pooler, MPP

Vincent Chan, BA

Elizabeth Messenger-Jones, MSPH

Kevin Schott, MS

Nicholas Sumnicht, BA

Last updated: February 16, 2021