1 Introduction

This document describes each of the individual constituent measures that constitute each of the domains of the Ohio Opportunity Index (OOI). The following section describes the data you will find in this appendix.


2 Helpful information for interpreting this document

The data section of this appendix includes four different tabs of information, which are explained in the following subsections, for each domain. An additional tab (“all domains”) is included to bring all measures together to examine between-domain correlations and to consolidate descriptions, and source into one single table.

2.1 Univariate description

The univariate analyses consist of a table with one row per constituent measure. Each row contains:

  1. the variable name written as x_y where x is a short name for the domain (e.g., “ED_” represents Education) and y is the variable name
  2. an English description of the variable
  3. a quantitative summary of the distribution
  4. a count of the number of distinct values (very small values could be problematic)
  5. a histogram of the distribution
  6. a count (and percentage) of tracts with a missing value

Note: variables with the “_flip” suffix were reversed so that higher values on all measures are indicative of more deprivation (less opportunity) with respect to the conceptual meaning of the measure.

2.2 Correlation heatmap

The heatmap is just like a correlation matrix, except the sign of the correlation is represented by a color (blue or red) and the magnitude of the correlation is represented by the saturation of the color instead of a number. Each colored box in the heatmap represents the correlation between the row and column variables it connects in the grid Increasingly saturated (dark) blues represent increasingly negative correlations while increasingly saturated reds represent more positive correlations. A zero correlation is indicated by pure white. Measures that hang together are connected by moderate to dark red colors. Measures that do not hang together or are inverses of one another are white or blue, respectively.

The dendrogram around the heatmap represents the results of a hierarchical clustering algorithm, which attempts to group variables according to their similarity, which equates to grouping of positively correlated variables.

2.3 Choropleth maps

Every measure within each domain is plotted as a choropleth of Ohio to visualize the geographic distribution of the measure. The colors range from a dark blue—indicating a low value for the variable in a tract—to a bright yellow/green—indicating a relatively high value in a tract. All variables are defined such that high values imply more deprivation with respect to the variable in question (bright areas are low-opportunity areas).

  • **: very high deprivation
  • **: high deprivation
  • **: moderate deprivation
  • **: low deprivation

The value associated with areas shaded in bright grey is missing or unknown.

The choropleths are visually enhanced by Winsorizing variables before they are plotted. Winsorizing sets extreme values > 97.5th percentile to the value of the 97.5th percentile, or those < the 2.5th percentile to the value of the 2.5th percentile. This technique supports visualization by minimizing the degree to which large outliers define the scale of the variable so much that the plot appears to have one bright tract while the rest appear to have uniformly low values (dark).


3 Analysis of each domain


3.1 Transportation

3.1.1 Univariate description

Variable Stats / Values Freqs (% of Valid) Graph Missing
TR_prp_public_transit_access_flip
[numeric]
Mean (sd) : -0.019 (0.041)
min < med < max:
-0.453 < -0.002 < 0
IQR (CV) : 0.019 (-2.15)
1680 distinct values 17
(0.5%)
TR_avg_commute
[numeric]
Mean (sd) : 24.14 (4.187)
min < med < max:
5 < 23.915 < 42.956
IQR (CV) : 5.411 (0.173)
3148 distinct values 17
(0.5%)
TR_prp_no_vehicle
[numeric]
Mean (sd) : 0.084 (0.094)
min < med < max:
0 < 0.054 < 0.8
IQR (CV) : 0.086 (1.115)
2983 distinct values 17
(0.5%)
TR_traffic_prox_flip
[numeric]
Mean (sd) : -117.023 (170.519)
min < med < max:
-1876.687 < -66.777 < -0.015
IQR (CV) : 114.896 (-1.457)
3157 distinct values 11
(0.3%)

3.1.2 Correlation heatmap

3.1.3 Choropleth maps

3.1.3.1 TR_prp_public_transit_access_flip

3.1.3.2 TR_avg_commute

3.1.3.3 TR_prp_no_vehicle

3.1.3.4 TR_traffic_prox_flip

3.1.4 Summary and sources of measures

variable description source
prp_public_transit_access_flip Proportion of the population that has access to public transportation (including taxi) (flipped) 2020 ACS
avg_commute Average time spent commuting to work 2020 ACS
prp_no_vehicle Proportion of households in a census tract without access to a vehicle 2020 ACS
traffic_prox_flip Average annual daily vehicle traffic by distance 2020 ACS

3.2 Health

3.2.1 Univariate description

Variable Stats / Values Freqs (% of Valid) Graph Missing
HL_prp_diabetes_admit
[numeric]
Mean (sd) : 0.031 (0.039)
min < med < max:
0 < 0.023 < 1
IQR (CV) : 0.034 (1.267)
938 distinct values 14
(0.4%)
HL_preventable_ED
[numeric]
Mean (sd) : 0.466 (0.052)
min < med < max:
0 < 0.467 < 0.796
IQR (CV) : 0.061 (0.112)
3159 distinct values 9
(0.3%)
HL_avg_medical_prov_dist
[numeric]
Mean (sd) : 1.244 (1.711)
min < med < max:
0 < 0.655 < 13.307
IQR (CV) : 0.895 (1.376)
3159 distinct values 0
(0.0%)
HL_avg_healthy_food_dist
[numeric]
Mean (sd) : 1.601 (1.813)
min < med < max:
0 < 1.02 < 14.465
IQR (CV) : 1.376 (1.132)
3159 distinct values 0
(0.0%)
HL_w_death_rate
[numeric]
Mean (sd) : 3.899 (4.754)
min < med < max:
0 < 2.85 < 137.351
IQR (CV) : 3.705 (1.219)
3141 distinct values 13
(0.4%)

3.2.2 Correlation heatmap

3.2.3 Choropleth maps

3.2.3.1 HL_prp_diabetes_admit

3.2.3.2 HL_preventable_ED

3.2.3.3 HL_avg_medical_prov_dist

3.2.3.4 HL_avg_healthy_food_dist

3.2.3.5 HL_w_death_rate

3.2.4 Summary and sources of measures

variable description source
prp_diabetes_admit Proportion of Medicaid inpatient admissions with a primary diagnosis of diabetes among Medicaid beneficiaries Medicaid Claims
preventable_ED Proportion of Emergency Department visits for a preventable medical condition, Medicaid beneficiaries Medicaid Claims
avg_medical_prov_dist Average distance to nearest healthcare provider from block centroid, weighted by block population Data Axle
avg_healthy_food_dist Average distance to nearest healthy food location from block centroid, weighted by block population Data Axle
w_death_rate Average age adjusted mortality Vital Statustics

3.3 Employment

3.3.1 Univariate description

Variable Stats / Values Freqs (% of Valid) Graph Missing
EM_low_wage_job_ratio_flip
[numeric]
Mean (sd) : -0.162 (0.173)
min < med < max:
-5.71 < -0.135 < -0.017
IQR (CV) : 0.051 (-1.065)
3104 distinct values 13
(0.4%)
EM_avg_workforce_training_dist
[numeric]
Mean (sd) : 5.176 (6.943)
min < med < max:
0 < 2.717 < 57.054
IQR (CV) : 4.154 (1.341)
3159 distinct values 0
(0.0%)
EM_prp_unemployed
[numeric]
Mean (sd) : 0.059 (0.056)
min < med < max:
0 < 0.042 < 0.842
IQR (CV) : 0.051 (0.964)
3036 distinct values 17
(0.5%)
EM_prp_poverty
[numeric]
Mean (sd) : 0.116 (0.124)
min < med < max:
0 < 0.075 < 1
IQR (CV) : 0.127 (1.069)
2896 distinct values 18
(0.6%)

3.3.2 Correlation heatmap

3.3.3 Choropleth maps

3.3.3.1 EM_low_wage_job_ratio_flip

3.3.3.2 EM_avg_workforce_training_dist

3.3.3.3 EM_prp_unemployed

3.3.3.4 EM_prp_poverty

3.3.4 Summary and sources of measures

variable description source
low_wage_job_ratio_flip Ratio of entry-level jobs to non-college educated workforce (flipped) 2020 ACS
avg_workforce_training_dist Average distance to nearest workforce training center from block centroid, weighted by block population Data Axle
prp_unemployed Unemployment rate 2020 ACS
prp_poverty Proportion of families living below the federal poverty line 2020 ACS

3.4 Education

3.4.1 Univariate description

Variable Stats / Values Freqs (% of Valid) Graph Missing
ED_prp_associates_plus_flip
[numeric]
Mean (sd) : -0.372 (0.182)
min < med < max:
-0.927 < -0.329 < 0
IQR (CV) : 0.245 (-0.489)
3147 distinct values 13
(0.4%)
ED_performance_flip
[numeric]
Mean (sd) : -79.022 (16.394)
min < med < max:
-113.825 < -81.876 < -34.339
IQR (CV) : 24.872 (-0.207)
2464 distinct values 0
(0.0%)
ED_free_red
[numeric]
Mean (sd) : 0.041 (0.031)
min < med < max:
0 < 0.04 < 0.184
IQR (CV) : 0.05 (0.763)
1411 distinct values 0
(0.0%)
ED_grad_rate_flip
[numeric]
Mean (sd) : -84.151 (14.007)
min < med < max:
-99.8 < -89.817 < -23.533
IQR (CV) : 19.167 (-0.166)
764 distinct values 0
(0.0%)
ED_tract_prp_w_int_flip
[numeric]
Mean (sd) : -0.967 (0.1)
min < med < max:
-1 < -1 < 0
IQR (CV) : 0.007 (-0.103)
1350 distinct values 10
(0.3%)

3.4.2 Correlation heatmap

3.4.3 Choropleth maps

3.4.3.1 ED_prp_associates_plus_flip

3.4.3.2 ED_performance_flip

3.4.3.3 ED_free_red

3.4.3.4 ED_grad_rate_flip

3.4.3.5 ED_tract_prp_w_int_flip

3.4.4 Summary and sources of measures

variable description source
prp_associates_plus_flip Proportion of population with an associate’s degree (flipped) 2020 ACS
performance_flip Average of the performance index of all schools in the census tract, unless there are less than three schools in the census tract, in which case the average also includes the nearest schools up to three schools (flipped) 2023 Ohio Department of Education
free_red Average free/reduced lunch rate of all schools in the census tract, unless there are less than three schools in the census tract, in which case the average also includes the nearest schools up to three schools (flipped) 2022 Ohio Department of Education
grad_rate_flip Average high school graduation rate of all schools in the census tract, unless there are less than three schools in the census tract, in which case the average also includes the nearest schools up to three schools (flipped) 2023 Ohio Department of Education
tract_prp_w_int_flip Proportion of internet connections with at least 300 Mbps maximum advertised download speed weighted by census block population (flipped) 2022 FCC

3.5 Housing

3.5.1 Univariate description

Variable Stats / Values Freqs (% of Valid) Graph Missing
HS_med_contract_rent_flip
[numeric]
Mean (sd) : -794.272 (295.681)
min < med < max:
-3501 < -726 < -99
IQR (CV) : 302 (-0.372)
948 distinct values 151
(4.8%)
HS_med_home_value_flip
[numeric]
Mean (sd) : -183892.9 (104801.1)
min < med < max:
-1130000 < -164250 < -12700
IQR (CV) : 113600 (-0.57)
1982 distinct values 72
(2.3%)
HS_LIHTC_units_per_1000_Households
[numeric]
Mean (sd) : 0.511 (1.266)
min < med < max:
0 < 0 < 17.787
IQR (CV) : 0.562 (2.478)
859 distinct values 17
(0.5%)
HS_prp_pre1980
[numeric]
Mean (sd) : 0.673 (0.241)
min < med < max:
0 < 0.716 < 1
IQR (CV) : 0.367 (0.359)
3097 distinct values 17
(0.5%)
HS_prp_overcrowd
[numeric]
Mean (sd) : 0.015 (0.022)
min < med < max:
0 < 0.007 < 0.275
IQR (CV) : 0.021 (1.497)
1902 distinct values 17
(0.5%)
HS_prp_moved_last_year
[numeric]
Mean (sd) : 0.127 (0.085)
min < med < max:
0 < 0.107 < 0.71
IQR (CV) : 0.087 (0.669)
3130 distinct values 17
(0.5%)

3.5.2 Correlation heatmap

3.5.3 Choropleth maps

3.5.3.1 HS_med_contract_rent_flip

3.5.3.2 HS_med_home_value_flip

3.5.3.3 HS_LIHTC_units_per_1000_Households

3.5.3.4 HS_prp_pre1980

3.5.3.5 HS_prp_overcrowd

3.5.3.6 HS_prp_moved_last_year

3.5.4 Summary and sources of measures

variable description source
med_contract_rent_flip Median rent 2020 ACS
med_home_value_flip Median home value 2020 ACS
LIHTC_units_per_1000_Households Concentration of family low-income housing tax credit households per 1000 households (flipped)
prp_pre1980 Proportion of homes build pre-1980 2020 ACS
prp_overcrowd Ratio of people living with overcrowding (>1 occupant per room) 2020 ACS
prp_moved_last_year Proportion of people that moved within the last year 2020 ACS

3.6 Environment

3.6.1 Univariate description

Variable Stats / Values Freqs (% of Valid) Graph Missing
EN_avg_park_dist
[numeric]
Mean (sd) : 1.076 (1.703)
min < med < max:
0 < 0.477 < 15.268
IQR (CV) : 0.8 (1.582)
3157 distinct values 0
(0.0%)
EN_D2_PM25
[numeric]
Mean (sd) : 15.708 (16.419)
min < med < max:
0 < 8.994 < 93.055
IQR (CV) : 18.464 (1.045)
3122 distinct values 6
(0.2%)
EN_tract_wi_flip
[numeric]
Mean (sd) : -7.299 (4.726)
min < med < max:
-19.5 < -7.274 < 0
IQR (CV) : 6.894 (-0.648)
2494 distinct values 0
(0.0%)
EN_transformed_cancer_risk
[numeric]
Mean (sd) : 19.651 (4.23)
min < med < max:
7.325 < 18.774 < 50
IQR (CV) : 2.603 (0.215)
2252 distinct values 8
(0.3%)
EN_transformed_noncancer_risk
[numeric]
Mean (sd) : 0.027 (0.013)
min < med < max:
0.012 < 0.025 < 0.317
IQR (CV) : 0.009 (0.486)
3135 distinct values 8
(0.3%)

3.6.2 Correlation heatmap

3.6.3 Choropleth maps

3.6.3.1 EN_avg_park_dist

3.6.3.2 EN_D2_PM25

3.6.3.3 EN_tract_wi_flip

3.6.3.4 EN_transformed_cancer_risk

3.6.3.5 EN_transformed_noncancer_risk

3.6.4 Summary and sources of measures

variable description source
avg_park_dist Average of distance to nearest park from block centroid, weighted by block population. 2018 Trust for Public Lands & ArcGIS USA Parks
D2_PM25 Annual average PM2.5 levels EJI
tract_wi_flip Weighted average of walkability index by block, weighted by block population (flipped). EPA
transformed_cancer_risk Weighted average of cancer risk per million people. 2019 EPA Airtoxscreen
transformed_noncancer_risk Average of the respiratory, neurological, liver, developmental, reproductive, kidney, ocular, endocrine, hematological, immunological, skeletal, spleen, thyroid, and whole body hazard quotients. 2019 EPA Airtoxscreen

3.7 Criminal Justice

3.7.1 Univariate description

Variable Stats / Values Freqs (% of Valid) Graph Missing
CR_homic_assault
[numeric]
Mean (sd) : 0.027 (0.07)
min < med < max:
0 < 0.011 < 3.333
IQR (CV) : 0.024 (2.618)
3020 distinct values 26
(0.8%)
CR_robbery
[numeric]
Mean (sd) : 0.002 (0.008)
min < med < max:
0 < 0 < 0.417
IQR (CV) : 0.002 (3.984)
2044 distinct values 26
(0.8%)
CR_burg_larc_mvtheft
[numeric]
Mean (sd) : 0.05 (0.181)
min < med < max:
0 < 0.029 < 9.667
IQR (CV) : 0.048 (3.624)
3052 distinct values 26
(0.8%)
CR_drunk_dui
[numeric]
Mean (sd) : 0.004 (0.019)
min < med < max:
0 < 0.002 < 1
IQR (CV) : 0.003 (4.94)
2576 distinct values 26
(0.8%)
CR_drug
[numeric]
Mean (sd) : 0.011 (0.044)
min < med < max:
0 < 0.005 < 2.25
IQR (CV) : 0.011 (3.886)
2880 distinct values 26
(0.8%)
CR_sex_offense_any
[numeric]
Mean (sd) : 0.002 (0.007)
min < med < max:
0 < 0.001 < 0.333
IQR (CV) : 0.003 (2.679)
2689 distinct values 26
(0.8%)

3.7.2 Correlation heatmap

3.7.3 Choropleth maps

3.7.3.1 CR_homic_assault

3.7.3.2 CR_robbery

3.7.3.3 CR_burg_larc_mvtheft

3.7.3.4 CR_drunk_dui

3.7.3.5 CR_drug

3.7.3.6 CR_sex_offense_any

3.7.4 Summary and sources of measures

variable description source
homic_assault Rate of reports of assault, murder, and manslaughter per person. 2021 and 2022 Ohio Department of Public Safety
robbery Rate of reports of pocket-picking, purse-snatching, and robbery per person. 2021 and 2022 Ohio Department of Public Safety
burg_larc_mvtheft Rate of reports of larceny, burglary, embezzlement, blackmail, identify theft, vehicle theft, shoplifting, and credit card fraud per person. 2021 and 2022 Ohio Department of Public Safety
drunk_dui Rate of reports of driving under the influence and drunkeness per person. 2021 and 2022 Ohio Department of Public Safety
drug Rate of reports of drug and narcotic violations per person. 2021 and 2022 Ohio Department of Public Safety
sex_offense_any Rate of reports of human trafficking for sex acts, prostitution, rape, fondling, sodomy, incest, peeping tom, and pornographic material per person. 2021 and 2022 Ohio Department of Public Safety

3.8 All Domains

3.8.1 Correlation heatmap

3.8.2 Summary and sources of measures

variable description source
homic_assault Rate of reports of assault, murder, and manslaughter per person. 2021 and 2022 Ohio Department of Public Safety
robbery Rate of reports of pocket-picking, purse-snatching, and robbery per person. 2021 and 2022 Ohio Department of Public Safety
burg_larc_mvtheft Rate of reports of larceny, burglary, embezzlement, blackmail, identify theft, vehicle theft, shoplifting, and credit card fraud per person. 2021 and 2022 Ohio Department of Public Safety
drunk_dui Rate of reports of driving under the influence and drunkeness per person. 2021 and 2022 Ohio Department of Public Safety
drug Rate of reports of drug and narcotic violations per person. 2021 and 2022 Ohio Department of Public Safety
sex_offense_any Rate of reports of human trafficking for sex acts, prostitution, rape, fondling, sodomy, incest, peeping tom, and pornographic material per person. 2021 and 2022 Ohio Department of Public Safety
prp_associates_plus_flip Proportion of population with an associate’s degree (flipped) 2020 ACS
performance_flip Average of the performance index of all schools in the census tract, unless there are less than three schools in the census tract, in which case the average also includes the nearest schools up to three schools (flipped) 2023 Ohio Department of Education
free_red Average free/reduced lunch rate of all schools in the census tract, unless there are less than three schools in the census tract, in which case the average also includes the nearest schools up to three schools (flipped) 2022 Ohio Department of Education
grad_rate_flip Average high school graduation rate of all schools in the census tract, unless there are less than three schools in the census tract, in which case the average also includes the nearest schools up to three schools (flipped) 2023 Ohio Department of Education
tract_prp_w_int_flip Proportion of internet connections with at least 300 Mbps maximum advertised download speed weighted by census block population (flipped) 2022 FCC
low_wage_job_ratio_flip Ratio of entry-level jobs to non-college educated workforce (flipped) 2020 ACS
avg_workforce_training_dist Average distance to nearest workforce training center from block centroid, weighted by block population Data Axle
prp_unemployed Unemployment rate 2020 ACS
prp_poverty Proportion of families living below the federal poverty line 2020 ACS
avg_park_dist Average of distance to nearest park from block centroid, weighted by block population. 2018 Trust for Public Lands & ArcGIS USA Parks
D2_PM25 Annual average PM2.5 levels EJI
tract_wi_flip Weighted average of walkability index by block, weighted by block population (flipped). EPA
transformed_cancer_risk Weighted average of cancer risk per million people. 2019 EPA Airtoxscreen
transformed_noncancer_risk Average of the respiratory, neurological, liver, developmental, reproductive, kidney, ocular, endocrine, hematological, immunological, skeletal, spleen, thyroid, and whole body hazard quotients. 2019 EPA Airtoxscreen
prp_diabetes_admit Proportion of Medicaid inpatient admissions with a primary diagnosis of diabetes among Medicaid beneficiaries Medicaid Claims
preventable_ED Proportion of Emergency Department visits for a preventable medical condition, Medicaid beneficiaries Medicaid Claims
avg_medical_prov_dist Average distance to nearest healthcare provider from block centroid, weighted by block population Data Axle
avg_healthy_food_dist Average distance to nearest healthy food location from block centroid, weighted by block population Data Axle
w_death_rate Average age adjusted mortality Vital Statustics
med_contract_rent_flip Median rent 2020 ACS
med_home_value_flip Median home value 2020 ACS
LIHTC_units_per_1000_Households Concentration of family low-income housing tax credit households per 1000 households (flipped)
prp_pre1980 Proportion of homes build pre-1980 2020 ACS
prp_overcrowd Ratio of people living with overcrowding (>1 occupant per room) 2020 ACS
prp_moved_last_year Proportion of people that moved within the last year 2020 ACS
prp_public_transit_access_flip Proportion of the population that has access to public transportation (including taxi) (flipped) 2020 ACS
avg_commute Average time spent commuting to work 2020 ACS
prp_no_vehicle Proportion of households in a census tract without access to a vehicle 2020 ACS
traffic_prox_flip Average annual daily vehicle traffic by distance 2020 ACS