Appendix C: Ohio Children’s Opportunity Index (OCOI) Construction (2014/2017–Reduced Version)


1 Introduction

This appendix documents the construction of the two-period Ohio Children’s Opportunity Index (OCOI; 2014 & 2017). This version includes a reduced set of the original 53 constituent measures because not all measures were available for both periods. This reduction ensured the OCOI score for each period is based on comparable data. The code in this document assumes the database of constituent measures—each assigned to a domain—is already prepared. A separate document describes each variables in each domain using a conceptual English definition, univariate descriptive results, correlation analyses, and choropleth map visualization.

Note: Throughout this document, you will find tabs allowing you to select between information about the 2014 or 2017 period of this two-period version of the OCOI.

The steps are as follows:

  • Standardize the constituent measures (transform to z-scores)
  • Average the standardized measures within each domain
  • Define the domain scores as an exponential transformation of the track-ranked values of each domain-average
  • Define the Ohio Children’s Opportunity Index as an unweighted average of the domain scores
  • Plot each domain score and the final OCOI score as choropleth maps

2 Domain scores

The domain scores are a function of the constituent measures within each domain. We are first standardized each individual constituent measure by transforming it to a z-score (centering it around zero and dividing by its standard deviation). We save that value for plotting. We then transform the z-score further to an exponential distribution that incorporates certain desirable cancellation properties, discussed in the next paragraph, into the final OCOI.

Following is a simplified example characterizing the benefits of the transformation. If we used untransformed z-scores or domain rank values, then one unit of opportunity contribution by one domain could completely cancel-out one unit of deprivation contributed by another domain (i.e., zero-sum). The exponential transform adjusts these cancellation properties in such a way that such a cancellation would require more than one unit of opportunity to cancel out one unit of deprivation.

This choice is based on key principles stemming from research on the creation of deprivation idices in the UK (Noble, Wright, Smith & Dibbens, 2006).

Below is the R code used in the transformation. We plot univariate and bivariate information about the resulting set of untransformed and transformed domain scores.

2.1 Code

# load the constituent measure data
load("../../../data/OCOI/ConstituentMeasures.14.17.RData")

# subset to only those in D
OCOI.14 <- OCOI.14[,c("tract",unlist(D))]
OCOI.17 <- OCOI.17[,c("tract",unlist(D))]

# standardize variables (create z-scores)
for(i in unlist(D)) {
  OCOI.14[,i] <- scale(OCOI.14[,i], scale=T, center=T)
  OCOI.17[,i] <- scale(OCOI.17[,i], scale=T, center=T)
}

# create a new data frame for the domain scores
OD.14 <- OCOI.14[,1,drop=F]
OD.17 <- OCOI.17[,1,drop=F]

# one an intermediate data frame for untransformed domains averages (for visualization)
ODZ.14 <- OCOI.14[,1,drop=F]
ODZ.17 <- OCOI.17[,1,drop=F]

# one an intermediate data frame for untransformed domains averages (for visualization)
ODR.14 <- OCOI.14[,1,drop=F]
ODR.17 <- OCOI.17[,1,drop=F]

# average the measures in their respective domains and transform
for(d in names(D)) {
  # sum
  ODZ.14[,d] <- rowSums(OCOI.14[,D[[d]]], na.rm=T)
  ODZ.17[,d] <- rowSums(OCOI.17[,D[[d]]], na.rm=T)
  
  # impute the median for any tracts with NA in the domain score
  ODZ.14[is.na(ODZ.14[,d]), d] <- median(ODZ.14[,d], na.rm=T)
  ODZ.17[is.na(ODZ.17[,d]), d] <- median(ODZ.17[,d], na.rm=T)

  # rank
  ODR.14[,d] <- rank(ODZ.14[,d]) - 1
  ODR.17[,d] <- rank(ODZ.17[,d]) - 1
  
  # scale to [0,1]
  ODR.14[,d] <- ODR.14[,d] / max(ODR.14[,d])
  ODR.17[,d] <- ODR.17[,d] / max(ODR.17[,d])
  
  # exponential transform
  OD.14[,d] <- -23 * log(1 - ODR.14[,d] * (1 - exp(-100/23)))
  OD.17[,d] <- -23 * log(1 - ODR.17[,d] * (1 - exp(-100/23)))
}

# name the rows according to tract for easier merging during later mapping
rownames(OD.14) <- OD.14$tract
rownames(OD.17) <- OD.17$tract

2.2 Visualize the result

2.2.1 Non-transformed domain values

Take a look at histograms of the domain sum variables.

2.2.1.1 2014

2.2.1.2 2017

2.2.2 Transformed domain values

Take a look at histograms of the domain averages that have been transformed.

2.2.2.1 2014

2.2.2.2 2017

2.3 Correlations among the domains

2.3.1 2014

2.3.2 2017

2.4 Construct the OCOI as an unweighted mean of the domain scores

We calculate the OCOI for a tract as the mean of its transformed domain scores, and then we reverse the OCOI such that higher values reflect more overall opportunity.

2.5 Visualize the “learned” latent factor

Below we see histograms of the DI and the OCOI side-by-side. They are—as they should be—mirror images.

2.5.1 2014

2.5.2 2017

2.6 Variance of OCOI attributable to domain scores

2.6.1 2014

  totVar uniqVar
FS 0.592 0.029
HS 0.550 0.035
CR 0.450 0.042
HL0 0.402 0.054
ED 0.385 0.049
HL1 0.068 0.067
EN 0.031 0.065
AC 0.016 0.057

2.6.2 2017

## Warning in summary.lm(ans): essentially perfect fit: summary may be unreliable

## Warning in summary.lm(ans): essentially perfect fit: summary may be unreliable

## Warning in summary.lm(ans): essentially perfect fit: summary may be unreliable

## Warning in summary.lm(ans): essentially perfect fit: summary may be unreliable

## Warning in summary.lm(ans): essentially perfect fit: summary may be unreliable

## Warning in summary.lm(ans): essentially perfect fit: summary may be unreliable

## Warning in summary.lm(ans): essentially perfect fit: summary may be unreliable

## Warning in summary.lm(ans): essentially perfect fit: summary may be unreliable
  totVar uniqVar
FS 0.626 0.027
HS 0.547 0.037
CR 0.442 0.043
HL0 0.421 0.051
ED 0.373 0.053
HL1 0.082 0.067
AC 0.016 0.058
EN 0.013 0.064

3 Validation of the OCOI

This section provides evidence of the criterion validity of the 2014 and 2017 OCOI. It includes results of correlation analyses between 5 outcome variables and the single OCOI score for each period as well as the proportion of variance explained in the same 5 outcome variables in multiple linear regressions of each outcome on all 8 domain scores (predictors). All resuts are in the tables below. Thus, the correlations provide information about the validity of the OCOI, while the multiple regression results provide information about the validity of the collection of domain scores.

Five outcomes were examined:

  • all-cause age-adjusted mortality (death_rate) from OOI
  • asthma (from the HL1 domain)
  • life expectancy (“le”) from CDC
  • child severe mental illness
  • pre-term birth

3.1 2014

Outcome Correlation with OCOI
(p-value)
Multiple Regression
R-squared
death rate -0.15
(0.00)
0.05
asthma -0.52
(0.00)
0.35
life expectancy 0.65
(0.00)
0.56
child severe mental illness -0.26
(0.00)
0.32
pre-term birth -0.48
(0.00)
0.40

3.2 2017

Outcome Correlation with OCOI
(p-value)
Multiple Regression
R-squared
death rate -0.15
(0.00)
0.05
asthma -0.46
(0.00)
0.31
life expectancy 0.65
(0.00)
0.57
child severe mental illness -0.27
(0.00)
0.42
pre-term birth -0.45
(0.00)
0.41

4 Domain and OCOI Choropleth Maps

This section contains choropleth maps of each of the domain scores (transformed), the overall deprivation index (DI), and the reversed deprivation index (i.e., the Ohio Children’s Opportunity Index or OCOI). These plots provide a means for determining the face validity of each domain score and the overall OCOI. For the OCOI, higher values (brighter areas) correspond with higher levels of opportunity.

4.1 FS

4.1.1 2014

4.1.2 2017

4.2 HL0

4.2.1 2014

4.2.2 2017

4.3 HL1

4.3.1 2014

4.3.2 2017

4.4 HS

4.4.1 2014

4.4.2 2017

4.5 AC

4.5.1 2014

4.5.2 2017

4.6 ED

4.6.1 2014

4.6.2 2017

4.7 EN

4.7.1 2014

4.7.2 2017

4.8 CR

4.8.1 2014

4.8.2 2017

4.9 DI

4.9.1 2014

4.9.2 2017

4.10 OCOI

4.10.1 2014

4.10.2 2017

4.11 Difference

4.11.1 2017 OCOI minus the 2014 OCOI

Red values suggest opportunity increased in the 2017 period.

Blue values suggest it decreased.