These four cities were selected both because they had relatively large populations of blacks and Hispanics and because they exhibited a pattern of economic growth similar to that in Miami over the late 1970s and early 1980s. (p.249)
Simple average with \(w_i = 1/N\): \[ \hat{Y}_{1t}(0) = \frac{1}{N} \sum_{i = 2}^{N + 1}Y_{it}. \]
Population-weighted average with \(w_i = w_i^{pop}\): \[ \hat{Y}_{1t}(0) = \sum_{i = 2}^{N + 1} w_i^{pop} Y_{it}. \]
Then, calculate the p-value of the sharp null hypothesis \(\tau_{it} = 0\) for \(i = 1, \cdots, N + 1\) and \(t = 1, \cdots, T\) is: \[ p = \frac{1}{N + 1} \sum_{i = 1}^{N + 1}1\{r_i \ge r_1\}. \]
Remember that SCM was a formalization of a case study.
Without the formalization, we could not apply the “same procedure” to estimate the placebo statistics.
In this sense, this inference became possible only by the formalization of SCM.
install.packages("Synth")
data("basque", package = "Synth") basque %>% head() %>% kbl() %>% kable_styling()
regionno | regionname | year | gdpcap | sec.agriculture | sec.energy | sec.industry | sec.construction | sec.services.venta | sec.services.nonventa | school.illit | school.prim | school.med | school.high | school.post.high | popdens | invest |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
1 | Spain (Espana) | 1955 | 2.354542 | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA |
1 | Spain (Espana) | 1956 | 2.480149 | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA |
1 | Spain (Espana) | 1957 | 2.603613 | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA |
1 | Spain (Espana) | 1958 | 2.637104 | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA |
1 | Spain (Espana) | 1959 | 2.669880 | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA |
1 | Spain (Espana) | 1960 | 2.869966 | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA |
period <- seq(1961, 1969, 2) df <- Synth::dataprep( foo = basque, predictors = c("school.illit", "school.prim", "school.med", "school.high", "school.post.high", "invest"), predictors.op = "mean", time.predictors.prior = 1964:1969, special.predictors = list( list("gdpcap", 1960:1969, "mean"), list("sec.agriculture", period, "mean"), list("sec.energy", period, "mean"), list("sec.industry", period, "mean"), list("sec.construction", period, "mean"), list("sec.services.venta", period, "mean"), list("sec.services.nonventa", period, "mean"), list("popdens", 1969, "mean") ), dependent = "gdpcap", unit.variable = "regionno", unit.names.variable = "regionname", time.variable = "year", treatment.identifier = 17, controls.identifier = c(2:16, 18), time.optimize.ssr = 1960:1969, time.plot = 1955:1997 )
names(df)
## [1] "X0" "X1" "Z0" "Z1" "Y0plot" "Y1plot" ## [7] "names.and.numbers" "tag"
df$X1 %>% kbl() %>% kable_styling()
17 | |
---|---|
school.illit | 39.888465 |
school.prim | 1031.742299 |
school.med | 90.358668 |
school.high | 25.727525 |
school.post.high | 13.479720 |
invest | 24.647383 |
special.gdpcap.1960.1969 | 5.285469 |
special.sec.agriculture.1961.1969 | 6.844000 |
special.sec.energy.1961.1969 | 4.106000 |
special.sec.industry.1961.1969 | 45.082000 |
special.sec.construction.1961.1969 | 6.150000 |
special.sec.services.venta.1961.1969 | 33.754000 |
special.sec.services.nonventa.1961.1969 | 4.072000 |
special.popdens.1969 | 246.889999 |
result <- Synth::synth( data.prep.obj = df, method = "BFGS" )
## ## X1, X0, Z1, Z0 all come directly from dataprep object. ## ## ## **************** ## searching for synthetic control unit ## ## ## **************** ## **************** ## **************** ## ## MSPE (LOSS V): 0.008864606 ## ## solution.v: ## 0.02773094 1.194e-07 1.60609e-05 0.0007163836 1.486e-07 0.002423908 0.0587055 0.2651997 0.02851006 0.291276 0.007994382 0.004053188 0.009398579 0.303975 ## ## solution.w: ## 2.53e-08 4.63e-08 6.44e-08 2.81e-08 3.37e-08 4.844e-07 4.2e-08 4.69e-08 0.8508145 9.75e-08 3.2e-08 5.54e-08 0.1491843 4.86e-08 9.89e-08 1.162e-07
table <- Synth::synth.tab(dataprep.res = df, synth.res = result) table$tab.pred %>% kbl() %>% kable_styling()
Treated | Synthetic | Sample Mean | |
---|---|---|---|
school.illit | 39.888 | 256.337 | 170.786 |
school.prim | 1031.742 | 2730.104 | 1127.186 |
school.med | 90.359 | 223.340 | 76.260 |
school.high | 25.728 | 63.437 | 24.235 |
school.post.high | 13.480 | 36.153 | 13.478 |
invest | 24.647 | 21.583 | 21.424 |
special.gdpcap.1960.1969 | 5.285 | 5.271 | 3.581 |
special.sec.agriculture.1961.1969 | 6.844 | 6.179 | 21.353 |
special.sec.energy.1961.1969 | 4.106 | 2.760 | 5.310 |
special.sec.industry.1961.1969 | 45.082 | 37.636 | 22.425 |
special.sec.construction.1961.1969 | 6.150 | 6.952 | 7.276 |
special.sec.services.venta.1961.1969 | 33.754 | 41.104 | 36.528 |
special.sec.services.nonventa.1961.1969 | 4.072 | 5.371 | 7.111 |
special.popdens.1969 | 246.890 | 196.283 | 99.414 |
Synth::path.plot(synth.res = result, dataprep.res = df, Ylab = "real per-capita GDP (1986 USD, thousand", Xlab = "year", Ylim = c(0, 12), Legend = c("Basque country", "Synthetic Basque country"), Legend.position = "bottomright")
Synth::gaps.plot(synth.res = result, dataprep.res = df, Ylab = "real per-capita GDP (1986 USD, thousand", Xlab = "year", Ylim = c(-1.5, 1.5), Main = NA)
devtools::install_github("synth-inference/synthdid")
data("california_prop99", package = "synthdid") california_prop99 %>% head() %>% kbl() %>% kable_styling()
State | Year | PacksPerCapita | treated |
---|---|---|---|
Alabama | 1970 | 89.8 | 0 |
Arkansas | 1970 | 100.3 | 0 |
Colorado | 1970 | 124.8 | 0 |
Connecticut | 1970 | 120.0 | 0 |
Delaware | 1970 | 155.0 | 0 |
Georgia | 1970 | 109.9 | 0 |
df <- california_prop99 %>% synthdid::panel.matrices(unit = "State", time = "Year", outcome = "PacksPerCapita", treatment = "treated")
names(df)
## [1] "Y" "N0" "T0" "W"
result_did <- synthdid::did_estimate(Y = df$Y, N0 = df$N0, T0 = df$T0) plot(result_did)
synthdid::synthdid_units_plot(result_did)
result_sc <- synthdid::sc_estimate(Y = df$Y, N0 = df$N0, T0 = df$T0) plot(result_sc)
synthdid::synthdid_units_plot(result_sc)
result_synthdid <- synthdid::synthdid_estimate(Y = df$Y, N0 = df$N0, T0 = df$T0) plot(result_synthdid)
synthdid::synthdid_units_plot(result_synthdid)