Specification Curve Analysis: Overview and Stata Example
Overview
Research reproducibility topic has been gaining momentum over the past decade. There have been many studies reporting inability to replicate published results and lack of necessary details in methods description. Some journals are addressing this issue by requiring access to study data and executable code. However, while this may provide some reassurance in reliability of the results, the actual choice of analytic methods could be shaped by many assumptions that might not be evident.
There usually isn’t one correct way to analyse data. Instead, empirical studies often have plenty of flexibility in the way data are collected and cleaned as well as in the final model specification. A data cleaning step may involve exclusion of some units with missing data or conversion of a continuous variable to a categorical one or (vice versa). There also might be models equally plausible for the outcome, but having different sets of covariates or functional forms. Each of these small steps may snowball into a reported effect that is overly favorable to researchers’ narrative.
A relatively novel and very promising method that can help to mitigate this issue was proposed in Simonsohn, Simmons, and Nelson (2015) and is called Specification Curve Analysis (SCA). The idea behind the method is simple - the researcher is asked to consider multiple plausible ways to analyze the data and show that, jointly, the null hypothesis of no effect can be rejected. It doesn’t mean that all models must result in a statistically significant effect (though, it would make the conclusions very convincing!). However, even if the effect is detected when all specifications are tested simultaneously, this would result in a more objective inference.
Method Details
The method involves the following steps:
- identifying the set of theoretically justified, statistically valid, and non-redundant analytic specifications;
- running the analysis for each specification and displaying the results graphically - this allows the readers to identify consequential specification decisions;
- conducting statistical tests to determine whether, as a whole, results are inconsistent with the null hypothesis.
The first two steps above are self-explanatory. However, the third step is novel. The authors (Simonsohn, Simmons, and Nelson (2020)) proposed three test statistics for the SCA:
median effect estimated across all specifications;
share of specifications that obtain a statistically significant effect in the predicted direction;
average of Z-values across all specifications.
For each of them a sampling distribution can be generating by “resampling under-the-null.” This involves modifying the observed data so that the null hypothesis is known to be true, and then drawing random samples of the modified data. The test statistic of interest is then computed on each of those samples. The resulting distribution is the estimated distribution of the test statistic under the null.
Available Tools
There are several resources available to aid the implementation of the method. I organize them in a table below:
Language | Package Name | Description |
---|---|---|
R | specr | Available on CRAN. Provides functions to set up, run, evaluate and plot the specifications of interest. There is a lot of flexibility in model set-up. However, the package doesn’t have capability to perform the step (iii) above (i.e., the joint testing). |
R | rdfanalysis | Available only on GitHub. A more comprehensive collection of functions that provides a self-documenting code base that allows researchers to systematically document and explore their researcher degrees of freedom when conducting analyses. Has a shiny front end that helps to explore the findings interactively. |
Stata | speccurve | One function that can only plot the curve using coefficients stored in the e()-returns. Requires setting up and looping through the models beforehand. |
Stata | specurve | Depends on Stata 16’s Python (v.3.6) integration and several additional Python modules. The function performs regressions as specified in a provided YAML-formatted file and plots the specification curve. Limited to reghdfe models only, but allows for various combinations of fixed effects and clustering. |
Stata | specc | Available on SSC and is open for development on GitHub. The package appears to be very flexible in setting up models and enumerating specifications as well as plotting the curve. However, it lacks a simple example to get started. |
Python | specification_curve | Allows to conduct analysis and plot specification curves. Flexible in model specification and very well documented. While it also can’t perform the joint test (step (iii) of the specification analysis), the author has an example of its manual implementation here. |
It looks like most major statistical programming language have some version of the specification curve implemented. However, as far as I can tell, none of them are capable of performing step (iii), which, arguably, is as important as the curve itself. Therefore, for now, researches have to implement it themselves or contact RCS (research@hbs.edu) for assistance!
Stata Example
Next, I show an example in Stata that loops through several model specifications and then uses the speccurve function in Stata to plot the curve. Before running this code, make sure that the function is installed in Stata by running the following line:
net install speccurve, from("https://raw.githubusercontent.com/martin-andresen/speccurve/master")
The code uses a classic auto
data set and specifies several regression models that predict car price using available characteristics. The effect of interest is the coefficient estimated for the indicator foreign
.
clear all
sysuse auto, clear
loc no=0
* enumerationg many different specifications using a loop
foreach m in "" "mpg" {
foreach tr in "" "trunk" {
foreach wt in "" "weight" {
foreach ln in "" "length" {
foreach hr in "" "headroom" {
qui reg price foreign `m' `tr' `wt' `ln' `hr'
eststo md`no'
loc ++no
}
}
}
}
}
* plotting a SC with foreign as a parameter of interest
speccurve *, param(foreign) controls title(SCA for the effect of foreigh)
graph export "speccurve1.svg", replace
(1978 Automobile Data)
(file speccurve1.svg written in SVG format)
The code above produced the following specification curve:
Looks like including the weight
variable in the model had a notable effect on the coefficient for foreign
. Function speccurve
is somewhat limited in that it doesn’t work with models that have factors as controls. Next, I show a workaround for the latter case:
clear all
sysuse auto, clear
egen headroom_c = group(headroom)
loc no=0
foreach m in "" "mpg" {
foreach tr in "" "trunk" {
foreach wt in "" "weight" {
foreach ln in "" "length" {
foreach hr in "" "headroom" "i.headroom_c"{
qui reg price foreign `m' `tr' `wt' `ln' `hr'
qui estadd scalar mpgv = 0, replace
qui estadd scalar trunkv = 0, replace
qui estadd scalar weightv = 0, replace
qui estadd scalar lengthv = 0, replace
foreach vr in m tr wt ln {
if "``vr''"!="" qui estadd scalar ``vr''v = 1, replace
}
qui estadd scalar headroomv = 0
qui estadd scalar iheadroom_cv = 0
local vname = subinstr("`hr'", ".", "", .)
qui estadd scalar `vname'v = 1, replace
eststo md`no'
loc ++no
}
}
}
}
}
* The code below produces an error:
*speccurve *, param(foreign) controls title(SCA for the effect of foreigh)
* Workaround:
speccurve *, param(foreign) level(95) graphopts(legend(pos(1))) title(SCA for auto dataset) panel(mpgv trunkv weightv lengthv headroomv iheadroom_cv)
graph export "speccurve2.svg", replace
(1978 Automobile Data)
(file speccurve2.svg written in SVG format)
The code implements models that have headroom
included as a factor or as a continuous variable. Note that the first call for speccurve
would produce an error due to a bug in the function. However, the second call produces the following specification curve:
One can also output a table with numerical results:
matlist r(table)
| specno modelno estimate min95 max95 mpgv trunkv weightv lengthv headroomv iheadro~v
-------------+-------------------------------------------------------------------------------------------------------------------------
md0 | 1 1 312.2587 -1191.708 1816.225 0 0 0 0 0 0
md2 | 2 3 364.925 -1419.362 2149.212 0 0 0 0 0 1
md1 | 3 2 577.8125 -992.5493 2148.174 0 0 0 0 1 0
md14 | 4 15 740.7716 -960.3329 2441.876 0 1 0 0 0 1
md13 | 5 14 1128.818 -393.3118 2650.948 0 1 0 0 1 0
md12 | 6 13 1190.155 -326.8468 2707.157 0 1 0 0 0 0
md26 | 7 27 1327.396 -294.4929 2949.285 1 0 0 0 0 1
md38 | 8 39 1376.011 -230.2271 2982.249 1 1 0 0 0 1
md25 | 9 26 1714.109 292.4855 3135.733 1 0 0 0 1 0
md24 | 10 25 1767.292 371.2169 3163.368 1 0 0 0 0 0
md37 | 11 38 1825.733 408.1118 3243.355 1 1 0 0 1 0
md36 | 12 37 1887.461 468.5866 3306.335 1 1 0 0 0 0
md41 | 13 42 2196.194 517.1768 3875.212 1 1 0 1 0 1
md29 | 14 30 2247.635 591.1235 3904.146 1 0 0 1 0 1
md17 | 15 18 2294.095 616.0623 3972.129 0 1 0 1 0 1
md5 | 16 6 2352.064 696.5941 4007.534 0 0 0 1 0 1
md40 | 17 41 2615.666 1084.272 4147.059 1 1 0 1 1 0
md27 | 18 28 2644.771 1125.227 4164.315 1 0 0 1 0 0
md28 | 19 29 2644.847 1133.077 4156.616 1 0 0 1 1 0
md39 | 20 40 2670.519 1133.691 4207.347 1 1 0 1 0 0
md16 | 21 17 2774.021 1233.682 4314.361 0 1 0 1 1 0
md3 | 22 4 2801.143 1273.549 4328.737 0 0 0 1 0 0
md4 | 23 5 2801.899 1281.258 4322.54 0 0 0 1 1 0
md15 | 24 16 2827.236 1282.39 4372.082 0 1 0 1 0 0
md23 | 25 24 3072.365 1665.236 4479.495 0 1 1 1 0 1
md47 | 26 48 3079.179 1643.422 4514.937 1 1 1 1 0 1
md20 | 27 21 3116.728 1652.392 4581.064 0 1 1 0 0 1
md8 | 28 9 3132.815 1688.75 4576.88 0 0 1 0 0 1
md11 | 29 12 3146.808 1750.304 4543.312 0 0 1 1 0 1
md35 | 30 36 3148.211 1721.962 4574.459 1 0 1 1 0 1
md44 | 31 45 3162.517 1671.851 4653.184 1 1 1 0 0 1
md32 | 32 33 3179.193 1706.637 4651.749 1 0 1 0 0 1
md46 | 33 47 3502.516 2193.975 4811.056 1 1 1 1 1 0
md22 | 34 23 3526.83 2250.662 4802.999 0 1 1 1 1 0
md34 | 35 35 3545.345 2248.433 4842.256 1 0 1 1 1 0
md33 | 36 34 3550.194 2242.594 4857.793 1 0 1 1 0 0
md45 | 37 46 3557.085 2235.3 4878.871 1 1 1 1 0 0
md10 | 38 11 3570.379 2305.781 4834.976 0 0 1 1 1 0
md9 | 39 10 3573.092 2297.992 4848.191 0 0 1 1 0 0
md21 | 40 22 3580.051 2290.845 4869.256 0 1 1 1 0 0
md7 | 41 8 3623.75 2316.374 4931.127 0 0 1 0 1 0
md19 | 42 20 3631.585 2310.07 4953.101 0 1 1 0 1 0
md6 | 43 7 3637.001 2303.885 4970.118 0 0 1 0 0 0
md31 | 44 32 3648.619 2310.079 4987.159 1 0 1 0 1 0
md43 | 45 44 3654.777 2302.875 5006.679 1 1 1 0 1 0
md30 | 46 31 3673.06 2308.909 5037.212 1 0 1 0 0 0
md18 | 47 19 3686.447 2352.692 5020.201 0 1 1 0 0 0
md42 | 48 43 3711.123 2346.938 5075.308 1 1 1 0 0 0