Farley Grubb recently published an article on colonial Virginia's paper money purporting that it ‘functioned like a zero-coupon bond and traded below face value due to time-discounting, not depreciation’ (Reference Grubb2018, p. 113). The argument proceeds by introducing two measures. One, MEV, is designed to measure the market value of paper money as derived from exchange rate data, expressed as a percentage of its redemption value; e.g. MEV = 80 would mean that paper money was valued at 80 percent of its redemption value. The second, APV, is designed to measure the average present value of paper money: the discounted value is expressed as a percentage of its value when received by the colonial treasury in payment of taxes. Grubb argues that MEV can be almost entirely accounted for by APV and that fluctuations in APV explain most of the fluctuations in MEV. The heart of his article lies in the application of time series econometrics to establish this association.
Although this comment will focus on Grubb's econometric results, it is worth briefly mentioning some unrelated objections to the theory and history. As explained in Michener (Reference Michener2018), Grubb bases his measure of paper money's discounted value, APV, on its average utility rather than its marginal utility, putting his theory on the wrong side of the marginalist revolution. Computing the discounted value of money in future tax payments from its average utility inflates its value, just as the average value of a glass of water greatly exceeds its marginal value. A value based on its marginal utility would be a trivial fraction of MEV. Moreover, Grubb computes MEV from exchange rate data by adjusting the par of exchange to cover the cost of importing specie from England. This would make sense if exchange rates in colonial America hovered about the specie import point, but it is well known that colonial America was much more likely to export specie to England than to import it, so his adjustment improperly deflates MEV.
Grubb's estimates of Virginia currency in the hands of the public are also questionable. After the colony treasurer died, the Virginia House of Burgesses appointed a committee to examine his records. The committee reported on 9 April 1767 that ‘the Notes now in Circulation amount to 206,727..2..2£va.’ This is 23.4 percent more than Grubb reports circulating in 1767, and about 7 percent more than he reports circulating in 1766. On 7 April 1768, another committee report concluded that the amount in circulation was 170,419..16..1£va. This is 20.2 percent more than Grubb reports circulating in 1768, and 1.7 percent more than he reports for 1767 (Kennedy Reference Kennedy1906, pp. 120, 155; Grubb Reference Grubb2018, Table 2). Moreover, because the Journals of the House of Burgesses provide no explicit information on redemptions, Grubb relies on interpolation and guestimates after 1770 to track retirement of the currency (Grubb Reference Grubb2017, p. 104). As Brock (Reference Brock1992, Table 9) previously noted, however, treasury audits reporting redemptions were published in the newspaper beginning in 1768 (Rind's Virginia Gazette, 30 June 1768, 12 January 1769, 29 June 1769, 10 January 1771, 17 December 1772, 24 June 1773, 30 December 1773, 29 December 1774).
In the late 1760s and early 1770s, the Virginia treasury possessed as much as 15,000 to 20,000£va of gold and silver that Nicholas, Virginia's treasurer, offered to exchange on demand for its paper money (Rind's Virginia Gazette, 20 June 1771, 14 September 1769; Purdie and Dixon's Virginia Gazette, 25 May 1769, 15 June 1769). He found few takers, because paper money was ‘generally preferred to Gold and Silver’ (Purdie and Dixon's Virginia Gazette, 30 September 1773; Bland Reference Bland1898; Grubb Reference Grubb2017, p. 108). How can one maintain that Virginia's paper money ‘functioned like a zero-coupon bond and traded below face value due to time discounting’ when paper money was convertible on demand at the treasury, and people spurned the offer?
Nevertheless, many consider historical interpretation to be purely subjective, a matter of competing narratives. The test of competing narratives is how well they explain data, a more objective measure. Although I do not share this opinion, I bow to it in setting aside these objections to focus on the econometric evidence.
Here is a brief summary of the econometric challenges for the benefit of non-specialists. Many problems discussed below relate to unit root tests; the presence of a unit root implies that a time series possesses a stochastic trend. One applies unit root tests to univariate series to characterize the kind of trend they exhibit, if any. One applies unit root tests to regression residuals to determine whether the regression error term is free of unit roots, because error terms possessing a unit root typically arise from spurious regressions. One therefore tests regression residuals for the absence of unit roots – a test of cointegration – to rule out one kind of spurious regression. Although time series econometricians often rely on unit root tests to classify time series according to whether or not they possess a unit root, all such tests have a weakness. Even in large samples, certain stationary series can mimic the behavior of a unit root process, and certain unit root processes can mimic the behavior of a stationary series (Campbell and Perron Reference Campbell, Perron, Blanchard and Fischer1991, pp. 157-8). One ought to consider all such classifications as provisional rather than definitive.
There are many unit root tests, but Grubb's article and this comment make use of Augmented Dickey–Fuller (ADF) tests and their close cousins. What characterizes all these tests is that the test statistic resembles an ordinary t statistic, which, however, has a nonstandard distribution under the null hypothesis that a unit root exists. One peculiarity of these tests is that the distribution of the ‘t statistic’ under the null, and hence the critical value for testing, depends on the right-hand side variables that appear in the regression (Campbell and Perron Reference Campbell, Perron, Blanchard and Fischer1991, p. 149). Furthermore, if one applies the test to regression residuals, the critical values depend on the number of explanatory variables in the original regression (Enders Reference Enders2010, pp. 373-4). When one performs one of these ersatz t tests on a univariate series containing no more than lagged dependent variables, along with perhaps a constant or deterministic trend, one is said to be performing a Dickey–Fuller test, and one compares computed ‘t statistics’ to critical values that may be found in Enders (Reference Enders1995, p. 419). In scenarios that are more complicated these ersatz t-tests go by different names and have different critical values. When there is a structural break present, one uses a Perron test; when one is testing regression residuals for a unit root one uses the Engle–Granger test. The comment argues that Grubb (Reference Grubb2018) relied on tables of critical values tabulated for the ordinary ADF test, when he needed to use critical values for Perron or Engle–Granger tests and that using the correct critical values modifies and often reverses his results. Moreover, by overlooking some complications of the Perron test, Grubb ran the wrong regressions to compute his test statistics. Finally, in his test of cointegration Grubb included lagged dependent variables for the express purpose of pre-whitening the residuals, only to then test those residuals for stationarity, an invalid procedure that can mask the presence of a unit root. The other econometric problems are more familiar to the general practitioner – omitted variable bias and a spurious regression. In a critical regression in panel C, Grubb omits variables he himself had found to be important; including those variables undermines the article's principal thesis. In panel D, there is a purely mechanical relation between the dependent variable and a key explanatory variable. The statistical significance of the explanatory variable is spurious.
Preliminaries
Grubb presents his econometric results in his Table 3, which consists of four panels: A, B, C and D. Panel C reports three regressions. In Table 1 this article presents four panels paralleling Grubb's, likewise denoted A, B, C and D, for purposes of comparison. Grubb devotes panels A and B to establishing the univariate time series properties of MEV and APV. In panel C he first demonstrates that MEV and APV are cointegrated, which implies that regressing MEV on APV is a valid exercise. Executing that regression, he concludes that fluctuations in APV induce a response from MEV that is approximately one-to-one. None of this is well executed.
Notes: Standard errors are in parentheses beneath the coefficients. D = 1 for years after 1765, 0 otherwise; DT = 1 if year is 1766 and 0 otherwise. The z t in the cointegration test are residuals from the first panel C regression. Lags is the number of lags of the dependent variable included in the regression whose coefficients are not displayed. This table is designed to be compared to Grubb (Reference Grubb2018, Table 3). Although Grubb's method for choosing the date of the structural break is likely to introduce some pretest bias, no attempt has been made to adjust for that in assessing statistical significance. No R-squared is given for the cointegration test because R-squared becomes an ambiguous statistic when the intercept has been suppressed.
*** Statistically significant above the 0.01 level.
** Statistically significant above the 0.05 level.
*Statistically significant above the 0.1 level.
Ɨ See discussion in the text.
# The errors are serially correlated and the OLS standard error estimate is biased and inconsistent.
Even with Grubb's assistance, I could replicate neither his regression in panel A nor his first regression result in panel C. Professor Grubb recognized that a typo in his article (N = 20, not 19) led me to unnecessarily truncate my sample in estimating panel A. Correcting that, however, still left a small discrepancy in both equations. The remaining mistake in panel A now can be traced to a transcription error in his published article. The source of the error in panel C remains mysterious. Neither error creates an important or sizeable shift in any coefficients or test statistics. The replications are in panels A and C of Table 1.
Panels A and B – mistaken critical values and test procedures
Grubb reports using Dickey–Fuller critical values, taken from Enders (Reference Enders1995, p. 419), for his tests. In panel A, however, Grubb includes not only a constant and a deterministic linear trend but also a dummy variable to capture a structural break in the intercept beginning in 1766. In panel B, although there is no deterministic trend, the structural break dummy variable is still present.
Perron (Reference Perron1989, Reference Perron1990) – amended in Perron and Vogelsang (Reference Perron and Vogelsang1992, Reference Perron and Vogelsang1993) – introduced a generalization of the ADF unit root test for time series exhibiting a structural break at a known date, a test whose critical values are larger in absolute value than the ordinary Dickey–Fuller critical values. Using ordinary Dickey–Fuller critical values to perform a unit root test when there is a structural break is never correct. Provided the date of the structural break is known a priori, the test should be performed as recommended by Perron (Reference Perron and Rao1994), using the tables of critical values in Perron (Reference Perron1990, Table 4) for the case without a trend, and in Perron (Reference Perron1989, Table IV.B) for the case with a trend.Footnote 1
Perron presents two distinct models of a structural break in the intercept: the additive outlier model (AO for short) and the innovational outlier model (IO for short). In the AO model, once the dummy variable for the structural break switches on the entire shift in the intercept happens immediately. In the IO model once the dummy variable for the structural break switches on the effect is analogous to a constant being added to the model's innovations, and the effect propagates through the model's distributed lags. The appropriate unit root test is different for the two models (Perron Reference Perron and Rao1994, pp. 118-20, 133; Harris and Sollis Reference Harris and Sollis2003, pp. 57-63). Although Grubb never explicitly discusses the distinction, his equations in Table 3 are consistent with the IO model, not the AO model, so in the subsequent discussion I shall limit myself to Perron's test for the IO model.
Another complication is that unit root tests can't be performed by running the regressions Grubb estimates in panels A and B. To explain the problem, consider an equation of the kind Grubb estimates in panel A, where the Dummy variable D is 1 if YEAR > 1765 and 0 otherwise.
The fundamental properties of the model are not the same under the null hypothesis of a unit root, β 1 = 0, as under the alternative hypothesis of stationarity, β 1<0. Under the alternative hypothesis, a non-zero value of β 3 corresponds to an intercept shift, whereas under the null hypothesis a non-zero value of β 3 corresponds to a slope shift. To perform a unit root test when an intercept shift occurs, one must define a new dummy variable, DT which is 1 if YEAR = 1766 and 0 otherwise and then nest the null and alternative within the same model.
Eyeballing the MEV and APV data, it appears that a shift in the intercept is much more plausible than a shift in the slope. Implicitly testing a joint null of a unit root plus a trend shift matters even more in the panel A regression than the mistaken critical value. Grubb's rejection of a unit root in panel A appears to have arisen because the data reject a slope shift in favor of an intercept shift.
Redoing the panel A and panel B analyses as recommended by Perron (Reference Perron and Rao1994) and Harris and Sollis (Reference Harris and Sollis2003, pp. 57-63) results in the estimates reported in the second regression in panel A and the sole regression in panel B of Table 1.Footnote 2 In panel A the observed t statistic is -.4181/.2151 = −1.94, and the critical value when testing at the 10 percent level is -3.46. Grubb reported rejecting a unit root at the 1 percent level; in fact, one cannot reject a unit root. In panel B the observed test statistic is -.3113/.08847 = −3.52; the critical value at a 5 percent significance level (for N = 50) is -3.45. Grubb reported rejecting a unit root at the 1 percent level; in fact, one barely rejects a unit root at a 5 percent level. Even that conclusion requires using critical values for a sample size of 50, the smallest sample size for which critical values have been published, in a regression with 18 observations.
Panel C and cointegration
Grubb's fundamental thesis is that APV and MEV are intimately connected, and to demonstrate that his regression of MEV on APV is valid he wants to establish that MEV and APV are cointegrated. Because he concluded that MEV is a trend stationary process and that APV is a stationary process, his subsequent claim that these two variables are cointegrated is jarring. Intuitively, cointegration usually means that two variables share a common trend. As Dickey et al. (Reference Dickey, Jansen, Thornton and Bhaskara Rao2007, p. 10) put it, cointegrated variables ‘cannot move “too far” away from each other. In contrast, a lack of cointegration suggests that such variables have no long-run link; in principle, they can wander arbitrarily faraway from each other.’
Econometricians, however, have proposed a less restrictive concept known as stochastic cointegration, distinct from the ordinary cointegration Dickey has in mind. Variables are stochastically cointegrated if a nondegenerate linear combination of those variables exists that is trend stationary (Campbell and Perron Reference Campbell, Perron, Blanchard and Fischer1991, pp. 164–5). If Grubb's conclusions in panels A and B happen to be correct, the only species of cointegration between MEV and APV that is possible is stochastic cointegration. In this case, however, MEV and APV certainly will ‘wander arbitrarily faraway from one another’, an outcome difficult to reconcile with Grubb's theory linking the two variables. If MEV has a unit root and APV does not, as properly performed unit root tests suggest, the two series cannot be cointegrated in any way.
Grubb nonetheless tests for ordinary cointegration using the residuals from the penultimate regression in his panel C. This regression, however, contains lagged values of the dependent variable, lagged values expressly included to eliminate serial correlation in the error term. Pre-whitening the residuals before applying a unit root test to those residuals has the effect of eliminating any unit root that might have been present. It is precisely for this reason that Lee (Reference Lee1996, p. 136) recommends that ‘pre-whitening … should not be used for … stationarity tests, since it involves an intrinsic problem of making stationarity tests inconsistent.’ The correct way to test for ordinary cointegration would be to apply a unit root test to the residuals of the first regression Grubb estimated in his panel C, as is done in Table 1, panel C of this article. The test statistic is t = -.493686/.19901 = −2.48. Grubb refers to a table of Dickey–Fuller critical values, but the test is actually a Engle–Granger cointegration test, which has different critical values (Enders Reference Enders2010, pp. 374, 490). The 10 percent critical value for this test statistic (N = 50) is -3.461, so at a 10 percent significance level, the unit root null is easily accepted; there is no evidence of cointegration.
Even the ill-conceived and inconsistent test using the pre-whitened residuals from the penultimate regression in Grubb's panel C would not reject the unit root null if Grubb had used Engle–Granger critical values. The test statistic is t = -.8342/.2464 = −3.39. The 5 percent critical value for this 3-variable Engle–Granger test is -3.915; the 10 percent critical value is -3.578 (Enders Reference Enders2010, Table C, p. 490).Footnote 3
The missing regression in panel C
Grubb's first two regressions in panel C serve another purpose – to establish a statistically significant relationship between MEV and APV that is approximately one-to-one.
If we trust the unit root tests performed here, his first two regressions in panel C are spurious, because MEV and APV are not cointegrated. Nevertheless, because unit root tests sometimes perform poorly, it is possible that Grubb stumbled on the correct conclusion when he declared MEV to be trend stationary with a structural break in 1766. If so MEV and APV could be stochastically cointegrated. In that case, however, the panel C regressions would require a trend term. There is also reason to believe the panel C regression ought to have a structural break; Grubb's panel D reports that the difference between MEV and APV contains a structural break. Including a trend and structural break changes the outcome dramatically, as one can see in the final panel C regression in Table 1. The trend term and the structural break term, both omitted in Grubb's panel C, are highly statistically significant. APV has a coefficient that is close to zero in magnitude and that isn't remotely statistically significant. APV does not appear to have any effect – let alone a one-to-one effect – on MEV.
Spurious regressions and panel D
Grubb argues that the positive and statistically significant coefficient of the per-capita money supply in panel D of Table 3 demonstrates that the transactions premium, TP, the excess of the paper money's value over its present value in tax collections, increases with an increase in the per-capita money supply. ‘More paper money in circulation per capita’ Grubb (Reference Grubb2018, p. 137) concludes, ‘increased its ubiquity and familiarity of usage, which in turn led the public increasingly to treat this money as fiat-like currency.’ It therefore displaced less efficient transactions media, such as barter, book credit, tobacco and specie.
As Pearson, one of the great early statisticians, noted as far back as 1902, spurious correlation often arises ‘due solely to the particular manipulation of the observations’.Footnote 4 Such is the case here, I believe. The transactions premium, TP (or more precisely, TP - RD), cannot be independently observed; it is inferred from MEV – APV, which is actually the dependent variable in panel D. Grubb (Reference Grubb2018, p. 129, equation 4) defines APV, however, as an inverse function of the current money supply. Consequently, the correlation between APV and the per-capita money supply in the Virginia data set is -0.9336! This correlation tells us nothing about the evolution of the colonial economy – both variables are simply transformations of the money supply data. When one subtracts APV from MEV to create the dependent variable in the panel D regression, this ‘particular manipulation of the data’ injects the current money supply directly into the dependent variable. The statistical significance of the per-capita money supply in the panel D regression is both unremarkable and uninformative.
The overwhelming importance of the spurious correlation of APV and the per-capita money supply in Grubb's panel D can be verified by noting (in panel D of Table 1) that when APV is dropped from the dependent variable, the money supply per capita has an insignificant negative effect on MEV. The per capita money supply only exhibits a positive statistically significant association in Grubb's panel D because APV is part of the dependent variable.
These are all the econometric results that Grubb presents. His panel D reports a spurious regression, and performing the hypothesis tests correctly reverses every substantive result in panels A, B and C. The data provide no support for his proposition.