Interest Rate Conundrums in the 21st Century∗

SAMUEL G. HANSON

Harvard Business School

DAVID O. LUCCA

Federal Reserve Bank of New York

JONATHAN H. WRIGHT

Johns Hopkins University

June 13, 2018

Abstract

A large literature argues that long-term nominal interest rates react far more to high-frequency (daily or monthly) movements in short-term rates than is predicted by the standard expectations hypothesis. We find that, since 2000, this high-frequency sensitivity has grown even stronger in U.S. data. By contrast, the association between low-frequency changes (at 6- or 12-month horizons) in short- and long-term rates, which was equally strong before 2000, has weakened substantially. As a result, “conundrums”—defined as 6- or 12-month periods in which short and long rates move in opposite directions—have become increasingly common. We show that this post-2000 combination of high-frequency “excess sensitivity” and low-frequency “decoupling” of short- and long-term rates arises because increases in short rates temporarily raise the term premium on long-term bonds, leading long rates to temporarily overreact to changes in short rates. This post-2000 phenomenon can be understood using a model in which (i) declines in short rates lead to outward shifts in the demand for long-term bonds—e.g., because some investors “reach for yield”—and (ii) the arbitrage response to these demand shifts is slow. We discuss the implications of our findings for the transmission of monetary policy and the validity of the event-study methodology.

∗We thank John Campbell, Gabriel Chodorow-Reich, Richard Crump, Thomas Eisenbach, Robin Greenwood, Eric Swan-

son, Jeremy Stein, and Adi Sunderam as well as seminar participants at the 2018 AEA meetings, Johns Hopkins University, the Society for Computational Economics 2017 International Conference, University of Georgia, and University of South- ern California for helpful comments. Hanson gratefully acknowledges funding from the Division of Research at Harvard Business School. All errors are our sole responsibility. The views expressed here are the authors’ and are not representa- tive of the views of the Federal Reserve Bank of New York or of the Federal Reserve System.

Ofer Abarbanel – Online Library

Short-term nominal interest rates are determined by current monetary policy and its near-term ex- pected path. Shocks to monetary policy and the macroeconomy are generally seen as being short-lived, so long-term rates should not be highly sensitive to changes in short rates if the expectations hypothesis holds (Shiller, 1979). However, a large literature demonstrates that long-term nominal rates are far more sensitive to high-frequency changes in short rates than is predicted by this standard view combining fast mean-reversion in short rates with the expectations hypothesis (Shiller et al., 1983; Cochrane and Piazzesi, 2002; Gu¨rkaynak et al., 2005). Despite its importance for monetary policy transmission, the deeper forces underpinning this puzzling degree of high-frequency sensitivity—and the extent to which this sensitivity has evolved over time—remain poorly understood.

In this paper, we document an important and previously unrecognized fact about the term structure of nominal interest rates: the association between high-frequency changes (at daily or 1-month horizons) in short- and long-term rates has strengthened considerably since 2000. In contrast, the relationship between low-frequency changes (at 6- or 12-month horizons) in short- and long-term interest rates, which was also quite strong before 2000, has weakened substantially in recent years. Concretely, between 1971 and 1999, a daily regression of changes in 10-year U.S. Treasury yields on changes in 1-year yields delivers a coefficient of 0.56; and the analogous regression using 12-month changes gives the same coefficient of

0.56. Strikingly, between 2000 and 2017, the coefficient from the daily regression jumps to 0.86, while the coefficient from the corresponding 12-month regression drops to just 0.20. In summary, changes in

U.S. short- and long-term rates have become even more tightly linked at high frequencies since 2000, but have largely decoupled at low frequencies.1 And, we find broadly similar patterns for Canada, Germany,

and the U.K.

A stark example of such low-frequency decoupling was the period after June 2004 when the Federal Reserve raised its short-term policy rate, but longer-term yields fell. This was famously described by then Federal Reserve Chairman Greenspan as a “conundrum,” and has been discussed in many papers, including Backus and Wright (2007). One way to summarize our key finding is that we show that this 2004 episode was by no means unique: “conundrums” —defined as 6- or 12-month periods where short and long-term rates move in opposite directions—have become far more common since 2000. From 1971 to 1999, 1-and 10-year nominal yields moved in the same direction in 82% of all 6-month periods. By contrast, since 2000, the corresponding figure has only been 62%.

What explains the puzzling post-2000 tendency of short- and long-term rates to move together at

1Because of the importance of communication about the near-term path of monetary policy, we take the short rate to be the 1-year nominal Treasury yield rather than the overnight federal funds rate targeted by the Federal Reserve.

high frequencies but not at low frequencies? As a matter of statistical description, we show that this pattern arises because, all else equal, past increases in short rates predict a subsequent flattening of the yield curve in the post-2000 data. Furthermore, this predictable flattening of the curve is associated with a transitory rise in the expected returns on long-term bonds relative to those on short-term bonds: since 2000 term premia on long-term bonds appear to be temporarily elevated following past increases in short rates. Thus, relative to an expectations-hypothesis baseline, long rates temporarily overreact to movements in short rates, exhibiting a form of what Mankiw and Summers (1984) have dubbed “excess sensitivity.” Concretely, in the post-2000 data, we estimate that 10-year yields rise by 64 basis points (bps) in response to a 100 bps monthly increase in 1-year yields. Over the following 6 months, 10-year yields are expected to fall by 33 bps, all else equal, reversing roughly half of the initial response. But, while this predictable reversion of long-term rates is a robust feature of the post-2000 data, this pattern is not present before 2000.

What deeper economic forces have led to the shifting relationship between movements in long- and short-term yields? Gu¨rkaynak et al. (2005) note that the strong sensitivity of long-term nominal rates could be consistent with the expectations hypothesis if one adopts the non-standard view that long-run inflation expectations are unanchored and are continuously being updated in light of incoming news. In other words, Gu¨rkaynak et al. (2005) argue that the strong sensitivity of long-term rates to short rates could work through a expectations-hypothesis channel once one allows for highly persistent shocks to inflation expectations. We argue that the narrative in Gu¨rkaynak et al. (2005) is a good explanation for the high degree of sensitivity observed in the pre-2000 period. Indeed, consistent with the expectations- hypothesis logic of their explanation, in the pre-2000 data, we find no evidence that the reaction of long yields to movements in short rates tends to predictably reverse.

However, as shown by Beechey and Wright (2009), Hanson and Stein (2015), Abrahams et al. (2016), and Nakamura and Steinsson (forthcoming), in the post-2000 period, the strong high-frequency sensitivity of long-term nominal rates primarily reflects the sensitivity of long-term real rates to short-term nominal

rates, rather than the sensitivity of long-term break-even inflation.2 To the extent that one shares the

widespread view that expected future real rates at distant horizons should not fluctuate meaningfully at high frequencies (see Gu¨rkaynak et al. (2005)), this casts doubt on an expectations hypothesis-based explanation of the strong high-frequency sensitivity of long rates in the post-2000 data. To resolve this

2This exercise cannot be done before 2000 for the U.S. because inflation-indexed bonds were not issued in the U.S. until

1997. However, for the U.K., we show that the rise in the high-frequency sensitivity of long-term nominal yields is fully explained by the rising sensitivity of long-term real yields.

puzzle, Hanson and Stein (2015) argue that excess sensitivity works through term premia on long-term bonds: shocks to short rates temporarily move term premia in the same direction. Consistent with Hanson and Stein (2015), in the post-2000 data, we find strong evidence that the reaction of long-term yields to movements in short rates tends to predictably reverse, giving rise to short-lived shifts in the expected returns to holding long-term bonds.

We construct a simple model to help understand the shifting relationship between movements in long- and short-term yields, especially the puzzling post-2000 combination of high-frequency excess sensitivity and low-frequency decoupling. In our model, risk-averse investors can either invest in short- or long-term nominal bonds. While monetary policy pins down the rate on short-term nominal bonds, long-term bonds are available in a net supply that varies randomly over time. (The net supply is the gross supply of long-term bonds net of the amount inelastically demanded by other agents.) Since shocks to the supply and demand for long-term bonds must be absorbed by risk-averse investors, shifts in net supply affect term premia on long-term bonds as in Vayanos and Vila (2009).

In the pre-2000 period, we assume there was a large persistent component of short-term nominal rates, reflecting shocks to trend inflation as in Stock and Watson (2007). As in Gu¨rkaynak et al. (2005), the existence of this highly persistent component in combination with expectations-hypothesis logic, explains the strong sensitivity of long rates at both high and low frequencies before 2000. In the post-2000 period, the volatility of this persistent component of short rates has dropped sharply. From an expectations- hypothesis perspective, this should have reduced the sensitivity of long rates. In the data, this occurs at low frequencies, but we see even greater sensitivity at high frequencies.

The key contribution of our model is to explain how such frequency-specific sensitivity may arise in the post-2000 period. Our explanation rests on two key ingredients: (i) shifts in the supply and demand for long-term bonds that move term premia in the same direction as short-term rates and (ii) and slow-moving capital. The first key ingredient is our assumption that shocks to the net supply of long-term bonds are positively correlated with shocks to short rates. This implies that increases in short rates are associated with increases in the term premium component of long-term rates, generating “excess sensitivity” relative to the expectations hypothesis. The simplest interpretation of our assumption follows Hanson and Stein (2015) and appeals to inelastic shifts in the demand for long-term bonds from “yield-oriented” investors. These investors target a certain level of portfolio yield and “reach for yield,” inelastically demanding more long-term bonds, as short rates decline. However, our assumption can be seen as a reduced-form for several distinct supply-and-demand-based amplification mechanisms that have grown in importance

since 2000, including shifts in supply of long-term bonds due to mortgage refinancing waves (Hanson,

2014; Malkhozov et al., 2016) and shifts in the demand for long-term bonds from biased investors who over-extrapolate short-term interest rates (Piazzesi et al., 2015; Giglio and Kelly, 2018).3 The second key ingredient is that capital is slow-moving as in Duffie (2010): these supply and demand shocks encounter a short-run demand curve that is steeper than the long-run demand curve. This slow-moving capital dynamic implies that the shifts in bond term premia triggered by movements in short rates are transitory. As a result, the excess sensitivity of long rates is greatest when measured at high frequencies.

In summary, this combination of reaching-for-yield and slow-moving capital enables our model to match the frequency-specific sensitivity of long rates observed since 2000. And, our model explains the shift in the relationship between long- and short-term rates that occurred around 2000 as stemming from

(1) a decline in the volatility of the persistent component of short rates and (2) the growing importance of the kinds of supply-and-demand-based amplification mechanisms noted above.

Our findings have important implications for how one should interpret event-study evidence based on high-frequency changes in long-term bond yields. Macroeconomic news—including news about monetary policy—comes out in a lumpy manner, and the short-run change in long-term yields around news an- nouncements is often used as a convenient and unconfounded measure of the longer-run impact of news shocks. Gertler and Karadi (2015) and Nakamura and Steinsson (forthcoming) are two prominent re-

cent examples of this increasingly popular high-frequency approach to identification in macroeconomics.4

However, if, as we show, some of the impact of a news shock on long-term rates tends to wear off quickly over time, then a shock’s short- and long-run impact will be quite different. And, the event-study ap- proach will necessarily capture only the short-run impact. Indeed, it is common for news announcements to cause large jumps in 10- and even 20-year forward rates, but we show that a large portion of these jumps are due to transient shifts in term premia. As a result, event-study methodologies are likely to provide biased estimated of the longer-run impact of news announcements on long-term yields. In sum- mary, empirical macroeconomists face an important bias-variance trade-off: high-frequency event studies provide precise estimates of the short-run impact of news surprises on long-term yields, but these are likely to be systematically biased estimates of the longer-run impact that is often of greatest interest (Greenwood et al., forthcoming).

3The idea that supply-and-demand effects can have important consequences for long-term rates features prominently in many other recent papers, especially in analyses of Quantitative Easing (Gagnon et al., 2011; Hamilton and Wu, 2012; Krishnamurthy and Vissing-Jorgensen, 2011, 2012).

4Earlier papers examining the high-frequency response of long-term rates to news about monetary policy include Evans and Marshall (1998), Kuttner (2001), and Cochrane and Piazzesi (2002).

Our findings also have implications for the transmission of monetary policy. In the textbook New Keynesian view (Gali, 2008), the central bank adjusts short-term nominal rates. This affects long-term interest rates via the expectations hypothesis, which in turn influences aggregate demand. Stein (2013) points out that the excess sensitivity of long-term yields—whereby shocks to short rates move term premia in the same direction—should strengthen the effects of monetary policy relative to the textbook view. Stein (2013) refers to this as the “recruitment” or “risk-taking” channel of monetary transmission. We find that the behavior of interest rates does not conform to the textbook New Keynesian view in which term premia are constant. Nonetheless, our findings suggest that the recruitment channel may not be as strong as Stein (2013) speculates since a portion of the resulting shifts in term premia are transitory and, thus, likely to have only modest effects on aggregate demand. To be clear, we do not argue that there is no recruitment channel, just that it is smaller than one might conclude based on the high-frequency response of term premia to policy shocks documented in Hanson and Stein (2015), Gertler and Karadi (2015), and Gilchrist et al. (2015).

The plan for the paper is as follows. In Section 1, we document our key stylized facts about the changing high- and low-frequency sensitivity of long-term interest rates. In Section 2, we show that past increases in short rates predict a future flattening of the yield curve in the post-2000 data, reflecting a new form of bond return predictability. Section 3 develops the economic model that we use to interpret our findings. Section 4 discusses the implications of our findings for identification strategies exploiting high-frequency responses of long-term yields, for the transmission of monetary policy, and for affine term-structure models. Section 5 concludes.

1 Main findings

This section presents our main findings. The association between high-frequency changes in short- and long-term interest rates has strengthened since 2000. By contrast, the association between low-frequency changes in short- and long-term interest rates was quite strong before 2000, but has weakened substantially in recent years. We first document these basic facts for the U.S. We then show that broadly similar results hold for Canada, Germany, and the U.K.

1.1 U.S. evidence

We obtain historical data on the nominal and real U.S. Treasury yield curve from Gu¨rkaynak et al. (2007) and Gu¨rkaynak et al. (2010). We focus on continuously compounded 10-year zero-coupon yields and 10-year instantaneous forward rates. We also decompose nominal yields into real yields and inflation compensation, defined as the difference between nominal and real yields derived from Treasury Inflation- Protected Securities (TIPS). Our sample begins in 1971, which is when reliable data on 10-year nominal yields first become available, and ends in 2017. For real yields and inflation compensation, we only study the post-2000 sample, since data on TIPS are not available until 1999. All data are measured as of the end of the relevant period—e.g., the last trading day of each month.

In standard monetary economics models, the central bank sets overnight nominal interest rates, and other interest rates are influenced by the expected path of overnight rates. A large literature argues that central banks in the U.S. and abroad have increasingly relied on communication—implicit or explicit signaling about the future path of overnight rates—as an active policy instrument (Gurkaynak et al., 2005; Lucca and Trebbi, 2009). To capture news about the near-term path of monetary policy that would not impact the current overnight rate, we take the short rate to be the 1-year nominal Treasury rate which follows approaches in the recent literature (Campbell et al., 2012; Gertler and Karadi, 2015; Gilchrist et al., 2015; Hanson and Stein, 2015).

To illustrate our key stylized fact, we begin by regressing changes in 10-year yields or forward rates on changes in 1-year nominal yields. Specifically, we estimate regressions of the form:

y(10) − y(10) = αh + βh(y(1)

− y ) + εt,t+h (1.1)

and

f (10) − f (10) = αh + βh(y(1)

− y ) + εt,t+h, (1.2)

where y(n)

is the n-year zero-coupon rate in period t and f (n)

is the n-year-ahead instantaneous for-

ward rate. Panel A in Table 1 reports estimated slope coefficients βh in (1.1) for zero-coupon nom- inal yields, real yields, and inflation compensation using daily data and using end-of-month data with h = 1, 3, 6, 12—i.e., we report coefficients for daily, monthly, quarterly, semi-annual, and annual changes in yields. (Throughout the paper, bond maturities are in years and time periods are in months, except when we estimate regressions at a daily frequency.) The results are shown for the pre-2000 and post-2000 subsamples separately. We base this sample split on a number of break-date tests that we will discuss

shortly. Panel B reports the corresponding slope coefficients in (1.2) using changes in instantaneous forwards as the dependent variable.5

At a daily frequency, there has been a large increase in the regression coefficients between the pre-2000 and post-2000 subsamples. The increasing sensitivity of long-term rates at high frequencies is our first key finding. Specifically, the daily coefficient for 10-year yields in Panel A has increased from βh = 0.56 in the pre-2000 subsample to βh = 0.86 in the post-2000 subsample with the increase being statistically significant (p-val< 0.001). Similarly, from Panel B, the coefficient for daily changes in 10-year forward rates is βh = 0.39 pre-2000 and βh = 0.48 post-2000.

Table 1 shows a second fact that has also not been previously documented: the coefficients at lower

frequencies are much smaller after 2000. For example, the coefficient for h = 12-month changes in 10- year yields is β12 = 0.56 before 2000 but only β12 = 0.20 in the post-2000 sample and this difference is statistically significant (p-val < 0.001). Similarly, for 10-year forward rates, the coefficient at a 12-month horizon is β12 = 0.39 in the pre-2000 sample but β12 = −0.17 after 2000.

More generally, in the post-2000 sample, the coefficient βh is a steeply declining function of the horizon h over which yield changes are calculated. By contrast, βh is a relatively constant function of horizon h in the pre-2000 sample. In terms of a decomposition between real yields and inflation compensation, the majority of the decline in βh as a function of h during the post-2000 sample is accounted for by the real yield component of the 10-year yield.

In summary, Table 1 shows that, prior to 2000, there was a strong tendency for short- and long- term interest rates to rise and fall together at both high- and low-frequencies. While the high-frequency relationship has grown even stronger since 2000, the low-frequency relationship has weakened significantly. As a result, events such as “Greenspan’s conundrum” (Backus and Wright, 2007)—the period after June 2004 when the Federal Reserve raised short-term rates and longer-term yields declined—have grown increasingly common. Indeed, since 2000, 1- and 10-year nominal yields have moved in the same direction in 62% of all 6-month periods. By contrast, from 1971 to 1999, the corresponding figure was 82%, and the difference is statistically significant (p-val < 0.001). In the Internet Appendix, we show that very similar results obtain when we use long-term private yields as the dependent variable in equation (1.1). Specifically, we examine long-term corporate bond 5Since we use overlapping changes in equation (1.1) when h > 1, we report Newey and West (1987) standard errors using a lag truncation parameter of ƒ1.5 × h{; when h = 1, we report heteroskedasticity-robust standard errors. To address the tendency for statistical tests based on Newey-West standard errors to over-reject in finite samples, we compute p-values using the asymptotic theory of Kiefer and Vogelsang (2005) which gives more conservative p-values and has better finite-sample properties than traditional Gaussian asymptotic theory.

yields with Moody’s ratings of Aaa and Baa, the 10-year swap yield, and the yield on Fannie Mae mortgage-backed-securities. For of all these long-term yields, the sensitivity to changes in 1-year Treasury rates was similar irrespective of frequency before 2000. After 2000, the sensitivity at high frequencies increases while the sensitivity at low frequencies declines significantly.

We use two other approaches to document our key stylized fact. The first is to estimate equations (1.1) and (1.2) using 10-year rolling windows. The estimated slope coefficients for h = 12-month changes are shown in Figure 2 for 10-year yields and forward rates. The coefficient declines substantially in more recent windows. The second approach is to test for a structural break in equations (1.1) and (1.2), allowing for a break date that is not known a priori. We use the test of Andrews (1993) who conducts a Chow (1960) test at all possible break dates, and then takes the maximum of the Wald test statistics. Figure 3 plots the Wald test statistic for each possible break date in equations (1.1) and (1.2) along with the Cho and Vogelsang (2017) critical values for a null of no structural break. The strongest evidence

for a break is in 1999 or 2000 in both equations (1.1) and (1.2) and the break is highly statistically significant.6,7

1.2 International evidence

Our focus is on the U.S., but it is useful to consider whether these same patterns are also observed in other large, highly-developed economies. In Table 2, we briefly explore evidence for the U.K., Germany, and Canada. Panel A of Table 2 shows estimates of equation (1.1) for the U.K., where data is available beginning in 1985. For the U.K., the estimates are broken out into real yields and inflation compensation. The evidence for the U.K. is remarkably similar to the U.S. evidence in Table 1. Before 2000, the daily coefficient (βh = 0.44) and the yearly coefficient (βh = 0.38) are similar in the U.K. After 2000, the daily sensitivity increases (βh = 0.86), and the yearly sensitivity declines (βh = 0.29). Because we have data on real yields prior to 2000 in the U.K., we can decompose the change in βh into its real and inflation compensation components. As shown in Table 2, the inflation compensation component of βh is stable across sample periods and frequency h. Thus, most of the changes in βh are accounted for by changes in

6Here, we are comparing pre-2000 and post-2000 data based on the estimated slope coefficients in equations (1.1) and (1.2) for h = 12-month changes. There may be breaks in the regression coefficients for higher frequency changes at other dates, and for example, Thornton (forthcoming) argues that there is a break in the relationship between monthly changes in 10-year yields and monthly changes in the federal funds rate somewhat earlier in the sample.

7One might wonder if our dating of this break is driven by distortions stemming from the 2008–2015 period when overnight nominal rates were stuck at the zero lower bound in the U.S.. Our use of 1-year rates as the independent variables in equations (1.1) and (1.2) limits any potential distortions since 1-year nominal yields continued to fluctuate from 2008 to 2015 (Swanson and Williams, 2014). Indeed, even if we end our sample period in 2008, we still detect a break around 2000. Specifically, if the post-2000 sample ends in December 2008, we find a daily βh = 0.77 and a yearly βh = 0.20, which are essentially indistinguishable from the numbers in Table 1.

the real component of nominal yields.

Panel B of Table 2 shows estimates of equation (1.1) for Germany and Canada. For Germany, monthly data is available beginning in 1972 and daily is available starting in 2000. For Canada, monthly and daily data are available beginning in 1986. Again, we observe similar patterns to those in the U.S. In the pre-2000 sample, βh is stable across frequencies in Germany and Canada. After 2000, we observe greater sensitivity at high frequencies and less sensitivity at lower frequencies.

2 Yield-curve dynamics and bond return predictability

In this section, we first pinpoint the dynamic properties of the term structure that account for the greater high-frequency sensitivity and smaller low-frequency sensitivity of long rates to short rates in the post- 2000 data. Specifically, we demonstrate that this puzzling pattern arises because, all else equal, past increases in short rates predict a subsequent flattening of the yield curve in the post-2000 data. Speaking statistically, this means that post-2000 yield curve dynamics are “path-dependent” or non-Markovian: it is not enough to know the current shape of the yield curve. Instead, to form the best forecast of future bond yields and returns, one also needs to know how the yield curve has shifted in recent months. And we show that these non-Markovian dynamics help explain several post-2000 “conundrum” episodes when short- and long-term rates moved in opposite directions.

Second, we show that this predictable flattening of the yield curve is closely linked to a transient rise in the expected returns on long-term bonds over to those on short-term bonds. Specifically, since 2000, term premia on long-term bonds are temporarily elevated following past increases in short rates. Thus, relative to an expectations-hypothesis baseline, long rates exhibit excess sensitivity at high frequencies and temporarily overreact to changes in short rates.

2.1 Non-Markovian yield-curve dynamics

When examining term structure dynamics, it is useful and customary to study the dynamics of yield- curve factors, especially level and slope factors (Litterman and Scheinkman, 1991). Defining level as the 1-year yield (Lt ≡ y ), the slope as the 10-year yield less the 1-year yield (St ≡ y − y ), the puzzle

described in this paper can be restated as the observation that, since 2000, changes in level and slope have become negatively associated at low frequencies, but not at high frequencies.8 Specifically, note that

8The level and slope factors are sometimes defined as the first two principal components of a set of yields. For simplicity, we have defined the level and slope factors using fixed maturities on the yield curve. However, this choice makes little

the coefficient in equation (1.1) for h-month changes can be rewritten as:

βh = 1 +

Σh−1

j=−h+1

(h − |j|)Corr(∆Lt, ∆St+j ) . V ar(∆St)

. (2.1)

Σh−1

j=−h+1

(h − |j|)Corr(∆Lt, ∆Lt+j ) V ar(∆Lt)

To begin, note that one would not expect βh to vary strongly with horizon h as in the post-2000 data. For instance, in a simple expectations-hypothesis world of the sort outlined in Section 3 where St =

α + β × Lt − Lt for some β ∈ (0, 1), we have Corr(∆Lt, ∆St+j ) = −Corr(∆Lt, ∆Lt+j ) for all j, implying

that βh = β for all h.9

Alternately, if Corr(∆Lt, ∆Lt+j ) ≈ 0 and Corr(∆Lt, ∆St+j ) ≈ 0 for all j ƒ= 0—a situation that roughly describes the pre-2000 data—then βh ≈ β1 for any h. Thus, equation (2.1) implies that the decline in βh at low, but not high, frequencies after 2000 must mean that (i) there was a shift in the autocorrelation of ∆Lt, (ii) the cross-correlation between changes in level and future changes in slope declined, or (iii) the cross-correlation between changes in level and past changes in slope has fallen. As shown in Figure 4, which plots Corr(∆Lt, ∆St+j ) at different leads j, it turns out that both of these cross-correlations have declined, although (ii) plays a more important role in driving the decline in βh at low frequencies. In other words, Figure 4 suggests that past increases in the level of the yield curve predict a flattening of the curve post-2000.

By contrast, Figure 4 shows that the contemporaneous correlation between changes in level and slope is significantly less negative post-2000. Consistent with Table 1, this means that long- and short-term yields are more likely to move in lockstep at a 1-month frequency post-2000.

2.1.1 Predicting level and slope

The post-2000 decline in the low-frequency relationship between long- and short-term rates could either be the result of a systematic shift in yield curve dynamics, or simply of a sequence of innovations to long-term rates that were negatively related to short rates. To assess whether yield-curve dynamics have in fact shifted we estimate predictive regressions for the level and slope of the yield curve.

Most term-structure models are Markovian with respect to the filtration given by current yield curve factors, meaning that the conditional mean of future yields depends only on today’s yield-curve factors. However, our key finding—that the relationship between changes in short- and long-term rates has weak-

difference: we find similar results if we examine the first two principal components.

9As discussed in Section 3, if there are both persistent and transient shocks to short-term rates, the expectations hypothesis actually suggests that βh should be increasing in h. Thus, the fact that βh is sharply decreasing in h in the post-2000 data is difficult to square with the expectations hypothesis.

ened at low-frequencies, even though the daily relaionship between the two has grown stronger—suggests that it may be useful to include lagged factors when forecasting yields. We therefore consider the following system of predictive monthly regressions:

Lt+1 = δ0L + δ1LLt + δ2LSt + δ3L(Lt − Lt−h) + δ4L(St − St−h) + εL,t+1 (2.2a)

St+1 = δ0S + δ1SLt + δ2SSt + δ3S (Lt − Lt−h) + δ4S (St − St−h) + εS,t+1, (2.2b)

These regressions include level and slope as well as their changes over the prior h-months.10

Table 3 reports estimates of equations (2.2a) and (2.2b) for h = 12 and for both the pre-2000 and post-2000 subsamples.11 We include specifications omitting all lagged changes (imposing δ3 = δ4 = 0), omitting lagged changes in slope (imposing δ4 = 0), and including all predictors. Based on the AIC or BIC, the model with one lag of level and slope and lagged changes in level is selected in the post-2000 subsample, while no lagged changes are needed in the pre-2000 subsample. As shown in the bottom panel, in the post-2000 subsample, the lagged change in level is a highly significant negative predictor of the slope—i.e., increases in the level of yields predict subsequent yield curve flattening. For example, as shown in column (5), a 100 basis point cumulative increase in the level over the prior 12-months is associated with a 9 basis per-month decline in slope in the post-2000 sample (p-val < 0.001). By contrast, the estimate in the pre-2000 sample is economically small and is not statistically significant.

The model in equations (2.2a)-(2.2b) can match the key stylized fact we documented above. These equations can be written as a restricted vector autoregression (VAR) in yt = (Lt, St)t of the form:

yt+1 = µ + A1yt + A2yt−h + εt+1. (2.3)

Let Γij (h) denote the ijth element of the autocovariance of yt at a lag of h months—i.e., the the ijth element of Γ(h) = E[(yt − E [yt]) (yt−h − E [yt−h])t]. Given the estimated parameters from equations (2.2a) and (2.2b), we can work out Γij (h) to obtain the VAR-implied values of βh in equation (1.1):

β V ar(Lt − Lt−h) + Cov(St − St−h, Lt − Lt−h) 2Γ12(0) − Γ12(h) − Γ12(−h)

V ar(Lt − Lt−h) 2(Γ11(0) − Γ11(h))

In the pre-2000 sample, Table 1 reported estimates of β1 = 0.46 and β12 = 0.56. In the post-2000 sample,

10Several other authors have considered the use of lagged variables in term structure models, including Cochrane and Piazzesi (2005), Duffee (2013), and Joslin et al. (2013).

11In unreported results, we also find that past changes in the level factor are associated with declines in the slope factor over the following month when the change in the level is computed over the prior h = 3 or 6 months.

we reported estimates of β1 = 0.64 and β12 = 0.20. The last two rows of Table 3 include the VAR-implied values of β1 and β12 from equation (2.4). In the pre-2000 data, all of the VAR models can roughly match both β1 and β12. In the post-2000 sample, all of the models can match β1, but only the VAR models that include lagged changes in level—i.e., models that allow for non-Markovian dynamics—can match the sharp drop in β12. Specifically, if the post-2000 VAR does not include lagged changes (δ3 = δ4 = 0), the VAR-implied values of β12 would be 0.58 and would be nowhere near what we observe in the data.

2.1.2 “Conundrum” episodes

The fact that past changes in the level of the yield curve are increasingly useful for predicting future changes in slope helps explain some important recent “conundrum” episodes—i.e., 12-month periods when short-term and long-term rates moved in opposite directions—that would be hard to reconcile with fully Markovian yield curve dynamics. Figure 1 shows Treasury yield curves around three noteworthy “co- nundrum” episodes: Greenspan’s original 2004 “conundrum,” 2008 which was a “conundrum in reverse,” and the 2017 conundrum.

We first consider Greenspan’s original 2004 “conundrum.” We take the unrestricted estimates of equa- tion (2.2b) in the post-2000 sample with h = 12 from Table 3 and the estimates restricting the coefficients on lagged changes to zero (δ3S = δ4S = 0). Starting in May 2004, we simulate the counterfactual path of 10-year yields that would have prevailed if δ3S = δ4S = 0. To do so, we hold the level factor at its actual value and use the residuals from the unrestricted version of (2.2b), but set the parameters to their estimated values in the restricted regression. The top panel of Figure 5 plots the actual 1- and 10-year yields over this 2004 conundrum period along with the 10-year yield under this counterfactual scenario. Had the slope not responded to lagged changes in the level of the yield curve, Figure 5 shows that, instead of falling, 10-year yields would have risen in 2004.

The next two panels of Figure 5 repeat this exercise for the two other “conundrum” episodes shown in Figure 1. If the slope had not responded to past changes in level, in both cases, 10-year yields would have moved in the same direction as 1-year yields.

2.2 Bond return predictability

We can recast our main finding—the fact that, in recent years, βh is so large at high frequencies and then declines rapidly as a function of horizon h —as a result about bond return predictability. Specifically, we show that this result arises because past increases in the level of rates lead to temporary rise in the

expected return on long-term bonds relative to those on short-term bonds. Thus, our findings reflect a new form of bond return predictability.

Results for 10-year bonds: Recall that the k-month log return on n-year zero-coupons bonds from month t to t + k is

r(n)

≡ log(P /P ) = ny − (n − k/12)y , (2.5)

t→t+k

t+k t t

t+k

where P (n) = exp(−ny(n)) is the price of a n-year zero-coupon bond at time t. And, the k-month excess

return on n-year bonds over the riskless return on k-month bills, r(k/12)

t→t+k

= (k/12) y(k/12), is

rx(n)

(n)

(k/12)

= (k/12) (y(n) − y(k/12)) − (n − k/12)(y(n−k/12) − y(n)). (2.6)

t→t+k

t→t+k

t→t+k t t

t+k t

We first focus on k = 1 and 3-month excess return on n = 10-year zero-coupon bonds.

To draw out the connection to the predictable curve flattening discussed above, we show that our results for 10-year returns are related to predictability of the returns on what we refer to as “level- mimicking” and “slope-mimicking” portfolios. Specifically, we follow Joslin et al. (2014) and construct bond portfolios that locally mimic changes in the level and slope factors. Consider a factor-mimicking portfolio that places weight wn on zero-coupon bonds with n years to maturity. The excess returns on

this portfolio is rxP

t→t+k

= (Σn

wn × rxt→t+k

)/ |Σn

wn|. The level-mimicking portfolio has a weight −1

on 1-year bonds and no weight on any other bonds. Recalling that Lt ≡ y and St ≡ y − y , for

small k we have rx(10)

t→t+k

≈ −10 × (∆k Lt+k + ∆k St+k ) and rxt→t+k

≈ −1 × ∆k Lt+k . Thus, the level-

mimicking portfolio has a k-month excess return of:

rxLEV EL = −1 × rx

≈ ∆k Lt+k .

t→t+k t→t+k

The slope-mimicking portfolio has a weight 1 on 1-year bonds and −0.1 on 10-year bonds,12 so:

rxSLOPE = (1 × rx

− 0.1 × rx

)/0.9 ≈ ∆k St+k/0.9.

t→t+k

t→t+k

t→t+k

Finally, we note that the excess return on the ten-year bond is just a linear combination of the level- and

12Note that the slope-mimicking portfolio is approximately hedged against parallel shifts in the level of the yield curve.

Thus, rxSLOPE corresponds to the returns on what fixed-income investors would call a “steepener” trade—i.e., a trade that

t→t+k

will profit if the yield curve steepens.

slope-mimicking excess returns:

rx(10)

= −9 × rx − 10 × rx . (2.7)

t→t+k

t→t+k

t→t+k

We consider the following system of predictive regressions:

rxLEV EL

t→t+k

rxSLOPE

t→t+k

rx(10)

t→t+k

= δ0,10 + δ1,10Lt + δ2,10St + δ3,10(Lt − Lt−h) + δ4,10(St − St−h) + ε10,t→t+k . (2.8c)

We report the results from estimating these predictive regressions for k = 1 and 3-month returns and

using h = 6 and 12-month past changes in yield-curve factors in Table 4.

The results in Panels A and B of Table 4 where we forecast rxLEV EL

t→t+k

and rxSLOPE

t→t+k

are entirely

consistent with those in Table 3. Specifically, in the post-2000 sample, lagged changes in level are highly significant predictors of excess returns on the slope-mimicking portfolio. Again, our key finding is that, since 2000, an increase in the level of short rates has been followed by yield curve flattening.

In Panel C of Table 4 considers the excess returns on 10-year bonds and reports estimates of equation (2.8c). Based on equation (2.7), the coefficients in equation (2.8c) are a linear combination of the coefficients in equations (2.8a) and (2.8b). As shown in Panels A and B, in the post-2000 sample, the

excess returns on the slope-mimicking portfolio depend negatively on Lt − Lt−h but the excess returns

on the level-mimicking portfolio depend positively on Lt − Lt−h.13 While the two effects partially cancel

out when predicting 10-year excess returns, the net effect is positive and statistically significant in the post-2000 data. In other words, past increases in the level of rates lead to an increase in risk premia on

long-term bonds.14

Finally, to assess the persistence of the associated shifts in term premia, in untabulated results, we estimate equation (2.8c) holding h fixed and varying the return-forecasting horizon k. We find that the return predictability associated with Lt − Lt−h is confined to small k—i.e., k ≤ 6 months. Thus, the associated rise in bond risk premia is indeed quite short-lived.

In summary, we find that, since 2000, term premia on long-term bonds are temporarily elevated

13The latter fact is consistent with Piazzesi et al. (2015) and Cieslak (forthcoming), who account for it either with expectational errors or time-varying risk premia. Brooks et al. (2017) also show that the federal funds rate displays short- term momentum.

14The return predictability associated with past changes in level (Lt − Lt−h) remains similar if, instead of controlling for level (Lt) and slope (St), we control for the first five forward rates as in Cochrane and Piazzesi (2005).

following past increases in short rates. This implies that, relative to an expectations-hypothesis baseline, long-term rates temporarily overreact to movements in short rates, exhibiting what Mankiw and Summers (1984) called “excess sensitivity” at high frequencies.

Trading strategies: As another way of assessing the resulting return predictability, we consider trading strategies in which an investor decides to take either a long or short position in the slope-mimicking portfolio every month. We assume that the investor takes a long (short) position in the slope-mimicking portfolio from month t to month t+1 if Lt < Lt−h (Lt > Lt−h). Alternatively, we assume that the investor takes a position in the slope-mimicking portfolio from month t to month t + 1 that is proportional to

−(Lt − Lt−h). Table 5 computes the annualized Sharpe ratios of these two related trading strategies for different choices of h, in the pre- and post-2000 samples. As shown in Table 5 the implied annualized Sharpe ratios for these strategies range between about 0.5 to 0.7 in the post-2000 sample but were negligible in the pre-2000 sample.

Results for other bond maturities: Finally, we examine the predictability for other bond maturities. If, as we argue, past increases in short rates temporarily raise the net supply of long-term bonds that investors must hold, thereby raising the compensation investors require for bearing interest-rate risk, this should have a larger impact on the expected returns of long-term bonds than intermediate bonds. This is because the returns on long-term bonds are more sensitive to shifts in yields than those on intermediate bonds (Vayanos and Vila, 2009; Greenwood and Vayanos, 2014).

We explore this prediction in the first plot in Figure 6 where we estimate

rx(n)

t→t+k

= δ0,n + δ1,nLt + δ2,nSt + δ3,n(Lt − Lt−h) + δ4,n(St − St−h) + εt→t+k

(2.9)

separately for n = 1, ….20-year bonds. We then plot the coefficients, δ3,n, from estimating equation (2.9) versus bond maturity, n, for the pre-2000 and post-2000 samples. (For the pre-2000 sample, the longest available maturity is n = 15 years). We show the results for k = 3-month returns and h = 6-month past changes in level and slope in equation (2.9), but the results are similar for other choices of k and

h. Consistent with the idea that past increases in short rates temporarily raise the price of interest-rate risk, the coefficients δ3,n are monotonically increasing in bond maturity n in the post-2000 sample. By contrast, there is no predictability in the pre-2000 sample.

This temporary rise in the compensation for bearing interest rate risk impacts the yield and forward

rate curves. As explained in Greenwood and Vayanos (2014), a short-lived rise in the compensation for bearing interest rate risk may have relatively constant or even a hump-shaped effect on the yield and forward curves as opposed to the monotonically increasing effect shown above for returns. The intuition is that the impact on bond yields equals the effect on a bond’s average expected returns over its lifetime. As a result, a temporary rise in the compensation for bearing interest rate risk can have a greater impact intermediate-term yields than on long-term yields. Thus, we plot the slope coefficients δ3,n versus maturity n from estimating

f (n−k/12) − f (n) = δ0,n + δ1,nLt + δ2,nSt + δ3,n(Lt − Lt−h) + δ4,n(St − St−h) + ε(n)

, (2.10)

t+k t

t→t+k

(here f (n) ≡ ny(n) − (n − 1) y(n) is the 1-year rate (n − 1) years forward) and

y(n−k/12) − y(n) = δ0,n + δ1,nLt + δ2,nSt + δ3,n(Lt − Lt−h) + δ4,n(St − St−h) + ε(n)

. (2.11)

t+k t

t→t+k

for n = 1, 2, …, 20 years for both the pre-2000 and post-2000 samples.15 After 2000, the second plot in Figure 6 shows that a past increase in short-term rates has a slight humped-shaped effect on the evolution of the forward curve with the peak impact at n = 4 years. Turning to yields, while past increases in level of short rates forecast a future flattening of the yield curve in the post-2000 data as noted above, the bottom plot in Figure 6 shows that the expected changes in long-term yields is relatively constant beyond roughly 5 years.

In summary, the results for different maturities are support the view that past increases in short-term rates temporarily raise the compensation that investors earn for bearing interest-rate risk.

3 Model

In this section, we construct a simple model that is useful for explaining our key findings: since 2000, short- and long-term yields no longer move strongly together at low frequencies, but they move together even more strongly at high frequencies than in the past. In our model, time is discrete and infinite. Risk-averse investors can either hold long-term, perpetual nominal bonds or short-term nominal bonds.

15Since f (n−k/12) − f (n) = −(rx(n)

− rx ), there is a tight connection between the coefficients in equations (2.10)

and (2.9). Specifically, defining f (1−k/12) − f (1) = −rx(1)

, we have rx(n)

= Σn

(f (m) − f (m−k/12)). Thus, the

coefficients in equation (2.9) for maturity n can be recovered by summing up the −1 times coefficients in equation (2.10) for all maturities m ≤ n. Similarly, the coefficients from equation (2.11) for maturity n are approximately the average of the coefficients in equation (2.10) for all maturities m ≤ n.

Short-term nominal bonds are available in perfectly elastic supply and the nominal interest rate on short- term bonds from t to t + 1, denoted it, follows an exogenous stochastic process. Long-term bonds are available in a given net supply that must be absorbed by the investors in our model. The net supply, st, is the gross supply of long-term bonds outstanding net of the amount purchased by other agents—e.g., “preferred habitat investors”—who have inelastic demands for long-term bonds. As in Vayanos and Vila (2009) and Greenwood and Vayanos (2014), shifts in the supply and demand for long-term bonds impact the term premium on long bonds.

The first key assumption in our model is that shocks to the net supply of long-term bonds are positively correlated with shocks to short rates: increases in short rates are either associated with increases in the gross supply of long-term bonds or with reductions in the inelastic demands of preferred habitat investors. While we discuss several amplication mechanisms that give rise to this reduced form, the simplest interpretation is that there is a growing set of investors who “reach for yield” when short rates decline (Hanson and Stein, 2015). This assumption implies that increases in short rates are associated with increases in term premia, generating “excess sensitivity” of long rates beyond the sensitivity implied by the expectations hypothesis.

The second key assumption, following Duffie (2010), is that capital is slow-moving: these supply and demand shocks walk down a short-run demand curve that is steeper than the long-run demand curve. This slow-moving capital dynamic implies that an increase in short-term interest rates gives to a short- lived increase in term premia on long-term bonds. As a result, the excess sensitivity of long-term rates is greatest when measured at short horizons. Thus, by combining reaching-for-yield and slow-moving capital, our model helps us understand why long-term rates may temporarily overreact to movements in short-term rates. Formally, our model is a close cousin of the model in Greenwood et al. (forthcoming), who incorporate slow-moving capital effects into a model of the term structure.

As we detail below, our model can match our key findings—βh has fallen for large h and risen for small h post-2000—if (i) shocks to short-term nominal rates have become slightly less persistent and

(ii) the kinds of supply-and-demand-based amplification mechanisms that we emphasize have grown in importance. We argue that (i) is justified since there is strong evidence that shocks to the persistent component of nominal inflation have become far less volatile since the mid-1990s (Stock and Watson, 2007). Similarly, we argue that (ii) is justified since many of these amplification mechanisms appear to have become more powerful in recent years.

3.1 Model setting

3.1.1 Long-term nominal bonds

The long-term nominal bond is a perpetuity that pays a coupon of K > 0 each period. Let Pt denote the price of this bond at time t, so the return on long-term bonds from t to t + 1 is:

1 + Rt+1 = (Pt+1 + K) /Pt. (3.1)

To generate a tractable linear model, we use a Campbell and Shiller (1988) log-linear approximation to the return on this perpetuity. Specifically, defining θ ≡ 1/ (1 + K) < 1, the 1-period log return on the perpetuity from t to t + 1 is approximately

rt+1

≡ ln (1 + R

t+1

D

) ≈ ¸ x1s ¸y

D−1

θ y

1 − θ

t+1

, (3.2)

where yt is the log yield-to-maturity at time t and D = 1/ (1 − θ) = (K + 1) /K is the Macaulay duration

when the bond is trading at par.16

Let it denote the interest rate on short-term nominal bonds from t to t + 1 and let

1 θ

rxt+1 ≡ rt+1 − it = 1 − θ yt − 1 − θ yt+1 − it (3.3)

denote the excess return on long-term bonds over short-term bonds from t to t + 1. Iterating equation (B.1) forward and taking expectations, the yield on long-term bonds is:

yt = (1 − θ) Σ∞j=0 θj Et [it+j + rxt+j+1] . (3.4)

The long-term yield is the sum of (i) an expectations hypothesis component (1 − θ) ∞j=0 θj Et [it+j ] that reflects expected future short rates and (ii) a term premium component (1 − θ) ∞j=0 θj Et [rxt+j+1] that reflects expected future excess returns on long-term bonds over short-term bonds.

16This log-linear approximation for default-free coupon-bearing bonds appears in Chapter 10 of Campbell et al. (1996). We review the derivation of equation (B.1) in the Internet Appendix.

3.1.2 Market participants

There are two groups of risk-averse investors in the model, each with identical risk tolerance τ , who differ solely in the frequency with which they can rebalance their bond portfolios.

The first group are “fast-moving investors” who are free to adjust their holdings of long-term and short-term bonds each period. Fast-moving investors are present in mass q and we denote their demand for long-term bonds at time t by bt. Fast-moving investors have mean-variance preferences over 1-period portfolio log returns. Thus, their demand for long-term bonds at time t is

Et [rxt+1]
bt = τ . (3.5)

V art [rxt+1]

The second group are “slow-moving investors” who can only adjust their holdings of long-term and short-term bonds every k periods. Slow-moving investors are present in mass 1 − q. A fraction 1/k of these slow-moving investors is active each period and can reallocate their portfolios. However, they must

then maintain this same portfolio allocation for the next k periods. As in Duffie (2010), this is a reduced- form way to model the forces—whether due to institutional frictions or to limited attention—that may limit the speed of capital flows. Since they only rebalance their portfolios every k periods, slow-moving investors have mean-variance preferences over their k-period cumulative portfolio excess return. Thus, the demand for long-term bonds from the subset of slow-moving investors who are active at time t is

given by:

dt = τ

k j=1

V ar [Σk

rxt+j ] rx

. (3.6)

]

3.1.3 Risk factors

Investors in long-term bonds face two different types of risk. First, they are exposed to interest rate risk: they will suffer a capital loss on their long-term bond holdings if short-term rates unexpectedly rise. Second, they are exposed to supply risk : there are shocks to the net supply of long-term bonds that impact long-term bond yields, holding fixed the expected future path of short-term interest rates. In other words, these supply shocks impact the term premium on long-term bonds. We make the following concrete assumptions about the evolution of these two risk factors.

Short-term nominal interest rates: Short-term nominal bonds are available in perfectly elastic supply. At time t, investors learn that short-term bonds will earn a riskless log return of it in nominal

terms between time t and t+1. One can think of the short-term nominal rate as being determined outside the model by monetary policy.

Crucially, we assume that the short-term nominal interest rate is the sum of a highly persistent component iP,t and a more transient component iT,t:

it = iP,t + iT,t. (3.7)

We assume that the persistent component iP,t follows an exogenous AR(1) process:

iP,t+1 = ¯ı + ρP (iP,t − ¯ı) + εP,t+1, (3.8) where 0 < ρP < 1 and V art [εP,t+1] = σ2 . Similarly, we assume that the transient component iT,t follows

an exogenous AR(1) process:

iT,t+1 = ρT iT,t + εT,t+1, (3.9)

where 0 < ρT ≤ ρP < 1 and V art [εT,t+1] = σ2 .

We posit that σP —the volatility of shocks to the persistent component of short rates—has declined since the late 1990s. A natural explanation is that σP has declined because long-run inflation expectations have become far more anchored, leading to a flattening of the Phillips curve (Bernanke, 2007). Indeed, there is strong evidence that shocks to the persistent component of nominal inflation have become far less volatile since the mid-1990s (Stock and Watson, 2007).

As we show below, if σP is large, then long-term nominal rates will be highly sensitive to changes in short-term nominal rates due to the standard expectations hypothesis channel. While this is a good explanation for the high sensitivity observed in the 1970s, 1980s, and early 1990s when long-run inflation expectations were less well-anchored (Gu¨rkaynak et al., 2005), it seems far less plausible in recent years since long-run inflation expectations have been so well-anchored. And, as shown by Beechey and Wright (2009) and Hanson and Stein (2015), in the post-2000 period, the high sensitivity of long-term nominal rates to movements in short-term nominal rates primarily reflects the sensitivity of long-term real rates. This is puzzling from the standpoint of the expectations hypothesis assuming one adheres to the stan- dard view that expected future real rates at distant horizons should not fluctuate meaningfully at high frequencies. Thus, to match the strong high-frequency sensitivity of long-term rates in recent years, it is natural to evoke shocks to long-term bond supply that impact term premia (Hanson and Stein, 2015).

Supply of long-term bonds: We assume that the long-term nominal bond is available in an exogenous, time-varying net supply st that must be held in equilibrium by fast investors and slow-moving investors. This net supply equals the gross supply of long-term bonds minus the demand for long-term bonds from other agents outside the model who have inelastic demand for these bonds. Formally, we assume that st follows an AR(1) process:

st+1 = s + ρs (st − s) + εs,t+1 + CεP,t+1 + CεT,t+1, (3.10)

where 0 < ρs ≤ ρT < 1, C ≥ 0, and V art [εs,t+1] = σ2. When C > 0, shocks to short rates are positively associated with shocks to the net supply of long-term bonds. The εs,t+1 shocks capture other forces that impact the net supply of long-term bonds. The model can be solved for any arbitrary correlation structure among the three underlying shocks εP,t+1, εT,t+1, and εs,t+1. However, for simplicity, we that the three shocks are mutually orthogonal.

3.1.4 Shocks to supply and the short rate

Suppose that C > 0 in equation (3.10), so short rates shocks are positively associated with shocks to long-term bond supply. In the Internet Appendix, we show that:

s = s + C[(i

− ¯ı) − (ρ − ρ ) Σ∞

ρj (i

− ¯ı)] (3.11)

+ C[i − (ρ

− ρ ) Σ∞

ρji

] + [Σ∞

ρjε ].

Thus, when ρs < ρT , net bond supply is increasing in the differences between the current level of each component of the short rate and a geometric moving-average of past values of that component. The specification in equation (3.11) is a reduced-form way of capturing several different supply-and-demand mechanisms that help explain why negative shocks to short-term rates are associated with declines in the term premium on long-term bonds. While the precise mix of these mechanisms and their combined strength may vary over time and across countries, there is a growing consensus that these mechanisms play an increasingly important role in fixed-income markets. The “reaching-for-yield” channel: The simplest interpretation for of our assumption that C > 0 in the post-2000 data, is that there is a growing set of investors who tend to “reach for yield” when short- term interest rates decline. Specifically, according to the reaching-for-yield channel (Hanson and Stein,

2015), negative shocks to short-term rates boost the demand for long-term bonds from “yield-oriented investors.” The idea is that, for either frictional or behavioral reasons, these yield-oriented investors care about the current yield on their portfolios over and above their expected portfolio returns. Because expected mean reversion in short rates means that the yield curve is steeper when short rates are low, yield-oriented investors’ demand for long-term bonds is greater when short rates are low. This means that the net supply of long-term bonds that must be held by the fast- and slow-moving investors in our model declines when short rates fall. More generally, low short rates may increase investors’ risk appetites through a variety of channels, thereby depressing term premia (Maddaloni and Peydro´, 2011; Di Maggio and Kacperczyk, 2017; Drechsler et al., 2014).

Importantly, Lian et al. (2017) provide experimental evidence that there is a non-linear relationship between reaching-for-yield behavior and the level of rates. Specifically, building on Prospect Theory (Kahneman and Tversky, 1979), they argue that reaching for yield becomes more pronounced as rates fall further below some reference level that investors are accustomed to based on past experience, just as in equation (3.11). Relatedly, they argue that, because people tend to think in proportions as opposed to in differences—a well-known psychological phenomenon known as Weber’s law—small return differentials loom larger in investors’ minds when the level of rates is lower (see Bordalo et al. (2013)). For instance a 1% pick-up in yield from buying long-term bonds instead short-term bonds seems like “a better deal” when short rates are 1% than when they are 5%, because it doubles yield rather than increasing it by a factor of 1.2. Thus, their evidence suggests that the reaching-for-yield channel may have grown stronger in recent years as interest rates have reached historically low levels. In the language of our model, this suggests that C has risen.

The “mortgage convexity” channel: According to the mortgage convexity channel (Hanson, 2014; Malkhozov et al., 2016), negative shocks to short-term interest rates induce mortgage refinancing waves that lead to temporary declines in the duration of outstanding fixed-rate mortgages—i.e., a temporary reduction in the gross supply of long-term bonds. As a result, declines in short rates are associated with temporary declines in term premia. And, due to these mortgage refinancing dynamics, the current gross supply of long-term bonds depends on the difference between current interest rates and a moving-average of past rates as in equation (3.11). The mortgage refinancing channel is only relevant in countries like the U.S. where fixed-rate mortgages with an embedded prepayment option—i.e., mortgages that are “negatively convex” —are an important source of financing.

Crucially, Hanson (2014) shows that the strength of this channel grew during the 1990s as mortgage- backed securities became a larger component of the U.S. fixed-income market. Specifically, Hanson (2014) presents evidence that long-term rates are more sensitive to short-term rates when the aggregate negative convexity of the mortgage market is larger relative to the broader U.S. fixed-income market, and that aggregate negative convexity has trended upwards. In our reduced-form framework, this is equivalent to the statement than C has risen over time.

Asset and liability management by insurers and pensions: Domanski et al. (2017) and Shin (2017) point to a related convexity-based amplification mechanism stemming from the desire of insurers and pensions to match the duration of their assets and liabilities. They argue that the convexity of insurers’ and pensions’ liabilities is greater than the convexity of their assets. Thus, as interest rates decline, the duration of their liabilities tends to increase more than the duration of their long-term bond holdings, and insurers and pensions increase their demand for long-term bonds to match asset and liability duration. Holding fixed the gross supply of long-term bonds, this means that the net supply of long-term bonds that must be held by other investors is lower when short rates are low. As a result, term premia on long-term bonds fall when short rates decline. This mechanism is arguably quite important in European fixed-income markets where insurers and pensions play an especially important role. And this dynamic may have strengthened in recent years as regulators have pushed insurers to more prudently manage their interest rate exposures.

A behavioral over-extrapolation mechanism: According to this channel hinted at by Piazzesi et al. (2015) and Giglio and Kelly (2018), there is a set of biased investors who over-estimate the persistence of short-term rates, perhaps, because they have “diagnostic expectations” (Bordalo et al., 2017) and form their expectations of future short rates using representativeness heuristic (Kahneman and Tversky, 1972). As a result, negative shocks to short rates lead these biased investors to demand more long-term bonds relative to rational investors who properly estimate the persistence of short rates. This means that the net supply of long-term bonds that must be held by unbiased investors declines when short rates are low, leading to a decline in term premium. In the simplest telling, there is little reason to expect that this extrapolative tendency should have increased since 2000. However, in a more complicated telling, the amount of over-extrapolation might have risen if some investors are using an “outdated” model that features a larger fraction of persistent short rate shocks than there has been in recent years. This might be because some investors were slow to learn about the decline in the volatility of trend inflation

documented in Stock and Watson (2007). This more complicated version of the over-extrapolation story is again consistent with the idea that C has risen since 2000.

3.2 Equilibrium yields

At time t, there is a mass q of fast-moving investors, each with demand bt, and a mass (1 − q) k−1 of active slow-moving investors who rebalance their portfolios, each with demand dt. These investors must accommodate the active supply, which is the total net supply st of long-term bonds less any supply held off the market by inactive slow-moving investors who do not rebalance the portfolios, (1 − q) k−1 Σk−1 dt−j .

Thus, the market-clearing condition for long-term bonds at time t is

Fast demand

¸qxbst¸ +

Active slow demand

¸(1 − qx)sk−1d¸t =

Total supply

¸xsst ¸

Inactive slow holdings

− ¸(1 − q)(k−1xsΣk−1 dt−j¸). (3.12)

We conjecture that equilibrium yields yt and the demands of active slow-moving investors dt are linear functions of a state vector, xt, that includes the steady-state deviations of both components of short-term nominal interest rates, the net supply of bonds, and holdings of bonds by inactive slow-moving investors.

Formally, we conjecture that the yield on long-term bonds is

yt = α0 + α1t xt (3.13)

and that slow-moving investors’ demand for long-term bonds is

dt = δ0 + δ1t xt, (3.14)

where the (k + 2) × 1 dimensional state vector, xt, is given by

xt=[iP,t − ¯ı, iT,t, st − s, dt−1 − δ0, · · · , dt−(k−1) − δ0]t. (3.15) These assumptions imply that the state vector follows a VAR(1) process xt+1 = Γxt + st+1, where Γ

depends on the parameters δ1 governing slow-moving investors’ demand.

We can show that equilibrium yields take the form:

Expected future short-term nominal rates

Unconditional term premia

y = ¯ı + 1 − θ

1 − ρP θ

(iP,t

¯ı) + 1 − θ

1 − ρT θ

iT,t

Σ¸ + ¸Σ(qτ )−1 V (1) (xss− (1 − q) δ0)Σ¸

(3.16)

+ ¸Σ(qτ )−1 V (1) . 1 − θ

Conditional term premia

(s − s) − (1 − θ) (1xs− q) k−1 Σ∞

θiE [Σk−1(d

− δ )]ΣΣ¸ ,

where V (1) = V art [rxt+1] is the equilibrium variance of 1-period excess returns on long-term bonds. As detailed in the Internet Appendix, solving the model involves numerically finding a solution to a system of 2k non-linear equations in 2k unknowns. An equilibrium solution only exists if investors are sufficiently risk tolerant (i.e., for τ sufficiently large). When an equilibrium exists, there can be multiple equilib- rium solutions. Equilibrium non-existence and multiplicity of this sort arise in overlapping-generations, rational-expectations models such as ours where risk-averse investors with finite investment horizons trade an infinitely-lived asset that is subject to supply shocks.17 Different equilibria correspond to different self-fulfilling beliefs that investors can hold about the price-impact of supply shocks and, hence, the risks associated with holding long-term bonds. However, we always find a unique equilibrium that is stable in the sense that equilibrium is robust to a small perturbation in investors’ beliefs regarding the equilib- rium that will prevail in the future. Consistent with Samuelson’s “correspondence principle” (Samuelson, 1947), which says that the comparative statics of stable equilibria have certain properties, this unique stable equilibrium has comparative statics that accord with standard economic intuition. We focus on this unique stable equilibrium in our numerical illustrations. See Greenwood et al. (forthcoming) for an extensive discussion of these issues.

3.3 Matching the main findings

Consider the coefficient from a regression of yt+h − yt on it+h − it in the model:

β Cov [yt+h − yt, it+h − it] V ar [it+h − it]

This is the model counterpart of the empirical regression coefficient in equation (1.1). Our main empirical findings are that βh has declined in the post-2000 sample at low frequencies (high h) but has risen at high frequencies (low h). In this section, we argue that our model can match these surprising patterns if

17For previous treatments of these issues, see Spiegel (1998), Albagli (2015), and Greenwood et al. (forthcoming).

two underlying parameters shifted in the post-2000 period:

1. σP has fallen: This means that shocks to the persistent component of short-term nominal rates have become less volatile in the post-2000 period.

2. C has risen: This means that the kinds of supply-and-demand-based amplification mechanisms that we emphasize have grown in importance.

If the expectations hypothesis holds—i.e., if the term premium on long-term bonds is constant over time, then, in the simplest version of the model where the short rate follows a univariate AR(1) process, βh should be a constant function of horizon h. In a more complicated version of the model—one where the short rate is the sum of a persistent and a transient component—the expectations hypothesis actually implies that βh should be an increasing function of h. To explain why βh declines with h, one can relax the expectations hypothesis and allow bond term premia to respond to shifts in the level of short rates. However, if the resulting shifts in term premia are as persistent as the underlying shifts in short-term interest rates, βh will still be a constant or increasing function of h. Thus, to explain why βh is a steeply decreasing function of h, one needs to assume that shifts in short rates give rise to transient movements in term premia.

Special case without slow-moving capital: To more formally develop these intuitions about the behavior of βh in the model, we first consider the special case in which there is no slow-moving capital (i.e., if either q = 1 or k = 1). In this special case, the model can be solved in closed form and the equilibrium yield on long-term bonds is

α0 α1P

α1T

α1s

y = .¸ ¯ı + τ −x1sV (1)sΣ¸ + ¸ 1 x−s θ ¸ (i

P,t

¯ı) + 1 − θ i

1 − ρT θ

T,t

+ τ −1V (1) 1 − θ (s

1 − ρsθ

− s) , (3.18)

where V (1) is the smaller root of a quadratic equation given in the Internet Appendix. In this case, the model-implied regression coefficient is

α1P V ar [∆hiP,t] + α1T V ar [∆hiT,t] + α1s (Cov [∆hiP,t, ∆hst] + Cov [∆hiT,t, ∆hst])

βh = . (3.19)

V ar [∆hiP,t+h] + V ar [∆hiT,t+h]

where for X ∈ {P, T } we have V ar [∆hiX,t] = 2 Σ(1 − ρh )/ .1 − ρ2 ΣΣ σ2 and Cov [∆hiX,t, ∆hst] =

C Σ.2 − ρh − ρh Σ / (1 − ρsρX )Σ σ2 . Recall that we have assumed that C ≥ 0 and ρs ≤ ρT ≤ ρP . For

simplicity, in the following discussion, we will also assume that σ2 = 0.18

We first consider the level of βh irrespective of horizon h. Inspecting equation (B.48), it is easy to see that:

• When C = 0, the level of βh is increasing in σp for all h. An increase in σP raises the fraction of total short-rate variation at all horizons that is due to movements in the more persistent component (i.e., raises V ar [∆hiP,t+h] / (V ar [∆hiP,t+h] + V ar [∆hiT,t+h]) for all h). Since shocks to the more persistent component of short rates have larger impact on long-term yields via a straightforward expectations hypothesis channel (i.e., since α1P > α1T ), an increase in σP raises the level of βh at all horizons. Thus, if σP declined between the pre-2000 and post-2000 periods as we argued above, this would lead βh to decline at all horizons h.

We next consider the way βh behaves as a function of horizon h. Again, using equation (B.48), it is easy to show that:

• When C = 0 and ρT = ρP , βh is a constant that is independent of h. These assumptions imply that the expectations hypothesis holds—i.e., there is no excess sensitivity—and that all shocks to short rates have the same persistence. In this benchmark case, βh = α1P = α1T for all h—i.e., the sensitivity of long rates to short rates is the same at all horizons.

• When C = 0 and ρT < ρP , βh is an increasing function of h. These assumptions im- ply that the expectations hypothesis holds, but there are now transient and persistent shocks to short rates. In this case, βh rises with h since (i) movements in the more persistent compo- nent of short rates are associated with larger movement in long-term yields (i.e., α1P > α1T ) and

(ii) because the persistent component dominates changes in short rates at longer horizons (i.e.,

V ar [∆hiP,t+h] / (V ar [∆hiP,t+h] + V ar [∆hiT,t+h]) rises with h when ρT < ρP ). • When C > 0 and ρs = ρT = ρP , βh is a constant that is independent of h. In this case, there is excess sensitivity—shifts in short rates lead to shifts in the term premium on long-term bonds—but the excess sensitivity is the same irrespective of horizon h. This is because ∆hst+h =

C∆hiP,t+h+C∆hiT,t+h when ρs = ρT = ρP (see equation (3.11)) and V ar [∆hiP,t+h] / (V ar [∆hiP,t+h] + V ar [∆h

σ2 / .σ2 + σ2 Σ when ρT = ρP .

18This is without loss of generality since σ2 only impacts the level of α1s and does not otherwise affect βh.

• When C > 0 and ρs < ρT = ρP , βh is a decreasing function of h. In this case, long-term

interest rates exhibit excess sensitivity to movements in short rates that declines with horizon

h. Intuitively, if the supply shocks induced by shocks to short rates are more transient than the underlying shocks to short rates, then term premia will react more in the short run than in the long run. Thus, there will be greater excess sensitivity in the short run.

Thus, assuming ρs < ρT < ρP , if C was zero in the pre-2000 sample but positive in the post-2000 sample, then we would expect βh to be a mildly increasing function of h in the pre-2000 period and a decreasing function of h in the post-2000 period. General case with slow-moving capital: We next work out the model-implied regression coefficients βh in the general case with slow-moving capital. Since the state vector xt follows a VAR(1), if we let V =V ar [xt], we have vec(V) = (I − Γ ⊗ Γ)−1vec(Σ). We also have Cov [xt+j, xtt] = Γj V and Cov[xt, xtt+j ] = V (Γt)j , so V ar [xt+h − xt] = 2V − ΓhV − V(Γt)h. Thus, letting e denote the (k + 2) × 1 vector with ones in the first and second positions and zeros elsewhere, we have: Cov[α1t (xt+h − xt) , (xt+h − xt)t e] α1t (2V − ΓhV − V(Γt)h)e βh = V ar Σ(x t+h − x )t eΣ = . (3.20) et(2V − ΓhV − V(Γt)h)e As above, the assumption that C > 0 implies excess sensitivity relative to the expectations hypothesis. Once we add slow-moving capital, two features of our model can help match the finding that βh is a declining function of h. First, if ρs < ρT = ρP , so supply shocks are less persistent than short-rate shocks, then, as shown above, βh will be a declining function of h even in the absence of slow-moving capital. For instance, if supply shocks are driven by the mortgage convexity channel, then the resulting supply shocks may be quite transient since Hanson (2014) shows that even persistent declines in rates only induce short-lived mortgage refinancing waves. Second, when there is slow-moving capital (i.e., when k > 1 and q < 1), the short-run demand curve for long-term bonds is steeper than the long-run demand curve. As a result, so long as C > 0, βh will decline with h even if ρs = ρT = ρP . For instance, the simplest versions of the “reaching-for-yield” channel (Hanson and Stein, 2015) would suggest that

ρs = ρT = ρP , implying that st − s = C × (it − ¯ı) by equation (3.11). Thus, in order for this channel

to explain our results, we would need to assume that these induced supply shocks walk down a short- run demand curve that is far steeper than the long-run demand curve (Duffie, 2010; Greenwood et al., forthcoming).

In summary, in order for βh to be a steeply declining function of h as in the post-2000 data and to match our associated return forecasting results, we need (i) C > 0 and either (ii.a) ρs < ρT or (ii.b) slow-moving investors. In practice, we believe both ρs < ρT and slow-moving capital likely play some role in explaining why βh is a declining function of h in the recent data. Furthermore, these two mechanisms reinforce one another: it is easiest to match the steep decline in βh as a function of h using calibrations, such as our illustrative calibration below, that feature both. 3.4 Model calibration We consider an illustrative calibration of the model in which each time period is a month. We assume the following parameters were the same in both the pre-2000 and post-2000 periods: • Persistence: ρP = 0.995, ρT = 0.96, and ρs = 0.80. This implies that shocks to the persistent component of short rates have a half-life of 11.5 years, shocks to the transient component of short rates have a half-life of 1.4 years, and shocks to the net supply of long-term bond have a half-life of 3 months. The short half-life of bond supply shocks is consistent with the mortgage convexity channel discussed above. • Slow-moving capital: q = 30% and k = 12. Thus, 1 − q = 70% of the investors are slow-moving and only rebalance their bond portfolios every 12 months. These assumptions capture the idea that many large institutional investors only rebalance their portfolios annually. • Volatility of the transient component of short rates: σ2 = 0.15%. • No independent supply shocks: σ2 = 0. We make this assumption for simplicity only. Thus, the supply shocks induced by shocks to short rates are the only reason term premia vary. • Other parameters: τ = 0.5 and θ = 119/120, so the duration of the perpetuity is D = 1/ (1 − θ) = 120 months—i.e., 10 years. Supported by the previous discussion, we assume that two model parameters, C and σP , changed between the pre-2000 and the post-2000 periods. For the pre-2000 period, we assume that: • Large persistent component of short rates: σ2 = 0.15%. Thus, the implied standard deviation of the short rate is 4.12%. This compares with a pre-2000 volatility of 1-year yields of 2.63% in the data. • No supply shocks induced by short rate shocks: C = 0. Thus, we assume that there is no excess sensitivity in the early period. By contrast, for the post-2000 period, we assume that: • Small persistent component of short rates: σ2 = 0.012%. Thus, the implied standard deviation of the short rate is 1.77% which is similar to the post-2000 volatility of 1-year yields of 1.85% in the data. • Supply shocks induced by short rate shocks: C = 0.55 > 0.

3.4.1 Model-implied regression coefficients

Figure 7 plots the model-implied coefficients βh in equation (3.17) against the horizon (h) in months for the pre-and post-2000 calibrations. In the pre-2000 calibration where σP is large and C = 0, the level of βh is high for all h. Furthermore, βh rises gradually with h in the pre-2000 calibration because ρT < ρP , implying that the more persistent component of short rates dominates changes in short rates at longer horizons. By contrast, in the post-2000 calibration where σP is smaller and C is large, βh declines steeply with h. And, since σP is lower, βh eventually reaches a lower level for large h than in the pre-2000 calibration. As emphasized above, βh declines steeply with h in the post-2000 calibration because short-rate rate shocks give rise to transient shocks to the supply of long-term bonds (C > 0 and ρs < ρT ) that encounter of short-run demand curve that is far steeper than the long-run demand curve due to slow-moving capital (q < 1 and k > 1).

Figure 8 shows the model-implied impulse response functions in the post-2000 calibration following a +100 bp shock to short rates that lands in month t = 13. (We assume there is a +50 bp shock to both the persistent and transient components of the short rate.) The long-term yield is the sum of an expectations-hypothesis component and a term-premium component: yt = eht + tpt, where eht =

¯ı + [(1 − θ) / (1 − ρP θ)] (iP,t − ¯ı) + [(1 − θ) / (1 − ρT θ)] iT,t. Thus, the term spread is tst = yt − it =

tpt + (eht − it). We show the impulse responses for short-term rates (it), long-term yields (yt), the term spread (yt − it), and the term premium (tpt) in Figure 8.

The initial shock to short rates leads to a rise in term premia. Thus, relative to the expectations-

hypothesis, long-term rates are excessively sensitive to short rates. Nonetheless, the rise in short rates causes the yield curve to flatten on impact. This is because (eht − it) falls and this flattening due to the expectations hypothesis outweighs the steepening due to the rise in term premia. However, the rise in

term premia wears off quickly, explaining our key finding that βh declines sharply with horizon h. Indeed, the initial rise in short rates is predicts additional flattening of the yield curve over the following months. And, consistent with the decomposition of βh in equation (2.1), raising C or increasing the degree of slow-moving capital raises Corr(∆it, ∆tst) in the model, giving rise to greater high-frequency excess sensitivity. These same forces lower Corr(∆it, ∆tst+j ) for j > 0, explaining low-frequency decoupling.

3.4.2 Matching related ftndings

In addition to matching the βh coefficients in the pre-2000 and post-2000 periods, our model is also capable of matching the related facts documented above. First, the model is consistent with our return forecasting evidence: in the post-2000 calibration where C > 0, bond risk premia Et [rxt+1] will be elevated when short-term rates have recently risen. To see this, note that risk premia are

Et [rxt+1] = τ −1V (1) × bt = (τq)−1 V (1) × (st − (1 − q)k−1 Σk−1 dt−j ). (3.21)

The idea is that, when C > 0, fast-moving investors will be bearing greater interest-rate risk when short-rates have recently risen—i.e., bt = q−1(st − (1 − q)k−1 k−1 dt−j ) will be higher—and they require compensation for bearing this extra risk. Again, when C > 0, there are two reasons why increases in short rates lead to increases in bt and Et [rxt+1]. First, even if there are no slow-moving investors (i.e., if q = 1, so bt = st), when supply shocks are less persistent than short rates (i.e., ρs < ρT ≤ ρP ), equation (3.11) shows that supply st is likely to be high when short rates have recently risen. Second, even if supply shocks are as persistent as short-rate shocks (i.e., ρs = ρT = ρP ), when there is slow-moving capital, bt will be high when short rates have recently risen since some slow-moving investors will not have rebalanced their portfolios in response to the related supply shock. Second, let Lt = it and St = yt − it denote the model-implied level and slope factors. If we es- timate equation (2.2b) in data simulated from the model, we find that past increases in the level of rates predict a flattening of the yield curve in the post-2000 calibration but not in the pre-2000 cali- bration. When C > 0 as in the post-2000 calibration, past increases in the level of rates are associ- ated with a higher current risk premium on long-term bonds. Since the risk premium is Et [rxt+1] = St − θ (1 − θ)−1 (Et [∆St+1] + Et [∆Lt+1]), all else equal, Et [∆St+1] is lower when (Lt − Lt−h) is higher. Thus, the model is capable of generating the non-Markovian dynamics emphasized in Section 2.1.

Finally, we are more likely to have sign (yt+h − yt) ƒ= sign (it+h − it) for h = 6 or h = 12 months

in data simulated from the post-2000 calibration than the pre-2000 calibration. In other words, the calibrated model is consistent with the increasing prevalence of interest rate “conundrums.”

4 Implications

4.1 High-frequency identification

Our findings have clear implications for identification approaches based on the high-frequency response of long-term rates to macroeconomic news announcements. The short-run change in long-term yields around news announcements is increasingly used as an unconfounded measure of the longer-run impact of news shocks—see e.g., Gertler and Karadi (2015) and Nakamura and Steinsson (forthcoming). However, if, as we argue, some of the impact of a news shock on long-term rates reflects transient shifts in term premia that quickly revert, then a shock’s short- and long-run impact on long-term rates will be quite different. As a result, identification based on the high-frequency responses of long-term rates are likely to provide biased estimates of the longer-run impact of announcements. In this way, our results suggest that economists face an important bias-variance trade-off: high-frequency identification allows for precise estimates of the short-run impact of news on long-term yields, but these are likely to be biased estimates of the longer-run impact that is often of greatest interest.

Still, it is conceivable that changes in 1-year yields that are associated with macroeconomic news announcements are different, and do not cause transient changes in term premia, as recently argued by Nakamura and Steinsson (forthcoming) and H¨ordahl et al. (2015). To get some direct evidence on this question, we form an “economic news index” for month t, NewsIndext, by cumulating daily changes in 1- year yields within month t on days with important macroeconomic announcements. The announcements we consider are: (1) FOMC decisions, (2) CPI, (3) PPI, (4) durable goods orders, (5) new and existing home sales, (6) housing starts, (7) the employment report, and (8) retail sales. We then estimate the following predictive regression:

St+h − St = α + β × NewsIndext + δ1Lt + δ2St + γ(Lt − Lt−1) + εt+h, (4.1)

where Lt and St denote the level and slope of the yield curve at the end of month t.

Table 6 shows the results for both pre- and post-2000 samples and for h = 3-, 6- and 12- month future changes in the slope of the yield curve. In Panel A, the news index is the only independent variable. In

the pre-2000 sample, there is no relationship between NewsIndext and subsequent changes in slope. In the post-2000 sample, a positive news index predicts a subsequent flattening of the curve and the effect is highly statistically significant (p-val< 0.001). Panel B adds the level and slope of the yield curves as controls, and gives similar results for the relationship between NewsIndext and future changes in slope. In Panel C, we also include the change in level of short-term term rates in month t, Lt − Lt−1, as an in- dependent variable The goal is to see if shifts in short-term rates on announcement and non-announcement days have different implications for the expected future change in slope. In Panel C, we find that the estimated coefficient on NewsIndext is negative, but generally not quite statistically significant, indicat- ing that shifts in short-term rates on announcement and non-announcement days have similar effects on subsequent changes in slope. If anything, changes in short-term rates on news announcement days are more likely to be followed by subsequent yield curve flattening than changes on non-announcement days. Meanwhile, Panel C shows that the effect of changes in the level of rates on future changes in slope is not significant in the pre-2000 sample, but is significantly negative in the post-2000 sample. In summary, we conclude that since 2000, the high-frequency response of long-term rates to economic news appears to dissipate at lower frequencies, posing challenges to interpreting high-frequency yield curve responses in more recent data. 4.2 Monetary policy transmission Our results also have important implications for the transmission of monetary policy. Central banks conduct conventional monetary policy by adjusting short-term nominal rates. According to the standard New Keynesian view (Gali, 2008), changes in short-term nominal rates affect short-term real rates because of nominal rigidities. And, the resulting shifts in short-term real rates affects long-term real rates via the expectations hypothesis, which in turn influence household spending and firm investment. Stein (2013) points out that the excess sensitivity of long-term yields—whereby shocks to short rates move term premia on long-term bonds in the same direction—should strengthen the effects of monetary policy relative to the canonical view. Stein (2013) refers to this as the “recruitment” channel of monetary transmission. In our theoretical framework, the strength of this recruitment channel at medium-run or business- cycle frequencies (e.g., over a 1 to 3-year horizon) depends on (i) the relative strength of the relevant supply-and-demand-based amplication mechanisms (i.e., the size of C relative to investor risk tolerance τ ) and (ii) the persistence of the associated supply and demand shocks. Specifically, when ρs is well below ρT as under the mortgage-convexity interpretation of C, the associated shifts in term premia would be quite transient and would likely have only modest effects on investment and spending. By contrast, when ρs ≈ ρT as under the reaching-for-yield interpretation of C, the shifts in term premia would be more persistent and likely to have larger effects on aggregate demand. Our empirical results do not allow us to speak directly to the strength of this non-standard channel at business cycle frequencies—i.e., to assess the extent to which monetary policy influences term and other risk premia over the course of a monetary policy cycle. Instead, what we can confidently say is that, some of the influence of short-term rates on term premia is quite transitory.19 However, regardless of the medium-run strength of the recruitment channel, our model shows that, when capital is slow-moving, the short-run effect of shifts in short rates on term premia will exceed the medium-run effect. Thus, our findings suggest that recruitment channel is smaller than one would conclude based on a simplistic extrapolation of the high-frequency response of term premia to policy shocks documented by Hanson and Stein (2015), Gertler and Karadi (2015), and Gilchrist et al. (2015). More generally, our findings suggest that, in the presence of slow-moving capital, central banks should care about the way that monetary policy impacts financial conditions at business-cycle frequencies, but should focus less on the immediate market response to its announcements since much of the latter may be quite transitory. In this way, our findings lend support to the arguments in Stein and Sunderam (forthcoming) who argue that the Federal Reserve has become too focused on high-frequency movements in asset prices.20 4.3 Affine term-structure models In this last subsection, we explore the implications of our results for affine term-structure models which are a widely-used, reduced-form tools for understanding the term structure of bond yields (Duffee, 2002; Duffie and Kan, 1996). In these models, the n-year zero coupon yield, y(n), takes the affine form: (n) t = α0(n) + α1t (n) xt, where xt is a vector of state variables and the α 0(n) and α 1(n) satisfy a set of recursive equations. In the Internet Appendix, we apply the estimation methodology of Adrian et al. (2013) and fit affine term-structure models using the first K principal components of 1- to 10-year yields as the state variables xt. We show that standard affine models—models that are Markovian with respect to the filtration given by these current yield-curve factors—cannot fit our key finding that the sensitivity of 19That said, the estimated persistence of the transitory shifts in term premia that we find are similar to the estimated persistence of shocks to the federal funds rate in the New Keynesian literature (Bernanke and Gertler (1995), Christiano, Eichenbaum, and Evans (1999)) which typically vanish within 6 to 9 months. And, these transient shocks to short-term nominal rates are generally estimated to have meaningful effects on real activity. 20These ideas are also related to the argument that the large declines in bond yields on the days of large-scale asset purchase announcements by central banks may have been somewhat transitory (Wright, 2012; Greenlaw et al., 2018). long rates to short rates (βh) declines so strongly with horizon (h) in the post-2000 data. Furthermore, we show that this remains so even if we estimate models that include many (e.g., K = 5) current yield-curve factors as state variables. However, we show that our key finding is consistent with non-Markovian term-structure models in which past lags of the yield-curve factors are treated as “unspanned state variables.” In standard affine models, if the true model is known, one can recover the full set of state variables xt by inverting an appropriate set of yields—i.e., the state variables are “spanned” by current yields. An unspanned state variable is a variable that is useful for forecasting future bond yields and returns but that has no impact on the current yield curve. This non-Markovian model allows us to parsimoniously capture our result that past changes in the level of rates are useful for forecasting future bond yields and returns. And, similar models have been considered in Joslin et al. (2013).21 Finally, we use a bootstrap procedure to test the hypothesis that each affine model is correctly specified, using the ratio of yearly to monthly coefficients β12/β1 from equation (1.1) as the test statistic. The test rejects if the observed value of β12/β1 is too high or low to have been generated by that model. This test is in the spirit of Giglio and Kelly (2018), who test the hypothesis an affine model is correctly specified by checking whether the comovement of yields at different points on the curve is consistent with the estimated model. Using this bootstrap procedure, we conclude that, in the post-2000 sample, the Markovian models are decisively rejected: if these standard models were correctly specified it would be highly unlikely to observe a value of β12/β1 as small as we do in the data. However, the non-Markovian models are not rejected in the post-2000 sample. To summarize, our conclusion is that affine term-structure models need to include lagged yield-curve factors to match the fact that the sensitivity of long rates declines so sharply with horizon in the post-2000 data. These findings are consistent with the lead-lag relationships between changes in level and slope and the attendant return predictability documented in Section 2. 21To be clear, we would not argue that the past increase in the level of rates is literally unspanned. Instead, we think this variable is close to being unspanned. Specifically, like any factor that has a short-lived impact on bond risk premium, past increases in the level of rates should have only a small effect on the current yield curve. Thus, in practice, it may be quite difficult to recover information about this variable from current yields—e.g., because yields are measured with a tiny amount of error or because the true data-generating model evolves over time. As a result, conditioning on lagged yields will add information beyond that readily revealed by current yields. Indeed the model in Section 3, does not feature unspanned variables. Specifically, we can recast the model, which only has one class perpetual long-term bonds, to have a set of zero-coupon bonds with different maturities. Using the resulting affine model, if one knew the true process generating yields, one could recover the full (k + 2)-dimensional state vector xt from any set of (k + 2) yields. However, many of these state variables would be close to unspanned—they would have only minimal effects on yields—and, in practice, it would be difficult to extract them from yields. 5 Conclusion The strong sensitivity of long-term interest rates to changes in short rates is a long-standing puzzle. In this paper, we have shown that since 2000 this sensitivity has become even stronger at high frequencies. By contrast, this sensitivity has fallen significantly when looking at low-frequency changes. As a result, low-frequency decoupling between long and short rates—the phenomenon that former Federal Reserve Chairman Greenspan called a conundrum—has become increasingly common. From an expectations hy- pothesis perspective, the puzzle is not the weak relationship between long- and short-term rates observed recently at low frequencies. Instead, the puzzle is why this relationship was previously so strong at all frequencies and why it has become stronger at high frequencies. We have proposed a simple model that can explain these puzzling facts. Before 2000 we assume there was a sizeable persistent component of short rates due to uncertainty about trend inflation, explaining the strong sensitivity of long rates to movements in short-term rates at all horizons. Since 2000, this persistent component has become much smaller, leading to a decline in sensitivity at low frequencies. In our model, the rising excess sensitivity of long rates observed at high frequencies since 2000 is explained by the combination of (i) a growing set of investors who tend to reach for yield when short rates decline and (ii) a gradual arbitrage response to these demand shifts. Our findings have important implications for the transmission of monetary policy and event-studies. The excess sensitivity of long-term yields reinforces the effects of monetary policy (Stein, 2013), but in recent years this channel of policy has been far more short-lived than one might conclude based on a simplistic reading of high-frequency evidence. More broadly, part of the high-frequency response of long rates to shocks to short rates represents term premium movements that tend to wear off quickly. Consequently, it is important to remember that event-study approaches only measure high-frequency responses to macroeconomic news and that the impact may often be more muted at the lower frequencies that are often of greatest interest to macroeconomists and policymakers. References Abrahams, M., T. Adrian, R. K. Crump, E. Moench, and R. Yu (2016): “Decomposing real and nominal yield curves,” Journal of Monetary Economics, 84, 182–200. Adrian, T., R. K. Crump, and E. Moench (2013): “Pricing the term structure with linear regres- sions,” Journal of Financial Economics, 110, 110–138. Albagli, E. (2015): “Investment horizons and asset prices under asymmetric information,” Journal of Economic Theory, 158, 787–837. Andrews, D. W. (1993): “Tests for parameter instability and structural change with unknown change point,” Econometrica, 61, 821–856. Backus, D. and J. H. Wright (2007): “Cracking the conundrum,” Brookings Papers on Economic Activity, 1, 293–316. Beechey, M. J. and J. H. Wright (2009): “The high-frequency impact of news on long-term yields and forward rates: Is it real?” Journal of Monetary Economics, 56, 535–544. Bernanke, B. S. (2007): “Inflation expectations and inflation forecasting,” Speech at the Monetary Economics Workshop of the National Bureau of Economic Research Summer Institute, Cambridge, Massachusetts. Bordalo, P., N. Gennaioli, and A. Shleifer (2013): “Salience and consumer choice,” Journal of Political Economy, 121, 803–843. ——— (2017): “Diagnostic expectations and credit eycles,” Journal of Finance, 73, 199–227. Brooks, J., M. Katz, and H. Lustig (2017): “Post-FOMC announcement drift in U.S. bond markets,” . Campbell, J. R., C. L. Evans, J. D. Fisher, and A. Justiniano (2012): “Macroeconomic effects of Federal Reserve forward guidance,” Brookings Papers on Economic Activity, 2012, 1–80. Campbell, J. Y., A. W. Lo, and A. C. MacKinlay (1996): The Econometrics of Financial Markets, Princeton, New Jersey: Princeton University Press. Campbell, J. Y. and R. J. Shiller (1988): “Stock prices, earnings, and expected dividends,” Journal of Finance, 43, 661–676. Cho, C.-K. and T. J. Vogelsang (2017): “Fixed-b inference for testing structural change in a time series regression,” Econometrics, 5, 2. Chow, G. C. (1960): “Tests of equality between sets of coefficients in two linear regressions,” Econo- metrica, 28, 591–605. Cieslak, A. (forthcoming): “Short-rate expectations and unexpected returns in Treasury bonds,” Review of Financial Studies. Cochrane, J. H. and M. Piazzesi (2002): “The Fed and interest ratesa high-frequency identification,” American Economic Review: Papers and Proceedings, 92, 90–95. ——— (2005): “Bond risk premia,” American Economic Review, 95, 138–160. Di Maggio, M. and M. T. Kacperczyk (2017): “The unintended consequences of the zero lower bound policy,” Journal of Financial Economics, 123, 59–80. Domanski, D., H. S. Shin, and V. Shushko (2017): “The hunt for duration: Not waving but drown- ing?” IMF Economic Review, 65, 113–153. Drechsler, I., A. Savov, and P. Schnabl (2014): “A model of monetary policy and risk premia,” Tech. rep., National Bureau of Economic Research. Duffee, G. (2002): “Term premia and interest rate forecasts in affine models,” Journal of Finance, 57, 405–443. Duffee, G. R. (2013): “Forecasting interest rates,” in Handbook of Economic Forecasting, Volume 2, ed. by G. Elliott and A. Timmermann, Elsevier. Duffie, D. (2010): “Asset price dynamics with slow-moving capital,” Journal of Finance, 65, 1238–1268. Duffie, D. and R. Kan (1996): “Yield factor models of interest rates,” Mathematical Finance, 64, 379–406. Evans, C. L. and D. A. Marshall (1998): “Monetary policy and the term structure of nominal interest rates: Evidence and theory,” Carnegie-Rochester Conference Series on Public Policy, 49, 53–111. Gagnon, J., M. Raskin, J. Remache, and B. Sack (2011): “Large-scale asset purchases by the Federal Reserve: Did they work?” Federal Reserve Bank of New York Economic Policy Review, May, 41–59. Gali, J. (2008): Monetary Policy, Inflation, and the Business Cycle, Princeton, New Jersey: Princeton University Press. Gertler, M. and P. Karadi (2015): “Monetary policy surprises, credit costs, and economic activity,” American Economic Journal: Macroeconomics, 7, 44–76. Giglio, S. and B. Kelly (2018): “Excess volatility: Beyond discount rates,” Quarterly Journal of Economics, 133, 71–127. Gilchrist, S., D. Lpez-Salido, and E. Zakrajek (2015): “Monetary policy and real borrowing costs at the zero lower bound,” American Economic Journal: Macroeconomics, 7, 77–109. Greenlaw, D., J. D. Hamilton, E. S. Harris, and K. D. West (2018): “A skeptical view of the impact of the Feds balance sheet,” Monetary Policy Forum. Greenwood, R., S. G. Hanson, and G. Y. Liao (forthcoming): “Asset price dynamics in partially segmented markets,” Review of Financial Studies. Greenwood, R. and D. Vayanos (2014): “Bond supply and excess bond returns,” Review of Financial Studies, 27, 663–713. Gu¨rkaynak, R. S., B. Sack, and E. T. Swanson (2005): “The sensitivity of long-term interest rates to economic news: Evidence and implications for macroeconomic models,” American Economic Review, 95, 425–436. Gu¨rkaynak, R. S., B. Sack, and J. H. Wright (2007): “The U.S. Treasury yield curve: 1961 to the present,” Journal of Monetary Economics, 54, 2291–2304. Gu¨rkaynak, R. S., B. Sack, and J. H. Wright (2010): “The TIPS yield curve and inflation compensation,” American Economic Journal: Macroeconomics, 2, 70–92. Gurkaynak, R. S., B. P. Sack, and E. T. Swanson (2005): “Do actions speak louder than words? The response of asset prices to monetary policy actions and statements,” International Journal of Central Banking, 1, 55–93. Hamilton, J. and C. Wu (2012): “The effectiveness of alternative monetary policy tools in a zero lower bound environment,” Journal of Money, Credit, and Banking, 44, 3–46. Hanson, S. G. (2014): “Mortgage convexity,” Journal of Financial Economics, 113, 270–299. Hanson, S. G. and J. C. Stein (2015): “Monetary policy and long-term real rates,” Journal of Financial Economics, 115, 429–448. Ho¨rdahl, P., E. M. Remolona, and G. Valente (2015): “Expectations and risk premia at 8:30AM: Macroeconomic announcements and the yield curve,” BIS working paper 527. Joslin, S., A. Le, and K. J. Singleton (2013): “Gaussian macro-finance term structure models with lags,” Journal of Financial Econometrics, 11, 589–609. Joslin, S., M. Preibsch, and K. J. Singleton (2014): “Risk premiums in dynamic term structure models with unspanned macro risks,” Journal of Finance, 69, 1197–1233. Kahneman, D. and A. Tversky (1972): “Subjective probability: A judgment of representativeness,” Cognitive Psychology, 3, 430454. ——— (1979): “Prospect Theory: An analysis of decision under risk,” Econometrica, 47, 263–292. Kiefer, N. M. and T. J. Vogelsang (2005): “A new asymptotic theory for heteroskedasticity- autocorrelation robust tests,” Econometric Theory, 21, 1130–1164. Krishnamurthy, A. and A. Vissing-Jorgensen (2011): “The effects of quantitative easing on interest rates: Channels and implications for policy,” Brookings Papers on Economic Activity, Fall, 215–265. ——— (2012): “The aggregate demand for Treasury debt,” Journal of Political Economy, 120, 233–267. Kuttner, K. N. (2001): “Monetary policy surprises and interest rates: evidence from the Fed Funds futures market,” Journal of Monetary Economics, 47, 523–544. Lian, C., Y. Ma, and C. Wang (2017): “Low interest rates and risk taking: Evidence from individual investment decisions,” . Litterman, R. and J. Scheinkman (1991): “Common factors affecting bond returns,” Journal of Fixed Income, 1, 54–61. Lucca, D. O. and F. Trebbi (2009): “Measuring central bank communication: an automated approach with application to FOMC statements,” Tech. rep., National Bureau of Economic Research. Maddaloni, A. and J.-L. Peydro´ (2011): “Bank risk-taking, securitization, supervision, and low interest rates: Evidence from the euro-area and the US lending standards,” Review of Financial Studies, 24, 2121–2165. Malkhozov, A., P. Mueller, A. Vedolin, and G. Venter (2016): “Mortgage risk and the yield curve,” Review of Financial Studies, 29, 1220–1253. Mankiw, N. G. and L. H. Summers (1984): “Do long term interest rates overreact to short-term interest rates?” Brookings Papers on Economic Activity, 1, 223–242. Nakamura, E. and J. Steinsson (forthcoming): “High-frequency identification of monetary non- neutrality: The information effect,” Quarterly Journal of Economics. Newey, W. K. and K. D. West (1987): “A simple, positive semi-definite, heteroskedasticity and autocorrelation consistent covariance matrix,” Econometrica, 55, 703–708. Piazzesi, M., J. Salomao, and M. Schneider (2015): “Trend and cycle in bond premia,” Working Paper, Stanford University. Samuelson, P. A. (1947): Foundations of Economic Analysis, Cambridge, MA: Harvard University Press. Shiller, R. J. (1979): “The volatility of long-term interest rates and expectations models of the term structure,” Journal of Political Economy, 87, 1190–1219. Shiller, R. J., J. Y. Campbell, and K. L. Schoenholtz (1983): “Forward rates and future policy: Interpreting the term structure of interest rates,” Brookings Papers on Economic Activity, 1, 173–217. Shin, H. S. (2017): “How much should we read into shifts into long-dated yields?” U.S. Monetary Policy Forum, speech. Spiegel, M. (1998): “Stock price volatility in a multiple security overlapping generations model,” Review of Financial Studies, 11, 419–447. Stein, J. C. (2013): “Yield-oriented investors and the monetary transmission mechanism,” . Stein, J. C. and A. Sunderam (forthcoming): “The Fed, the bond market, and gradualism in monetary policy,” Journal of Finance. Stock, J. H. and M. W. Watson (2007): “Why has U.S. inflation become harder to forecast?” Journal of Money, Credit and Banking, 39, 3–33. Swanson, E. T. and J. C. Williams (2014): “Measuring the effect of the zero lower bound on medium- and longer-term interest rates,” American Economic Review, 104, 3154–3185. Thornton, D. L. (forthcoming): “Greenspan’s conundrum and the Fed’s ability to affect long-term yields,” Journal of Money, Credit and Banking. Vayanos, D. and J.-L. Vila (2009): “A preferred-habitat model of the term structure of interest rates,” Tech. rep., National Bureau of Economic Research. Wright, J. H. (2012): “What does monetary policy do to long-term interest rates at the zero lower bound?” Economic Journal, 122, F447–F466. Table 1: Regression of changes in long-term rates on short-term rates. This table reports the estimated regression coefficients from equations (1.1) and (1.2) for each reported sample. The dependent variable is the change in the 10-year yield or forward rate, either nominal, real or their difference (IC, or inflation compensation). The independent variable is the change in the 1-year nominal yield in all cases. Changes are considered with daily data, and with monthly data using monthly (h = 1), quarterly (h = 3), semi-annual (h = 6) and annual (h = 12) horizons. In the 1971-1999 monthly sample, time t runs from 1971m8 to 1999m12 and the number of monthly observations is 341 irrespective of h. In the 2000-2017 monthly sample, t runs from 2000m1 to 2017m12, so the number of monthly observations runs 215 from for h = 1 to 204 for h = 12. For h > 1, we report Newey-West (1987) standard errors are in brackets, using a lag truncation parameter of 1.5 h ; for h = 1, we report heteroskedasticity

robust standard errors. Significance: ∗p < 0.1, ∗∗ p < 0.05, ∗∗∗p < 0.01. Significance is computed using the asymptotic theory of Kiefer and Vogelsang (2005) which has better finite sample properties than traditional asymptotic theory. Panel A: 10-year zero coupon yields and IC (1) (2) (3) (4) Nominal Nominal Real IC Daily 0.56∗∗∗ 0.86∗∗∗ 0.55∗∗∗ 0.31∗∗∗ Monthly [0.02] 0.46∗∗∗ [0.03] 0.64∗∗∗ [0.03] 0.37∗∗∗ [0.02] 0.26∗∗∗ [0.04] [0.11] [0.09] [0.10] Quarterly 0.48∗∗∗ 0.42∗∗∗ 0.21∗ 0.22 [0.04] [0.07] [0.11] [0.13] Semi-annual 0.50∗∗∗ 0.31∗∗∗ 0.20∗∗ 0.12 [0.04] [0.07] [0.08] [0.10] Yearly 0.56∗∗∗ 0.20∗∗∗ 0.13∗ 0.07 [0.05] [0.04] [0.06] [0.05] Sample 1971-1999 2000-2017 2000-2017 2000-2017 Panel B: 10-year instantaneous forward yields and IC (1) (2) (3) (4) Nominal Nominal Real IC Daily 0.39∗∗∗ 0.47∗∗∗ 0.31∗∗∗ 0.17∗∗∗ [0.03] [0.04] [0.03] [0.03] Monthly 0.29∗∗∗ 0.22 0.17∗∗ 0.06 [0.04] [0.14] [0.08] [0.09] Quarterly 0.31∗∗∗ 0.03 0.08 -0.04 [0.05] [0.09] [0.05] [0.05] Semi-annual 0.33∗∗∗ -0.06 0.03 -0.09∗∗ [0.06] [0.07] [0.04] [0.04] Yearly 0.39∗∗∗ -0.17∗∗∗ -0.03 -0.14∗∗∗ [0.07] [0.05] [0.05] [0.03] Sample 1971-1999 2000-2017 2000-2017 2000-2017 Table 2: Regression of changes in long-term international rates on short-term rates. This table reports the estimated regression coefficients from equation (1.1) for the United Kingdom (UK), Germany (DE), and Canada (CAN) on each reported sample. We obtain data on each country’s zero- coupon yield curve from each country’s central bank website. The dependent variable is the change in the 10-year zero-coupon yield, either nominal, real, or their difference—i.e., inflation compensation (IC). The independent variable is the change in the 1-year nominal yield in all cases. Changes are considered with daily data, and with monthly data using monthly (h = 1), quarterly (h = 3), semi-annual (h = 6) and annual (h = 12) horizons. For h > 1, we report Newey-West (1987) standard errors are in brackets, using a lag truncation parameter of 1.5 h ; for h = 1, we report heteroskedasticity robust standard errors.

Significance: ∗p < 0.1, ∗∗ p < 0.05, ∗∗∗p < 0.01. Significance is computed using the asymptotic theory of

Kiefer and Vogelsang (2005) which has better finite sample properties than traditional asymptotic theory.

Panel A: UK 10-year zero-coupon yields

(1) (2) (3) (4) (5) (6)

Nominal Nominal Real Real IC IC

Daily 0.44∗∗∗ 0.86∗∗∗ 0.14∗∗∗ 0.63∗∗∗ 0.29∗∗∗ 0.23∗∗∗

[0.04] [0.03] [0.01] [0.03] [0.04] [0.02]
Monthly 0.47∗∗∗ 0.55∗∗∗ 0.19∗∗∗ 0.12 0.28∗∗∗ 0.43∗∗∗

[0.06] [0.13] [0.04] [0.23] [0.08] [0.14]
Quarterly 0.49∗∗∗ 0.43∗∗∗ 0.23∗∗∗ 0.04 0.26∗∗ 0.39∗∗∗

[0.08] [0.10] [0.04] [0.17] [0.10] [0.09]
Semi-annual 0.45∗∗∗ 0.39∗∗∗ 0.22∗∗∗ 0.07 0.23∗∗ 0.32∗∗∗

[0.09] [0.08] [0.05] [0.11] [0.11] [0.06]
Yearly 0.38∗∗∗ 0.29∗∗∗ 0.16∗∗ 0.05 0.22∗∗ 0.24∗∗∗

[0.06] [0.06] [0.06] [0.08] [0.08] [0.03]
Sample 1985-1999 2000-2017 1985-1999 2000-2017 1985-1999 2000-2017

Panel B: German and Canadian 10-year nominal zero-coupon yields

(1)

DE (2)

DE (3)

CAN (4)

CAN

Daily 0.65∗∗∗ 0.42∗∗∗ 0.71∗∗∗

Monthly 0.34∗∗∗ [0.03] 0.50∗∗∗ [0.03] 0.46∗∗∗ [0.03] 0.51∗∗∗

Quarterly [0.05] 0.41∗∗∗ [0.10] 0.44∗∗∗ [0.05] 0.51∗∗∗ [0.08] 0.38∗∗∗

Semi-annual [0.04] 0.41∗∗∗ [0.07] 0.41∗∗∗ [0.05] 0.50∗∗∗ [0.05] 0.26∗∗∗

Yearly [0.04]
0.43∗∗∗ [0.08]
0.33∗∗∗ [0.07]
0.43∗∗∗ [0.05] 0.12∗

[0.04] [0.10] [0.08] [0.06]
Sample 1972-1999 2000-2017 1986-1999 2000-2017

Table 3: Estimates of predictive equations for level and slope. This table reports the estimated regression coefficients from monthly predictive equations (2.2a) and (2.2b) with h = 12 for the 1971m8–1999m12 and 2000m1–2017m12 subsamples. Dependent

variables are the level (Lt ≡ y ) and slope (St ≡ y − y ) of the U.S. Treasury zero-

coupon yield curve. Heteroskedasticity robust standard errors are in brackets. Significance:

∗p < 0.1, ∗∗p < 0.05, ∗∗∗p < 0.01. The table also shows AIC and BIC values (to be

minimized) for each possible specification of the system of two equations. Lastly, the implied coefficients β1 and β12 in equation (1.1) corresponding to each possible specification of the system are reported.

Pre-2000 Post-2000

(1) (2) (3) (4) (5) (6)

Dependent Variable: Level

Lt 0.98∗∗∗ 0.97∗∗∗ 0.96∗∗∗ 0.97∗∗∗ 0.98∗∗∗ 0.98∗∗∗ [0.02] [0.02] [0.02] [0.01] [0.01] [0.01]
St 0.00 0.02 -0.01 -0.03∗ 0.02 0.02

[0.04] [0.04] [0.04] [0.02] [0.02] [0.02]
Lt − Lt−12 0.02 0.06∗∗ 0.07∗∗∗ 0.07∗∗∗

[0.02] [0.03] [0.02] [0.02]
St − St−12 0.11∗∗ -0.00

[0.05] [0.02]
Dependent Variable: Slope

Lt 0.01 0.01 0.02 -0.00 -0.02∗∗ -0.02

[0.01] [0.01] [0.01] [0.01] [0.01] [0.01]
St 0.96∗∗∗ 0.94∗∗∗ 0.95∗∗∗ 0.97∗∗∗ 0.90∗∗∗ 0.91∗∗∗

[0.03] [0.03] [0.03] [0.02] [0.02] [0.02]
Lt − Lt−12 -0.02 -0.04∗∗ -0.09∗∗∗ -0.11∗∗∗

[0.01] [0.02] [0.02] [0.02]
St − St−12 -0.05 -0.03

[0.03] [0.03]
N 341 329 329 215 215 215

Implied β1 0.46 0.46 0.46 0.65 0.68 0.68

Implied β12 0.52 0.48 0.54 0.58 0.28 0.25

AIC -5720.3 -5504.6 -5506.5 -4048.3 -4098.8 -4096.4

BIC -5697.3 -5474.2 -5468.6 -4028.1 -4071.8 -4062.7

Sample 1971-1999 1972-1999 1972-1999 2000-2017 2000-2017 2000-2017

Table 4: Estimates of predictive equations for bond excess returns This table reports the estimated regression coefficients in equations (2.8a), (2.8b), and (2.8c) using monthly data from the 1971m8–1999m12 and 2000m1–2017m12 subsamples. We report results various combinations of the return forecast horizon (k) and the past change in

yield curve factors (h). Significance: ∗p < 0.1, ∗∗p < 0.05, ∗∗∗p < 0.01. For k = 1-

month returns, we report heteroskedasticity robust standard errors are in brackets. For k = 3-month returns, we report Newey and West (1987) standard errors in brackets, using a lag truncation parameter of 5 months. In this case, p-values are computed using the asymptotic theory of Kiefer and Vogelsang (2005) which has better finite sample properties than traditional asymptotic theory.

Pre-2000 Post-2000

t→t+k

t→t+k

t→t+k

Table 5: Sharpe ratios for slope-mimicking portfolios This table reports the annu- alized Sharpe ratios since 2000 of the strategy of going long (short) the slope-mimicking portfolio if the level fell (rose) over the previous h months and also the strategy of taking a

position in the slope-mimicking portfolio that is proportional to −(Lt − Lt−h), and holding

the position from t to t + 1. The position is rebalanced each month. Annualized√Sharpe

ratios are computed as the sample average monthly excess returns multiplied by

divided by the standard deviation of those monthly excess returns.

Strategy with h: 1 3 6 12

Pre-2000

2 × I(Lt − Lt−h < 0) − 1 0.22 0.03 -0.09 -0.00

−(Lt − Lt−h) 0.08 -0.01 0.12 0.09

Post-2000

−(Lt − Lt−h) 0.38 0.62 0.47 0.45

12 and

Table 6: Economic News Index and subsequent changes in slope This table reports the regression coefficents in equation (4.1) using monthly data from the 1971m8– 1999m12 and 2000m1–2017m12 subsamples. Newey-West (1987) standard errors are in

brackets, using a lag truncation parameter of 1.5 h . Significance: ∗p < 0.1, ∗∗p < 0.05,

∗∗∗p < 0.01. Significance is computed using the asymptotic theory of Kiefer and Vogelsang

(2005) which has better finite sample properties than traditional asymptotic theory.

Pre-2000 Post-2000

(1) (2) (3) (4) (5) (6)

Dep. Var: fsh.St with h 3 6 12 3 6 12

Panel A

NewsIndext -0.42 0.11 -0.12 -1.31∗∗∗ -1.85∗∗∗ -2.85∗∗∗

[0.36] [0.43] [0.46] [0.23] [0.53] [0.78]
Adj.R2 0.01 -0.00 -0.00 0.11 0.09 0.08

Panel B

Lt 0.03 0.06∗ 0.13∗∗∗ -0.01 0.02 0.10

[0.02] [0.03] [0.05] [0.04] [0.08] [0.13]
St -0.13∗ -0.19∗ -0.30∗ -0.09∗ -0.19∗ -0.45∗∗

[0.07] [0.10] [0.13] [0.05] [0.09] [0.19]
NewsIndext -0.61∗∗ -0.20 -0.67 -1.28∗∗∗ -1.69∗∗∗ -2.33∗∗∗

[0.31] [0.41] [0.44] [0.21] [0.50] [0.88]
Adj.R2 0.10 0.15 0.28 0.15 0.20 0.40

Panel C

Lt 0.03 0.07∗ 0.13∗∗∗ -0.00 0.02 0.10

[0.03] [0.03] [0.04] [0.03] [0.07] [0.13]
St -0.12∗ -0.19∗ -0.30∗∗ -0.09∗∗ -0.20∗∗ -0.45∗∗

[0.06] [0.09] [0.13] [0.04] [0.09] [0.19]
NewsIndext -0.79∗∗∗ -0.38 -0.72∗ -0.52 -0.29 -0.07

[0.28] [0.32] [0.36] [0.34] [0.54] [0.51]
Lt − Lt−1 0.08

[0.12] 0.08

[0.11] 0.02

[0.14] -0.63∗∗

[0.27] -1.17∗∗

[0.49] -1.89∗∗∗

[0.56]
Adj.R2 0.10 0.15 0.28 0.18 0.25 0.45

N 341 341 341 214 211 205

Sample 1971-1999 1971-1999 1971-1999 2000-2017 2000-2017 2000-2017

Figure 1: Treasury yield curves around selected episodes This figure plots the Treasury yield curve (dotted line) in the original 2004 “conundrum” episode, the 2008 “conundrum in reverse” episode and the “2017 conundrum”. Diamonds display the target for the federal funds rate in each episode.

2004 Conundrum

5

4

3

2 Jun 29, 2004

Feb 2, 2005

1

0 2 4 6 8 10

Maturity (years)

2008 Conundrum in Reverse

5

4

3

2 Jan 29, 2008

Oct 28, 2008

1

0 2 4 6 8 10

Maturity (years)

2017 Conundrum

2.5

2

1.5

1 Dec 12, 2016

Dec 13, 2017

.5

0 2 4 6 8 10

Maturity (years)

Figure 2: Rolling Regression Estimates of Equations (1.1) and (1.2) This figure plots rolling estimates of the slope coefficients in equations (1.1) and (1.2) with h = 12- month changes using 10-year rolling windows for estimation. Results are plotted against the midpoint of the 10-year rolling window. 95% confidence intervals are included (shared areas), formed using Newey-West standard errors with a lag truncation parameter of 18 and 95% critical values from the asymptotic theory of Kiefer and Vogelsang (2005). Specifically,

the 95% confidence interval is ±2.41 times the estimated standard errors as opposed to

±1.96 under traditional asymptotic theory.

1

.5

0

-.5

10-year yield

Jan80 Jan85 Jan90 Jan95 Jan00 Jan05 Jan10

1

.5

0

-.5

10-year forward rate

Jan80 Jan85 Jan90 Jan95 Jan00 Jan05 Jan10

Figure 3: Break Tests for Equations (1.1) and (1.2) This figure plots the Wald test statistic for each possible break date in equations (1.1) and (1.2) with h = 12-month changes from a fraction 15% of the way through the sample to 85% of the way through the sample. The horizontal red dashed lines denote 10%, 5% and 1% critical values for the maximum of these Wald statistics as in Andrews (1993). Our Wald tests use a Newey and West (1987) variance matrix with a lag truncation parameter of 18. To address the tendency for tests based on the Newey-West variance estimator to over-reject in finite samples, we use the Cho and Vogelsang (2017) critical values for a null of no structural break. The Cho and Vogelsang (2017) critical values are based on the asymptotic theory of Kiefer and Vogelsang (2005) and are slightly larger than the traditional critical values from Andrews (1993).

10-year yield

50

40

30

20

10

0

Jan80 Jan85 Jan90 Jan95 Jan00 Jan05 Jan10

10-year forward rate

50

40

30

20

10

0

Jan80 Jan85 Jan90 Jan95 Jan00 Jan05 Jan10

Figure 4: Cross-Correlation of Changes in Level and Slope. Level is the 1-year zero coupon Treasury yield; slope is the 10-year less the 1-year yield. This figure plots the cross correlation Corr(∆Lt, ∆St+j ) as a function of the lead j. Thus, correlations at positive leads (on the right of the figure) denote correlations between monthly changes in level and future changes in slope. The cross-correlation at lead j = 0 is the contemporaneous correlation of daily changes in level and slope.

.1

-.1

-.3

-.5

Pre-2000 Post-2000

-12 -9 -6 -3 0 3 6 9 12

Leads of level (changes, months)

Figure 5: Counterfactual paths of ten-year yields in selected “conundrum” episodes This figure plots 1- and 10-year yields in the original 2004 “conundrum” episode, the 2008 “conundrum in reverse” episode and the “2017 conundrum.” As described in the text, we also plot counterfactual 10-year yields generated from restricting the slope to depend on lags of level and slope, but not also on lagged changes in level and slope.

2004 Conundrum

5

4

3

2

1 Actual 1y Actual 10y Alt 10y

Feb04 May04 Aug04 Nov04 Feb05 May05

2008 Conundrum in reverse

5

4

3

2

1 Actual 1y Actual 10y Alt 10y

Sep07 Dec07 Mar08 Jun08 Sep08

2017 Conundrum

3

2.5

2

1.5

1

.5 Actual 1y Actual 10y Alt 10y

Aug16 Nov16 Feb17 May17 Aug17 Nov17

Figure 6: Predicting returns, the changes in forwards, and the change in yields for various bond maturities n. This figure plots the coefficients δ3,n on the past change in the level factor versus bond maturity n from estimating equation (2.9) for returns, equation (2.10) for the change in forward rates, and equation (2.11) for the change in yields. The results are shown for k = 3-month returns or future changes using h = 6- month past changes in level. Due to the use of overlapping data, we plot 95% confidence intervals using Newey-West (1987) standard errors with a lag truncation parameter of 5. The critical values are computed using the asymptotic theory of Kiefer and Vogelsang

(2005) which has better finite sample properties than traditional asymptotic theory. For instance, in the post-2000 sample, the 95% confidence interval is ±2.03 times the estimated standard errors as opposed to ±1.96 under traditional asymptotic theory.

Future returns

8

6

4

2

0

-2

0 5 10 15 20

Bond maturity in year (n)

Pre-2000 Post-2000

.2

0

-.2

-.4

Future changes in forwards

0 5 10 15 20

Bond maturity in year (n)

Pre-2000 Post-2000

.4

.2

0

-.2

-.4

Future changes in yields

0 5 10 15 20

Bond maturity in year (n)

Pre-2000 Post-2000

Figure 7: Model-implied coefficients βh in equation (3.17) against horizon (h) in months. We show this separately for the pre-2000 and post-2000 calibrations discussed in the text.

0.55

0.5

0.45

0.4

0.35

0.3

0.25

0 2 4 6 8 10 12

Ofer Abarbanel is a 25 year securities lending broker and expert who has advised many Israeli regulators, among them the Israel Tax Authority, with respect to stock loans, repurchase agreements and credit derivatives. Founder of TBIL.co STATX Fund.