## 1 Introduction

Since the seminal contribution of Klein (1969; 1971), economic forecasts had been built upon the presumption that the relationships between economic variables remain stable over time. However, the last decades have been subject to many social-economic episodes and technological advancements that have led economists to reconsider the assumption of model stability. The resonant empirical evidences documented in, among others, Perron (1989) and Stock and Watson (1996) [see also the recent survey by Ng and Wright (2013)

] have motivated the development of econometric methods that detect such instabilities—most work directed toward structural changes—and estimate the actual dates at which economic relationships change. Yet, the issue of parameter insatiability is not limited to model estimation. In the forecasting literature, there has been a widespread concordance that the major issue that prevents good forecasts for economic variables is parameter instability—and structural changes as a special case—[cf.

Banerjee, Marcellino, and Masten (2008), Clements and Hendry (1998, 2006), Elliott and Timmermann (2016), Giacomini (2015), Giacomini and Rossi (2015), Inoue and Rossi (2011), Clark and McCracken (2005), Pesaran, Pettenuzzo, and Timmermann (2006) and Rossi (2013a)].This paper develops a statistical setting under infill asymptotics
to address the issue of testing whether the predictive ability of
a given forecast model remains stable over time. Ng and Wright (2013)
and Stock and Watson (2003) explain that there has been abundant
evidence for which a predictor that has performed well over a certain
time period may not perform as well during other subsequent periods.
For example, Gilchrist and Zakrajšek (2012) proposed a new credit
spread index and showed that a residual component labeled as the excess
bond premium—the credit spread adjusted for expected default
risk—has considerable predictive content for future economic
activity. They documented that this forecasting ability is stronger
over the subsample 1985-2010 rather than over the full sample starting
from 1973.^{1}^{1}1They reported that structural change tests provide some statistical
evidence for a break in a coefficient associated with financial indicators—more
specifically the coefficient on the federal funds rate. Given the
latter evidence and the well-documented change in the conduct of monetary
policy in the late 1970s and the early 1980s, it seems plausible to
split the sample in 1985 (see p. 1709 and footnote 11 in their paper). The latter finding can be attributed to a more developed bond market
in the 1985-2010 subsample. Relatedly, Giacomini and Rossi (2010)
and Ng and Wright (2013) further examined this finding and found
that indeed the predictive ability of commonly used term and credit
spreads is unstable and somehow episodic. The latter authors suggested
that credit spreads may be more useful predictors of economic activity
in a more highly leveraged economy and that recent developments in
financial markets translate into credit spreads containing more information
than they had previously. We refer to such temporal instability for
a given forecasting method as forecast instability or more
specifically, as forecast failure. These terminologies are
not new to professional forecasters as they were informally introduced
by Clements and Hendry (1998) and generalized in econometric terms
by Giacomini and Rossi (2009) who interpreted forecast breakdown
(or forecast failure) as a situation in which the out-of-sample
performance of a forecast model significantly deteriorates relative
to its in-sample performance. Our approach is to formally define forecast
instability from the economic forecaster’s perspective.^{2}^{2}2We use the terminology “instability” because not only the deterioration
but also the improvement of the performance of a given forecast model
over time can provide useful information to the forecaster. We emphasize that a forecast failure may well result from a short
period of instability within the out-of-sample and not necessarily
require that the instability be systematic in the sense of persisting
throughout the whole out-of-sample period. That is, consistency of
a forecast model’s performance with expected performance given the
past should hold not only throughout the out-of-sample but also in
any sub-sample of the latter. Indeed, many documented episodes of
forecast failure seemed to arise from parameters nonconstancy data-generating
processes over relatively short time periods compared to the total
sample size. Hence, the desire of focusing on statistical tests being
able to detect short-lasting instabilities is intuitive: if a test
for forecast failure needs the deterioration of the forecasting ability
to last for, say, at least half of the total sample in order to have
sufficiently high power to reject the null hypotheses, then this test
would not perform very well in practice because instability can be
short-lasting. Furthermore, the occurrence of recurrent structural
instabilities or multiple breaks that compensate each other in the
out-of-sample might lead a forecast model to perform, on average,
in a similar fashion as in the in-sample period. However, should a
forecaster know about those recurrent changes she would conceivably
revise its forecast model to adapt to the unstable environment. Hence,
we introduce the following definition.

###### Definition 1.1.

(Forecast Instability)

Forecast Instability refers to a situation of either sustained deterioration
or improvement of the predictive ability of a given forecast model
relative to the historical performance that would had led a forecaster
to revise or reconsider its forecast model if she had known the occurrence
of such instability. The time lengths of these two distinct periods
need not bear any relationship.^{3}^{3}3Forecast Failure constitutes a special case of the definition—namely,
a sustained deterioration of predictive ability.

Th definition poses at the center the economic forecaster and consequently
it is not merely a statistical definition; rather, it is based on
an equilibrium concept. Since forecasting constitutes a decision theoretic
problem, it should be from the forecaster perspective that a given
forecast model is deemed to have failed. It is implicit from the definition
to distinguish between forecasting method and model. Two forecasters
may share the same forecast model—the relationship between
the variable of interest and the predictor—but use different
methods (e.g., recursive scheme versus rolling scheme). Thus, instability
refers to a given method-model pair. The object of the definition
is predictive ability. Since the latter can be measured differently
by different loss functions, then the definition applies to a given
choice of the loss function. A notable aspect of the definition is
the reference to the time span of the historical performance and of
the putative period of instability. They need not be related. Consider
a given forecasting strategy which has performed well during, say,
the Great Moderation (i.e., from mid-1980s up to prior the beginning
of the Great Recession in 2007). Assume that during the years 2007-2012
this method endures a time of poor performance and returns to perform
well thereafter. According to our definition, this episode constitutes
an example of forecast instability. However, if one designs the forecasting
exercise in such a way that half of the sample is used for estimation
and the remaining half for prediction, then this relatively short
period of instability gets “averaged-out” from tests which simply
compare the in-sample and out-of-sample averages. Conceivably, such
tests would not reject the null hypotheses of no forecast failure
while it seems that a forecaster would had revised its strategy during
the crisis if she had known about such occurring under-performance
in the present and immediate future period. Finally, detection of
forecast instability does not necessarily mean that a forecast model
should be abandoned. In fact, its performance may have improved over
time. Yet, even if forecast instability is induced by performance
deterioration, a forecaster might not end up switching to a new predictor.
For example, entering a state of high variability might lead to poor
performance even if the forecast model is still correct. Hence, our
definition uses the term reconsider. Continuing with the
above example, a forecaster may reconsider the choice of
the forecasting window since a longer window may now produce better
forecasts while keeping the same forecast model. In other
words, knowledge of forecast instability is important because indicates
that care must be exercised to assess the source of the changes.^{4}^{4}4Economists have documented episodes of forecast failure in many areas
of macroeconomics. In the empirical literature on exchange rates a
prominent forecast failure is associated with the Meese and Rogoff’s
puzzle [cf. Meese and Rogoff (1983), Cheung, Chinn, and Garcia Pascual (2005),
and Rossi (2013b) for an up-to-date account]. In the context
of inflation forecasting, forecast failures have been reported by
Atkeson and Ohanian (2001) and Stock and Watson (2009). For forecast
instability concerning other macroeconomic variables see the surveys
of Stock and Watson (2003) and Ng and Wright (2013).

The theoretical implication is that in this paper our tests for forecast instability shall be based on the local behavior of the sequence of realized forecast losses. This is opposite to existing tests for forecast instability—and classical structural change tests more generally—which instead rely on a global and retrospective methodology merely comparing the average of in-sample losses with the average of out-of-sample losses. While maintaining approximately correct nominal size, our class of test statistics achieves substantial gains in statistical power relative to previous methods. Furthermore, as the initial timing of the instability moves away from middle sample toward the tail of the out-of-sample, the gains in power become considerable.

In this paper, we set out a continuous record asymptotic framework for a forecasting environment where observations at equidistant time intervals are made over a fixed time span with These observations are realizations from a continuous-time model for the variable to be forecast and for the predictor. From these discretely observed realizations we compute a sequence of forecasts using either a fixed, recursive or rolling scheme. To this sequence of forecasts there corresponds a continuous-time process which satisfies mild regularity conditions and that under the null hypotheses possesses a continuous sample-path. We exploit this pathwise property to base an hypothesis testing problem on the relative performance of a given forecast model over time. Under the hypotheses we expect the sequence of losses to display a smooth and stable path. Any discontinuous or jump behavior followed by a (possibly short) period of substantial discrepancy from the same path over the in-sample period provides evidence against the hypotheses. Our asymptotic theory involves a continuous record of observations where we let the sample size grow to infinity by shrinking the sampling interval to zero with the time span kept fixed at , thereby approaching the continuous-time limit.

Our underlying probabilistic model is specified in terms of continuous
Itï¿œ semimartingales which are standard building blocks for analysis
of macro and financial high-frequency data [cf. Andersen, Bollerslev,
Diebold, and Labys (2001),
Andersen, Fusari, and Todorov (2016), Bandi and Renò (2016) and Barndorff-Nielsen and Shephard (2004)];
the theoretical methodology is thus related to that of Casini and Perron (2017a),
Li, Todorov, and Tauchen (2017), Li and Xiu (2016) and Mykland and Zhang (2009).^{5}^{5}5Recent work by Li and Patton (2017) extends standard methods for
testing predictive accuracy of forecasts to a high-frequency financial
setting. The framework is not only useful for high-frequency data; in particular,
recent work of Casini and Perron (2017a,
2017b) has adopted this continuous-time
approach for modeling time series regression models with structural
changes fitted to low-frequency data (e.g., macroeconomic data that
are sampled at weekly, monthly, quarterly, annual frequency, etc.).
They have showed that this continuous-time approach delivers a
better approximation to the finite-sample distributions of estimators
in structural change models and inference is more reliable than previous
methods based on classical long-span asymptotics.

The classical approach to economic forecasting for macroeconomic variables is to formulate models in discrete-time and then base inference on long-span asymptotics where the sample size increases without bound and the sampling interval remains fixed [cf. Diebold and Mariano (1995), Giacomini and White (2006) and West (1996)

]. There are crucial distinctions between this classical approach and the setting introduced in this paper. Under long-span asymptotics, identification of parameters hinges on assumptions on the distributions or moments of the studied processes [cf. the specification of the null hypotheses in

Giacomini and Rossi (2009)], whereas within a continuous-time framework, unknown structural parameters are identified from the sample paths of the studied processes. Hence, we only need to assume rather mild pathwise regularity conditions for the underlying continuous-time model and avoid any ergodic or weak-dependence assumption. As in Casini and Perron (2017a), our framework encompasses any time series regression model allowing for general forms of non-stationarity such as heteroskedasticty and serial correlation.Given a null hypotheses stated in terms of the path properties of
the sequence of losses, we propose a test statistic which compares
the local behavior of the sequence of surprise losses defined as the
difference between the out-of-sample and in-sample losses. More specifically,
our maximum-type statistic examines the smoothness of the sequence
of surprise losses as the continuous-time limit is approached. Under
the hypotheses, the continuous-time analogue of the sequence of losses
follows a continuous motion and any deviation from such smooth path
is interpreted as evidence against the hypotheses. The null distribution
of the test statistic is non-standard and follows an extreme value
distribution. Therefore, our limit theory exploits results from extreme
value theory as elaborated by Bickel and Rosenblatt (1973) and Galambos (1987).^{6}^{6}6In nonparametric change-point testing, related works are Wu and Zhao (2007)
and Bibinger, Jirak, and Vetter (2017).

We propose two versions of the test statistic: one that is self-normalized and one that uses an appropriate estimator of the asymptotic variance. The test statistic is defined as the maximal deviation between the average surprise losses over asymptotically vanishing time blocks. Further, we consider extensions of each of these statistics which use overlapping rather than non-overlapping blocks. Although they should be asymptotically equivalent, the statistics based on overlapping blocks are more powerful in finite-samples. In a framework where one allows for model misspecification, the problem of non-stationarity such as heteroskedastcity and serial correlation in the forecast losses should be taken seriously. Given the block-based form our test statistics we derive an alternative estimator of the long-run variance of the forecast losses. This estimator differs from the popular estimators of

Andrews (1991) and Newey and West (1987) [see Müller (2007) for a review] and it is of independent interest. Finally, we extend results to settings that allow for stochastic volatility, and we conduct a local power analysis and highlight a few differences of our testing framework from the structural change test of Andrews (1993). Related aspects, such as estimating the timing of the instability and covering high-frequency setting with jumps, are being considered in a companion paper.The rest of the paper is organized as follows. Section 2 introduces the statistical setting, the hypotheses of interest and the test statistics. Section 3 derives the asymptotic null distribution under a continuous record. We discuss the estimation of the asymptotic variance in Section 4. Some extensions and a local power analysis are presented in Section 5. Additional elements that are covered in our companion paper are briefly described in Section 6. A simulation study is contained in Section 7. Section 8 concludes the paper. The supplemental material to this paper contains all mathematical proofs and additional simulation experiments.

## 2 The Statistical Environment

Section 2.1 introduces the statistical setting with a description of the forecasting problem and the sampling scheme considered throughout. The underlying continuous-time model and its assumptions are introduced in Section 2.2. In Section 2.3 we set out the testing problem and state the relevant null and alternative hypotheses. The test statistics are presented in Section 2.4. Throughout we adopt the following notational conventions. All limits are taken as , or equivalently as , where is the sample size and

is the sampling interval. All vectors are column vectors and for two vectors

and , we write if the inequality holds component-wise. For a sequence of matrices we write if each of its elements is and likewise for If is a non-stochastic vector, denotes the its Euclidean norm, whereas if is a stochastic vector, the same notation is used for the norm. We use to denote the largest smaller integer function and for a set the indicator function of is denoted by . A sequence is i.i.d. if the are independent and identically distributed. We useto denote convergence in probability and weak convergence, respectively.

is used for the space of positive define real-valued matrices whose elements are cï¿œdlï¿œg. The symbol “” is definitional equivalence.### 2.1 The Forecasting Problem

The continuous-time stochastic process is defined on a filtered probability space and takes value in where is the variable to be forecast and

are the predictor variables. The index

is defined as the continuous-time index and we have , where is referred to as the time span. In this paper, will remain fixed. That is, the unobserved process evolves within the fixed time horizon and the econometrician records of its realizations, with a sampling interval , at discrete-time points , where accordingly A continuous record asymptotic framework involves letting the sample size grow to infinity by shrinking the time interval to zero at the same rate so that remains fixed. The index is used for the observation (or tick) times .The objective is to generate a series
of -step ahead forecasts. We shall adopt an out-of-sample
precedure whereby splitting the time span into
an in-sample and out-of-sample window,
and , respectively.^{7}^{7}7Indeed, corresponds to the in-sample
window only for the fixed forecasting scheme to be introduced later—e.g.,
the rolling scheme only uses the most recent span of data of length
. A minor and straightforward modification to this
notation should be applied when the recursive and rolling schemes
are considered. However, for all methods indicates
the artificial separation such that is the beginning
of the out-of-sample period. The latter two time horizons are supposed to be fixed and therefore
within the in-sample (or prediction) window a sample of size
is observed whereas within the out-of-sample (or estimation) window
the sample is of size . We consider a general
framework that allows for the three traditional forecasting schemes:
(1) a fixed forecasting scheme with discrete-time observations
;
(2) a recursive forecasting scheme where at time the
prediction sample includes observations ;
(3) a rolling forecasting scheme where the time span of the
rolling window is fixed and of the same length as
(i.e., at time the in-sample window includes observations .^{8}^{8}8Equivalently, the observation times within the rolling widow at the
th’s observation are .

The forecasts may be based on a parametric model whose time-

parameter estimates are then collected into the random vector . If no parametric assumption is made, then represents whatever semiparametric or nonparametric estimator used for generating the forecasts. The time- forecast is denoted by , where is some measurable function. The notation indicates that the -time forecast is generated from information contained in a sample of size .^{9}

^{9}9 varies with the forecastis scheme; e.g., for the rolling scheme we have while for the recursive scheme we have .

Next, we introduce a loss function which serves
for evaluating the performance of a given forecast model. More specifically,
each out-of-sample loss
constitutes a statistical measure of accuracy of the -step
forecast made at time . However, given the objective of detecting
potential instability of a certain forecasting method over time, we
need additionally to introduce the in-sample losses
where is an in-sample
fitted value with varying over the specific in-sample window.
That is, for each time- forecast there corresponds a sequence
(indexed by ) of in-sample fitted values .^{10}^{10}10We have for the fixed scheme,
for the recursive scheme and for the
rolling scheme. Then, the testing problem turns into the detection of any “systematic
difference” between the sequence of out-of-sample and in-sample
losses; the formal measure of such difference under our context is
provided below.

### 2.2 The Underlying Continuous-Time Model

The process is a -valued semimartingale on
and we further assume that all processes considered in this paper
are cï¿œdlï¿œg adapted and possess a -a.s. continuous path
on .^{11}^{11}11For accessible treatments of the probabilistic elements used in this
section we refer to Aït-Sahalia and Jacod (2014), Jacod and Shiryaev (2003),
Jacod and Protter (2012), Karatzas and Shreve (1996) and Protter (2005). The continuity property represents a key assumption in our setting
and implies that is a continuous Itô semimartignale. The integral
form for is given by,

(2.1) |

where is a Wiener process, and are the drift and spot covariance process, respectively, and is -measurable. We incorporate model misspeficication into our framework by allowing for a large non-zero drift which adds to the residual process:

(2.2) |

where , is a standard Wiener process, is its associated volatility, and is -measurable. In (2.2), the last two terms on the right-hand side account for the residual part of which is not explained by where . We assume so that the factor inflates the infinitesimal mean of the residual component thereby approximating a setting with arbitrary misspecification.

###### Remark 2.1.

In (2.2), misspecification manifests itself in the
form of (time-varying) non-zero conditional mean of the residual process,
and in giving rise to serial dependence in the disturbances which
in turn leads to dependence in the sequence of forecast losses.^{12}^{12}12Asymptotically, these features can be dealt with basic arguments used
in the high-frequency financial statistics literature; however, when
is not too small one needs methods that are robust in finite-samples
to such misspecification-induced properties. More precisely, we will
propose an appropriate estimator of the long-run variance of the sequence
of forecast losses in Section 4.
Hence, this specification is similar in spirit to the near-diffusion
assumption of Foster and Nelson (1996) who studied the impact of
misspecification in ARCH models. On the other hand, Casini and Perron (2017a)
introduced a “large-drift” asymptotics with to deal
with non-identification of the drift in their context. Technically,
the latter specification implies that as becomes small the drift
features larger oscillations that add to the local Gaussianity of
the stochastic part. Casini and Perron (2017a) referred
to this specification as small-dispersion assumption. Finally,
note that the presence of can also be related to
the signal plus small Gaussian noise of Ibragimov and Has’minskiǐ (1981)
if one sets in their model in Section
VII.2.

###### Assumption 2.1.

We have the following assumptions: (i) The
processes and
have -a.s. continuous sample paths; (ii) The processes
and are locally bounded;
(iii) There exists such that -a.s.
and
with ; (iv)
and
and the conditional variance (or spot covariance) is defined as ,
which for all satisfies
where denotes the -th
element of . Furthermore, for every
and , the quantity
is bounded away from zero and infinity, uniformly in and ;
(v) The disturbance process is orthogonal (in martingale
sense) to identically
for all .^{13}^{13}13The angle brackets notation
is used for the predictable quadratic variation process.

Part (i) rules out jump processes from our setting. We relax this restriction in our companion paper; see Section 6. Part (ii) restricts those processes to be locally bounded. These should be viewed as regularity conditions rather than assumptions and are standard in the financial econometrics literature [see Barndorff-Nielsen and Shephard (2004), Li and Xiu (2016) and Li, Todorov, and Tauchen (2017)]; recently, they have been used by Casini and Perron (2017a) in the context of structural change models.

The continuous-time model in (2.1)-(2.2) is not observable. The econometrician only has access to realizations of and with a sampling interval over the horizon . For each is a random vector step function that jumps only at time , and so on. The discretized processes and are assumed to be adapted to the increasing and right-continuous filtration . The increments of a process are denoted by . A seminal result known as Doob-Meyer Decomposition [cf. the original sources are Doob (1953) and Meyer (1967); see also Section III.3 in Protter (2005)] allows us to decompose the semimartingale process into a predictable part and a local martingale part. Hence, it follows that we can write for , where the drift is measurable, and is a continuous local martingale with finite conditional covariance matrix -a.s. . Turning to equation (2.2), the error process , with , is then a continuous local martingale difference sequence taking its values in with finite conditional variance -a.s. Therefore, we express the discretized analogue of (2.2) as

(2.3) |

###### Remark 2.2.

As explained above, we accommodate possible model misspecification by adding the component . In the forecasting literature, often one directly imposes restrictions on the sequence of losses, say, where is a forecast error. There are two main differences from our approach. First, in order to facilitate illustrating our novel framework to the reader, we have chosen, without loss of generality, to express directly the relationship between and while at the same time, allowing for misspecification by including . A second distinction from the classical approach is that the latter imposes restrictions on the sequences of losses such as mixing and ergodicity conditions, covariance stationary and so on. In contrast, our infill asymptotics does not require us to impose any ergodic or mixing condition [cf. Casini and Perron (2017a)].

Finally, we have an additional assumption on the path of the volatility process . This turns out be important because it partly affects the local behavior of the forecast losses.

###### Assumption 2.2.

For small , define the modulus of continuity of on the time horizon by We assume that for some sequence of stopping times and some

-a.s. finite random variable

.The assumption essentially states that is locally bounded and is Lipschitz continuous. Lipschitz volatility is a more than reasonable specification for the macroeconomic and financial data to which our analysis is primarily directed. Indeed, the basic case of constant variance is easily accommodated by the assumption. Time-varying volatility is also covered provided is sufficiently smooth. However, the assumption rules out some standard stochastic volatility models often used in finance. We relax that assumption in Section 5, so that we can extend our results to, for example, stochastic volatility models driven by a Wiener process.

### 2.3 The Hypotheses of Interest

As time evolves, a forecast model can suffer instability for multiple
reasons. However, incorporating model misspecification into our framework
necessarily implies that the exact form of the instability is unknown
and thus one has to leave it unspecified. This differs from the classical
setting for estimation of structural change models [cf. Bai and Perron (1998)
and Casini and Perron (2017a)] where (i) the break
date is well-defined as it is part of the definition of the econometric
problem, and (ii) the form of the instability is explicitly specified
through a discrete shift in a regression parameter. In contrast, under
our context we remain agnostic regarding both (i) and (ii). There
may be multiple dates at which the forecast model suffers instability
and they might be interrelated in a complicated way. Forecast instability
may manifest itself in several forms, including gradual, smooth or
recurrent changes in the predictive relationship between
and ; certainly, there could also be discrete shifts in —arguably
the most common case in practice—but this is only a possibility
in our setting and not an assumption as in structural change models.
A forecast failure then reflects the forecaster’s failure to recognize
the shift in the predictive power of on .
On the other hand, even if one can rule out shits in ,
a forecast instability may be induced by an increase/decrease in the
uncertainty in the data which might result, for example, from changes
in the unconditional variance of the target variable. In this case,
the predictive ability of on ,
as described for instance by a parameter remains stable
while due to an increase in the unconditional variance of
it might become weak and in turn the forecasting power might breakdown.
Tests for forecast failure such as those proposed in this paper and
the ones proposed in Giacomini and Rossi (2009) are designed to have
power against both of the above hypotheses.^{14}^{14}14Recently, Perron and Yamamoto (2018) proposed to apply modified versions
of classical structural break tests to the forecast failure setting.
However, their testing framework and hence their null hypotheses are
different from ours because they do not fix a model-method pair but
only fix the forecast model under the null.

#### 2.3.1 The Null and Alternative Hypotheses on Forecast Instability

Define at time a surprise loss given by the deviation between the time- out-of-sample loss and the average in-sample loss: for , where is the average in-sample loss computed according to the specific forecasting scheme. One can then define the average of the out-of-sample surprise losses

(2.4) |

where denotes the time span
of the out-of-sample window.^{15}^{15}15By definition is fixed and should not be confused with
which indicates the number of observations in the out-of-sample window.
Indeed, In the classical discrete-time setting, under the hypotheses of no
forecast instability one would naturally test whether
has zero mean, where is the pseudo-true value of .
If the forecasting perfomance remains stable throughout the whole
sample then there should be no systematic surprise losses in the out-of-sample
window and thus
This reasoning motivated the forecast breakdown test of Giacomini and Rossi (2009).
Therefore, under the classical asymptotic setting one exploits time
series properties of the process
such as ergodicity and mixing together with the representation of
the hypotheses by a global moment restriction.^{16}^{16}16Global refers to the property that the zero-mean restriction involves
the entire sequence of forecast losses. By letting the span , this method underlies
the classical approach to statistical inference but does not directly
extend to an infill asymptotic setting. Under continuous-time asymptotics,
identification of parameters is achieved by properties of the paths
of the involved processes and not by moment conditions. This constitutes
the key difference and requires one to recast the above hypotheses
into an infill setting thereby making use of assumptions on an underlying
continuous-time data-generating mechanism which is assumed to govern
the observed data.

We begin with observing that the sequence of losses
can be viewed as realizations from an underlying continuous-time process
, with .
That is, consists of temporally integrated forecast
losses where is the loss at time and is defined by some
transformation of the target variable and of the predictor
.^{17}^{17}17The definition of uses that so long as the forecast
step is small and finite one can approximate
by for sufficiently small In order to provide a general theory, we focus on families of loss
functions that depend only on the forecast error.^{18}^{18}18The most popular loss functions used in economic forecasting are within
this category [see Elliott and Timmermann (2016) for a recent incisive
account of the literature]. Extension to ad hoc loss functions
requires specific treatment that might vary from case to case. We denote this class by and we say that the
loss function
if
for all , where .
The class comprises the vast majority of loss
functions employed in empirical work, including among others the popular
Quadratic loss, Absolute error loss and Linex loss. The following
examples illustrate how these loss functions are constructed under
our setting. For the rest of this section, assume for simplicity
and that is one-dimensional in (2.2).

###### Example.

(: Quadratic Loss)

The Mean Squared Error or Quadratic loss function
is symmetric and is by far the most commonly used by practitioners.
Given (2.2), we have .
Then or
with .

###### Example.

(: Linex Loss)

The Linear-exponential or Linex loss was introduced
by Varian (1975) and it is an example of asymmetric loss function.
By the same reasoning as in the Quadratic loss case, we have
or
with .

Below we make very mild pathwise assumptions on the process which
imply restrictions on .
We derive asymptotic results under Lipschitz continuity (in )
of the coefficients of the system of stochastic differential equations
driving the data . We apply the
techniques of stochastic calculus to formulate our testing problem.
To avoid clutter, we introduce the notation
and its shorthand .^{19}^{19}19The notation implicitly assumes that the same loss function is used
for estimation and prediction which in turn implies that the subscript
in can be omitted since
it can be understood from that of the argument . By Itô Lemma, [cf. Section II.7 in Protter (2005)], under
smoothness of ,

Let denote the expectation conditional on the path . The instantaneous mean of is . Note that the latter is a symbolic abbreviation for

Since the coefficients of the original system of stochastic equations are Lipschitz continuous in , one can verify that is also Lipschitz upon regularity conditions on and time- information.

We denote by the class of Lipschitz continuous functions on . Let denote a continuous-time stochastic process that is -a.s. locally bounded and adapted.

###### Definition 2.1.

The process belongs to if for some sequence of stopping times and some -a.s. finite random variable .

We are in a position to formulate the testing problem in terms of
the pathwise property of This
implies that the hypotheses are specified in terms of random events
which differs from classical hypotheses testing but it is typical
under continuous-time asymptotics; see Aït-Sahalia and Jacod (2012) (for
many references), Li, Todorov, and Tauchen (2016) and Reiß, Todorov, and Tauchen (2015).
We consider the following hypotheses: for any ,^{20}^{20}20Precise assumptions will be stated below.

(2.5) |

which means that we wish to discriminate between the following two events that divide

The dependence of the hypotheses on is appropriate because each event generates a certain path of on , where . The hypotheses requires a Lipschitz condition to hold on , where is the usual artificial separation date after which the first forecast is made. is taken as given here because the testing problem applies to a specific method-model pair and is part of the chosen forecasting method. From a practical standpoint, it would be helpful if this separation date is such that the forecast model is stable on [see Casini and Perron (2017c) for more details]. The latter property is, however, unknown a priori by the practitioner. We cover this case in Section 6.

###### Example.

(; cont’d)

For the Quadratic loss Itô Lemma yields
.
If is Lipschitz continuous, then the hypothesis
holds.

###### Example.

(; cont’d)

From Itô Lemma, .
Consequently, by Itô Isometry [cf. Section 3.3.2 in Karatzas and Shreve (1996)
or Lemma 3.1.5 in Øksendal (2000)]
and hypotheses is seen to hold under Lipschitz continuity
of .^{21}^{21}21Recall that composition of Lipschitz functions is Lipschitz and that
under our context
is Lipschitz because (i) is locally bounded and
Lipschitz, and (ii) and remains fixed.

We have reduced the forecast instability problem to examination of
the local properties of the path of . However, we still have
to face the question on how to use the data to test in practice.
Even if we could observe , it would not be clear
how to formulate a testing problem on the stability of by
using path properties of . The reason is that
is always absolutely continuous by definition,
and thus it would provide little information on the large deviations
of the forecast error . In order to study the local behavior
of one needs to consider the small increments of
close to time . Leaving the definition of
aside for a moment, observe that -a.s. continuity of
is equivalent to having the relationship between
and holding for any infinitesimal interval of time. For the
basic parametric linear model: . Then,
the forecast loss is , which is difficult to
interpret in rigorous probabilistic terms. However, we can consider
its discrete-time analogue. We normalize the forecast error by the
factor and redefine .^{22}^{22}22Alternatively, .
Then, for all , the mean of —conditional
on —depends on the parameters of the model
and its local behavior can be used as a proxy for the local behavior
of the infinitesimal mean of .
If the corresponding structural parameters of the continuous-time
data-generating process satisfy a Lipschitz continuity in , then—knowing
—also
should be Lipschitz in the continuous-time limit. Under the hypotheses
there should be no break in
and an appropriately defined right local average of

should not differ too much from its left local average. That is, one can test for forecast instability by using a two-sample t-test over asymptotically vanishing time blocks.

###### Example.

(; cont’d)

Conditional on , .
Thus, .
If is Lipschitz continuous, then the hypothesis
holds.

###### Example.

(; cont’d)

Similar to the Quadratic loss case, we have .
Again, the hypotheses is satisfied if is
Lipschitz.

Both examples demonstrate that pathwise assumptions on the data-generating process implies restrictions on the properties of the sequence of loss functions. For the QL example, if there is a structural break at the observation then this would result in the mean of shifting to a new level after time . Given that the same reasoning extends to the sequence of surprise losses, one may consider to construct a test statistic on the basis of the local behavior of the surprise losses over time. If there is no instability in the predictive ability of a certain model, then the sequence of out-of-sample surprise losses should display a certain stability. Under the framework of Giacomini and Rossi (2009), this stability is interpreted in a retrospective and global sense as a zero-mean restriction on the sequence over the entire out-of-sample. In contrast, under our continuous-time setting, this stability manifests itself as a continuity property of the path of the continuous-time counterpart of the sequence.

### 2.4 The Test Statistics

By inspection of the null hypotheses in (2.5), it is evident that a considerable number of forms of instabilities are allowed. These may result from discrete shifts in a model’s structural parameter and/or in structural properties of the processes considered such as conditional and unconditional moments and so on. This first set of non-stationarities relates to the popular case of structural changes which are designed to be detected with high probability by the structural break tests of, among others, Andrews (1993) and Andrews and Ploberger (1994), Bai and Perron (1998) and Elliott and Müller (2006) in univariate settings and of Qu and Perron (2007) in multivariate settings. However, a forecast instability may be generated by many other forms of non-stationarities against which such classical tests for structural breaks are not designed for and consequently they might have little power against. For example, consider the case of smooth changes in model parameters, or in the unconditional variance of . Even more serious would be the presence of recurrent smooth changes in the marginal distribution of the predictor since in this case the above-mentioned tests are likely to falsely reject too often [cf. Hansen (2000)]. Thus, the null hypotheses of no forecast instability calls for a new statistical hypotheses testing framework. Ideally, in this context one needs a test statistic that retains power against any discontinuity, jump and recurrent switch at any point in the out-of-sample and for any magnitude of the shift. We propose a test statistic which aims asymptotically at distinguishing any discontinuity from a regular Lipschitz continuous motion. We introduce a sequence of two-sample t-tests over asymptotically vanishing adjacent time blocks. This should lead to significant gains in power whenever on fixed time intervals the out-of-sample losses exhibit instabilities of any form such as breaks, jumps and relatively large deviations. Such gains are likely to occur especially when instabilities take place within a small portion of the sample relative to the whole time span—a common case in practice that has characterized many episodes of forecast failure in economics.

Interestingly, for the Quadratic loss function we can exploit properties of the local quadratic variation and propose a self-normalized test statistic. Thus, we separate the discussion on the Quadratic loss from that on general loss functions. Let , . Next, we partition the out-of-sample into blocks each containing observations. Let and for .

#### 2.4.1 Test Statistics under Quadratic Loss

We propose the following statistic

The quantity is a local average of the surprise losses within the block . We have partitioned the out-of-sample window into blocks of asymptotically vanishing length . We consider an asymptotic experiment in which the number of blocks increases at a controlled rate to infinity while the per-block sample size grows without bound at a slower rate than the out-of-sample size . The appeal of the statistic is that a large deviation suggests the existence of either a discontinuity or non-smooth shift in the surprise losses close to time and thus it provides evidence against . We comment on the nature of the normalization in the denominator of below, after we introduce a version of statistic which uses all admissible overlapping blocks of length :

where . Since under the alternative hypotheses the exact location of the change-point–-or possibly the locations of the multiple change-points—within the block might actually affect the power of the -based test in small samples, we indeed find in our simulation study that the test statistic which uses overlapping blocks is more powerful especially when the instability arises in forms other than the simple one-time structural change. Thus, the power of the test is slightly sensible to the actual location of the change-point within the block, with higher power achieved when the change-point is close to either the beginning or the end of the block. In contrast, the statistical power of is uniform over the location of the change-point in the sample. The latter property is not shared by the exiting test of Giacomini and Rossi (2009) given that its power tends to be substantially lower if the instability is not located at about middle sample.

An important characteristic of both and is that they are self-normalized; no asymptotic variance appears in their definition. The reason for why appears in the denominator of, for example, is that even though constitutes a more logical self-normalizing term, it might be close to zero in some cases. This would occur under Quadratic loss if, for example, for all This is not true for the factor .

In addition, observe that allowing for misspecification naturally leads one to deal carefully with artificial serial dependence in the forecast losses in small samples. Thus, we consider a version of the statistics and that are normalized by their asymptotic variance: and similarly,

The quantity standardizes the test statistic so that under the null hypotheses we obtain a distribution-free limit. This can be useful because given the fully non-stationary setting together with the possible consequences of misspecification in finite-samples, standardization by the square-root of the asymptotic variance might lead to a more precise empirical size in small samples. We relegate theoretical details on as well as on its estimation to Section 4 where we also present a discussion about its relation with the choice of the number of blocks.

#### 2.4.2 Test Statistics under General Loss Function

For general loss , we propose the following statistic,

where are defined as in the quadratic case and

with . The interpretation of is essentially the same as of , the only difference arising from the denominator that estimates the within-block variance. A version that uses all overlapping blocks is

where , with

Comments

There are no comments yet.