On the Construction of a Leading Indicator Based on News Headlines for Predicting Greek Deposit Outflows

: Employing data in a monthly frequency, with a sample period spanning from 2002 to 2018, the purpose of this study is twofold. First, we construct a novel leading indicator based on news headlines drawn from Bloomberg, and second, examine whether this leading indicator able to capture agents’ sentiment affects Greek bank deposit flows’ trajectory. Employing alternative econometric methodologies, we find that this index proxies for depositors’ crisis sentiment and the higher this index becomes, the higher the depositors’ negative sentiment becomes, leading them to withdraw their bank deposits. Overall, in this work, we show that the last decade’s advances in internet technology, which permit us to have direct access to a vast amount of information such as news headlines, offers the possibility of forecasting critical measures in the economy’s banking system, such as the number of bank deposits, which are of crucial importance. Monetary poly authorities or macroprudential regulators could adapt our model to assess the resilience of a bank or the whole banking sector.

The level of private sector bank deposits in a country is directly related to macroeconomic conditions that drive the allocation of individuals' money between consumption and savings. However, a fast-growing literature has supported that private sector bank deposits are also affected by agents' economic sentiment, with some researchers support that sentiment exerts a higher impact on bank deposits than macro fundamentals. Our work is part of a growing strand of the literature that explores how the economic sentiment affects various facets of the economic activity (see among others, (Anastasiou, Ballis, & Drakos, 2021a, 2021cAnastasiou, Bragoudakis, & Giannoulakis, 2021b;Schumaker, Zhang, Huang, & Chen, 2012;Smales, 2014)) and especially to the literature investigating the impact of sentiment on bank deposits Anastasiou & Katsafados, 2020;Fecht, Thum, & Weber, 2019).
Specifically, Fecht et al. (2019) showed that Google searches for 'deposit insurance' and related strings reflect depositors' fears and help to predict deposit shifts in the German banking sector from private banks to fully guaranteed public banks. After introducing blanket state guarantees for all deposits in the German banking system, this fear-driven reallocation of deposits stopped. Their findings highlight that heterogeneous insurance of deposits can lead to a sudden, fear-induced reallocation of deposits endangering the banking sector's stability even in the absence of redenomination risks. He (2019) found that banks experience deposit inflows from investors seeking a safe-haven when sentiment declines. Their regression results further indicate a U-shaped relation between deposit growth and investor sentiment. This accords with the previous suggestion that arbitragers transfer funds into deposit accounts in booms. It is noteworthy that the coefficients on the sentiment index are both more statistically and economically significant than those of the paper-bill spread, which shows that investor sentiment has stronger explanatory power for the variations in bank liquidity, even compared with important economic fundamentals that have been discovered in the existing literature. Anastasiou and Katsafados (2020) investigated whether the so-called textual sentiment has any impact on European depositors' behavior to withdraw their deposits. After the manual collection of monthly speeches of the president of the European Central Bank, they applied textual analysis techniques and constructed two alternative sentiments able to capture the perceived uncertainty. They found that a high frequency of uncertainty and weak modal words in the monthly speeches of the president of the ECB leads both households and non-financial corporations to withdraw their bank deposits. They also found that these textual sentiments have a more significant impact on non-financial corporations than households. Konstantakis, Paraskeuopoulou, Michaelides, and Tsionas (2020) showed that extending standard autoregressive models with the keyword "Grexit" leads to improvement in the forecasting ability of the Bank Deposits in Greece, compared to models that contain just some standard explanatory variables. Their findings are checked for robustness in the data samples, in the activation function used and the prior distributions employed and are found to be econometrically robust and economically sound, confirming the crucial role of Google Searches for predicting bank deposits in the Greek economy.
Anastasiou and Drakos (2021a) employing monthly search volume data on crisis-related queries from the Google Trends database, introduced two modified Google search-based crisis sentiment indicators to measure depositors' fear. These indices capture depositors' crisis sentiment, and they are found to be significant drivers of bank deposit flows across European Union countries. Their findings indicate that countries with high search intensity of crisis-related keywords tend to witness bank deposit outflows. By employing a Panel Vector Autoregressive approach, they documented the robustness of their findings. Anastasiou and Drakos (2021b) investigated whether Greek depositors' uncertainty about the future currency contains information for the observed acute depletion of deposits in the Greek banking system. They conducted a Nowcasting exercise using the Google search intensity for the term «Drachma» and documented that higher search intensity leads to higher Total deposits outflows, primarily driven by outflows in Time deposits. They also found that the Google search intensity for the term «Drachma» exerts an asymmetric impact across One-Day deposits and time deposits. In addition, the asymmetry is also present between firms' deposits and households' deposits. These findings support that 'a flight-to-safety' behavior caused by uncertainty about the future currency accounts for the erosion of deposits in Greece.
Despite this previously cited research, there is a research gap in the literature of bank deposit flows on how news headlines capturing depositors' sentiment may affect their decisions to withdraw their bank deposits. Our study builds on this prior work and presents a methodological framework for constructing a novel leading indicator for Greek bank deposit outflows, based on the news' headlines drawn from Bloomberg. We find that this leading indicator exerts a significant impact on bank deposit flows. Thus, it can be a valuable tool for macroprudential policymakers to forecast bank deposit outflows and hence predict potential bank runs.
The rest of the paper is structured as follows. Section 2 describes the data and variables we employed, along with the methodology followed for the construction of our leading indicator. Section 3 describes the econometric methodology we used, and we provide a discussion of our findings. Finally, Section 4 contains

Data, Variables, and Methodology
The data for the Greek deposits refer to monthly changes in the outstanding amounts of deposits held by Greek banks, published on the website of the Bank of Greece (in millions €). Our data are in a monthly frequency, and the sample period spans from 2002 to 2018. The data for news refers to the number of times each month specific keywords appear in the economic press headlines, from over one hundred authoritative global sources, and were derived from Bloomberg.
In order to construct the leading indicator, we obtained from Bloomberg the number of specific keywords that appeared in the news headlines for the under-examination period, which we believe capture the so-called "negative sentiment." By employing a Principal Component Analysis 1 (PCA hereafter) on these keywords, we isolated the common component(s). The keywords we employed were DEBT -BAILOUT -CRISIS -UNCERTAINTY -RECESSION -GREXIT -AUSTERITY -DEFAULT. All keywords were searched jointly with the word GREECE. The selection of these eight keywords was based on our identification assumption. Our prior beliefs are that since these eight news-based keywords denote the perceived crisis environment, then the common-isolated component (that is, the leading indicator) will be negatively (positively) associated with the Greek bank deposits (deposit outflows). Table 1 shows the corresponding eigenvalue, proportion, and cumulative proportion for each of the eight components. As we can depict, the first two components have the highest cumulative proportion in creating the common factor. After computing the principal components, we wish to formally determine how many components to keep in order to construct the final indicator. In factor analysis, the question of the "true" number of factors is a complicated issue. With PCA, it is a little more straightforward. We may set a percentage of variance we wish to account for, say, >85%, and retain just enough components to account for at least that much of the variance. The relative magnitudes of the eigenvalues indicate the amount of variance they account for. According to Anastasiou, Bragoudakis, and Malandrakis (2020) an essential benefit of the PCA is the fact that it produces the weights for each variable automatically, implying that the novel leading indicator that we constructed explains as much of the variance in the set of the different news headlines variables as possible.
A valuable tool for visualizing the eigenvalues relative to one another so that we can decide the number of components to retain in the scree plot proposed by Cattell (1966). As we can observe from Figure 1, the first two components have the highest proportion in creating the common factor. These two first components, which were found to be the main drivers of the PCA correspond to the keywords AUSTERITY and UNCERTAINTY, respectively, used to construct the leading indicator (PC variable) based on the weights presented in Table 1. 1 PCA originated with the work of Pearson (1901) and Hotelling (1933). PCA is a statistical technique used for data reduction. The leading eigenvectors from the eigen decomposition of the correlation or covariance matrix of the variables describe a series of uncorrelated linear combinations of the variables that contain most of the variance. In addition to data reduction, the eigenvectors from a PCA are often inspected to learn more about the underlying structure of the data. Vol. 4, No. 1, pp. 1-11 2021DOI: 10.53935/2641 Corresponding  After constructing our leading indicator, we examine its forecasting ability on bank deposit flows (DEPOSITS). 2 In particular, we first estimate a robust linear regression model (Model 1) in which we include the one month lag of the PC variable 3 , controlling for both autocorrelation and seasonality patterns via autoregressive (AR) and moving average (MA) terms and for some additional control variables which we believe that they also have an impact on bank deposit flows. 4 Following among others Anastasiou and Drakos (2021b), the variables that we also consider as controls are defined as follows:

International Journal of Business Management and Finance Research
• INTEREST_RATE, which is the first-time lag of the real interest rate.
• ELECTIONS, which denotes a dummy variable which takes 1 in the years where we had elections and 0 otherwise, and • CAPITAL_CONTROLS, which stands for a dummy variable that takes 0 (1) before (after) the imposition of the capital controls in the Greek banking system.

Econometric Methodology and Discussion of Results
Before we embark on a detailed description of our econometric methodology and discuss our empirical findings, we deem it appropriate, as a preliminary analysis, to observe the association between PC and DEPOSITS through a Locally Weighted Scatterplot Smoothing (lowess) graph ( Figure 2). On the vertical and horizontal axes, we also include the kernel density of each variable. A lowess 5 graph is a popular tool used in regression analysis that creates a smooth line through a scatter plot, helping us see the relationship between variables and foresee trends. As we may well observe, there is a clear negative association between PC and Greek's banking system DEPOSITS. This preliminary evidence confirms our prior beliefs, thus enhancing our choice for the construction of this specific leading indicator. Let us move on now to the empirical findings of this study. According to the results of the first model (Table 2), we found results that are compatible both with the economic theory and our priors. In particular, we found that PC is statistically significant at the 1% level and with the proper negative sign, denoting that as PC increases, more deposit outflows occur next month. In other words, as the PC increases, the higher the depositors' negative sentiment becomes, which leads them to withdraw their deposits from the banks.
As far as the control variables are concerned, we found that the one-period lag of INTEREST_RATE and the CAPITAL_CONTROLS dummy variables have a positive impact on bank deposit flows. On the contrary, the ELECTIONS dummy variable was found to significantly negatively impact the Greeks' depositors' behavior. Finally, we also found evidence that the monthly outflow changes have persistence with strong autocorrelation.  Notes: (a) *, **, *** denote statistical significance at the 10, 5, and 1 percent level respectively, (b) numbers in parentheses denote robust standard errors, (c) variables INTEREST_RATE and PC are included in one period lag in the model.
The Root Mean Square Error (RMSE) of the one-step-ahead forecasts under Model 1 was equal to 2079.09, and the Mean Absolute Error was equal to 1594.9. 6 As depicted in Figure 3, the actual values almost lie on the fitted values, having a common trajectory. This suggests that we have constructed a well-specified model able to explain a large proportion of Greek private sector deposit flows. In some more detail, we observe that fitted values move more closely with actual values of bank deposits after the year 2010. This suggests that changes in private sector bank deposit flows are better explained after the outburst of the 2010 Greek sovereign debt crisis, than before. This finding further enhances our argument that our model is capable of explaining Greek bank deposit flows, especially in economically stressed periods of time.
After estimating the first model, in which we observe the association between deposit flows and PC, we deem it appropriate to examine how our leading indicator influences the likelihood of going into a "highoutflows" regime. To do that, following Petropoulos, Vlachogiannakis, and Mylonas (2018) we first employ a Markov Regime Switching Regression model on DEPOSITS with two regimes.
Many economic time series occasionally exhibit dramatic breaks in their behavior, associated with events such as financial crises (Hamilton, 2005) or abrupt changes in government policy (Davig, 2004;Sims & Zha, 2006). Of particular interest to economists is the apparent tendency of many economic variables to behave quite differently during economic downturns, when underutilization of factors of production rather than their long-run tendency to grow governs economic dynamics (Chauvet & Hamilton, 2006). Markov Regime Switching Regression models are a well-established method for modeling effectively non-linear financial time series, which entail non-homogeneous observed data and where the patterns of the entailed temporal dynamics may change over time. In addition, due to their proper statistical features in a non-linear optimization, Markov Regime Switching Regression models have been conducted recently, especially in business cycles studies by estimating state or latent variable(s). The latent usage variable in a business cycle model starts with Burns and Mitchell (1946) then Hamilton (1988); Hamilton (1989) introduces his seminal Markov Regime Switching Regression models work to observe the state or latent variable(s) to identify the USA business cycles by comparing them with NBER cycle estimations (Guha & Banerji, 1999). Later, in related literature, one may find the Markov Regime Switching Regression models works of Goodwin (1993); Durland and McCurdy (1994); Ghysels (1994); Filardo (1994); Garcia and Perron (1996); Chauvet (1998); Krolzig (2000); Lam (2004); Smith and Summers (2005) and other seminal studies. In all these studies, Markov Regime Switching Regression models become an alternative to the linear models by allowing a change in parameters in a stochastic process. In other words, the Markov switching regression model extends the simple exogenous probability framework by specifying a first-order Markov process for the regime probabilities. Finally, it has to be noted that due to its structure, a Markov Regime Switching Regression model offers a flexible and general-purpose modeling framework for univariate and multivariate analysis, specifically for discrete-time series and classification data (Petropoulos et al., 2018).
In Table 3 are represented two states/regimes in the economy. In the first regime (good state) there are excess inflows (896.96 mils. € per month), while in the second regime (bad state) is represented a state of high outflows (-3472.76 mil. € per month). In Table 4, panel A are displayed the Markov transition probabilities, according to which the probability of moving from state 1 to state 2 is 0.014 while the probability of moving from state 2 to state 1 is 0.065. We also find that the probability of staying/remaining in the same regime is 0.985 and 0.934 for the first and the second regime, respectively. Furthermore, from panel B, we observe that the expected duration for the first state of the economy is 70.72 months, whereas the expected duration of the bad state (increased outflows) equals 15.32 months. As depicted in Figure 4, the period of excessive outflows (state 2) spans from 2010-2012 and from the end of 2014 to middle 2015, when the capital controls act was issued.
Based on the output of the Markov Switching model, we create a dummy variable (OUTFLOWS STATE) which takes 0 and 1 if we are in an inflow (state 1) or outflow (state 2) regime respectively. Then, we estimate a Binary Logit model (model 2) by employing the same control variables as before to investigate how the leading indicator constructed above is related to the probability of observing a regime with excessive outflows.  The results from the estimation of the logit model are presented in Table 5. We observe that the onemonth lag of the PC was found once again to be a significant predictor (p-value<0.0001) for observing a state of excessive outflows. Especially, we found that as the number of news increases, the greater the perceived fear becomes, we have increased probability the next month to observe deposit outflows. The real interest rate and the capital controls were found again to affect the probability of outflows. Although it was found to have the proper sign, the ELECTIONS variable was not significant in our second model. A deposit outflow prediction model can be provided either from the first model, in which we tried to predict the expected deposit outflows directly or from the second model, which we employed to predict the likelihood of going into a high outflow regime. These models can be expanded with additional covariates and be improved based on more advanced (i.e., non-linear) statistical methods while incorporating variance predictions. In this framework, the number of specific keywords appearing in news headlines seems to hold vital information for future deposit movements and can potentially serve as a leading indicator.
Nowadays, more bank deposit holders will withdraw their money, and, indeed, as the crisis deepens dayby-day and the levels of risk increase dramatically, then an increasing number of bank account holders would mimic this behavior and react in the same way. Given that deposits have always been a significant channel of banks' ability to "transfer" loans to the economy, we can see that this mechanism also harms the real economy. Clearly, if we study a case where an unfavorable economic environment exists, then we could safely argue that the increased amount of negative Bloomberg news headlines is linked to increasing bank deposit outflows, that is, a "bank-run."

Conclusions
The exploration of the determining factors of bank deposits is an issue of substantial importance for regulatory authorities concerned about financial stability. In this context, the utilization of internet mediabased information for predicting critical developments in the financial sector has attracted limited attention in the relevant literature so far. In this work, we show that the last decade's advances in internet technology, which permit us to have direct access to a vast amount of information such as news headlines, offers the possibility of forecasting critical measures in the economy's banking system, such as the number of bank deposits, which are of crucial importance.
The purpose of this study is twofold. First, construct a leading indicator based on news headlines, and second, examine whether this novel indicator affects Greek bank deposit flows' trajectory. Employing alternative econometric methodologies, we find that this index proxies for depositors' crisis sentiment and the higher this index becomes, the higher the depositors' negative sentiment becomes, leading them to withdraw their bank deposits. Monetary poly authorities or macroprudential regulators could adapt our model to assess the resilience of a bank or the whole banking sector.
In terms of the direction of future research, a possible extension of this empirical work would be to explore the potential differentiation in the impact of our leading indicator based on news headlines among banks located in the same country. For example, it can be the case that banks with higher capital adequacy ratios, increased profitability, and higher liquidity buffers (i.e., strong banks) suffer fewer deposit outflows