Article Versions
Export Article
Cite this article
  • Normal Style
  • MLA Style
  • APA Style
  • Chicago Style
Open Access Peer-reviewed

Forecasting Stock Market Out-of-Sample with Regularised Regression Training Techniques

Jonathan Iworiso
International Journal of Econometrics and Financial Management. 2023, 11(1), 1-12. DOI: 10.12691/ijefm-11-1-1
Received March 22, 2023; Revised April 25, 2023; Accepted May 07, 2023

Abstract

Forecasting stock market out-of-sample is a major concern to researchers in finance and emerging markets. This research focuses mainly on the application of regularised Regression Training (RT) techniques to forecast monthly equity premium out-of-sample recursively with an expanding window method. A broad category of sophisticated regularised RT models involving model complexity were employed. The regularised RT models which include Ridge, Forward-Backward (FOBA) Ridge, Least Absolute Shrinkage and Selection Operator (LASSO), Relaxed LASSO, Elastic Net and Least Angle Regression were trained and used to forecast the equity premium out-of-sample. In this study, the empirical investigation of the Regularised RT models demonstrate significant evidence of equity premium predictability both statistically and economically relative to the benchmark historical average, delivering significant utility gains. Overall, the Ridge gives the best statistical performance evaluation results while the LASSO appeared to be most economical meaningful. They seek to provide meaningful economic information on mean-variance portfolio investment for investors who are timing the market to earn future gains at minimal risk. Thus, the forecasting models appeared to guarantee an investor in a market setting who optimally reallocates a monthly portfolio between equities and risk-free treasury bills using equity premium forecasts at minimal risk.

1. Introduction

The out-of-sample predictability of stock market is a research problem in empirical finance. The quest on stock market delivery to mean-variance investors above the treasury bill rate led to an “estimate of the equity premium”. The historical average model is an old-fashioned efficient market approach for forecasting the equity premium. Existing literature claimed that financial variables used as potential predictors can only forecast the equity premium in-sample but are unable to deliver significantly superior out-of-sample forecasts relative to the benchmark historical average. This led to the research question: can anything consistently beat the historical average out-of-sample? 4. The historical average is used as a benchmark for comparing the performance of any model whose forecasts are estimated out-of-sample via expanding or rolling window 10, 11. Thus, any model whose statistical measures outperformed those from the benchmark historical average is said to beat the historical average.

This research proposes an application of regularised Regression Training (RT) techniques to forecast monthly equity premium out-of-sample recursively with expanding window.

In finance, the statistical predictability does not necessarily guarantee investor's profit from the trading strategy. Thus, the statistical predictability and economic significance are comparatively considered in the performance evaluation metrics in this paper.

The equity premium or excess stock return is the difference between the expected return on the market portfolio (SP500) and the risk-free treasury bill rate. It is the return that investors can expect from holding the market portfolio in excess of the return on the risk-free rate.

Mathematically, it is defined as:

(1)

where is the price of the stock index at period ; is the risk-free interest rate at .

The Regression Training comes from the “caret” package, developed by 16 for Classification And REgression. The caret package (caret for R and RStudio, PyCaret for python) aimed to automate the main steps for evaluating and comparing machine learning algorithms. The RT techniques presumed that all predictor variables are useful before preprocessing when training the model and the trained model decides variable importance associated with the final model. Thus, the RT model is resampled and fine-tuned iteratively, and the best tuning parameters are used to run the out-of-sample forecasts.

The remaining structure of the paper is laid out as follows: Section 2 described the research methodology; Section 3 present the variables, the empirical results and discussion; Section 4 concludes the paper.

2. Methodology

2.1. The Historical Average

Given a univariate time series , with denoting the monthly equity premium. The historical average (HA) model is defined as follows:

(2)

where is a parameter representing the intercept; is a zero mean disturbance term; 4, 17. The least squares estimator (LSE) of the historical average is as follows:

which implies that the forecast for is given by:

where is the parametric estimator of .

2.2. The Least Squares Regression Training

Given a training dataset of T statistical units, then a kitchen sink predictive linear model takes the form:

where is the equity premium at ; are the predictor variables available at the end of used to predict ; is a constant term representing the intercept; are the model coefficients; is a zero mean disturbance term 11, 22.

The above model can be represented in matrix form, as follows:

(4)

where is a vector of observed values; is matrix of predictor variables; is dimensional parameter vector; is zero mean vector of disturbances.

If the parameters are estimated by OLS, then the linear model (LM) forecasts can be obtained from the resulting kitchen sink predictive model:

(5)

where is the OLS estimate of .

2.3. The Regularised or Penalised Regression Training
2.3.1. The Ridge

Using the training set

and by imposition of ridge constraints, the model parameter estimates will be obtained by minimizing the objective function

which is a convex optimization problem, hence the solution has a closed form 12, 15.

The ridge model parameter estimates will be:

which is always invertible, and hence non-singular 1; where is the matrix of covariates; is the shrinkage penalty; is the norm of the vector ; is the ridge tuning parameter; is the intercept; are the ridge coefficients; is a identity matrix; is the number of parameters to be estimated; is the sample size; .

Thus, the ridge forecasts are obtained from the resulting forecasting model:

(7)

where

is vector of unknown parameters, including the intercept; is the sample size.

The ridge forecasts converges to the sample mean for large values of tuning parameter, :


2.3.2. The Forward - Backward Ridge

The forward-backward (FOBA) ridge is an extension of the ridge model. It implements the forward and backward sparse learning algorithms for the ridge regression model. In this case, controls how likely the steps are to be taken, determining either addition or deletion of a variable in the ridge model. The FOBA method takes a backward step when the ridge penalised risk increase is less than times the ridge penalised risk reduction in the corresponding forward step, and vice versa.


2.3.3. The Least Absolute Shrinkage and Selection Operator

The LASSO model parameter estimates are obtained by minimizing the objective function:

(8)

where is the LASSO tuning parameter; is the intercept; are the LASSO coefficients; is the norm of the vector 23, 25.

The LASSO model parameter estimates will be:

where is the shrinkage penalty; and as

Thus, the LASSO forecasts are obtained from the resulting LASSO forecasting model:

(9)

where

where is the matrix of covariates; is vector of unknown parameters, including the intercept; is the sample size; controls the amount of shrinkage 9, 19.


2.3.4. The Relaxed Least Absolute Shrinkage and Selection Operator

The relaxed least absolute shrinkage and selection operator (RELAXO) is a generalisation of the LASSO for linear regression.

Let and be two separate parameters for controlling model selection and shrinkage estimation. The RELAXO estimator can be defined for and as follows:

where is the set of predictor variables selected by LASSO estimator; is the indicator function on the set of predictor variables; is the shrinkage penalty for the RELAXO 20. It can be expressed as follows:

Let be the negative log-likelihood under the parameter , then the generalized RELAXO estimator takes the form 20:

(10)

Thus, the RELAXO forecasts are obtained from the resulting RELAXO forecasting model:

(11)

where

(12)

where is vector of unknown parameters, including the intercept; is the set of predictor variables selected by LASSO estimator; is the indicator function on the set of predictor variables.


2.3.5. The Elastic Net

The elastic net, as proposed by 28 combines both the and penalty vector norms, and tends to eliminate extreme solutions. Thus the elastic net model parameter estimates are obtained by minimizing the objective function that includes the ridge and LASSO shrinkage penalties subject to both constraints, as follows:

where is the ridge tuning parameter; is the LASSO tuning parameter 27.

It is worth noting that the Elastic Net is Ridge if ; it is LASSO if and it is strictly convex if 3.

Therefore, the elastic net forecasts are obtained from the elastic net forecasting model:

(12)

where

is vector of unknown parameters including the intercept; and is the matrix of covariates.


2.3.6. The Least Angle Regression

The least angle regression (LARS), introduced by 8 is a machine learning model selection algorithm for fitting linear regression models to high dimensional data. In the LARS algorithm, the parameter estimates are increasing in an equiangular direction to each of the corresponding correlations associated with the model residuals.

The LARS algorithm adapted from 2 and 8 is summarized as follows:

The LARS2 is a special improved case of the LARS that uses step as the tuning parameter instead of fraction.

2.4. Statistical and Economic Performance Evaluation
2.4.1. Mean Squared Forecast Error

The mean squared forecast error (MSFE) is computed as follows:

where is the out-of-sample forecasting period; is the actual value at specific time ; is the forecast value at specific time .


2.4.2. Out-of-Sample Forecast Evaluation: The R2OOS Statistic

The out-of-sample statistical goodness of fit used to measure the performance of individual equity premium forecasting model, suggested by 5 for evaluating the overall performance of any competing model forecasts in terms of proportional error minimization, relative to the benchmark historical average forecast is defined as follows:

where implies that the MSE of the forecasting model is less than the MSE of the benchmark forecasts based on historical average; represents an equity premium forecast based on a specific competing model.


2.4.3. Diebold-Mariano Test

The assumptions of Diebold-Mariano test rely on the forecast error loss differential function 6, 7. Let and denote the forecast errors associated with the loss functions and for forecasts 1 and 2 respectively. The time-t loss differential between forecasts 1 and 2 is defined as follows:

The DM hypothesis of equal forecast accuracy, also known as equal expected loss, corresponds to the zero mean assumption of , i.e., ; where denotes the mean value. Thus, the null hypothesis of equal forecast accuracy against the alternative hypothesis of unequal forecast accuracy between forecasts 1 and 2, based on monthly forecast horizon , can be tested using the DM test statistic as follows 6:

where is the sample mean loss differential and is a consistent estimate of the standard deviation of . Thus, the DM test statistic has the asymptotic standard normal distribution under the null hypothesis of equal forecast accuracy. In this study, the forecast errors of each RT model are compared with the forecast errors from the benchmark historical average.


2.4.4. Sharpe Ratio

24 employed the Sharp Ratio (SR) as a measure of excess return per unit of risk in an investment asset or trading strategy. In this study, the SR standardizes the realized returns with the risk of the portfolios, and it is computed as follows:

where is the average realized return of the portfolio over the out-of-sample period; is the average risk-free treasury bill rate; is the variance of the portfolio over the out-of-sample period.


2.4.5. Cumulative Return

The cumulative return (CR) of the portfolio, is computed as follows:

(13)

where is the return on month t; is the number of months in the out-of-sample periods.


2.4.6. Utility Gain

A mean-variance investor who forecasts the monthly equity premium using the HA will decide at the end of time to allocate risky weights as share of her portfolio to equities in time , in the form:

where the portfolio risky weights are constrained to lie between 0% and 150%, (i.e., if and if ; is the risk aversion parameter; is the equity premium forecasts based on HA; is the variance of stock returns 5, 11, 12.

The investor realizes an average utility from the HA, given by:

where is the sample mean over the out-of-sample period; is the sample variance over the out-of-sample period.

The weight risky equity share can be chosen by the following:

Then the investor realizes an average utility from an individual RT model, defined by:

where is the sample mean over the out-of-sample period; is the sample variance over the out-of-sample period.

Thus, the utility gain (UG) can be computed as follows:

for each of the RT out-of-sample forecasting models.

3. The Empirical Results and Discussion

3.1. Data, Variables and Forecasting Method

The dataset with financial variables used in this paper are obtained from 26, Amit Goyal's website and Robert Shiller’s website, covering monthly observations from January 1960 to December 2019. The stock indices are obtained from the CRSP's month-end values of the S&P500 monthly index, and the stock returns are the continuously compounded returns on the S&P500 index. All out-of-sample forecasts are obtained by expanding window; and the out-of-sample period is from January 1994 to December 2019. The parameters of the forecasting models are estimated recursively using an expanding window of observations, with data point from the start date to the present time and obtain a one month-period-ahead forecast. The forecast horizon is one month ahead, and the procedure is repeated until the last forecast is obtained.

3.2. Results and Discussion

In this paper, the empirical results for the RT models are summarised in two panels, displayed in Table 1 and Table 2 respectively. Following the benchmark statistical performance evaluation metrics in 4, 5, 10, any model which gives a positive out-of-sample (i.e., ) using expanding or rolling window is said to have outperformed or consistently beat the historical average. In the Kitchen Sink RT Model panel, the Linear Model (LM) gives a negative (i.e., ), which indicates underperformance relative to the benchmark historical average. It corroborates previous findings in empirical literature in which the ordinary linear regression cannot consistently beat the benchmark historical average out-of-sample. Thus, the introduction of model training with fine-tuning of parameters recursively in the LM does not improve the statistical predictive task of the LM in this direction.

In the regularised RT Models panel, each of the models produced a positive (i.e., ), which indicates statistical evidence of outperformance over the benchmark historical average. In this paper, the Diebold-Mariano DM test is introduced as an additional statistical performance evaluation measure to compare the forecast accuracy of each RT model with those obtained from historical average. Interestingly, the regularised RT Models demonstrate statistically significant evidence of producing better forecasts than those obtained from historical average at 5% significance level, except for the Elastic Net. Also, the LM in the Kitchen Sink Model panel could not give any statistically significant evidence of producing unequal forecast accuracy relative to the historical average, as judged by the DM test. In the regularised RT Models panel, the Ridge gives the highest with corresponding minimum MSFE and among all the RT models tested, in terms of statistical predictive power. Thus, the presence of the -vector norm in the Ridge model seems to improve the statistical predictive task of the Ridge model. The FOBA underperformed the Ridge model while the relaxed LASSO outperformed the LASSO. The combination of both and vector norms in the Elastic Net does not improve the statistical predictive task of the Elastic Net model, as compared to their individual forms, as in the Ridge and LASSO. The step as a tuning parameter in the LARS2 algorithm seems to improve the predictive task of the model, as compared to the LARS algorithm which uses fraction as a tuning parameter. Thus, the concept of bias-variance trade off in the sophisticated regularised RT Models is a more useful approach for forecasting the U.S. monthly equity premium out-of-sample with significant predictive power, relative to the benchmark historical average.

Turning to the economic performance evaluation measures (Table 3), it is important to note that the statistical predictive power of a model relative to the benchmark historical average does not necessarily guarantee economic significance in real market setting. The and MSFE alone cannot explicitly account for an investor's risk over the out of sample period. In this paper, the useful economic performance evaluation metrics which includes the Cumulative Return CR, Sharpe Ratio SR and Utility Gains UG based on the out-of-sample periods were employed. The study seeks to reconcile the statistical and economic evidence in an attempt to guarantee the future expectation of a mean-variance portfolio investor. In this paper, the average risk-free treasury bill rate is and the risk aversion parameter is . A mean-variance investor can increase her monthly portfolio return by computing a proportional factor , where SR is the Sharpe ratio. In 21 and 22, the UG is expressed in the form of average annualised percentage returns, also known as certainty equivalent returns. The UG is important in a real market setting in that it provides useful economic information on the portfolio management fee that an investor would be willing to pay in order to have access to the additional available information in the forecasting model relative to the sole information in the historical equity premium. For a mean-variance portfolio investor, a model that produced a higher UG based on the out-of-sample periods than the average risk-free treasury bill is preferable to its counterpart. Whereas, if risk is equal, then it is more profitable to invest in the treasury bills than in the portfolio based on the forecasting model.

5 argued that even very low positive values for monthly data can produce a meaningful economic evidence of equity premium predictability in terms of increased annual portfolio returns for a mean-variance investor. In agreement with 5, the LM in the Kitchen Sink Model panel gives an economically meaningful evidence, preferable to the average risk-free treasury bill, as judged by the UG and SR. In spite of the weak statistical predictive power of the LM, it seems to provide useful economic information to a mean-variance portfolio investor.

In the regularised RT Models panel, all the models provide strong significant evidence of economic predictability and outperformance over the treasury bill. It is worth noting that the superiority of a forecasting model in terms of statistical predictability does not correspondingly guarantee superiority in economic significance. In the statistical performance evaluation metrics, the Ridge gives the best results. Whereas, in the economic performance evaluation metrics, the LASSO produced the best results. Thus, the -vector norm in the LASSO forecasting model seems to be more economically powerful than the -vector norm in the Ridge forecasting model. Like in 13 in which the penalised binary probit models used as classifiers for sign or directional forecasting, and the application of deep learning in 14, the training and fine-tuning approach of the regularised regression models in this paper also provides statistically significant evidence of equity premium predictability with significant economic gains. Figure 1 and Figure 2 depict the graphical analysis of the out-of-sample RT forecasting models. Figure 1 is a stacked bar chart while Figure 2 is bar chart, showing the cumulative returns (CRs), Sharpe ratios (SRs) and utility gains (UGs). The time series graphical representation of actual versus forecasts for the various RT forecasting models are depicted in Figure 3, Figure 4, Figure 5, Figure 6 and Figure 7 respectively. As in 18, the regularised RT forecasting models in this paper provide significant evidence of equity premium predictability over the benchmark historical average with useful economic gains, and suggesting better alternatives to mean-variance investors.

The empirical analysis in this paper revealed that the sophisticated regularised RT forecasting models consistently beat the benchmark historical average out-of-sample, both statistically and economically. Thus, the regularised RT forecasting models used in this study appeared to guarantee a mean-variance portfolio investor in a real-time market setting who optimally reallocates a monthly portfolio between equities and risk-free treasury bill using equity premium forecasts at minimal risk.

4. Conclusion

This paper has answered the research question in 4, 5, demonstrating the superiority of regularised RT forecasting models over the benchmark historical average out-of-sample with significant economic gains. Interestingly, all the regularised RT forecasting models consistently beat the benchmark historical average out-of-sample, both statistically and economically.

Overall, the Ridge gives the best statistical performance evaluation results while the LASSO appeared to be most economically meaningful. The regularised RT forecasting models provide useful economic information on mean-variance portfolio investment for investors who are timing the market to earn future gains at minimal risk. Thus, the regularised RT forecasting models appeared to guarantee a mean-variance investor in a real-time setting who optimally reallocates a monthly portfolio between equities and risk-free treasury bill using equity premium forecasts at minimal risk.

References

[1]  Ahn, J. J., Byun, H. W., Oh, K. J., and Kim, T. Y. (2012). Using ridge regression with genetic algorithm to enhance real estate appraisal forecasting. Expert Systems with Applications, 39(9): 8369-8379.
In article      View Article
 
[2]  Alfons, A., Croux, C., and Gelper, S. (2016). Robust groupwise least angle regression. Computational Statistics & Data Analysis, 93: 421-435.
In article      View Article
 
[3]  Bai, J. and Ng, S. (2008). Forecasting economic time series using targeted predictors. Journal of Econometrics, 146(2): 304-317.
In article      View Article
 
[4]  Campbell, J. Y. and Thompson, S. B. (2005). Predicting the equity premium out of sample: can anything beat the historical average? Technical report, National Bureau of Economic Research.
In article      View Article
 
[5]  Campbell, J. Y. and Thompson, S. B. (2007). Predicting excess stock returns out of sample: Can anything beat the historical average? The Review of Financial Studies, 21(4): 1509-1531.
In article      View Article
 
[6]  Diebold, F. X. (2015). Comparing predictive accuracy, twenty years later: A personal perspective on the use and abuse of diebold–mariano tests. Journal of Business & Economic Statistics, 33(1): 1-1.
In article      View Article
 
[7]  Diebold, F. X. and Mariano, R. S. (2002). Comparing predictive accuracy. Journal of Business & Economic Statistics, 20(1): 134-144.
In article      View Article
 
[8]  Efron, B., Hastie, T., Johnstone, I., Tibshirani, R., et al. (2004). Least angle regression. The Annals of Statistics, 32(2): 407-499.
In article      View Article
 
[9]  Elliott, G., Gargano, A., and Timmermann, A. (2013). Complete subset regressions. Journal of Econometrics, 177(2): 357-373.
In article      View Article
 
[10]  Goyal, A. and Welch, I. (2003). Predicting the equity premium with dividend ratios. Management Science, 49(5): 639-654.
In article      View Article
 
[11]  Goyal, A. and Welch, I. (2007). A comprehensive look at the empirical performance of equity premium prediction. The Review of Financial Studies, 21(4): 1455-1508.
In article      View Article
 
[12]  Hoerl, A. E. and Kennard, R. W. (1970). Ridge regression: Biased estimation for nonorthogonal problems. Technometrics, 12(1): 55-67.
In article      View Article
 
[13]  Iworiso, J. and Vrontos, S. (2020). On the directional predictability of equity premium using machine learning techniques. Journal of Forecasting, 39(3): 449-469.
In article      View Article
 
[14]  Iworiso, J. and Vrontos, S. (2021). On the predictability of the equity premium using deep learning techniques. The Journal of Financial Data Science, 3(1): 74-92.
In article      View Article
 
[15]  James, G., Witten, D., Hastie, T., and Tibshirani, R. (2013). An introduction to statistical learning. Springer.
In article      View Article
 
[16]  Kuhn, M. et al. (2008). Caret package. Journal of statistical software, 28(5): 1-26.
In article      View Article
 
[17]  Lee, T.-H., Tu, Y., and Ullah, A. (2015). Forecasting equity premium: global historical average versus local historical average and constraints. Journal of Business & Economic Statistics, 33(3): 393-402.
In article      View Article
 
[18]  Li, J. and Tsiakas, I. (2017). Equity premium prediction: The role of economic and statistical constraints. Journal of Financial Markets, 36: 56-75.
In article      View Article
 
[19]  MartíNez-MartíNez, J. M., Escandell-Montero, P., Soria-Olivas, E., Mart´ıN-Guerrero, J. D., Magdalena-Benedito, R., and G´oMez-Sanchis, J. (2011). Regularized extreme learning machine for regression problems. Neurocomputing, 74(17):3716-3721.
In article      View Article
 
[20]  Meinshausen, N. (2007). Relaxed lasso. Computational Statistics & Data Analysis, 52(1): 374-393.
In article      View Article
 
[21]  Rapach, D. E., Strauss, J. K., and Zhou, G. (2007). Out-of-sample equity premium prediction: Consistently beating the historical average. Review of Financial Studies.
In article      
 
[22]  Rapach, D. E., Strauss, J. K., and Zhou, G. (2010). Out-of-sample equity premium prediction: Combination forecasts and links to the real economy. The Review of Financial Studies, 23(2): 821-862.
In article      View Article
 
[23]  Sagaert, Y. R., Aghezzaf, E.-H., Kourentzes, N., and Desmet, B. (2018). Tactical sales forecasting using a very large set of macroeconomic indicators. European Journal of Operational Research, 264(2):558-569.
In article      View Article
 
[24]  Sharpe, W. F. (1994). The sharpe ratio. Journal of portfolio management, 21(1): 49-58.
In article      View Article
 
[25]  Tibshirani, R. (1996). Regression shrinkage and selection via the lasso. Journal of the Royal Statistical Society: Series B (Methodological), 58(1): 267-288.
In article      View Article
 
[26]  Welch, I. and Goyal, A. (2008). A comprehensive look at the empirical performance of equity premium prediction. The Review of Financial Studies, 21(4): 1455-1508.
In article      View Article
 
[27]  Wu, L. and Yang, Y. (2014). Nonnegative elastic net and application in index tracking. Applied Mathematics and Computation, 227: 541-552.
In article      View Article
 
[28]  Zou, H. and Hastie, T. (2005). Regularization and variable selection via the elastic net. Journal of the Royal Statistical Society: Series B (Statistical Methodology), 67(2): 301-320.
In article      View Article
 

Published with license by Science and Education Publishing, Copyright © 2023 Jonathan Iworiso

Creative CommonsThis work is licensed under a Creative Commons Attribution 4.0 International License. To view a copy of this license, visit http://creativecommons.org/licenses/by/4.0/

Cite this article:

Normal Style
Jonathan Iworiso. Forecasting Stock Market Out-of-Sample with Regularised Regression Training Techniques. International Journal of Econometrics and Financial Management. Vol. 11, No. 1, 2023, pp 1-12. http://pubs.sciepub.com/ijefm/11/1/1
MLA Style
Iworiso, Jonathan. "Forecasting Stock Market Out-of-Sample with Regularised Regression Training Techniques." International Journal of Econometrics and Financial Management 11.1 (2023): 1-12.
APA Style
Iworiso, J. (2023). Forecasting Stock Market Out-of-Sample with Regularised Regression Training Techniques. International Journal of Econometrics and Financial Management, 11(1), 1-12.
Chicago Style
Iworiso, Jonathan. "Forecasting Stock Market Out-of-Sample with Regularised Regression Training Techniques." International Journal of Econometrics and Financial Management 11, no. 1 (2023): 1-12.
Share
[1]  Ahn, J. J., Byun, H. W., Oh, K. J., and Kim, T. Y. (2012). Using ridge regression with genetic algorithm to enhance real estate appraisal forecasting. Expert Systems with Applications, 39(9): 8369-8379.
In article      View Article
 
[2]  Alfons, A., Croux, C., and Gelper, S. (2016). Robust groupwise least angle regression. Computational Statistics & Data Analysis, 93: 421-435.
In article      View Article
 
[3]  Bai, J. and Ng, S. (2008). Forecasting economic time series using targeted predictors. Journal of Econometrics, 146(2): 304-317.
In article      View Article
 
[4]  Campbell, J. Y. and Thompson, S. B. (2005). Predicting the equity premium out of sample: can anything beat the historical average? Technical report, National Bureau of Economic Research.
In article      View Article
 
[5]  Campbell, J. Y. and Thompson, S. B. (2007). Predicting excess stock returns out of sample: Can anything beat the historical average? The Review of Financial Studies, 21(4): 1509-1531.
In article      View Article
 
[6]  Diebold, F. X. (2015). Comparing predictive accuracy, twenty years later: A personal perspective on the use and abuse of diebold–mariano tests. Journal of Business & Economic Statistics, 33(1): 1-1.
In article      View Article
 
[7]  Diebold, F. X. and Mariano, R. S. (2002). Comparing predictive accuracy. Journal of Business & Economic Statistics, 20(1): 134-144.
In article      View Article
 
[8]  Efron, B., Hastie, T., Johnstone, I., Tibshirani, R., et al. (2004). Least angle regression. The Annals of Statistics, 32(2): 407-499.
In article      View Article
 
[9]  Elliott, G., Gargano, A., and Timmermann, A. (2013). Complete subset regressions. Journal of Econometrics, 177(2): 357-373.
In article      View Article
 
[10]  Goyal, A. and Welch, I. (2003). Predicting the equity premium with dividend ratios. Management Science, 49(5): 639-654.
In article      View Article
 
[11]  Goyal, A. and Welch, I. (2007). A comprehensive look at the empirical performance of equity premium prediction. The Review of Financial Studies, 21(4): 1455-1508.
In article      View Article
 
[12]  Hoerl, A. E. and Kennard, R. W. (1970). Ridge regression: Biased estimation for nonorthogonal problems. Technometrics, 12(1): 55-67.
In article      View Article
 
[13]  Iworiso, J. and Vrontos, S. (2020). On the directional predictability of equity premium using machine learning techniques. Journal of Forecasting, 39(3): 449-469.
In article      View Article
 
[14]  Iworiso, J. and Vrontos, S. (2021). On the predictability of the equity premium using deep learning techniques. The Journal of Financial Data Science, 3(1): 74-92.
In article      View Article
 
[15]  James, G., Witten, D., Hastie, T., and Tibshirani, R. (2013). An introduction to statistical learning. Springer.
In article      View Article
 
[16]  Kuhn, M. et al. (2008). Caret package. Journal of statistical software, 28(5): 1-26.
In article      View Article
 
[17]  Lee, T.-H., Tu, Y., and Ullah, A. (2015). Forecasting equity premium: global historical average versus local historical average and constraints. Journal of Business & Economic Statistics, 33(3): 393-402.
In article      View Article
 
[18]  Li, J. and Tsiakas, I. (2017). Equity premium prediction: The role of economic and statistical constraints. Journal of Financial Markets, 36: 56-75.
In article      View Article
 
[19]  MartíNez-MartíNez, J. M., Escandell-Montero, P., Soria-Olivas, E., Mart´ıN-Guerrero, J. D., Magdalena-Benedito, R., and G´oMez-Sanchis, J. (2011). Regularized extreme learning machine for regression problems. Neurocomputing, 74(17):3716-3721.
In article      View Article
 
[20]  Meinshausen, N. (2007). Relaxed lasso. Computational Statistics & Data Analysis, 52(1): 374-393.
In article      View Article
 
[21]  Rapach, D. E., Strauss, J. K., and Zhou, G. (2007). Out-of-sample equity premium prediction: Consistently beating the historical average. Review of Financial Studies.
In article      
 
[22]  Rapach, D. E., Strauss, J. K., and Zhou, G. (2010). Out-of-sample equity premium prediction: Combination forecasts and links to the real economy. The Review of Financial Studies, 23(2): 821-862.
In article      View Article
 
[23]  Sagaert, Y. R., Aghezzaf, E.-H., Kourentzes, N., and Desmet, B. (2018). Tactical sales forecasting using a very large set of macroeconomic indicators. European Journal of Operational Research, 264(2):558-569.
In article      View Article
 
[24]  Sharpe, W. F. (1994). The sharpe ratio. Journal of portfolio management, 21(1): 49-58.
In article      View Article
 
[25]  Tibshirani, R. (1996). Regression shrinkage and selection via the lasso. Journal of the Royal Statistical Society: Series B (Methodological), 58(1): 267-288.
In article      View Article
 
[26]  Welch, I. and Goyal, A. (2008). A comprehensive look at the empirical performance of equity premium prediction. The Review of Financial Studies, 21(4): 1455-1508.
In article      View Article
 
[27]  Wu, L. and Yang, Y. (2014). Nonnegative elastic net and application in index tracking. Applied Mathematics and Computation, 227: 541-552.
In article      View Article
 
[28]  Zou, H. and Hastie, T. (2005). Regularization and variable selection via the elastic net. Journal of the Royal Statistical Society: Series B (Statistical Methodology), 67(2): 301-320.
In article      View Article