Modelling the Effects of Trading Volume on Stock Return Volatility Using Conditional Heteroskedastic Models

In this study, we analyzed the effects of trading volume as a proxy for the information arrival on stock return volatility and assess whether with the inclusion of trading volume in conditional variance equation, volatility persistence disappears using the generalized autoregressive conditional heteroscedasticity models; EGARCH and TGARCH. The analysis was done on the daily Nairobi Security Exchange (NSE) 20-share index and trading volume from 02/01/2009 to 02/06/2017 accounting for 2108 observations. The results of AR (2)-EGARCH (1,1) and AR (2)-TGARCH (1,1) models show that the relationship between trading volume and stock returns volatility is positive but not statistically significant implying that trading volume as a proxy of information flow can be considered generally as a poor source of volatility in stock returns. However, the results do not support the hypothesis that persistence in volatility disappears with the inclusion of trading volume in the conditional variance equation and this was consistent with the Student’s t-distribution and Generalized error term distribution assumption. We suggest that the AR (2)-EGARCH (1,1) model without trading volume with student t-distribution is a more suitable model to capture the main features of the stock returns such as the volatility clustering, the stock returns volatility and the leverage effect.


Introduction
The study of volatility in financial markets is of great importance to investors in the managing of risk as it provides a degree of uncertainty on their investment. Reference [1] articulates that financial analysts and investors in financial markets are concerned with the unpredictability on asset return investment that are attributed to business performance instability and varying market prices. The common risk measures in financial markets are the Value at Risk (VaR) and Expected Shortfall (ES) were the former is a more established statistic within the financial markets while the latter is increasingly becoming of research interest.
In many instances, time series especially in natural sciences cannot be modeled by a linear process. Thus, are better modeled by nonlinear processes which include; ARCH, GARCH, TGARCH, EGARCH, PGARCH and many others. Financial time series returns often display volatility clustering. Reference [2] outlines the most essential financial time series features as; they tend to have leptokurtic distribution, leverage effect, skewness and volatility clustering. Hence, the standard ARCH/GARCH model can model the leptokurtosis, skewness and volatility clustering. [3] shows that the standard model is unable to capture the dynamics of an important feature of financial time series known as the leverage effect i.e. cannot model this asymmetric behaviour of stock returns.
A stock return is what an investor gains or losses on investing in a particular stock or portfolio which is dependent on the inherent risk in the market that the stock is listed. [4] articulates that variations on investment returns are mostly dependent on the willingness of the investor to take risk that is, the more the willingness to take the risk, the more the returns to the said investor and conversely. In [5], volatility is defined as a measure of variability or dispersion about a measure of central tendency. Generally, in financial markets, the major concern is often on the spread of asset returns. [6] articulates that for any stock market, volatility and returns are two important factors around which the entire stock market revolves. Volatility is associated with the uncertainty of the price; however, it is really not the same as risk. The undesirable outcome is linked to risk, whereas a strict measure of uncertainty which can be attributed to either a positive or negative outcome is volatility [5]. For instance, a higher volatility implies higher risk in the market.
The number of shares that change ownership for a particular security is measured by the trading volume. In the financial market, several researchers and traders are of the view that trading volume strongly influences movements in prices. Reference [7] argues that researchers have found out that the trading volumes contains a lot of information as it forms a good proxy for information level of investors regarding stocks at any given time hence affecting the reactions through selling and buying of stocks. [8] argues that this has been consistent with studies by [7,9,10], who concluded that volume and the absolute changes in price have a positive correlation. Reference [11] argues that one factor that many have considered in the prediction of stock prices is the trading volume. In addition, the accessible amount of new information about a company on a given day can vary a securities everyday volume depending on the expiry of option contracts, the trading say is full or half day and other possible factors. Among a wide range of factors influencing the trading volume, the arrival of new information is one factor that corresponds the most to a securities fundamental evaluation.
Reference [12] articulates that the arrival of information assumes a critical part in stock markets as it primarily drives movements. This information can be a public statement, profit declaration, court ruling relation to operations or change in company regulatory policies. Additionally, news arrival as an essential driver of market movements is a unique property of vector stochastic models. Thus, implying that news has a significant impact on the investment decisions of most investors than information regarding the business activities of listed companies such as financial statement.
Researchers have become more interested in the analysis of trading volume and corresponding change in prices relating to informational releases due to the inferences that can be made from abnormal trading volume. However, traders focus on the trading volume because of several reasons. Theoretically, low volume in the market means high fluctuations in prices while low price variability is as a result of high volume. Consequently, this results in reduction of the price effect on large trade. Generally, broker revenue increases with an increase in volume and due to high turnover, the market makers have a greater chance for profit.
Although a fair amount of empirical evidence exists on the effects of trading volume on the stock returns volatility for emerging stock markets in developing countries, the current literature provides very few empirical studies that considered asymmetric GARCH models. Therefore, this study intends to fill this gap by analyzing the effects of trading volume as a proxy for the arrival of information (hereafter, information arrival) on stock return volatility and assess whether with the inclusion of trading volume in conditional variance equation, volatility persistence disappears using the generalized autoregressive conditional heteroscedasticity models; EGARCH and TGARCH. The remaining parts of this paper are organized as follows. We discuss the methodology considered in Section 2. The results and discussion are contained in Section 3 and lastly, we conclude the paper in Section 4.

Methodology
The analysis was done on daily NSE 20-share price index and trading volume from 02/01/2009 to 2/06/2017 accounting for 2108 observations and was analyzed in the R software environment [13].
The stock return is defined as: where t P and 1 − t P are the values of the stock index at close of the current day and previous day respectively. is the logarithmic of stock returns.
The trading volume is defined as: where t Vol and are the values of the volume of shares traded at the close of the current day and previous day of trade respectively. is the logarithmic of trading volume.
We let be the shock at time t and be the available information through time t. The modeling includes the estimation of the mean and conditional variance equations. We define the model as, where µ is the conditional mean of t R given information through time 1 t − and t R is the return at time index t. t  is a non-constant term with respect to time and is defined as,

Conditional Mean Equation
In modeling the conditional mean equation of R t , we will employ the general Box Jenkins ARMA (p, q) model defined as, Where µ is a constant, i φ and j θ are parameters of the ARMA (p, q) model and t  is the disturbance term.

Conditional Variance Equation
To model the daily stock returns volatility, we used asymmetric models due to the fact that shocks of equal magnitude which may either be positive or negative are considered to have different effects on the volatility in future. Reference [14] articulates that asymmetric models are extensively motivated by the need to distinguish between negative and positive shocks and their impact on volatility in financial markets. In this paper, we used the standard EGARCH (r, s) and TGARCH (r, s) models that we discuss below:

Exponential GARCH (EGARCH) Models
In this model, the asymmetric responses of the timevarying variance to shocks is captured. The model ensures a positive variance and uses standardized value of The EGARCH (r, s) specification is given by ( ) where the asymmetric or leverage parameter is γ i . In most empirical cases, the leverage parameter is expected to be positive so that a negative shock increases future volatility or uncertainty while a positive shock eases the impact on future uncertainty. When − t i  is positive (i.e. good news), its contribution to the log volatility is there is symmetry i.e. no asymmetric volatility ii. γ i < 0, then negative shocks (bad news) will increase the volatility more than positive shocks (good news).
iii. γ i > 0, then positive shocks (good news) will increase the volatility more than negative shocks (bad news) The persistence P of the model is given by, In order to analyze the effects of the trading volume t V on stock return volatility, the following modification of the conditional variance equation (6) is used: Where t V is the logarithmic of the trading volume which is used as a proxy for information arrival while meaning of the rest of the parameters are as defined in equation (6). [15] argues that if the proxy of information flow in the market is serially correlated to the variance then the persistence would be significantly smaller than when t V is not included and the parameter δ > 0. If the parameter δ > 0 and statistically significant, then the proxy for information flow is serially correlated to the variance and has explanatory power.

Threshold GARCH (TGARCH) Models
The TGARCH (r, s) conditional variance specification is given by, where i γ is the asymmetric response parameter or leverage parameter, i α and j β are non-negative parameters satisfying conditions similar to those of GARCH models. If i γ =0, the model collapses to the classical GARCH (p, q) process. In this model, positive shocks (good news) and negative shocks (bad news) have different effects on the conditional variance 2 t σ that is The persistence P of the model is given by, where k is the expected value of the standardized residuals t a below zero (effectively the probability of being below zero), where f is the standardized conditional density with any additional skew and shape parameters (· · ·). For instance, the value of k is 0.5 in the case of symmetric distributions.
In order to analyze the effects of the trading volume t V on stock return volatility, the following modification of the conditional variance equation (8) is used: Where t V is the logarithmic of the trading volume which is used as a proxy for information arrival to the market while meaning of the rest of the parameters are as defined in equation (8). According to [15], the value of δ should be positive and there should be negligible volatility persistence.

Distribution Assumptions of the Error (a t ) in the GARCH type Model
Reference [8] argues that often non-normality patterns such as excess kurtosis and skewness are exhibited by financial time series. The residuals of conditional heteroscedasticity models may generally show excess kurtosis, heavy tails and skewness. In order to account for the skewness, excess kurtosis and heavy-tails of return distributions this study employed the use of the Student'st distribution and the Generalized Error Distribution (GED). [17] proposed that in fitting the GARCH model for the standardized error of the return series the Student's t-Distribution can be used in order to better capture the observed fat tails. The probability density function for a random variable that has a Student's t-distribution with v degrees of freedom is given by,

Student's t-Distribution
The density of the standardized Student's t-Distribution with v > 2 degrees of freedom is given by; is a gamma function, v is the parameter that measures the thickness of the tail. The log likelihood function is given by equation (14).

Generalized Error Distribution
The probability density function of the Generalized Error Distribution (GED) is given by, The log likelihood function is given by, In order to maximize the log likelihood function, the quasi maximum likelihood function estimator will be used with respect to the unknown parameters. This is a preferred methodology because it is said to provide asymptotic standard errors that are valid under nonnormality, is generally consistent and has a normal limiting distribution [18].

Results and Discussion
The descriptive statistics of the variables considered in this study are presented in Table 1. We observe that the stock returns series have a negative daily mean suggesting that they decrease slightly over time while the average mean daily trading volume is positive implying that the trading volume increase slightly with time. Both the stock returns and trading volume are right skewed implying that they have an asymmetric distribution as can be seen from the coefficient of the skewness. The values of the skewness and the kurtosis are different from 'zero' and 'three' respectively. Hence, suggesting the presence of leptokurtic i.e. fat tails thus implying that the series are not normally distributed which is confirmed by the Jarque-Bera (JB) tests at 5% level of significance.   To test for stationarity of the time series, the Augmented Dickey Fuller (ADF) test was considered. From the computed test statistics in Table 2, we observe that the computed test statistics for the daily price series test is more than the critical values and thus, the null hypothesis is not rejected at 5% level of significance and we conclude that the price series is not stationary (there is a unit root) while t V and t R are stationary as can be observed from the visual inspection in Figure (1b) and Figure (2b). Table 3 shows the Ljung Box test for the stock return and trading volume series results at lags 2, 4, 6, 8 and the null hypothesis is rejected at 5% level of significance. Therefore, the result indicating that there exists correlation in the stock returns and trading volumes series. Hence, detected autocorrelation in the stock return and trading volume series can be removed from the data by fitting the simplest plausible ARMA (p, q) model.

Estimated Mean Equation
In this study, an ARMA (p, q) model was used to fit the mean returns because it is said to provide approximations to the conditional mean dynamics that are flexible and parsimonious. According to [19], to deduce the order of an ARMA (p, q) model we may use the Autocorrelation Function (ACF) and Partial Autocorrelation Function (PACF). In this study, we suggested that the stock returns can be modeled by an AR (2) process. This is consistent with the results in Table 4 which suggested the best fitting model based on the criterion of choosing a model with minimum AIC and BIC and largest log-likelihood function. Therefore, the ARMA (2, 0) is selected as the mean equation. The test for ARCH effects on the residuals of the AR (2) model resulted in the rejection of the null hypothesis at 5% level of significance and the results of the tests considered are given in Table 5. The lack of fit can also be observed from the plot of ACF and PACF in Figure 3. Therefore, the implementation of the GARCH-type models is valid in the modeling of the stock returns volatility.

Estimated Volatility Models
The results of the parameter estimate for the AR (2)-EGARCH (1,1) and AR (2)-TGARCH (1,1) without and with trading volume respectively under the Student's tdistribution and Generalized Error distribution assumption of the error term distribution are presented in Table 6 and  Table 7. The p-values are given in parentheses.

AR (2)-EGARCH (1, 1) and AR (2)-TGARCH (1,1) Models without trading volume
In Table 6, the estimates of the AR (2) i.e. 1 φ and 2 φ are significant hence backing the implementation in modeling of the NSE stock returns with an AR (2) model. We observe that the GARCH term ( 1 β ) is statistically significant for AR.
(2)-EGARCH (1,1) with GED whereas the ARCH term ( 1 α ) and the mean parameter (μ) for both Student's t-distribution and GED are not statistically significant at 5% level of significance. The parameter 1 γ is positive and significant for the AR (2)-EGARCH (1,1) suggesting the presence of the leverage effect under the GED and Student's t distribution. This implies a confirmation of the fact that good news (positive shocks) increases volatility more than bad news (negative shocks) of the same magnitude. This finding agrees with earlier studies on the NSE by [14] and [20] who modeled daily and weekly returns using the GARCH-type models respectively. The GARCH term ( 1 β ) and the ARCH term ( 1 α ) are statistically significant for AR (2)-TGARCH (1,1) at 5% level of significance under the GED and Student's t-distribution. The parameter 1 γ in the AR (2)-TGARCH (1,1) model is negative and not statistically significant at 5% level of significance suggesting that there is no asymmetry under both error term distribution assumptions.   Table 7 gives the results of the parameter estimates for the AR (2)-EGARCH (1,1) and AR (2)-TGARCH (1,1) with trading volume under the GED and Student's t-distribution. The mean parameter (μ) for all distribution assumptions are not statistically significant at 5% level of significance. The estimates of the AR (2)  is statistically significant for the AR (2)-EGARCH (1,1) and AR (2)-TGARCH (1,1) under all distributional assumptions whereas the ARCH term ( 1 α ) for the AR (2)-EGARCH (1,1) at 5% level of significance is not statistically significant. In the AR (2)-EGARCH (1,1), the parameter 1 γ in the model is positive and statistically significant at 5% level of significance implying the presence of asymmetry under the GED and Student's t-distribution. This suggests that positive shocks (good news) results in more conditional variance than negative shocks (bad news) of similar magnitude indicating that bad news has a lesser impact on the volatility than good news. Similarly, in the AR (2)-TGARCH (1,1) the parameter 1 γ is negative and not statistically significant at 5% level of significance hence, collapsing the model to a GARCH (r, s) process where good and bad news have the same impact on stock market return series and thus implying that there is no asymmetry under the error term distributional assumptions. We observe that in both the AR (2)-EGARCH (1,1) and AR (2)-TGARCH (1,1) models respectively, the value of the parameter 1 γ slightly decreases with the inclusion of the trading volume in the model implying that the trading volume leads to less asymmetric volatility on the market. The parameter δ is positive but not statistically significant hence, we deduce that the volatility in the NSE cannot be explained by the trading volume. Then, the proxy of information flow that is trading volume considered in this study may reflect a poor source of heteroskedasticity in the NSE stock returns and differs with findings of [15] as volatility persistence remains high. This is consistent with results of [21] and [22]. The positive coefficient of the parameter δ indicates a positive relationship between stock return volatility and trading volume. This result is steady with the findings of [23] who examined the relationship between the daily stock return and the trading volume in the NSE using regression analysis. The degree of persistence in the conditional variance equations slightly increased with the inclusion of trading volume in all the models considered and thus, this is consistent regardless of the error term distribution assumption. The persistence implies that today's volatility shocks have an impact on the future expected volatility. Also, the presence of the leverage effect in the NSE stock returns is confirmed by the AR (2)-EGARCH (1,1) process for the innovations with Student's t distribution and GED.

AR (2)-EGARCH (1, 1) and AR (2)-TGARCH (1,1) Models trading volume
The model accuracy evaluation was done using the Ljung-Box test for the Q (9) and 2 Q (9) for residuals and squared residuals respectively and the ARCH (7) for all the models under the all error term distribution assumptions. The null hypothesis of no significant correlations and no arch effects is accepted for all the cases implying that the fitted models were adequate. Finally, a more suitable model to capture the main features of the stock returns such as the volatility clustering, the volatility and the leverage effect based on the values of the AIC, BIC and LL is the AR (2)-EGARCH (1,1) model without trading volume.

Conclusion
In this study, we modeled the effects of trading volume on stock return volatility using the AR (2)-EGARCH (1,1) and AR (2)-TGARCH (1,1) models under the student's tdistribution and GED. We observe that including the trading volume in the AR (2)-EGARCH (1,1) and AR (2)-TGARCH (1,1) model slightly decreases the value of the parameter implying that the trading volume leads to less asymmetric volatility on the market. The parameter is positive but not statistically significant hence, we conclude that the trading volume does not explain volatility in the NSE. The degree of persistence slightly increased with the inclusion of trading volume in the conditional variance equations of all the models considered and thus, this is consistent regardless of the error term distribution assumption. The persistence implies that volatility shocks today will influence the expectation of volatility many periods in the future. Therefore, the trading volume as a proxy of information flow can generally be considered to be a poor source of volatility in the stock returns. The result agrees with the findings as in [24] and [25]. The AR (2)-EGARCH (1,1) model without trading volume was suggested to be a more suitable model to capture the main features of the NSE return such as the volatility clustering, the stock returns volatility and the leverage effect.