Stochastic DEA with a Perfect Object and Its Application to Analysis of Environmental Efficiency

View current figure in a new window

Figures index

Veiw figure View Figure

View next figure

As follows from formula (6), given the joint distribution of relative inputs and outputs, the probability distribution of the efficiency score can be found explicitly. The underlying theory and corresponding formulas can be found in ^[16]. From formula (6), it follows that the efficiency scores for each DMU_k k = 1,…, N are determined by the product of maximum normalized inputs and outputs. This observation allows for the introduction of stochasticity into DEA PO that is not related to the inaccuracy in the measurement of inputs or outputs. In contrast, stochastic DEA PO considers normalized actual DMUs as occurrences of identically distributed random DMUs. By doing so, it converts a group of DMUs under consideration into a statistical sample for which the probability distribution of the efficiency score can be estimated. The mathematical expectation of the group's efficiency score is referred to below as common efficiency E_c:

(7)

where symbol M stands for the mathematical expectation. The common efficiency is the efficiency of the whole group and it serves as a benchmark for the evaluations of the partial efficiencies of the group's individual members.

Common efficiency shows how well the group is performing overall. Its use in the group-oriented approach to efficiency estimation is practical in some applications, such as a comparative study of national and regional environmental efficiency. In this case it is useful to evaluate how well a group of economies meets environmental standards, as well as efficiency variation among its members. For every particular DMU_k, the difference between individual efficiency and common efficiency is referred to as partial efficiency E_pk:

(8)

As follows from formula (8), partial efficiency E_pk is scaled to the interval [-1,1]. It shows how well an individual DMU_k performs with regard to the group as a whole. Negative values of partial efficiency E_pk mean that a corresponding DMU performs worse than the group at large and requires special attention of the regulatory body.

It should be noted that while conventional DEA efficiency scores are bound to the interval [0,1], publications ^[1] and ^[13] introduced “super efficiency” that allows it to be greater than 1. Similarly, negative values of the partial efficiency introduced in this paper extend the interval of feasible values of efficiency scores further to values below zero.

Introduction of common and partial efficiency leads to additive decomposition of the total efficiency of the DMU_k as follows:

(9)

where E_k, E_c , and E_pkstand for the total efficiency score and its common and partial components for DMU_k, respectively.

With inputs and outputs normalized to the interval [0,1] and considered as occurrences of identically distributed random objects, the Beta distribution is a convenient choice for fitting their distributions, see ^[8]. The support of the Beta distribution is the interval [0,1] - same as that of the DEA PO normalized inputs and outputs. It has two parameters: α and β that control the shape of the probability density function (pdf). Depending on the values of α and β, the Beta-pdf may be increasing, decreasing, bell–shaped, U–shaped, or horizontal. A statistical computer language R provides a procedure to estimate parameters of the Beta distribution. It is available on the internet for free download ^[22].

Recall that the pdf of the Beta distribution is:

(10)

where Β(α,β) is the Beta–function

(11)

The mathematical expectation and variance of the Beta distribution are as follows:

(12)

A cumulative distribution function (cdf) of the maximum normalized inputs and outputs that appear in the formula (6) can be found using the formulas for the probability distribution of order statistics, see ^[16]. Let ζ = <ζ₁,ζ₂,…,ζ_p> be a random vector with cdf

(13)

and η = max(ζ_q,q=1,…,p). Then the cdf of the random variable η is:

(14)

A pdf of the random variable η, denoted as f_η(x), can be found by partial differentiation of its cdf wth respect to x₁,x₂,…,x_p and consequent substituting x for every x_q, q = 1,…p:

(15)

where F_qζ(x,x,…,x) stands for the partial derivative of F_ζ(x,x,…,x) with respect to q-th argument, q = 1,…,p.

In case of independent random variables ζ₁,ζ₂,…,ζ_p,

(16)

so that:

(17)

where f_ξq(x) is a pdf of the random variable ζ_q. In the case of two inputs and two outputs considered in the next section, this formula becomes:

(18)

For three inputs or outputs, the formula is:

(19)

It may be noted that the number of additive terms in formulas (16) through (19) grows linearly with the number of inputs or outputs, and their computation requires just minor changes in the computer program.

In addition to the pdf of the maximum value, a formula (6) also requires computation of the pdf of the product of two random variables having support [0,1]. Designating the pdf of the joint distribution of the factors as g(x,y), we get ^[16],

(20)

that simplifies to

(21)

for two independent random variables.

Publication ^[9] emphasizes computational problems that may arise when formulas (20) or (21) are used in practice. However, these problems may be avoided if inputs and outputs are independent. In this case, the mathematical expectation of the product of the maxima of normalized inputs and outputs equals to the product of their mathematical expectations:

(22)

where

(23)

where E_output and E_inputare calculated using a formula (18) for two elements, formula (19) for three, and formula (17), for a greater number of the elements. Formula (22) holds for any number of inputs and outputs.

Formulas (17), (18), (19), (22) and (23) shed light on how SDEA PO operates, and on its ability to fix a problem related to conventional DEA revealed in the literature. As mentioned in literature - ^[1], for example - DEA may assign an efficiency score of 1 (fully efficient) to an object with a single very large output or a very small input. Similarly, DEA PO may assign a high efficiency score to an object having just one very large output and one very small input. These observations may undermine the ability of DEA to assign ranks objectively. Contrary to the deterministic versions, stochastic DEA PO, when producing a common efficiency score E_c, blends all of the indicators of all objects. By doing so, it obviates an opportunity for any single input or output to have a decisive impact.

The steps performed by the SDEA PO on the way to finding the common efficiency E_c are as follows. First, it blends inputs and outputs into two random vectors, with each component corresponding to one input or output, respectively. Next, it fits the Beta - distribution to the sample, and uses the distributions of inputs and outputs to obtain the pdf's of maximum values of relative inputs and outputs – f_input and f_output – as shown by formulas (17), (18) and (19). Then the mathematical expectations E_output and E_input, as given by formulas (23), are produced. Finally, the common efficiency score E_c is obtained as a product of two mathematical expectations, see formula (22) - a stochastic equivalent of formula (6). As a result, all inputs and outputs, rather than the maxima only, are included in the common efficiency score E_c.

It may be noted that computationally SDEA PO is robust with regard to the numbers of inputs and outputs. Each additional input or output just adds one more additive term in the structure of formula (17) and one multiplicative term to each addend.

3. Example of Applications

This section presents an example of application of the SDEA PO to the prospective comparative analysis of environmental efficiency of major national and regional economies. While conventional DEA is widely used in environmental performance research, see ^[20], for example, SDEA PO extends the opportunities in this arena. In this section, we analyze how well a group of major national and regional economies is expected to perform from the environmental point of view, and which of them need improvement. In calculations, we use prospective data for 2030, available on the website of the Energy Information Administration of the United States ^[21]. The website provides prospective data for GDP, population, total primary energy consumption, and carbon dioxide emissions. The data on the area were collected from the website of the United Nations Statistics Division ^[23].

Efficiency scores obtained by using a DEA PO or an SDEA PO are independent of the units of measurement. For convenience, we transformed the quantitative indicators into the percentage of the world total, as shown in Table 1. A select group of objects includes major national and regional economies that cover 80% to 94% of the total for each quantitative indicator. At the next step, the data in Table 1 were further transformed into the ratios presented in Table 2, columns (2) to (5), similar to that suggested in ^[14].

In general, given five quantitative indicators: Gross Domestic Product (G), Population (P), Area (A), Primary Energy Consumption (R), and Equivalent CO₂ Emissions (C), there are 20 ratios available for inclusion in the DEA model, as shown in Table 3. However, most of these ratios are functionally dependent. Thus, the ratios that are symmetrical about the main diagonal of Table 3 are inverses of each other. For example, R/C = 1/(C/R). Also, any three ratios that form an L-shape in Table 3 are interconnected as well. For instance, (P/R)(A/P) = (A/R). It may be shown that with five quantitative indicators, there are only four functionally independent ratios.

In this paper, we used functionally independent ratios with clear economic interpretations referred to below as environmental ratios. Among them are Energy intensity of GDP (R/G) and Emissions intensity of energy (C/R) – the ratios typical for environmental studies. They were used as DEA inputs. For outputs, we used GDP per capita (G/P), a commonly used economic indicator, and the ratio of the Area to CO₂-equivalent emissions (A/C). The last ratio is atmospheric clearness of a country or region. By choosing it, we stress the responsibility of each country or region for the atmospheric quality within its borders.

Functional independence of the selected ratios can be proven by using a functional determinant of the Jacobian matrix ^[12]. To prove it, we consider these ratios as functions of the quantitative indicators C, A, R, P, and G, that is h₁ = G/P, h₂ = A/C, h₃ = R/G, and h₄ = C/R. With straightforward calculations, it can be shown that the Jacobian matrix has rank 4, so that the functions h_i, i=1,..4, and, therefore corresponding environmental ratios, are functionally independent.

Columns (6) to (9) in Table 2 present normalized environmental ratios obtained as the ratios of i-output to maximum output, and of minimum input to j-input, respectively. They were used further as DEA PO inputs and outputs. For example, for the United States, the normalized values of inputs and outputs are as follows: output₁ = 3.6723/3.6723 = 1.0000, output₂ = 0.4141/5.8420 = 0.0709, input₁ = 0.6643/1.0174 = 0.6529, and input₂ = 0.6634/0.9278 = 0.7150. Maxima of the two normalized outputs and inputs are given in columns (10) and (11), while column (12) presents their product expressed in percent. For instance, for the U.S., max(1.0000, 0.0709) = 1.000, max(0.6529, 0.7150) = 0.7150. The product is 1.0 × 0.7150 = 0.7150 or 71.50%, which is the total efficiency score calculated by the formula (6) and shown in column (12) of Table 2 in the row corresponding to the U.S.

Table 1. Projections for 2030, % of the World Total

Download as

Veiw figure View Table

View next table

Table 2. Prospected ratios and efficiency

Download as

Veiw figure View Table

View previous table

View next table

Table 3. Ratio matrix

Download as

Veiw figure View Table

View previous table

View next table

For analysis, we used SDEA PO to separate total efficiency into additive components: common and partial efficiency, as given by the formula (9). The common component E_c is a part of the total efficiency; it was calculated using SDEA PO and a computer program provided in the Appendix section. Methodologically, E_c corresponds to the worldwide environmental efficiency level obtained by weighing CO₂-equivalent emissions against economic development, technological progress, energy consumption, area, and population. It was calculated by fitting the Beta distribution to the normalized environmental ratios and then by using formula (18) for finding the probability distribution of the corresponding maxima, and formulas (23) and (22) for calculations of the mathematical expectations and common efficiency, respectively. For the calculations, we used a program written by the author in R - language, version R 2.13.0 given in the Appendix.

While the R language procedure “fitdist” with default parameters (maximum likelihood method) gave the best results in simulations, it failed to fit to actual data. Because of that, we used the program parameters providing maximum goodness of fit estimation (MGE) using the Cramér–von Mises (CvM) distance. Table 4 presents the values of the parameters α and β for the Beta distribution. These values were used in the pdf functions g₁ and g₂in formula (10). By doing so, we obtained the common efficiency score E_c = 46.43%. This value was used in column 13 of Table 2 to calculate partial efficiency scores as the difference between total and common efficiency, see formula (8).

Table 4. Beta-distribution parameters, 2030

Download as

Veiw figure View Table

View previous table

Figure 2 shows total and partial efficiency graphically. As follows from Table 2 and Figure 2, the U.S. and Canada are expected to have high values of total efficiency (71.50% and 82.17%, respectively), and positive values of partial efficiency (25.07% and 35.74%, respectively). For the OECD countries of Europe and Australia/New Zealand region, the expected total efficiency is sufficiently lower: 56.11% and 58.57%, respectively, though partial efficiency is still positive - 9.68% and 12.14%, respectively. When looking at the BRIC countries (Brazil, Russia, India, and China) that are conventionally considered as the leading force of future economic development, it may be noted that only Brazil is expected to have positive partial efficiency of 9.05%. The other three bear negative values: Russia -4.04%, China -29.81, and India -35.61%. The Middle East is another region with a large negative partial efficiency score of -27.57%.

Obtained results may be used to develop recommendations on economic restructuring leading to better environmental performance worldwide.

Figure 2. Total and partial efficiency. Common efficiency = 46.43%

Download as