**American Journal of Applied Mathematics and Statistics**

## Application Bayesian Approach in Tasks Decision-Making Support

Department of Information Economic and Technology, Azerbaijan State Economic University, Baku, AzerbaijanAbstract | |

1. | Introduction |

2. | Experimental Methods |

3. | Results and Discussion |

4. | Conclusions |

References |

### Abstract

The proposed work is devoted to the use of the Bayesian approach based on probabilistic method of use, along with the original statistical data prior information about the current process. Especially significant advantages over classical methods in terms of the accuracy of the statistical inference Bayesian approach is in the relatively small samples, which is very characteristic of the simulation. A theorem on minimizing classification error probability is proved. About choosing a priori distribution parameters from the coordinated class of distributions is considered.

**Keywords:** a priori distribution, a posteriori distribution, a coordinated distribution, Bayesian approach

**Copyright**© 2015 Science and Education Publishing. All Rights Reserved.

### Cite this article:

- Aliyeva Tarana Abulfaz. Application Bayesian Approach in Tasks Decision-Making Support.
*American Journal of Applied Mathematics and Statistics*. Vol. 3, No. 6, 2015, pp 257-262. http://pubs.sciepub.com/ajams/3/6/7

- Abulfaz, Aliyeva Tarana. "Application Bayesian Approach in Tasks Decision-Making Support."
*American Journal of Applied Mathematics and Statistics*3.6 (2015): 257-262.

- Abulfaz, A. T. (2015). Application Bayesian Approach in Tasks Decision-Making Support.
*American Journal of Applied Mathematics and Statistics*,*3*(6), 257-262.

- Abulfaz, Aliyeva Tarana. "Application Bayesian Approach in Tasks Decision-Making Support."
*American Journal of Applied Mathematics and Statistics*3, no. 6 (2015): 257-262.

Import into BibTeX | Import into EndNote | Import into RefMan | Import into RefWorks |

### 1. Introduction

Many statistical problems independently of the methods of solving them have a common property: before you get a specific set of data as a potentially acceptable to study the situation examines the several probabilistic models. When the data is received, there is a pronounced in a form of knowledge about relative acceptability of these models. One way to "revision" on the relative acceptability of probabilistic models is the Bayesian approach, which is based on Bayes' theorem.

The Bayesian approach has a number of advantages that make it attractive enough for widespread application. The main difference of this approach from other approaches is that before the data is received, the decision-maker or a statistician examines the degree of his confidence in the possible models and presents them in the form of probabilities.

Once the data is received, the Bayes' theorem allows us to calculate a new set of probabilities, which are revised the degree of credibility of possible models taking into account the new information received through the data.

As is known, the statistical datas are often absent in actual tasks Decision-Making support, which makes use of many traditional frequency approaches unlawful. Available information may only contain subjective assessments in the form of expert evaluations and judgments. Moreover, the situation in which it is decided, may be generally new at all and has never been previously analyzed. These characteristics complicate the tasks Decision-Making support** **process and may call into question any conclusions and opinions. In such a situation, the application of the Bayesian Approach ^{[1]} is very effective.

### 2. Experimental Methods

The Bayesian approach ^{[2]} is based on the statistical nature of the observations. It is known that in the decision statistic used as input simultaneously two types of information: a priori (before the existing statistical production) and is contained in the initial statistical data. In this case, a priori information given to him in the form of an a priori probability distribution of the analyzed unknown parameter , that describes the degree of confidence that this option will take some value, before the collection of baseline statistics. of the unknown parameter of the system means to find its posteriori distribution. The idea of the Bayesian approach lies precisely in the transition from a priori distribution to the posterior distribution.

Let - full group of mutually exclusive events. , ∅, when . Then the posterior probability is:

(1) |

where - the a priori probability of the event - the conditional probability of an event , provided that the event occurred, and the event has a non-zero probability ().

Initially, Bayesian classification is used to formalize the knowledge of experts in expert systems, Bayesian classification is now also used as a method of Data Mining.

The Bayesian approach consists of the following steps:

1. Determination of the a priori distribution of the desired multi-dimensional parameter;

2. Getting the source of statistical data to the laws of distribution at fixed parameter;

3. The calculation of the likelihood function defined by the relation

(2) |

4. The calculation of the posteriori distribution , determined by the relation

(3) |

5. Building a Bayesian point and interval estimates of parameters using the mean or modal value of the distribution found from the formula (3).

(4) |

**2.1. Experimental Set Up**

In many decision-making tasks a priori probability information about the States of nature can be changed after the new expert assessments or as a result of observation relevant developments related to states and confirming or refuting a priori information.

The dependence of the posterior probabilities of a priori shows how much information and values of the unknown parameters contained in the statistics. If the posterior probabilities are highly dependent on the a priori, it is likely the data contains little information. If the posterior probabilities are weakly dependent on the choice of a priori distribution, the data are informative.

Thus, using the Bayesian approach except for the probability distribution of the considered random variable is assumed that some a priori distribution parameters , the distribution function of values . Relying on statistical data, the a priori distribution of the parameters is modified by multiplying the likelihood function and normalization. The result of the modification is the posterior distribution of the parameters. In other words, the parameters of the random variable themselves are random variables with some distribution.

The most important and challenging at the same time is a matter of choosing a priori distribution parameters. One of the factors here is that the presence of even statistical informative "poor" a priori distribution will not significantly affect the posterior. Another important factor is the computational complexity, especially if the calculations of the posterior distribution produced in series as they become available statistical information. Therefore, the choice of the a priori distribution affects its belonging to so-called class coordinated distributions such distributions that are a priori and a posteriori the same distribution, but with different parameters.

Priori distributions modeling no a priori information are called uninformative. The postulate Bayes-Laplace ^{[3]} says that when nothing is known about the pre-parameter a priori distribution should be uniform, i.e., all manner of** **outcomes of a random variable have equal probability. The main problem of the use of uniform distribution as a non-informative a priori distribution is that a uniform distribution is not invariant with respect to the function of the parameter. If we know nothing about the parameter , then we also do not know, for example, about the function . However, if has a uniform distribution, no longer has a uniform distribution, although according to the postulate Bayes - Laplace, must have the uniform distribution. Moreover, the uniform distribution cannot be used as a priori, if the set parameter values are infinite.

In the literature, there is enough considerable number of approaches for the selection of the no information priori distribution having its advantages and disadvantages. However, the most interesting is, the approach is completely different from most traditional. The essence of this approach is as follows. Define more than one prior distribution, but a whole class of distributions , for which you can find lower and upper probability of the event as a

Under certain conditions, the set is completely determined by the lower and upper probability distribution functions. It should be noted that the class should be seen "not as a class of suitable prior distributions, as well as a suitable class of prior distributions. This means that every single distribution of the class is not the best or "good" a priori distribution, since no single distribution can not satisfactorily simulate the absence of information. But the whole class as a whole, define the upper and the lower the probability distribution is an appropriate model for the lack of information.

**Definition****.** The family of a priori distributions is called the conjugate with respect to the likelihood function , and if the posterior distribution as calculated by the formula (3), again belongs to the same family.

If the likelihood function can be represented in a form

(5) |

Where and - some of the functions of the observations , do not depend on the parameters , then there is the family a priori distributions conjugate to . The functions also called sufficient statistics.

On the other hand, before any data or observations of parameters are often not known, i.e. the researcher does not have any useful a priori information about the values of the estimated parameter. In such cases it is necessary to consider the following selection rules ^{[4]} priori distribution:

If the estimated scalar parameter can take values in a finite interval or a infinite interval from to , then the a priori density function should be considered constant at the appropriate interval;

if the meaning of the estimated parameter follows that it can take any positive values, it should be considered constant on the real line density function distribution of the logarithm of the value of the parameter, i. e, , when .

This a priori distribution is denoted by . The fact that so defined on infinite direct or semi-direct a priori distributions violated known rules of normalization of the probability density function (because the, , where the integration is over all possible values), does not deliver "technical inconveniences". As a priori distributions used uniform distribution and it allows the density distribution function to minimize the entropy measure of information - for arbitrarily large values .

Given the normalization of the density function (3) we obtain:

(6) |

Us determine the form a priori density for the case, when .

Let - distribution function of the parameter. Then

Accordingly, the density function of the parameter is:

since by the condition . So that for we have:

(7) |

From this we obtain formula for the calculation of the posterior distribution

(8) |

where the likelihood function has the form (5).

**Remark.** Use as a priori laws of probability distributions associated with the observed general set of a family of sets of priori distributions. However, the implementation of Bayesian approach is necessary to operate the specific priori distribution, which requires knowledge of the numerical values of the parameters on which our a priori probability distribution depends. In a broad sense, the a priori distribution parameters can be determined by the method of moments with known average values of the estimated parameters and their standard deviations, , …, ...,.

Let the family of priori distributions, conjugated with the likelihood function of observations available to us and let set parameter values for the analyzed case. Then, using a series of identical transformations the right side of the relation

(9) |

is given to factors independent of to the species, where the last function belongs to the family, and each of the components of the vector of parameters is a function of and .

Consistency of distributions is determined not only by the views of a priori distribution, but also the view of the likelihood function, i.e., type of distribution must be maintained when multiplying a priori distribution on the likelihood function with the normalization. Dirichlet distribution also applies to distributions.

For a description of the Dirichlet distribution, consider the standard polynomial model.

Let the set of possible outcomes, and there is a set of observations independently selected from the same probability of each outcome for all , where and . The probability that the outcome of the observations will occur time, determined from the well-known formula of polynomial distribution with parameters. However, the parameters themselves may be random variables and have a distribution or probability density. One of the most interesting distributions of the parameters is a Dirichlet distribution, which is coordinated with a polynomial distribution in the sense that a priori and a posteriori distributions are Dirichlet distributions.

Here the parameter is the average value (expectation) the probability ; the parameter determines the effect of a priori distribution on the posterior probabilities; the vector belongs to the inner area unit simplex of dimension, which we denote ; - the gamma function, satisfying conditions and . Necessary to note that the variables of the Dirichlet distribution are probabilities, satisfying the condition . This means that not only the random events are considered, but their probabilities.

After receiving the observation vector , where - the number of observations outcome , multiplying a priori density on the likelihood function , we obtain the posterior density:

which can be considered as the density of the Dirichlet distribution, where

In other words, the Dirichlet distribution belongs to the coordinated class of distributions and under the recalculation a priori parameters are converted into . This is a very important feature, through which the Dirichlet distribution is widespread in the Bayesian analysis.

### 3. Results and Discussion

**3.1. Results**

*Example 1. *

a) A random variable with unknown parameter value has an exponential distribution. Check the conditions of existence of the conjugate priori distribution, to determine its accessories thereof Class of distributions and the specific values of the parameters of the distribution.

From equation (2) have:

(10) |

Therefore,

Here - sufficient statistic that confirms the existence of a priori, conjugated with the distribution of the parameter.

Now will update the belonging of this distribution to a certain class of distributions.

(11) |

The right side of relation (11) states that the family of conjugate priori distributions parameter exponentially distributed the general total belongs to the class of gamma distribution with the density

and distribution of parameters and , respectively.

It is known that the mean value () and variance () of gamma - distributions are expressed by the parameters of the distribution and by the formulas:

Substituting into these relations instead and respectively given values and , we obtain as the solutions of a system from the two equations (and relative to and ):

In the example above, we define a posteriori distribution, taking into account (10) and the density of gamma distribution:

From here, it is seen that the a posteriori distribution of the parameter again obeys the law gamma - distribution, but with the following parameters:

b) The random variable (the number of successes in series of Bernoulli trials) has a binomial distribution, where - unknown probability of success in one such test, and - the total number of tests in this series of Bernoulli. Check conditions for the existence of the conjugate prior distribution; determine its belonging in the class of distributions and specific parameters of the distribution.

It is known that

There are observed such series. According to the formula (2) we have:

(12) |

where the number of successes in -th series. There - sufficient statistic that confirms the existence of a priori, coupled with the distribution parameter . Now, we will specify the distribution of belonging to a particular class of distributions.

Define for , then

(13) |

The right side of relations (11) is (up to a normalizing factor, independent of) the density of the beta distribution

(14) |

with parameters and respectively, where - the known Euler gamma function (see. Example 1 a)).

Consequently, the family of conjugate prior distributions parameter (the probability of "success") observed the binomial distribution of the total population belongs to the class of beta - distribution (14).

It is known that the average value () and variance () beta distribution parameters and expressed in terms and of this distribution by the formulas:

Using these expressions we obtain as the solutions of the system of two equations relatively and :

We define the posterior distribution, taking into account (10) and the expression of gamma density - distribution:

From here, it is seen that the posterior distribution of parameter again obeys the gamma - distribution, but with the following parameters:

v) The random variable with unknown value of the parameter has a uniform distribution on the segment . Check the conditions of existence of the conjugate priori distribution, to determine its accessories thereof Class of distributions and the specific values of the parameters of the distribution.

It is known that

From equation (2) have:

(15) |

Therefore, within the overall view (6) we have: - sufficient statistic that confirms the existence of a priori, conjugated with the distribution of the parameter. Now will update the belonging of this distribution to a certain class of distributions.

Taking into account (7), from (8) we obtain:

(16) |

The right side of the relations (16) is the density distribution of the Pareto of the form

(17) |

with parameters and some parameter shift

Consequently, the family of conjugate prior distributions on the segment of parameter uniformly distributed random variable belongs to the Pareto distribution of the form (17).

We define a priori distribution, taking into account (15) and the expression density Pareto distribution (17) as a priori distribution

we have:

for .

It follows that the a priori distribution of the parameter is described, as well as a priori, by Pareto law (17), but with the following parameters:

*Example 2.* A result of inspection 10 firms it was found that 6 firms have an average reliability, 3 firm - high reliability and 1 firm - low reliability. A priori reliability probabilities equal ,, respectively. Find a posteriori expectations of probability reliability of firms.

Thus, after receiving the new statistical information probability of the reliability of firms are random, but with new mathematical expectations obtained above. In this information confirmed the a priori idea of the probability of firms of high reliability and changed the idea of the probabilities of low and medium of firms reliability. On one hand, if we take, then the a posteriori information be fully determined only by the available additional information as analysis 10 firms. The data of conversion in this case do not depend on the choice of a priori probabilities. On the other hand, if the parameter takes large values, the resulting statistics practically ceases to affect the posterior probabilities and the search for additional information loses its meaning.

Bayesian models have several advantages compared with the frequency models. One such advantage is that Bayesian models can in principle give some results, even if not at the sample data. This is due to the use of the a priori probability distribution, which in the absence of statistical data does not change, and calculated by Bayes' theorem the posteriori distribution coincides with the a priori.

### 4. Conclusions

In deciding Bayesian models allow to carry out additional experiments to clarify the states of nature. In practice, this refinement is mainly by collecting additional information, and by performing these experiments. Additional information about the profit and timely settlement with the budget allows us to refine the a priori probabilities of states of nature (for reliability).

At the same time, the design of experiments to determine whether it is expedient from the viewpoint of the cost of its implementation. To do this, you need to compare the expected additional income or benefits that be obtained by using additional information acquired as a result of the experiment, and the expected cost of the experiment. So, for example, with an analysis of the reliability the firm for information about the profit may demand an audit on which to spend certain means. The number of alternative courses of action in such problems increases as an opportunity to not only select one of the available alternatives, but also determine the feasibility of carrying out experiments or activities for additional information.

### References

[1] | Tulupyev, A.L., Nikolenko S.I., Sirotkin A.V., BAYESIAN NETWORKS: A Probabilistic Logic Approach (Russian), St. Petersburg: Nauka, 2006. 607 p. | ||

In article | |||

[2] | Morris, U.T, The science of management: A Bayesian approach.. М: Мir, 1971. 304 p. | ||

In article | |||

[3] | Friedrich Schreiber, Probability and Bayesian Statistics Plenum Press, The Extended Bayes-Postulate, Its Potential Effect on Statistical methods and Some Historical Aspects, New York, 1987, pp. 423-430. | ||

In article | |||

[4] | Jeffreys, Scientific Inference. 2nd ed. Cambridge University Press, 1957. | ||

In article | |||

[5] | De Groot M.H. Optimal Statistical Decisions: Wiley, 2004. | ||

In article | View Article | ||