A Learning Automata Based Spectrum Prediction Technique for Cognitive Radio Networks

Veiw figure View Figure

The simplicity of the structure and ability of the correct predicting of the channel status are two important features to evaluate the predictive methods. There are several prediction techniques to discover spectrum holes in the cognitive networks.

Hidden Markov Model (HMM) is a prediction technique which is designed based on the Markov chains ^{[3, 4]}. In HMM, two types of states are introduced as hidden states and observation states and two sets of probabilities, transition probability and emission probability are defined. Transition probabilities show the probability of the moving from one hidden state to other hidden state and emission probabilities show the probability of emitting each observation state from each hidden state. Hidden states present the status of the PUs in channel utilization which can have two cases (idle or free). Observation states are composed of different pattern of the PUs in the spectrum utilization. The probability of the observation state occurring is updated using the statistical PUs behaviour through the forward- backward algorithm. The prediction of the PUs behaviour is performed by comparing the emission probabilities. Consequently, an observation state which is more probable to be emitted from a hidden state (for example idle state) shows the PUs activity in the future time slot. The forward-backward algorithm imposes many computations to calculate emission probabilities and defines several states in HMM which increases the complexity of this model.

Another machine learning method which is the most common prediction technique is artificial neural network ^{[5, 6, 7]}. Artificial neural networks are abstracted model of the human brain which consists of input and output layers and middle layers composed of artificial neurons. Neurons are connected by weighted links passing signal from one neuron to another (other layers of the network) ^[7]. In order to predict the behaviour of the PUs, the statistical pattern is utilized to update the weight of neurons through a training stage. Number of intermediate layers in the neural network determines the complexity of the network. More number of intermediate layers with a complete training data set can improve accurate prediction of the PUs behaviour. However, it is important to note that by increasing the number of neurons in the neural network, the complexity of the model increases which may make it inefficient in the practice. Furthermore, to achieve an acceptable prediction rate, artificial neural network requires large amount of training data with high resolution. Therefore, in the case of prediction of the PUs behaviour, large amount of the statistical information of the PUs spectrum should be available which make the training phase of the neural network so long. This is a serious problem for real-time applications where the systems are expected to adopt themselves with the environment quickly.

As mentioned in the last two paragraphs, both HMM and artificial neural networks suffer from high computational and time complexity which are important for real-time applications. Therefore, a proper predictive technique should be able to update itself with variation of the environment quickly. In this paper, we introduce a simple structure Learning Automata model to predict the PUs behaviour. The training stage of the proposed Learning Automata is simple and fast enough for real-time applications. Furthermore, the proposed technique is able to provide acceptable performance in term of correct prediction of the PU behaviour.

The rest of this paper is organized as follows. In section 2, the background and principal of the machine learning technique is presented. Section 3 introduces the proposed Learning Automata. Section 4 evaluates the proposed method and compares its performance with artificial neural network. Last section concludes the paper.

2. Background

In this section, we explain the concepts of machine learning techniques and Learning Automata in the system modeling.

In order to analyse the system with stochastic environment and variables, mathematical modelling is at the basis of different machine learning techniques. Machine learning techniques by defining different structure for system and utilizing parameters (such as probability vector) showing the interaction between the system variables and the model try to model the behaviour of a system.

The Learning Automata is one of the basic models of the machine learning techniques with finite states that interacts with a stochastic environment trying to learn the optimal action offered by the environment, via a learning process. One form of presenting a Learning Automata is defined via Markov chains that can be determined by set of states and vector of the transition probabilities. The states present different status of the system or variables in the system and, the transition probability shows the likelihood of transition of each state to other state.

A general application of the machine learning techniques such as Learning Automata is decision making for resource allocation based on the behavior of the system. The decision making of an optimal action using the Learning Automata is performed based on the transition probability vector that shows the priority of the optimal actions. In other word, the Automata model determines the present status of the system as one of its states. After that, according to the statistical behavior of the system and transition probability, the most probable status of the system is predicted for a certain time in the future. The optimal decision making for resource allocation is performed based on the prediction of the model.

In order to use machine learning techniques for system modeling two stages, training and testing phase are very important. In the training stage, the behavior of the system is modeled based on the statistical behavior of the system. During the training stage the model updates its parameters. For example, in the Learning Automata, the transition probability vector is updated in the training stage for suitable decision making. The training methods must be simple with low computational complexity and fast. After training stage, the test phase must be performed to evaluate the ability of the model in correct prediction of the system behavior. This aim is achieved by the proper training procedure and enough information about the behavior of the system.

In next section, the structure of the proposed Learning Automata, training and testing stages of the mod model for cognitive radio applications are introduced.

3. Proposed Learning Automata for the CR network

In this section, we introduce an Automata model to predict the activity of the PUs in the related frequency bands. Therefore, the pattern of the bandwidth utilization of PUs must be modelled. The Structure of the proposed Learning Automata and modelling the behaviour of the PUs is explained in three steps: A) generating the pattern of the PUs activity, B) modelling the training and updating and C) testing the ability of the model in correct prediction of the system behaviour.

A) PU activity pattern: As mentioned, the activity pattern of the PU is the system variable which must be modelled. Therefore, the first step of the system modelling is to provide a method to present the activity of the PUs. In the practice, the activity pattern of the PUs is available by accurate spectrum sensing and collecting the statistical behaviour of the PUs for a certain period of the time. However, to test the ability of the model to correct prediction of the PU status, the activity pattern of the PU is generated based on the features of the PUs activity. The pattern of spectrum usage of PUs is function of time and place. Each PU according to its geographical position and the range of the demands has different activity pattern that varies by the time. In order to present the activity of the PUs, we use a binary sequence, where each 0 shows the absence of the PU and each 1 presents the activity of the PU. The number of the bits in each sequence presents the accuracy of the spectrum sensing for PU activity in each day. Ratio of the number of ones in each sequence to total the number of bits in each sequence is defined as the activity factor of the PU which is one of the most important parameters in the system modelling. To model the PU bandwidth utilization pattern, it is assumed that each PU occupies related bandwidth for more than half of the total time duration in each day (activity factor > 50%). To challenge the ability of model to predict the status of PUs a random behaviour is added to each sequence. Figure 2 shows an example of adding random behaviour to each binary sequence. As Figure 2 shows in each sequence some bits are randomly set to zeros and ones around time duration when PU is constantly active.

Figure 2. Binary sequence showing the random behaviour of the PUs

Download as

Veiw figure View Figure

B) Training the Automata Model: Figure 3 shows an example of Markov chain used for modelling PU behaviour. As can be seen, the number of bits in the states determines the total number of the states and structure complexity in the Markov model. Each bit in a state shows the activity of the PU. For example state 001 shows that PU was inactive for two consecutive time slot and at last time slot became active. In the Markov model the following sets are defined:

A={a1, a2,…, ar} is set of actions on the Markov chains. r is the number of actions (2 < r < ∞) and the action selected at time t is a(t) which can be 0 or 1. Each action shows a transition over the Markov states.

E={e1, e2,…, er} is defined as set of possible response of environment. ei=1 presents PUs activity and ei=0 shows the absence of PU in the related spectrum.

B={b1(t),b2(t),…,br(t)} is set of transition probability between states (bi(t)∈[0,1]^{[, 1]}) and .

Figure 3. An example of Markov chain

Download as

Veiw figure View Figure

Before training the model, the initial transition probabilities (Binit) are considered to be 0.5. This assumption shows that at the beginning of the training stage no action has priority than other actions. Equations (1) and (2) present probability updating scheme for training the Learning Automata model. Note that summation of outgoing probability in each state is 1.

(1)

(2)

where k is an integer value. Selecting different values for the parameter k affects the threshold value for decision making. We consider the parameter k equal to number of bits in each state. For more explanation about updating scheme consider Figure 4.

Figure 4. Example of updating scheme in time slot t.

Download as

Veiw figure View Figure

Assume in Figure 4, s(t) is the current state of the model. Based on binary sequence of PU activity, next environmental response (PU activity) is known. If PU is active for next time slot (e(t+1)=1) then transition probability, b(t+1), is updated using (1). be0 shows transition probability when PU is inactive (e(t+1)=0) and be1 is used when the response of environment is 1 (e(t+1)=1). If PU is inactive for next time slot, transition probability is updated using (2). After computing b(t+1), current state changes to s(t+1) and updating scheme will continue for the next time slot. This procedure continues for all set of binary sequence of the PU activity to complete the training stage.

As mentioned, one of the most important parameters in training stage is the complexity of training method. Utilizing a simple method with low computational complexity makes a predictive model more applicable for online applications. Updating methods in the most predictive techniques are a function of the number of the actions O(r). However, the proposed updating scheme, in (1) and (2), is independent of the number of the actions in the updating stage ^[9].

C) Testing the Automata model: After training the predictive model, in order to assess the ability of the model to correct predicting of the PU activity, a testing binary sequence is generated using the same method explained for generating the PU activity pattern at the beginning of this section. A random time in the test sequence determines the current state of the model. Next status of PU activity is predicting by comparing the transition probability in each state. After that the result of prediction is compared with actual PU activity in the test sequence. This process conducts for many iterations and using (3) the prediction rate or the ability of Learning Automata model to correct prediction of PU activity is computed.

(3)

Next section evaluates the proposed Learning Automata model.

4. Modelling and Results

To evaluate the performance of the Learning Automata model, we modelled a cognitive radio network with one PU and N SUs. The bandwidth usage pattern of the PU is determined by the set of binary sequences showing the statistical behaviour of the PU for certain time duration. In the simulated model, we considered a binary sequence with 48 bits presenting the activity of PU for a day. A set of 100 binary sequences generated by the method introduced in the previous section is utilized to training the Learning Automata model. After the training stage, in order to evaluate the ability of predicting model, a testing sequence (St(t)) is generated. A random number (t0) between 1 and 47 is selected showing the starting time in the testing binary sequence d St(t0) presents the status of PU activity. The duty of the predicting model is to determine the status of the PU activity in (t0+1)th time slot. By comparing the result of predicting technique with (t0+1)th bit in the test sequence (St(t0+1)) the prediction rate of Learning Automata is computed using (3). This process repeats for 200 independent iterations and the average prediction rate for different scenarios is presented as follows.

Figure 5 shows prediction rate of the Learning Automata against the number of bit integrating each state in the model. It is obvious that by increasing the number of bits generating the states in the model, total number of states increases and the model becomes more complex. However, by increasing the number of states in the model detection of the states and transition behaviour of the PUs is modelled with higher resolution and accuracy which improves the system modelling and prediction ability of the model.

Figure 5. Prediction rate versus variable number of bits in each state

Download as

Veiw figure View Figure

Another important parameter to determining the ability of model to correct prediction is the activity factor. Results of modelling demonstrate that the higher activity factor lead to the higher prediction rate In other word, if the PU occupies the related frequency bands for longer time duration then the number of spectrum holes in time domain decreases and prediction rate of the model increases. Figure 6 presents the effect of the activity factor on the prediction rate. In this case, the number of bits in each state is assumed 4 bits.

Figure 6. Prediction rate versus activity factor

Download as

Veiw figure View Figure

Finally, Figure 7 demonstrates the ability of the proposed model versus the number of states and activity factor. As can be seen, higher activity factor and increasing the number of the state increases the prediction rate of the model.

Therefore, an acceptable prediction rate can be achieved by selecting more numbers of the bits in each state. However, this assumption results to increasing the complexity of the model. Hence, there is a trade-off between the complexity of the model (number of bits in each state) and desirable prediction rate.

Figure 7. The effects of activity factor and the number of states in prediction rate of model.

Download as

Veiw figure View Figure