A Learning Automata Based Spectrum Prediction Technique for Cognitive Radio Networks

Mehdi Golestanian, Shahrzad Iranmanesh, Reza Ghazizadeh, Mohammadreza Azimi

  Open Access OPEN ACCESS  Peer Reviewed PEER-REVIEWED

A Learning Automata Based Spectrum Prediction Technique for Cognitive Radio Networks

Mehdi Golestanian1,, Shahrzad Iranmanesh1, Reza Ghazizadeh1, Mohammadreza Azimi1

1Faculty of Electrical and Computer Engineering, University of Birjand, Birjand, Iran


This paper introduces an application of artificial intelligence in the cognitive radio networks. The Cognitive Radio Network (CRN) provides a suitable environment for Secondary Users (SUs) to share the spectrum with Primary Users (PUs) in a non-interfering manner. In order to determine the availability of PUs bandwidth, SU can sense the spectrum in the channel. But, accurate and constant spectrum sensing consumes the energy of the SUs significantly. In these conditions, to discover the spectrum holes in the absence of PUs, predictive techniques can be one of the solutions which can reduce the consuming energy of the SUs. The simplicity and reliability of predictive techniques play an important role in the practice. In this paper, we utilize a Learning Automata technique to predict the spectrum hole in the cognitive network based on the statistical behaviour of the PUs. Simple structure and acceptable prediction rate are two important features of the proposed technique. In order to compare the performance of the proposed method with similar predictive techniques in CRNs, we design a predictor model using multilayer perceptron artificial neural networks and test the performance of these two methods on the same conditions. The results of modelling confirm that the Learning Automata with simple structure is more reliable than neural network.

At a glance: Figures

Cite this article:

  • Golestanian, Mehdi, et al. "A Learning Automata Based Spectrum Prediction Technique for Cognitive Radio Networks." International Transaction of Electrical and Computer Engineers System 2.3 (2014): 93-97.
  • Golestanian, M. , Iranmanesh, S. , Ghazizadeh, R. , & Azimi, M. (2014). A Learning Automata Based Spectrum Prediction Technique for Cognitive Radio Networks. International Transaction of Electrical and Computer Engineers System, 2(3), 93-97.
  • Golestanian, Mehdi, Shahrzad Iranmanesh, Reza Ghazizadeh, and Mohammadreza Azimi. "A Learning Automata Based Spectrum Prediction Technique for Cognitive Radio Networks." International Transaction of Electrical and Computer Engineers System 2, no. 3 (2014): 93-97.

Import into BibTeX Import into EndNote Import into RefMan Import into RefWorks

1. Introduction

Spectrum sharing is one of the most important topics in resource allocation in the wireless networks. Cognitive Radio (CR) provides a proper solution to the scarcity of spectrum for today growing communication systems by introducing some fundamental activities known as Cognition Cycles [1, 2]. Cognition Cycles are the activities performed by the CR including observing the wireless environment, adopting itself with the environment conditions by the suitable decision making and learning from the past behaviour of the system [2].

Users in a cognitive radio networks are classified to two groups: Primary Users (PUs) and Secondary Users (SUs). PUs have higher priority to access to spectrum than the SUs. The cognitive radio provides a suitable environment for the SUs to share the spectrum with the PUs in a non-interfering manner. In this way, one of the most important duties of the SUs is to sense the spectrum of the PUs to use the spectrum holes in the absence of the PUs. Figure 1 shows an example of spectrum holes in the absence of the PUs in time and frequency domains.

As mentioned, to use the spectrum holes if a SU perform an accurate spectrum sensing mechanism, it results to high energy consumption. However, the cognition cycle by utilizing some predicting techniques enables the SUs to discover spectrum holes based on statistical behaviour of PUs without consuming more energy. By using such techniques, the spectrum sensing by SUs is limited to channels that are predicted as idle channels and as a result the power consumption of the spectrum sensing decreases.

Figure 1. Spectrum holes in the absence of the Pus

The simplicity of the structure and ability of the correct predicting of the channel status are two important features to evaluate the predictive methods. There are several prediction techniques to discover spectrum holes in the cognitive networks.

Hidden Markov Model (HMM) is a prediction technique which is designed based on the Markov chains [3, 4]. In HMM, two types of states are introduced as hidden states and observation states and two sets of probabilities, transition probability and emission probability are defined. Transition probabilities show the probability of the moving from one hidden state to other hidden state and emission probabilities show the probability of emitting each observation state from each hidden state. Hidden states present the status of the PUs in channel utilization which can have two cases (idle or free). Observation states are composed of different pattern of the PUs in the spectrum utilization. The probability of the observation state occurring is updated using the statistical PUs behaviour through the forward- backward algorithm. The prediction of the PUs behaviour is performed by comparing the emission probabilities. Consequently, an observation state which is more probable to be emitted from a hidden state (for example idle state) shows the PUs activity in the future time slot. The forward-backward algorithm imposes many computations to calculate emission probabilities and defines several states in HMM which increases the complexity of this model.

Another machine learning method which is the most common prediction technique is artificial neural network [5, 6, 7]. Artificial neural networks are abstracted model of the human brain which consists of input and output layers and middle layers composed of artificial neurons. Neurons are connected by weighted links passing signal from one neuron to another (other layers of the network) [7]. In order to predict the behaviour of the PUs, the statistical pattern is utilized to update the weight of neurons through a training stage. Number of intermediate layers in the neural network determines the complexity of the network. More number of intermediate layers with a complete training data set can improve accurate prediction of the PUs behaviour. However, it is important to note that by increasing the number of neurons in the neural network, the complexity of the model increases which may make it inefficient in the practice. Furthermore, to achieve an acceptable prediction rate, artificial neural network requires large amount of training data with high resolution. Therefore, in the case of prediction of the PUs behaviour, large amount of the statistical information of the PUs spectrum should be available which make the training phase of the neural network so long. This is a serious problem for real-time applications where the systems are expected to adopt themselves with the environment quickly.

As mentioned in the last two paragraphs, both HMM and artificial neural networks suffer from high computational and time complexity which are important for real-time applications. Therefore, a proper predictive technique should be able to update itself with variation of the environment quickly. In this paper, we introduce a simple structure Learning Automata model to predict the PUs behaviour. The training stage of the proposed Learning Automata is simple and fast enough for real-time applications. Furthermore, the proposed technique is able to provide acceptable performance in term of correct prediction of the PU behaviour.

The rest of this paper is organized as follows. In section 2, the background and principal of the machine learning technique is presented. Section 3 introduces the proposed Learning Automata. Section 4 evaluates the proposed method and compares its performance with artificial neural network. Last section concludes the paper.

2. Background

In this section, we explain the concepts of machine learning techniques and Learning Automata in the system modeling.

In order to analyse the system with stochastic environment and variables, mathematical modelling is at the basis of different machine learning techniques. Machine learning techniques by defining different structure for system and utilizing parameters (such as probability vector) showing the interaction between the system variables and the model try to model the behaviour of a system.

The Learning Automata is one of the basic models of the machine learning techniques with finite states that interacts with a stochastic environment trying to learn the optimal action offered by the environment, via a learning process. One form of presenting a Learning Automata is defined via Markov chains that can be determined by set of states and vector of the transition probabilities. The states present different status of the system or variables in the system and, the transition probability shows the likelihood of transition of each state to other state.

A general application of the machine learning techniques such as Learning Automata is decision making for resource allocation based on the behavior of the system. The decision making of an optimal action using the Learning Automata is performed based on the transition probability vector that shows the priority of the optimal actions. In other word, the Automata model determines the present status of the system as one of its states. After that, according to the statistical behavior of the system and transition probability, the most probable status of the system is predicted for a certain time in the future. The optimal decision making for resource allocation is performed based on the prediction of the model.

In order to use machine learning techniques for system modeling two stages, training and testing phase are very important. In the training stage, the behavior of the system is modeled based on the statistical behavior of the system. During the training stage the model updates its parameters. For example, in the Learning Automata, the transition probability vector is updated in the training stage for suitable decision making. The training methods must be simple with low computational complexity and fast. After training stage, the test phase must be performed to evaluate the ability of the model in correct prediction of the system behavior. This aim is achieved by the proper training procedure and enough information about the behavior of the system.

In next section, the structure of the proposed Learning Automata, training and testing stages of the mod model for cognitive radio applications are introduced.

3. Proposed Learning Automata for the CR network

In this section, we introduce an Automata model to predict the activity of the PUs in the related frequency bands. Therefore, the pattern of the bandwidth utilization of PUs must be modelled. The Structure of the proposed Learning Automata and modelling the behaviour of the PUs is explained in three steps: A) generating the pattern of the PUs activity, B) modelling the training and updating and C) testing the ability of the model in correct prediction of the system behaviour.

A) PU activity pattern: As mentioned, the activity pattern of the PU is the system variable which must be modelled. Therefore, the first step of the system modelling is to provide a method to present the activity of the PUs. In the practice, the activity pattern of the PUs is available by accurate spectrum sensing and collecting the statistical behaviour of the PUs for a certain period of the time. However, to test the ability of the model to correct prediction of the PU status, the activity pattern of the PU is generated based on the features of the PUs activity. The pattern of spectrum usage of PUs is function of time and place. Each PU according to its geographical position and the range of the demands has different activity pattern that varies by the time. In order to present the activity of the PUs, we use a binary sequence, where each 0 shows the absence of the PU and each 1 presents the activity of the PU. The number of the bits in each sequence presents the accuracy of the spectrum sensing for PU activity in each day. Ratio of the number of ones in each sequence to total the number of bits in each sequence is defined as the activity factor of the PU which is one of the most important parameters in the system modelling. To model the PU bandwidth utilization pattern, it is assumed that each PU occupies related bandwidth for more than half of the total time duration in each day (activity factor > 50%). To challenge the ability of model to predict the status of PUs a random behaviour is added to each sequence. Figure 2 shows an example of adding random behaviour to each binary sequence. As Figure 2 shows in each sequence some bits are randomly set to zeros and ones around time duration when PU is constantly active.

Figure 2. Binary sequence showing the random behaviour of the PUs

B) Training the Automata Model: Figure 3 shows an example of Markov chain used for modelling PU behaviour. As can be seen, the number of bits in the states determines the total number of the states and structure complexity in the Markov model. Each bit in a state shows the activity of the PU. For example state 001 shows that PU was inactive for two consecutive time slot and at last time slot became active. In the Markov model the following sets are defined:

A={a1, a2,…, ar} is set of actions on the Markov chains. r is the number of actions (2 < r < ∞) and the action selected at time t is a(t) which can be 0 or 1. Each action shows a transition over the Markov states.

E={e1, e2,…, er} is defined as set of possible response of environment. ei=1 presents PUs activity and ei=0 shows the absence of PU in the related spectrum.

B={b1(t),b2(t),…,br(t)} is set of transition probability between states (bi(t)∈[0,1][, 1]) and .

Before training the model, the initial transition probabilities (Binit) are considered to be 0.5. This assumption shows that at the beginning of the training stage no action has priority than other actions. Equations (1) and (2) present probability updating scheme for training the Learning Automata model. Note that summation of outgoing probability in each state is 1.


where k is an integer value. Selecting different values for the parameter k affects the threshold value for decision making. We consider the parameter k equal to number of bits in each state. For more explanation about updating scheme consider Figure 4.

Assume in Figure 4, s(t) is the current state of the model. Based on binary sequence of PU activity, next environmental response (PU activity) is known. If PU is active for next time slot (e(t+1)=1) then transition probability, b(t+1), is updated using (1). be0 shows transition probability when PU is inactive (e(t+1)=0) and be1 is used when the response of environment is 1 (e(t+1)=1). If PU is inactive for next time slot, transition probability is updated using (2). After computing b(t+1), current state changes to s(t+1) and updating scheme will continue for the next time slot. This procedure continues for all set of binary sequence of the PU activity to complete the training stage.

As mentioned, one of the most important parameters in training stage is the complexity of training method. Utilizing a simple method with low computational complexity makes a predictive model more applicable for online applications. Updating methods in the most predictive techniques are a function of the number of the actions O(r). However, the proposed updating scheme, in (1) and (2), is independent of the number of the actions in the updating stage [9].

C) Testing the Automata model: After training the predictive model, in order to assess the ability of the model to correct predicting of the PU activity, a testing binary sequence is generated using the same method explained for generating the PU activity pattern at the beginning of this section. A random time in the test sequence determines the current state of the model. Next status of PU activity is predicting by comparing the transition probability in each state. After that the result of prediction is compared with actual PU activity in the test sequence. This process conducts for many iterations and using (3) the prediction rate or the ability of Learning Automata model to correct prediction of PU activity is computed.


Next section evaluates the proposed Learning Automata model.

4. Modelling and Results

To evaluate the performance of the Learning Automata model, we modelled a cognitive radio network with one PU and N SUs. The bandwidth usage pattern of the PU is determined by the set of binary sequences showing the statistical behaviour of the PU for certain time duration. In the simulated model, we considered a binary sequence with 48 bits presenting the activity of PU for a day. A set of 100 binary sequences generated by the method introduced in the previous section is utilized to training the Learning Automata model. After the training stage, in order to evaluate the ability of predicting model, a testing sequence (St(t)) is generated. A random number (t0) between 1 and 47 is selected showing the starting time in the testing binary sequence d St(t0) presents the status of PU activity. The duty of the predicting model is to determine the status of the PU activity in (t0+1)th time slot. By comparing the result of predicting technique with (t0+1)th bit in the test sequence (St(t0+1)) the prediction rate of Learning Automata is computed using (3). This process repeats for 200 independent iterations and the average prediction rate for different scenarios is presented as follows.

Figure 5 shows prediction rate of the Learning Automata against the number of bit integrating each state in the model. It is obvious that by increasing the number of bits generating the states in the model, total number of states increases and the model becomes more complex. However, by increasing the number of states in the model detection of the states and transition behaviour of the PUs is modelled with higher resolution and accuracy which improves the system modelling and prediction ability of the model.

Figure 5. Prediction rate versus variable number of bits in each state

Another important parameter to determining the ability of model to correct prediction is the activity factor. Results of modelling demonstrate that the higher activity factor lead to the higher prediction rate In other word, if the PU occupies the related frequency bands for longer time duration then the number of spectrum holes in time domain decreases and prediction rate of the model increases. Figure 6 presents the effect of the activity factor on the prediction rate. In this case, the number of bits in each state is assumed 4 bits.

Finally, Figure 7 demonstrates the ability of the proposed model versus the number of states and activity factor. As can be seen, higher activity factor and increasing the number of the state increases the prediction rate of the model.

Therefore, an acceptable prediction rate can be achieved by selecting more numbers of the bits in each state. However, this assumption results to increasing the complexity of the model. Hence, there is a trade-off between the complexity of the model (number of bits in each state) and desirable prediction rate.

Figure 7. The effects of activity factor and the number of states in prediction rate of model.

Now, to evaluate the performance of Learning Automata model with other predictive technique in cognitive radio applications, we designed a multi-layer feed-forward neural networks (MFFN). The network consists of an input layer, 3 hidden layers and one output layer. MFFNs are used for supervised training manner where the set of input and output streams are introduced to the model. During training iteration, MFFN adjusts the weight of hidden layers using the back propagation algorithm [7]. The training process continues while the error of the modelling falls below a threshold value. To training the artificial neural network, we utilized the same data set used for training the Learning Automata. For implementation of the neural network, maximum iterations are assumed 1000 and minimum gradient value is considered 10-6.

The ability of two predictive model, the proposed Learning Automata and artificial neural network, are tested on the same testing sequence. The average prediction rate of the proposed Automata model for 200 iterations is 93%. The activity factor of PU is assumed 0.7 and the number of bits in each state is 3. Average prediction rate of the designed artificial neural network is 86%. Furthermore, average training time duration for the proposed training scheme is 0.04 sec, while the average training time for using artificial neural network is 47 sec. This result shows the fast training in the Learning Automata which is important for cognitive radio networks where the PUs might have a dynamic and variable behaviour.

Designing a predictive model requires tuning some parameters which control the performance of the model. However, the simplicity of the models and acceptable performance are the important factors to utilize the model. As the results shows, the proposed Learning Automata provides the better performance in term of the prediction rate with simple structure and less training time.

5. Conclusion

In this paper, we introduced a new prediction technique for PU behaviour using a Learning Automata model. The most important feature of the proposed technique is low complexity of the model in structure and computations. Fast updating and training is a critical target for online application in cognitive radio. Acceptable performance in term of prediction rate shows the reliability of the model for cognitive radio applications. As the simulation results demonstrate, the ability of the proposed model to predict the behaviour of the PU is reliable with simple structure and fast updating procedure.


[1]  T. Yucek, H. Arsalan, “A survey of spectrum sensing algorithms for cognitive radio applications,” IEEE Communications and Tutorial, vol. 11, no. 1, pp. 116-130, Oct. 2009.
In article      CrossRef
[2]  K. Tsagkaris, A. Katidiotis, P. Demestichas “Neural network-based learning schemes for cognitive radio systems,” Computer Communication, vol. 31, no. 14, pp. 3394-3404, 2008.
In article      CrossRef
[3]  I. A. Akbar, W. H. Tranter, “Dynamic Spectrum Allocation in Cognitive Radio Using Hidden Markov Models: Poisson Distributed Case,” in Proceeding of IEEE Southeast Conference, pp. 196-201, March. 2007.
In article      
[4]  C, Ghosh, C. Cordeiro, D. P. Agrawal, M. B. Rao, “Markov Chain Existence and Hidden Markov Models in Spectrum Sensing,” IEEE International Conference on Pervasive Computing and Communications, pp. 1-6, March. 2009.
In article      
[5]  N. Baldo, M.Zorzi, “Learning and adaptation in cognitive radios using neural networks,” IEEE Consumer Communications and Networking Conference, pp. 998-1003, Jan. 2008.
In article      
[6]  V. K. Tumuluru, P. Wang and D. Niyato, “A Neural Network Based Spectrum Prediction Scheme for Cognitive Radio,” in Proceeding of IEEE International Conference on Communication, pp. 1-5, May. 2010.
In article      
[7]  Nicola Baldo, Michele Zorzi, “Learning and Adaptation in Cognitive Radios using Neural Networks,” 5th IEEE consumer Communication and Networking Conference, pp. 998-1003, 2008.
In article      
[8]  J. A. Torkestani, M. R. Meybodi, “A Learning Automata-Based Cognitive Radio for Clustered wireless Ad-Hoc Networks,” Journal of Network and Systems Management, vol. 19, no. 2, pp. 278-297, 2011.
In article      CrossRef
[9]  M.S. Obaidat a, G.I. Papadimitriou, A.S. Pomportsis, “Efficient Fast Learning Automata,” Information Sciences—Informatics and Computer Science, vol. 157, no. 2, pp. 121-133, 2003.
In article      
  • CiteULikeCiteULike
  • MendeleyMendeley
  • StumbleUponStumbleUpon
  • Add to DeliciousDelicious
  • FacebookFacebook
  • TwitterTwitter
  • LinkedInLinkedIn