To investigate the relationship between antimalarial activity and molecular structures, a QSAR study is applied to a set of 19 Dihydrothiophenone compounds. This study is performed using the linear multiple regression (MLR) method. Calculations at the HF/6-31G (d, p) level of theory have been performed to obtain structure information. The molecular descriptors used are: carbonyl group vibrational frequency (Ѵ(C=O)), nitrogen-hydrogen vibrational frequency(Ѵ(NH)), entropy of formation (ΔfS) and lowest occupied energy(Elumo). The obtained model gives statistically significant results and shows good predictability: R2 = 0.925, S = 0.230 et F = 22.257. Internal and external validation parameters (Q2loo =0.934et Q2ext=0.748) reveal that the established model performs well in predicting the antimalarial activity of the investigated series of molecules Vibrational frequency of the carbonyl group (Ѵ(C=O)), is the priority descriptor in predicting the antimalarial activity of the investigated series of molecules. The acceptance criteria of Eriksson et al. used for the test set are verified.
Malaria remains a public health priority in the world and particularly in sub-Saharan Africa. According to the World Health Organization (WHO), there are between 300 and 500 million clinical cases with 1.5 to 2.7 million deaths, 90% of which occur in sub-Saharan Africa. Malaria kills more than one million children each year, i.e. more than 3,000 per day (a child dies of malaria every 30 seconds in Africa) 1. The lack of treatments for thousands of rare and less rare diseases makes the search for new drugs a major challenge for the pharmaceutical industry. There are widespread parasitic infections of malaria in the world which are difficult to eradicate completely because of the dormant forms of the plasmodium genus 2. Thus, the discovery of new molecules with specific therapeutic properties and minimal undesirable side effects in the fight against malaria is a major challenge; this is why numerous studies on the search for new drugs have led the pharmaceutical industry to develop new products with better therapeutic properties and without side effects, if possible with reduced production time and cost. Plasmodium is highly adaptable to its environment and develops numerous resistances, making some of the currently available molecules obsolete in many endemic territories. Although most of these compounds have been known for a long time, their modes of action are not completely elucidated. It is therefore urgent to find new molecules with new mechanisms of action to meet these needs. The pharmaceutical industry is moving towards new research methods that consist in predicting the activities of molecules even before they are synthesized. the challenge of the artemisinin-based combination therapy strategy 3 This threatens the great progress made in malaria control and can create a parasite pool that is increasingly difficult to treat and eliminate. In such a context, the development of new antimalarial molecules that can be more effective is essential 3. It is in this context that Xu et al.. 4 prepared a series of dihydrothiophenone derivatives and demonstrated the in vitro inhibitory capacity of these compounds against the enzyme 5 as well as chloroquine-sensitive (Pf3D7) and chloroquine-resistant (PfDd2) strains. Therefore, we have been interested in a series of molecules derived from dihydrothiophenone to identify their antimalarial activity with the general objective of proposing new molecules with improved antimalarial activities. Specifically, to develop QSAR models based on existing molecules.
The theory of frontier orbitals was developed by K. Fukui (Nobel Prize of Chemistry in 1981) and collaborators in the 1950s 6
(1) |
The determination of the thermodynamic quantities of the molecules is done according to the following approach: The optimization and the calculation of the frequencies of the different molecules by the DFT method at the level B3LYP/6-31G (d, p) are performed. From the different output files, the following thermodynamic parameters are taken: enthalpy, free enthalpy, entropy as indicated on the output file in the subtitle "Thermochemistry" Gaussian:
All these quantities will be used to calculate the quantities namely entropy, enthalpy and free enthalpy of formation of the molecules 7:
(2) |
(3) |
Avec:
(4) |
: Atomization energy;
: Total energy of the molecule;
: Zero-point energy of the molecule;
: Enthalpy corrections for atomic elements. These values are included in the Janaf table 8
: Enthalpy correction of the molecule
: Thermal correction enthalpy.
(5) |
: Number of atoms of X in the molecule
(6) |
They allow to represent the topology of the molecule without worrying about the exact spatial geometry of the latter 9. These descriptors (mainly topological) are obtained from the planar structure of the molecule. These descriptors do not necessarily have an obvious chemical meaning but they contain within them information on the global size of the system, its global shape and its ramifications 10. These descriptors are easy to calculate and their values are generally accurate.
In this study, these descriptors are:
• Frequency of vibration of the groups (Ѵ(C=O)),
• Vibration frequency Nitrogen-Hydrogen (Ѵ(NH)).
3.4. Statistical AnalysisLinear regression is undoubtedly the most widely used statistical method. A distinction is usually made between simple linear regression (a single explanatory variable) and multiple linear regression (several explanatory variables), although the conceptual framework and calculation methods are identical.
The principle of linear regression is to model a quantitative dependent variable Y through a linear combination of p quantitative explanatory variables
The deterministic model 11 is written:
(7) |
Where the are the coefficients of the regression and ε is the model error.
The statistical framework and accompanying assumptions are not necessary to fit this model. Moreover, the least squares minimization provides an exact analytical solution. Nevertheless, if we want to test hypotheses and measure the explanatory power of the different explanatory variables in the model, a statistical framework is necessary.
The quality of a model is determined on the basis of various statistical analysis criteria, including the coefficient of determination R2, standard deviation S, correlation coefficients of the cross-validation and Fischer F. R2, S and F relate to the fit of the calculated and experimental values. They describe the predictive capacity within the limits of the model, and allow to estimate the accuracy of the calculated values on the test set 12, 13. As for the cross-validation coefficient it provides information on the model's predictive power. This predictive power is called "internal" because it is calculated from the structures used to build the model. The correlation coefficient R² gives an evaluation of the dispersion of the theoretical values around the experimental values. The quality of the modeling is better when the points are close to the fitting line 14. The fit of the points to this line can be evaluated by the coefficient of determination.
(8) |
Where:
: Experimental value of antimalarial activity
: Theoretical value of antimalarial activity
: Mean value of experimental values of antimalarial activity.
The higher the value of R² will be close to 1 the more the theoretical and experimental values are correlated. Moreover, the variance is determined by the relation 9:
(9) |
Where k is the number of independent variables (descriptors), n is the number of molecules in the test or training set and n-k-1 is the degree of freedom.
The standard deviation S is another statistical indicator used. It allows to evaluate the reliability and the precision of a model:
(10) |
The Fisher F test is also used to measure the level of statistical significance of the model, i.e. the quality of the choice of descriptors making up the model.
(11) |
The coefficient of determination of the cross-validation allows to evaluate the accuracy of the prediction on the test set. This coefficient is calculated using the following relationship:
(12) |
The performance of a mathematical model is characterized by a value of for a satisfactory model when for the excellent model . According to them, given a test set, a model will perform well if the acceptance criterion is respected 15, 16.
According to Tropsha et al 17, 18, 19. For the external validation set, the predictive power of a model can be obtained from five criteria. These criteria are:
1)
2)
3)
4) et ,
5) et
The regression equation for the model is based on the following four descriptors:
pIC50 = 0.04750* ѴC=O -0.02288*Ѵ(NH)-0.00154*ΔfS -0.39468*Elumo.
4.2. Analysis of the ContributionIn order to evaluate the effect of the descriptors on the predictive power of the model, an analysis of the contribution of the descriptors was performed. The contribution of each descriptor is given by the following equation 20, 21.
(13) |
According to the analysis of the contribution values, the importance of the descriptors in the model is in descending order:
The contribution calculations show that the vibrational frequency of the carbonyl group (Ѵ(C=O)) makes a contribution of 38.859% in the prediction of antimalarial activity, the vibrational frequency Nitrogen-Hydrogen (Ѵ(NH)), entropy of formation (ΔfS) and lowest occupied energy(ELUMO) contribute 35.956%, 13.551% and 11.634% respectively. It clearly appears that the vibrational frequency of the carbonyl group is the main predictive descriptor of antimalarial activity. This sequence is shown in Figure 1 for the normalized coefficients of the descriptors.
4.3. Statistical Parameters for Validation and PredictionThe results of the table show that the model obtained has a coefficient of determination R2= 0.925, Fischer F=22.257 and standard deviation s=0.230. The correlation coefficient is R=0.962. This result indicates that malaria activity is highly correlated with the model descriptors. Moreover, 92.50% of the experimental variance of the activity is explained by the model descriptors. Concerning the Fischer value, it is very high compared to the Fischer limit value (Flim= 3.06). This means that the QSAR model obtained is significant. Indeed, the model contains at least one relevant descriptor to explain malaria activity. The good fit and high predictive reliability of the activity studied is justified by the low value of the standard deviation (s=0.230).
After presenting the statistical indicators of the model, we proceed to its internal validation.
In order to proceed to the internal validation of the model, we used the Leave-One-Out (LOO) technique. This internal validation technique is obtained by omitting a molecule from the training set. The different parameters obtained are recorded in the table below:
The values in the table indicate that the cross-validation coefficient Q2 LOO has a value of 0.934. This value is much higher than 0.5 22, showing that the QSAR model obtained is reliable for the prediction of the antimalarial activity of the series of molecules studied. Moreover, 93.40% of the training set have their activity predicted by the model.
In order to carry out the internal validation of the model by the Y-randomization method, we carried out ten (10) iterations. The values of R2 obtained at the end of these iterations are recorded in the table below.
The TODESHNI criterion, R2p=0.711>0.5 shows that the model really exists and is not due to a chance 23. From the results of the internal validation, we can conclude that this model is stable and has explanatory power with respect to the molecules of the learning set.
4.4. Statistical Parameters of PredictionIn order to perform the external validation of the model, we used the five (5) Tropsha criteria. The different values obtained are presented below:
Verification of Tropsha criteria
Criterion 1 :
Criterion 2: =0.748>0.60
Criterion 3: < 0.1 et k=0.961avec 0.85<k<1.15
Criterion 4: < 0.1 et k’= 1.038 avec 0.85<k’<1.15
Criterion 5: =0.027 <0.30
In view of the results of the Tropsha criteria of our obtained values, we can deduce that the five (5) criteria are respected. These results show that the model obtained is robust and has a good predictive power.
The figure shows the correlation between the theoretical and experimental values. On the graph, we can see that the points tend to be close to the regression line, which indicates a strong linear correlation between the theoretical values and the experimental values. From the point of view of statistical performance, this model presents a correlation coefficient R2 significant value (R2=0.925). This coefficient indicates that the model can be successfully applied to predict the antimalarial activity of the series of molecules.
In order to determine the area in which the model can predict, we determined its domain of applicability. The figure below shows the domain of applicability of the model.
The Williams plot shows that the standardized residue values of the compounds are between -3δ and + 3δ 24. Moreover, the values of the levers of the molecules are lower than the value of the threshold lever h*(h*=1.07). Also, all the molecules are within the applicability domain of the model.
The methods of Quantum Chemistry and Molecular Modeling have been used in this work on nineteen (19) molecules of dihydrothiophenone in order to study their Quantitative Structure-Activity Relationship (QSAR). This theoretical study was carried out using the DFT method with the HF/6-31G (d, p) level. The study revealed that Carbonyl group vibrational frequency (Ѵ(C=O)), Nitrogen-Hydrogen vibrational frequency (Ѵ(NH)), entropy of formation (ΔfS) and lowest occupied energy(Elumo) are the priority descriptors in the prediction of anti-malarial activity. The robustness study of the constructed model shows good stability and predictive power with (R2= 0. 925, S= 0.230, F= 22,257) This model can thus be used to predict the activity of new molecules on the one hand and on the other hand, to identify the descriptors which improve the antimalarial activity giving thus orientations to conceive new more active molecules.
[1] | OMS, “Malaria in the world”. | ||
In article | |||
[2] | World Health Organization, “World malaria report,” Geneva, 2018. | ||
In article | |||
[3] | S. B. Sirima, S. Cousens et P. Druilhe, “Protection against Malaria by MSP3 Candidate Vaccine,” N. Engl. J. Med, vol. 365, n° %111, p. 1062-1064, Sept 2011. | ||
In article | View Article PubMed | ||
[4] | M. Xu, “Novel selective and potent inhibitors of malaria parasite dihydroorotate dehydrogenase: Discovery and optimization of dihydrothiophénone derivatives,” J. Med. Chem, vol. 56, n° %120, p. 7911-7924, 2013. | ||
In article | View Article PubMed | ||
[5] | D. e. a. Xu, “Discovery and structure activity relationships of ent Kaurene diterpenoids as potent and selective 11b-HSD1 inhibitors: Potential impact in diabetes,” European Journal of Medicinal Chemistry, p. 403-414, 2013. | ||
In article | View Article PubMed | ||
[6] | V. Labet, “Theoretical study of some Aspects of the Reactivity of DNA Bases-Definition of new theoretical tools for the study of chemical reactivity,” Chemical Sciences, 2009. | ||
In article | |||
[7] | M. Frisch, G. Trucks, H. Schlegel et G. Scuseria, “Revision A.02,” chez Gaussian 09, Wallingford CT, Gaussian, Inc., 2009. | ||
In article | |||
[8] | M. W. Chase, C. A. Davies, J. R. Downey, D. J. Frurip et R. A. e. S. N. McDonald, “JANAF Thermochemical Tables,” J. Phys. Ref, vol. 14, n° %111, 1985. | ||
In article | |||
[9] | H. P. Schultz, “Topoligical organic chemistry: Graph theory and topological indices of alkanes,” Journal of chemical Information and Modeling, vol. 29, p. 227-228, 1989. | ||
In article | View Article | ||
[10] | M. Karelson, “Molecular descriptors in QSAR/QSPR,” wiley. | ||
In article | |||
[11] | S. Chatterjee et A. S. Hadi, “fourth Edition, a John Wiley and Sons,” Hoboken, 2006. | ||
In article | |||
[12] | G. W. Snedecor et W. G. Cochran, Methods Statistical, India: Oxford and IBH: New Delhi, 1967, p. 381. | ||
In article | |||
[13] | N. J.-B. Kangah, M. G.-R. Koné, C. G. Kodjo, B. R. N’guessan, S. A. Kablan, Yéo et N. Ziao, “Antibacterial Activity of Schiff Bases Derived from Ortho Diaminocyclohexane, Meta-Phenylenediamine and 1,6-Diaminohexane: Qsar Study with Quantum Descriptors,” International Journal of Pharmaceutical Science Invention, vol. 6, n° %113, pp. 38-43, 2017. | ||
In article | |||
[14] | E. X. Esposito, A. J. Hopfinger et J. D. Madura, “Methods for Applying the Quantitative Structure-Activity Relationship Paradigm,” Methods in Molecular Biology, vol. 275, pp. 131-213, 2004. | ||
In article | View Article PubMed | ||
[15] | L. Eriksson, J. Jaworska, A. Worth, M. D. Cronin, R. M. Mc Dowell et P. Gramatica, “Methods for Reliability and Uncertainty Assessment and for Applicability Evaluations of Classification- and Regression-Based QSARs,” Environmental Health Perspectives, vol. 111, n° %110, pp. 1361-1375, 2003. | ||
In article | View Article PubMed | ||
[16] | J. N’dri, M.-G. Koné, C. KODJO, S. AFFI, A. KABLAN, O. OUATTARA et D. Soro, “Quantitative Activity Structure Relationship (QSAR) of a Series of Azet idinones Derived from Dap-sone by the Method of Density Functional Theory (DFT),” IRA International Journal of Applied Sciences (ISSN 2455-4499), vol. 8, n° %12, pp. 55-62, 2017. | ||
In article | View Article | ||
[17] | A. Golbraikh et A. Tropsha, “Beware of qsar,” J. Mol. Graph. Model., vol. 20, p. 269, 2002. | ||
In article | View Article | ||
[18] | A. Tropsha, P. Gramatica et V. and Gombar, “The importance of being earnest, validation is the absolute essential for successful application and interpretation of QSPR models,” QSAR Comb. Sci. , vol. 22, p. 69, 2003. | ||
In article | View Article | ||
[19] | O. Ouattara, T. S. Affi, M. G.-R. Koné, K. Bamba et N. Ziao, “Can Empirical Descriptors Reliably Predict Molecular Lipophilicity ? A QSPR Study Investigation,” Int. Journal of Engineering Research and Application, vol. 7, n° %115, pp. 50-56, 2017. | ||
In article | View Article | ||
[20] | R. Guha, D. T. Stanton et P. C. Jurs, “Interpreting Computational Neural Network Quantitative Structure-Activity Relationship Models: A Details Interpretation of the weights and Biases,” Journal of chemical Information and Modeling, vol. 45, n° %14, p. 1109-1121, 2005. | ||
In article | View Article PubMed | ||
[21] | D. Cherquaoui, M. Essefer, D. Villemin, J. M. Cence, M. Chastrette et D. Zakarya, “Etude de la relation Quantitative structure-activité inhibitrice des enzymes hydrolytiques:cas de alpha-glucosidase,” New .J.Chem, vol. 22, p. 843-869, 1998. | ||
In article | View Article | ||
[22] | A. Vessereau, Statistical methods in biology and agronomy, vol. 538, Paris: Lavoisier (Tec & Doc)., 1988. | ||
In article | |||
[23] | K. Roy, “A Primer on QSAR/QSPR Modeling, Chapter 2 Statistical Methods in QSAR/QSPR,” Springer Briefs in Molecular Science, pp. 37-59, 2015. | ||
In article | View Article | ||
[24] | S. Chtita, M. Larif, M. Ghamali, M. Bouachrine et T. Lakhlifi, “Quantitative structure–activity relationship studies of dibenzo[a,d]cycloalkenimine derivatives for non-competitive antagonists of N-methyl-D-aspartate based on density functional theory with electronic and topological descriptors,” Journal of Taibah University for Science , vol. 9, pp. 143-154, 2015. | ||
In article | View Article | ||
Published with license by Science and Education Publishing, Copyright © 2021 Fandia Konate, Fatogoma Diarrrassouba, Georges Stéphane Dembele, Mamadou Guy-Richard Koné, Bibata Konaté, Nanou Tiéba Tuo and Nahossé Ziao
This work is licensed under a Creative Commons Attribution 4.0 International License. To view a copy of this license, visit https://creativecommons.org/licenses/by/4.0/
[1] | OMS, “Malaria in the world”. | ||
In article | |||
[2] | World Health Organization, “World malaria report,” Geneva, 2018. | ||
In article | |||
[3] | S. B. Sirima, S. Cousens et P. Druilhe, “Protection against Malaria by MSP3 Candidate Vaccine,” N. Engl. J. Med, vol. 365, n° %111, p. 1062-1064, Sept 2011. | ||
In article | View Article PubMed | ||
[4] | M. Xu, “Novel selective and potent inhibitors of malaria parasite dihydroorotate dehydrogenase: Discovery and optimization of dihydrothiophénone derivatives,” J. Med. Chem, vol. 56, n° %120, p. 7911-7924, 2013. | ||
In article | View Article PubMed | ||
[5] | D. e. a. Xu, “Discovery and structure activity relationships of ent Kaurene diterpenoids as potent and selective 11b-HSD1 inhibitors: Potential impact in diabetes,” European Journal of Medicinal Chemistry, p. 403-414, 2013. | ||
In article | View Article PubMed | ||
[6] | V. Labet, “Theoretical study of some Aspects of the Reactivity of DNA Bases-Definition of new theoretical tools for the study of chemical reactivity,” Chemical Sciences, 2009. | ||
In article | |||
[7] | M. Frisch, G. Trucks, H. Schlegel et G. Scuseria, “Revision A.02,” chez Gaussian 09, Wallingford CT, Gaussian, Inc., 2009. | ||
In article | |||
[8] | M. W. Chase, C. A. Davies, J. R. Downey, D. J. Frurip et R. A. e. S. N. McDonald, “JANAF Thermochemical Tables,” J. Phys. Ref, vol. 14, n° %111, 1985. | ||
In article | |||
[9] | H. P. Schultz, “Topoligical organic chemistry: Graph theory and topological indices of alkanes,” Journal of chemical Information and Modeling, vol. 29, p. 227-228, 1989. | ||
In article | View Article | ||
[10] | M. Karelson, “Molecular descriptors in QSAR/QSPR,” wiley. | ||
In article | |||
[11] | S. Chatterjee et A. S. Hadi, “fourth Edition, a John Wiley and Sons,” Hoboken, 2006. | ||
In article | |||
[12] | G. W. Snedecor et W. G. Cochran, Methods Statistical, India: Oxford and IBH: New Delhi, 1967, p. 381. | ||
In article | |||
[13] | N. J.-B. Kangah, M. G.-R. Koné, C. G. Kodjo, B. R. N’guessan, S. A. Kablan, Yéo et N. Ziao, “Antibacterial Activity of Schiff Bases Derived from Ortho Diaminocyclohexane, Meta-Phenylenediamine and 1,6-Diaminohexane: Qsar Study with Quantum Descriptors,” International Journal of Pharmaceutical Science Invention, vol. 6, n° %113, pp. 38-43, 2017. | ||
In article | |||
[14] | E. X. Esposito, A. J. Hopfinger et J. D. Madura, “Methods for Applying the Quantitative Structure-Activity Relationship Paradigm,” Methods in Molecular Biology, vol. 275, pp. 131-213, 2004. | ||
In article | View Article PubMed | ||
[15] | L. Eriksson, J. Jaworska, A. Worth, M. D. Cronin, R. M. Mc Dowell et P. Gramatica, “Methods for Reliability and Uncertainty Assessment and for Applicability Evaluations of Classification- and Regression-Based QSARs,” Environmental Health Perspectives, vol. 111, n° %110, pp. 1361-1375, 2003. | ||
In article | View Article PubMed | ||
[16] | J. N’dri, M.-G. Koné, C. KODJO, S. AFFI, A. KABLAN, O. OUATTARA et D. Soro, “Quantitative Activity Structure Relationship (QSAR) of a Series of Azet idinones Derived from Dap-sone by the Method of Density Functional Theory (DFT),” IRA International Journal of Applied Sciences (ISSN 2455-4499), vol. 8, n° %12, pp. 55-62, 2017. | ||
In article | View Article | ||
[17] | A. Golbraikh et A. Tropsha, “Beware of qsar,” J. Mol. Graph. Model., vol. 20, p. 269, 2002. | ||
In article | View Article | ||
[18] | A. Tropsha, P. Gramatica et V. and Gombar, “The importance of being earnest, validation is the absolute essential for successful application and interpretation of QSPR models,” QSAR Comb. Sci. , vol. 22, p. 69, 2003. | ||
In article | View Article | ||
[19] | O. Ouattara, T. S. Affi, M. G.-R. Koné, K. Bamba et N. Ziao, “Can Empirical Descriptors Reliably Predict Molecular Lipophilicity ? A QSPR Study Investigation,” Int. Journal of Engineering Research and Application, vol. 7, n° %115, pp. 50-56, 2017. | ||
In article | View Article | ||
[20] | R. Guha, D. T. Stanton et P. C. Jurs, “Interpreting Computational Neural Network Quantitative Structure-Activity Relationship Models: A Details Interpretation of the weights and Biases,” Journal of chemical Information and Modeling, vol. 45, n° %14, p. 1109-1121, 2005. | ||
In article | View Article PubMed | ||
[21] | D. Cherquaoui, M. Essefer, D. Villemin, J. M. Cence, M. Chastrette et D. Zakarya, “Etude de la relation Quantitative structure-activité inhibitrice des enzymes hydrolytiques:cas de alpha-glucosidase,” New .J.Chem, vol. 22, p. 843-869, 1998. | ||
In article | View Article | ||
[22] | A. Vessereau, Statistical methods in biology and agronomy, vol. 538, Paris: Lavoisier (Tec & Doc)., 1988. | ||
In article | |||
[23] | K. Roy, “A Primer on QSAR/QSPR Modeling, Chapter 2 Statistical Methods in QSAR/QSPR,” Springer Briefs in Molecular Science, pp. 37-59, 2015. | ||
In article | View Article | ||
[24] | S. Chtita, M. Larif, M. Ghamali, M. Bouachrine et T. Lakhlifi, “Quantitative structure–activity relationship studies of dibenzo[a,d]cycloalkenimine derivatives for non-competitive antagonists of N-methyl-D-aspartate based on density functional theory with electronic and topological descriptors,” Journal of Taibah University for Science , vol. 9, pp. 143-154, 2015. | ||
In article | View Article | ||