An In Silico Study: Can the Modulation of miRNA Expression through a Diet that Promotes the Production of Butyrate and Consumption of Genistein and Quercetin, Impact Cancer?

Diet plays a major role in regulating cancer. Bioactives such as polyphenols and isoflavones found naturally in our food are increasingly being recognised as regulators of interest. These compounds can regulate cancer pathways through microRNAs which are critical in modulating expression of various genes. We carried out a literature review wherein we assessed the impact of three dietary compounds, namely butyrate, genistein and quercetin, on miRNA expression followed by an in silico study utilising DIANA-miRPathv3 software. Our literature search found that miR-34a, miR-200a-3p and miR-200b-3p were modulated by all three compounds while miR-221, miR-222, miR-29a, miR-3935 and miR-574-3p were modulated by both genistein and butyrate and let-7b, miR-194, miR-96-5p and miR-424 were modulated by butyrate and quercetin. The in silico analysis identified key pathways of interest such as “bladder cancer” which had significant interactions with the miRNAs modulated by the dietary


Introduction
Despite decades of research, cancer continues to be one of the most prevalent diseases worldwide. While there have been numerous breakthroughs in the treatment of certain cancers, there still remains great scope for further development. The mechanisms of action of a healthy diet in preventing, and in some cases halting progression of lifestyle related cancers, have garnered much interest. Certain regional diets, such as the Mediterranean style diet, have been shown to decrease DNA damage in men with prostate cancer [1,2]. The DNA damage response is relevant as although many cancer pathways act synergistically and result in cancer development and/or progression, certain key pathways can act as important targets for prevention as well as treatment. Furthermore, such diets often regulate these pathways through modulation of miRNA expression [3].
MiRNAs are small non-coding RNAs which regulate gene expression post-transcriptionally. They have been shown to regulate cell growth, differentiation and apoptosis [4]. In cancer, miRNAs influence the function of either tumour suppressor or oncogenes, and chromosomal rearrangements, genomic amplifications, deletions and mutations have been attributed to the over-activation or inactivation of their function [5]. Importantly, miRNAs serve as unique modulators to the heterogeneous and complex nature of cancer by the simultaneous targeting of multiple genes [4,5,6]. This insight is important as current treatments look to target specific genes or pathways which vastly restrict their therapeutic potential. Therefore, there is a need to develop approaches which will modulate miRNAs and in turn will allow us to regulate the expression of multiple genes.
One such modulatory method is through the use of dietary compounds. As mentioned above, diet has been shown to alter significant molecular pathways with success (see Motti et.al. 2018 andBanikazemi et. al. 2017 for reviews on miRNAs, cancer and diet [7,8]). Further research into these diets have allowed for the identification of bioactives which impact DNA repair, carcinogenesis, hormonal regulation, cell differentiation, apoptosis and the cell cycle [9]. Treatment with bioactives such as curcumin, resveratrol and polyunsaturated fatty acids derived from turmeric, grapes and fish respectively, have been shown to regulate miRNAs important for the development or inhibition of several types of cancers [10,11,12]. While there is extensive literature on the influence these compounds have on miRNA expression in cancer, there is a need to consider the less studied compounds. The following were selected: butyrate, a short chain fatty acid derived from fermentation of dietary fibre, genistein an isoflavone found in soy based products, and quercetin derived from citrus and berry fruits, green leafy vegetables, and grains, amongst others. These three bioactives are readily available through a healthy diet, although genistein is more commonly consumed as part of Asian, vegan or vegetarian diets rather than a typical Western style diet [13]. Similarly quercetin and butyrate are abundantly consumed or produced (respectively) from plant based diets. This in silico study focuses on three bioactives that are likely to be in abundance in people consuming plant based and Asian diets. In order to determine the impact of these compounds on miRNAs, a literature search was performed to find evidence of the effects of treatment with these bioactives on various cancers in in vivo and in vitro models. Thereafter an in silico analysis was conducted using the DIANA mirPath v.3 software [14] which utilised input from the miRNA lists and used these to predict miRNA interactions with various molecular pathways.

Materials and Methods
A literature search was undertaken by inputting the key search terms: "butyrate", "miRNA" and "cancer" into Google Scholar. Butyrate was subsequently replaced with the search term "genistein" or "quercetin". Our inclusion criteria for the literature search were as follows, firstly we only used peer-reviewed published articles that presented evidence of miRNA modulation in human cells, tissues or participants, and secondly, we only included articles written in English. Thereafter we proceeded to input the lists of miRNA modulated for each bioactive into the DIANA-mirPath v3.0 software [14]. Essentially the software allows the generation of miRNA supersets and analyses the additive effects of different miRNA in various molecular pathways. The use of the DIANA software allowed for the demonstration of a viable tool for the support of future literature and laboratory studies. We sourced miRNA with molecular pathway interactions from the diana-Tarbase v7.0, as this database-software combination produces experimentally supported interactions which may be relevant for the scientist in the lab. The software also allows for two choices of pathway databases, namely the KEGG (Kyoto Encyclopedia of Genes and Genomes) pathway or GO (Gene Ontology) option. The KEGG option was used to include molecular interactions in a human disease and organismal level which the GO option lacks.
The settings for the in silico analyses were as follows: firstly, the p-value threshold was set at 0.05 so that only statistically significant miRNA interactions were presented. Secondly, the "pathways union" option was utilized so that heat maps could be generated as a graphical representation of the data. The false discovery rate (FDR) correction (advanced statistics option) was also utilized which adjusts for errors caused when multiple comparisons are made. While some regard FDR as overly conservative, it helps guard against making false claims.

MiRNAs Modulated by Butyrate, Genistein and Quercetin
The results of the literature search are presented in Table 1 and Figure 1. Table 1 shows the miRNAs modulated by each dietary compound, how the miRNA was modulated, the cells/tissues treated and the dose/duration of treatment. MiRNA were placed in numerical order (as far as possible) for ease of reference. The data presented in Table 1 were used to create a Venn diagram shown in Figure 1. Based on the quantity of literature found, data on butyrate and genistein was abundant while studies using quercetin were few. Most of the modulated miRNAs were only modulated by a specific bioactive while others were modulated by more than one. The literature search results were as follows: nine miRNAs modulated by two bioactives and three that were modulated by all three (as seen in the centre of the Venn diagram ( Figure 1) indicated by the overlapping, darker blue regions). Interestingly, there were no miRNAs modulated only by genistein and quercetin.

In silico Analysis of the miRNAs Modulated by Each Dietary Bioactive and the Statistically Significant Pathway Interactions
We used data from Table 1 to form lists of miRNAs modulated by each bioactive compound to upload to the DIANA-miRPath v3.0 software. The resulting tables ( Table 2, Table 3, Table 4) were in turn used to generate heat maps as seen in Figure 2, Figure 3 and Figure 4.  Table 4 represent the statistically significant interactions of miRNAs modulated by each bioactive with the KEGG pathways. Importantly, we found that pathways relevant to cancer development such as "miRNAs in cancer", "proteoglycans in cancer" and "hippo signalling pathway" consistently ranked amongst the higher statistically significant interactions. There were also more specific pathways relevant to each cancer such as prostate cancer, colorectal cancer and glioma found in each table. These pathways maybe of greater interest to those who are considering the miRNA influence in specific cancers. Transcriptional misregulation in cancer 0.006467 35 12 TGF-beta signaling pathway 0.007188 9 9 Thyroid cancer 0.011571 12 11 Acute myeloid leukemia 0.016374 18 13 *p-value = 0 reflects a value<1.0E-15.

Discussion
Dietary bioactives are available in varying amounts from different foods, and it is important to consider the intake of these bioactives relative to the concentrations at which miRNA expression is influenced. The concentrations listed in Table 1. were obtained from in vitro studies, and although it is clear that they influence the expression of specific miRNAs, these figures cannot be directly related to the human situation, as safety and pharmacokinetic parameters would first need to be established. However, from the literature, safe concentrations in humans have been established for butyrate [42], genistein [43] and quercetin [44], although specific studies would need to be designed to establish the optimal faecal/blood concentrations so as to impact miRNA expression.
Butyrate is derived from the fermentation of dietary fibre, and hence is usually generated in the colon rather than consumed as butyrate. In a study carried out by McOrist et al. (2011) study participants varied with respect to butyrate faecal excretions of less than 0.5 to approximately 20 mmol butyrate per 48hrs [42]. Clearly diet (in the form of fibre containing foods) plays a role, and the microbiota present in the gut will also partially determine the amount of butyrate available.   Genistein and quercetin availability can also vary between people and across seasons. The daily intake of genistein in the Japanese population is approximately 1.5-4.1 mg/person and ranges from 4.6 to 18.2 µg/g in soy based products [45]. The daily intake amongst Western populations is significantly lower. When treating patients with leukaemia, Uckun et al. (1999) found the half-life of genistein to be 20hrs ±5hrs following an infusion of between 0.1 to 0.32mg/kg [43]. The peak drug levels varied from 48 ng/ml to 6497 ng/ml during daily treatment with the abovementioned infusion [43].
The quercetin content of fresh fruit and vegetables was found to vary from as little as 0.5mg/100g (fresh weight) in broccoli to as high as 41.9mg/100g in onions [46]. The estimated quercetin consumption varies depending on the food frequency questionnaire used [47]. Intake is reported to vary depending on region, gender and season and is reported as ranging from 4.37 mg/day to 454 mg/day [44,46,47]. The half-life of quercetin is reported as varying from 11 to 28hrs, and is influenced by coadministration with other dietary compounds such as fat, fibre and other flavonoids [44].
The aims of conducting this in silico study were to effectively evaluate the strength of evidence for miRNA modulation by dietary bioactives. There is no doubt from the plethora of evidence that miRNAs play a crucial role in cancer development. However, despite this knowledge, we are still only scratching the surface with respect to understanding how miRNAs are regulated, how they act individually and synergistically to modulate gene expression and in finding novel therapeutic treatments that are effective in regulating miRNA.
Dietary bioactives are easily accessible, safe and costeffective and hence a natural contender for regulating miRNA and subsequently aberrant gene expression in cancer. While there was a focus on the treatment of cancer cells, such epigenetic modifications by dietary bioactives in healthy individuals has implications for the prevention and maintenance of health [3,48].
Reduced expression of miR-34a in cancer, results in abnormalities in the p53-apoptotic pathway. This results in uncontrolled proliferation of cells [49,50] which is a hallmark of cancer. p53 tumour suppressor protein has long been recognised as a critical regulator of genes related to cell-cycle arrest (significant pathway of interaction as seen in Tables 2 and 4), apoptosis, increased DNA repair and/or inhibition of angiogenesis [51,52]. Therefore, there is merit in regulating the miR-34a -p53 relationship as p53 inactivation via gene expression changes, is one of the most frequent alterations seen in human cancers [49]. Furthermore, genistein treatment led to miR-34a re-expression in AsPC-1 prostate cancer cells which contributed a 30% inhibition of cell proliferation and increased apoptosis [32]. This study shows the applicability of using genistein treatment in miR-34a expression and demonstrating the direct effects its modulation has on cancer cell growth. MiR-34a also presents a unique opportunity and need for further investigation due to its non-specific regulation. Other bioactives such as curcumin [53], resveratrol [11] and polyunsaturated fatty acids [54] were also shown to modulate the expression of miR-34a. As miR-34a regulates key pathways in cancer, it is important to note that its non-specific regulation serves as a readily available target for cancer treatment. Genistein, butyrate and quercetin also modulated the expression of miR-200a-3p and miR-200b-3p. These two miR's are a part of the miR-200 family and have been shown to modulate cancer invasion by regulating epithelial to mesenchymal transition (i.e. metastasis) [55]. MiR-574-3p is reportedly a tumour suppressor miR, and it has been inversely associated with post-operative tumour relapse in patients treated for esophageal squamous cell carcinoma [56].
MiR-221 and miR-222, modulated by genistein and butyrate, are paralogues and are often investigated as a pair. They regulate the tumour suppressor protein p27 kip1 which acts as a cell cycle inhibitor [57]. Importantly, they have oncogenic roles via the down-regulation of this protein, promoting the progression of cancer [58]. Furthermore, their clinical relevance in prostate cancer has allowed the proposal of their use as a molecular marker for the characterisation of cancer progression [59].
Similar to miR-221 and miR-222, miR-29a influences cell proliferation and the cell cycle by down-regulating p42.3, however, it acts as a tumour suppressor rather than an oncogene [60]. Interestingly, circulating levels of miR-29a have also been proposed as a biomarker for cancer progression in colorectal cancer [61]. Unfortunately little is known about miR-3935, although it has been found to be down-regulated in some neurodegenerative disorders [62], and up-regulated expression reduced proliferation and migration in A549 cells [23] .  has been shown to inhibit epithelial to mesenchymal transition (i.e. metastasis) and possesses anti-proliferative activity also through the p53 pathway [63]. Additionally, it has also been shown to regulate the p53 mediated THBS1 gene (inhibitor of angiogenesis) post-transcriptionally, thereby playing a part in angiogenesis [64]. Other miRNA such as miR-424 are more specific and are important for monocytic differentiation and leukaemia and regulation of the cell cycle by causing arrest at the G1 stage [65]. Finally, let-7b is often grouped under the let-7 family. Let-7b is known to act as a tumour suppressor by suppressing oncogenes influencing cell growth and motility [66]. MiR-96-5p expression is upregulated in several cancers, and in breast cancer cells this overexpression inhibited autophagy, promoted cell proliferation, migration and invasion [67]. Collectively, the modulation of these miRNA encompass a wide array of cellular functions which when aberrantly modified, lead to cancer development.
The results from the DIANA software revealed our miRNA lists had statistically significant interactions with pathways such as "fatty acid biosynthesis" and "fatty acid metabolism". Interestingly, such pathways, which may seem irrelevant, have been shown in several studies to over-activate in cancers, causing increased energy uptake and metabolism whilst promoting clinically aggressive behaviour in tumours, tumour cell-growth and survival [68][69][70][71][72]. Other pathways such as "proteoglycans in cancer" and "miRNAs in cancer" consistently ranked among the most statistically significant pathways ( Table 2 -Table 4). Such results mutually strengthen both the evidence from literature of dietary modulated miRNAs and also the accuracy of prediction. However, the software brings to light the need for further exploration under the lens of miRNA. The software also predicted other more specific pathways of interest such as "prostate cancer", "colorectal cancer", "bladder cancer" and many others. The "details" option in the DIANA software allows the user to see which miRNAs were predicted to have statistical significance with each pathway. When we consider the implications of this software, one of its strengths is in its vast database of experimentally supported interactions. This reduces the time needed to find miRNAs of interest. Usually vast assays of miRNAs are assessed in order to see which have been altered, however the software does this for the researcher and thus decreases the need for time and resources for such assays. Furthermore, a "reverse search" option allows the user to input a pathway of interest, i.e. prostate cancer and generates a list of statistically significant miRNA interactions from the Tarbase v7.0 database. Both these options allow for the user to tailor and specify the miRNAs for investigation when looking into different pathways. While we focussed on cancer, other physiological pathways of interest could follow a similar model. Importantly, when the DIANA software is compared to other software such as TargetScanHuman v7.1 [66], the Tarbase based DIANA software helps overcome the high false positive rates seen in such software's implementations which rely solely on in silico predictions by sourcing experimentally supported interactions. TargetScanHuman software in itself is a more difficult software to navigate with a smaller database of miRNAs. It predicts biological targets of miRNAs by searching for the presence of conserved (miRNA found in multiple species) sites that match the seed region of each miRNA [67] and predicts efficacy of targeting using cumulative weighted context++ scores of the sites [66] as opposed to using p values <0.05 to rank interactions of each miRNA with targeted pathways in the DIANA pathway. One of the main logistical advantages of the DIANA software is the ability to upload large lists of miRNA into the software to allow concurrent complex interactions to be calculated with algorithms, something which the TargetScanHuman fails to do.
Although there are many trengths associated with the programme, there were several factors which vastly limited its applicability. Firstly, there is a limit of 100miRNAs that can be input at any one time. This restricts the possibility of large miRNA studies as they have to be broken up into smaller sets. The strength of this programme is in the prediction and visualisation of networks between various miRNA and relevant pathways. When lists have to be broken up, they are seen as separate entities and this restricts the ability to form large networks of miRNA relationships and interactions. Secondly, the software is limited in the information that it can provide. While its algorithms can show the degree of significance of interaction between a miRNA and a pathway, this is where it is limited too. The effects of down-regulation or up-regulation of certain miRNA as seen in our literature review is not taken into consideration by the algorithm and thus the effects of this cannot be presented. This is important as our review found that different bioactives had antagonistic effects on miR-34a. Hu et al. (2011) found that butyrate down-regulated the expression of miR-34a [17] while Hirata et al. (2013) found that genistein had antagonist effects, up-regulating its expression [25]. Interestingly, Sun et al. (2009) found that quercetin both up-regulated and down-regulated miR-34a [26]. The modulation of miR-34a by quercetin demonstrates an important point, namely that bioactives may in reality modulate miRNAs differently in different tissues. Ultimately DIANA software is reliant on the strength of the data within the database upon which it utilises. It is not clear how DIANA software utilises contradictory information. If different tissues result in different activities of miRNA, we cannot assume that the assumption the software makes of this miRNA truly captures the nature of its activity.
Taking these factors into consideration, we still believe that the DIANA software is crucial and novel in its usefulness as a tool for the lab. If it is used knowing its limitations, we can vastly refine the scope of our search and investigate further those miRNA predicted to be relevant.

Conclusion
To conclude, dietary bioactives modulate miRNA which in turn have effects on cancer development and progression. We aimed to effectively evaluate the strength of evidence from the available literature regarding the usefulness of dietary bioactives whilst, simultaneously demonstrating the features of the DIANA software and its ability to synergistically strengthen data from the literature. Such evidence sourced from experimentally supported interactions as seen in Tarbase v7.0, implies that we can be relatively confident, keeping the limitations in mind, that the information provided can be applied in the laboratory to investigate diet-miRNA interactions in the context of cancers. Rather than seeking a needle in a haystack, we liken the use of DIANA software together with Tarbase, as dramatically reducing the size of the miRNA pool to be investigated, to something more manageable from a logistical and financial point of view.

Author Contributions
KSB conceived the study. SMJ performed the review of the literature, the analysis using DIANE software and prepared the first draft of the manuscript. KSB and SMJ jointly developed the structure and arguments for the manuscript. KSB and SMJ made critical revisions and approved the final version.

Funding
SMJ received financial support from the University of Auckland, School of Medicine Foundation.