Integration of a multi-omics stem cell differentiation dataset using a dynamical model

JournalFeeds

Authors:
Patrick R. van den Berg, et al.

Introduction

Much of the medical potential of pluripotent stem cells is due to their ability to differentiate into all cell types of the adult body [1]. While tremendous progress has been made in guiding cells through successive lineage decisions, the regulatory mechanisms underlying these decisions often remain unknown, especially at the post-transcriptional level. This gap in knowledge hampers the streamlining and acceleration of differentiation protocols.

A common first step towards finding novel gene regulatory relationships is the comprehensive, ideally genome-wide, measurement of gene expression dynamics. A large body of work has focused on charting transcriptome changes during differentiation, most recently down to the single-cell level [26]. While highly informative, such studies usually make the implicit assumption that mRNA levels are a good proxy for protein levels, despite widespread discordance observed in several mammalian systems [79]. The relationship between mRNA and protein abundance has been studied in various systems at different resolutions and time scales. Originally, mRNA-protein correlation was assessed across the genome (“across-gene correlation”[10]) in cell lines growing in steady state, where population-average abundances were measured with bulk omics methods. In this context, initial estimates claimed that only 40% of protein variability across the genome is explained by mRNA abundance in steady state [11]. Models of the protein to mRNA ratio explained up to two-thirds of the variability, when sequence features—such as the length of the coding sequence or amino acid frequencies—were taken into account [12]. Importantly, discordance between mRNA and protein abundance does not immediately imply specific regulation, as technical noise tends to reduce the observed correlation and conventional correction schemes typically ignore the effect of systematic, correlated errors [13]. In a comparison of relative protein levels across human tissues, about 50% of the variance in protein abundance was ascribed to post-transcriptional regulation [14]. All in all, a significant amount of protein abundance variability across the genome seems to remain unexplained, even if the effect of technical noise is considered.

Bulk measurements in unperturbed, steady state conditions can only reveal across-gene correlations. Single-cell methods or bulk measurements of dynamic systems, on the other hand, reveal fluctuations across cells or time, respectively, which allows us to study mRNA-protein correlation of individual genes (“within-gene correlation”[10]). The variability of mRNA-protein ratios in single-cells was found to be influenced by various factors such as protein half-life [15], as well as phenotypic state or microenvironment [16]. Within-gene mRNA-protein correlation can also be studied by measuring mRNA and protein abundances at the population level across time in a highly dynamic system, such as differentiating stem cells.

For this study, we collected a bulk multi-omics data set of retinoic acid (RA)-driven differentiation of mouse embryonic stem cells (mESCs). Samples taken over a period of 96 h were subjected to: mass spectrometry, bulk RNA-sequencing of nuclear and cytoplasmic fractions, as well as small RNA sequencing to quantify microRNA (miR) abundance. To describe protein dynamics, we refined a birth-death model [1722] by considering explicitly the cytoplasmic fraction of the mRNA and the influence of certain technical artefacts related to mass spectrometry. In contrast to steady-state models, dynamical models aim to infer kinetic rates for protein synthesis and degradation rather than explain absolute protein levels. Here, we show how such models can in principle be used to nominate candidate regulators of gene expression during stem cell differentiation. By assuming a specific effect of miRs on protein synthesis, we attempted to identify miRs with a potential regulatory function. Finally, we used mimics and inhibitors of several candidate miRs to follow up on the model’s predictions.

Results

Pervasive discordance between mRNA and protein in retinoic acid-driven mESC differentiation

We used RA differentiation of mESCs as a generic model for in vitro differentiation. Previously, we characterized this differentiation assay in detail at the transcriptional level by single-cell RNA-seq [2] and showed that RA exposure induces a bifurcation into extraembryonic endoderm-like and ectoderm-like cells. Here, we collected RNA and protein samples during an RA differentiation time course (Fig 1A). For each time point, we quantified poly(A) RNA by RNA-seq and protein expression by tandem mass tag (TMT) labeling followed by tandem mass spectrometry (MS/MS). In total, we obtained RNA and protein abundance estimates for 6271 genes (S1A–S1E Fig) at 8 time points in duplicate. After correction for batch effects due to separate sequencing runs (S1F Fig), we achieved highly similar results for the two biological replicates. To investigate in how far protein expression can be predicted from RNA expression, we started with the simplest conceivable model (termed naive here), which assumes that protein abundance is connected to RNA expression by a constant, gene-dependent scaling factor. This model is justified, if protein synthesis and degradation rates are constant and RNA expression changes slowly on the time scale of protein turnover, thus resulting in a quasi-steady state. Consequently, the protein-to-RNA ratio would be approximately constant over time. To test this model, we scaled both protein and RNA to their respective means, which should result in a constant protein-to-RNA ratio of 1, if the naive model is valid. We observed that for a large fraction of genes, the naive model is inaccurate, resulting in low coefficients of determination (R2) and low correlation (Figs 1B, 1C and S2A). In many cases, R2 assumes negative values, which means that the naive model performs worse than a model predicting a scaled protein level of 1 for all time points. (Note that only for linear regression models R2 is guaranteed to be positive and interpretable as the fraction of variance explained by the model.) For some genes, we even observed significant anti-correlation between RNA and protein (Fig 1B). While technical noise certainly contributes to this result, the high quality of our data sets (S1 Fig) suggests that a substantial part of the observed discordance is of biological origin. The assumptions of the naive model are therefore likely wrong for the majority of genes and a more sophisticated model is necessary to explain the relationship between RNA and protein.

thumbnail

Fig 1. Birth-death models outperform the naive model in predicting dynamics.

(A) Schematic overview of RA differentiation time course and subsequent omics measurements. (B) Example fit of the naive model. The naive model is a smoothing spline fit of RNA scaled to match the mean protein expression. (C) R2 distribution of the naive model. (D) Example fit of the total RNA (totRNA) model. (E) R2 distributions of the naïve and total RNA totRNA models. (F) Example fit of the total RNA and cytoplasmic RNA model, replicate 1. (G) R2 distributions of the total RNA and cytoplasmic RNA model. (H) Example fit of the cytoplasmic RNA and ci model, replicate 1. The height of the grey bar indicates the fitted ci parameter. (I) R2 distributions of the cytoplasmic RNA and ci model. Only genes that are improved by the ci model are shown. The distribution of all genes is shown in S2E Fig. (C,E,G,I) Some genes with extremely low R2 values are set to the minimum value of the plot for clarity. Corresponding Pearson’s r distributions are plotted in S2A–S2D Fig.


https://doi.org/10.1371/journal.pgen.1010744.g001

A simple birth-death model explains mRNA-protein discordance for most genes

To relax the assumption that expression is in steady state, we next considered a kinetic model that implements a birth-death process for protein dynamics (Eq 1).

(1)

In this total RNA model, protein production of gene g depends on total mRNA abundance R and the protein synthesis rate ks, while protein degradation depends on protein abundance P and the degradation rate kd. Synthesis and degradation rate are taken to be constant in time but gene-specific (as indicated by the index g). All processes related to protein production (initiation, elongation, etc.) are lumped into ks, while kd represents all processes leading to a reduction in protein levels (dilution due to cell division, active degradation, etc.). Similar models have been used previously to describe protein dynamics during the stress response in yeast [18], as well as embryonic development of Xenopus [19] and Drosophila [22]. We do not consider simpler, degenerate models (without ks and/or kd [19]), because these models are not relevant in our biological system: Synthesis and degradation always occur to some degree during stem cell differentiation. To reduce the influence of uninformative small fluctuations, we applied a smoothing spline to the abundance estimates prior to inferring model parameters by non-linear least-squares fitting. Compared to the naive model, R2 and correlation improved markedly for the total RNA model (Figs 1D, 1E and S2B), which is to be expected given the increase in model flexibility. To correct for a difference in the number of fit parameters and thus compare model performance fairly, we used the Bayesian information criterion (BIC, see Methods). According to the BIC, 3551 out of 4580 proteins were better fit by the kinetic model. These proteins are thus likely out of steady state, for the duration of the experiment, as a result of the differentiation cue. In summary, these results showed that a simple birth-death model outperforms the naive model of protein dynamics.

Despite the overall improvement observed with the total RNA model, R2 and correlation were still low for many proteins. We hypothesized that the remaining discrepancies could be explained by kinetic rates changing over time. To evaluate this hypothesis, we first sought to exclude technical limitations of our measurements as possible alternative explanations. We first considered the subcellular localization of mRNA. In our first experiment, we measured total poly(A) RNA, whereas only cytoplasmic mRNA is available for translation. Nuclear retention of mRNA is known to reduce transcriptional noise [23] and has been shown to contribute to translational regulation for specific genes [2426]. To measure the cytoplasmic mRNA fraction of each gene, we repeated the differentiation experiment in triplicate and separated cell lysates into a nuclear and a cytoplasmic fraction before performing RNA-seq. To obtain a global scaling factor between cytoplasmic and nuclear expression, we regressed total RNA (totRNA) reads, measured previously, on nuclear RNA (nuRNA) reads and cytoplasmic RNA (cyRNA) reads across all genes (see Methods). Then, the cytoplasmic fraction C was calculated for each gene and each time point. To our surprise, C did not vary substantially between genes (mean = 0.82, std = 0.02, calculated for a subset of 3,563 genes without any missing values) (S1G Fig). In addition, C also did not fluctuate much in time for individual genes (S1H Fig). Despite the low variability of C, we incorporated this parameter, leading to the cytoplasmic RNA model (Eq 2).

(2)

As to be expected, adding C brought overall only a subtle improvement (Fig 1G), although for individual cases, the improvement was significant (Fig 1F). We opted to fit further models including the cytoplasmic fraction, due to the overall slightly better performance.

Another potential confounder is inherent to the proteomics method we employed. TMT-based proteomics suffers from co-isolation interference (ci), a process in which two peptides are co-isolated in the second MS step. The contaminating peptide can interfere with the quantification of the peptide of interest and cause a constant offset [27,28]. To model a possible offset, we extended the model by an additional parameter (ci), quantifying the degree of fold-change compression for a protein. Thus, we assume ci to be constant for all TMT tags, i.e. time points, resulting in the ci model (Eq 3).

(3)

Effectively, including the parameter ci allows protein expression to have a bigger dynamic range, which can improve the fit for certain genes significantly (Figs 1H, 1I, S2D and S2E). Judged by the BIC, 557 genes were fit better including ci. All in all, this result reinforces the importance of considering co-isolation interference.

Including miRs improves model performance and identifies miR-mRNA interactions

Having incorporated important confounding factors, we sought to further extend the model by relieving the assumption of constant kinetic rates. Such a model would be able to capture protein synthesis and degradation varying across time, which likely happens during a process as dynamic as stem cell differentiation. A model in which ks and kd are time-dependent and completely arbitrary cannot be sufficiently constrained by our mRNA and protein measurements. Hence, we decided to focus on a specific regulatory mechanism, for which an additional, complementary data set could be obtained. Specifically, we explored the influence of miRs, which are known to be involved in gene regulation during differentiation [29]. In order to study the role of miRs in our system, we repeated the RA differentiation assay and measured the miRnome by small RNA-seq in quadruplicate. We quantified around 1000 mature miRs per time point (S1A, S1C and S1E Fig). For further analysis, we retained miRs with high reproducibility across replicates and high variance across time (S1I Fig). To identify possible effects of miRs, we focused on computationally predicted miR target genes. We further reduced the number of miR-mRNA interactions by filtering based on the “context score” provided by TargetScanMouse [30] (S1D Fig). This score combines information from mutiple sequence features and can be used to rank hits. In the end we retained 4527 genes, 560 unique mature miRs and 45,882 potential interactions between them (S1D and S1E Fig).

If multiple miRs with similar temporal profiles targeted the same gene, we considered them to be indistinguishable. Therefore, we grouped all miRs into six clusters by their temporal expression profiles (Fig 2A) and averaged over miRs from the same cluster targeting the same mRNA (Fig 2B). As miR binding is known to trigger translational repression [31], we modified the term describing protein synthesis in our model. In the interest of parsimony, we assumed a linear dependence of protein synthesis on miR abundance for this miR model (Eq 4).

(4)

Here, Mgm is the geometric mean abundance of miRs in cluster m targeting gene g and αgm parametrizes the interaction between those miRs and the target gene g. For each gene, we fit a model with one of the six miR clusters at a time and identified the improvement in model performance. Including miRs greatly improved the fits for some proteins, especially when there was a transient discordance between RNA and protein expression (Fig 2B). Typically, the “effective” mRNA abundance (cytoplasmic mRNA corrected for miR effects) was more dynamic than nominal mRNA abundance. For many of the genes that benefited from the addition of miRs, their influence is typically large: For these genes, translation was reduced up to 50% by the miR term in the model at peak miR expression (Fig 2C). Overall, the addition of a miR-dependent term significantly improved the coefficient of determination for a quarter of the proteins as determined by the BIC (Fig 2D). At this point we would like to stress that we made particular assumptions about the influence of miRs (miR binding only affects ks and the effect is linear in the abundance etc.). A more general model would have been under-constrained by the available data. Other conceivable interaction terms might result in improvements for different sets of genes. A better performance of the miR model, compared to simpler models, therefore does not prove the assumed regulatory mechanism.

thumbnail

Fig 2. The addition of miRs further improves the dynamical model for a subset of genes and suggests potential miR-mRNA interactions.

(A) Expression profiles of 560 miRs in six clusters. (B) Example fit of miR model for the gene Rab8a, replicate 1. First panel: expression of the assigned miRs of a single cluster. Colored lines are individual smoothing spline fits. Second panel: Cytoplasmic RNA expression and the effective RNA concentration available for translation (see Methods). Solid lines represent smoothing splines. Third/fourth panel: cytoplasmic RNA and miR model fits. (C) Distribution of inferred α for genes that benefit from miR model. (D) R2 distribution of the miR model and the next best model (either naive, total RNA, cytoplasmic RNA or ci). Only genes that benefit from the miR model are shown. Some genes with extremely low R2 values are set to the minimum value of the plot for clarity. The corresponding Pearson’s r distribution is shown in S2F Fig.


https://doi.org/10.1371/journal.pgen.1010744.g002

The best dynamical model explains 45% of total protein variance

While each model refinement introduced above improved model performance overall, each model achieved the lowest BIC only for a subset of genes (Fig 3A). In about 16% of cases the naive model was optimal, meaning that for these proteins none of the other models improved prediction by a significant amount (as judged by the BIC). 25% and 26% of genes were best predicted by the model without or with considering mRNA localization, respectively. Hence, for 51% of genes, protein abundance seemed to be out of steady state, but explainable by a simple model with fixed synthesis and degradation rates. For 8% of proteins, the model including co-isolation interference was optimal. The increased relative dynamic range due to subtracting a constant increased the fit for these genes significantly. Finally, 25% of proteins were fit optimally with a model including one of the miR clusters, indicating that there are likely additional regulatory mechanisms at play, potentially mediated by miRs. If the optimal model for each protein is chosen, only very few cases of negative R2 values remain (Fig 3B). Ignoring those, we found that the miR model explained 45% of the variance, when it is the optimal model (Fig 3C). In summary, the dynamic responses of 84% of the quantified proteins were signfificantly different from their mRNA counterparts. Therefore, it does not seem warranted to consider mRNA abundance a good proxy for protein levels in a highly dynamic setting.

thumbnail

Fig 3. Selecting the optimal model on a gene-by-gene basis increases the total explained variance of protein expression from 30% to 50%.

(A) Assignment of the optimal model for each gene based on the BIC. The number next to the miR bar indicates the miR cluster giving the best fit. (B) R2 distribution of the optimal fits from (A) and their naive model counterpart. Some genes with extremely low R2 values are set to the minimum value of the plot for clarity. (C) Median percentage of protein variance explained by each model. For each model, only those genes were included for which that model was the best. Fits with negative R2 were ignored.


https://doi.org/10.1371/journal.pgen.1010744.g003

Follow-up on the model predictions using miR inhibitors and mimics

Given that the model including miRs was optimal for 25% of proteins, we wanted to explore in how far our model is able to identify candidates for novel miR-mRNA interactions. To that end, we ranked all proteins by model performance (R2) and performance improvement (i.e. change in R2) compared to the simpler models without miRs (Fig 4A and S1 Table). Among this list of candidate genes, we selected seven genes (Rab8a, Cdk7, Pccb, Acad8, Mfge8, Eif4h and Srgap2) and their thirteen putative regulating miRs (Figs 2B, 4B and S3), for further investigation. To test the functional significance of those miRs, we set up a transfection assay with miR mimics and inhibitors in mESCs. To optimize mimic and inhibitor transfection, we first created two fluorescent reporter cell lines, based on a published approach [32,33] (S4A Fig). In these cell lines, a bi-directional promoter drives the expression of two fluorescent proteins, functioning as miR ‘reporter’ and ‘normalizer’, respectively. The bi-directional promoter guarantees highly correlated levels of transcription of both transcripts. Due to miR binding sites in the ‘reporter’ transcript, its expression is reduced relative to the ‘normalizer’, if the respective miR is present. We created reporter cell lines for a miR that is undetected in our system (mir-590-3p) and one that is highly expressed (miR292a-5p), in order to evaluate a mimic and an inhibitor, respectively. Flow cytometry measurements of the mir-590-3p reporter line transfected with a mir-590-3p mimic revealed a high percentage of transfected and regulated cells after 24 h (S4B Fig). Although the effect increased slightly over time, we picked 24 h as the ideal time point for evaluation in order to limit the extent of secondary effects (S4C Fig). The miR292a-5p inhibitor was slightly less effective in modulating miR292a-5p reporter signal, even at higher doses (S4D Fig). For the miR inhibitor, we selected 48 h transfection with 2X the suggested concentration (S4E Fig).

thumbnail

Fig 4. RNA-seq of mESCs transfected with mimics or inhibitors of miRs identified by the model.

(A) Scatter plot of the R2 values for the best miR model and the best non-miR model, see Fig 2D. Colored dots are defined by the cutoffs indicated in red and represent a subset of genes with a miR-mRNA interaction of higher confidence. Some genes with extremely low R2 values are set to the minimum value of the plot for clarity. (B) miR model fits of Acad8 and Eif4H, which belong to the subset highlighted in (A). (C) Expression levels (regularized counts scaled to scrambled control) of Acad8 and Eif4H after miR mimic (top) and miR inhibitor (bottom) transfection in 3 biological replicates. P-value shown is for an uncorrected one-sided test (see Methods). Differential expression of six more target genes is shown in S5 Fig. (D) Expression fold changes relative to scrambled control after mimic and inhibitor transfections for three miRs that target Acad8 and 2 miRs that target Eif4h. Distributions of the six more targets are shown in S6 Fig. The boxed genes are our proposed targets, additionally some known targets are shown. Red color indicates significantly differentially expressed genes (Padj < = 0.01).


https://doi.org/10.1371/journal.pgen.1010744.g004

Having set up optimal concentrations and timings for our transfection assays, we next set out to validate the predicted miR targets. Although our kinetic model was developed to predict regulation at the level of translation, we used mRNA abundance as a first, convenient readout, reasoning that mRNA degradation usually follows translational repression [31]. Cells were transfected with miR mimics and inhibitors for the thirteen aforementioned miRs and one additional miR as a control, for which we expected no effect (miR-216a-5p, which is predicted to target Leo1). mRNA samples from these experiments were subjected to RNA-seq. Differential expression analysis revealed significant downregulation in four out of eight targeted genes by at least one of the miR mimics (Figs 4C and S5A). Acad8 was downregulated by miR-433-3p (P = 1.7e–3), Cdk7 was downregulated by miR-99a-5p (P = 0.014), Eif4h was downregulated by both miR-152-3p (P = 0.015) and miR-467e-3p (P = 1.21e–19) and Srgap2 by miR-135b-3p. The observed downregulation supported the hypothesis that miR-mRNA interaction takes place in these five particular cases. The miR inhibitors on the other hand did not result in significant differential expression in any of the predicted targets (Figs 4C and S5B Fig). The negative control Leo1 was not differentially expressed, as expected (S5 Fig). Overall, the mimics of the thirteen miRs downregulated the predicted targets significantly in a combined analysis (Pfisher = 2.4e-3). On the contrary, the inhibitors did not have a significant opposite effect (Pfisher = 0.6), possibly due to a smaller change in miR abundance, compared to the mimics, and redundancy created by multiple miRs targeting the same gene.

As additional validation of our assay we looked at the measured differential expression for known targets of the perturbed miRs (Figs 4D and S6). Specifically, we focused on the miRs predicted to regulate Acad8 and Eif4h, which gave the largest effect sizes at the mRNA level. miR-23b-5p, which we predicted to target Acad8, is also known to target Hmgb2 [34] and thereby play a role in cardiac hypertrophy. Another predicted regulator of Acad8 was miR-433-3p, which regulates genes involved in development, like the transcription factor CREB [35] and the Egfr binding adaptor protein GRB2 [36]. The third predicted regulator of Acad8, miR-5615-3p, does not have any confirmed targets to the best of our knowledge. Eif4h was predicted to be regulated by miR-152-3p and miR-467e-3p the latter of which has no independently validated targets. Some of miR-152-3p’s known targets are the two cell cycle genes CDKN1B [37] and CDK8 [38], the pluripotency inducing KLF4 [39], and the DNA methyltransferase Dnmt1 [40]. With the exception of Hmgb2, all known targets were downregulated in the respective mimic assays, indicating that our experiments can validate existing interactions.

Finally, we wanted to test, whether some of the identified miR-mRNA interactions also have an effect on protein abundance. We repeated the transfections with miR inhibitors and mimics for miR433-3p (which targets Acad8), miR467e-5p (which targets Eif4h) and miR-99a-5p (which targets Cdk7) in mESCs and measured protein abundance by immunofluorescence and flow cytometry. For these experiments we used a 5X reagent concentration to increase the expected effect size. In the case of ACAD8 and EIF4h, both miR mimic and inhibitor lead to effects in the same direction, relative to the respective scrambled control (S7 Fig). For ACAD8, both the mimic and the inhibitor caused an increase in protein abundance, while for EIF4H both treatments caused a decrease in abundance. This observation can be explained by other regulatory mechanisms that (over)compensate for the perturbations or secondary effects of the selected miRs. For CDK7, miR-99a-5p inhibition resulted in a strong upregulation of protein abundance, while the miR mimic slightly reduced abundance (Fig 5). This observation is consistent with miR-99a-5p specifically targetting CDK7. As shown above, miR-99a-5p inhibition did not significantly increase Cdk7 mRNA levels (S5B Fig), which means that regulation occured at the level of translation.

thumbnail

Fig 5. CDK7 protein abundance is regulated by miR-99a-5p in mESCs.

Flow cytometry of CDK7 immunostaining in 4 biological replicates of mESCs treated with miR-99a-5p mimic, inhibitor or the respective scrambled controls.


https://doi.org/10.1371/journal.pgen.1010744.g005

All in all, the validation experiments showed that our model has the power to nominate potential regulators of protein abundance.

Discussion

In this study we set out to integrate a multi-omics data set on stem cell differentiation. A range of tools for the integration of multiple omics modalities, both at the bulk and single-cell cell level [41,42], already exist. The most recent approaches have started to aim for biologically meaningful integration by incorporating prior knowledge in various ways. For example, biological knowledge can inform priors of Bayesian models or the topology of networks that represent interactions connecting different modalities [43,44]. A third way to exploit biological relationships between data sets, which we adopted in this study, is the use of dynamical models [1722]. These models incorporate biophysical relationships between different types of molecules in a quantitative way and can also be used to infer kinetic parameters that have obvious biological interpretations17–22. We speculate that combinations of the mentioned approaches, exemplified by physics or systems-biology informed neural networks [45,46], will become powerful tools for data integration in the future.

In our time-resolved multi-omics data set we found overall low correlation between mRNA and protein abundance across time. Such low correlation has been observed in several systems, in particular: Xenopus development [19], C. elegans development [47], macrophage differentiation [48], mouse ESC differentiation [49] and the intestinal epithelium [50]. While the lack of strong correlation is typically interpreted as a sign of (post-)translational regulation [47,49], theoretical work showed that a simple delay between mRNA and protein production can lead to a reduction in gene-wise correlation [51,52]. A minimal model with constant kinetic rates explained the protein dynamics of a third of all genes during the stress response in yeast [18] and 75% of genes in Xenopus development [19]. In our system, 3552 out of 4580 genes were explained better by this model, compared to a naive model which assumes a constant protein-to-RNA ratio.

To explain the remaining discordance we explored the cytoplasmic fraction of the mRNA, but did not find a strong effect. A possible explanation could be that the nuclear fractionation method we used was not very effective and a substantial amount of cytoplasmic RNA remained in the nuclear fraction. While the mean cytoplasmic fraction measured here (0.82) was comparable to values reported in another study [23] of pancreatic beta cells (0.79) and liver cells (0.87), we observed much lower variability between genes compared to that previous study. This might either indicate that nuclear retention does not play a role in differentiation or that the nuclear fractionation method was not optimal. Intriguingly, Halpern et al. have shown, using single-cell methods and a dynamical model of mRNA turnover and nuclear export, that nuclear retention can reduce the variability of cytoplasmic mRNA abundance [23]. Reduced cytoplasmic noise likely leads to an increase in mRNA-protein correlation. It would be very insightful to employ single-cell methods and test whether similar mechanims are at play in stem cells and during differentiation. Whereas incorporating the cytoplasmic fraction only resulted in a small improvement, a model that includes co-isolation interference was optimal for almost 400 proteins. This result is in agreement with the known, confounding influence of co-isolation interference [27,28].

Overall, our analysis showed that for 16% of the proteins, the naive model is sufficient. From the mRNA abundances alone it is, however, impossible to predict, which genes would fall in this category. These results reinforce the notion that mRNA abundance should not be used without caution as a proxy for protein abundance [53]. Future work might reveal predictors (such as mRNA sequence composition, known binding motifs etc.) that might help to identify the regulatory mode of a particular gene. It will be interesting to see how much of the variance that remains unexplained by our most detailed model (55%) is due to technical noise or unexplored biological factors (regulation of protein degradation, control of protein turnover by RNA binding proteins etc.)

Having defined a simple birth-death model that includes the confounding effect of co-isolation interference, we next explored whether considering miRs could further improve model performance. miRs have been identified as key regulators of stem cell pluripotency and differentiation [29,54]. For example, members of the let-7 and miR-290 families have been implied as drivers of differentiation as well as the maintenance of pluripotency in ESCs [5457]. To find putative targets of miRs, various computational methods, typically based on sequence complementarity and conservation, have been developed [30,58,59]. These methods predict hundreds of thousands of interactions, among which are likely many false positives. The gold standard for validation, the luciferase assay, is time-consuming, which means that the majority of potential interactions have not been verified. More comprehensive, experimental methods to identify miR-mRNA interactions, such as HITS-CLIP or PAR-CLIP employ crosslinking of the RNA-induced silencing complex (RISC) with associated miRs and their targets, but they can suffer from low sensitivity and require the use of reagents that might perturb cell physiology [60]. Our modeling-based approach might complement these methods in several ways: No special reagents are necessary and miR-mRNA interactions are identified by effect size, which might help to find functionally relevant miRs. Our approach identified several candidate regulators and we were able to validate the repression of CDK7 protein expression by miR-99a-5p. Future studies might reveal the functional relevance of this interaction for the maintenace of pluripotency, differentiation or other aspects of stem cell biology. Once more we would like to stress that the miR effect incorporated in our model is an assumption. Our model is consistent with the finding that miRs tend to repress translation, next to promoting mRNA degradation [61]. Yet, due to the under-constrained nature of the problem, a better fit of a model incorporating the assumed miR effect is not to be taken as evidence that this effect is (exclusively) at play. Our follow-up experiments suggested the presence of additional effects, not considered in our model, that significantly modulate the effect of miR inhibition or overexpression. To demonstrate the postulated regulatory mechanism conclusively, protein translation would have to be measured directly, for example by ribosome profiling or pulse-chase labeling and mass spectrometry [62].

In conclusion, our study demonstrates that biological relationships between datasets can be leveraged for the integration of multi-omics experiments. Developing optimal integration algorithms will be a continuing challenge as more and more types of molecular profiling data become available. We hope that our time-resolved multi-omic dataset will be a rich resource for the discovery of gene regulatory mechanism in stem cell differentiation.

Materials and methods

Experimental methods

Cell culture.

E14 mouse embryonic stem cells were cultured as previously described [2]. Briefly, cells were grown in modified 2i medium [63]: DMEM/F12 (Life technologies) supplemented with 0.5x N2 supplement, 0.5x B27 supplement, 4mM L- glutamine (Gibco), 20 μg/ml human insulin (Sigma-Aldrich), 1x 100U/ml penicillin/streptomycin (Gibco), 1x MEM Non-Essential Amino Acids (Gibco), 7 μl 2-Mercaptoethanol (Sigma-Aldrich), 1 μM MEK inhibitor (PD0325901, Stemgent), 3 μM GSK3 inhibitor (CHIR99021, Stemgent), 1000 U/ml mouse LIF (ESGRO). Cells were passaged every other day with Accutase (Life technologies) and replated on gelatin coated tissue culture plates (Cellstar, Greiner bio-one). During transfections, cells were temporarily cultured in serum+LIF medium (10% ES certified FBS, 1X non-essential amino acids, 0.1mM β-mercaptoethanol, 1X pen/strep, 2mM L-glutamine, 10,000U/ml mLIF, mLIF from Merck, rest from Thermo Fisher Scientific). miR reporter cell line clone selection took place on homegrown mouse embryonic fibroblast feeders.

Cloning.

The miReporter backbone (AddGene, Plasmid #82478) was transformed into DH5a competent cells (Cat. 18265017, Thermo Fisher Scientific) as per manufacturer’s instructions. Then, transformed cells were expanded and plasmids harvested by miniprep (Qiaprep, Qiagen). A set of two oligos was synthesized for each of the two reporter cell lines: miR-590-3p-fwd: 5’-GATCG TAATTTTATGTATAAGCTAGT AAGCTTC-3’, miR-590-3p-rev: 5’-CTAGGAAGCTT ACTAGCTTATACATAAAATTA C-3’, miR-292a-5p-fwd: 5’-GATCG ACTCAAACTGGGGGCTCTTTTG AAGCTTC-3’, miR-292a-5p-rev: 5’-CTAGGAAGCTT CAAAAGAGCCCCCAGTTTGAGT C-3’ (Integrated DNA Technologies, see S4A Fig). Pairs of oligos were annealed and phosphorylated in a thermocycler: 30m at 37°C, 5m at 95°C, for 12 cycles (1μM fwd oligo, 1μM rev oligo, 1X T4 buffer, 1U/μl T4 Polynucleotide Kinase; buffer and enzyme from New England Biolabs). Next, backbone digestion and ligation was performed in one step in a thermocycler, which was facilitated by the ligated inserts destroying the restriction sites for the enzymes (S4A Fig): 5m at 37°C, 5m at 23°C, for 12 cycles (1:2500 dilution of phosphorylated oligo duplex, 2.5ng/μl backbone, 5% v/v DTT, 0.15U/μl BamHI, 0.5U/μl NheI, 1U/μl T4 ligase, 1X restriction buffer; T4 from New England Biolabs, rest from Thermo Fisher Scientific). Plasmids were then amplified in the same manner as the backbone: Plasmids were transformed into DH5a cells, which were then expanded and used for midiprep extraction (Plasmid Midi, Qiagen).

miR mimic and inhibitor transfection.

Pre-miR miRNA Precursors (Thermo Fisher Scientific) were used as miR mimics. These double stranded RNAs are designed to be processed by the cell to result in mature miRs. miRCURY LNA miRNA Power Inhibitor (Qiagen) was used for miR inhibition. This reagent blocks miRs by complementary binding to the mature miR with high affinity (due to the presence of LNA bases). mESCs and both miReporter cell lines were transfected with miR mimic and inhibitors in identical fashion. See S2 Table for a list of all miR reagents. Cells were seeded in 2i medium 48 h prior to transfection in 12-well plates. Half an hour before transfection, the culture medium was replaced by 500 μl of 2i medium supplemented with 10% FBS to allow the cells to flatten. Lipid complexes (Lipofectamine RNAiMax, Thermo Fisher Scientific) were prepared at the ratios recommended by the manufacturer but siRNA was replaced with either miR mimic or miR inhibitor. miR inhibitor, pre-miR or negative control stock (10 mM), were mixed with 3 μl of Lipofectamine RNAiMax solution and KO DMEM medium to a total volume of 75 μL. We considered 100 nM of mimic/inhibitor in this mixture a 1X concentration. The obtained transfection mix was incubated for 5 min at RT before being added to the cell medium. After 24h (or 48h for the 2X inhibitor experiments on the miReporter cells), the transfected cells were collected and fixed in 4% PFA for 15 min at 4°C. See also section ‘flow cytometry’.

Mass spectrometry.

Pelleted cells were lysed in 400 μl RIPA buffer, except for the sorted cells. Volumes of cell lysate corresponding to 100 μg protein per sample were digested with trypsin using a modified FASP protocol [64]. Subsequently each sample was labeled with TMT 10-plex, 6-plex or 11-plex reagent (Thermo Fisher)) according to the manufacturer’s protocol. All labeled samples were combined into a set-sample. Which labels were assigned to each sample is specified in the specification table. The labeled set-sample was fractionated by electrostatic repulsion-hydrophilic interaction chromatography (ERLIC) run on an HPLC 1200 Agilent system using PolyWAX LP column (200×2.1 mm, 5 μM, 30nm, PolyLC Inc, Columbia, MD) and a fraction collector (Agilent Technologies, Santa Clara, CA). Set-samples were fractionated into a total of 40 ERLIC fractions. Each ERLIC fraction was subsequently further separated by online nano-LC and submitted for tandem mass spectrometry analysis to both LTQ OrbitrapElite or Q exactive high field (HF). One third of each fraction was injected from an auto-sampler into the trapping column (75 um column ID, 5 cm length packed with 5 um beads with 20 nm pores, from Michrom Bioresources, Inc.) and washed for 15 min; the sample was eluted to analytic column with a gradient from 2 to 32% of buffer B (0.1% formic acid in ACN) over 180 min gradient and fed into LTQ OrbitrapElite or Q exactive HF. The instruments were set to run in TOP 20 MS/MS mode method with dynamic exclusion. After MS1 scan in Orbitrap with 60K resolving power, each ion was submitted to an HCD MS/MS with 60K resolving power and to CID MS/MS scan subsequently. All quantification data were derived from HCD spectra.

Computational methods

Flow cytometry data analysis.

Live cell gating and all other analysis of the flow cytometry data was achieved using custom R scripts (FlowCore v1.52.1 [65]). To determine relative down- or upregulation of Citrine expression in the miReporter lines, we calculated the ratio Citrine/mCherry for each cell in the mimic/inhibitor assays and scrambled controls. For the mimic experiments, miReporter downregulation was considered succesfully if the Citrine/mCherry ratio was lower than the 1st percentile of of the signal in miReporter cells transfected with the scrambled control mimic. For the inhibition experiments, miReporter upregulation was considered succesfully if the Citrine/mCherry ratio was higher than the 99th percentile of of the signal in miReporter cells transfected with the scrambled control mimic.

Rate model fitting.

Several rate models were fit for every gene for which data from each measurment modality was available (S1E Fig). RNA and protein expression were scaled by dividing each by the average expression across time. Next a smoothing spline (smooth.spline function, base R) was applied to the total RNA data for each replicate with 7 degrees of freedom (DF). DF was fixed because the automatic inferrence of DF by smooth.spline sometimes sometimes lead to the oversimplification of RNA dynamics and thereby bad fits at the protein level if the protein had more dynamics than the resulting spline. Therefore, a fit was deemed more conservative, if high dynamics were forced at the RNA level at the cost of introducing some noise. smooth.spline was used determine the DF for all other smoothing spline fits using leave-one-out-cross-validation. Smoothing splines fit to the C-fraction were multiplied with the smooth total RNA to get smooth cytoplasmic RNA. miRs that were assigned to each gene were first averaged over replicates and then divided by the miRs maximum value. Smoothing splines were fit to each miR and the smooth miR profiles for each miR cluster were averaged. The differential equations were solved using deSolve (v1.28), given a rate model, parameters, total cytoplasmic RNA and a miR cluster. Instead of using ks and kd directly as parameters, log2(kprod) and log2(kdiv), with kprod = ks · kd and kdiv = ks/kd, were used as parameters to increase the robustness of the optimization. An additional fit parameter not mentioned in the main text was, P0, the protein concentration at t = 0 h, which was used to allow for measurement error on the protein abundance. The parameter ci was constrained to be between 0 and the minimal observed protein expression, as co-isolation interference cannot exceed the measured values. α was constrained to be between 0 and 1. Since both α and scaled miR expression cannot exceed 1, translation can only be completely repressed at peak miR cluster expression. Optimal parameters were found using the optim function (base R). Sum of squared residuals (SSR) were minimized using the “L-BFGS-B” method of the optim function. For the models described by Eqs 2 and 3, there was an initial optimization step minimizing SSR + 10 · log2(kdiv) · 2. Due to the scaling of RNA and protein, log2(kdiv) is expected to be close to 0 so we penalized any divergence from the expected value to get a better estimate for log2(kprod) first,. Without the initial regularization, the fits sometimes resulted in extreme parameter values. The resulting parameters were used as initial values for the final unpenalized fits. The Bayesian Information Criterion (BIC) was used to compare models with different numbers of parameters: BIC = k · ln(n)– 2 · SLL, where k is the number of parameters, n is the number of samples, and SLL the sum of log-likelihood. k is 0 for the naive model, 3 for Eqs 1 and 4 for Eqs 2 and 3. n = 8, the number of time points. The error of the fits was assumed to be normally distributed in order to calculate the SLL. When comparing models, the model with the lower BIC was considered superior.

Supporting information

S2 Fig. Model performance comparison with Pearson’s r.

(A-D) Pearson’s r distribution of various kinetic models. Corresponding R2 distributions are shown in Fig 1. (E) R2 distributions of the cytoplasmic RNA and ci model for all genes. The R2 distribution of the subset of genes that are best fit by the ci model is shown in Fig 1I. (F) Pearson’s r distribution of the miR model and the next best model (either naive, total RNA, cytoplasmic RNA or ci). Only genes that are best fit by the miR model are shown. Corresponding R2 distributions are plotted in Fig 2D.

https://doi.org/10.1371/journal.pgen.1010744.s002

(TIF)

S4 Fig. Dose and timing for miR mimic and inhibitor transfection experiments can be obtained using fluorescent reporters of miR activity.

(A) miReporter plasmid, inserts and digestion sites (BamHI and NheI). The insert overhangs are compatible with BamHI and NheI, but block redigestion. See Methods for full cloning strategy. (B) Inhibition of the miR-590-3p reporter transcript by the miR-590-3p mimic for seven time points as measured by flow cytometry. The asterisk indicates the optimal transfection timing shown in D (24h). (C) Fluorescence signal of miR-590-3p reporter for miR-590-3p mimic or scrambled control at optimal transfection conditions. Blue line indicates 1st percentile of reporter/normalizer ratio of the scrambled control. (D) Reduction of inhibition of the miR-292a-5p reporter transcript by the miR-292a-5p inhibitor for three time points at three transfection concentrations as measured by flow cytometry. The asterisk indicates the optimal transfection timing shown in E (2days, 2X). (E) Fluorescence signal of miR-292a-5p reporter for miR-292a-5p inhibitor or scrambled control at optimal transfection conditions. Blue line indicates 99th percentile of reporter/normalizer ratio of the scrambled control.

https://doi.org/10.1371/journal.pgen.1010744.s004

(TIF)

S5 Fig. Differential expression of predicted targets after miR mimic and miR inhibitor transfection.

(AB) Expression levels (regularized counts scaled to scrambled control) of Cdk7, Leo1, Mfge8, Pccb, Rab8a, and Srgap2 after miR mimic (A) and miR inhibitor (B) transfection. P-value shown is for an uncorrected one-sided test (see Methods). Differential expression of two more targets is shown in Fig 4C. Note that Leo1 was not predicted to be regulated by miR-216a-5p and is included as a negative control.

https://doi.org/10.1371/journal.pgen.1010744.s005

(TIF)

References

  1. 1.
    Soldner F, Jaenisch R. iPSC Disease Modeling. Science. 2012;338: 1155–1156. pmid:23197518
  2. 2.
    Semrau S, Goldmann JE, Soumillon M, Mikkelsen TS, Jaenisch R, Oudenaarden A van. Dynamics of lineage commitment revealed by single-cell transcriptomics of differentiating embryonic stem cells. Nat Commun. 2016;8: 1096. pmid:29061959
  3. 3.
    Loh KM, Chen A, Koh PW, Deng TZ, Sinha R, Tsai JM, et al. Mapping the Pairwise Choices Leading from Pluripotency to Human Bone, Heart, and Other Mesoderm Cell Types. Cell. 2016;166: 451–467. pmid:27419872
  4. 4.
    Klein AM, Mazutis L, Akartuna I, Tallapragada N, Veres A, Li V, et al. Droplet Barcoding for Single-Cell Transcriptomics Applied to Embryonic Stem Cells. Cell. 2015;161: 1187–1201. pmid:26000487
  5. 5.
    Cuomo ASE, Seaton DD, McCarthy DJ, Martinez I, Bonder MJ, Garcia-Bernardo J, et al. Single-cell RNA-sequencing of differentiating iPS cells reveals dynamic genetic effects on gene expression. Nat Commun. 2020;11: 810. pmid:32041960
  6. 6.
    Kumar P, Tan Y, Cahan P. Understanding development and stem cells using single cell-based analyses of gene expression. Development. 2017;144: 17–32. pmid:28049689
  7. 7.
    Schwanhäusser B, Busse D, Li N, Dittmar G, Schuchhardt J, Wolf J, et al. Global quantification of mammalian gene expression control. Nature. 2011;473: 337–342. pmid:21593866
  8. 8.
    Wilhelm M, Schlegl J, Hahne H, Gholami AM, Lieberenz M, Savitski MM, et al. Mass-spectrometry-based draft of the human proteome. Nature. 2014;509: 582–587. pmid:24870543
  9. 9.
    Edfors F, Danielsson F, Hallström BM, Käll L, Lundberg E, Pontén F, et al. Gene-specific correlation of RNA and protein levels in human cells and tissues. Mol Syst Biol. 2016;12: 883. pmid:27951527
  10. 10.
    Buccitelli C, Selbach M. mRNAs, proteins and the emerging principles of gene expression control. Nat Rev Genet. 2020;21: 630–644. pmid:32709985
  11. 11.
    Vogel C, Marcotte EM. Insights into the regulation of protein abundance from proteomic and transcriptomic analyses. Nat Rev Genet. 2012;13: 227–232. pmid:22411467
  12. 12.
    Vogel C, Abreu R de S, Ko D, Le S, Shapiro BA, Burns SC, et al. Sequence signatures and mRNA concentration can explain two-thirds of protein abundance variation in a human cell line. Mol Syst Biol. 2010;6: 400–400. pmid:20739923
  13. 13.
    Csárdi G, Franks A, Choi DS, Airoldi EM, Drummond DA. Accounting for Experimental Noise Reveals That mRNA Levels, Amplified by Post-Transcriptional Processes, Largely Determine Steady-State Protein Levels in Yeast. Snyder M, editor. Plos Genet. 2015;11: e1005206. pmid:25950722
  14. 14.
    Franks A, Airoldi E, Slavov N. Post-transcriptional regulation across human tissues. Plos Comput Biol. 2017;13: e1005535. pmid:28481885
  15. 15.
    Raj A, Peskin CS, Tranchina D, Vargas DY, Tyagi S. Stochastic mRNA Synthesis in Mammalian Cells. Plos Biol. 2006;4: e309. pmid:17048983
  16. 16.
    Popovic D, Koch B, Kueblbeck M, Ellenberg J, Pelkmans L. Multivariate Control of Transcript to Protein Variability in Single Mammalian Cells. Cell Syst. 2018;7: 398–411.e6. pmid:30342881
  17. 17.
    Teo G, Vogel C, Ghosh D, Kim S, Choi H. PECA: A Novel Statistical Tool for Deconvoluting Time-Dependent Gene Expression Regulation. J Proteome Res. 2014;13: 29–37. pmid:24229407
  18. 18.
    Tchourine K, Poultney CS, Wang L, Silva GM, Manohar S, Mueller CL, et al. One third of dynamic protein expression profiles can be predicted by a simple rate equation. Mol Biosyst. 2014;10: 2850–2862. pmid:25111754
  19. 19.
    Peshkin L, Wühr M, Pearl E, Haas W, Freeman RM, Gerhart JC, et al. On the Relationship of Protein and mRNA Dynamics in Vertebrate Embryonic Development. Dev Cell. 2015;35: 383–394. pmid:26555057
  20. 20.
    Jovanovic M, Rooney MS, Mertins P, Przybylski D, Chevrier N, Satija R, et al. Dynamic profiling of the protein life cycle in response to pathogens. Science. 2015;347: 1259038. pmid:25745177
  21. 21.
    Teo G, Zhang YB, Vogel C, Choi H. PECAplus: statistical analysis of time-dependent regulatory changes in dynamic single-omics and dual-omics experiments. Npj Syst Biology Appl. 2017;4: 3. pmid:29263799
  22. 22.
    Becker K, Bluhm A, Casas-Vila N, Dinges N, Dejung M, Sayols S, et al. Quantifying post-transcriptional regulation in the development of Drosophila melanogaster. Nat Commun. 2018;9: 4970. pmid:30478415
  23. 23.
    Bahar Halpern K, Caspi I, Lemze D, Levy M, Landen S, Elinav E, et al. Nuclear Retention of mRNA in Mammalian Tissues. Cell Reports. 2015;13: 2653–2662. pmid:26711333
  24. 24.
    Prasanth KV, Prasanth SG, Xuan Z, Hearn S, Freier SM, Bennett CF, et al. Regulating Gene Expression through RNA Nuclear Retention. Cell. 2005;123: 249–263. pmid:16239143
  25. 25.
    Iampietro C, Bergalet J, Wang X, Cody NAL, Chin A, Lefebvre FA, et al. Developmentally Regulated Elimination of Damaged Nuclei Involves a Chk2-Dependent Mechanism of mRNA Nuclear Retention. Dev Cell. 2014;29: 468–481. pmid:24835465
  26. 26.
    Graindorge A, Carré C, Gebauer F. Sex-lethal promotes nuclear retention of msl2 mRNA via interactions with the STAR protein HOW. Gene Dev. 2013;27: 1421–1433. pmid:23788626
  27. 27.
    Savitski MM, Mathieson T, Zinn N, Sweetman G, Doce C, Becher I, et al. Measuring and Managing Ratio Compression for Accurate iTRAQ/TMT Quantification. J Proteome Res. 2013;12: 3586–3598. pmid:23768245
  28. 28.
    Sandberg A, Branca RMM, Lehtiö J, Forshed J. Quantitative accuracy in mass spectrometry based proteomics of complex samples: The impact of labeling and precursor interference. J Proteomics. 2014;96: 133–144. pmid:24211767
  29. 29.
    Ivey KN, Srivastava D. MicroRNAs as Regulators of Differentiation and Cell Fate Decisions. Cell Stem Cell. 2010;7: 36–41. pmid:20621048
  30. 30.
    Agarwal V, Bell GW, Nam J-W, Bartel DP. Predicting effective microRNA target sites in mammalian mRNAs. Elife. 2015;4: e05005. pmid:26267216
  31. 31.
    Wilczynska A, Bushell M. The complexity of miRNA-mediated repression. Cell Death Differ. 2015;22: 22–33. pmid:25190144
  32. 32.
    Sladitschek HL, Neveu PA. Bidirectional Promoter Engineering for Single Cell MicroRNA Sensors in Embryonic Stem Cells. Mari B, editor. Plos One. 2016;11: e0155177. pmid:27152616
  33. 33.
    Sladitschek HL, Neveu PA. The bimodally expressed microRNA miR-142 gates exit from pluripotency. Mol Syst Biol. 2015;11: 850. pmid:26690966
  34. 34.
    Oumarou DB, Ji H, Xu J, Li S, Ruan W, Xiao F, et al. Involvement of microRNA-23b-5p in the promotion of cardiac hypertrophy and dysfunction via the HMGB2 signaling pathway. Biomed Pharmacother. 2019;116: 108977. pmid:31103821
  35. 35.
    Sun S, Wang X, Xu X, Di H, Du J, Xu B, et al. MiR-433-3p suppresses cell growth and enhances chemosensitivity by targeting CREB in human glioma. Oncotarget. 2016;8: 5057–5068. pmid:27926502
  36. 36.
    Shi Q, Wang Y, Mu Y, Wang X, Fan Q. MiR-433-3p Inhibits Proliferation and Invasion of Esophageal Squamous Cell Carcinoma by Targeting GRB2. Cell Physiol Biochem. 2018;46: 2187–2196. pmid:29730656
  37. 37.
    Wang L, Wang Y, Lin J. MiR-152-3p promotes the development of chronic myeloid leukemia by inhibiting p27. Eur Rev Med Pharmaco. 2018;22: 8789–8796. pmid:30575920
  38. 38.
    Yin T, Liu M-M, Jin R-T, Kong J, Wang S-H, Sun W-B. miR-152-3p Modulates hepatic carcinogenesis by targeting cyclin-dependent kinase 8. Pathology—Res Pract. 2019;215: 152406. pmid:30967300
  39. 39.
    Feng F, Liu H, Chen A, Xia Q, Zhao Y, Jin X, et al. miR-148-3p and miR-152-3p synergistically regulate prostate cancer progression via repressing KLF4. J Cell Biochem. 2019;120: 17228–17239. pmid:31104329
  40. 40.
    Sun J, Tian X, Zhang J, Huang Y, Lin X, Chen L, et al. Regulation of human glioma cell apoptosis and invasion by miR-152-3p through targeting DNMT1 and regulating NF2. J Exp Clin Canc Res. 2017;36: 100. pmid:28764788
  41. 41.
    Cantini L, Zakeri P, Hernandez C, Naldi A, Thieffry D, Remy E, et al. Benchmarking joint multi-omics dimensionality reduction approaches for the study of cancer. Nat Commun. 2021;12: 124. pmid:33402734
  42. 42.
    Jackson CA, Vogel C. New horizons in the stormy sea of multimodal single-cell data integration. Mol Cell. 2022;82: 248–259. pmid:35063095
  43. 43.
    Vitrinel B, Koh HWL, Kar FM, Maity S, Rendleman J, Choi H, et al. Exploiting Interdata Relationships in Next-generation Proteomics Analysis*. Mol Cell Proteomics. 2019;18: S5–S14. pmid:31126983
  44. 44.
    Cao Z-J, Gao G. Multi-omics single-cell data integration and regulatory inference with graph-linked embedding. Nat Biotechnol. 2022;40: 1458–1466. pmid:35501393
  45. 45.
    Karniadakis GE, Kevrekidis IG, Lu L, Perdikaris P, Wang S, Yang L. Physics-informed machine learning. Nat Rev Phys. 2021;3: 422–440.
  46. 46.
    Yazdani A, Lu L, Raissi M, Karniadakis GE. Systems biology informed deep learning for inferring parameters and hidden dynamics. Plos Comput Biol. 2020;16: e1007575. pmid:33206658
  47. 47.
    Grün D, Kirchner M, Thierfelder N, Stoeckius M, Selbach M, Rajewsky N. Conservation of mRNA and Protein Expression during Development of C. elegans. Cell Reports. 2014;6: 565–577. pmid:24462290
  48. 48.
    Kristensen AR, Gsponer J, Foster LJ. Protein synthesis rate is the predominant regulator of protein expression during differentiation. Mol Syst Biol. 2013;9: 689–689. pmid:24045637
  49. 49.
    Lu R, Markowetz F, Unwin RD, Leek JT, Airoldi EM, MacArthur BD, et al. Systems-level dynamic analyses of fate change in murine embryonic stem cells. Nature. 2009;462: 358–362. pmid:19924215
  50. 50.
    Harnik Y, Buchauer L, Ben-Moshe S, Averbukh I, Levin Y, Savidor A, et al. Spatial discordances between mRNAs and proteins in the intestinal epithelium. Nat Metabolism. 2021;3: 1680–1693. pmid:34931081
  51. 51.
    Gedeon T, Bokes P. Delayed Protein Synthesis Reduces the Correlation between mRNA and Protein Fluctuations. Biophysical Journal. 2012;103: 377–385. Available: http://www.sciencedirect.com/science/article/pii/S0006349512006820 pmid:22947853
  52. 52.
    Munsky B, Neuert G. From analog to digital models of gene regulation. Phys Biol. 2015;12: 045004. pmid:26086470
  53. 53.
    Liu Y, Beyer A, Aebersold R. On the Dependency of Cellular Protein Levels on mRNA Abundance. Cell. 2016;165: 535–550. pmid:27104977
  54. 54.
    Rahkonen N, Stubb A, Malonzo M, Edelman S, Emani MR, Närvä E, et al. Mature Let-7 miRNAs fine tune expression of LIN28B in pluripotent human embryonic stem cells. Stem Cell Res. 2016;17: 498–503. pmid:27776272
  55. 55.
    Li MA, He L. microRNAs as novel regulators of stem cell pluripotency and somatic cell reprogramming. Bioessays. 2012;34: 670–680. pmid:22674461
  56. 56.
    Kumar RM, Cahan P, Shalek AK, Satija R, DaleyKeyser AJ, Li H, et al. Deconstructing transcriptional heterogeneity in pluripotent stem cells. Nature. 2014;516: 56–61. pmid:25471879
  57. 57.
    Lichner Z, Páll E, Kerekes A, Pállinger É, Maraghechi P, Bősze Z, et al. The miR-290-295 cluster promotes pluripotency maintenance by regulating cell cycle phase distribution in mouse embryonic stem cells. Differentiation. 2011;81: 11–24. pmid:20864249
  58. 58.
    Dweep H, Sticht C, Pandey P, Gretz N. miRWalk–Database: Prediction of possible miRNA binding sites by “walking” the genes of three genomes. J Biomed Inform. 2011;44: 839–847. pmid:21605702
  59. 59.
    Griffiths-Jones S, Saini HK, Dongen S van, Enright AJ. miRBase: tools for microRNA genomics. Nucleic Acids Res. 2007;36: D154–D158. pmid:17991681
  60. 60.
    Hafner M, Katsantoni M, Köster T, Marks J, Mukherjee J, Staiger D, et al. CLIP and complementary methods. Nat Rev Methods Primers. 2021;1: 20.
  61. 61.
    Huntzinger E, Izaurralde E. Gene silencing by microRNAs: contributions of translational repression and mRNA decay. Nat Rev Genet. 2011;12: 99–110. pmid:21245828
  62. 62.
    Iwasaki S, Ingolia NT. The Growing Toolbox for Protein Synthesis Studies. Trends Biochem Sci. 2017;42: 612–624. pmid:28566214
  63. 63.
    Ying Q-L, Wray J, Nichols J, Batlle-Morera L, Doble B, Woodgett J, et al. The ground state of embryonic stem cell self-renewal. Nature. 2008;453: 519–523. pmid:18497825
  64. 64.
    Wiśniewski JR, Zougman A, Nagaraj N, Mann M. Universal sample preparation method for proteome analysis. Nat Methods. 2009;6: 359–362. pmid:19377485
  65. 65.
    Hahne F, LeMeur N, Brinkman RR, Ellis B, Haaland P, Sarkar D, et al. flowCore: a Bioconductor package for high throughput flow cytometry. Bmc Bioinformatics. 2019;10: 106. pmid:19358741

Source link