References
- Chinnery, Patrick F., and David C. Samuels. "Relaxed replication of mtDNA: a model with implications for the expression of disease." The American Journal of Human Genetics 64.4 (1999): 1158-1165.
- Dicu, Daria. “Prior Parameter Distributions.” Absolute Quantification of MtDNA, 2017, dariadicu.github.io/inference.html.
- van Dijk, David, et al. "MAGIC: A diffusion-based imputation method reveals gene-gene interactions in single-cell RNA-sequencing data." BioRxiv (2017): 111591.
- Haario, Heikki, Eero Saksman, and Johanna Tamminen. "An adaptive Metropolis algorithm." Bernoulli 7.2 (2001): 223-242.
- Kharchenko, Peter V., Lev Silberstein, and David T. Scadden. "Bayesian approach to single-cell differential expression analysis." Nature methods 11.7 (2014): 740.
- Kiselev, Vladimir, et al. "Analysis of single cell RNA-seq data." Hemberg Group | Wellcome Sanger Institute (2018).
- Lalam, Nadia. "Statistical inference for quantitative polymerase chain reaction using a hidden Markov model: a Bayesian approach." Statistical applications in genetics and molecular biology 6.1 (2007).
- Li, Wei Vivian, and Jingyi Jessica Li. "An accurate and robust imputation method scImpute for single-cell RNA-seq data." Nature communications 9.1 (2018): 997.
- Metropolis, Nicholas, et al. "Equation of state calculations by fast computing machines." The journal of chemical physics 21.6 (1953): 1087-1092.
- Murphy, K. (2012). Machine learning. Cambridge, Mass. [u.a.]: MIT Press.
- Roberts, Gareth O., and Jeffrey S. Rosenthal. "Examples of adaptive MCMC." Journal of Computational and Graphical Statistics 18.2 (2009): 349-367.
- Schmittgen, Thomas D., and Kenneth J. Livak. "Analyzing real-time PCR data by the comparative C T method." Nature protocols 3.6 (2008): 1101.
- Stewart, James B., and Patrick F. Chinnery. "The dynamics of mitochondrial DNA heteroplasmy: implications for human health and disease." Nature Reviews Genetics 16.9 (2015): 530.
- Stumpf, Michael P.H. 2017, Lecture 18: Parameter Estimation & Model Selection, MSc in Bioinformatics and Theoretical Systems Biology (2017-18), Imperial College London, delivered 30 November 2017.
- Tang, Fuchou, et al. "mRNA-Seq whole-transcriptome analysis of a single cell." Nature methods 6.5 (2009): 377.
- Wilson, Philip J., and Stephen LR Ellison. "Extending digital PCR analysis by modelling quantification cycle data." BMC bioinformatics 17.1 (2016): 421.
- Yuan, Yuan, et al. "Comprehensive Molecular Characterization of Mitochondrial Genomes in Human Cancers." bioRxiv (2017): 161356.
Glossary
Analyte:
-
Object to be identified and measured.
Binomial branching process:
-
Stochastic process used to model population as a set of discrete random variables $X_i$ over time units $i$, where the transition between consecutive states is dictated by a binomial distribution.
Burn-in:
-
A initial period of iterations of a MCMC method before the covariance matrix is updated and the transition kernel adopts a new shape.
Expression matrix:
-
Each row of the expression matrix represents a gene and each column represents a cell (NB: sometimes the transpose is used). Each entry represents the expression level of a particular gene in a given cell. The units by which the expression is meassured depends on the protocol and the normalization strategy used for the data. Many analyses of scRNA-seq data take as their starting point an expression matrix.
Fat hairy caterpillar:
-
The shape of the trace plot generated during inference, indicating that the Markov chain is well-mixed.
Gene dropout:
-
A gene being observed at a moderate expression level in one cell but not being detected in another cell.
Heteroplasmy:
-
The presence of more than one type of organellar genome (mitochondrial DNA or plastid DNA) within a cell or individual. It can be expressed as a ratio $m/(w+m)$, with $m$ the number of mutant DNA molecules present, and $w$ the number of wildtype.
Jeffrey's prior:
-
A non-informative prior distribution proportional to the square root of the determinant of the Fisher information matrix.
Manhattan skyline:
-
The shape of the trace plot generated during inference, indicating that the Markov chain is 'hot'.
Prior:
-
Probability distribution of a variable, incorporating prior knowledge into the model before inference is performed.
Posterior:
-
Probability distribution of a variable given observed data, the target object of interest in inference on experimental data.
Sequencing depth:
-
A measure related to the number of unique reads that include a given nucleotide in the reconstructed sequence. Deep sequencing refers to the general concept of aiming for high number of unique reads of each region of a sequence.
Sequencing read:
-
The sequence of a section of a unique fragment of RNA transcript.
Wildtype:
-
Non-mutant DNA molecule.
Abbreviations and Notation
$\alpha$:
-
The parameter describing the proportionality of the fluorescence intensity emitted at a given cycle to the number of molecules at that point.
AIC:
-
Akaike information criterion.
AM:
-
Adaptive Metropolis (algorithm).
cDNA:
-
complementary DNA.
ddPCR:
-
Droplet digital PCR.
HMM:
-
Hidden Markov Model.
MCMC:
-
Markov chain Monte Carlo.
MH:
-
Metropolis Hastings (algorithm).
mtDNA:
-
mitochondrial DNA.
PCR:
-
polymerase chain reaction.
qPCR:
-
quantitative PCR.
$r$:
-
the efficiency of the qPCR experiment.
RNAseq:
-
RNA sequencing.
$\sigma$:
-
the variance contributing to the noise term in the fluorescence expression.
scRNAseq:
-
single cell RNA sequencing.
$X_0$:
-
the initial number of molecules present in the single cell, the primary target analyte.