Introduction¶
SMCSMC, short for Sequential Monte Carlo for the Sequentially Markovian Coalescent, is a method for estimating ancestral population size and migration history from sequence data. It has several advantages over comparable methods, especially when you are interested in analysing complex demographic models.
The method uses a particle filter to sample from the posterior distribution of trees along the sequence and Variational Bayes to infer epoch specific demographic parameters over a given number of iterations.
smcsmc
takes as input optionally phased sequencing data formatted as segments, and provides utilities for analysing and visualising the inferred ancestral rates.
Installation¶
We highly recommend installing SMCSMC from conda
, as it comes packaged with all necessary dependencies. A seperate guide for manual compilation may be found in the developer reference. See here for a helpful guide to installing and using conda
to manage programs.
First add both conda-forge
and terhorst
to your channel lists (if they are not there already), then install smcsmc
.
conda --add channel conda-forge
conda --add channel terhorst
conda install smcsmc
Basic Usage¶
To use SMCSMC, start a python
session and import the smcsmc
module. As a part of the installation above, two binaries are installed into the conda-bin
, smcsmc (inference) and scrm (simulation). The front end, smcsmc
is a wrapper around these binaries providing convenient functions for data manipulation, conversion, plotting, and utilities surrounding the workflow for analysing sequences with SMCSMC. With a test seg file such as this one, the following will run a default session of SMCSMC.
import smcsmc
test_args = {
`seg`: `test_seg.seg`,
`nsam`: 4
}
smcsmc.utils.run_smcsmc(test_args)
Follow the Getting Started guide to become familiar with the basic structure and function of SMCSMC commands, then look at one of the tutorials for analysing simulated or real data. For a more complete guide to arguments, see Input Arguments. Alternatively the cli can be used with identical results.
smc2 -nsam 4 -seg test_seg.seg
Other Methods¶
SMCSMC is part of the PopSim consortium, and we are actively involved in building a framework to standardize population genetic analyses. Part of this involves making it easy to run the same analysis with many different methods. We have built smcsmc
with this goal in mind. For the latest information about comparisons between different population genetic software, including smc++
, stairwayplot
, msmc
, and dadi/fastcoal
, check out the PopSim analysis repository.
Citation¶
If you use smcsmc
in your work, please cite the following article:
- Henderson, D., Zhu, S. (Joe), & Lunter, G. (2018). Demographic inference using particle filters for continuous Markov jump processes. BioRxiv, 382218. https://doi.org/10.1101/382218
- Staab, P. R., Zhu, S., Metzler, D., & Lunter, G. (2015). scrm: efficiently simulating long sequences using the approximated coalescent with recombination. Bioinformatics, 31(10), 1680–1682. https://doi.org/10.1093/bioinformatics/btu861