Workflow Management Using Snakemake¶
In this webinar you’ll learn about snakemake, a workflow management system consisting of a text-based workflow specification language and a scalable execution environment. A workflow management system (WMS) is a piece of software that sets up, performs and monitors a defined sequence of computational tasks (i.e. “a workflow”). Snakemake is a WMS that was developed in the bioinformatics community, and as such it has some features that make it particularly well suited for creating reproducible and scalable data analyses.
Topics covered:¶
- Snakemake syntax and basics
- Visualization and workflow management
- Example RNAseq workflow of MRSA
Prerequisites¶
Downloads, access, and services¶
In order to complete this tutorial you will need access to the following services/software
Prerequisite Preparation/Notes Link/Download CyVerse account You will need a CyVerse account to complete this exercise Register
- familiarity with the terminal
- UNIX intro
- Python and R knowledge would be beneficial
Platform(s)¶
We will use the following CyVerse platform(s):
Platform | Interface | Link | Platform Documentation | Learning Center Docs |
---|---|---|---|---|
Discovery Environment | Web/Point-and-click | Discovery Environment | DE Manual | Guide |
Quick Launch Snakemake-VICE Jupyter lab app¶
OR search within Discovery Environment¶
Login to the Discovery Environment.
CLick on “Apps” tab in the Discovery Enviornment and search for “snakemake”.
Under “Analysis Name” leave the defaults or make any desired notes.
Note
The app comes pre-loaded with required software required for performing RNAseq analysis.
Under “Resource Requirements” request resources as needed or leave for defaults
Click Launch Analysis. You will receive a notification that the job has been submitted and running in your notification tab.
Note
You will be notified when the analysis has finished successfully.
6. Click on the “Analyses” button to display the dashboard of your analyses. Click on your anlaysis name to navigate to that analysis folder in your data store.
- Click here for the tutorial.
RNA-seq analysis of MRSA Workflow¶
- Clone RNAseq Snakemake tutorial repository
git clone https://github.com/NBISweden/workshop-reproducible-research.git
cd workshop-reproducible-research/docker/
git checkout devel
ls
- Generate rulegraph
snakemake --rulegraph | dot -Tpng > rulegraph_mrsa.png
- Dry-Run RNAseq Snakefile
snakemake -n
- Run RNAseq Snakefile
snakemake --cores 8
Note
Here we used the package snakemake-minimal. This is a slimmed down version that lack some features, in particular relating to cloud computing and interacting with remote providers such as Google Drive or Dropbox.
Other Workflow Managers¶
- CCTools offers Makeflow a workflow management system similar to Snakemake and also WorkQueue for scaling-up through Distributed Computing for customized and efficient utilization of resources. Read more here.
- NextFlow
Additional information, help¶
- Snakemake Read The Docs
- Snakemake Tutorial
- Contact CyVerse support by clicking the intercom button on the page.
Fix or improve this documentation
- Search for an answer: |CyVerse Learning Center|
- Ask us for help: click |Intercom| on the lower right-hand side of the page
- Report an issue or submit a change: |Github Repo Link|
- Send feedback: Tutorials@CyVerse.org