CyVerse logo

Home_Icon Learning Center Home

Workflow Management Using Snakemake

In this webinar you’ll learn about snakemake, a workflow management system consisting of a text-based workflow specification language and a scalable execution environment. A workflow management system (WMS) is a piece of software that sets up, performs and monitors a defined sequence of computational tasks (i.e. “a workflow”). Snakemake is a WMS that was developed in the bioinformatics community, and as such it has some features that make it particularly well suited for creating reproducible and scalable data analyses.

Topics covered:

  • Snakemake syntax and basics
  • Visualization and workflow management
  • Example RNAseq workflow of MRSA

Note

Special Thanks to instructors at NBIS (ELIXIR-SE) for the tutorial used here. This tutorial is one of 6-part course on Tools for reproducible research. Read more here

  • See Snakemake Slides here and pdf.

Prerequisites

Downloads, access, and services

In order to complete this tutorial you will need access to the following services/software

Prerequisite Preparation/Notes Link/Download
CyVerse account You will need a CyVerse account to complete this exercise Register
    • familiarity with the terminal
    • UNIX intro
    • Python and R knowledge would be beneficial

Platform(s)

We will use the following CyVerse platform(s):

Platform Interface Link Platform Documentation Learning Center Docs
Discovery Environment Web/Point-and-click Discovery Environment DE Manual Guide


Quick Launch Snakemake-VICE Jupyter lab app

  • Right-Click the button below and login to CyVerse Discovery Environment for a quick launch of Snakemake-VICE Jupyter-lab app.

    smake-vice

OR search within Discovery Environment

  1. Login to the Discovery Environment.

  2. CLick on “Apps” tab in the Discovery Enviornment and search for “snakemake”.

  3. Under “Analysis Name” leave the defaults or make any desired notes.

    Note

    The app comes pre-loaded with required software required for performing RNAseq analysis.

  4. Under “Resource Requirements” request resources as needed or leave for defaults

  5. Click Launch Analysis. You will receive a notification that the job has been submitted and running in your notification tab.

Note

You will be notified when the analysis has finished successfully.

6. Click on the “Analyses” button to display the dashboard of your analyses. Click on your anlaysis name to navigate to that analysis folder in your data store.

  1. Click here for the tutorial.

RNA-seq analysis of MRSA Workflow

  • Clone RNAseq Snakemake tutorial repository
git clone https://github.com/NBISweden/workshop-reproducible-research.git

cd workshop-reproducible-research/docker/

git checkout devel

ls
  • Generate rulegraph
snakemake --rulegraph | dot -Tpng > rulegraph_mrsa.png
  • Dry-Run RNAseq Snakefile
snakemake -n
  • Run RNAseq Snakefile
snakemake --cores 8

Note

Here we used the package snakemake-minimal. This is a slimmed down version that lack some features, in particular relating to cloud computing and interacting with remote providers such as Google Drive or Dropbox.

Other Workflow Managers

  • CCTools offers Makeflow a workflow management system similar to Snakemake and also WorkQueue for scaling-up through Distributed Computing for customized and efficient utilization of resources. Read more here.
  • NextFlow

Additional information, help


Fix or improve this documentation


Home_Icon Learning Center Home