Bioinformatics Workflows with Snakemake

Time:
9am - 5pm each day
Venue:
online
Places:
24 for each workshop
You will be contacted by our finance team for full payment. Once payment is made, your place will be confirmed and full details sent by our training team.
Registration fee:
University of Edinburgh Staff/ Students - £340
Other University Staff/Students - £360
Industry staff - £390
Information:
Overview
Researchers needing to implement data analysis workflows face a number of common challenges, including the need to organise their tasks, make effective use of compute resources, handle unexpected errors in processing, and document and share their methods. The Snakemake workflow system provides effective solutions to these problems. By the end of the course, you will be confident in using Snakemake to tackle complex workflow problems and in your day-to-day research.
About Snakemake
Snakemake is a popular open-source tool to create reproducible and scalable data analyses. Workflows are described via a human readable language that defines steps in the workflow as rules, and these are then used by Snakemake to construct and execute a work plan to yield the desired output. Re-calculation of existing results is avoided where possible, so you can add or update input data, then efficiently generate an updated result. Workflows can be seamlessly scaled to server, cluster, grid and cloud environments without the need to modify the workflow definition.
Who this course is for
This course is intended for researchers who need to automate data analysis tasks for biological research involving next-generation sequence data, for example RNA-seq analysis, variant calling, CHIP-Seq, bacterial genome assembly, etc. Attendees must have a working knowledge of how to use the Linux BASH command line - our 1-day "Linux for bioinformatics" course is a suitable background.
The language used to write Snakemake workflows is Python-based, but no prior knowledge of Python is required or assumed.
Check out an overview
Instructors
Tim Booth - Bioinformatician and Software Developer, Edinburgh Genomics
Frances Turner – Bioinformatician and Software Developer, Edinburgh Genomics
Workshop format
The workshop consists of presentations and hands-on tutorials.
Outcomes
By the end of the course students should be familiar with:
-
Standard input, standard output, standard error
-
Shell variables and environment variables
-
Interpolation
-
Loops and conditionals in shell scripts
-
Parent and child processes
-
Process return codes
-
Signals
-
Use of Conda environments
-
Designing, implementing and testing a new workflow
-
Defining rules with complex inputs and outputs
-
Dynamic processing based on configuration
-
Choosing effective file naming schemes
-
Using regular expressions in rule definitions
-
Using Python and shell syntax within workflows
-
Snakemake invocation options
-
Temporary and protected files