Snakemake
Registration
Register HERE
Dates
24-27 March 2025
9.00am – 5.00pm each day
Venue
Online
Places
24 for each workshop
You will be contacted by our finance team for full payment. Once payment is made, your place will be confirmed and full details sent by our training team.
Registration fee
£371 – University of Edinburgh staff/students
£390 – Other university or registered charity staff/students
£408 – Industrial researchers
Information
This course is for researchers who need to automate data analysis tasks for biological research involving next-generation sequence data, for example RNA-seq analysis, variant calling, CHIP-Seq, bacterial genome assembly, etc.
Snakemake is a popular open-source tool to create reproducible and scalable data analyses. Workflows are described via a Python-based language that defines steps in the workflow as rules, and these are then used by Snakemake to construct and execute a work plan to yield the desired output. Re-calculation of existing results is avoided where possible, so you can add or update input data, then efficiently generate an updated result. Workflows can be seamlessly scaled to server, cluster, grid and cloud environments without the need to modify the workflow definition.
A key appeal of workflow systems like Snakemake is that our workflows can be re-used, published and re-mixed as open-source code. We look at WorkflowHub.eu and other on-line resources for workflow sharing, and how we can best prepare our own workflows to be most effectively re-usable
Attendees must have a working knowledge of how to use the Linux BASH command line – our 1-day “Linux for bioinformatics” course is a suitable background.
Instructors
- Tim Booth (Bioinformatics Developer, EdGe)
- Frances Turner (Bioinformatics Analyst, EdGe)
Workshop format
The workshop consists of episodes where we interactively present new topics, try them out together, and set short exercises. Towards the end there is a longer practical challenge.
You will be provided with a cloud-based Linux virtual machine environment to run all the workflows and tools.
Who should attend
Attendees must have a working knowledge of how to run commands and navigate directory structures in the Linux BASH command line – our 1-day “Linux for bioinformatics” course is a suitable background.
No knowledge of Python is required.
Covered topics
Day 1
- Welcome and set-up
- Running commands with Snakemake
- Placeholders and wildcards
- Chaining rules
- How Snakemake plans what jobs to run
Day 2
- Processing lists of inputs
- Handling awkward programs
- Configuring workflows
Day 3
- Optimising workflow performance
- Conda integration
- Constructing a whole new workflow
- Cleaning up
Day 4
- Re-using and sharing your workflows
- Where to find and share workflows on-line
- Best practises to make your code re-usable
- Choosing a test dataset
- Source code control and versioning