Essential Bioinformatics


ADD TO WAITING LIST

Date:

TBD

Venue:

Online

Places:

20 (first come, first served)

Registration fee:

University of Edinburgh Staff/Students - £TBD

Non- University of Edinburgh Staff/Students - £TBD

Information:

Contact our training team


This three day workshop introduces you to two commonly used programming environments that will unlock your potential to handle big data and carry out bioinformatic analysis.

Firstly, the Linux command line is where most bioinformatic analyses start. This powerful operating system environment is common to all high performance computing systems capable of carrying out the computationally intensive tasks of manipulating raw genomic data. Here, we introduce you to the basic commands that will allow you to navigate file systems, handle data, and run programmes. We then advance onto writing your first shell scripts and working with bioinformatics tools.

Days two and three focus on the statistical programming language ‘R’. You will learn the basics of R including manipulation of data frames, performing iterative tasks, and writing simple functions, before apply these skills to genomic data structures and learning to visualise your data to generate publication ready plots.


 

Instructors

Nathan Medd

Tim Booth

Workshop format

The workshop consists of guided tutorials and hands-on exercises. The first day will be spent gaining hands-on experience of the Linux command line. Days two and three will be mostly be spent working in R with some short lectures on the concepts behind the language.
 

Who should attend

This workshop is aimed at researchers and technical workers with a background in biology who want to learn to use the Linux and R environments in order to analyse genomic data.
 

Requirements

Students should have enough biological background to appreciate the examples and exercise problems, and have at least some interest in working with DNA sequence data. No previous computer skills are necessary, as we will introduce both coding languages starting with the very basics. 


 

Topics covered

  • The shell and commands
  • Getting help
  • Files and directories
  • Navigating the file system
  • File management
  • Permissions
  • Accessing files
  • Downloading remote files
  • Zipping and unzipping files
  • Pipes and redirects
  • Filtering / manipulating file content
  • Shell scripts
  • Process management
  • Command-line tools for genomics (seqtk, bioawk, samtools, bedtools, tabix)
  • Introduction to R
  • R fundamentals
  • Using functions
  • Iterating functions over data structures
  • Handling genomic data (bioconductor, annotationHub, GenomicRanges)
  • Data visualisation (ggplot2)