#BGA23/sessions#Annotation#Ensembl#Pipeline#Workshop

This session may now be out of date so beware!!!

This session is part of BGA23

Session Leader(s)

Leanne Haggerty
Ensembl Genome Annotation Project Leader
EMBL-EBI

Jose Maria Gonzalez Perez-Silva
Bioinformatician
EMBL-EBI

Description

Open in Gitpod GitHub:: https://github.com/BGAcademy23/ensembl-annotation

Part 1: General concepts + Case study

By the end of this part you will have:

  1. Obtained an overview of state-of-the art methods and tools for genome annotation
  2. Understood their advantages/limitations in different use-cases
  3. Gained a picture of the quality measures of existing annotations
  4. Discussed the future of genome annotation methodologiesā€

Slides for this session can be found here.

A list of useful tools for annotating genomes can be found here (download this file to access hyperlinks to tools).

Post-session additional notes can be found here

Part 2: Hands on - From RNAseq reads to gene models

By the end of this part you will be able to:

  1. Perform quality control and pre-processing of RNA-Seq reads: This includes trimming of adapter sequences and quality control to ensure that the reads are of sufficient quality to be used for analysis.
  2. Align RNA-Seq reads to the genome: The reads are aligned to the genome assembly, allowing for the identification of expressed genes and transcripts.
  3. Assemble transcripts: The aligned reads are used to reconstruct the transcripts, or messenger RNAs (mRNAs), that are expressed in the sample.
  4. Annotate genes: The identified genes can be annotated using a combination of homology-based and ab initio gene prediction methods. The homology-based methods use existing gene information from closely related species, while the ab initio methods use the transcript assembly information to predict new genes.

Prerequisites

  1. Understand the terms genome assembly, reads, contigs
  2. Understand what a gene model is, what introns and exons are
  3. Understand what transcriptomic and protein data are, including understanding the different types of RNA
  4. Understand the concept of sequence alignment

!!! warning "Please make sure you MEET THE PREREQUISITES and READ THE DESCRIPTION above"

You will get the most out of this session if you meet the prerequisites above.

Please also read the description carefully to see if this session is relevant to you.

If you donā€™t meet the prerequisites or change your mind based on the description or are no longer available at the session time, please email tol-training at sanger.ac.uk to cancel your slot so that someone else on the waitlist might attend.