Page tree

Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

The following protocol describes processing data from Illumina HiSeq pipeline. Whether you should perform exactly these steps with your data sets will depend on your data - its age, whether you've done single-end or paired-end sequencing, the read lengths, your reference genome, and so on. For example, when you run TopHat, you may need to adjust parameters to accommodate smaller read lengths if you are running data from pre-HiSeq instruments.

Check sequence quality.

Your first step upon downloading your reads will be to check the quality of your sequence. One of the easiest to use tools we (in the Loraine lab) have found for quality checking is a terrific program called FastQC, from the Babraham Institute. You can run it interactively or as a command line tool, and it's written in Java, which means you can run it on a Mac, a Linux machine, or a Windows computer. Like IGB, any computer that supports Java can run FastQC.

...

In our experience, low yields and low quality generally happen because something went wrong with sequence. Over-represented sequences and so-called PCR duplicates (many copies of the same sequence) are usually due to problems in library construction.

Align your sequence.

Let's assume that (happily) you have good-quality sequence. Your next step should be to align your sequences onto a reference genome or refernece transcriptome. Indeed, even if you haven't got high-quality sequence, you should still try to align it, because the alignments can tell you a lot about what went wrong.

...

For details on how to do this, see the section titled Not done yet -- using tabix to create a sorted, random-access bedgraph file in Creating a new genome release for IGB QuickLoad - an example from Zea mays.

...

View data in IGB

Once you've created the files, you should then be able to open them in IGB. However, be sure to keep the index files (.bai from samtools and .tbi from tabix) in the same folder with their corresponding alignments or bedgraph files.

...