Page tree
Skip to end of metadata
Go to start of metadata

You are viewing an old version of this page. View the current version.

Compare with Current View Page History

« Previous Version 11 Next »

Introduction

FindJunctions is a Java program that uses spliced alignments to identify and quantify exon-exon junctions in RNA-Seq data. When given a BAM file, it produces a BED file that summarizes every spliced aligned alignment identified in the BAM file. If also given a reference genomic sequence file (in .2bit format) it attempts to identify the strand of origin for each junction by looking for canonical intron splice junction sequences.

How to use FindJunctions

You can view FindJunctions by:

  1. Zooming in on the gene or region of interest.
  2. Click on Load Data.
  3. Right-click on the track, and select Track Operations > FindJunctions.
  4. Click on OK.

 

FindJunctions will appear as a new track. Brackets represent splice junctions across exons. The number of reads supporting each splice junction is shown above each bracket.

 

 

The default behavior for FindJunctions is to identify split reads with a minimum of five base pairs on either side of the exon.

You can change the default FindJunctions behavior by:

  1. Right-clicking on the track, and selecting Track Operations > FindJunctions.
  2. Inputting a new number in the threshold box (default is 5).

How to get FindJunctions without IGB

To obtain a copy of FindJunction, visit https://bitbucket.org/lorainelab/findjunctions

Using the jar file to run FindJunction

To run the program from the command line, you would do something like:

java -Xmx1g -jar FindJunction_exe.jar -u -n 5 -b Genome.2bit -o FJ.bed sample1.bam,sample2.bam

In this example, the -Xmx1g option specifies that the program can run with up to 1 Gb of computer memory (RAM) using the code in jar file (-jar) FindJunction_exe.jar.  The -u option (for unique) indicates that only uniquely mapping spliced reads with NH tag equal to 1 will be used to construct junctions. The -n option is the number of bases that must map to either side of a putative intron for a spliced read to be used to create or support a junction feature. The -b option gives the full path to the .2bit genomic sequence file that will be used to identify junction strand. The -o (output) option gives the name of the junctions.bed file that will be written. The final argument is a comma-separated list of the BAM files containing spliced alignments.

The output file (FJ.bed) is BED12 format. The name field contains a name constructed from the location of the junction and the score field contains the number of spliced alignments used to create the junction feature.

  • No labels