Table of Contents |
---|
Introduction
IGB QuickLoad (QL) is a simple file-based system for users to access annotation, alignment, or sequence data.
...
To view an example, see the IGB QuickLoad site.
How to Set Up a QuickLoad Site
Step One: Create a QuickLoad root directory.
Create a folder on your local computer or on a Web server. This folder will contain your genome directories and a "meta-data" file called contents.txt.
Tip |
---|
If you are hosting a QuickLoad site using the Apache Web server, you can configure QuickLoad to demand a user name and password from anyone who accesses the data by adding an .htacess file to the top level directory. |
Step Two: Create one or more genome directories.
Next create genome directories for each of the genome versions you want to make available via QuickLoad.
...
Panel | ||
---|---|---|
|
Step Three: Create contents.txt file.
Create a simple, plain text file called contents.txt and add it the top-level directory.
...
Note: You can include other directories and other files in your QL site folder; IGB will simply ignore anything that isn't listed in your contents.txt file. Also, any changes in the name of the genome directory must be updated in the contents.txt file or IGB will not recognize it.
Optional Step: Create a synonyms.txt file.
This is a list of synonyms for genomes. This list allows you to match names across diverse quickload sites. For example, if a DAS1 or DAS2 data source (such as ones hosted at Ensembl or UCSC) use different names to refer to the same genome version, you can specify these here. Each line contains any number of synonyms for a genome, separated by tabs. Details about this file can be found on the Personal Synonyms page.
For an example, see the attached synonyms ^synonyms.txt file.
Step Four: Create genome.txt files.
For each genome directory, create a genome.txt (formerly, mod_chromInfo.txt) file that lists all the chromosomes in your genome, together with their sizes.
...
Note: You can create your genome.txt file from a sequence 2bit file (see below) using twoBitInfo, available from http://hgdownload.cse.ucsc.edu/admin/exe/. See Step Seven: Add sequence data.
Optional: Create liftall.lft file.
This describes how contigs are assembled into chromosomes.
Each line contains: CONTIG_START tab CONTIG_NAME tab CONTIG_LENGTH tab CHROMOSOME_NAME tab CHROMOSOME_LENGTH
Step Five: Create annots.xml files.
If you have any annotations for your genomes, you can make it possible for IGB to display them in the Data Access Panel by listing them in the annots.xml file.
...
When loaded into IGB, each file will create one or more tracks. (Some file types, such as BED and GFF3, can specify multiple tracks.)
Annots.xml options - specify track color, annotation style, and more
An annots.xml file contains one or more file tags, enclosed in a files element. The file tag has attributes that will dictate how the data will look when displayed in IGB. Most of these options correspond to options users configure using the Tracks tab in the IGB Preferences window. (See File > Preferences > Tracks.)
attribute | optional | Description |
---|---|---|
name | Required | The name of the file on your file system. |
title | Optional | User-friendly text IGB will show in the Data Access Panel as the title of the data set. If you don't provide a title, IGB will display the name of the file instead. |
description | Optional | IGB will display this text as a tooltip when users hover the mouse over the data set title in the Data Access tab. |
url | Optional | Use this tag to specify the location of a Web page describing the data set. If provided, IGB will display an "info" icon next to the data set title. When users click the icon, their Web browser will open showing the contents of the URL you provide. |
load_hint | Optional | Optional but if used its value should be "Whole Sequence". Using this tag will force IGB to load the entire file when users select the genome version, which is usually appropriate only for reference gene model annotations. Use this only if your site is the sole provider of a particular genome's reference genome annotations. |
label_field | Optional | Use this field to indicate the annotation property (e.g., "score" or "id") that should be used in IGB to label individual annotations. For gene models, "id" is best. |
background | Optional | Use this field to define track background color (e.g. background="000000") |
foreground | Optional | Use this field to define annotation color (e.g. foreground="00FFFF") |
max_depth | Optional | Use this field to define the default max depth value, the number of annotations that can be shown individually (max_depth="10" or max_depth="0" for unlimited) |
name_size | Optional | Use this field to define the default track label name font size (e.g. name_size="12") |
connected | Optional | Use this field to define the default boolean value for the connected field (e.g. connected="true" or connected="false") |
show2tracks | Optional | Use this field to define the default boolean value for the show2Tracks field (e.g. show2tracks="true" or show2tracks="false") |
direction_type | Optional | Use this field to dictate whether annotations or alignments will be shown using arrows and/or color to indicate direction. (e.g. direction_type="arrow", direction_type="color", direction_type="both", direction_type="none") |
positive_strand_color | Optional | Use this field to define the default positive strand color (e.g. positive_strand_color="CCFFFF") |
negative_strand_color | Optional | Use this field to define the default negative strand color (e.g. negative_strand_color="33FFFF") |
view_mode | Optional | Use this field to define the default view mode (e.g. view_mode="default", view_mode="depth") |
Step Six: Add your annotation files.
Place your annotations into the appropriate genome sub-directories. They can be "gzipped" or not depending on your preference. You should be able to use any of the many formats IGB supports. For annotation files, BED format is the most commonly used.
Starting with IGB 6.6, IGB QuickLoad can support BED, bedgraph, and GFF files that have been sorted, compressed, and indexed using the bgzip and tabix utility. Doing this helps speed up data loading in IGB. For an example of how we used tabix to distribute coverage and junction files from an RNA-Seq data sets from maize, see Creating a new genome release for IGB QuickLoad from the IGB Developer's Guide.
Step Seven: Add sequence data.
You may not need to do this if you are working with a genome that other QuickLoad or DAS2 sites support. That is, IGB may be able to retrieve sequence data for your genome from other sites if it can recognize that your annotations belong to the same genome version as the other data providers it gets data from. However, if you are working with a newly sequenced genome or don't want to use other groups' servers, you can support IGB's sequence visualization functionality by setting up your own sequences to distribute.
...
IGB can also use sequence files in the legacy IGB BNIB format. For more information on the BNIB format, see Converting FASTA to BNIB. Be sure that you name your files using the same names you listed in your genome.txt file.
Step Eight: Add your new QuickLoad site to IGB.
Tell IGB to use your new QuickLoad site.
...