Page tree
Skip to end of metadata
Go to start of metadata

You are viewing an old version of this page. View the current version.

Compare with Current View Page History

« Previous Version 21 Next »

How IGB recognizes species and genome versions

IGB is distributed with two tab-delimited files called "species.txt" and "synonyms.txt" that allow it to match genome names between QuickLoad sites, Galaxy, and other sites. Whenever IGB accesses a QuickLoad site, it also request these two files from the top-level, root directory.

If you are setting up your own QuickLoad site, you can create and distribute  these same files as part of your site to ensure the species names are displayed correctly and that you can use IGB in conjunction with Galaxy sites that contain the same genomes.

Species.txt

The species.txt file is a tab-delimited file that lists

  • binomial (Latin) names for species
  • common name for the species
  • IGB-friendly  genome version name prefix (e.g., H_sapiens or A_gambiae)
  • Zero or more genome version name prefixes used by Galaxy, UCSC Genome Bioinformatics, or other data providers

IGB uses these data to populate the Species menu in the Current Genome tab. It also uses these data to associate species with their genome assemblies. This is why the species.txt file also lists the IGB-friendly genome version prefix in the file in column 3. Because many data providers also use common prefixes to indicate genome versions for a species, we also include those whenever possible.

When IGB accesses a QuickLoad site, it will also attempt to retrieve a species.txt file from the QuickLoad root directory. If available, it will use those data to add new species to the Species menu.

The species.txt file that is distributed with IGB is version-controlled. It resides in the igb/resource directory in the project repository. To see the latest version of this file, go to https://bitbucket.org/lorainelab/integrated-genome-browser, select the source button, and navigate to the resources folder as shown below:

Also, here is a version that was current as of December, 2014.

To create a species.txt file for a QuickLoad site:

  • Open an editor (Microsoft Word is fine, but you'll need to save the file as a plain text (.txt) file.)
  • Type the Latin name for the species, including a subspecies (e.g., Zea mays B73)
  • Type TAB
  • Type the common name for the species (e.g., maize)
  • Type TAB
  • Type the IGB genome version prefix you'll use for every genome assembly from this species (e.g., Z_mays_B73)
  • Type TAB
  • Type the Galaxy/UCSC genome version prefix, if available. You only need to do this if the genome versions is supported by Galaxy and UCSC and you would like to use open files and data from Galaxy in IGB.
  • Repeat the previous steps for all new species you'd like to include in your QuickLoad site.
  • Save the file as plain text.
  • Place the file in the root directory of your QuickLoad site.

 

 

 

  • No labels