Page tree

Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

A genome version refers to a group of chromosome sequences that you or another group assembled and made available.

For example, NCBI releases versions 35 and 36 of the human genome are considered to be two separate genomes.  Each one contains multiple chromosome sequences, including the expected chromosomes 1 to 22, X , and Y. Other sequences, such as "chr22_random" are also considered distinct chromosomes for the purposes of display in IGB. 

...

If you are building a genome for display in IGB, we recommend that you give it an IGB-friendly name, consisting of the month and year of release combined with genus and species, following the pattern: G_species_mon_yyyy, where G is the first letter of the genus, mon is the three-letter English abbreviation for the month the genome was released, and year is the year of the release.

For example:

  • A_thaliana_Jun_2009
  • A_mellifera_Jan_2005
  • H_sapiens_Feb_2009

...

When users operate the pulldown menu to choose a species to view in IGB, a short message indicating the common name of the species appears. If  If you are adding a new species, contact the IGB developers and ask to have your common name added to the species.txt file under version control at sourceforge.net. This is a tab delimited file that lists all the species that IGB supports, including common names for many of them.

...

Unfortunately, different groups tend to refer to the same genome or chromosome by different names.  For example, NCBI human genome build 35 is also known as hg17 and ensembl1834, as well as H_sapiens_May_2004.  When IGB is able to recognize that two names refer to the same genome or chromosome, it will merge the data.  Otherwise it will keep the two data sets distinct.  Currently, IGB uses a simple table of synonyms to store these associations.  You can create your own set of synonyms that will extend this set if needed. 

Annotations, Sequences, Graphs

...

and Alignments

IGB can work with four distinct types of data: annotations, alignments (typically from Illumina sequencing experiments), graphs , and genomic sequences.  Some features of the program are type-specific and will only work with these specific types of data.

Annotations indicate the known or suspected locations of genomic landmark features such as genes, exons, promoter regions, pseudogenes, and so forth.  Alignments of EST sequences, GeneChip probe sequences, and other sequences onto chromosome are also sometimes referred to as annotations, particularly when they don't include the sequence of the aligned entity.  Annotation data can be loaded from files, QuickLoad and DAS servers. 

...

Alignments represent how empirical sequences (such as short reads from an RNA-Seq experiment) align onto the reference genomic sequence. At  At low zoom , they look like regular annotations, but with marks representing mismatches whenever these data are available. At  At high zoom , they show the sequence of the aligned read, and sometimes indicate scores and the degree of agreement with the reference sequence. These are typically loaded from BAM (binary alignment) files.