How IGB loads data from a DAS server.
This is the current procedure for loading data from DAS servers. Currently,
this only works on two DAS servers (it no longer works for Ensembl):
UCSC = http://genome.cse.ucsc.edu/cgi-bin/das/dsn
Ensembl = http://www.ensembl.org/das/dsn
http://genome.cse.ucsc.edu/cgi-bin/das/dsn http://www.ensembl.org/das/dsn
- initialize server:
for each DSN element {{
for each "SOURCE" element {{
sourceid = get "id" attribute from SOURCE element
}}
master = first "MAPMASTER" element of DSN element
masterURL = get text from master element
version key = end (after last /) of masterURL
one source object (=version) per version key
add sourceid to source object.
}}
this creates a collection of source objects (versions)
- select species/version
issue a types query = server name -/dsn suffix + <versionid> + "/types" : http://genome.cse.ucsc.edu/cgi-bin/das/<id>/types
for each TYPE element {{
typeid = get "id" attribute from TYPE element
featureURL = server name -/dsn suffix + <versionid> + "/features?type=" + <typeid>
add features map entry, key = typeid, value = featureURL
}}
Proposal for using the DAS registry to find data.
It is possible to use the DAS registry to find data on DAS servers,
the problem is that all DAS servers can be set up differently, they
all follow different rules or different DAS versions, some don't follow
all the protocol rules, and some are not maintained.
DAS is a protocol, and it has different versions. The spec for the
latest version (1.6) is at:
http://www.biodas.org/documents/spec-1.6.html
the registry itself is at:
http://www.dasregistry.org
So, for example, if we want to find servers for H_sapiens_Feb_2009,
we can use the query:
http://www.dasregistry.org/das/sources?organism=9606&capability=features&version=37
and parse the results. The result is an xml document with SOURCES as root element,
and several SOURCE elements. each SOURCE element has one VERSION element, which contains
some CAPABILITY elements. Ther must be at least one CAPABILITY element with type attribute
of "das1:features" (the restriction in the query). There may also be a "das1:entry_points"
CAPABILITY element (these will be the available chromosomes), a "das1:types" CAPABILITY
element (each type is a feature), a "das1:stylesheet" CAPABILITY element (this explains
how to display the element, colors, etc.). If there is no types capability, the SOURCE
element can be handled as one single feature.