How to create and add a new default data provider
- Create a new module for the data provider and add the following classes:
- Factory class that implements DataProviderFactory - this interface has methods to initialize the actual data provider class with different options of parameters.
- Data provider class that extends BaseDataProvider - this class has methods that have to be implemented to initialize the provider parameters and provide implementation to get all the supported genome versions and all the available data sets for a single genome version.
- The data provider class should also implement these two interfaces, AssemblyProvider and ReferenceSequenceProvider - these have methods that are used to get the chromosome and sequence information.
- Methods that are needed to be implemented:
- initialize() - the data provider will be initialized, and usually an object that stores the available genomes is created and loaded with the data.
- getSupportedGenomeVersionNames() - other classes use this method to get the available genomes for that provider.
- getAvailableDataSets() - has a parameter, data container, which has the selected genome version for which this method returns the available datasets.
- getAssemblyInfo() - this method is used to get the chromosome information. It should return a Map<String, Integer> object, where the key is the chromosome name, and the value is the chromosome start.
- getSequence() - this is used to get the sequence of the genome version for the requested chromosome and region.
- Once all the code is implemented, add that provider to this file using the format below:
igbDefaultPrefs.json
{ "factoryName": "factory name", //factory name for the provider "name": "provider name", //provider name "url": "provider_url", "loadPriority": "load_priority_int", //this is used to determine the order in which data providers are initialized "defaultDataProviderId": "IgbDefaultDataProviderId:16" //this is like a key in a map, so it should be unique from the rest of the providers }
- All the default data providers should be present in igbDefaultPrefs.json, this file is used to initialize the data providers. If all the code is implemented for the provider and it’s not declared here, the provider won’t be initialized.