MetaCerberus is a massively parallel, fast, low memory, scalable annotation tool for inference gene function across genomes to metacommunities. It offers scalable gene elucidation to major public databases, including KEGG (KO), COGs, CAZy, FOAM, and specific databases for viruses, including VOGs and PHROGs, from single genomes to metacommunities. More information about the tool as well as its source code can be found on its GitHub page: https://github.com/raw-lab/MetaCerberus. Below, you'll find the the process for visualizing MetaCerberus output in IGB.

Install MetaCerberus

MetaCerberus is compatible with Python 3, works on both Mac OS X and Linux, and can be installed using bioconda:

Linux/OSX-64

1) Install mamba using conda

conda install mamba

NOTE: Make sure you install mamba in your base conda environment unless you have OSX with ARM architecture (M1/M2 Macs). Follow the OSX-ARM instructions below if you have a Mac with ARM architecture.

2) Install MetaCerberus with mamba

mamba create -n metacerberus -c conda-forge -c bioconda metacerberus
conda activate metacerberus
metacerberus.py --setup
metacerberus.py --download

OSX-ARM (M1/M2)

1) Set up conda environment

conda create -y -n metacerberus
conda activate metacerberus
conda config --env --set subdir osx-64

2) Install mamba, python, and pydantic inside the environment

conda install -y -c conda-forge mamba python=3.10 "pydantic<2"

3) Install MetaCerberus with mamba

mamba install -y -c conda-forge -c bioconda metacerberus
metacerberus.py --setup
metacerberus.py --download

NOTE: Mamba is the fastest installer. Anaconda or miniconda can be slow. Also, install mamba from conda not from pip. The pip mamba doesn't work for install.

Open MetaCerberus output in IGB

Run MetaCerberus

Run metacerberus.py with the options required for your project. See MetaCerberus' GitHub page for usage details: https://github.com/raw-lab/MetaCerberus?tab=readme-ov-file#metacerberus-options

Add custom genome

Before you can visualize the MetaCerberus output, the genome used to run MetaCerberus will need to be added as a custom genome in IGB.

IGB supports many species and genome versions, not just the species shown on the start screen. To check whether a genome is available in IGB, click the Current Sequence tab and use the Species and Genome Version menus to look for your genome of interest.

However, if your genome is not available, you can still use IGB. Here's how:

How to open a custom genome

Select File > Open Genome from File... (or click the DNA icon in the Toolbar.)
Select a sequence file to use as the reference genome (fasta or 2bit format).
Enter Optional details:
1. Enter Genus name
2. Enter the Species name
3. Enter the Variety as appropriate (strain/cultivar/accession)
4. Choose the Month of the genome release date
5. Enter the Year in YYYY format.
Click OK and wait for the genome to load.
Open data files as usual. And to view sequence, zoom in and click Load Sequence.

Visualize output

Output files with a .gff extension can be viewed in IGB. See File Formats for a list of all currently supported file formats.

To open local files on your computer:

Select File > Open File... or File > Open URL...
Enter file name or URL

Alternatively, drag and drop local files from your file chooser into IGB.

Some MetaCerberus output in the .gff file format will have a ##FASTA section at the bottom. Although this is a valid component of a .gff file, IGB does not currently have the logic to parse it and has thus been throwing an error when it encounters files with this section. We are working on adding that logic, but in the meantime, removing all text in your .gff file from that ##FASTA section down will fix the issue.

Page tree

MetaCerberus