Analysis Protocols

Initial nemabiome studies were done using a bespoke database, which was highly-curated but contained relatively few species. An analysis pipeline was developed that used the mothur bioinformatics software and worked well for the small database. Recent developments in amplicon sequence analyis have allowed single-nucleotide resolution of amplicon sequences and reduced error. We have developed a new analysis pipeline that uses the DADA2 R-package. Additionally we have also built a more comprehensive ITS2 database that caputures a much larger amount of nematode diversity.

The current recommened workflow is to use DADA2 along with the new database for taxonomy assignment. Although more comprehensive the analysis is also more complex an requires a basic understanding of the R-programming language. To help facilitate analysis for users that are new to bioinformatics we have outlined a detailed tutorial on how to process the sequences with the DADA2 software. A higher skill level is required but we stongly encourage users to take the time to learn how to do the analysis themselves or obtain assistance from a bioinformatician. Individual projects are unique and there is rarely a one-size-fits-all solution for these types of analysis and learning how to do it yourself gives you the confidence and the tools to do the right analysis for your data.

The DADA2 tutorial can be found here and the original mothur workflow is documented here.

Below are example data files used in the two workflows.

Example dataset

This dataset contains 26 pooled strongyle worm samples collected from Brazilian cattle farms. Each sample contains 1000 L3 larvae cultured from fecal samples. The Nemabiome assay was used to investigate species proportions within these pooled samples. The raw results from the assay are usually downloaded directly from the Illumina website (Basespace). We have provided the raw results from these samples below. Begin by downloading and unzipping all three files.

Miseq data files