Bulk file downloads for all sequence and analysis files are made available under the download. subdomain.

This tutorial uses the MealyBugBase download site.

1. Visit download.mealybug.org:

2. Data from each release is kept in a top-level directory. At the time of writing, the most recent MealyBugBase release is “v1”. Click on the “v1” directory row in the list of file names and sizes to see the available data:

3. Data for each assembly is kept in a separate subdirectory. Each of these has a common structure so click on “Planococcus_citri_Pcitri.v1” as an example:

4. Analysis results from BLAST and InterProScan analyses are in the “blast” and “InterProScan” directories, respectively. A standardised GFF3 format file of gene annotations is in the “gff3” directory. The “embl” directory will contain an EMBL format file if a locus tag and BioProject have been specified in the assembly metadata. The “fasta” directory contains several subdirectories for different sequence types based on the file type naming used by EMBL-EBI curated Ensembl sites:

5. Files can be downloaded by clicking on a filename:

6. Alternatively files may be downloaded using a command line tool by requesting the file URL directly, e.g.:

wget https://download.mealybug.org/v1/Planococcus_citri_Pcitri.v1/interproscan/Planococcus_citri_Pcitri.v1.proteins.fa.interproscan.tsv.gz