Our first blog post, describing the launch of the 10k Salmonella Genomes Project.

Jay Hinton (University of Liverpool) and Neil Hall (Earlham Institute) have won financing from the RCUK Global Challenge Research Fund. This grant pays for the sequencing and basic bioinformatic analysis of 10,000 Salmonella genomes of Salmonella isolates from a range of developing countries in Africa and Latin America, including Colombia, Costa Rica, Gambia, Malawi, Mexico, South Africa and Uganda.

The 10,000 Salmonella genomes project is designed to generate information relevant to the epidemiology, drug resistance and virulence factors of Salmonellae using a whole-genome sequencing approach. We are interested in identifying representative strains using a combination of core genome-based phylogenetics, identification of antibiotic resistance genes and comparisons of the accessory genome (particularly plasmids and phages). Ultimately, we will select individual strains that represent the diversity and clinical repertoire of invasive non-typhoidal Salmonella, and we will use a functional genomic approach to identify novel iNTS virulence factors (funded by a Wellcome Trust Senior Investigator award to Jay Hinton).

We are using a collaborative open-access philosophy to maximise the value of the sequence data generated by this genome project to the worldwide Salmonella research community. Assembled genome sequences will become publically available about 1-year after sequencing. Our collaborator’s metadata will remain private until publication, but please contact Jay Hinton if you might be interested in collaborating on the bioinformatic analysis of certain Salmonella genomes.

DNA sequencing is carried out with the innovative low-input sequencing pipeline at the Earlham Institute. This automated approach allows the genome sequences to be generated relatively fast. The Illumina short read data will be used to produce genome assemblies at the Earlham Institute. With assistance from our colleagues at Enterobase, collaborators will be provided with information about the serotype and sequence type of each Salmonella isolate, as well as basic phylogenetic information. Subsequently, detailed phylogenomic and accessory genome analysis will be a labour-intensive task that will require a collaborative effort that is likely to take several years.

To be eligible for this study, Salmonella isolates are sourced from developing countries on the DAC list of Official Development Assistance (ODA) recipients. We are requesting collaborators to provide Salmonella isolates with metadata, such as geographical origin, body site (stool/blood culture, etc.), age/sex of the patient, etc. For comparative purposes, we accepted a small numbers of additional Salmonella isolates from the environment, animals or food in developing countries.