MIBiG 3.0: a community-driven effort to annotate experimentally validated biosynthetic gene clusters

Abstract

With an ever-increasing amount of (meta)genomic data being deposited in sequence databases, (meta)genome mining for natural product biosynthetic pathways occupies a critical role in the discovery of novel pharmaceutical drugs, crop protection agents and biomaterials. The genes that encode these pathways are often organised into biosynthetic gene clusters (BGCs). In 2015, we defined the Minimum Information about a Biosynthetic Gene cluster (MIBiG): a standardised data format that describes the minimally required information to uniquely characterise a BGC. We simultaneously constructed an accompanying online database of BGCs, which has since been widely used by the community as a reference dataset for BGCs and was expanded to 2021 entries in 2019 (MIBiG 2.0). Here, we describe MIBiG 3.0, a database update comprising large-scale validation and re-annotation of existing entries and 661 new entries. Particular attention was paid to the annotation of compound structures and biological activities, as well as protein domain selectivities. Together, these new features keep the database up-to-date, and will provide new opportunities for the scientific community to use its freely available data, e.g. for the training of new machine learning models to predict sequence-structure-function relationships for diverse natural products. MIBiG 3.0 is accessible online at https://mibig.secondarymetabolites.org/.

Document Details

Document Type
Pub Defense Publication
Publication Date
Nov 18, 2022
Source ID
10.1093/nar/gkac1049

Entities

People

  • AFIF PRANAYA JATI
  • Aditya M Kunjapur
  • Adriana Rego
  • Aruna Vigneshwari
  • Athina Gavriilidou
  • Barbara R Terlouw
  • Benjamin Philmus
  • Bita Pourmohsenin
  • Catarina Loureiro
  • Chao Du
  • César Aguilar
  • Damien Gayrard
  • Daniel W Udwary
  • Darren J Scobie
  • David Meijer
  • Dong Yang
  • Edward Kalkreuter
  • Eric J N Helfrich
  • Eve Tallulah Roxborough
  • Fernanda O Chagas
  • Francisco Barona-Gómez
  • Friederike Biermann
  • Gajender Aleti
  • Geng-min Lin
  • George Lund
  • Hannah E. Augustijn
  • Huali Xie
  • J Abraham Avelar-rivas
  • Jaclyn M. Winter
  • Jeffrey A van Santen
  • Jingwei Yu
  • Jonathan Parra
  • Jordan Bernaldo Agüero
  • Jorge Navarro
  • Joris J. R. Louwen
  • Justin J J van der Hooft
  • Jérome Collémare
  • Kai Blin
  • Karina Gutiérrez-García
  • Katherine R. Duncan
  • Kristiina Vind
  • Kristina Haslinger
  • Kumar Saurabh Singh
  • Kyo Bin Kang
  • Liana Zaroubi
  • Lotte J. U. Pronk
  • Luis A Avitia-Domínguez
  • Luis Rodrigo Rosas-Becerra
  • Marc G Chevrette
  • Marnix H Medema
  • Michael J J Recchia
  • Michelle Schorn
  • Mitja Zdouc
  • Mohammad Alanjary
  • Nelly Selem-Mojica
  • Nico L L Louwen
  • Nicole E. Avalon
  • Nika Sokolova
  • Nikolaos Kalyvas
  • Pablo Cruz-Morales
  • Raquel Castelo-branco
  • Rex D A B
  • Roger G Linington
  • Sam E Williams
  • Sang Hyeon Lee
  • Satria A Kautsar
  • Serina L Robinson
  • Sophie P. J. M. Vromans
  • Suhad A A Al-Salihi
  • Susan Egbert
  • Thomas E. Witte
  • Thomas J Booth
  • Thomas Tørring
  • Tilmann Weber
  • Valentin Waschulin
  • Vincent A Bielinski
  • Víctor J. Carrión
  • Wonyong Kim
  • Xiaoyu Tang
  • Yong-Xin Li
  • Zachary L Reitz
  • Zheng Zhong

Organizations

  • Aarhus University
  • Biotechnology and Biological Sciences Research Council
  • Carnegie Institution for Science
  • Chinese Academy of Agricultural Sciences
  • Consejo Nacional de Humanidades, Ciencias y Tecnologías
  • Danish National Research Foundation
  • Federal University of Rio de Janeiro
  • Fundação Carlos Chagas Filho de Amparo à Pesquisa do Estado do Rio de Janeiro
  • Fundação para a Ciência e Tecnologia
  • German Research Foundation
  • Goethe University Frankfurt
  • J. Craig Venter Institute
  • John Innes Centre
  • Joint Genome Institute
  • Leiden University
  • Massachusetts Institute of Technology
  • National Autonomous University of Mexico
  • National Institutes of Health
  • National Research Foundation of Korea
  • National Science Foundation
  • Netherlands Institute of Ecology
  • Netherlands eScience Center
  • Novo Nordisk Fonden
  • Oregon State University
  • Purdue University
  • Rothamsted Research
  • Simon Fraser University
  • Sookmyung Women's University
  • Southern University of Science and Technology
  • Tennessee State University
  • United States Department of Energy
  • University of Bristol
  • University of California, San Diego
  • University of Delaware
  • University of Florida
  • University of Groningen
  • University of Hong Kong
  • University of Johannesburg
  • University of Manitoba
  • University of Nottingham
  • University of Ottawa
  • University of Porto
  • University of Strathclyde
  • University of Szeged
  • University of Tübingen
  • University of Utah
  • University of Warwick
  • Wageningen University & Research
  • Westerdijk Institute

Tags

Fields of Study

  • Biology

Readers

  • Database Systems and Applications
  • Molecular Genetics
  • Theoretical Analysis.

Technology Areas

  • AI & ML
  • AI & ML - Information Retrieval
  • AI & ML - Neural Networks