06 April 2023, 12.30 - 13.30
Impulse, Speakers’ Corner
Please sign-up here
Feel free to bring your own lunch and discuss with fellow WUR researchers!
From Twitter post to community-driven annotation marathon: The creation of MIBiG 3.0
Biosynthetic gene clusters (BGCs) are genetic units which encode the production of natural products: specialised metabolites with a wide array of societally relevant functions, including antibiotic, antimalarial, antifungal, and herbicidal properties. With genome sequencing costs at an all-time low, BGC detection from sequence data has become a standard step in natural product discovery pipelines. To make it possible to easily cross-reference a genome with experimentally characterised BGCs, the Minimum Information about a Biosynthetic Gene cluster (MIBiG standard was defined in 2018, with an accompanying database that facilitates comparison of novel and previously validated BGCs.
In 2022, we published the third issue of this database, which up to that moment had been maintained biennially by a core team of 10 researchers. Maintenance mostly involves going through literature to find novel published biosynthetic gene clusters, compare them to existing database entries, and record them in a format that can easily be converted to an HTML page. For MIBiG 3.0, we took to twitter to gauge interest in joining our annotation effort. Instead of the small handful of people we expected to sign up, 86 annotators expressed interest, many of whom were experts in the field. With an 8-fold increase in manpower compared to previous years, we created the largest database update yet, with 661 new entries and 4871 separate data points added to the database.
In this talk, Barbara Terlouw will discuss how they mobilised and coordinated 86 researchers from four different continents to combine their efforts into this mammoth annotation effort, and how we ensured consistent annotation quality throughout the process.