ScRAPdb    Saccharomyces cerevisiae Reference Assembly Panel Database (ScRAPdb)


Introduction


As a unicellular eukaryote, the budding yeast Saccharomyces cerevisiae strikes a unique balance between biological complexity and experimental tractability, serving as a long-standing classic model for both basic and applied studies. Recently, S. cerevisiae further emerged as a leading system for studying natural diversity of genome evolution and its associated functional implication at population scales. Having high-quality comparative and functional genomics data is critical for such efforts. Here we exhaustively expanded the Telomere-to-Telomere (T2T) S. cerevisiae reference assembly panel (ScRAP) that we previously constructed for 142 strains to cover high-quality genome assemblies and annotations for 264 S. cerevisiae strains from diverse geographical and ecological niches and also 33 outgroup strains from the other Saccharomyces species described in the genus. We created a dedicated online database, ScRAPdb (https://www.evomicslab.org/db/ScRAPdb/), to host this expanded pangenome collection. On top of the pangenome, ScRAPdb also integrates a population-scale pan-omics atlas (pantranscriptome, panproteome, and panphenome) and rich data exploration toolkits for intuitive genomics analyses. All curated data and downstream analysis results could be easily downloaded from the database. We expect ScRAPdb to become a highly valuable platform for the yeast community and beyond, leading to a pan-omics understanding of the global genetic and phenotypic diversity.


Summary of the dataset


A. The geographic origin of the ScRAPdb strain collection.



B. The ecological origin of the ScRAPdb strain collection.




C. The statistics of used sequencing technologies of the ScRAPdb assemblies.



D. The BioProject, strain, and assembly counts of the ScRAPdb assemblies.



E. Distribution of the strain ploidies and the fraction of strains with phased assemblies (left) and mitochondrial assemblies (right).



    F. The strain intersections of the pangenome, pantranscriptome, panproteome and panphenotype datasets in ScRAPdb.


Note:
Genome: ScRAPdb strain collection
Pantranscriptome_Dataset: Caudal et al. (2024) Nature Genetics
Panproteome_Dataset1: Teyssonnière et al. (2024) PNAS
Panproteome_Dataset2: Muenzner et al. (2024) Nature
Panphenome_Dataset1: Peter et al. (2018) Nature
Panphenome_Dataset2: De Chiara et al. (2022) Nature Ecology & Evolution