10 stages, 40 embryos, ~120000 cells, more to come...
Zebrahub's inaugural release of a single-cell RNA sequencing timecourse dataset, which offers a detailed account of zebrafish development at the resolution of individual embryos. This comprehensive dataset encompasses 10 developmental stages, ranging from end-of-gastrulation embryos to 10-day-old larvae (including bud-stage, 5-, 10-, 15-, 20-, 30-somite stages, as well as 2-, 3-, 5-, and 10-days post-fertilization), with four embryos sequenced per time point and a total of around 120,440 cells analyzed. Our goal is to provide a user-friendly and high-quality single-embryo resolved timecourse dataset that leverages the latest single-cell technologies, providing a comprehensive view of development.
More details about the experimental protocol can be found here.
The whole dataset integrates all the developmental stages together to get temporally coherent cell-type annotations. Annotations were done using information from previously annotated single UMAPs, cross-validated with a literature search based on enriched genes per cluster group. We leveraged the Zebrafish Anatomy Ontology (ZFA) to provide the community with cell-type annotations that use the ZFA's controlled vocabulary. You can download the data here.
10 stages, 40 embryos
We generated UMAPs for each developmental stage by combining the data of all individually sequenced embryos (four embryo replicates per stage). Per time point, we computed Leiden clusters, which we annotated based on the expression of specific enriched genes followed by a literature search and ZFIN database queries, as well as existing published and annotated scRNAseq data (Farrell et al. 2018; Wagner et al. 2018; Farnsworth et al. 2020; Raj et al. 2020).
hpf = hours post fertilization
dpf = days post fertilization
The full and timepoint-resolved data above is annotated at a relatively coarse level. To improve annotation resolution, we start with six high-level lineages that cover all cells and time points. Then, for each lineage, we reproject, recluster, and meticulously annotate the resulting higher-resolution clusters with more fine-grained annotations. This two-level approach helps resolve our annotations while retaining a hierarchical organization that facilitates the exploration of the data. The data can be downloaded here.
Zebrahub is an ongoing project and we are working on improving the resolution of our annotations over time. We welcome collaboration to improve the quality of the cell-type annotations and/or other metadata variables. Therefore if you detect any ambiguity in the current data objects or want to help us on a specific region or cluster please contact us.
Data for the RNA velocity can be downloaded here.