Year 3
The Data Center Management group (DCM) has been very productive during the past two quarters. The number of labs now producing genomics data has continued to grow. During this period we integrated files and metadata for eight new datasets, and added new analysis files or primary data for eight additional datasets. We released analysis pages for eight datasets, and created summary pages for fifteen datasets, resulting in a complete set of summaries for all the datasets that have been submitted to date. We developed a new public-facing portal for the publicly released data that includes a landing page suitable for a less technical audience, and are well into the development phase of a new interactive viewer for single cell data. The DCM collaborated with the Data Curation Group (DCG) in the production of the Minimum Information about a Stem Cell Experiment (MISCE) v1.1 reference, and started collaboration with the CIP4 group to integrate their work into the dataset summary and analysis pages.
The DCG made significant progress these past six months. The metadata standard, MISCE, has now had a second version released (ver 1.2) and has been incorporated into the annotation pipeline of SCHub. Over the last six months, we have worked with the DCM to enact a new protocol for annotating newly contributed samples whereby the SCHub contacts work directly with labs and the ontology designers (at JCVI) work with SCHub (rather than also directly with the labs). This has greatly streamlined the interaction and avoids the confusion of having collaborating labs interfacing with two different informatics groups. All decisions can then be made in a centralized and controlled fashion. Further development of the MISCE standards will be ongoing for the next reporting period and we will publish the descriptions in the coming months.
We have also formed two new working groups to organize collaborative work on thematic projects — the Brain of Cells and the Cardiac Analysis Working Groups (AWGs). We are collecting a large set of single cell RNA-Seq data for the BoC project to rally the analysis around identifying new cell types in this fetal brain collection. The CIP4 methods will be tested and evaluated on this the BoC. A “report card” that summarizes how a newly contributed sample relates to a sample seen before in the BoC will be implemented into the SCHub data summaries in the next few months. Details on each milestone are enumerated below.