D1.440 - Single cell pipeline to improve immune cell annotation through comparison of multiple bioinformatic methods
Background
Accurate cell annotation is vital when using single cell sequencing to improve our understanding of allergy among individual immune cell types. Since the gene expression captured by single cell RNA-sequencing is both influenced by technical and dataset/sample-specific biases, gene expression may not align with classical markers from the literature, protein expression and surface markers. Additionally, technical artifacts, and biases of bioinformatic tools used in analysis and clustering may affect the downstream annotations. While no single method, reference dataset, or database is optimal for annotation of any individual dataset, comparison among multiple methods and resources may improve annotation. Therefore we aim to improve immune cell annotation through implementation of various methods in a bioinformatic pipeline.
Method
A selection of cell annotation methods were incorporated into the pipeline: two automatic reference-based annotation methods were implemented, with options to use custom reference datasets from the literature. Marker genes from five cell marker databases and the literature were used as input for two gene set/module scoring methods.. An iterative process of re-clustering and plotting compared the effects of three feature selection methods, three dimensional reduction methods, and a range of parameters, on the separation of cell types, cell states, and batch effects among clusters, and enabled the most relevant and dataset-specific markers to be identified.
Results
Use of the pipeline enabled comparison of multiple methods and parameters. The combined approach and iterative process identified the most relevant genes for macrophage annotation in scRNA sequencing of pericardial fluid. Cell signature scoring, and visualisations of the clusters produced characterised by altered cell states, rather than cell type were identified, and the optimal parameters for the compared methods to separate Macrophages from Monocytes was found: five clusters within 273 macrophages were annotated.
Conclusion
An improvement in macrophage annotation was found from using the combined references, databases and parameters implemented in the pipeline, than from using any individual method.
