Reconstruction of complex single-cell trajectories using CellRouter

Reconstruction of complex single-cell trajectories using CellRouter

Reconstruction of complex single-cell trajectories using CellRouter

Abstract

A better understanding of the cell-fate transitions that occur in complex cellular ecosystems in normal development and disease could inform cell engineering efforts and lead to improved therapies.

However, a major challenge is to simultaneously identify new cell states, and their transitions, to elucidate the gene expression dynamics governing cell-type diversification. Here, we present CellRouter, a multifaceted single-cell analysis platform that identifies complex cell-state transition trajectories by using flow networks to explore the subpopulation structure of multi-dimensional, single-cell omics data.

We demonstrate its versatility by applying CellRouter to single-cell RNA sequencing data sets to reconstruct cell-state transition trajectories during hematopoietic stem and progenitor cell (HSPC) differentiation to the erythroid, myeloid and lymphoid lineages, as well as during re-specification of cell identity by cellular reprogramming of monocytes and B-cells to HSPCs. CellRouter opens previously undescribed paths for in-depth characterization of complex cellular ecosystems and establishment of enhanced cell engineering approaches.

Introduction

Gene expression profiling has been widely applied to understand regulation of cellular processes in development and disease1. However, micro-environmental influences, asynchronous cell behaviors, and molecular stochasticity often leads to pronounced heterogeneity in cell populations, obscuring the dynamic biological principles governing cell-state transitions. Single-cell, high-throughput technologies present an opportunity to characterize these states and their transitions by simultaneously quantifying a large number of parameters at single-cell resolution. However, as cells are destroyed during measurement, data-driven approaches are required to illuminate the dynamics of cellular programs governing fate transitions.

To study gene expression dynamics, several algorithms have been developed to organize single cells in pseudo-temporal order based on transcriptomic or proteomic divergence2,3,4,5,6. While current algorithms best identify trajectories between the most phenotypically distant cell states, which molecularly are very distinct, they are less robust in reconstructing trajectories from early states towards intermediate or transitory cell states. Limitations include reconstructing linear trajectories (Waterfall, Monocle 1), identifying only a single branch point (Wishbone), or requiring a priori knowledge of the number of branches (Diffusion Pseudotime, DPT).

Monocle 2 addresses many of these challenges but is not designed to reconstruct trajectories between any two chosen cell states, which might include transitions from or to rare cell types. Moreover, as they are designed to identify branching trajectories, Wishbone, DPT, and Monocle 2 are less suited to detect convergent differentiation paths, such as during plasmacytoid dendritic cell development from distinct precursor cells7.

To overcome these challenges, we developed CellRouter (Supplementary Software 1–4, https://github.com/edroaldo/cellrouter), a general single-cell trajectory detection algorithm capable of exploring the subpopulation structure of single-cell omics data to reconstruct trajectories of complex transitions between cell states. CellRouter requires no a priori knowledge of trajectory structure, such as number of cell fates or branches. CellRouter is a transition-centered trajectory reconstruction algorithm, distinct from the bifurcation-centered algorithms such as Wishbone, DPT, and Monocle 2.

While bifurcations occur during lineage diversification, transitions also converge to specific lineages or occur between cell states within branches. CellRouter relaxes the requirement of identifying branching points during cell-fate transitions and implements a flow network algorithm to flexibly reconstruct multi-state transition trajectories. Moreover, CellRouter is independent of dimensionality reduction techniques and can be used, for example, with principal component analysis (PCA), t-stochastic neighbor embedding (t-SNE) or diffusion maps.

CellRouter is a flexible single-cell analysis platform designed to reconstruct single-cell trajectories of complex cell-state transitions. We apply CellRouter to several single-cell RNA-sequencing data sets to provide insight into multi-lineage differentiation from hematopoietic stem and progenitor cells (HSPCs) in snapshot data sets and also during a time-course of mesoderm diversification towards the blood lineage, revealing sequential waves of gene expression changes along differentiation trajectories. Moreover, we provide insight to guide cellular reprogramming by exploring stem cell differentiation data sets as a blueprint to identify reprogramming trajectories and develop new cell engineering strategies. CellRouter integrates subpopulation identification, multi-state trajectories, and gene regulatory networks (GRNs) to provide new insights into cell-state transitions during lineage diversification, convergence, or cell reprogramming.

Reconstructing complex single-cell trajectories

To identify multi-state transition trajectories, CellRouter builds a k-nearest neighbor (kNN) graph from cell−cell relationships in a space of reduced dimensionality (Fig. 1). CellRouter then transforms the kNN graph to represent cell−cell similarities by assigning weights to each edge based on network similarity metrics (e.g., the Jaccard similarity). This approach weakens connections between unrelated cell types and strengthens connections between cells within the same subpopulation, better representing phenotypic relatedness8. Next, using community-detection algorithms (e.g., the Louvain method), subpopulations are defined by identifying communities of densely inter-connected cells8, 9.

Then, CellRouter uses a graph theory approach to solve the minimum cost flow problem and precisely define trajectories between any two subpopulations (t1, t2,.., t6)10, 11, including transitions to intermediate states (t1, t2) or rare or under-represented cell types or states (tr) (Fig. 1, Supplementary Note 1, Supplementary Method). Importantly, CellRouter identifies a subset of representative transitioning cells, better accounting for stochastic or regulated cell-to-cell variation. Finally, to account for drop out events in single-cell RNA-seq data, CellRouter explores the local topology of the kNN graph to smoothen the kinetic trends along each trajectory.

Fig. 1

PrintPrint this page