19.11.2025

Ocelli: A Mathematical Eye That Peers Inside Cells and Uncovers Their Future

Understanding life requires observing it at single-cell resolution. Thanks to a new tool developed by ICTER scientists, it’s now possible to track gene expression changes and how gene regulation mechanisms interact over time. Ocelli is software that transforms the analysis and visualization of multidimensional single-cell data.

PhD student Piotr Rutkowski and Dr. Marcin Tabaka from the International Center for Eye Research (ICTER) developed Ocelli to address these challenges. Detailed in NAR Genomics and Bioinformatics, Ocelli meets the growing need for effective tools to analyze multimodal data obtained from single cells.

Cell divide /Photo: Depositphotos

Building on these innovations, modern sequencing technologies now enable simultaneous measurement of the transcriptome, chromatin accessibility, protein levels, and histone modifications in single cells. Each of these ‘modalities’ represents a different stage of gene regulation-from epigenetic control mechanisms to final protein products. However, understanding these dynamic processes requires tools that can integrate them without losing biological and temporal resolution. Ocelli provides exactly that.

Biological Time Map

Unlike earlier methods mainly based on neural networks or probabilistic models, Ocelli combines topic modeling (TRM) with multimodal diffusion maps (MDM). MDM views transitions between cell states as diffusion processes within the space of multidimensional cell features, forming a complex network of dependencies that captures both the overall structure of differentiation and the subtle transitions between cell states.

This approach allows us to map the trajectory of cell differentiation continuously, keeping the nonlinearity inherent in biological processes. Additionally, Ocelli assigns weights to individual cells, indicating which modality-RNA, chromatin, or protein – most accurately reflects its current developmental stage. This is especially important because the significance of different mechanisms regulating gene expression varies at different stages of development. For example, epigenetic signals dominate in progenitor cells, while protein levels are more influential in determining the state of differentiated cells.

Reconstruction of developmental pathways in the hair follicle SHARE-seq dataset, comprising chromatin accessibility and transcriptome modalities. (A) Schematic depicting the differentiation pathways in the regenerative part of hair follicle. (B) Ocelli’s visualization of the single-cell SHARE-seq data of the regenerative part of hair follicle using scVelo [26] RNA velocities. (C) Inferred chromatin accessibility weights show higher levels for progenitor TACs. In contrast, the transcriptome weights are elevated for differentiated cells. (D) A distribution of 10% of cells with the highest weights of each modality. (E) Gene signature activity, computed as gene expression mean z-scores, of HS: medulla (HS-Me) and cortex (HS-Co); and IRS: Huxley’s layer (IRS-Hu) and Henle’s layer (IRS-He). (F) CellRank analysis detects four terminal differentiation states. Mean absorption probability shows a variety within TAC. IRS cells are more likely to develop from TAC-1 and HS cells from TAC-2.

“In developmental biology, it’s not just what cell characteristics we measure that matters, but how they change over time. Ocelli enables us to capture these changes in a continuous, nonlinear, and biologically meaningful way, integrating various layers of information into a single, coherent representation,” explains Piotr Rutkowski, M.A., from ICTER, co-author of the Ocelli tool.

The authors conducted several analyses on both simulated and real-world data, including SHARE-seq data from mouse hair follicles and data from human bone marrow (ASAP-seq, NTT-seq, SHARE-seq) and blood (Dogma-seq). Using synthetic data, Ocelli successfully reconstructed simulated developmental trees with multiple cell lines and more complex structures with sparse cell transitions. In both cases, it outperformed other methods in reconstructing developmental paths, maintaining trajectory continuity, and identifying bifurcation points.

In real-world data, Ocelli demonstrated exceptional accuracy in mapping cell differentiation in the regenerative part of the hair follicle. By analyzing chromatin accessibility and gene expression data simultaneously, it accurately traced the division of progenitor cells (TACs) into lineages that differentiate into various hair follicle layers: the medullary hair shaft (HS-Me), the cortical hair shaft (HS-Co), and inner root sheath (IRS) cells – including Henle (IRS-He) and Huxley (IRS-Hu) layers. Moreover, using CellRank confirmed that Ocelli’s identified differentiation states align with the most probable trajectories based on RNA velocity algorithms.

A Breakthrough in Single-Cell Data Analysis

One of the main challenges in single-cell data analysis is its high dimensionality and noise. Data from single cells are inherently sparse – in RNA-seq, only 10-45% of transcripts are detected, and in ATAC-seq, detection drops to 1-10% of chromatin sites. Ocelli offers an efficient, mathematically grounded method for reconstructing these gaps – by effectively using the eigendecomposition of a multimodal Markov matrix – while preserving the data structure. This approach not only fills in missing data but also maintains biologically relevant correlations between genes and regulatory elements.

Cell selection strategy for refined analysis of trajectories. (A) The multimodal visualization of ASAP-seq human bone marrow data with coarse annotation of hematopoietic cell lineages. (B) Violin plots show a distribution of multimodal weights in each cluster from panel (A). (C) A distribution of 15 % of cells with the highest weights. (D) Cells from the HSCs cluster were assigned 1 s (t = 0), and the diffusion was performed for different diffusion times t = 1, 10, 20. (E) Cells selected at t = 20 were reanalyzed with Ocelli. The FLE plots show refined hematopoietic trajectories with cells colored by pseudotime or epitope level of lineage-specific protein markers. HSCs, hematopoietic stem cells; E, erythroid progenitors; pDCs plasmacytoid dendritic cells; Ba, basophilic/mast cell progenitors; B, B cell progenitors; cDCs, classical dendritic cells; M, myeloid progenitors.

In terms of speed, Ocelli surpasses competitors – analyzing 100,000 cells in less than three minutes – making it one of the fastest tools in its class. It also allows exploration of developmental subtrajectories, such as the differentiation of bone marrow stem cells into B lymphocytes, monocytes, or dendritic cells, without manual cell grouping or cluster selection.

“We wanted to create a tool that would accurately reproduce biological processes hidden in multidimensional data from multimodal single-cell profiling. Ocelli can identify and display complex cell development pathways-including rare, transient, or subtle ones buried in data noise. This is a new step forward for multimodal data analysis,” says Dr. Marcin Tabaka, Group Leader of the CGG at ICTER and co-author of the Ocelli tool.

Ocelli opens new possibilities for fields like developmental biology, immunology, oncology, and regenerative medicine. By precisely mapping cellular states over time, it can be used to study stem cell differentiation, cancer development, immune responses, and the effects of experimental treatments. Unlike neural networks, Ocelli visualizes data as a dynamic, continuous, and directional process.

The authors highlight that one of the key advantages of Ocelli is its transparency, flexibility, and accessibility – the source code is publicly available, along with documentation and examples. This enables research teams to adapt the algorithm for their specific needs, regardless of tissue type.


Source: Piotr Rutkowski, Marcin Tabaka (2025). Ocelli: an open-source tool for the analysis and visualization of developmental multimodal single-cell data. NAR Genomics and Bioinformatics.

DOI: https://doi.org/10.1093/nargab/lqaf040

Author: Scientific Editor Marcin Powęska