How to get started with single-cell data analysis

How to start with single cell data analysis

Home > How to get started with single-cell data analysis

Once you receive your single-cell sequencing data, you can get started with your single-cell data analysis. It can be overwhelming when you first download the data. Here, we’ll give you some tips for the first steps of single-cell sequencing data analysis.

Choose your method

The first and most important point is that there is not a general way to analyze single-cell data. There are many different methods and software solutions. We advise you to test them in an exploratory effort to find one that works best for your data and biological question.

You have three options to do your analysis:

You learn R or Python and run one of the many single-cell analysis packages
Download or buy software in which you can load your data to perform the downstream analysis and visualization
Outsource your analysis to a collaborator or company

Here, we’ll give you some advice for the first option: perform your own analysis. We assume that you have count tables of mapped and demultiplexed data. This means that your data is in a gigantic table with genes as rows and cells as columns. It is the typical starting point for single-cell data analysis.

Step 1: Quality Control

First, it is essential to start with quality control of your data immediately. The QC will give you an instant idea about how well your experiment worked at a technical level. If the quality is not sufficient, you might need to repeat the experiment.

So, you can avoid a lot of wasted time by doing the quality control before moving on to downstream analysis. A few helpful questions: Do you detect the number of cells and read depth you were looking for? Do most cells look good (see filtering below)? Are all samples equally represented in the data?

You can do this by loading your data in R or Python and look at a few basic metrics. First, take a look at the number of successfully mapped reads. Second, check the number of cells in your dataset. And third, take a look at the number of genes per cell.

Step 2: Exploring filtering parameters

After the basic metrics, you can continue the quality control with three standard histograms.

A great first plot to make is a histogram of the total UMIs/cell. This will give you an overview of the distribution of reads per cell and insight into what filtering cutoffs to use. It also gives you a rough estimate of how many cells yield good data.

You can make a similar histogram for genes/cells, telling you how complex your dataset is.

Third, a histogram for the percentage of mitochondrial reads per cell is indicative of how many stressed or bad quality cells there are*.

If you do this for all your samples, you will have an idea about the general quality of your dataset and if you have enough cells from each sample for your downstream analysis. If all is well, it is time to start answering your biological questions with clustering and differential gene expression analysis.

*High mitochondrial content does not equal bad quality cells for all cell types. This depends on your sample.

Step 3: Downstream analysis with dedicated analysis packages

After gaining some intuition about the QC metrics, it’s time to cluster your data. A helpful way to get started is to look into packages specifically made for single-cell analysis, like Seurat or Scanpy.

These packages are written for R or Python respectively and have a wide set of tools. For example, can normalize your data, do batch effect corrections, filter bad quality cells, and most importantly: perform clustering and differential gene expression analysis.

Before trying these packages, it might be good to first familiarize yourself with the R or Python basics. Great starting points are this concise introduction to R or this Python for beginners starting page.

Next, you can try one of the example vignettes both these packages have. For example this Seurat vignette Once you feel comfortable with the analysis and understand the plots it produces, you are ready to cluster and analyze your data!

Sounds too complex? If you generated your data with us, you could outsource your analysis to our experienced bioinformaticians. Contact us to discuss the options.

What can we help you with?

Contact sequencing experts

Let's discuss
your research

Single-cell sequencing

DRUG-seq

Bulk RNA sequencing

Sequencing with NovaSeq X Plus

Fast, high-quality
Sequencing Service

Spatial transcriptomics

Visium HD
Whole transcriptome spatial discovery at single-cell resolution

Data analysis

Data Consulting
as a service

Complementary services

Custom solutions by our R&D team

We keep you ahead of the curve

Services

State-of-the-art
RNA solutions

Complementary services

Plate-based

Parse Biosciences

Single-cell multiomics

10x Genomics

Complementary services

How to get started with single-cell data analysis

Choose your method

Step 1: Quality Control

Step 2: Exploring filtering parameters

Step 3: Downstream analysis with dedicated analysis packages

How can we help?

Want to supercharge your project with single-cell insights?

Services

Resources

About

Subscribe to newsletter

Applications

Research areas

Our approach

Support

Get a quotation

Contact

What can we help you with?

Contact sequencing experts

Let's discuss your research

Single-cell sequencing

DRUG-seq

Bulk RNA sequencing

Sequencing with NovaSeq X Plus

Fast, high-quality Sequencing Service

Spatial transcriptomics

Visium HD Whole transcriptome spatial discovery at single-cell resolution

Data analysis

Data Consulting as a service

Complementary services

Custom solutions by our R&D team

We keep you ahead of the curve

Services

State-of-the-art RNA solutions

Complementary services

Plate-based

Parse Biosciences

Single-cell multiomics

10x Genomics

Complementary services

Share

Choose your method

Step 1: Quality Control

Step 2: Exploring filtering parameters

Step 3: Downstream analysis with dedicated analysis packages

Other Articles

How DRUG-seq Reveals Mechanism-of-Action (MoA)

Why a list of genes is not a cell type

How to choose cell number and sequencing depth for your single-cell experiment

How can we help?

Want to supercharge your project with single-cell insights?

Let's discuss
your research

Fast, high-quality
Sequencing Service

Visium HD
Whole transcriptome spatial discovery at single-cell resolution

Data Consulting
as a service

State-of-the-art
RNA solutions