This article explains what you can expect from processing your SORT-seq samples
In short, our requirements to start processing are:
- Samples are submitted online
- A printed version of your sample submission is added to the package
- You have signed a quotation
Before processing, we check if we have received your sample submission form online and if your printed sample submission form is added to the package. The info on the sample submission form is required for correct processing in our lab. Besides submitting the samples, we also check if you have a signed quotation
We do not process samples that do not have a completely filled out submission form and/or do not have a signed quotation
Plates on hold
In the sample submission form, we ask the question: ‘process plate?’. If you choose yes, we will process your plate as soon as possible. If you choose no, we will store your plate in our freezer and we’ll wait with processing until we receive an email from you telling us to release the plate. We can keep your plates on hold for a maximum of 6 months.
Putting plates on hold is often chosen when clients want to wait for the first QC and sequencing results. When the first results look good, the other plates are processed.
We maintain a throughput time of 4 to 6 weeks. We start counting from the moment you release your plates. In these weeks we process your sample in our lab, send it out for sequencing, perform our preliminary data analysis and sent it back to you.
When SORT-seq plates are processed, we perform the following reactions:
- Reverse transcription and second-strand synthesis reaction, to generate (unamplified) barcoded DNA in each individual well of each cell capture plate.
- The material from all wells of one plate is subsequently pooled into a single Eppendorf tube.
- In Vitro Transcription reaction (IVT) is performed; a linear amplification step that results in amplified RNA (aRNA).
- TheaRNA is fragmented, and we run it on a Bioanalyzer to check the RNA yield.
- Another reverse transcription reaction is performed, as well as a PCR reaction. These steps result in a cDNA library that contains the right adapters for sequencing. We run another Bioanalyzer to check the quality and concentration of the final cDNA libraries.
You will receive two updates by email. The first one is to inform you about the start of processing your SORT-seq plates. After the first update, it will take a maximum of two weeks before you receive the second update. The second update is about the QC data results. In this email, we explain how we interpreted the QC data and what the results are. The email will also contain 2 PDF files with RNA and DNA QC data (bioanalyzer plots), so you can have a look for yourself as well.
In this email, we will also tell you if we think it is possible to sequence your sample. If your sample is good enough to be sent for sequencing, we will send it out without consulting you first. If the QC data of your sample is questionable, we will consult you first about if you want to proceed with these samples.
During the processing of SORT-seq plates, we check the amount of aRNA and cDNA and the quality of the samples. The plot it generates is a QC metric to study the concentration and size distribution of library fragments and is the first indication of the quality of the sample.
The QC results depend e.g. on the cell type that was sorted (some cells have more RNA than others), the quality of the sort (e.g. how many wells are filled with a cell) and the quality of the material.
The QC data unfortunately doesn’t provide information on the number of cells in your plate and the cells’ viability. This can only be determined by sequencing.
After your plates arrive at our lab, we perform a reverse transcription and second strand synthesis reaction. Subsequently, the material from all wells of one plate is pooled into a single Eppendorf tube to perform In Vitro Transcription (IVT); a linear amplification step that results in amplified RNA (aRNA).
We fragment this aRNA and then run it on an Agilent bioanalyzer. Unfortunately, we cannot check the quality of the RNA before amplification, as this would result in the loss of unique transcripts.
After IVT, we perform another reverse transcription reaction as well as a PCR reaction. During the PCR reaction, the material is again amplified and the right adapters for sequencing are added to the sequences. This results in a DNA library (cDNA) that can be used to send for sequencing.
We run the cleaned cDNA library on an Agilent bioanalyzer.
In every aRNA plot, you first see a peak at 25 nucleotides (nt). This is a marker peak. What should follow is a distinct RNA “bump”.
Depending on the cell type, the quality of the cells, the number of cells, and the quality of the reaction, the amount of RNA can vary significantly. A low RNA yield is in some situations very normal (think: low RNA expressing cells, RNA from nuclei, etc.), while in other situations it may indicate a problem with for example the sort or the quality of the cells.
A small “extra bump” may precede the RNA curve. This is a primer-dimer peak and does not interfere with the quality of your sample.
Also important to mention is the RIN number: this number usually says something about the quality of your RNA and is calculated based on the ribosomal peaks of your total RNA. Since we do not measure total RNA and we fragment the aRNA before running it on the bioanalyzer, the RIN number has no meaning anymore and can be neglected.
Based on the aRNA plots we can determine a a low, mid, or high RNA yield. This is as expected, for examplebased on the cell type, or not. In our experience, a mid- or high-range RNA yield usually results in (some) useful data. However, with a high aRNA yield, it is still possible that only a handful of cells have worked, but that each of these cells has a (very) high RNA content.
A very low yield of aRNA (“flatline” and/or RNA amounts <100pg/µl) is very unlikely to result in useful data and might indicate that none to only very few cells in the plate have worked. This can be due to e.g. low quality of the sorted cells or because the cells did not land in the wells correctly. The data may still be useful if the RNA is isolated from nuclei or cells with a very low complexity.
In every cDNA plot, there are two marker peaks: one at 35bp and one at 10380 bp. There should be a clear and spiky curve between these markers, that goes up at around 200 bp and is back at baseline around 2,000 bp.
The height of the curve depends mostly on the number of PCR cycles but can be limited by a low RNA yield (a very low to absent RNA yield, which will usually result in a low(er) DNA curve even at the maximal cycle number).
What can we conclude based on the cDNA plots?
Option A: the DNA library looks good: there is likely (some) useful data in this library, which can be sent for sequencing. It is good to keep in mind, however, that a good DNA library may still produce poor data, but this is something we can only determine by sequencing.
Option B: The DNA library doesn’t look goodand we ran the maximum number of PCR cycles: such a library is always accompanied by a “flat” RNA plot, and very likely indicates that there is none to very little data that can be obtained from this library (possible exceptions: nuclei and very low complexity cells).
With poor-looking QC results, it can be useful to sequence one or more of the libraries, to try and find out together with you why the results are suboptimal. Perhaps the sequencing results can help to determine whether the library/libraries indeed provide(s) low-quality data and what can be improved for a potential next sort.
The plate diagnostic plots of your SORT-seq plates are analyzed separately to review the quality of the sequencing data. This tells you how well each cell worked (endogenous reads, UMIs and genes per cell) and how well the technical handling of the plate worked (ERCC spike in reads).
In the top three plots, your 384-plate is visualized, with each circle representing a well of the plate. The wells in red represent high expression of reads and in blue the wells represent a low expression of reads. This way, the endogenous reads and the ERCC spike in reads across the entire plate can be compared. Spike-ins are synthetic control RNAs that are added during sample preparation to detect technical artifacts. The ratio between the endogenous and ERCC spike in reads shows the wells where the reaction worked. Wells with little or no endogenous reads are depicted in red in this plot
The “total unique reads w/o Spike-In reads” graph shows the number of transcripts detected per cell on the X-axis using a log10 scale. The “cumulative dist genes” graph depicts the number of genes detected per cell represented in a cumulative fashion/plot. Next, the “oversequencing_molecules” graph shows the number of molecules that was sequenced more than once, which is based on the UMIs. A large bar at 0.0 means that these molecules are sequenced once and not oversequenced.
On the bottom left, the graph with the top expressed genes shows which genes have the highest expression detected in the cells across the plate. On the bottom right, the top noisy genes report the genes that vary the most in all the cells across the plate.