Partitioning RNAs by length improves transcriptome reconstruction from short-read RNA-seq data

Abstract

The accuracy of methods for assembling transcripts from short-read RNA sequencing data is limited by the lack of long-range information. Here we introduce Ladder-seq, an approach that separates transcripts according to their lengths before sequencing and uses the additional information to improve the quantification and assembly of transcripts. Using simulated data, we show that a kallisto algorithm extended to process Ladder-seq data quantifies transcripts of complex genes with substantially higher accuracy than conventional kallisto. For reference-based assembly, a tailored scheme based on the StringTie2 algorithm reconstructs a single transcript with 30.8% higher precision than its conventional counterpart and is more than 30% more sensitive for complex genes. For de novo assembly, a similar scheme based on the Trinity algorithm correctly assembles 78% more transcripts than conventional Trinity while improving precision by 78%. In experimental data, Ladder-seq reveals 40% more genes harboring isoform switches compared to conventional RNA sequencing and unveils widespread changes in isoform usage upon m⁶A depletion by Mettl14 knockout.

This is a preview of subscription content

Access options

Subscribe to Journal

Get full journal access for 1 year

92,52 €

only 7,71 € per issue

All prices are NET prices.
VAT will be added later in the checkout.
Tax calculation will be finalised during checkout.

Buy article

Get time limited or full article access on ReadCube.

$32.00

All prices are NET prices.

Data availability

Ladder-seq raw sequencing data from WT and Mettl14 KO mouse NPCs, conventional RNA-seq data from WT mouse NPCs and ONT long-read sequencing from WT and Mettl14 KO mouse NPCs are available in the Gene Expression Omnibus (GSE158985). m⁶A peaks from two replicates of m⁶A sequencing in mouse NPCs were downloaded from the Gene Expression Omnibus (GSE104686).

Code availability

The kallisto-ls, StringTie-ls and Trinity-ls programs and workflows are available at:

•https://github.com/canzarlab/ladderseq_quant,

•https://github.com/canzarlab/ladderseq_assembly and

•https://github.com/canzarlab/ladderseq_denovo,

respectively. The results of our benchmark studies can be reproduced via a Snakefile³³, available at https://github.com/canzarlab/ladderseq_benchmark.

References

1.
Zhang, C., Zhang, B., Lin, L.-L. & Zhao, S. Evaluation and comparison of computational tools for RNA-seq isoform quantification. BMC Genomics 18, 583 (2017).
PubMed
PubMed Central

Google Scholar
2.
Teng, M. et al. A benchmark for RNA-seq quantification pipelines. Genome Biol. 17, 74 (2016).
PubMed
PubMed Central

Google Scholar
3.
Aguiar, D. et al. Bayesian nonparametric discovery of isoforms and individual specific quantification. Nat. Commun. 9, 1681 (2018).
PubMed
PubMed Central

Google Scholar
4.
Song, L., Sabunciyan, S., Yang, G. & Florea, L. A multi-sample approach increases the accuracy of transcript assembly. Nat. Commun. 10, 5000 (2019).
PubMed
PubMed Central

Google Scholar
5.
Li, W. V. et al. AIDE: annotation-assisted isoform discovery with high precision. Genome Res. 29, 2056–2072 (2019).
6.
Desrosiers, R. C., Friderici, K. H. & Rottman, F. M. Characterization of novikoff hepatoma mRNA methylation and heterogeneity in the methylated 5′ terminus. Biochemistry 14, 4367–4374 (1975).
CAS
PubMed

Google Scholar
7.
Barbosa-Morais, N. L. et al. The evolutionary landscape of alternative splicing in vertebrate species. Science 338, 1587–1593 (2012).
CAS
PubMed

Google Scholar
8.
Jelen, N., Ule, J., Živin, M. & Darnell, R. B. Evolution of nova-dependent splicing regulation in the brain. PLoS Genetics 3, e173 (2007).
PubMed Central

Google Scholar
9.
Merkin, J., Russell, C., Chen, P. & Burge, C. B. Evolutionary dynamics of gene and isoform regulation in mammalian tissues. Science 338, 1593–1599 (2012).
CAS
PubMed
PubMed Central

Google Scholar
10.
Tardaguila, M. et al. SQANTI: extensive characterization of long-read transcript sequences for quality control in full-length transcriptome identification and quantification. Genome Res. 28, 396–411 (2018).
CAS
PubMed Central

Google Scholar
11.
Chen, K. et al. Genome-wide binding and mechanistic analyses of Smchd1-mediated epigenetic regulation. Proc. Natl Acad. Sci. USA 112, E3535–E3544 (2015).
CAS
PubMed
PubMed Central

Google Scholar
12.
Soneson, C. et al. A comprehensive examination of Nanopore native RNA sequencing for characterization of complex transcriptomes. Nat. Commun. 10, 3359 (2019).
PubMed
PubMed Central

Google Scholar
13.
Hurowitz, E. H. & Brown, P. O. Genome-wide analysis of mRNA lengths in Saccharomyces cerevisiae. Genome Biol. 5, R2 (2003).
PubMed
PubMed Central

Google Scholar
14.
Bray, N. L., Pimentel, H., Melsted, P. & Pachter, L. Near-optimal probabilistic RNA-seq quantification. Nat. Biotechnol. 34, 525 (2016).
CAS
PubMed

Google Scholar
15.
Heber, S., Alekseyev, M., Sze, S.-H., Tang, H. & Pevzner, P. A. Splicing graphs and EST assembly problem. Bioinformatics 18, S181–S188 (2002).
PubMed

Google Scholar
16.
Pachter, L. Models for transcript quantification from RNA-seq. Preprint at https://arxiv.org/abs/1104.3889 (2011).
17.
Kovaka, S. et al. Transcriptome assembly from long-read RNA-seq alignments with Stringtie2. Genome Biol. 20, 278 (2019).
CAS
PubMed
PubMed Central

Google Scholar
18.
Ebrahim Sahraeian, S. M. et al. Gaining comprehensive biological insight into the transcriptome by performing a broad-spectrum RNA-seq analysis. Nat. Commun. 8, 59 (2017).
19.
Glinos, D. A. et al. Transcriptome variation in human tissues revealed by long-read sequencing. Preprint at https://www.biorxiv.org/content/10.1101/2021.01.22.427687v1 (2021).
20.
Grabherr, M. G. et al. Full-length transcriptome assembly from RNA-seq data without a reference genome. Nat. Biotechnol. 29, 644–652 (2011).
CAS
PubMed
PubMed Central

Google Scholar
21.
Chang, Z., Wang, Z. & Li, G. The impacts of read length and transcriptome complexity for de novo assembly: a simulation study. PLoS ONE 9, 1–8 (2014).

Google Scholar
22.
Dong, X. et al. The long and the short of it: unlocking nanopore long-read RNA sequencing data with short-read differential expression analysis tools. NAR Genom. Bioinform. 3, lqab028 (2021).
23.
Wang, Y. et al. N⁶-methyladenosine RNA modification regulates embryonic neural stem cell self-renewal through histone modifications. Nat. Neurosci. 21, 195–206 (2018).
PubMed
PubMed Central

Google Scholar
24.
Canzar, S., Andreotti, S., Weese, D., Reinert, K. & Klau, G. W. CIDANE: comprehensive isoform discovery and abundance estimation. Genome Biol. 17, 16 (2016).
PubMed
PubMed Central

Google Scholar
25.
Alqassem, I., Sonthalia, Y., Klitzke-Feser, E., Shim, H. & Canzar, S. McSplicer: a probabilistic model for estimating splice site usage from RNA-seq data. Bioinformatics 37, 2004–2011 (2021).
CAS
PubMed Central

Google Scholar
26.
Batista, P. J. et al. m⁶a RNA modification controls cell fate transition in mammalian embryonic stem cells. Cell Stem Cell 15, 707–719 (2014).
CAS
PubMed
PubMed Central

Google Scholar
27.
Ke, S. et al. A majority of m⁶a residues are in the last exons, allowing the potential for 3′ UTR regulation. Genes Dev. 29, 2037–2053 (2015).
CAS
PubMed
PubMed Central

Google Scholar
28.
Yamauchi, T., Nishiyama, M., Moroishi, T., Kawamura, A. & Nakayama, K. I. FBXL5 inactivation in mouse brain induces aberrant proliferation of neural stem progenitor cells. Mol. Cell. Biol. 37, e00470-16 (2017).
29.
Kuboyama, K., Fujikawa, A., Suzuki, R. & Noda, M. Inactivation of protein tyrosine phosphatase receptor type Z by pleiotrophin promotes remyelination through activation of differentiation of oligodendrocyte precursor cells. J. Neurosci. 35, 12162–12171 (2015).
CAS
PubMed
PubMed Central

Google Scholar
30.
Kurosaki, T., Popp, M. W. & Maquat, L. E. Quality and quantity control of gene expression by nonsense-mediated mRNA decay. Nat. Rev. Mol. Cell Biol. 20, 406–420 (2019).
CAS
PubMed
PubMed Central

Google Scholar
31.
Lianoglou, S., Garg, V., Yang, J. L., Leslie, C. S. & Mayr, C. Ubiquitously transcribed genes use alternative polyadenylation to achieve tissue-specific expression. Genes Dev. 27, 2380–2396 (2013).
CAS
PubMed
PubMed Central

Google Scholar
32.
Tang, A. D. et al. Full-length transcript characterization of SF3B1 mutation in chronic lymphocytic leukemia reveals downregulation of retained introns. Nat. Commun. 11, 1438 (2020).
CAS
PubMed
PubMed Central

Google Scholar
33.
Köster, J. & Rahmann, S. Snakemake—a scalable bioinformatics workflow engine. Bioinformatics 28, 2520–2522 (2012).
PubMed

Google Scholar
34.
DeAngelis, M. M., Wang, D. G. & Hawkins, T. L. Solid-phase reversible immobilization for the isolation of PCR products. Nucleic Acids Res. 23, 4742–4743 (1995).
CAS
PubMed
PubMed Central

Google Scholar
35.
Sobczak, K. & Krzyzosiak, W. J. RNA structure analysis assisted by capillary electrophoresis. Nucleic Acids Res. 30, e124 (2002).
PubMed
PubMed Central

Google Scholar
36.
Azarani, A. & Hecker, K. H. RNA analysis by ion-pair reversed-phase high performance liquid chromatography. Nucleic Acids Res. 29, e7 (2001).
CAS
PubMed
PubMed Central

Google Scholar
37.
Wang, Y. et al. High-resolution profile of transcriptomes reveals a role of alternative splicing for modulating response to nitrogen in maize. BMC Genomics 21, 353 (2020).
CAS
PubMed
PubMed Central

Google Scholar
38.
Li, R. et al. Direct full-length RNA sequencing reveals unexpected transcriptome complexity during Caenorhabditis elegans development. Genome Res. 30, 287–298 (2020).
39.
Haussmann, I. U. et al. m⁶a potentiates Sxl alternative pre-mRNA splicing for robust Drosophila sex determination. Nature 540, 301–304 (2016).
CAS
PubMed

Google Scholar
40.
Bartosovic, M. et al. N⁶-methyladenosine demethylase FTO targets pre-mRNAs and regulates alternative splicing and 3′-end processing. Nucleic Acids Res. 45, 11356–11370 (2017).
CAS
PubMed
PubMed Central

Google Scholar
41.
Xiao, W. et al. Nuclear m⁶a reader YTHDC1 regulates mRNA splicing. Mol. Cell 61, 507–519 (2016).
CAS
PubMed

Google Scholar
42.
Zhou, K. I. et al. Regulation of co-transcriptional pre-mRNA splicing by m⁶a through the low-complexity protein hnRNPG. Mol. Cell 76, 70–81 (2019).
CAS
PubMed
PubMed Central

Google Scholar
43.
Jacob, A. G. & Smith, C. W. J. Intron retention as a component of regulated gene expression programs. Hum. Genet. 136, 1043–1057 (2017).
CAS
PubMed
PubMed Central

Google Scholar
44.
Braunschweig, U. et al. Widespread intron retention in mammals functionally tunes transcriptomes. Genome Res. 24, 1774–1786 (2014).
CAS
PubMed
PubMed Central

Google Scholar
45.
Yoon, K.-J. et al. Temporal control of mammalian cortical neurogenesis by m⁶a methylation. Cell 171, 877–889 (2017).
CAS
PubMed
PubMed Central

Google Scholar
46.
Eckmann, C. R., Rammelt, C. & Wahle, E. Control of poly(A) tail length. Wiley Interdiscip. Rev. RNA 2, 348–361 (2011).
CAS
PubMed

Google Scholar
47.
Dobin, A. et al. STAR: ultrafast universal RNA-seq aligner. Bioinformatics 29, 15–21 (2013).
CAS
PubMed

Google Scholar
48.
Conforti, L. et al. Kif1Bβ isoform is enriched in motor neurons but does not change in a mouse model of amyotrophic lateral sclerosis. J. Neurosci. Res. 71, 732–739 (2003).
CAS
PubMed

Google Scholar
49.
Jiang, L. et al. Synthetic spike-in standards for RNA-seq experiments. Genome Res. 21, 1543–1551 (2011).
CAS
PubMed
PubMed Central

Google Scholar
50.
Li, B. & Dewey, C. N. RSEM: accurate transcript quantification from RNA-seq data with or without a reference genome. BMC Bioinformatics 12, 323 (2011).
CAS
PubMed
PubMed Central

Google Scholar
51.
Lappalainen, T. et al. Transcriptome and genome sequencing uncovers functional variation in humans. Nature 501, 506–511 (2013).
PubMed Central

Google Scholar
52.
Chang, Z. et al. Bridger: a new framework for de novo transcriptome assembly using RNA-seq data. Genome Biol. 16, 30 (2015).
PubMed
PubMed Central

Google Scholar
53.
Pertea, M. et al. StringTie enables improved reconstruction of a transcriptome from RNA-seq reads. Nat. Biotechnol. 33, 290–295 (2015).
CAS
PubMed
PubMed Central

Google Scholar
54.
Shao, M. & Kingsford, C. Accurate assembly of transcripts through phase-preserving graph decomposition. Nat. Biotechnol. 35, 1167–1169 (2017).
CAS
PubMed
PubMed Central

Google Scholar
55.
Pertea, M., Kim, D., Pertea, G. M., Leek, J. T. & Salzberg, S. L. Transcript-level expression analysis of RNA-seq experiments with HISAT, StringTie and Ballgown. Nat. Protocols 11, 1650 (2016).
CAS
PubMed

Google Scholar
56.
Kent, W. J. BLAT—the BLAST-like alignment tool. Genome Res. 12, 656–664 (2002).
CAS
PubMed
PubMed Central

Google Scholar
57.
Xie, Y. et al. SOAPdenovo-Trans: de novo transcriptome assembly with short RNA-Seq reads. Bioinformatics 30, 1660–1666 (2014).
CAS

Google Scholar
58.
Liu, J., Yu, T., Mu, Z. & Li, G. TransLiG: a de novo transcriptome assembler that uses line graph iteration. Genome Biol. 20, 81 (2019).
PubMed
PubMed Central

Google Scholar
59.
Roberts, A. & Pachter, L. Streaming fragment assignment for real-time analysis of sequencing experiments. Nat. Methods 10, 71–73 (2013).
CAS
PubMed

Google Scholar
60.
Li, B., Ruotti, V., Stewart, R. M., Thomson, J. A. & Dewey, C. N. RNA-Seq gene expression estimation with read mapping uncertainty. Bioinformatics 26, 493–500 (2009).
PubMed
PubMed Central

Google Scholar
61.
Vitting-Seerup, K. & Sandelin, A. The landscape of isoform switches in human cancers. Mol. Cancer Res. 15, 1206–1220 (2017).
CAS
PubMed

Google Scholar
62.
Anders, S., Reyes, A. & Huber, W. Detecting differential usage of exons from RNA-seq data. Genome Res. 22, 2008–2017 (2012).
CAS
PubMed
PubMed Central

Google Scholar
63.
Park, H. J. et al. CPAT: Coding-Potential Assessment Tool using an alignment-free logistic regression model. Nucleic Acids Res. 41, e74 (2013).
PubMed
PubMed Central

Google Scholar
64.
Finn, R. D. et al. Pfam: the protein families database. Nucleic Acids Res. 42, D222–D230 (2014).
65.
Zhang, Y. et al. Model-based analysis of ChIP-Seq (MACS). Genome Biol. 9, 1–9 (2008).

Google Scholar
66.
Quinlan, A. R. & Hall, I. M. BEDTools: a flexible suite of utilities for comparing genomic features. Bioinformatics 26, 841–842 (2010).
CAS
PubMed
PubMed Central

Google Scholar
67.
Heinz, S. et al. Simple combinations of lineage-determining transcription factors prime cis-regulatory elements required for macrophage and B cell identities. Mol. Cell 38, 576–589 (2010).
CAS
PubMed
PubMed Central

Google Scholar
68.
Alexa, A., Rahnenführer, J. & Lengauer, T. Improved scoring of functional groups from gene expression data by decorrelating GO graph structure. Bioinformatics 22, 1600–1607 (2006).
CAS
PubMed

Google Scholar

Download references

Acknowledgements

We would like to thank A. A. Kha for helpful discussions and the members of the Canzar laboratory for comments and suggestions. We thank P. J. Shaw and M. Halic for constructive feedback on mRNA length separation. We thank S. Esser for excellent technical assistance. S. Chakraborty was supported by a Deutsche Forschungsgemeinschaft (DFG) fellowship through the Graduate School of Quantitative Biosciences Munich. F.R.R. was supported, in part, by the Bavarian Gender Equality Grant. This work was funded by the TTIC Research fund (to S. Canzar) and by grants from the National Institutes of Health (R35NS097370 to G.M. and R35NS116843 to H.S), the Simons Foundation (575050 to H.S.), the Deutsche Forschungsgemeinschaft (DFG, German Research Foundation) SFB-TRR 338/1 2021-452881907 (to S. Canzar), DFG Sachbeihilfe (417912216 to M.L.S.), Deutsche Gesellschaft für Muskelkranke e.V. Forschungsförderung (awarded in 2019 to M.L.S.), the Institute for Basic Science (IBS-R008-D1 to A.H. and H.C.), the National Research Foundation of Korea (2019R1C1C1006600 and 2020M3A9E4039670 to K.-J.Y. and 2021R1C1C1010758 to A.H. and H.C.), funded by the Ministry of Science, and by ICT of Korea, Young Investigator Grant, from the Suh Kyungbae Foundation (to K.-J.Y.).

Author information

Author notes

These authors contributed equally: Francisca Rojas Ringeling, Shounak Chakraborty.

Affiliations

Gene Center, Ludwig-Maximilians-Universität München, Munich, Germany
Francisca Rojas Ringeling, Shounak Chakraborty, Akshay M. Patel, Tim Reska & Stefan Canzar
Department of Biochemistry & Biophysics, University of California, San Francisco, San Francisco, CA, USA
Caroline Vissers
Department of Biomedical Engineering, University of Illinois at Chicago, Chicago, IL, USA
Derek Reiman
Department of Biological Sciences, Korea Advanced Institute of Science and Technology (KAIST), Daejeon, Republic of Korea
Ki-Heon Lee, Chan-Woo Park & Ki-Jun Yoon
Center for RNA Research, Institute for Basic Science (IBS), Seoul, Republic of Korea
Ari Hong & Hyeshik Chang
Interdisciplinary Program in Bioinformatics, Seoul National University, Seoul, Republic of Korea
Ari Hong & Hyeshik Chang
Department of Informatics, Technical University of Munich, Garching, Germany
Julien Gagneur
Institute of Human Genetics, Technical University of Munich, Munich, Germany
Julien Gagneur
Institute of Computational Biology, Helmholtz Zentrum München, Neuherberg, Germany
Julien Gagneur
School of Biological Sciences, Seoul National University, Seoul, Republic of Korea
Hyeshik Chang
Biomedical Center, Department of Physiological Chemistry, Ludwig-Maximilians-Universität München, Martinsried-Planegg, Germany
Maria L. Spletter
Department of Neuroscience and Mahoney Institute for Neurosciences, University of Pennsylvania, Philadelphia, PA, USA
Guo-li Ming & Hongjun Song

Contributions

F.R.R. and S. Canzar conceived the original idea. S. Chakraborty and S. Canzar designed the kallisto-ls, StringTie-ls and Trintity-ls methods, with input from F.R.R. S. Chakraborty implemented the kallisto-ls, StringTie-ls and Trinity-ls methods and conducted all experiments and evaluations on simulated data. F.R.R. and S. Canzar contributed to the evaluation on simulated data. D.R. performed the initial computational study on simulated data (not shown). F.R.R. performed differential splicing analysis in NPCs, with input from S. Canzar. F.R.R. developed and performed Ladder-seq experiments, with input from H.S., G.M., M.L.S. and J.G. C.V. conducted RT–qPCR experiments. K.-J.Y. bred mice and harvested NPCs. K.-J.Y., H.C., K.-H.L., A.H. and C.-W.P. performed ONT long-read sequencing experiments. A.M.P., S. Chakraborty, F.R.R. and T.R. contributed to analysis of ONT long-read datasets. F.R.R. and S. Canzar wrote the manuscript, with input from all authors. S. Canzar supervised the entire project. All authors read and approved the final manuscript.

Corresponding author

Correspondence to
Stefan Canzar.

Ethics declarations

Competing interests

The authors declare no competing interests.

Additional information

Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Extended data

Extended Data Fig. 1 Reduced (effective) gene complexity in Ladder-seq.

We estimate transcript expression in Mettl14 KO sample 1 using kallisto on Ladder-seq reads pooled across bands and show the histogram of gene complexity measured as the number of expressed transcripts. In Ladder-seq, we partition the set of expressed transcripts into 7 bands and count the number of transcripts contained in each band according to their annotated length (plus 200 nt average poly(A) tail size⁴⁶), assuming cuts at 1000 bp, 1500 bp, 2000 bp, 3000 bp, 4000 bp and 6000 bp. The resulting histogram of effective gene complexity shows an increased fraction of gene bands with low complexity.

Extended Data Fig. 2 In silico gel for WT and KO NPC samples.

For every annotated transcript the intensity of a point with y- coordinate equal to its annotated length (plus 200 nt average poly(A) tail size) shows the fraction of reads obtained from each band (x-axis) that can be assigned unambiguously to it. As expected, each band contains predominantly reads from transcripts of a distinct length range.

Extended Data Fig. 3 Overview of the benchmark strategy.

1. The ground truth transcriptome including abundances and error profile is calculated by RSEM from GEUVADIS sample NA12716_7. 2. Reads are simulated from the ground truth transcriptome by RSEM to obtain RNA-seq samples of different sequencing depths. 3. A matching Ladder-seq sample is obtained by separating reads in silico according to probability mass functions estimated from our NPC Ladder-seq sample (and variants thereof). 4. Transcripts are quantified and assembled by our Ladder-seq tailored transcript analysis methods kallisto-ls, StringTie-ls, and Trinity-ls from the Ladder-seq sample, while their conventional counterparts are run on the corresponding RNA-seq sample. 5. The results are compared to the ground truth to evaluate and compare their accuracy.

Extended Data Fig. 4 Quantification accuracy of kallisto-ls on 75 million simulated reads.

Mean values across 20 simulations are reported. Pearson correlation of estimated and ground truth abundance in log2 transformed transcripts per million (TPM) and mean absolute relative difference (MARD) are shown as a function of gene complexity, that is the number of transcripts expressed by a gene. For ease of visualization, we omit genes expressing a single transcript, many of which are estimated to be lowly expressed in GEUVADIS sample NA12716_7 by RSEM.

Extended Data Fig. 5 Accuracy of transcript assembly from 75 million simulated reads.

RNA-seq and Ladder-seq reads were aligned identically to the reference genome (GRCh38) using STAR. Sensitivity and precision of StringTie-ls and its conventional counterpart StringTie2 are shown as a function of gene complexity measured as the number of expressed transcripts. Sensitivity and precision are calculated with respect to the same set of ground truth transcripts as in the smaller 30 million read pairs data set.

Extended Data Fig. 6 Accuracy of de novo transcript assembly from 30 million (top row) and 75 million (bottom row) simulated reads.

(a) Sensitivity of Trinity-ls and its conventional counterpart Trinity at 90% transcript length cut-off is shown as a function of gene complexity measured as the number of expressed transcripts. (b) Total number of correctly assembled transcripts at different transcript length cut-offs. (c) Precision at different transcript length cut-offs.

Extended Data Fig. 7 Ladder-seq improves differential analysis of reconstructed transcriptomes.

(a) Computational pipeline for differential isoform usage analysis with conventional RNA-seq and Ladder-seq. Reads were aligned using STAR aligner prior to transcript assembly for both pipelines. (b) Venn diagram showing overlap between switching genes identified by Ladder-seq and conventional RNA-seq. (c and e) Isoform switches identified only by Ladder- seq in genes Exo1 and Tram1l1 (between n=4 WT and n=4 KO samples). Red arrows show location of m⁶A methylation. TCONS_00000541 and TCONS_00000542 are novel isoforms of Exo1 detected only by Ladder-seq. TCONS_00006855 is a novel isoform of Tram1l1 that was assembled by both methods, but conventional RNA-seq failed to identify the isoform switch. Without length information, conventional RNA-seq reads in KO bands 2 and 3 were predominantly assigned to the annotated transcript in band 4. Barplots represent mean ± SEM; FDR corrected pExo1 and Tram1l1 showing separation of reads from transcripts of different lengths. (g) Jensen Shannon divergence for Ladder-seq and conventional RNA-seq for all identified transcripts grouped by relative difference in abundance estimation by both methods (n=18761 for

Extended Data Fig. 8 Mettl14 KO leads to isoform switches in m⁶A methylated genes.

(a) Gene Ontology for m⁶A methylated genes containing isoform switches. (b) Isoform switch in Ptprz1 (between n=4 WT and n=4 KO samples). Red arrow shows location of m⁶A methylation. Barplots represent mean ± SEM; FDR corrected p

Extended Data Fig. 9 Comparison of Ladder-seq and ONT-cDNA sequencing on mouse NPCs.

(a) Orange bars show validation by StringTie2 (left panel) or by an independent ONT dataset (Dong et al.) (right panel) of transcripts found by both Ladder-seq and ONT-cDNA while light blue bars show validation values for transcripts reported only by ONT-cDNA. In comparison, 32.5% of transcripts uniquely identified by Ladder-seq (average TPM ≥ 1) were also identified in the dataset by Dong et al. (b) Boxplots showing expression levels (TPM) for transcripts identified both by long- reads and Ladder-seq (green boxes) and for transcripts identified only by Ladder-seq (grey boxes). Left panel shows values for all Ladder-seq transcripts with TPM higher than 1 (n=6169 identified only by Ladder-seq, n=15099 by both). Right panel shows values for Ladder-seq switching transcripts with TPM higher than 1 (n=905 identified only by Ladder-seq, n=2012 by both). Boxplot definition: Bottom and top of the box correspond to lower and upper quartiles of the data, bar is the median and whiskers are median ± 1.5x interquartile range.

Extended Data Fig. 10 Cumulative percentage of Ladder-seq transcripts identified by long-read sequencing.

Bars show percentage of Ladder-seq transcripts identified by FLAIR (green), plus those additionally identified by StringTie2 (blue), plus transcripts additionally found in a recently published long-read mouse NPC transcriptome (light blue).

Supplementary information

About this article

Cite this article

Ringeling, F.R., Chakraborty, S., Vissers, C. et al. Partitioning RNAs by length improves transcriptome reconstruction from short-read RNA-seq data.
Nat Biotechnol (2022). https://doi.org/10.1038/s41587-021-01136-7

Download citation

Received: 09 September 2020
Accepted: 26 October 2021
Published: 10 January 2022
DOI: https://doi.org/10.1038/s41587-021-01136-7

Note: This article have been indexed to our site. We do not claim legitimacy, ownership or copyright of any of the content above. To see the article at original source Click Here

All-new Amazon Echo Show 15 | A stunning HD 15.6" smart kitchen TV for home organization with Alexa

(349)

$299.99 (as of January 22, 2025 19:24 GMT +00:00 - )

The Frontiersmen: A Narrative

(1920)

$22.19 (as of January 22, 2025 19:24 GMT +00:00 - )

Under Cabinet Lights,40 LED Rechargeable Battery Operated Motion Sensor Light Indoor, 2 Pack Magnetic Dimmable…

(25297)

$23.80 (as of January 21, 2025 19:55 GMT +00:00 - )

SAMSUNG Galaxy Watch Ultra 47mm LTE AI Smartwatch w/Energy Score, Wellness Tips, Heart Rate Tracking, Sleep…

(14)

$341.99 (as of January 22, 2025 19:19 GMT +00:00 - )

Seagate Exos ST24000NM000C 24TB 7.2K RPM SATA 6Gb/s 512e 3.5in Enterprise Hard Drive (Renewed)

(14)

$299.99 (as of January 22, 2025 19:19 GMT +00:00 - )

Index Of News Author

Science and Medical

A BBC presenter criticised the UK’s boat policy. He was suspended

Key PointsGary Lineker has been told to stay off-air after he criticised the UK's asylum seeker policy.A bill introduced to parliament would deport asylum seekers to their homeland or to so-called safe third countries.The Ex-England captain, who has hosted refugees in his home, said the policy was "beyond awful".Gary Lineker was told by the BBC

March 10, 2023

Science and Medical

2-in-1 shot for flu and COVID shows promise in advanced trial

The ongoing trial includes adults ages 50 and older. (Image credit: mixetto via Getty Images) An experimental two-in-one vaccine triggered strong immune responses against both the flu and SARS-CoV-2, the virus behind COVID-19, in an advanced clinical trial that included adults ages 50 and older. The ongoing clinical trial pits this new combination shot, called

June 10, 2024

Science and Medical

COVID Long Haulers Are Calling Attention to Chronic Illnesses

When the first wave of coronavirus infections hit the U.S. in March 2020, what kept me up at night was not only the tragedy of the acute crisis but also the idea that we might soon be facing a second crisis—a pandemic of chronic illness triggered by the virus. I had just finished reporting and…

February 16, 2022

Science and Medical

Constraining the Multiverse: Stephen Hawking’s Final Theory About the Big Bang

Professor Stephen Hawking’s final theory on the origin of the universe, which he worked on in collaboration with Professor Thomas Hertog from KU Leuven, was published in 2018 in the Journal of High Energy Physics. The theory, which was submitted for publication before Hawking’s death earlier in 2018, is based on string theory and predicts…

January 9, 2022

Science and Medical

Nurses Struggle through a New COVID Wave with Rage and Compassion

To health care workers in the COVID era, holidays mean death, and we knew Omicron was coming before it had a name. The wave caused by this variant has barely begun, rapidly gathering steam, and we are exhausted, attempting to pull from reserves badly drained by earlier surges. Back in August, the beds of my…

January 11, 2022

Science and Medical

Chandra and Webb combine to solve the mystery of the “Green Monster”

Astronomers have, for the first time, combined data from NASA’s Chandra X-ray Observatory and the James Webb Space Telescope to study the well-known supernova remnant Cassiopeia A (Cas A). This collaborative effort has shed light on an unusual structure in the debris known as the “Green Monster,” first discovered in Webb data in April 2023.

January 9, 2024