EVscope: A Comprehensive Bioinformatics Pipeline for Accurate and Robust Analysis of 1 Total RNA Sequencing from Extracellular Vesicles
Output Details
Description
**Results:** Here, we present EVscope, an open-source bioinformatics pipeline designed specifically for processing 20 EV RNA-seq datasets. EVscope employs an optimized genome-wide expectation-maximization (EM) algorithm 21 that significantly improves multi-mapping read assignment at single-base resolution by effectively leveraging 22 alignment scores (AS) and local read coverage, specifically tailored for fragmented and low-abundance EV RNAs. 23 Notably, EVscope uniquely generates EM-based BigWig files for downstream analysis, a capability currently 24 unavailable in existing EM-based BigWig quantification tools. The pipeline systematically integrates 27 major 25 steps, including quality control, analysis of library structure, contamination assessment, read alignment, read 26 strandedness detection, UMI-based deduplication, RNA quantification, genomic DNA (gDNA) contamination 27 correction, cellular and tissue source inference and visualization with a comprehensive HTML report. EVscope 28 incorporates a comprehensive, updated annotation covering 19 distinct RNA biotypes, encompassing protein-29 coding genes, lncRNAs, miRNAs, piRNAs, retrotransposons (LINEs, SINEs, ERVs), and additional non-coding 30 RNAs (tRNAs, rRNAs, snoRNAs). Furthermore, it leverages two highly balanced circRNA detection algorithms 31 for robust circular RNA identification. Notably, a downstream module enables the inference of the tissue/cellular 32 origins of EV RNAs using bulk and single-cell RNA-seq reference datasets. EVscope is implemented as a 33 convenient, single-command Bash pipeline leveraging Conda-managed standard software packages and custom 34 scripts, ensuring reproducibility and straightforward deployment.
Identifier (DOI)
10.1101/2025.06.24.660984