Dragen QC report template for multiple samples from fastqc metrics

By Synnøve Yndestad in R sequencing RNAseq

February 14, 2022

For RNAseq performed on the Illumina NovaSeq6000, a single RNAseq run may contain 70 different samples. Batch aggregating and plotting Quality Control metrics from a sequencing run is very useful to spot samples with low sequencing quality within a single run.
While multiQC is an excellent tool for aggregating and visualizing QC metrics, my RNAseq project is run using the Dragen pipeline. Since no Dragen module had yet been implemented when I was processing the samples, I wrote my own version in the form of a Rmarkdown report template. It takes a folder of *.fastq_metrics.csv files generated by Dragen, and produces a html report with plots made interactive by plotly.

An example report can be viewed here.
The Rmarkdown template and the example report can be found in my GitHub here.

Instructions for use:
1- Add a folder containing the *fastq_metrics.csv files to the working directory.
The folder name will be assigned as RunID.
2- Change any run-specific details in the Description section to document for future reference i.e what kind of samples, from which study, what prep protocol generated the library and which dragen version was used in the processing.
3- Knit report

The plots produced will be the same kind of plots listed below.

Plots produced by the report:

1- Read Mean quality; Per-Sequence Quality Scores

Total number of reads. Each average Phred-scale quality value is rounded to the nearest integer.

2- Positional Base Mean Quality; Per-Base Quality Scores

Average Phred-scale quality value of bases with a specific nucleotide at a given location in the read. Locations are listed first and can be either specific positions or ranges. The nucleotide is listed second and can be A, C, G, or T. N or ambiguous bases are assumed to have the system default value, usually QV2.