Skip to the content.

MR.PARETO DOI GitHub license GitHub Release Snakemake

Genome Browser Track Visualization Workflow

A Snakemake 8 workflow for easy visualization of genome browser tracks of aligned/mapped BAM files (e.g., RNA-seq, ATAC-seq, scRNA-seq, …) powered by the wrapper gtracks for the package pyGenomeTracks and IGV-reports.

[!NOTE]
This workflow adheres to the module specifications of MR.PARETO, an effort to augment research by modularizing (biomedical) data science. For more details, instructions, and modules check out the project’s repository.

⭐️ Star and share modules you find valuable 📤 - help others discover them, and guide our future work!

[!IMPORTANT]
If you use this workflow in a publication, please don’t forget to give credit to the authors by citing it using this DOI 10.5281/zenodo.10849097.

Workflow Rulegraph

🖋️ Authors

💿 Software

This project wouldn’t be possible without the following software and their dependencies:

Software Reference (DOI)
deeptools https://doi.org/10.1093/nar/gkw257
gtracks https://gitlab.com/salk-tm/gtracks
igv-reports https://github.com/igvteam/igv-reports
pygenometracks https://doi.org/10.1093/bioinformatics/btaa692
samtools https://doi.org/10.1093/bioinformatics/btp352
sinto https://github.com/timoast/sinto

🔬 Methods

This is a template for the Methods section of a scientific publication and is intended to serve as a starting point. Only retain paragraphs relevant to your analysis. References [ref] to the respective publications are curated in the software table above. Versions (ver) have to be read out from the respective conda environment specifications (workflow/envs/*.yaml file) or post-execution in the result directory (genome_tracks/envs/*.yaml). Parameters that have to be adapted depending on the data or workflow configurations are denoted in squared brackets e.g., [X].

(optional) Single-cell preprocessing. Each single-cell BAM file was split into [group]-wise BAM files according to it’s cell barcode metadata using filterbarcodes from the command line tool sinto (ver) [ref].

Processing. Aligned (filtered, and indexed) BAM files were merged by [group] using samtools (ver) [ref]. Each merged BAM file’s coverage was determined for dowmnstream analysis and visualization using bamCoverage from the command line tool deepTools (ver) [ref] and saved in the bigWig format. Finally, we extracted coordinates, extended start and end by [base_buffer] bases, and number of isoforms of all relevant genes/genomic regions [gene_list] from the 12 column BED file genome [genome] annotation [genome_bed].

Visualization. Visualizations for each relevant gene/genomic region and [category] were generated by using the generated bigWig coverage files and vertically stacking genome browser tracks with their annotation at the [x_axis] and each track scaled by [y_max] reads. The plotting was performed using the python wrapper gtracks (ver) [ref] for the package pyGenomeTracks (ver) [ref]. Additionally, an interactive self-contained IGV-report containing all merged samples and gene/genomic regions of interest was generated using igv-reports (ver) [ref]. Finally, a UCSC genome browser track hub was created for online sharing and inspection using UCSC Genome Browser. Both the plotted tracks and the UCSC genome browser tracks were color coded according to [group].

The processing and visualizations described here were performed using a publicly available Snakemake [ver] (ref) workflow [10.5281/zenodo.10849097].

🚀 Features

The workflow performs the following steps to produce the outlined results (genome_tracks/).

🛠️ Usage

Here are some tips for the usage of this workflow:

This workflow is written with Snakemake and its usage is described in the Snakemake Workflow Catalog.

⚙️ Configuration

Detailed specifications can be found here ./config/README.md

📖 Examples

— COMING SOON —

Runtime examples for different data modalities:

🧬 Genome Browser Tracks

The bigWigs directory contains the read coverage per sample/group in bigWig format ({group}.bw) for visual inspection of each sample e.g., during QC or group e.g., comparison of conditions. Below are instructions for two different approaches (online/local).

UCSC Genome Browser Track Hub (online)

  1. Requirement: web server.
  2. Copy (or symlink) the bigWigs directory to an externally accessible location on your web server (=web_server_location).
  3. Create a UCSC Genome Browser hyperlink
  4. Share the link with the world e.g., collaborators or upon publication of your data.

A new feature (2024-08-30) allows users to download all visible data in the current region directly from our tracks display. This facilitates reproducibility when writing reports or publications as data can update and change over time. This feature can be found in the blue bar menu by going to Downloads > Download Current Track Data. The resulting pop-up dialogue box (see screenshot below) can configure the exact tracks to download from all visible tracks, as well as the file name and the output format (JSON, csv, tsv). UCSC_download

IGV: Integrative Genomics Viewer (local/offline)

  1. Requirement: IGV Desktop application.
  2. Open IGV.
  3. Select genome.
  4. Drag and drop all/selected bigWig files from the bigWigs directory directly into the IGV application.

🔗 Links

📚 Resources

📑 Publications

The following publications successfully used this module for their analyses.

⭐ Star History

Star History Chart