OmniEdge Scientific
Beginner

Next-Generation Sequencing (NGS) Analysis

Next-Generation Sequencing (NGS) Analysis

Master the complete Next-Generation Sequencing (NGS) workflow. This course provides hands-on experience in a Linux environment (WSL), covering everything from raw data Quality Control to Genome Assembly and Sequence Alignment.

  • Environment: Linux on Windows (WSL) setup and command-line mastery.
  • Core Workflows: QC, Trimming, Assembly (SPAdes), and Alignment (BWA/SAMtools).
  • Data Analysis: Working with FASTA, FASTQ, BAM, and VCF formats.

Course Overview

This section provides a comprehensive introduction to NGS data analysis, covering essential computational skills, biological data handling, and practical workflows. Participants will gain hands-on experience using Linux-based environments and widely used bioinformatics tools for real-world genomic data analysis.


What You Will Learn

Module 0 & 1: Linux Environment & NGS Fundamentals

  • Linux on Windows: Setting up WSL and understanding why Linux is vital for bioinformatics.
  • Command Line: Practical navigation, file system structure, and essential shell commands.
  • Workflow Design: Overview of end-to-end NGS analysis and setting up mini-projects.

Module 2 & 3: Data Retrieval & Similarity Search

  • NCBI Databases: Accessing real genomic data and reference genomes (e.g., E. coli).
  • File Formats: Deep dive into FASTA, FASTQ, BAM, and VCF structures.
  • BLAST Analysis: Executing local vs. web-based BLAST and interpreting output results.

Module 4: Quality Control (QC)

  • Raw Data Assessment: Understanding Phred quality scores and FastQC reports.
  • Data Cleaning: Hands-on read trimming and filtering techniques to improve data quality.

Module 5 & 6: Genome Assembly & Alignment

  • Genome Assembly: Building complete genomes from reads using SPAdes and validating results.
  • Sequence Alignment: Aligning reads to reference genomes using BWA and SAMtools.
  • Metrics: Evaluating alignment quality and understanding SAM/BAM formats.

Module 7: Variant Calling (Conceptual)

  • Genetic Variation: Principles of identifying variants from sequencing data.
  • Pipelines: Understanding the logic of industry-standard variant calling workflows and VCF interpretation.

Learning Outcomes

  • Work efficiently in a Linux-based bioinformatics environment.
  • Handle real-world NGS datasets independently.
  • Perform QC, Assembly, and Alignment using professional tools.
  • Translate raw sequencing data into meaningful biological insights.