午夜视频在线网站,日韩视频精品在线,中文字幕精品一区二区三区在线,在线播放精品,1024你懂我懂的旧版人,欧美日韩一级黄色片,一区二区三区在线观看视频

分享

01.GATK人種系變異最佳實(shí)踐SnakeMake流程:Workflow簡(jiǎn)介

 生信探索 2023-05-26 發(fā)布于云南

學(xué)習(xí)的第一個(gè)GATK找變異流程,人的種系變異的短序列變異,包括SNP和INDEL。寫了一個(gè)SnakeMake分析流程,從fastq文件到最后的vep注釋后的VCF文件,關(guān)于VCF的介紹可以參考上一篇推文基因序列變異信息VCF (Variant Call Format)

流程代碼在https:///BioQuest/smkhgs或https://github.com/BioQuestX/smkhgs

README

GATK best practices workflow Pipeline summary

SnakeMake workflow for Human Germline short variants (SNP+INDEL)

Reference

  1. Reference genome related files and GTAK budnle files (GATK)
  2. VEP Variarition annotation files (VEP)

Prepare

  1. Adapter trimming (Fastp)
  2. Aligner (BWA mem2)
  3. Mark duplicates (samblaster)
  4. Generates recalibration table for Base Quality Score Recalibration (BaseRecalibrator)
  5. Apply base quality score recalibration (ApplyBQSR)

Quality control report

  1. Fastp report (MultiQC)
  2. Alignment report (MultiQC)

Call

  1. Call germline SNPs and indels via local re-assembly of haplotypes (HaplotypeCaller)
  2. Import VCFs to GenomicsDB (GenomicsDBImport)
  3. Perform joint genotyping on one or more samples pre-called with HaplotypeCaller (GenotypeGVCFs)

Filter

  1. Select a SNP or INDEL of variants from a VCF file (SelectVariants)
  2. Build a recalibration model to score variant quality for filtering purposes (VariantRecalibrator)
  3. Apply a score cutoff to filter variants based on a recalibration table (ApplyVQSR)
  4. Merge all the VCF files (Picard)

Annotation

Annotate variant calls with VEP (VEP)

SnakeMake Report

Outputs

.
├── config
│   ├── captured_regions.bed
│   ├── config.yaml
│   └── samples.tsv
├── dag.svg
├── logs
│   ├── annotate
│   ├── call
│   ├── filter
│   ├── prepare
│   ├── qc
│   ├── ref
│   └── trim
├── raw
│   ├── SRR24443168.fastq.gz
│   └── SRR24443169.fastq.gz
├── README.md
├── report
│   ├── fastp_multiqc_data
│   ├── fastp_multiqc.html
│   ├── prepare_multiqc_data
│   ├── prepare_multiqc.html
│   └── vep_report.html
├── results
│   ├── called
│   ├── filtered
│   ├── prepared
│   ├── trimmed
│   └── vep_annotated.vcf.gz
├── workflow
│   ├── envs
│   ├── report
│   ├── rules
│   ├── schemas
│   ├── scripts
│   └── Snakefile

Directed Acyclic Graph

Reference

GATK best practices workflow: https://gatk./hc/en-us/sections/360007226651-Best-Practices-Workflows
GATK: https://software./gatk/
VEP: https://www./info/docs/tools/vep/index.html
fastp: https://github.com/OpenGene/fastp
BWA mem2: http://bio-bwa./
samblaster: https://github.com/GregoryFaust/samblaster
BaseRecalibrator: https://gatk./hc/en-us/articles/13832708374939-BaseRecalibrator
ApplyBQSR: https://github.com/GregoryFaust/samblaster
HaplotypeCaller: https://gatk./hc/en-us/articles/13832687299739-HaplotypeCaller
GenomicsDBImport: https://gatk./hc/en-us/articles/13832686645787-GenomicsDBImport
GenotypeGVCFs: https://gatk./hc/en-us/articles/13832766863259-GenotypeGVCFs
SelectVariants: https://gatk./hc/en-us/articles/13832694334235-SelectVariants
VariantRecalibrator: https://gatk./hc/en-us/articles/13832694334235-VariantRecalibrator
ApplyVQSR: https://gatk./hc/en-us/articles/13832694334235-ApplyVQSR
Picard: https://broadinstitute./picard
MultiQC: https://

    轉(zhuǎn)藏 分享 獻(xiàn)花(0

    0條評(píng)論

    發(fā)表

    請(qǐng)遵守用戶 評(píng)論公約

    類似文章 更多