Piper - A pipeline engine built on top of GATK Queue
What is Piper?
Piper is a workflow engine for handling data from next-generation sequencing experiments. It’s built on top of the GATK Queue framework, and started out at the SNP&SEQ Technology platform. It’s been put to large scale use in analyzing whole human genome sequencing data run at the National Genomics Infrastructure.
Piper builds on the concept of standardized workflows for different next-generation sequencing applications. At the moment Piper supports the following workflows:
- Whole genome variant calling : For human whole genome sequencing data. This goes through alignment, alignment quality control, data processing, variant calling, and variant filtration according to the best practice recommended by the Broad Institute, using primarily the GATK.
- Exome variant calling: TruSeq and SureSelect human exome sequencing: These use basically the same pipeline as the whole genome pipeline, but with the modifications suggested in the best practice document for exome studies.
- Haloplex variant calling: Haloplex targeted sequencing analysis. Including alignment, data processing, and variant calling.
- RNA read counting: Produces FPKMs for transcripts of an existing reference annotation using Tophat for mapping and Cufflinks to produce the FPKMs.
Documentation
For full documentation and instructions on how to use Piper see the README at Github.
Contributors
Licence
MIT
See the code for Piper here: https://github.com/NationalGenomicsInfrastructure/piper/