Cluster Flow - Automate and standardise bioinformatics analyses on cluster environments
What is Cluster Flow?
Cluster Flow is a command-line program which uses common cluster managers to run analysis pipelines. It currently supports GRIDEngine (SGE), LSF and SLURM as well as running locally on any unix system.
Benefits of using Cluster Flow:
- Routine analyses are very quick to run
- Pipelines use identical parameters, standardising analysis and making results more reproducable
- Integrated parallelisation tools help prevent your cluster becoming overloaded
- All commands and output is logged in files for future reference
- Intuitive commands and a comprehensive manual make Cluster Flow easy to use
- Very easy to get up and running (in theory at least!)
How Cluster Flow differs from other pipeline tools:
- Very lightweight and flexible
- Pipelines and configurations can easily be generated on a project-specific basis if required
- New modules and pipelines are super easy to write (see video tutorial)
Installation
Cluster Flow can be downloaded from http://clusterflow.io. The source code for Cluster Flow is hosted on GitHub: https://github.com/ewels/clusterflow/
Full installation instructions can be found in the documentation.
Documentation
You can read the full documentation at http://clusterflow.io
There are also three introductory videos:
Introduction
Usage
Installation
Contributors
Written whilst working at the Babraham Institute, maintained at SciLifeLab
Licence
GPL v3
See the code for Cluster Flow here: https://github.com/ewels/clusterflow