We developed many tools for compression, processing, and analysis of bioinformatics data.
Some of them were designed by our group in close collaboration with external researchers.
Below you can find a list of the tools with links to GitHub repositories.
You can also visit our organization Web page at GitHub.
Compression of sequencing data
- CoLoRd — compressor of 3rd gen (ONT, PacBio) sequencing data.
- DSRC — very fast compressor of 2nd gen (Illumina) sequencing data.
- FaStore — compressor (compromise between ratio and speed) of 2nd gen (Illumina) sequencing data.
- FQSqueezer — best-ratio-focussed compressor of 2nd gen (Illumina) sequencing data.
- ORCOM — experimental compressor of bases in 2nd gen (Illumina) sequencing data.
Compression of genome collections
- AGC — compressor of collections of complete genomes (sets of contigs) of the same species.
- GDC — compressor of collections of complete genomes (sets of chromosomes) of the same species.
Compression of genotype collections
- GTC — compressed data structure of collections of genotypes; supports various types of queries.
- GTShark — compressor of collections of genotypes.
- MuGI — index to a collection of genomes of the same species.
- TGC — experimental compressor of collections of genomes (sets of chromosomes) of the same species.
- VCFShark — compressor of VCF files.
Multiple sequence alignment of proteins
- CoMSA — compressor of collections of multiple sequence alignments (MSA) of proteins.
- FAMSA — very fast aligner of huge (1M+) protein families.
- QuickProbs — high-quality aligner of moderate-size (approx. 1k) protein families.
- CoMeta — classification of metagenomes in sequencing data.
- KMC — very fast k-mer counter.
- KMC tools — tools to operate on sets of k-mers.
- Kmer-db — compact data structure representing collection of k-mers in genomes.
- PHIST — tool to predict prokaryotic hosts for phage (meta)genomic sequences.
- RECKONER — corrector of errors in 2nd gen (Illumina) sequencing data.
- Whisper — robust mapper of 2nd gen (Illumina) sequencing data.
External projects to which we contributed