Discover our collection of 25 research tools and applications for bioinformatics.
Found 25 of 25 tools
A tool that finds regions of similarity between biological sequences. The program compares nucleotide or protein sequences to sequence databases and calculates the statistical significance.
COSG is a cosine similarity-based method for more accurate and scalable marker gene identification.
ABACAS is intended to rapidly contiguate (align, order, orientate) , visualize and design primers to close gaps on shotgun assembled contigs based on a reference sequence. It uses MUMmer to find alignment positions and identify syntenies of assembly contigs against the reference. The output is then processed to generate a pseudomolecule taking overlaping contigs and gaps in to account. MUMmer's alignment generating programs, Nucmer and Promer are used followed by the 'delta-filter' utility function. Users could also run tblastx on contigs that are not used to generate the pseudomolecule.
Abismal is a mapper of FASTQ bisulfite-converted short reads (between 50 and 1000 bases) to a FASTA reference genome.
abPOA: an SIMD-based C library for fast partial order alignment using adaptive band. abPOA can perform multiple sequence alignment (MSA) on a set of input sequences and generate a consensus sequence by applying the heaviest bundling algorithm to the final alignment graph.
ACTC (Align subreads to CCS reads) is developed by Pacific Biosciences and provides a one-click solution for aligning individual subreads to the corresponding circular consensus (CCS) reads — useful in workflows involving HiFi/CCS read analysis from PacBio sequencing.
AdapterRemoval searches for and removes adapter sequences from High-Throughput Sequencing (HTS) data and (optionally) trims low quality bases from the 3' end of reads following adapter removal. AdapterRemoval can analyze both single end and paired end data, and can be used to merge overlapping paired-ended reads into (longer) consensus sequences. Additionally, AdapterRemoval can construct a consensus adapter sequence for paired-ended reads, if which this information is not available.
Automatic Filtering, Trimming, Error Removing and Quality Control for fastq data AfterQC can simply go through all fastq files in a folder and then output three folders: good, bad and QC folders, which contains good reads, bad reads and the QC results of each fastq file/pair.
Another Gff Analysis Toolkit (AGAT) Suite of tools to handle gene annotations in any GTF/GFF format.
BAM Statistics, Feature Counting and Annotation
Alien_hunter is an application for the prediction of putative Horizontal Gene Transfer (HGT) events with the implementation of Interpolated Variable Order Motifs (IVOMs).
AlignStats produces various alignment, whole genome coverage, and capture coverage metrics for sequence alignment files in SAM, BAM, and CRAM format. This program is designed to serve reporting and quality control purposes in sequencing analysis pipelines at the Baylor College of Medicine Human Genome Sequencing Center (BCM-HGSC).
AMPtk is a series of scripts to process NGS amplicon data using USEARCH and VSEARCH, it can also be used to process any NGS amplicon data and includes databases setup for analysis of fungal ITS, fungal LSU, bacterial 16S, and insect COI amplicons. It can handle Ion Torrent, MiSeq, and 454 data.
AnchorWave (Anchored Wavefront Alignment) identifies collinear regions via conserved anchors (full-length CDS and full-length exon have been implemented currently) and breaks collinear regions into shorter fragments, i.e., anchor and inter-anchor intervals. By performing sensitive sequence alignment for each shorter interval via a 2-piece affine gap cost strategy and merging them together, AnchorWave generates a whole-genome alignment for each collinear block. AnchorWave implements commands to guide collinear block identification with or without chromosomal rearrangements and provides options to use known polyploidy levels or whole-genome duplications to inform alignment.
ANNOgesic is the swiss army knife for RNA-Seq based annotation of bacterial/archaeal genomes. It is a modular, command-line tool that can integrate different types of RNA-Seq data based on dRNA-Seq (differential RNA-Seq) or RNA-Seq protocols that inclusde transcript fragmentation to generate high quality genome annotations. It can detect genes, CDSs/tRNAs/rRNAs, transcription starting sites (TSS) and processing sites, transcripts, terminators, untranslated regions (UTR) as well as small RNAs (sRNA), small open reading frames (sORF), circular RNAs, CRISPR related RNAs, riboswitches and RNA-thermometers. It can also perform RNA-RNA and protein-protein interactions prediction.
Antimicrobial Resistance Identification By Assembly
Somatic copy number analysis using WGS paired end wholegenome sequencing
ASGAL (Alternative Splicing Graph ALigner) is a tool for detecting the alternative splicing events expressed in a RNA-Seq sample with respect to a gene annotation. The main idea behind ASGAL is the following one: the alternative splicing events can be detected by aligning the RNA-Seq reads against the splicing graph of the gene.
Get assembly statistics from FASTA and FASTQ files.
Atropos is tool for specific, sensitive, and speedy trimming of NGS reads.
bcl2fastq is a Linux-based command-line tool from Illumina that converts raw base call (BCL) files from Illumina sequencers into FASTQ format, while simultaneously demultiplexing data based on sample indexes. It is crucial for analyzing sequencing data, requiring a sample sheet and producing FASTQ files, statistics, and reports.
Bioawk is an extension to Brian Kernighan's awk, adding the support of several common biological data formats, including optionally gzip'ed BED, GFF, SAM, VCF, FASTA/Q and TAB-delimited formats with column names. It also adds a few built-in functions and an command line option to use TAB as the input/output delimiter. When the new functionality is not used, bioawk is intended to behave exactly the same as the original BWK awk.
Tools for early stage NGS alignment file processing including fast sorting and duplicate marking.
Parse multiple Antimicrobial Resistance Analysis Reports into a common data structure
Sinto is a toolkit for processing aligned single-cell data.