Resource Library · Databases & Data
Databases &Data Resources
Curated reference databases for bioinformatics research - genome assemblies, protein structures, pathway maps, cancer genomics, and population variation.
Genomics & Sequences
Sequence & Genome Databases
NCBI repository for nucleotide sequences, annotations, and literature links.
Genome-scale annotations, comparative genomics, variants, and regulatory tracks.
Interactive genome viewer - genes, regulation, conservation, and variant tracks.
Curated non-redundant sequences - the reference standard for genome annotation.
DNA Data Bank of Japan - global nucleotide data sharing via INSDC.
Broad reference files: GRCh38/hg19, known SNPs, BQSR and VQSR data.
Human SNP reference for population genetics and variant calling pipelines.
Protein & Structure
Protein & Structural Databases
🧬
Sequence & Function
Comprehensive protein sequence and functional data - curated Swiss-Prot combined with TrEMBL.
🏗️
3D Structure
3D structural data for proteins and nucleic acids - X-ray, NMR, and cryo-EM.
🔍
Domain Classification
Protein family and domain classification integrating Pfam, PRINTS, PROSITE, and more.
📚
Protein Families
Protein families via curated multiple sequence alignments and profile hidden Markov models.
🕸️
Interactions
Protein–protein interaction networks from experimental, co-expression, and text-mining evidence.
Functional & Pathway
Pathway & Functional Databases
01 Kyoto Encyclopedia of Genes and Genomes - curated pathway maps for metabolic, regulatory, and signalling processes.
02 Expert-curated pathways: signal transduction, metabolism, immune system, and disease mechanisms.
03 Organism-specific pathway and genome databases maintained by expert curators across thousands of species.
04 Open, community-curated pathway knowledge - freely downloadable in multiple analysis-ready formats.
Oncology & Cancer Genomics
Cancer Genomics Databases
Somatic Mutations
COSMIC
Catalogue of Somatic Mutations in Cancer - mutations, gene fusions, copy-number changes, and clinical correlations.
cancer.sanger.ac.uk →
Harmonised Data
GDC
NCI Genomic Data Commons - harmonised cancer genomic and clinical data from TCGA, TARGET, and major programmes.
gdc.cancer.gov →
International Consortium
ICGC
International Cancer Genome Consortium - genomic and clinical data spanning dozens of tumour types worldwide.
dcc.icgc.org →
Multi-Platform Atlas
TCGA
The Cancer Genome Atlas - landmark profiling of 11,000+ primary samples across 33 cancer types.
cancer.gov/tcga →
Microbial & Metagenomics
Microbial & Metagenomic Resources
Reference Databases
Analysis and annotation of metagenomic sequences from environmental and host-associated samples.
16S rRNA database for microbial taxonomy and phylogenetic classification.
Comprehensive ribosomal RNA database spanning all three domains of life.
DOE JGI platform for comparative microbial genomic and metagenomic analysis.
Analysis Tools
End-to-end microbiome analysis - denoising, diversity, classification, and visualisation.
Ultrafast k-mer taxonomic classification - low memory, high accuracy.
Marker-gene community profiling from metagenomic shotgun data.
Epigenomics & Regulatory
Epigenomic & Regulatory Databases
Functional Genomics Comprehensive functional genomics for understanding cis-regulatory elements and gene regulation across human cell types.
Epigenetic Landscape Epigenetic landscapes across 111 human cell types - histone modifications, methylation, and chromatin accessibility.
TF Binding Profiles Open-access curated transcription factor binding profiles (PFMs) for motif scanning and enrichment analysis.
ChIP-seq Peaks Unified ChIP-seq peaks for transcription factor binding sites through a consistent, reproducible pipeline.
Population Genetics
Variation & Population Genetics
Global Variation Genetic variation catalog from thousands of individuals across 26 global populations - foundational allele frequency reference.
Exome & Genome Aggregated genome and exome data from diverse populations - the standard for variant frequency lookup in clinical genetics.
Haplotype Blocks Human haplotype blocks and SNP linkage disequilibrium patterns - key resource for GWAS study design.
Exome Aggregation Exome Aggregation Consortium - global population exome data, predecessor to gnomAD, still widely cited.
General Resources
Miscellaneous Databases
Standardised vocabulary for functional annotation of genes and gene products across all species.
EMBL-EBI archive for functional genomics data from microarray and high-throughput sequencing experiments.
Metadata repository for biological samples referenced in EMBL-EBI databases.
Protein expression across human tissues, organs, brain regions, and cell types.
Account Holder Subscriptions
Want More Resources?
If you would like to get access to more resources, create an account and subscribe.
Login →