Free online books, handbooks, and guides for bioinformatics, R, Python, statistics, and reproducible data science - beginner to advanced.
One resource to bookmark above all others, plus the two foundational texts used in courses worldwide.
The largest curated collection of free R programming books anywhere - over 300 titles covering base R, tidyverse, machine learning, spatial analysis, bioinformatics, Shiny, and statistics.
The definitive introduction to data science with R - importing, tidying, transforming, visualising, and modelling using the tidyverse. Updated edition covers Quarto and Arrow. Universally used as a course text worldwide.
r4ds.hadley.nz →Community-developed handbook for reproducible, ethical, and collaborative data science - version control, testing, CI/CD, project design, and open science practices. Essential for anyone doing research computing.
the-turing-way.netlify.app →Free online books for R users at every level - every title has its own colour.
A fun, irreverent introduction to R - data structures, statistics, plotting, and tidyverse with a running pirate narrative.
Written by a neuroscience HDR student - installation, keyboard shortcuts, dplyr, and ggplot2. Practical from page one.
Practical recipes for ggplot2 and base R - basic charts through publication-ready figures. A reference you return to constantly.
Practical R for researchers - data manipulation, visualisation, and statistical analysis. No programming background needed.
Functional programming, metaprogramming, performance, and the internals of R. For users who want to truly understand the language.
The complete guide to Shiny - reactive programming, UI/server architecture, design patterns, and deployment.
Statistical data analysis with R for biologists - EDA, hypothesis testing, high-throughput sequencing, and machine learning via Bioconductor.
Free online books for learning Python for data science, biology, and research software engineering.
Comprehensive free book covering the core Python data science stack - IPython, NumPy, Pandas, Matplotlib, and scikit-learn from scratch through to machine learning pipelines. The standard reference for researchers moving into Python data analysis.
Practical Python for biologists - sequence analysis, file parsing, and data manipulation using biological examples throughout.
Building robust research software - testing, packaging, documentation, version control, and CI/CD. Written for researchers, not software engineers.
Opinionated best-practice guide - virtual environments, project structure, testing, packaging, and style. Essential for researchers writing Python beyond simple scripts.
Foundational texts for statistical thinking, modelling, and machine learning in a research context.
Neural networks and deep learning from first principles with worked Python examples - intuition-first, maths-respectful. The standard first text before TensorFlow or PyTorch.
Statistics in R - data visualisation, wrangling, linear regression, sampling, estimation, confidence intervals, and hypothesis testing. Tidyverse-native, freely available online.
R-based introduction with strong statistical fundamentals - probability, inference, regression, visualisation, and machine learning. Used in university courses worldwide.
Dependency management, containerisation, licensing, testing, and CI/CD pipelines for research software. Practical and opinionated - exactly the content that prevents replication failures.
Graduate-level econometrics covering probability, statistics, regression, time series, and forecasting - rigorous yet readable, with real-world applications throughout. Freely available direct from the author.
Free books and workbooks covering the full bioinformatics toolkit - command-line to RNA-seq and computational genomics.
Extensive workbook - command-line basics, HPC, BLAST, RNA-seq, genome assembly, variant discovery, metagenomics, ATAC-seq, and data visualisation.
Computational genomics in R - sequence analysis, ChIP-seq, BS-seq, single-cell RNA-seq, and machine learning for genomics via Bioconductor.
Sampling, distributions, p-values, t-tests, ANOVA, nonparametric tests, survival analysis, and RNA-seq with edgeR/limma.
The most approachable intro to Git and GitHub for R users - installation, RStudio integration, branching, pull requests, and common research workflows.
A practical guide to analysing CRISPR screens using edgeR in R - from count data processing through differential abundance testing and visualisation.