Before you run DESeq2, you need to understand what those numbers in your count matrix actually represent. This post explains what RNA-seq count data is, why raw counts are misleading, and exactly what format DESeq2 expects.
We’ve covered downloading data, normalization, and visualization. Now, we put it all together. This capstone post walks through a complete end-to-end analysis of a public breast cancer dataset (GSE183947) — from raw GEO download to identifying differentially expressed genes and creating a publication-ready volcano plot.
If R is a smart phone, Bioconductor is the App Store for biology. This post explains why bioinformatics has its own package repository, why it’s better than CRAN for scientists, and how to keep your biological analysis reproducible.
If you test 20,000 genes at once, you’ll get 1,000 "significant" results by chance alone. This post explains the Multiple Testing Problem and how to use the p.adjust() function in R to calculate False Discovery Rates (FDR) and protect your results.
Raw log2 CPM values are hard to interpret as a table. A heatmap fixes that — but the defaults are ugly. Here is how to build a publication-ready gene expression heatmap in R with pheatmap, including clustering, color palettes, and sample annotations.