- Published on
How to make a gene expression heatmap in R (that actually looks good)
- Authors

- Name
- BioTech Bench
This is Arc 2, Part 9 of the R for Biologists series.
The table you can't read
Post 8 ended with a log2 CPM matrix — normalized, log-transformed, ready. You could see that DUSP1 jumped from ~15 to ~17 in dexamethasone, and FKBP5 from ~13 to ~16.6 in dexamethasone. But reading a table of numbers and extracting a pattern across 30 genes and 6 samples is not something human visual systems are good at.
A heatmap is the fix. Color replaces number, and the eye picks up patterns instantly — which genes go up together, which go down, which samples cluster with which. Every RNA-seq paper has at least one. This post shows you how to build one that actually looks publication-ready, not like a default output you're embarrassed to show your PI.
What you'll learn
- What a heatmap encodes: color = expression level, rows = genes, columns = samples
- What the dendrogram is actually showing: hierarchical clustering, distance, linkage
- How to build a minimal heatmap with
pheatmap - How to upgrade it: RdBu color palette, z-score row scaling, sample group annotations
- How to control clustering: distance metric, linkage method,
cutree_rowsfor visual gaps
Setup
Install pheatmap and RColorBrewer if you haven't already:
install.packages("pheatmap") # skip if already installed
install.packages("RColorBrewer") # skip if already installed
Load them:
library(pheatmap)
library(RColorBrewer)
The data: 30 dexamethasone-regulated genes
We're continuing with the dexamethasone experiment from Posts 7 and 8: human airway smooth muscle cells treated with dexamethasone (a glucocorticoid) versus untreated controls. In Post 8 we normalized 10 genes to log2 CPM. Here we expand to 30 — 15 known dex-inducible genes and 15 dex-repressed genes — to get a heatmap that looks like a real result.
The values are log2 CPM, consistent with what Post 8 produced for the genes we share (DUSP1, FKBP5, CRISPLD2, IL6):
mat <- matrix(c(
14.76, 14.83, 14.64, 17.22, 17.28, 17.33, # DUSP1
12.81, 12.83, 12.82, 16.57, 16.66, 16.69, # FKBP5
11.87, 11.90, 11.75, 15.29, 15.31, 15.43, # CRISPLD2
11.23, 11.45, 11.08, 15.87, 16.12, 15.94, # TSC22D3
10.54, 10.72, 10.41, 15.23, 15.08, 15.37, # ZBTB16
11.02, 11.34, 10.89, 16.14, 16.41, 16.22, # PER1
12.14, 12.38, 12.05, 16.83, 17.09, 16.91, # KLF9
11.56, 11.23, 11.74, 15.67, 15.34, 15.82, # DDIT4
10.87, 11.13, 10.69, 15.04, 15.31, 15.12, # CDKN1C
11.34, 11.08, 11.52, 15.96, 15.68, 16.11, # ANGPTL4
10.92, 11.17, 10.78, 15.71, 15.96, 15.82, # PDK4
11.73, 11.45, 11.89, 16.38, 16.09, 16.54, # RGS2
12.37, 12.64, 12.22, 16.94, 17.21, 17.08, # SGK1
11.12, 10.89, 11.28, 15.78, 15.52, 15.93, # GADD45B
10.65, 10.41, 10.82, 15.34, 15.07, 15.49, # TNFRSF9
13.37, 13.39, 13.28, 11.01, 10.93, 11.44, # IL6
14.12, 14.35, 13.98, 9.23, 8.96, 9.37, # IL1B
13.56, 13.29, 13.72, 8.67, 8.41, 8.83, # CXCL8
14.34, 14.58, 14.19, 9.45, 9.19, 9.61, # CCL2
13.89, 13.63, 14.05, 8.90, 8.64, 9.06, # TNF
13.23, 13.47, 13.09, 8.34, 8.58, 8.45, # PTGS2
14.01, 13.75, 14.17, 9.12, 8.86, 9.28, # MMP3
13.67, 13.91, 13.53, 8.78, 9.02, 8.89, # CCL11
13.90, 13.64, 14.06, 9.01, 8.75, 9.17, # CXCL5
13.45, 13.69, 13.31, 8.56, 8.80, 8.67, # ICAM1
13.78, 13.52, 13.94, 8.89, 8.63, 9.05, # VCAM1
13.34, 13.58, 13.20, 8.45, 8.69, 8.56, # CCL7
14.23, 13.97, 14.39, 9.34, 9.08, 9.50, # CXCL1
13.89, 14.13, 13.75, 8.67, 8.91, 8.78, # CXCL2
13.45, 13.21, 13.60, 8.23, 7.98, 8.37 # MKI67
), nrow = 30, byrow = TRUE)
rownames(mat) <- c(
"DUSP1", "FKBP5", "CRISPLD2", "TSC22D3", "ZBTB16", "PER1",
"KLF9", "DDIT4", "CDKN1C", "ANGPTL4", "PDK4", "RGS2",
"SGK1", "GADD45B", "TNFRSF9",
"IL6", "IL1B", "CXCL8", "CCL2", "TNF", "PTGS2",
"MMP3", "CCL11", "CXCL5", "ICAM1", "VCAM1",
"CCL7", "CXCL1", "CXCL2", "MKI67"
)
colnames(mat) <- c("Ctrl_1", "Ctrl_2", "Ctrl_3", "Dex_1", "Dex_2", "Dex_3")
Check the dimensions:
dim(mat)
[1] 30 6
Take a look at the first few rows:
head(mat, 3)
Ctrl_1 Ctrl_2 Ctrl_3 Dex_1 Dex_2 Dex_3
DUSP1 14.76 14.83 14.64 17.22 17.28 17.33
FKBP5 12.81 12.83 12.82 16.57 16.66 16.69
CRISPLD2 11.87 11.90 11.75 15.29 15.31 15.43
The top 15 rows are dex-induced genes; the bottom 15 are dex-repressed. You can't really see the pattern in the table — the numbers overlap too much. That's exactly what the heatmap will fix.
Stage 1 — The minimal heatmap
One function, one line:
pheatmap(mat)
# Plot appears in the Plots pane
What you get: a heatmap with dendrograms on both axes and gene names on the right. The clustering is already working — dex samples are grouping together and ctrl samples are grouping together.
But three things are wrong:
- The color scale maps to absolute log2 CPM values. Genes with high absolute expression dominate the colors, hiding the treatment effect between conditions.
- No group labels. You can see Ctrl and Dex in the column names, but there's no visual separator.
- The default color scale doesn't communicate biology cleanly. Red for up and blue for down is the convention in expression data — but only after row-scaling, which we haven't done yet.
What the dendrogram means
Before fixing the aesthetics, it's worth understanding what that tree on the left is actually showing.
Hierarchical clustering treats each gene's expression profile as a point in multi-dimensional space — in our case, six-dimensional (one dimension per sample). It computes how far apart each pair of genes is, merges the closest two into a cluster, then treats that cluster as a new point and repeats until everything is merged into one tree. The branching structure it builds is the dendrogram. Genes that join at a low branch are similar to each other; genes that only merge near the top are very different.
The distance measure controls what "far apart" means. The default in pheatmap is Euclidean distance — the straight-line distance in 6D space. This is usually adequate, but it means genes with different absolute levels can look "similar" if their profiles happen to be numerically close. We'll switch to correlation-based distance in Stage 3, which measures similarity of shape (does the gene go up and down at the same times?) rather than closeness in absolute value.
The linkage method controls how distances between clusters are computed after the first merge. The default is complete linkage: the distance between two clusters is the maximum pairwise distance between any member of one and any member of the other. Ward.D2 (which we'll use in Stage 3) minimizes total within-cluster variance at each merge step — for expression data, it usually gives tighter, more interpretable groups with cleaner visual separation.
Stage 2 — The publication upgrade
Three changes: colors, scaling, and annotation.
annotation_col <- data.frame(
Group = c("Ctrl", "Ctrl", "Ctrl", "Dex", "Dex", "Dex")
)
rownames(annotation_col) <- colnames(mat)
pheatmap(mat,
color = colorRampPalette(rev(brewer.pal(9, "RdBu")))(100),
scale = "row",
annotation_col = annotation_col,
border_color = NA,
fontsize_row = 8,
fontsize_col = 10
)
# Plot appears in the Plots pane
The heatmap now shows two clear bands: the top 15 genes (dex-induced) are blue in control samples and red in dex samples; the bottom 15 (dex-repressed) are red in controls and blue in dex. A teal/salmon annotation bar above the columns labels the two conditions. The dendrograms are separating gene groups and sample conditions cleanly.
What each argument does:
color = colorRampPalette(rev(brewer.pal(9, "RdBu")))(100)— creates a 100-step palette running from blue (low) through white (zero) to red (high).rev()flips the RdBu palette so blue = low and red = high, matching the convention in most expression papers.scale = "row"— z-scores each row: subtracts the gene's mean across all samples and divides by its SD. After scaling, every gene has mean 0 and SD 1. The color now reflects whether expression is above or below that gene's own average — which is what you actually want to see.annotation_col— a data frame with one row per sample (rownames must match the matrix column names) and one or more columns. Each column becomes a colored annotation bar above the heatmap.border_color = NA— removes the grid lines between cells. Optional, but cleaner.fontsize_row = 8,fontsize_col = 10— adjust font sizes so gene names don't overlap at 30 rows.
Stage 3 — Clustering tuning
Two arguments change the clustering behavior; one adds a visual gap line:
pheatmap(mat,
color = colorRampPalette(rev(brewer.pal(9, "RdBu")))(100),
scale = "row",
annotation_col = annotation_col,
border_color = NA,
fontsize_row = 8,
fontsize_col = 10,
clustering_distance_rows = "correlation",
clustering_method = "ward.D2",
cutree_rows = 2
)
# Plot appears in the Plots pane
The heatmap now shows a visible gap line separating the 15 dex-induced genes (top cluster, blue in Ctrl / red in Dex) from the 15 dex-repressed genes (bottom cluster, red in Ctrl / blue in Dex). The clusters are tighter and the dendrogram branches more cleanly than in Stage 2.
What changed:
clustering_distance_rows = "correlation"— switches to Pearson correlation-based distance for the gene dendrogram. Two genes with the same directional response in dex will cluster together even if their absolute levels differ. This is almost always a better choice than Euclidean for expression data.clustering_method = "ward.D2"— switches to Ward.D2 linkage, which minimizes within-cluster variance at each merge step. It produces more compact, visually distinct clusters than the default complete linkage.cutree_rows = 2— draws a gap line through the row dendrogram, cutting it into 2 clusters. The number matches the biology: one up-regulated cluster, one down-regulated cluster.
We leave clustering_distance_cols at its default (Euclidean) — with only 6 samples, the choice of distance method matters much less for the column dendrogram.
The full script
Everything together, copy-paste ready:
library(pheatmap)
library(RColorBrewer)
mat <- matrix(c(
14.76, 14.83, 14.64, 17.22, 17.28, 17.33, # DUSP1
12.81, 12.83, 12.82, 16.57, 16.66, 16.69, # FKBP5
11.87, 11.90, 11.75, 15.29, 15.31, 15.43, # CRISPLD2
11.23, 11.45, 11.08, 15.87, 16.12, 15.94, # TSC22D3
10.54, 10.72, 10.41, 15.23, 15.08, 15.37, # ZBTB16
11.02, 11.34, 10.89, 16.14, 16.41, 16.22, # PER1
12.14, 12.38, 12.05, 16.83, 17.09, 16.91, # KLF9
11.56, 11.23, 11.74, 15.67, 15.34, 15.82, # DDIT4
10.87, 11.13, 10.69, 15.04, 15.31, 15.12, # CDKN1C
11.34, 11.08, 11.52, 15.96, 15.68, 16.11, # ANGPTL4
10.92, 11.17, 10.78, 15.71, 15.96, 15.82, # PDK4
11.73, 11.45, 11.89, 16.38, 16.09, 16.54, # RGS2
12.37, 12.64, 12.22, 16.94, 17.21, 17.08, # SGK1
11.12, 10.89, 11.28, 15.78, 15.52, 15.93, # GADD45B
10.65, 10.41, 10.82, 15.34, 15.07, 15.49, # TNFRSF9
13.37, 13.39, 13.28, 11.01, 10.93, 11.44, # IL6
14.12, 14.35, 13.98, 9.23, 8.96, 9.37, # IL1B
13.56, 13.29, 13.72, 8.67, 8.41, 8.83, # CXCL8
14.34, 14.58, 14.19, 9.45, 9.19, 9.61, # CCL2
13.89, 13.63, 14.05, 8.90, 8.64, 9.06, # TNF
13.23, 13.47, 13.09, 8.34, 8.58, 8.45, # PTGS2
14.01, 13.75, 14.17, 9.12, 8.86, 9.28, # MMP3
13.67, 13.91, 13.53, 8.78, 9.02, 8.89, # CCL11
13.90, 13.64, 14.06, 9.01, 8.75, 9.17, # CXCL5
13.45, 13.69, 13.31, 8.56, 8.80, 8.67, # ICAM1
13.78, 13.52, 13.94, 8.89, 8.63, 9.05, # VCAM1
13.34, 13.58, 13.20, 8.45, 8.69, 8.56, # CCL7
14.23, 13.97, 14.39, 9.34, 9.08, 9.50, # CXCL1
13.89, 14.13, 13.75, 8.67, 8.91, 8.78, # CXCL2
13.45, 13.21, 13.60, 8.23, 7.98, 8.37 # MKI67
), nrow = 30, byrow = TRUE)
rownames(mat) <- c(
"DUSP1", "FKBP5", "CRISPLD2", "TSC22D3", "ZBTB16", "PER1",
"KLF9", "DDIT4", "CDKN1C", "ANGPTL4", "PDK4", "RGS2",
"SGK1", "GADD45B", "TNFRSF9",
"IL6", "IL1B", "CXCL8", "CCL2", "TNF", "PTGS2",
"MMP3", "CCL11", "CXCL5", "ICAM1", "VCAM1",
"CCL7", "CXCL1", "CXCL2", "MKI67"
)
colnames(mat) <- c("Ctrl_1", "Ctrl_2", "Ctrl_3", "Dex_1", "Dex_2", "Dex_3")
annotation_col <- data.frame(
Group = c("Ctrl", "Ctrl", "Ctrl", "Dex", "Dex", "Dex")
)
rownames(annotation_col) <- colnames(mat)
pheatmap(mat,
color = colorRampPalette(rev(brewer.pal(9, "RdBu")))(100),
scale = "row",
annotation_col = annotation_col,
border_color = NA,
fontsize_row = 8,
fontsize_col = 10,
clustering_distance_rows = "correlation",
clustering_method = "ward.D2",
cutree_rows = 2
)
# Plot appears in the Plots pane
What's next
In the next post, we'll revisit p-values — specifically what happens when you test 30 genes at once. Running 30 t-tests and reporting the raw p-values is not valid: at p < 0.05, you'd expect 1–2 false positives by chance alone. Post 10 covers multiple testing correction: what p.adjust() is actually doing, when to use FDR versus Bonferroni, and how to apply both in R.
What dataset did you use your first heatmap on? Drop a comment below.
Resources
| Resource | Link | Notes |
|---|---|---|
| pheatmap package | cran.r-project.org/package=pheatmap | Kolde R (2019). pheatmap: Pretty Heatmaps. R package version 1.0.12 |
| RColorBrewer | cran.r-project.org/package=RColorBrewer | Neuwirth E (2022). Color palettes including RdBu |
| pheatmap GitHub | github.com/raivokolde/pheatmap | Source code and full argument reference |