- Published on
Conda for biologists: managing bioinformatics environments without losing your mind
- Authors

- Name
- BioTech Bench
Dependency hell is real
Let me guess. You tried to install a bioinformatics tool — maybe samtools, maybe fastqc, maybe a Python package for single-cell analysis. The installation instructions said "requires Python 3.9." You had Python 3.12. You installed 3.9. Now your other tool that needed 3.12 is broken. You created a virtual environment, but then a C library was missing. You installed the C library, but it conflicted with your system's version. You gave up and asked the bioinformatics core to run it for you.
This is called dependency hell, and it is the single biggest barrier between bench biologists and computational tools. It is not your fault. It is a genuinely hard problem — different tools need different versions of Python, R, C libraries, and system dependencies, and installing one can break another.
Conda is the solution. It is a package manager and environment manager that creates isolated "bubbles" — each with its own Python version, its own R version, its own libraries — so that installing one tool does not affect anything else on your system. Think of it as a set of sandboxed mini-computers inside your computer, each perfectly configured for one task.
What you'll learn
- What Conda is and how it works
- How to install Conda (Miniconda)
- How to create and manage isolated environments
- How to install bioinformatics tools from the
biocondachannel - How to share your environment with collaborators (reproducibility)
- How to avoid the most common Conda pitfalls
Step 1: Install Conda (Miniconda)
You don't need the full Anaconda distribution — it is 3 GB and comes with hundreds of packages you will never use. Instead, install Miniconda, which is just Conda itself plus Python. It is about 400 MB.
On Linux:
# Download the installer
wget https://repo.anaconda.com/miniconda/Miniconda3-latest-Linux-x86_64.sh
# Run it
bash Miniconda3-latest-Linux-x86_64.sh
# Follow the prompts. When it asks to initialize Conda, say yes.
On macOS (Apple Silicon):
wget https://repo.anaconda.com/miniconda/Miniconda3-latest-MacOSX-arm64.sh
bash Miniconda3-latest-MacOSX-arm64.sh
After installation, restart your terminal. Verify it worked:
conda --version
conda 25.11.1
If you see a version number, you're good. If you see "command not found," Conda is not in your PATH. Run source ~/.bashrc (or source ~/.zshrc on macOS) and try again.
Step 2: Set up your channels
Conda installs packages from channels — repositories of pre-built software. For bioinformatics, there are three channels you need, and the order matters:
conda config --add channels conda-forge
conda config --add channels bioconda
Verify your configuration:
conda config --show channels
channels:
- conda-forge
- bioconda
- defaults
Why this order matters: Conda resolves dependencies across channels in the order listed.
conda-forgeshould be first because it has the most up-to-date general packages.biocondashould be second — it contains bioinformatics-specific tools.defaults(the Anaconda repository) is the fallback. If you putbiocondafirst, Conda might pull a broken dependency from an older bioconda build instead of getting the maintained version from conda-forge.
Step 3: Create your first environment
Let's create an environment for a hypothetical RNA-seq project. We'll call it rna-seq and specify Python 3.11:
conda create -n rna-seq python=3.11 -y
Executing transaction: done
#
# To activate this environment, use
#
# $ conda activate rna-seq
#
# To deactivate an active environment, use
#
# $ conda deactivate
The -n flag names the environment. The -y flag says "yes to everything" (so you don't have to confirm). Conda downloaded Python 3.11 and a minimal set of dependencies into a new isolated directory.
Activate and verify
conda activate rna-seq
python --version
Python 3.11.15
When the environment is active, its name appears in your terminal prompt. Any package you install now goes into this environment only — it does not touch your system Python or any other environment.
To see all your environments:
conda env list
# conda environments:
#
# * -> active
# + -> frozen
btest /home/redhat/.conda/envs/btest
/home/redhat/miniconda3
/home/redhat/miniconda3/envs/CRISPRitz
/home/redhat/miniconda3/envs/crispor
/home/redhat/miniconda3/envs/crispresso2_env
/home/redhat/miniconda3/envs/fastqc
/home/redhat/miniconda3/envs/paperqa
base /usr
The * shows which environment is currently active.
Step 4: Install bioinformatics tools
This is where Conda shines. The bioconda channel has thousands of pre-compiled bioinformatics tools — samtools, bedtools, fastqc, bwa, bowtie2, STAR, DESeq2, cutadapt, trimmomatic, and hundreds more. You don't need to compile anything. Conda downloads the pre-built binary and installs it.
Install samtools
conda install -n rna-seq -c bioconda samtools -y
Preparing transaction: done
Verifying transaction: done
Executing transaction: done
That's it. Samtools is installed — no compiling, no missing headers, no ./configure && make && make install. Verify it:
samtools --version
samtools 1.3.1
Using htslib 1.3.1
Copyright (C) 2016 Genome Research Ltd.
Install FastQC
conda install -n rna-seq -c bioconda fastqc -y
FastQC is a Java application that normally requires you to install Java separately, download the FastQC zip, unzip it, and make the script executable. With Conda, Java is installed as a dependency automatically:
fastqc --version
FastQC v0.12.1
Search for available packages
Not sure if a tool is on bioconda? Search for it:
conda search -c bioconda samtools
# Name Version Build Channel
samtools 0.1.12 0 bioconda
samtools 0.1.12 1 bioconda
samtools 0.1.13 0 bioconda
samtools 0.1.14 0 bioconda
samtools 0.1.18 h20b1175_12 bioconda
samtools 1.3.1 h8ea3c3a_10 bioconda
samtools 1.17 h00cdaf9_0 bioconda
samtools 1.19 h5041a36_0 bioconda
samtools 1.20 h6e868fa_0 bioconda
samtools 1.21 f5299c06_0 bioconda
This shows every available version. To install a specific version:
conda install -n rna-seq -c bioconda samtools=1.21 -y
The one-liner environment
You can create an environment and install everything at once:
conda create -n rna-seq -c bioconda -c conda-forge \
python=3.11 samtools fastqc cutadapt trimmomatic -y
This creates the environment, installs Python 3.11, and adds four bioinformatics tools — all in one command, all mutually compatible.
Step 5: Managing environments
List installed packages
conda list -n rna-seq
# packages in environment at /home/redhat/.conda/envs/rna-seq:
#
# Name Version Build Channel
_openmp_mutex 4.5 20_gnu conda-forge
bzip2 1.0.8 hda65f42_9 conda-forge
ca-certificates 2026.6.17 hbd8a1cb_0 conda-forge
htslib 1.21 h9753388_0 bioconda
libgcc 15.2.0 he0feb66_19 conda-forge
ncurses 6.5 h5b29e6c_0 conda-forge
python 3.11.15 hf115687_0_cpython conda-forge
samtools 1.21 f5299c06_0 bioconda
Each row shows the package name, version, build hash, and which channel it came from. This is your complete software manifest — useful for troubleshooting and reproducibility.
Remove a package
conda remove -n rna-seq samtools -y
Delete an entire environment
conda env remove -n rna-seq -y
Deactivate
conda deactivate
You are now back in the base environment. Your rna-seq environment still exists — it is just not active.
Step 6: Reproducibility with environment.yml
Here is the scenario: you ran an analysis six months ago. Your paper is in review. The reviewer asks for your code. You send your scripts. They try to run them — and everything breaks, because they have different package versions.
Conda solves this with the environment.yml file. Export your environment:
conda env export -n rna-seq > environment.yml
The file looks like this:
name: rna-seq
channels:
- conda-forge
- bioconda
- defaults
dependencies:
- _openmp_mutex=4.5=20_gnu
- bzip2=1.0.8=hda65f42_9
- ca-certificates=2026.6.17=hbd8a1cb_0
- htslib=1.21=h9753388_0
- libgcc=15.2.0=he0feb66_19
- ncurses=6.5=h5b29e6c_0
- python=3.11.15=hf115687_0_cpython
- samtools=1.21=f5299c06_0
Now anyone — your collaborator, a reviewer, your future self — can recreate your exact environment with one command:
conda env create -f environment.yml
This is the gold standard for reproducible bioinformatics. Include your environment.yml alongside your code in a GitHub repository, and anyone can reproduce your computational environment exactly.
Tip: For a cleaner file that is less likely to break across platforms (Linux vs. macOS), use
--no-builds:
conda env export -n rna-seq --no-builds > environment.yml
This removes the build hashes, which are platform-specific. The resulting file is more portable, at a small risk of getting a slightly different build.
Step 7: A real workflow — CRISPR analysis environment
Let's put it all together. Say you want to set up a complete CRISPR off-target analysis environment. You need Python, samtools (for alignment), and CRISPResso2 (for editing efficiency). Here's how:
# Create the environment
conda create -n crispr-analysis -c bioconda -c conda-forge \
python=3.11 samtools=1.21 crispresso2 -y
# Activate it
conda activate crispr-analysis
# Verify everything is installed
python --version
samtools --version | head -1
CRISPResso --version
# Export for reproducibility
conda env export -n crispr-analysis > environment.yml
One environment, three tools, zero conflicts. When you are done with CRISPR analysis, deactivate it and your system is clean.
Common tools on bioconda
Here are some of the most popular bioinformatics tools available on the bioconda channel:
| Tool | What it does | Install command |
|---|---|---|
samtools | BAM/SAM file manipulation | conda install -c bioconda samtools |
bedtools | Genome interval operations | conda install -c bioconda bedtools |
bwa | Read alignment to reference genome | conda install -c bioconda bwa |
bowtie2 | Read alignment (RNA-seq, ChIP-seq) | conda install -c bioconda bowtie2 |
star | Spliced alignment for RNA-seq | conda install -c bioconda star |
fastqc | Quality control for sequencing reads | conda install -c bioconda fastqc |
cutadapt | Adapter trimming | conda install -c bioconda cutadapt |
trimmomatic | Read trimming and filtering | conda install -c bioconda trimmomatic |
bcftools | Variant calling and VCF manipulation | conda install -c bioconda bcftools |
vcftools | VCF analysis and filtering | conda install -c bioconda vcftools |
multiqc | Aggregate QC reports | conda install -c bioconda multiqc |
seqkit | FASTA/FASTQ manipulation | conda install -c bioconda seqkit |
crispresso2 | CRISPR editing analysis | conda install -c bioconda crispresso2 |
blast | Sequence similarity search | conda install -c bioconda blast |
prodigal | Gene prediction in prokaryotes | conda install -c bioconda prodigal |
prokka | Genome annotation | conda install -c bioconda prokka |
kallisto | Pseudoalignment for RNA-seq quantification | conda install -c bioconda kallisto |
salmon | Transcript quantification | conda install -c bioconda salmon |
Browse the full catalog at bioconda.github.io.
The real talk
Conda can be slow. The dependency solver — especially the default one — can take minutes to resolve complex environments. If you find yourself waiting, install the libmamba solver:
conda install -n base conda-libmamba-solver
conda config --set solver libmamba
libmamba is dramatically faster — what used to take 5 minutes now takes 5 seconds. Newer Conda versions (4.11+) ship with it by default.
Environments take disk space. Each environment is a full copy of Python plus all its packages. A typical environment is 1-5 GB. If you create 20 environments, that's 20-100 GB. Run conda clean --all periodically to remove cached packages and free space:
conda clean --all -y
Don't install everything in base. The base environment is Conda itself. Installing tools there can break Conda. Always create a separate environment:
# GOOD
conda create -n my-project samtools
conda activate my-project
# BAD - can break conda itself
conda install samtools
Bioconda builds can lag behind. The latest version of a tool on GitHub might not be on bioconda for weeks or months. If you need the bleeding edge, you may have to build from source. But for 95% of use cases, the bioconda version is fine.
Channel priority conflicts. If you see a message about "package not found" or "conflicting dependencies," it is often because your channel priority is wrong. conda-forge should always be prioritized over defaults. If you are still stuck, try installing with --strict-channel-priority:
conda install -n my-env -c bioconda -c conda-forge --strict-channel-priority samtools -y
Mamba: the faster alternative
If Conda is too slow for you, Mamba is a drop-in replacement written in C++. It uses the same channels and the same environment format, but it is significantly faster at dependency resolution. You can install it inside Conda:
conda install -n base -c conda-forge mamba
Then just replace conda with mamba in any command:
mamba create -n rna-seq -c bioconda samtools fastqc -y
mamba install -n rna-seq -c bioconda cutadapt -y
Same environments, same packages, much faster. Many bioinformaticians have switched to Mamba entirely.
The cheat sheet
| Command | What it does |
|---|---|
conda create -n env_name python=3.11 | Create a new environment |
conda activate env_name | Activate an environment |
conda deactivate | Leave the current environment |
conda env list | List all environments |
conda install -c bioconda tool_name | Install a package from bioconda |
conda search -c bioconda tool_name | Search for available versions |
conda list -n env_name | List packages in an environment |
conda remove -n env_name package_name | Remove a package |
conda env remove -n env_name | Delete an entire environment |
conda env export -n env_name > env.yml | Export environment to a file |
conda env create -f env.yml | Recreate environment from a file |
conda clean --all | Remove cached packages to free disk space |
mamba create -n env_name ... | Same as conda, but faster (if mamba installed) |
What's next?
You now have the three foundational skills for computational biology: navigating the command line, managing your software with Conda, and analyzing data with R. With these tools, you can install virtually any bioinformatics tool, run it on real data, and reproduce your results.
The next time someone sends you a GitHub repo with a pipeline, you won't close the tab. You'll create a Conda environment, install the dependencies, and run it.
Already using Conda in your research? What is your most-used environment? Drop a comment below.
Additional Resources
| Resource | Link | What it is |
|---|---|---|
| Miniconda | docs.conda.io/miniconda | Official installer |
| Bioconda | bioconda.github.io | The bioinformatics package channel |
| Conda cheat sheet | docs.conda.io/cheatsheet | Official quick reference |
| Mamba | github.com/mamba-org/mamba | Faster Conda alternative |
| Anaconda Cloud | anaconda.org | Search packages across all channels |
| Conda-forge | conda-forge.org | Community-maintained general packages |
Just getting started with the command line? Check out our Bash for Biologists survival guide first — it covers the terminal basics you need before Conda.