Prerequisites
What you need before getting started
This guide is written for macOS (both Apple Silicon M1/M2/M3/M4 and Intel Macs). The same packages work on Linux and Windows, but the installation commands will differ slightly.
Before we begin, make sure you have:
- macOS 12 (Monterey) or newer
- At least 10 GB of free disk space
- A stable internet connection (some packages are large)
- Terminal access (built-in Terminal app or VS Code terminal)
Apple Silicon users: R 4.5+ has native ARM64 support, so everything runs natively on M-series chips โ no Rosetta needed.
Removing Old R & RStudio Installations
Start with a clean slate to avoid version conflicts
If you have previous versions of R or RStudio installed, it's best to remove them completely before installing fresh. This avoids package conflicts and ensures a clean environment.
Remove RStudio
# Remove RStudio application
sudo rm -rf /Applications/RStudio.app
Remove R
# Remove R framework and binaries
sudo rm -rf /Library/Frameworks/R.framework
sudo rm -f /usr/local/bin/R /usr/local/bin/Rscript
Remove Configuration & Cache Files
# Remove R packages, config, and history
rm -rf ~/Library/R
rm -rf ~/.R ~/.RData ~/.Rhistory ~/.Rprofile ~/.Renviron
rm -rf ~/Library/Application\ Support/RStudio
rm -rf ~/Library/Caches/org.R-project.R
Verify Clean Removal
# All three should return "not found"
which R
which Rscript
ls /Applications/RStudio.app
Installing the Latest R
R 4.5.3 โ "Reassured Reassurer" (March 2026)
Download & Install R
Use the official CRAN installer for your Mac architecture. For Apple Silicon (M1/M2/M3/M4):
# Download R 4.5.3 for ARM64
curl -L -O https://cran.r-project.org/bin/macosx/big-sur-arm64/base/R-4.5.3-arm64.pkg
# Install
sudo installer -pkg R-4.5.3-arm64.pkg -target /
For Intel Macs, replace arm64 with x86_64 in the URL.
Verify Installation
R --version | head -1
# Expected: R version 4.5.3 (2026-03-11) -- "Reassured Reassurer"
Installing the Latest RStudio
RStudio 2026.01.1 โ "Apple Blossom"
Download RStudio
# Download RStudio Desktop
curl -L -o RStudio.dmg "https://download1.rstudio.org/electron/macos/RStudio-2026.01.1-403.dmg"
# Open the DMG
open RStudio.dmg
Install & Launch
Drag RStudio into the Applications folder in the window that opens. Then launch it:
# Open RStudio
open /Applications/RStudio.app
# Clean up installer files
rm ~/R-4.5.3-arm64.pkg
rm ~/RStudio.dmg
Pro tip: You can also download RStudio from posit.co/downloads if you prefer a graphical installer.
Installing Bioconductor Packages
Essential packages for ChIP-seq and genomics analysis
Bioconductor is the primary repository for bioinformatics packages in R. First, install BiocManager, then use it to install the packages you need.
Run the following in the RStudio Console:
# Install BiocManager (the package manager for Bioconductor)
install.packages("BiocManager")
# Install core Bioconductor packages for bioinformatics
BiocManager::install(c(
"DESeq2", # Differential expression analysis
"DiffBind", # Differential binding analysis
"ChIPseeker", # ChIP-seq peak annotation
"ChIPQC", # ChIP-seq quality control
"clusterProfiler", # GO & pathway enrichment
"GenomicRanges", # Genomic interval operations
"GenomicFeatures", # Gene model manipulation
"rtracklayer", # Import/export genomic files
"org.Hs.eg.db", # Human gene annotation
"TxDb.Hsapiens.UCSC.hg38.knownGene" # hg38 transcript database
))
This will take 15โ30 minutes depending on your internet speed. When prompted with "Update all/some/none?" type a for all. When asked "Install from sources?" type no for faster binary installs.
What Each Package Does
| Package | Source | Purpose |
|---|---|---|
| DESeq2 | Bioconductor | Differential expression/binding analysis using negative binomial models |
| DiffBind | Bioconductor | Differential binding analysis for ChIP-seq peak data |
| ChIPseeker | Bioconductor | Annotate peaks to nearest genes and genomic features |
| ChIPQC | Bioconductor | Quality metrics and reporting for ChIP-seq experiments |
| clusterProfiler | Bioconductor | Gene Ontology (GO) and KEGG pathway enrichment analysis |
| GenomicRanges | Bioconductor | Represent and manipulate genomic intervals in R |
| GenomicFeatures | Bioconductor | Work with gene models and transcript annotations |
| rtracklayer | Bioconductor | Import/export BED, BigWig, GFF, and other genomic formats |
| org.Hs.eg.db | Bioconductor | Human gene annotation database (Entrez IDs, symbols, GO terms) |
| TxDb.Hsapiens.UCSC.hg38.knownGene | Bioconductor | Pre-built transcript database for the hg38 human genome |
Installing CRAN Packages for Data Analysis
Essential tools for data wrangling and visualization
# Data analysis & visualization packages from CRAN
install.packages(c(
"tidyverse", # Data wrangling (dplyr, ggplot2, tidyr, etc.)
"ggplot2", # Publication-quality plots
"pheatmap", # Beautiful heatmaps
"RColorBrewer", # Color palettes for plots
"ggrepel", # Non-overlapping text labels in plots
"VennDiagram", # Venn diagram visualizations
"openxlsx" # Read/write Excel files
))
| Package | Source | Purpose |
|---|---|---|
| tidyverse | CRAN | Collection of packages for data science (includes ggplot2, dplyr, tidyr) |
| pheatmap | CRAN | Create clustered heatmaps with dendrograms |
| RColorBrewer | CRAN | ColorBrewer palettes for scientific visualization |
| ggrepel | CRAN | Smart label placement in ggplot2 (avoids overlapping text) |
| VennDiagram | CRAN | Create publication-quality Venn diagrams |
| openxlsx | CRAN | Read and write Excel files without Java dependency |
Verifying Your Setup
Make sure everything is working correctly
Run this verification script in RStudio to confirm all packages load successfully:
# Verification script โ load all key packages
library(DESeq2)
library(DiffBind)
library(ChIPseeker)
library(clusterProfiler)
library(GenomicRanges)
library(rtracklayer)
library(tidyverse)
library(pheatmap)
# Print versions
cat("โ
R version:", R.version.string, "\n")
cat("โ
Bioconductor:", as.character(BiocManager::version()), "\n")
cat("โ
All packages loaded successfully!\n")
Seeing "masked" warnings? That's completely normal โ it just means some packages share function names. R will use the most recently loaded version by default. You can always specify the package explicitly, e.g., dplyr::filter() vs stats::filter().
Missing dependency errors? If a package fails to load due to a missing dependency (e.g., mvtnorm for DiffBind), simply install the missing package with install.packages("mvtnorm") and try again.
Recommended Bioinformatics Workflow
How it all fits together for ChIP-seq analysis
A typical ChIP-seq analysis spans both command-line tools and R. Here's how the pieces connect:
| Stage | Where | Tools |
|---|---|---|
| Quality Control | HPC / Terminal | FastQC, MultiQC |
| Read Trimming | HPC / Terminal | fastp, Trimmomatic |
| Alignment | HPC / Terminal | Bowtie2 |
| BAM Processing | HPC / Terminal | samtools, Picard |
| Peak Calling | HPC / Terminal | MACS3 |
| Visualization | HPC / Terminal | deeptools |
| Peak Annotation | RStudio | ChIPseeker |
| Differential Binding | RStudio | DiffBind, DESeq2 |
| Pathway Analysis | RStudio | clusterProfiler |
| Publication Figures | RStudio | ggplot2, pheatmap |
Best practice: Run computationally heavy steps (alignment, peak calling) on your HPC cluster. Download the results (peak files, count matrices) to your local machine and perform downstream analysis in RStudio.
That's it โ your R environment is now fully configured for bioinformatics analysis.
Happy analysing! ๐งฌ