Genome-wide Association Studies

Author

Yuetiva Robles

Published

July 20, 2025

Summary

This document includes instructions for running a simple genome-wide association study (GWAS) using the command line program PLINK (version 1.9) and visualizing data using R and other tools. Additional information and the PLINK software can be found on the PLINK website.

Visualize results

With the .adjusted file, a very quick and easy preview of the results can be done using the command-line. Previewing the first few rows of the .adjusted file will help identify the SNPs with the smallest p-values and let you determine if there were any genome-wide significant associations (p<5×10-8). We will also check to see if our results from the previous tutorial are similar for rs429358 and rs769449. At the shell prompt, type:

HVGWAS="ADNI1_HV_GWAS_p1_maf02.assoc.linear"
ADGWAS="ADNI1_ADstatus_GWAS_p1_maf02.assoc.logistic"
cd ~/my-analysis
head $HVGWAS.adjusted
head $ADGWAS.adjusted

egrep -w 'rs429358|rs769449' $HVGWAS
egrep -w 'rs429358|rs769449' $ADGWAS

Manhattan and QQ plots

There are a number of R packages available to create Manhattan plots and QQ-plots such as ggplot2 and qqman. Another option is to use online tools like FUMA GWAS (Functional Mapping and Annotation of Genome-Wide Association Studies) which can provide a "one-stop shop" for visualizing, annotating, and prioritizing SNPs for downstream post-GWAS analyses covered in the later tutorial.

One of the easiest R packages to use and customize is qqman, which is what we'll demonstrate here. First, import the GWAS results and prepare the data. Warning, these are very large files and can take a while to load and manipulate. Note that in the Manhattan plot keeping millions of SNPs with P>0.05 is not going to be very different than the plot after removing these SNPs and shifting the y-axis to start at 1.3 instead of 0. The reduction in processing load and file size is noticeable, which is why we are filtering our GWAS results by p-value in the preparation below.

## load libraries
library(tidyverse)
library(here)
library(qqman)

## create variables for files/directories
gwasfile <- here("~","my-analysis","ADNI1_HV_GWAS_p1_maf02.assoc.linear")
outdir <- here("~","my-analysis")

## import GWAS results file
gwas <- read_table(gwasfile)
glimpse(gwas)
# 6813933 SNPs

## create and save QQ plot before filtering SNPs
## use png for a smaller file; pdf for publication quality
## IMPORTANT: The QQ plot takes several minutes to be generated.
png(file = paste0(outdir, "/ADNI1_HV_GWAS-qqplot.png"))
qq(gwas$P)
dev.off()

## filter by p-value to reduce number of observations
gwas <- gwas %>%
  filter(P <= 0.05)
glimpse(gwas)
# 327182

## create and save Manhattan plot
## use png for a smaller file; pdf for publication quality
png(file = paste0(outdir, "/ADNI1_HV_GWAS-manhattan.png"))
manhattan(gwas)
dev.off()

Organized by Alzheimer’s Association, ISTAART Neuroimaging PIA. Working group Brain Imaging Genetics.

Special thanks to ADNI for providing the datasets.

© 2025 AAIC Workshop Basics of Genetics • Maintained by @GeneticNeuroStats