PLINK Tutorial

📘

Go to ai.tinybio.cloud/chat to chat with a life sciences focused ChatGPT.

Overview of PLINK

PLINK is a free, open-source whole genome association analysis toolset, designed to perform a range of basic, large-scale analyses in a computationally efficient manner. The focus of PLINK is purely on the analysis of genotype/phenotype data, and it does not support steps prior to this, such as study design and planning, or generating genotype or CNV calls from raw data. However, through integration with gPLINK and Haploview, there is some support for the subsequent visualization, annotation, and storage of results.

PLINK was developed by Shaun Purcell at the Center for Human Genetic Research (CHGR), Massachusetts General Hospital (MGH), and the Broad Institute of Harvard & MIT. It is designed to handle large data sets comprising hundreds of thousands of markers genotyped for thousands of individuals and can manipulate and analyze these data sets in their entirety.

The tool set supports five main domains of function:

  1. Data management
  2. Summary statistics
  3. Population stratification
  4. Association analysis
  5. Identity-by-descent estimation

PLINK's capabilities include the estimation and use of identity-by-state and identity-by-descent information in the context of population-based whole-genome studies. This information can be used to detect and correct for population stratification and to identify extended chromosomal segments that are shared identical by descent between very distantly related individuals. Analysis of the patterns of segmental sharing has the potential to map disease loci that contain multiple rare variants in a population-based linkage analysis.

Installation

To install PLINK, you can download the stable version from the official PLINK website. There are versions available for different operating systems, including Windows, Unix, and Linux. The installation process typically involves downloading the appropriate binary for your system and, in the case of Unix/Linux, may also require compilation from source. Detailed instructions are provided on the PLINK website, ensuring that users can get the software up and running with minimal hassle.

Quick Start

Once PLINK is installed, you can begin using it by running the executable from the command line with the appropriate options and parameters. PLINK's command-line interface is straightforward, allowing users to specify the input data files and the analyses they wish to perform. The PLINK website provides a comprehensive list of commands and options, as well as a tutorial to help new users get started.

Code Examples Of Popular Commands

Here are five popular PLINK commands with explanations:

  1. Basic Association Test:

    plink --file data --assoc
    

    This command performs a basic allelic association test for each SNP in the dataset specified by the --file option.

  2. Case-Control Association with Permutation:

    plink --file data --assoc --perm
    

    This command runs a case-control association test with permutation analysis to assess the significance of the associations.

  3. Calculating Linkage Disequilibrium (LD):

    plink --file data --ld SNP1 SNP2
    

    This command calculates the linkage disequilibrium between two specified SNPs.

  4. Quality Control Filtering:

    plink --file data --geno 0.1 --make-bed --out data_filtered
    

    This command filters out SNPs with a genotyping rate below 90% (--geno 0.1), converts the data to binary format (--make-bed), and saves the output with a specified prefix (--out).

  5. Principal Component Analysis (PCA):

    plink --file data --pca
    

    This command performs PCA to detect and correct for population stratification in the dataset.

These commands represent just a small fraction of what PLINK is capable of. The toolset is highly versatile and can be adapted to a wide range of genomic analyses, making it an essential resource for researchers in the field of bioinformatics and genetic epidemiology.

For a more detailed guide on using PLINK, including advanced options and troubleshooting, users are encouraged to consult the online documentation and tutorials available on the PLINK website.