05 Pedigree File Format




Pedigree File Format (PED):

PED files are tab or whitespace separated text files used by the program PLINK, that describe meta-data about SNPs and their provenance. Normally a PED file is accompanied by a MAP file.

The first six columns are mandatory:

  • Family ID
  • Individual ID
  • Paternal ID
  • Maternal ID
  • Sex (1=male; 2=female; other=unknown)
  • Phenotype

A score of “-9” means “missing.”

Here is an example of ten male Abhkasians, each scored for 5 SNPs:

Abhkasians abh9  0 0 1 -9  C C  A G  T C  C C  A A 
Abhkasians abh24 0 0 1 -9 C C G G C C C C C A
Abhkasians abh27 0 0 1 -9 T C A G C C C C A A
Abhkasians abh41 0 0 1 -9 C C A A T T C C C A
Abhkasians abh45 0 0 1 -9 C C A G T C C C A A
Abhkasians abh53 0 0 1 -9 C C A G T C C C A A
Abhkasians abh60 0 0 1 -9 C C A G T C C C C A
Abhkasians abh71 0 0 1 -9 C C G G C C C C A A
Abhkasians abh74 0 0 1 -9 C C G G C C T C C C
Abhkasians abh85 0 0 1 -9 C C A G T C C C A A

In the above example, all columns were tab-separated except for the paired alleles that are space-separated. This flat-file text file takes up a lot of space. Normally PLINK works with a binary version of the data (“BED”), which is much more compact. You can convert a PED into BED as follows, assuming that your PED file is “mydata.ped”:

plink --file mydata --make-bed -out mydata

Likewise, you can convert a BED back into a PED + MAP as follows:

plink --bfile mydata --recode --tab -out mydata


Leave a Reply

Your email address will not be published. Required fields are marked *