Given a dataframe of phenotypes associated with PLANT_IDs and output from a PCA to control for population structure, this function will return a .csv file of the lambda_GC values for the GWAS upon inclusion of different numbers of PCs. This allows the user to choose a number of PCs that returns a lambda_GC close to 1, and thus ensure that they have done adequate correction for population structure.

pvdiv_lambda_GC(
  df,
  type = c("linear", "logistic"),
  snp,
  covar = NA,
  ncores = 1,
  npcs = c(0:10),
  saveoutput = FALSE
)

Arguments

df

Dataframe of phenotypes where the first column is PLANT_ID and each PLANT_ID occurs only once in the dataframe.

type

Character string. Type of univarate regression to run for GWAS. Options are "linear" or "logistic".

snp

Genomic information to include for Panicum virgatum. SNP data is available at doi:10.18738/T8/ET9UAU

covar

Covariance matrix to include in the regression. You can generate these using bigsnpr::snp_autoSVD().

ncores

Number of cores to use. Default is one.

npcs

Integer vector of principle components to use. Defaults to c(0:10).

saveoutput

Logical. Should output be saved as a csv to the working directory?

Value

A dataframe containing the lambda_GC values for each number of PCs specified. This is also saved as a .csv file in the working directory.