This function converts bigsnpr output, saved as rds files to the specified path, to four dataframes used in the R package mashr. It can clump SNPs based on LD and the maximum -log10(p-value) across all included GWAS. It can also set the random effect data frames to come from a subsample of SNPs clumped by MAF and LD.

pvdiv_bigsnp2mashr(
  path = ".",
  snp = NULL,
  gwas_rds = NA,
  phenotypes = NA,
  clump = TRUE,
  scaled = TRUE,
  numSNPs = 1000,
  model = c("linear", "logistic"),
  saveoutput = FALSE,
  suffix = ""
)

Arguments

path

File path to the rds files saved from bigsnpr, a character string. Defaults to the working directory.

snp

The "bigSNP" object used to run the gwas; needed if clump is TRUE. Load with bigsnpr::snp_attach().

gwas_rds

A character vector of saved gwas rds objects from bigsnpr. If NA, all *.rds files in the path will be used.

phenotypes

A character vector of phenotype names for the GWAS RDS objects. Must be the same length as gwas_rds, or NA. If NA, these will be the rds file names.

clump

Logical. Should SNPs be clumped by LD & p-value to standardize signal strength across different LD blocks? Default is TRUE.

scaled

Logical. Should marker effects in each condition be scaled to fall between -1 and 1? Default is TRUE.

numSNPs

The number of most significant SNPs selected from each GWAS. Ideally this will give 1 million or fewer total cells in the resultant mash dataframes. Defaults to 1000.

model

Regression used in bigstatsr. One of "logistic" or "linear". Default is "linear".

saveoutput

Logical. Should the function's output also be saved to RDS files? Default is FALSE.

suffix

Character. Optional. If the function's output is saved to RDS files, what unique suffix should be used?

Value

A list containing five data frames: the SNPs selected, the B_hat and S_hat matrices for the strong SNP set and for a random SNP set that is twice the size.

Note

To create a vector of phenotype names, use the pvdiv_results_in_folder function.

Examples

if (FALSE) bigsnp2mashr(path = system.file("inst/extdata"), numSNPs = 20,
    model = "linear")
if (FALSE) bigsnp2mashr(numSNPs = 10000, model = "logistic")
if (FALSE) bisgnp2mashr(numSNPs = 20000, model = "linear", saveoutput = TRUE)
if (FALSE) phenotype_vector <- pvdiv_results_in_folder(path = system.file(
    "inst/extdata"))
    numSNPs <- 1000000 / length(phenotype_vector)^2
#> Error in eval(expr, envir, enclos): object 'phenotype_vector' not found
    bigsnp2mashr(phenotypes = phenotype_vector, numSNPs = numSNPs,
model = "linear", saveoutput = TRUE)
#> Error in bigsnp2mashr(phenotypes = phenotype_vector, numSNPs = numSNPs,     model = "linear", saveoutput = TRUE): could not find function "bigsnp2mashr"