Enrichment of rare variants stratified by outlier significance threshold

enrichment_by_significance(outlier.calls, rare.variants,
  outlier.thresholds = c(0.05, 0.01, 0.001, 1e-04, 1e-05, 1e-07),
  limit.to.genes.w.outliers = T, base.significance.cutoff = 0.05,
  draw.plot = T, verbose = T)

Arguments

outlier.calls

A data frame with columns GeneID, SampleName, and outlier.score. The outlier.score is the result of any expression based test which designates a gene as an outlier at some threshold.

rare.variants

A data frame that lists all rare variants found near individual-gene pairs. Must columsn titled SampleName, GeneID, chr, start, and end

outlier.thresholds

Defaults to `c(0.05, 1e-2, 1e-3, 1e-4, 1e-5, 1e-7)`. This script will calculate enrichment at every threshold.

limit.to.genes.w.outliers

Default to `TRUE`. Should I remove genes that are never outliers in any individual?

base.significance.cutoff

Default to `0.05`. Only needed if `limit.to.genes.w.outliers` is true. Use this threshold for deciding whether to exclude genes that are never outliers.

verbose

Defaults to `TRUE` Should I print annoying but helpful messages?

Value

A data frame with enrichment scores at each significance level.

This function takes a data frame of outlier calls and a data frame of rare variant genotype data and produces a data frame with enrichment values representing the relative risk of having a rare variant given outlier status at different significance levels.