Title: | Similarity Regression |
---|---|
Description: | Similarity regression, evaluating the probability of association between sets of ontological terms and binary response vector. A no-association model is compared with one in which the log odds of a true response is linked to the semantic similarity between terms and a latent characteristic ontological profile - 'Phenotype Similarity Regression for Identifying the Genetic Determinants of Rare Diseases', Greene et al 2016 <doi:10.1016/j.ajhg.2016.01.008>. |
Authors: | Daniel Greene |
Maintainer: | Daniel Greene <[email protected]> |
License: | GPL (>= 2) |
Version: | 3.4 |
Built: | 2025-02-16 04:16:28 UTC |
Source: | https://github.com/cran/SimReg |
Functions for performing Bayesian similarity regression, and evaluating the probability of association between sets of ontological terms and binary response vector. A random model is compared with one in which the log odds of a true response is linked to the semantic similarity between terms and a latent characteristic ontological profile.
Key functions include sim_reg, for similarity regression of binary response variable against an ontologically encoded predictor. An example application would be inferring the probability of association between the presence of a rare genetic variant conditional on an ontologically encoded phenotype.
Daniel Greene <[email protected]>
Maintainer: Daniel Greene <[email protected]>
D. Greene, NIHR BioResource, S. Richardson, E. Turro, ‘Phenotype similarity regression for identifying the genetic determinants of rare diseases’, The American Journal of Human Genetics 98, 1-10, March 3, 2016.
phi
from sim_reg_out
objectCalculate marginal probability of terms inclusion in phi
from sim_reg_out
object
get_term_marginals(sim_reg_out)
get_term_marginals(sim_reg_out)
sim_reg_out |
Object of class |
Numeric vector of probabilities, named by term ID.
Get full set of terms to use in inference procedure based on similarity function arguments
get_terms(args)
get_terms(args)
args |
Named list of named arguments which gets passed to ontological similarity function by |
Character vector of term IDs.
gamma=1
and baseline model, gamma=0
.Calculate log Bayes factor for similarity the model, gamma=1
and baseline model, gamma=0
.
log_BF(x, ...) ## Default S3 method: log_BF(x, ...) ## S3 method for class 'sim_reg_output' log_BF(x, ...)
log_BF(x, ...) ## Default S3 method: log_BF(x, ...) ## S3 method for class 'sim_reg_output' log_BF(x, ...)
x |
|
... |
If x is a |
Numeric value.
Create ontological plot of marginal probabilities of terms
plot_term_marginals( ontology, term_marginals, max_terms = 10, min_probability = 0.01, ... )
plot_term_marginals( ontology, term_marginals, max_terms = 10, min_probability = 0.01, ... )
ontology |
|
term_marginals |
Numeric vector of marginal probabilities of inclusion in |
max_terms |
Maximum number of terms to include in plot. Note that additional terms may be included when terms have the same marginal probability, and common ancestor terms are included. |
min_probability |
Threshold probability of inclusion in |
... |
Additional arguments to pass to |
sim_reg_output
objectPlot summary of sim_reg_output
object
## S3 method for class 'sim_reg_summary' plot(x, ...) ## S3 method for class 'sim_reg_output' plot(x, ...)
## S3 method for class 'sim_reg_summary' plot(x, ...) ## S3 method for class 'sim_reg_output' plot(x, ...)
x |
Object of class |
... |
Additional arguments to pass to |
y
given x
conditional on association and given data.Predicted probability of y
given x
conditional on association and given data.
posterior_prediction( ontology, x, y, sim_reg_out, x_new = x, information_content = get_term_info_content(ontology, x), sim_params = list(ontology = ontology, information_content = information_content), two_way = TRUE, prediction_fn = NULL, min_ratio = 0.001, ... )
posterior_prediction( ontology, x, y, sim_reg_out, x_new = x, information_content = get_term_info_content(ontology, x), sim_params = list(ontology = ontology, information_content = information_content), two_way = TRUE, prediction_fn = NULL, min_ratio = 0.001, ... )
ontology |
|
x |
|
y |
|
sim_reg_out |
Object of class |
x_new |
New |
information_content |
Numeric vector of information contents of terms named by term ID. Defaults to information content based on frequencies of annotation in |
sim_params |
List of arguments to pass to |
two_way |
Boolean value determining whether to calculate semantic similarity ‘in both directions’ (i.e. compute |
prediction_fn |
Function for computing predicted probabilities for |
min_ratio |
Threshold for fraction of posterior probability which sampled phi must hold in order to be included in sum. |
... |
Additional arguments to pass to |
Vector of predicted probabilities corresponding to term sets in x_new
.
sim_reg_output
objectPrint sim_reg_output
object
## S3 method for class 'sim_reg_output' print(x, ...)
## S3 method for class 'sim_reg_output' print(x, ...)
x |
Object of class |
... |
Non-used arguments. |
sim_reg_summary
objectPrint sim_reg_summary
object
## S3 method for class 'sim_reg_summary' print(x, ...)
## S3 method for class 'sim_reg_summary' print(x, ...)
x |
Object of class |
... |
Non-used arguments. |
y
and x
Calculate probability of association between y
and x
prob_association(..., prior = 0.05)
prob_association(..., prior = 0.05)
... |
Arguments to pass to |
prior |
Numeric value determing prior probability that |
Numeric value.
Performs Bayesian ‘similarity regression’ on given logical
response vector y
against list
of ontological term sets x
. It returns an object of class sim_reg_output
. Of particular interest are the probability of an association, which can be calculated with prob_association
, and the characteristic ontological profile phi, which can be visualised using the functions plot_term_marginals
, and term_marginals
). The results can be summarised with summary
.
sim_reg( ontology, x, y, information_content = get_term_info_content(ontology, x), sim_params = list(ontology = ontology, information_content = information_content), using_terms = get_terms(sim_params), term_weights = rep(0, length(using_terms)), prior = discrete_gamma(using_terms), min_BF = -Inf, max_select = 2000L, max_phi_count = 200L, two_way = TRUE, selection_fn = fg_step_tab(N = length(y)), lik_method = NULL, lik_method_args = list(), gamma0_ml = bg_rate, min_ratio = 1e-04, ... )
sim_reg( ontology, x, y, information_content = get_term_info_content(ontology, x), sim_params = list(ontology = ontology, information_content = information_content), using_terms = get_terms(sim_params), term_weights = rep(0, length(using_terms)), prior = discrete_gamma(using_terms), min_BF = -Inf, max_select = 2000L, max_phi_count = 200L, two_way = TRUE, selection_fn = fg_step_tab(N = length(y)), lik_method = NULL, lik_method_args = list(), gamma0_ml = bg_rate, min_ratio = 1e-04, ... )
ontology |
|
x |
|
y |
|
information_content |
Numeric vector of information contents of terms named by term ID. Defaults to information content based on frequencies of annotation in |
sim_params |
List of arguments to pass to |
using_terms |
Character vector of term IDs giving the complete set of terms to include in the the |
term_weights |
Numeric vector of prior weights for individual terms. |
prior |
Function for computing the unweighted prior probability of a |
min_BF |
Bayes factor threshold below which to terminate computation, enabling faster execution time at the expense of accuracy and precision. |
max_select |
Upper bound for number of |
max_phi_count |
Upper bound for number of |
two_way |
Boolean value determining whether to calculate semantic similarity ‘in both directions’ (i.e. compute |
selection_fn |
Function for selecting values of |
lik_method |
Function for calculating marginal likelihood contional on values of |
lik_method_args |
List of additional arguments to pass to |
gamma0_ml |
Function for computing marginal likelihood of data under baseline model |
min_ratio |
Lower bound on ratio below which to discard |
... |
Additional arguments to pass to |
Calculate sum of log probabilities on log scale without over/under-flow
sum_log_probs(log_probs)
sum_log_probs(log_probs)
log_probs |
Numeric vector of probabilities on log scale. |
Numeric value on log scale.
sim_reg_output
objectGet summary of sim_reg_output
object
## S3 method for class 'sim_reg_output' summary(object, prior = 0.05, ...)
## S3 method for class 'sim_reg_output' summary(object, prior = 0.05, ...)
object |
Object of class |
prior |
Prior probability of association. |
... |
Non-used arguments. |
phi
Calculate marginal probability of terms inclusion in phi
term_marginals(...)
term_marginals(...)
... |
Arguments to pass to |
Numeric vector of probabilities, named by term ID.