Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Introduce families of distributions and a DistributionRestraint #1021

Open
sethaxen opened this issue Sep 7, 2019 · 0 comments
Open

Introduce families of distributions and a DistributionRestraint #1021

sethaxen opened this issue Sep 7, 2019 · 0 comments

Comments

@sethaxen
Copy link
Contributor

sethaxen commented Sep 7, 2019

I've lost count of how many implementations we have of log-normal restraints. I propose a module with a series of classes that represent probability distributions. I started something like this a ways back by generalizing IMP::isd::FNormal and the like to an IMP::isd::Distributions base class, but we could make this more useful with the following features:

  • Computation of log density and CDF
  • The above with gradients
  • Functionality to fit distributions
  • Functionality to draw exact samples from distributions
  • Some check to ensure implied dependence assumptions are sensible (i.e. a parameter drawn from one distribution cannot also be drawn from another; rather, it can be drawn from their joint, which would be its own distribution. This distinction is important for PPCs; see below.)

A single DistributionRestraint would then wrap a distribution, along with some interface for mixing and matching FloatIndexes with constants. To restrain the output of some function with a DistributionRestraint would require the function adding the quantity to the Model attributes with a ScoreState upon model update and pulling back the adjoints (derivatives of scoring function wrt quantities) to the function inputs, which could be other model attributes.

This would prevent unnecessary code-reuse, which is nice, but it would also enable rapid iteration on the statistical model, including unlocking multi-level models. Once a user has a forward model with pullback implemented, they can test a variety of different probability distributions with no additional effort. Developer focus is then shifted away from generic code to the particulars for their data/representation.

Additionally, this is an essential first step toward prior- and posterior-predictive checks. It is known how to draw exact samples from most generic distributions. Such a DistributionRestraint could then be inverted, enabling us to draw model parameters and data from the distributions. This enables us to sanity check the implicit assumptions in our priors (prior-predictive) and to visualize the posterior in data-space (posterior-predictive).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

1 participant