import tensorflow as tf
import tensorflow_probability as tfp
import elicit as el
tfd = tfp.distributions
Specify the generative model#
The generative model must be a class callable
with a specific input-output-structure.
Input structure:
necessary inputs:
prior_samples
(if
tf.utils.softmax_gumbel_trick
is used)**kwargs
optional inputs
any arguments that are required by the internal computation of the generative model (e.g., design matrix, N)
the optional arguments have to be specified as keyword arguments in
el.model
Note: tf.utils.softmax_gumbel_trick
is required when a discrete likelihood is used (see Example 2).
Output structure:
necessary output format: dictionary
keys: refer to names of target quantities (as defined in section parameters in
el.Elicit
)values: corresponding
tf.Tensor
as computed by the generative model
# Pseudo Code
class GenerativeModel:
def __call__(self, prior_samples, **kwargs):
# specify here your generative process: which takes the samples
# from the prior as input and computes the target quantities
# (e.g., observations from the outcome variable)
target = ...
# return the computed target quantities which
# should be queried from the domain expert
return dict(target = target)
Example 1: Continuous likelihood#
Probabilistic model notation#
Target quantities#
We assume, we want to query the domain expert regarding the outcome variable. Specifically, we ask the expert for two values of the predictor variable:
\(y_{pred} \mid X_{1}\)
\(y_{pred} \mid X_{2}\)
\(y_{pred} \mid X_{3}\)
Let the corresponding design matrix be
\( \textbf{X}=\begin{bmatrix} 1. & -1 \\ 1. & 0. \\ 1. & 1. \end{bmatrix} \)
Computational model implementation#
Create a Python class#
class ExampleModel1:
def __call__(self, prior_samples, X):
# shape=(B,num_samples,num_obs)
mu=tf.matmul(prior_samples, X, transpose_b=True)
# shape=(B,num_samples,num_obs)
ypred=tfd.Normal(mu, 1.).sample()
return dict(y_X1=ypred[:,:,0], # shape=(B,num_samples)
y_X2=ypred[:,:,1], # shape=(B,num_samples)
y_X3=ypred[:,:,2]) # shape=(B,num_samples)
Look behind the scenes#
Let us use the ExampleModel1
and generate some predictions based on artificial prior distributions.
Draw prior samples from the following “ground truth”: \(\beta_0 \sim \text{Normal}(1., 0.8)\) and \(\beta_1 \sim \text{Normal}(2., 1.5)\)
Define the design matrix \(\textbf{X}\)
Instantiate the generative model
Simulate from the generative model
# define number of batches and draws from prior distributions
B, num_samples = (1,10)
# sample from priors
prior_samples = tfd.Normal([1,2], [0.8, 1.5]).sample((B,num_samples))
print("(Step 1) prior samples:\n", prior_samples)
# define the design matrix
X = tf.constant([[1.,-1.],[1.,0], [1.,1.]])
print("\n(Step 2) design matrix:\n", X)
# create an instance of the generative model class
model_instance = ExampleModel1()
print("\n(Step 3) instantiated model:\n", model_instance)
# simulate from the generative model
ypred = model_instance(prior_samples, X)
print("\n(Step 4) samples from outcome variable:\n", ypred)
(Step 1) prior samples:
tf.Tensor(
[[[1.1125073 2.076223 ]
[0.80801105 1.0152982 ]
[0.6078063 2.4104307 ]
[1.039326 1.6489855 ]
[0.65120023 4.5983553 ]
[2.189219 0.14012146]
[1.7928746 2.4565156 ]
[0.6398772 3.0275126 ]
[1.3033034 2.8191433 ]
[0.72945213 2.9439828 ]]], shape=(1, 10, 2), dtype=float32)
(Step 2) design matrix:
tf.Tensor(
[[ 1. -1.]
[ 1. 0.]
[ 1. 1.]], shape=(3, 2), dtype=float32)
(Step 3) instantiated model:
<__main__.ExampleModel1 object at 0x000002C263C3E7D0>
(Step 4) samples from outcome variable:
{'y_X1': <tf.Tensor: shape=(1, 10), dtype=float32, numpy=
array([[-1.1919286, -1.800191 , -2.1663888, -1.4374082, -1.8860066,
2.3839483, -2.2876978, -0.7230171, 0.2864045, -3.782659 ]],
dtype=float32)>, 'y_X2': <tf.Tensor: shape=(1, 10), dtype=float32, numpy=
array([[-0.1615212 , 2.1068852 , -0.06249136, -0.03253329, 2.992262 ,
2.6980178 , 0.5170374 , 0.67750573, 0.2714877 , 1.3019307 ]],
dtype=float32)>, 'y_X3': <tf.Tensor: shape=(1, 10), dtype=float32, numpy=
array([[2.800105 , 0.6437601, 2.398678 , 3.0435407, 5.7429676, 4.4191847,
1.0006564, 2.4841185, 3.3039265, 3.9367332]], dtype=float32)>}
Implementation in the eliobj
#
The corresponding implementation in the eliobj
would then look as follows:
eliobj = el.Elicit(
model=el.model(
obj=ExampleModel1, # model class
X=X # additional input argument for model class
),
parameters=[
el.parameter(
name=f"beta{i}",
family=tfd.Normal,
hyperparams=dict(
loc=el.hyper(f"mu{i}"),
scale=el.hyper(f"sigma{i}", lower=0)
)
) for i in range(2)
],
targets=[
el.target(
name=f"y_X{i}",
query=el.queries.quantiles((.05, .25, .50, .75, .95)),
loss=el.losses.MMD2(kernel="energy"),
weight=1.0
) for i in range(1,4)
],
...
)
Example 2: Discrete Likelihood#
Important changes to the continuous case#
There are two specific changes in the class
definition that you need to consider when implementing a discrete likelihood
The
**kwargs
argument must be included inthe
__call__
method of your model classthe
el.utils.softmax_gumbel_trick
function
The likelihood requires a specific batch shape:
batch_shape=(B,num_samples,num_obs,1)
It is sufficient to add this extra dimension to one of the input arguments of your likelihood, as this will automatically propagate to the batch shape of the likelihood.
The reason for (1) is that the el.utils.softmax_gumbel_trick
function involves sampling. To ensure reproducibility, the seed
information must be passed to this function. By providing the **kwargs
argument in both the __call__
method of your class and the el.utils.softmax_gumbel_trick
function, the seed
(as specified in the trainer section of el.Elicit
) can be passed internally.
While requirement (2) is related to the internal computations performed within the el.utils.softmax_gumbel_trick
function.
Probabilistic model notation#
Target quantities#
We assume, we want to query the domain expert regarding the outcome variable. Specifically, we ask the expert for two values of the predictor variable:
\(z_{pred} \mid x_{1}\)
\(z_{pred} \mid x_{2}\)
\(z_{pred} \mid x_{3}\)
Let the corresponding design matrix be
\( \textbf{X}=\begin{bmatrix} 1. & -1 \\ 1. & 0. \\ 1. & 1. \end{bmatrix} \)
Computational model implementation#
Create a Python class#
class ExampleModel2:
# Note: pass the **kwargs argument
def __call__(self, prior_samples, X, N, **kwargs):
# shape=(B,num_samples,num_obs)
theta=tf.math.sigmoid(tf.matmul(prior_samples, X, transpose_b=True))
# likelihood; shape=(B,num_samples,num_obs,1)
# Note: add. dim in likelihood required by softmax_gumbel_trick
likelihood=tfd.Binomial(total_count=N,
probs=tf.expand_dims(theta, axis=-1)
)
# approximate samples from the Binomial distribution
# shape=(B,num_samples,num_obs)
# Note: add the **kwargs argument
zpred=el.utils.softmax_gumbel_trick(
epred=theta, likelihood=likelihood,
upper_thres=total_count,
**kwargs)
return dict(z_X1=zpred[:,:,0], # shape=(B,num_samples)
z_X2=zpred[:,:,1], # shape=(B,num_samples)
z_X3=zpred[:,:,2]) # shape=(B,num_samples)
Look behind the scenes#
Let us use the ExampleModel2
and generate some predictions based on artificial prior distributions.
Draw prior samples from the following “ground truth”: \(\beta_0 \sim \text{Normal}(0.1, 0.2)\) and \(\beta_1 \sim \text{Normal}(0.2, 0.3)\)
Define the design matrix \(\textbf{X}\)
Instantiate the generative model
Simulate from the generative model
# define number of batches and draws from prior distributions
# total_count for Binomial model
B, num_samples, N = (1,10,30)
# sample from priors
prior_samples = tfd.Normal([0.1, 0.2], [0.2, 0.3]).sample((B,num_samples))
print("(Step 1) prior samples:\n", prior_samples)
# define the design matrix
X = tf.constant([[1.,-1.],[1.,0], [1.,1.]])
print("\n(Step 2) design matrix:\n", X)
# create an instance of the generative model class
model_instance = ExampleModel2()
print("\n(Step 3) instantiated model:\n", model_instance)
# simulate from the generative model (specity seed information)
zpred = model_instance(prior_samples, X, N, seed=1)
print("\n(Step 4) samples from outcome variable:\n", zpred)
(Step 1) prior samples:
tf.Tensor(
[[[ 0.18061757 -0.12640624]
[ 0.08738093 0.60096705]
[ 0.24235204 0.05321406]
[-0.05284426 -0.1111746 ]
[-0.15038678 0.20636728]
[-0.01027516 -0.32295096]
[ 0.03292781 -0.11280026]
[ 0.30182764 0.5708762 ]
[-0.03673781 0.40208268]
[ 0.01587544 -0.11239086]]], shape=(1, 10, 2), dtype=float32)
(Step 2) design matrix:
tf.Tensor(
[[ 1. -1.]
[ 1. 0.]
[ 1. 1.]], shape=(3, 2), dtype=float32)
(Step 3) instantiated model:
<__main__.ExampleModel2 object at 0x000002C266920550>
(Step 4) samples from outcome variable:
{'z_X1': <tf.Tensor: shape=(1, 10), dtype=float32, numpy=
array([[16.420263, 11.55657 , 17.039097, 14.793844, 12.669424, 16.667715,
17.019955, 13.616165, 11.663627, 16.112122]], dtype=float32)>, 'z_X2': <tf.Tensor: shape=(1, 10), dtype=float32, numpy=
array([[16.055965, 17.370132, 16.093798, 15.073592, 13.7894 , 14.599481,
14.547873, 17.074022, 15.857426, 15.451229]], dtype=float32)>, 'z_X3': <tf.Tensor: shape=(1, 10), dtype=float32, numpy=
array([[14.718641, 20.798233, 17.821178, 13.658306, 15.893131, 12.64514 ,
14.8872 , 20.550808, 17.580793, 14.436962]], dtype=float32)>}
Implementation in the eliobj
#
The corresponding implementation in the eliobj
would then look as follows:
eliobj = el.Elicit(
model=el.model(
obj=ExampleModel2, # model class
X=X, # additional input
N=N # additional input
),
parameters=[
el.parameter(
name=f"beta{i}",
family=tfd.Normal,
hyperparams=dict(
loc=el.hyper(f"mu{i}"),
scale=el.hyper(f"sigma{i}", lower=0)
)
) for i in range(2)
],
targets=[
el.target(
name=f"z_X{i}",
query=el.queries.quantiles((.05, .25, .50, .75, .95)),
loss=el.losses.MMD2(kernel="energy"),
weight=1.0
) for i in range(1,4)
],
...
)
Notes about tensor operations & shapes#
prior_samples
have shape(B, num_samples, num_param)
in the notation above
B
refers to the batch size,num_samples
to the number of prior samples, andnum_param
to the number of model parametersit is important that you take care about the correct tensor shape, when implementing the generative model
Example for matrix multiplication#
We assume the following design matrix:
\(
X=\begin{bmatrix} 1. & -1 \\ 1. & 0. \\ 1. & 1. \end{bmatrix}
\)
Then we would get the linear predictor, \(\mu\), as follows: \( \mu = \boldsymbol{\beta}\textbf{X}^\top \)
# number of batches
B=2
# number of samples drawn from the prior
num_samples=4
# prior samples for two model parameters; shape=(B,num_samples,num_param)
prior_samples = tfd.Normal([1,2], [0.8, 1.5]).sample((B,num_samples))
print("prior samples:\n", prior_samples)
# design matrix; shape=(num_obs, num_param)
X = tf.constant([[1.,-1.],[1.,0], [1.,1.]])
print("X:\n",X)
# compute linear predictor mu;
# shapes: [B,num_samples,num_param] x [num_params,num_obs] = [B,num_samples,num_obs]
mu = tf.matmul(prior_samples, X, transpose_b=True)
print("mu:\n", mu)
prior samples:
tf.Tensor(
[[[ 0.21421784 2.1894784 ]
[ 2.0228667 3.8906546 ]
[ 0.14594889 3.722395 ]
[ 1.4024372 1.6433661 ]]
[[ 0.8237189 4.390056 ]
[ 1.5101621 2.3289535 ]
[-0.6737598 -0.17162848]
[ 2.4565587 -0.43988562]]], shape=(2, 4, 2), dtype=float32)
X:
tf.Tensor(
[[ 1. -1.]
[ 1. 0.]
[ 1. 1.]], shape=(3, 2), dtype=float32)
mu:
tf.Tensor(
[[[-1.9752605 0.21421784 2.4036963 ]
[-1.8677878 2.0228667 5.9135213 ]
[-3.576446 0.14594889 3.8683438 ]
[-0.24092889 1.4024372 3.0458033 ]]
[[-3.566337 0.8237189 5.213775 ]
[-0.8187914 1.5101621 3.8391156 ]
[-0.50213134 -0.6737598 -0.8453883 ]
[ 2.8964443 2.4565587 2.016673 ]]], shape=(2, 4, 3), dtype=float32)
Example for vector multiplication#
consider we want to compute the linear predictor instead as follows: \( \mu = \beta_0 + \beta_1x \)
# extract the two model parameters
b0=prior_samples[:,:,0] # shape=(B,num_samples)
b1=prior_samples[:,:,1] # shape=(B,num_samples)
# compute linear predictor mu;
# shapes: [B,num_samples,1] + [B,num_samples,1] * [1,num_obs] = [B,num_samples,num_obs]
# Note: With 'None' we can easily expand a tensor with a new dimension
mu = b0[:,:,None]+b1[:,:,None]*X[:,1][None,:]
print("mu:\n", mu)
<tf.Tensor: shape=(2, 4, 3), dtype=float32, numpy=
array([[[-1.9752605 , 0.21421784, 2.4036963 ],
[-1.8677878 , 2.0228667 , 5.9135213 ],
[-3.576446 , 0.14594889, 3.8683438 ],
[-0.24092889, 1.4024372 , 3.0458033 ]],
[[-3.566337 , 0.8237189 , 5.213775 ],
[-0.8187914 , 1.5101621 , 3.8391156 ],
[-0.50213134, -0.6737598 , -0.8453883 ],
[ 2.8964443 , 2.4565587 , 2.016673 ]]], dtype=float32)>