import tensorflow as tf
import tensorflow_probability as tfp
import elicit as el

tfd = tfp.distributions

Specify the generative model#

The generative model must be a class callable with a specific input-output-structure.

Input structure:

  • necessary inputs:

    • prior_samples

    • (if tf.utils.softmax_gumbel_trick is used) **kwargs

  • optional inputs

    • any arguments that are required by the internal computation of the generative model (e.g., design matrix, N)

    • the optional arguments have to be specified as keyword arguments in el.model

Note: tf.utils.softmax_gumbel_trick is required when a discrete likelihood is used (see Example 2).

Output structure:

  • necessary output format: dictionary

    • keys: refer to names of target quantities (as defined in section parameters in el.Elicit)

    • values: corresponding tf.Tensor as computed by the generative model

# Pseudo Code

class GenerativeModel:
    def __call__(self, prior_samples, **kwargs):
        
        # specify here your generative process: which takes the samples 
        # from the prior as input and computes the target quantities 
        # (e.g., observations from the outcome variable)
        target = ...
        
        # return the computed target quantities which 
        # should be queried from the domain expert
        return dict(target = target)

Example 1: Continuous likelihood#

Probabilistic model notation#

\[\begin{align*} (\beta_0, \beta_1) &\sim \text{Normal}(\mu_i, \sigma_i) \quad \text{ for } i=0,1 \\ \boldsymbol{\mu} &= \boldsymbol{\beta}\textbf{X}^\top \\ \textbf{y}_{pred} &\sim \text{Normal}(\boldsymbol{\mu}, \textbf{1.}) \end{align*}\]

Target quantities#

We assume, we want to query the domain expert regarding the outcome variable. Specifically, we ask the expert for two values of the predictor variable:

  • \(y_{pred} \mid X_{1}\)

  • \(y_{pred} \mid X_{2}\)

  • \(y_{pred} \mid X_{3}\)

Let the corresponding design matrix be

\( \textbf{X}=\begin{bmatrix} 1. & -1 \\ 1. & 0. \\ 1. & 1. \end{bmatrix} \)

Computational model implementation#

Create a Python class#

class ExampleModel1:
    def __call__(self, prior_samples, X):

        # shape=(B,num_samples,num_obs)
        mu=tf.matmul(prior_samples, X, transpose_b=True) 

        # shape=(B,num_samples,num_obs)
        ypred=tfd.Normal(mu, 1.).sample() 

        return dict(y_X1=ypred[:,:,0], # shape=(B,num_samples)
                    y_X2=ypred[:,:,1], # shape=(B,num_samples)
                    y_X3=ypred[:,:,2]) # shape=(B,num_samples)

Look behind the scenes#

Let us use the ExampleModel1 and generate some predictions based on artificial prior distributions.

  1. Draw prior samples from the following “ground truth”: \(\beta_0 \sim \text{Normal}(1., 0.8)\) and \(\beta_1 \sim \text{Normal}(2., 1.5)\)

  2. Define the design matrix \(\textbf{X}\)

  3. Instantiate the generative model

  4. Simulate from the generative model

# define number of batches and draws from prior distributions
B, num_samples = (1,10)

# sample from priors
prior_samples = tfd.Normal([1,2], [0.8, 1.5]).sample((B,num_samples))
print("(Step 1) prior samples:\n", prior_samples)

# define the design matrix
X = tf.constant([[1.,-1.],[1.,0], [1.,1.]])
print("\n(Step 2) design matrix:\n", X)

# create an instance of the generative model class
model_instance = ExampleModel1()
print("\n(Step 3) instantiated model:\n", model_instance)

# simulate from the generative model
ypred = model_instance(prior_samples, X)
print("\n(Step 4) samples from outcome variable:\n", ypred)
(Step 1) prior samples:
 tf.Tensor(
[[[1.1125073  2.076223  ]
  [0.80801105 1.0152982 ]
  [0.6078063  2.4104307 ]
  [1.039326   1.6489855 ]
  [0.65120023 4.5983553 ]
  [2.189219   0.14012146]
  [1.7928746  2.4565156 ]
  [0.6398772  3.0275126 ]
  [1.3033034  2.8191433 ]
  [0.72945213 2.9439828 ]]], shape=(1, 10, 2), dtype=float32)

(Step 2) design matrix:
 tf.Tensor(
[[ 1. -1.]
 [ 1.  0.]
 [ 1.  1.]], shape=(3, 2), dtype=float32)

(Step 3) instantiated model:
 <__main__.ExampleModel1 object at 0x000002C263C3E7D0>

(Step 4) samples from outcome variable:
 {'y_X1': <tf.Tensor: shape=(1, 10), dtype=float32, numpy=
array([[-1.1919286, -1.800191 , -2.1663888, -1.4374082, -1.8860066,
         2.3839483, -2.2876978, -0.7230171,  0.2864045, -3.782659 ]],
      dtype=float32)>, 'y_X2': <tf.Tensor: shape=(1, 10), dtype=float32, numpy=
array([[-0.1615212 ,  2.1068852 , -0.06249136, -0.03253329,  2.992262  ,
         2.6980178 ,  0.5170374 ,  0.67750573,  0.2714877 ,  1.3019307 ]],
      dtype=float32)>, 'y_X3': <tf.Tensor: shape=(1, 10), dtype=float32, numpy=
array([[2.800105 , 0.6437601, 2.398678 , 3.0435407, 5.7429676, 4.4191847,
        1.0006564, 2.4841185, 3.3039265, 3.9367332]], dtype=float32)>}

Implementation in the eliobj#

The corresponding implementation in the eliobj would then look as follows:

eliobj = el.Elicit(
    model=el.model(
        obj=ExampleModel1,    # model class 
        X=X                   # additional input argument for model class
        ),
    parameters=[
        el.parameter(
            name=f"beta{i}",
            family=tfd.Normal,
            hyperparams=dict(
                loc=el.hyper(f"mu{i}"),
                scale=el.hyper(f"sigma{i}", lower=0)
                )
        ) for i in range(2)
    ],
    targets=[
        el.target(
            name=f"y_X{i}",
            query=el.queries.quantiles((.05, .25, .50, .75, .95)),
            loss=el.losses.MMD2(kernel="energy"),
            weight=1.0
        ) for i in range(1,4)
    ],
    ...
)

Example 2: Discrete Likelihood#

Important changes to the continuous case#

There are two specific changes in the class definition that you need to consider when implementing a discrete likelihood

  1. The **kwargs argument must be included in

    • the __call__ method of your model class

    • the el.utils.softmax_gumbel_trick function

  2. The likelihood requires a specific batch shape: batch_shape=(B,num_samples,num_obs,1)

    • It is sufficient to add this extra dimension to one of the input arguments of your likelihood, as this will automatically propagate to the batch shape of the likelihood.

The reason for (1) is that the el.utils.softmax_gumbel_trick function involves sampling. To ensure reproducibility, the seed information must be passed to this function. By providing the **kwargs argument in both the __call__ method of your class and the el.utils.softmax_gumbel_trick function, the seed (as specified in the trainer section of el.Elicit) can be passed internally. While requirement (2) is related to the internal computations performed within the el.utils.softmax_gumbel_trick function.

Probabilistic model notation#

\[\begin{align*} (\beta_0, \beta_1) &\sim \text{Normal}(\mu_i, \sigma_i) \quad \text{ for } i=0,1 \\ \boldsymbol{\theta} &= \text{sigmoid}(\boldsymbol{\beta}\textbf{X}^\top) \\ \textbf{z}_{pred} &\sim \text{Binomial}(N,\boldsymbol{\theta}) \end{align*}\]

Target quantities#

We assume, we want to query the domain expert regarding the outcome variable. Specifically, we ask the expert for two values of the predictor variable:

  • \(z_{pred} \mid x_{1}\)

  • \(z_{pred} \mid x_{2}\)

  • \(z_{pred} \mid x_{3}\)

Let the corresponding design matrix be

\( \textbf{X}=\begin{bmatrix} 1. & -1 \\ 1. & 0. \\ 1. & 1. \end{bmatrix} \)

Computational model implementation#

Create a Python class#

class ExampleModel2:
    # Note: pass the **kwargs argument
    def __call__(self, prior_samples, X, N, **kwargs):

        # shape=(B,num_samples,num_obs)
        theta=tf.math.sigmoid(tf.matmul(prior_samples, X, transpose_b=True))

        # likelihood; shape=(B,num_samples,num_obs,1)
        # Note: add. dim in likelihood required by softmax_gumbel_trick
        likelihood=tfd.Binomial(total_count=N, 
                                probs=tf.expand_dims(theta, axis=-1)
                               )
        
        # approximate samples from the Binomial distribution
        # shape=(B,num_samples,num_obs)
        # Note: add the **kwargs argument
        zpred=el.utils.softmax_gumbel_trick(
            epred=theta, likelihood=likelihood, 
            upper_thres=total_count, 
            **kwargs)

        return dict(z_X1=zpred[:,:,0], # shape=(B,num_samples)
                    z_X2=zpred[:,:,1], # shape=(B,num_samples)
                    z_X3=zpred[:,:,2]) # shape=(B,num_samples)

Look behind the scenes#

Let us use the ExampleModel2 and generate some predictions based on artificial prior distributions.

  1. Draw prior samples from the following “ground truth”: \(\beta_0 \sim \text{Normal}(0.1, 0.2)\) and \(\beta_1 \sim \text{Normal}(0.2, 0.3)\)

  2. Define the design matrix \(\textbf{X}\)

  3. Instantiate the generative model

  4. Simulate from the generative model

# define number of batches and draws from prior distributions
# total_count for Binomial model
B, num_samples, N = (1,10,30)

# sample from priors
prior_samples = tfd.Normal([0.1, 0.2], [0.2, 0.3]).sample((B,num_samples))
print("(Step 1) prior samples:\n", prior_samples)

# define the design matrix
X = tf.constant([[1.,-1.],[1.,0], [1.,1.]])
print("\n(Step 2) design matrix:\n", X)

# create an instance of the generative model class
model_instance = ExampleModel2()
print("\n(Step 3) instantiated model:\n", model_instance)

# simulate from the generative model (specity seed information)
zpred = model_instance(prior_samples, X, N, seed=1)
print("\n(Step 4) samples from outcome variable:\n", zpred)
(Step 1) prior samples:
 tf.Tensor(
[[[ 0.18061757 -0.12640624]
  [ 0.08738093  0.60096705]
  [ 0.24235204  0.05321406]
  [-0.05284426 -0.1111746 ]
  [-0.15038678  0.20636728]
  [-0.01027516 -0.32295096]
  [ 0.03292781 -0.11280026]
  [ 0.30182764  0.5708762 ]
  [-0.03673781  0.40208268]
  [ 0.01587544 -0.11239086]]], shape=(1, 10, 2), dtype=float32)

(Step 2) design matrix:
 tf.Tensor(
[[ 1. -1.]
 [ 1.  0.]
 [ 1.  1.]], shape=(3, 2), dtype=float32)

(Step 3) instantiated model:
 <__main__.ExampleModel2 object at 0x000002C266920550>

(Step 4) samples from outcome variable:
 {'z_X1': <tf.Tensor: shape=(1, 10), dtype=float32, numpy=
array([[16.420263, 11.55657 , 17.039097, 14.793844, 12.669424, 16.667715,
        17.019955, 13.616165, 11.663627, 16.112122]], dtype=float32)>, 'z_X2': <tf.Tensor: shape=(1, 10), dtype=float32, numpy=
array([[16.055965, 17.370132, 16.093798, 15.073592, 13.7894  , 14.599481,
        14.547873, 17.074022, 15.857426, 15.451229]], dtype=float32)>, 'z_X3': <tf.Tensor: shape=(1, 10), dtype=float32, numpy=
array([[14.718641, 20.798233, 17.821178, 13.658306, 15.893131, 12.64514 ,
        14.8872  , 20.550808, 17.580793, 14.436962]], dtype=float32)>}

Implementation in the eliobj#

The corresponding implementation in the eliobj would then look as follows:

eliobj = el.Elicit(
    model=el.model(
        obj=ExampleModel2,  # model class 
        X=X,                # additional input
        N=N                 # additional input 
        ),
    parameters=[
        el.parameter(
            name=f"beta{i}",
            family=tfd.Normal,
            hyperparams=dict(
                loc=el.hyper(f"mu{i}"),
                scale=el.hyper(f"sigma{i}", lower=0)
                )
        ) for i in range(2)
    ],
    targets=[
        el.target(
            name=f"z_X{i}",
            query=el.queries.quantiles((.05, .25, .50, .75, .95)),
            loss=el.losses.MMD2(kernel="energy"),
            weight=1.0
        ) for i in range(1,4)
    ],
    ...
)

Notes about tensor operations & shapes#

  • prior_samples have shape (B, num_samples, num_param)

  • in the notation above B refers to the batch size, num_samples to the number of prior samples, and num_param to the number of model parameters

  • it is important that you take care about the correct tensor shape, when implementing the generative model

Example for matrix multiplication#

We assume the following design matrix:
\( X=\begin{bmatrix} 1. & -1 \\ 1. & 0. \\ 1. & 1. \end{bmatrix} \)

Then we would get the linear predictor, \(\mu\), as follows: \( \mu = \boldsymbol{\beta}\textbf{X}^\top \)

# number of batches
B=2
# number of samples drawn from the prior
num_samples=4

# prior samples for two model parameters; shape=(B,num_samples,num_param)
prior_samples = tfd.Normal([1,2], [0.8, 1.5]).sample((B,num_samples))
print("prior samples:\n", prior_samples)

# design matrix;  shape=(num_obs, num_param)
X = tf.constant([[1.,-1.],[1.,0], [1.,1.]])
print("X:\n",X)

# compute linear predictor mu; 
# shapes: [B,num_samples,num_param] x [num_params,num_obs] = [B,num_samples,num_obs]
mu = tf.matmul(prior_samples, X, transpose_b=True)
print("mu:\n", mu)
prior samples:
 tf.Tensor(
[[[ 0.21421784  2.1894784 ]
  [ 2.0228667   3.8906546 ]
  [ 0.14594889  3.722395  ]
  [ 1.4024372   1.6433661 ]]

 [[ 0.8237189   4.390056  ]
  [ 1.5101621   2.3289535 ]
  [-0.6737598  -0.17162848]
  [ 2.4565587  -0.43988562]]], shape=(2, 4, 2), dtype=float32)
X:
 tf.Tensor(
[[ 1. -1.]
 [ 1.  0.]
 [ 1.  1.]], shape=(3, 2), dtype=float32)
mu:
 tf.Tensor(
[[[-1.9752605   0.21421784  2.4036963 ]
  [-1.8677878   2.0228667   5.9135213 ]
  [-3.576446    0.14594889  3.8683438 ]
  [-0.24092889  1.4024372   3.0458033 ]]

 [[-3.566337    0.8237189   5.213775  ]
  [-0.8187914   1.5101621   3.8391156 ]
  [-0.50213134 -0.6737598  -0.8453883 ]
  [ 2.8964443   2.4565587   2.016673  ]]], shape=(2, 4, 3), dtype=float32)

Example for vector multiplication#

  • consider we want to compute the linear predictor instead as follows: \( \mu = \beta_0 + \beta_1x \)

# extract the two model parameters
b0=prior_samples[:,:,0] # shape=(B,num_samples)
b1=prior_samples[:,:,1] # shape=(B,num_samples)

# compute linear predictor mu;
# shapes: [B,num_samples,1] + [B,num_samples,1] * [1,num_obs] = [B,num_samples,num_obs]
# Note: With 'None' we can easily expand a tensor with a new dimension 
mu = b0[:,:,None]+b1[:,:,None]*X[:,1][None,:] 
print("mu:\n", mu)
<tf.Tensor: shape=(2, 4, 3), dtype=float32, numpy=
array([[[-1.9752605 ,  0.21421784,  2.4036963 ],
        [-1.8677878 ,  2.0228667 ,  5.9135213 ],
        [-3.576446  ,  0.14594889,  3.8683438 ],
        [-0.24092889,  1.4024372 ,  3.0458033 ]],

       [[-3.566337  ,  0.8237189 ,  5.213775  ],
        [-0.8187914 ,  1.5101621 ,  3.8391156 ],
        [-0.50213134, -0.6737598 , -0.8453883 ],
        [ 2.8964443 ,  2.4565587 ,  2.016673  ]]], dtype=float32)>