elicit.elicit module#

class elicit.elicit.Dtype(vtype: str, dim: int)[source]#

Bases: object

Creates a tensorflow scalar or array depending on the vtype attribute.

Returns:
tf.Tensor

Tensor of correct shape depending on vtype and dim.

Attributes:
vtype: str, (“real”, “array”)

Type of input parameter x.

dim: int

Dimensionality of input parameter x. For scalar: dim=1, for vector: dim>1

__init__(vtype: str, dim: int)[source]#
__call__(x: Tensor) Tensor[source]#

Call self as a function.

elicit.elicit.hyper(name: str, lower: float = -inf, upper: float = inf, vtype: str = 'real', dim: int = 1, shared: bool = False) Hyper[source]#

Specification of prior hyperparameters.

Parameters:
namestring

Custom name of hyperparameter.

lowerfloat

Lower bound of hyperparameter. The default is unbounded: float("-inf").

upperfloat

Upper bound of hyperparameter. The default is unbounded: float("inf").

vtypestring, (“real”, “array”)

Hyperparameter type. The default is "real".

diminteger

Dimensionality of variable. Only required if vtype is “array”. The default is 1.

sharedbool

Shared hyperparameter between model parameters. The default is False.

Returns:
hyppar_dictdict

Dictionary including all hyperparameter settings.

Raises:
ValueError

lower, upper take only values that are float or “-inf”/”inf”.

lower value should not be higher than upper value.

vtype value can only be either ‘real’ or ‘array’.

dim value can’t be ‘1’ if ‘vtype=”array”’

Examples

>>> # sigma hyperparameter of a parametric distribution
>>> el.hyper(name="sigma0", lower=0)
>>> # shared hyperparameter
>>> el.hyper(name="sigma", lower=0, shared=True)
elicit.elicit.parameter(name: str, family: Distribution | None = None, hyperparams: Dict[str, Hyper] | None = None, lower: float = -inf, upper: float = inf) Parameter[source]#

Specification of model parameters.

Parameters:
namestring

Custom name of parameter.

familytfp.distributions.Distribution, optional

Prior distribution family for model parameter. Only required for parametric_prior method. Must be an tfp.distributions object.

hyperparamselicit.hyper, optional

Hyperparameters of distribution as specified in family. Only required for parametric_prior method. Structure of dictionary: keys must match arguments of tfp.distributions object and values have to be specified using the hyper() method. Further details are provided in How-To specify prior hyperparameters (TODO). Default value is None.

lowerfloat

Only used if method="deep_prior". Lower bound of parameter. The default value is float("-inf").

upperfloat

Only used if method="deep_prior". Upper bound of parameter. The default value is float("inf").

Returns:
param_dictdict

Dictionary including all model (hyper)parameter settings.

Raises:
ValueError

family has to be a tfp.distributions object.

hyperparams value is a dict with keys corresponding to arguments of tfp.distributions object in ‘family’. Raises error if key does not correspond to any argument of distribution.

Examples

>>> el.parameter(name="beta0",
>>>              family=tfd.Normal,
>>>              hyperparams=dict(loc=el.hyper("mu0"),
>>>                               scale=el.hyper("sigma0", lower=0)
>>>                               )
>>>              )
elicit.elicit.model(obj: Callable, **kwargs) Dict[str, Any][source]#

Specification of the generative model.

Parameters:
objclass

class that implements the generative model. See How-To specify the generative_model for details (TODO).

**kwargskeyword arguments

additional keyword arguments expected by obj.

Returns:
generator_dictdict

Dictionary including all generative model settings.

Raises:
ValueError

generative model in obj requires the input argument ‘prior_samples’, but argument has not been found.

optional argument(s) of the generative model specified in obj are not specified

Examples

>>> # specify the generative model class
>>> class ToyModel:
>>>     def __call__(self, prior_samples, design_matrix, **kwargs):
>>>         # linear predictor
>>>         epred = tf.matmul(prior_samples, design_matrix,
>>>                           transpose_b=True)
>>>         # data-generating model
>>>         likelihood = tfd.Normal(
>>>             loc=epred, scale=tf.expand_dims(prior_samples[:, :, -1], -1)
>>>             )
>>>         # prior predictive distribution
>>>         ypred = likelihood.sample()
>>>
>>>         return dict(
>>>             likelihood=likelihood,
>>>             ypred=ypred, epred=epred,
>>>             prior_samples=prior_samples
>>>             )
>>> # specify the model category in the elicit object
>>> el.model(obj=ToyModel,
>>>          design_matrix=design_matrix
>>>          )
class elicit.elicit.Queries[source]#

Bases: object

quantiles(quantiles: Tuple[float, ...]) QueriesDict[source]#

Implements a quantile-based elicitation technique.

Parameters:
quantilestuple

Tuple with respective quantiles ranging between 0 and 1.

Returns:
elicit_dictdict

Dictionary including the quantile settings.

Raises:
ValueError

quantiles have to be specified as probability ranging between 0 and 1.

identity() QueriesDict[source]#

Implements an identity function. Should be used if no further transformation of target quantity is required.

Returns:
elicit_dictdict

Dictionary including the identity settings.

correlation() QueriesDict[source]#

Implements a method to calculate the pearson correlation between model parameters.

Returns:
elicit_dictdict

Dictionary including the correlation settings.

custom(func: Callable) QueriesDict[source]#

Implements a placeholder for custom target methods. The custom method can be passed as argument.

Parameters:
funccallable

Custom target method.

Returns:
elicit_dictdict

Dictionary including the custom settings.

elicit.elicit.target(name: str, loss: Callable, query: QueriesDict, target_method: Callable | None = None, weight: float = 1.0) Target[source]#

Specification of target quantity and corresponding elicitation technique.

Parameters:
namestring

Name of the target quantity. Two approaches are possible: (1) Target quantity is identical to an output from the generative model: The name must match the output variable name. (2) Custom target quantity is computed using the target_method argument.

querydict

Specify the elicitation technique by using one of the methods implemented in Queries(). See How-To specify custom elicitation techniques (TODO).

losscallable

Loss function for computing the discrepancy between expert data and model simulations. Implemented classes can be found in elicit.losses. The default is the maximum mean discrepancy with an energy kernel: elicit.losses.MMD2()

target_methodcallable, optional

Custom method for computing a target quantity. Note: This method hasn’t been implemented yet and will raise an NotImplementedError. See for further information the corresponding GitHub issue #34. The default is None.

weightfloat

Weight of the corresponding elicited quantity in the total loss. The default is 1.0.

Returns:
target_dictdict

Dictionary including all settings regarding the target quantity and corresponding elicitation technique.

Examples

>>> el.target(name="y_X0",
>>>           query=el.queries.quantiles((.05, .25, .50, .75, .95)),
>>>           loss=el.losses.MMD2(kernel="energy"),
>>>           weight=1.0
>>>           )
>>> el.target(name="correlation",
>>>           query=el.queries.correlation(),
>>>           loss=el.losses.L2,
>>>           weight=1.0
>>>           )
class elicit.elicit.Expert[source]#

Bases: object

data(dat: Dict[str, list]) ExpertDict[source]#

Provide elicited-expert data for learning prior distributions.

Parameters:
datdict

Elicited data from expert provided as dictionary. Data must be provided in a standardized format. Use elicit.utils.get_expert_datformat() to get correct data format for your method specification.

Returns:
expert_datadict

Expert-elicited information used for learning prior distributions.

Examples

>>> expert_dat = {
>>>     "quantiles_y_X0": [-12.55, -0.57, 3.29, 7.14, 19.15],
>>>     "quantiles_y_X1": [-11.18, 1.45, 5.06, 8.83, 20.42],
>>>     "quantiles_y_X2": [-9.28, 3.09, 6.83, 10.55, 23.29]
>>> }
simulator(ground_truth: dict, num_samples: int = 10000) ExpertDict[source]#

Simulate data from an oracle by defining a ground truth (true prior distribution(s)). See Explanation: Simulating from an oracle (TODO) for further details.

Parameters:
ground_truthdict

True prior distribution(s). Keys refer to parameter names and values to prior distributions implemented as tfp.distributions object with predetermined hyperparameter values. You can specify a prior distribution for each model parameter or a joint prior for all model parameters at once or any approach in between. Only requirement is that the dimensionality of all priors in ground truth match with the number of model parameters. Order of priors in ground truth must match order of elicit.elicit.Elicit() argument parameters.

num_samplesint

Number of draws from the prior distribution. It is recommended to use a high value to min. sampling variation. The default is 10_000.

Returns:
expert_datadict

Settings of oracle for simulating from ground truth. True elicited statistics are used as expert-data in loss function.

Examples

>>> el.expert.simulator(
>>>     ground_truth = {
>>>         "beta0": tfd.Normal(loc=5, scale=1),
>>>         "beta1": tfd.Normal(loc=2, scale=1),
>>>         "sigma": tfd.HalfNormal(scale=10.0),
>>>     },
>>>     num_samples = 10_000
>>> )
>>> el.expert.simulator(
>>>     ground_truth = {
>>>         "betas": tfd.MultivariateNormalDiag([5.,2.], [1.,1.]),
>>>         "sigma": tfd.HalfNormal(scale=10.0),
>>>     },
>>>     num_samples = 10_000
>>> )
>>> el.expert.simulator(
>>>     ground_truth = {
>>>         "thetas": tfd.MultivariateNormalDiag([5.,2.,1.],
>>>                                              [1.,1.,1.]),
>>>     },
>>>     num_samples = 10_000
>>> )
elicit.elicit.optimizer(optimizer: <module 'tensorflow.keras.optimizers' from '/opt/hostedtoolcache/Python/3.10.16/x64/lib/python3.10/site-packages/keras/api/_v2/keras/optimizers/__init__.py'> = <keras.src.optimizers.adam.Adam object>, **kwargs) Dict[str, Any][source]#

Specification of optimizer and its settings for SGD.

Parameters:
optimizercallable, tf.keras.optimizers object.

Optimizer used for SGD implemented. Must be an object implemented in tf.keras.optimizers object. The default is tf.keras.optimizers.Adam.

**kwargskeyword arguments

Additional keyword arguments expected by optimizer.

Returns:
optimizer_dictdict

Dictionary specifying the SGD optimizer and its additional settings.

Raises:
TypeError

optimizer is not a tf.keras.optimizers object

ValueError

optimizer could not be found in tf.keras.optimizers

Examples

>>> optimizer=el.optimizer(
>>>     optimizer=tf.keras.optimizers.Adam,
>>>     learning_rate=0.1,
>>>     clipnorm=1.0
>>> )
elicit.elicit.initializer(method: str | None = None, distribution: Uniform | None = None, loss_quantile: float | None = None, iterations: int | None = None, hyperparams: dict | None = None) Initializer[source]#

Only necessary for method parametric_prior: Initialization of hyperparameter values. Two approaches are currently possible:

  1. Specify specific initial values for each hyperparameter.

  2. Use one of the implemented sampling approaches to draw initial values from one of the provided initialization distributions

In (2) initial values for each hyperparameter are drawn from a uniform distribution ranging from mean-radius to mean+radius. Further details on the implemented initialization method can be found in Explanation: Initialization method.

Parameters:
methodstring, optional

Name of initialization method. Currently supported are “random”, “lhs”, and “sobol”.

distributiondict, optional

Specification of initialization distribution. Currently implemented methods: elicit.initialization.uniform()

loss_quantilefloat, optional

Quantile indicating which loss value should be used for selecting the initial hyperparameters.Specified as probability value between 0-1.

iterationsint, optional

Number of samples drawn from the initialization distribution.

hyperparamsdict, optional

Dictionary with specific initial values per hyperparameter. Note: Initial values are considered to be on the unconstrained scale. Use the forward method of elicit.utils.LowerBound(), elicit.utils.UpperBound() and elicit.utils.DoubleBound() for transforming a constrained hyperparameter into an unconstrained one. In hyperparams dictionary, keys refer to hyperparameter names, as specified in hyper() and values to the respective initial values.

Returns:
init_dictdict

Dictionary specifying the initialization method.

Raises:
ValueError

method can only take the values “random”, “sobol”, or “lhs”

loss_quantile must be a probability ranging between 0 and 1.

Either method or hyperparams has to be specified.

Examples

>>> el.initializer(
>>>     method="lhs",
>>>     loss_quantile=0,
>>>     iterations=32,
>>>     distribution=el.initialization.uniform(
>>>         radius=1,
>>>         mean=0
>>>         )
>>>     )
>>> el.initializer(
>>>     hyperparams = dict(
>>>         mu0=0., sigma0=el.utils.LowerBound(lower=0.).forward(0.3),
>>>         mu1=1., sigma1=el.utils.LowerBound(lower=0.).forward(0.5),
>>>         sigma2=el.utils.LowerBound(lower=0.).forward(0.4)
>>>         )
>>>     )
elicit.elicit.trainer(method: str, seed: int, epochs: int, B: int = 128, num_samples: int = 200) Trainer[source]#

Specification of training settings for learning the prior distribution(s).

Parameters:
methodstr

Method for learning the prior distribution. Available is either parametric_prior for learning independent parametric priors or deep_prior for learning a joint non-parameteric prior.

seedint

seed used for learning.

epochsint

number of iterations until training is stopped.

Bint

batch size. The default is 128.

num_samplesint

number of samples from the prior(s). The default is 200.

Returns:
train_dictdict

dictionary specifying the training settings for learning the prior distribution(s).

Raises:
ValueError

method can only take the value “parametric_prior” or “deep_prior”

epochs can only take positive integers. Minimum number of epochs is 1.

Examples

>>> el.trainer(
>>>     method="parametric_prior",
>>>     seed=0,
>>>     epochs=400,
>>>     B=128,
>>>     num_samples=200
>>> )
class elicit.elicit.Elicit(model: Dict[str, Any], parameters: List[Parameter], targets: List[Target], expert: ExpertDict, trainer: Trainer, optimizer: Dict[str, Any], network: NFDict | None = None, initializer: Initializer | None = None)[source]#

Bases: object

Parameters:
modeldict

specification of generative model using model().

parameterslist

list of model parameters specified with parameter().

targetslist

list of target quantities specified with target().

expertdict

provide input data from expert or simulate data from oracle with either the data or simulator method of the elicit.elicit.Expert module.

trainerdict

specification of training settings and meta-information for workflow using trainer()

optimizerdict

specification of SGD optimizer and its settings using optimizer().

networkdict, optional

specification of neural network using a method implemented in elicit.networks. Only required for deep_prior method. For parametric_prior use None.

initializerdict, optional

specification of initialization settings using initializer(). Only required for parametric_prior method. Otherwise the argument should be None.

Returns:
eliobjclass instance

specification of all settings to run the elicitation workflow and fit the eliobj.

Raises:
AssertionError

expert data are not in the required format. Correct specification of keys can be checked using el.utils.get_expert_datformat

Dimensionality of ground_truth for simulating expert data, must be the same as the number of model parameters.

ValueError

if method="deep_prior", network can’t be None and initialization should be None.

if method="deep_prior", num_params as specified in the network_specs argument (section: network) does not match the number of parameters specified in the parameters section.

if method="parametric_prior", network should be None and initialization can’t be None.

if method ="parametric_prior" and multiple hyperparameter have the same name but are not shared by setting ``shared=True.”

if hyperparams is specified in section initializer and a hyperparameter name (key in hyperparams dict) does not match any hyperparameter name specified in hyper().

NotImplementedError

[network] Currently only the standard normal distribution is implemented as base distribution. See GitHub issue #35.

__init__(model: Dict[str, Any], parameters: List[Parameter], targets: List[Target], expert: ExpertDict, trainer: Trainer, optimizer: Dict[str, Any], network: NFDict | None = None, initializer: Initializer | None = None)[source]#
Parameters:
modeldict

specification of generative model using model().

parameterslist

list of model parameters specified with parameter().

targetslist

list of target quantities specified with target().

expertdict

provide input data from expert or simulate data from oracle with either the data or simulator method of the elicit.elicit.Expert module.

trainerdict

specification of training settings and meta-information for workflow using trainer()

optimizerdict

specification of SGD optimizer and its settings using optimizer().

networkdict, optional

specification of neural network using a method implemented in elicit.networks. Only required for deep_prior method. For parametric_prior use None.

initializerdict, optional

specification of initialization settings using initializer(). Only required for parametric_prior method. Otherwise the argument should be None.

Returns:
eliobjclass instance

specification of all settings to run the elicitation workflow and fit the eliobj.

Raises:
AssertionError

expert data are not in the required format. Correct specification of keys can be checked using el.utils.get_expert_datformat

Dimensionality of ground_truth for simulating expert data, must be the same as the number of model parameters.

ValueError

if method="deep_prior", network can’t be None and initialization should be None.

if method="deep_prior", num_params as specified in the network_specs argument (section: network) does not match the number of parameters specified in the parameters section.

if method="parametric_prior", network should be None and initialization can’t be None.

if method ="parametric_prior" and multiple hyperparameter have the same name but are not shared by setting ``shared=True.”

if hyperparams is specified in section initializer and a hyperparameter name (key in hyperparams dict) does not match any hyperparameter name specified in hyper().

NotImplementedError

[network] Currently only the standard normal distribution is implemented as base distribution. See GitHub issue #35.

fit(save_history: SaveHist = {'hyperparameter': True, 'hyperparameter_gradient': True, 'loss': True, 'loss_component': True, 'time': True}, save_results: SaveResults = {'elicited_statistics': True, 'expert_elicited_statistics': True, 'expert_prior_samples': True, 'init_loss_list': True, 'init_matrix': True, 'init_prior': True, 'loss_tensor_expert': True, 'loss_tensor_model': True, 'model_samples': True, 'prior_samples': True, 'target_quantities': True}, overwrite: bool = False, parallel: Parallel | None = None) None[source]#

method for fitting the eliobj and learn prior distributions.

Parameters:
overwritebool

If the eliobj was already fitted and the user wants to refit it, the user is asked whether they want to overwrite the previous fitting results. Setting overwrite=True allows the user to force overfitting without being prompted. The default is False.

save_historydict, elicit.utils.save_history()

Exclude or include sub-results in the final result file. In the history object are all results that are saved across epochs. For usage information see How-To: Save and load the eliobj

save_resultsdict, elicit.utils.save_results()

Exclude or include sub-results in the final result file. In the results object are all results that are saved for the last epoch only. For usage information see How-To: Save and load the eliobj

paralleldict from elicit.utils.parallel(), optional

specify parallelization settings if multiple trainings should run in parallel.

Examples

>>> eliobj.fit()
>>> eliobj.fit(overwrite=True,
>>>            save_history=el.utils.save_history(
>>>                loss_component=False
>>>                )
>>>            )
>>> eliobj.fit(parallel=el.utils.parallel(runs=4))
save(name: str | None = None, file: str | None = None, overwrite: bool = False)[source]#

method for saving the eliobj on disk

Parameters:
name: str, optional

file name used to store the eliobj. Saving is done according to the following rule: ./{method}/{name}_{seed}.pkl with ‘method’ and ‘seed’ being arguments of elicit.elicit.trainer().

filestr, optional

user-specific path for saving the eliobj. If file is specified name must be None. The default value is None.

overwritebool

If already a fitted object exists in the same path, the user is asked whether the eliobj should be refitted and the results overwritten. With the overwrite argument, you can disable this behavior. In this case the results are automatically overwritten without prompting the user. The default is False.

Raises:
AssertionError

name and file can’t be specified simultaneously.

Examples

>>> eliobj.save(name="toymodel")
>>> eliobj.save(file="res/toymodel", overwrite=True)
update(**kwargs)[source]#

method for updating the attributes of the Elicit class. Updating an eliobj leads to an automatic reset of results.

Parameters:
**kwargs

keyword argument used for updating an attribute of Elicit class. Key must correspond to one attribute of the class and value refers to the updated value.

Raises:
ValueError

key of provided keyword argument is not an eliobj attribute. Please check dir(eliobj).

Examples

>>> eliobj.update(parameter = updated_parameter_dict)
workflow(seed: int) Tuple[dict, dict][source]#

helper function that builds the main workflow of the prior elicitation method: get expert data, initialize method, run optimization. Results are returned for further post-processing.

Parameters:
seedint

seed information used for reproducing results.

Returns:
results, history = Tuple(dict, dict)

results of the optimization process.