Loss Functions¶

The loss module provides the objective functions minimized during neural network training. Each loss wraps a BellmanPeriod and encodes a different optimality criterion: static reward maximization, estimated discounted lifetime reward, Bellman equation residuals, or Euler equation residuals.

class skagent.loss.BellmanEquationLoss(bellman_period, value_function, parameters=None, agent=None, foc_weight=0.0)¶

Creates a Bellman equation loss function for the Maliar method.

The Bellman equation is: V(s) = max_c { u(s,c,ε) + β E_ε’[V(s’)] } where s’ = f(s,c,ε) is the next state given current state s, control c, and shock ε, and the expectation E_ε’ is taken over future shock realizations ε’.

This function expects the input grid to contain two independent shock realizations: - {shock_sym}_0: shocks for period t (used for immediate reward and transitions) - {shock_sym}_1: shocks for period t+1 (used for continuation value evaluation)

Parameters:

bellman_period (BellmanPeriod) – The model block containing dynamics, rewards, and shocks
value_function (Union[dict[str, Callable], Callable]) – A value function that takes state variables and returns value estimates
parameters (dict[str, Any] | None) – Model parameters for calibration
agent (str | None) – Agent identifier for rewards
foc_weight (float)

class skagent.loss.CustomLoss(loss_function, block, parameters=None, other_dr=None)¶

A custom loss function that computes the negative reward for a block, assuming it is executed just once (a non-dynamic model)

TODO: leaving this as ambiguously about Blocks and BellmanPeriods for now

class skagent.loss.EstimatedDiscountedLifetimeRewardLoss(bellman_period, big_t, parameters)¶

A loss function for a Block that computes the discounted lifetime reward for T time periods.

Parameters:

bellman_period
big_t (int) – The number of time steps to compute reward for
parameters

class skagent.loss.EulerEquationLoss(bellman_period, parameters=None, agent=None, weight=1.0, constrained=False)¶

Creates an Euler equation loss function for the Maliar method.

The Euler equation is the first-order condition from the Bellman equation, relating marginal rewards across periods. For a DSOP with control \(x_t\), arrival states \(s_t\), and pre-decision states \(m_t\), this loss function computes the Euler equation residual:

\[f = u'(x_t) + \beta \cdot u'(x_{t+1}) \cdot \sum_s \left[ \frac{\partial s_{t+1}}{\partial x_t} \cdot \frac{\partial m'}{\partial s_{t+1}} \right]\]

where \(f\) is the residual that equals zero at optimality, \(s_{t+1}\) is the next-period arrival state, and \(m'\) is the pre-decision state.

The discount factor \(\beta\) is obtained from the BellmanPeriod via bellman_period.discount_variable, so it adapts to the model’s calibration.

Multi-control support:

For models with \(J\) control variables, a separate Euler residual is computed per control. The loss sums over all controls: \(L = \sum_j w \cdot f_j^2\).

Handling Inequality Constraints (Fischer-Burmeister):

When constrained=True and a control has an upper_bound defined on its Control object, the complementarity conditions

\[f \geq 0, \quad s \geq 0, \quad f \cdot s = 0\]

(where \(s\) is the constraint slack) are replaced by the smooth Fischer-Burmeister equation (Maliar et al. 2021, equation 25):

\[\text{FB}(f, s) = f + s - \sqrt{f^2 + s^2} = 0\]

For controls without an explicit upper_bound, the loss falls back to the one-sided \(\text{relu}(-f)^2\) formulation.

Scope: upper bounds only. The constrained mode currently models the upper-bound side of the complementarity condition: \(f \geq 0\), \(s = ub - c \geq 0\), \(f \cdot s = 0\). Although Control accepts both lower_bound and upper_bound, lower-bound constraints are not yet handled here; bilateral support requires also flipping the residual sign for lower-binding cases (FB(-f, c - lb)) and is left as a follow-up.

Parameters:

bellman_period (BellmanPeriod) – The model block containing dynamics, rewards, and shocks.
parameters (dict[str, Any] | None) – Model parameters for calibration.
agent (str | None) – Agent identifier for rewards.
weight (float) – Exogenous weight for combining multiple optimality conditions (default: 1.0). This corresponds to the vector \(v\) in equation (12) of the paper.
constrained (bool) – If True, use Fischer-Burmeister or one-sided loss for upper-bound constrained controls (default: False).

Examples

>>> bp = BellmanPeriod(block, "beta", calibration={"R": 1.04, "beta": 0.95})
>>> loss_fn = EulerEquationLoss(bp, parameters={"R": 1.04, "beta": 0.95})

class skagent.loss.StaticRewardLoss(bellman_period, parameters, other_dr=None)¶: A loss function that computes the negative reward for a block, assuming it is executed just once (a non-dynamic model)

skagent.loss.static_reward(bellman_period, dr, states, shocks=None, parameters=None, agent=None)¶

Returns the reward for an agent for a block, given a decision rule, states, shocks, and calibration.

Parameters:

bellman_period (BellmanPeriod) – The Bellman period object containing the model.
dr (dict or callable) – Decision rules (dict of functions), or a decision function.
states (dict) – Initial states, symbols to values.
shocks (dict, optional) – Shock variable values.
parameters (dict, optional) – Calibration parameters.
agent (str or None, optional) – Name of reference agent for rewards.