Loss Functions¶
The loss module provides the objective functions minimized during neural network
training. Each loss wraps a BellmanPeriod and encodes a different optimality
criterion: static reward maximization, estimated discounted lifetime reward,
Bellman equation residuals, or Euler equation residuals.
- class skagent.loss.BellmanEquationLoss(bellman_period, value_function, parameters=None, agent=None, foc_weight=0.0)¶
Creates a Bellman equation loss function for the Maliar method.
The Bellman equation is: V(s) = max_c { u(s,c,ε) + β E_ε’[V(s’)] } where s’ = f(s,c,ε) is the next state given current state s, control c, and shock ε, and the expectation E_ε’ is taken over future shock realizations ε’.
This function expects the input grid to contain two independent shock realizations: - {shock_sym}_0: shocks for period t (used for immediate reward and transitions) - {shock_sym}_1: shocks for period t+1 (used for continuation value evaluation)
- Parameters:
bellman_period (
BellmanPeriod) – The model block containing dynamics, rewards, and shocksvalue_function (
Union[dict[str,Callable],Callable]) – A value function that takes state variables and returns value estimatesparameters (
dict[str,Any] |None) – Model parameters for calibrationfoc_weight (
float)
- class skagent.loss.CustomLoss(loss_function, block, parameters=None, other_dr=None)¶
A custom loss function that computes the negative reward for a block, assuming it is executed just once (a non-dynamic model)
TODO: leaving this as ambiguously about Blocks and BellmanPeriods for now
- class skagent.loss.EstimatedDiscountedLifetimeRewardLoss(bellman_period, big_t, parameters)¶
A loss function for a Block that computes the discounted lifetime reward for T time periods.
- Parameters:
bellman_period
big_t (int) – The number of time steps to compute reward for
parameters
- class skagent.loss.EulerEquationLoss(bellman_period, parameters=None, agent=None, weight=1.0, constrained=False)¶
Creates an Euler equation loss function for the Maliar method.
The Euler equation is the first-order condition from the Bellman equation, relating marginal rewards across periods. For a DSOP with control \(x_t\), arrival states \(s_t\), and pre-decision states \(m_t\), this loss function computes the Euler equation residual:
\[f = u'(x_t) + \beta \cdot u'(x_{t+1}) \cdot \sum_s \left[ \frac{\partial s_{t+1}}{\partial x_t} \cdot \frac{\partial m'}{\partial s_{t+1}} \right]\]where \(f\) is the residual that equals zero at optimality, \(s_{t+1}\) is the next-period arrival state, and \(m'\) is the pre-decision state.
The discount factor \(\beta\) is obtained from the
BellmanPeriodviabellman_period.discount_variable, so it adapts to the model’s calibration.Multi-control support:
For models with \(J\) control variables, a separate Euler residual is computed per control. The loss sums over all controls: \(L = \sum_j w \cdot f_j^2\).
Handling Inequality Constraints (Fischer-Burmeister):
When
constrained=Trueand a control has anupper_bounddefined on itsControlobject, the complementarity conditions\[f \geq 0, \quad s \geq 0, \quad f \cdot s = 0\](where \(s\) is the constraint slack) are replaced by the smooth Fischer-Burmeister equation (Maliar et al. 2021, equation 25):
\[\text{FB}(f, s) = f + s - \sqrt{f^2 + s^2} = 0\]For controls without an explicit
upper_bound, the loss falls back to the one-sided \(\text{relu}(-f)^2\) formulation.Scope: upper bounds only. The constrained mode currently models the upper-bound side of the complementarity condition: \(f \geq 0\), \(s = ub - c \geq 0\), \(f \cdot s = 0\). Although
Controlaccepts bothlower_boundandupper_bound, lower-bound constraints are not yet handled here; bilateral support requires also flipping the residual sign for lower-binding cases (FB(-f, c - lb)) and is left as a follow-up.- Parameters:
bellman_period (
BellmanPeriod) – The model block containing dynamics, rewards, and shocks.parameters (
dict[str,Any] |None) – Model parameters for calibration.weight (
float) – Exogenous weight for combining multiple optimality conditions (default: 1.0). This corresponds to the vector \(v\) in equation (12) of the paper.constrained (
bool) – If True, use Fischer-Burmeister or one-sided loss for upper-bound constrained controls (default: False).
Examples
>>> bp = BellmanPeriod(block, "beta", calibration={"R": 1.04, "beta": 0.95}) >>> loss_fn = EulerEquationLoss(bp, parameters={"R": 1.04, "beta": 0.95})
- class skagent.loss.StaticRewardLoss(bellman_period, parameters, other_dr=None)¶
A loss function that computes the negative reward for a block, assuming it is executed just once (a non-dynamic model)
- skagent.loss.static_reward(bellman_period, dr, states, shocks=None, parameters=None, agent=None)¶
Returns the reward for an agent for a block, given a decision rule, states, shocks, and calibration.
- Parameters:
bellman_period (BellmanPeriod) – The Bellman period object containing the model.
dr (dict or callable) – Decision rules (dict of functions), or a decision function.
states (dict) – Initial states, symbols to values.
shocks (dict, optional) – Shock variable values.
parameters (dict, optional) – Calibration parameters.
agent (str or None, optional) – Name of reference agent for rewards.