fit
muppet.explainers.fit
FIT (Feature Importance in Time) explainer for time series models.
This module implements the Instance-wise Feature Importance in Time (FIT) method for explaining time series classification models. FIT evaluates the importance of observations by quantifying the shift in the predictive distribution over time using KL divergence. Unlike traditional XAI methods, FIT specifically addresses the temporal nature of time series data and controls for time-dependent distribution shifts.
MUPPET Component Integration
- Explorer:
RepeatedTimestepExplorer- generates timestep-wise masks for temporal perturbation - Perturbator:
ConditionalSamplingGeneratorPertubator- applies conditional sampling based perturbations using trained generators - Attributor:
ProbaShiftAttributor- calculates distributional shifts using probability differences - Aggregator:
MonteCarloKLAggregator- aggregates KL divergences from Monte Carlo sampling
Classes:
-
FITExplainer–Implementation of the FIT method for time series explanation.
Technical Details Workflow: 1. For a given input example and a set of features to explain \(S\), FIT calculates saliency map showing the importance of \(S\) at every time step. It does so by perturbing the other features not presented in \(\overline{S}\) (\(S\) compliment) by values sampled from a conditional distribution on \(S\) fitted on the historical data (up to the current explained time step \(t\)).
The score of $S$ at time step $t$ is calculated by the a difference-measure between:
- **Temporal shift between $X$ at time $t$ against itself at $t-1$**: $P(y/X_{0:t}) || P(y/X_{0:t-1})$,
- **Unexplained distribution shift between $X$ and $X'$ at time $t$**: $P(y/X_{0:t}) || P(y/X'_{0:t})$,
where $X'_{0:t}=X_{0:t-1, x_{S, t}}$ means the values of features in $\overline{S}$ at time $t$ are perturbed and imputed by the generator.
2. Supports univariate and multivariate time series,
4. Implements only KL divergence as the difference-measure. More measures will be added later.
References
Tonekaboni, Sana, et al. "Instance-wise feature importance in time." Advances in Neural Information Processing Systems 33 (2020): 21757-21767. https://papers.nips.cc/paper/2020/hash/08fa43588c2571ade19bc0fa5936e028-Abstract.html
Classes
FITExplainer
FITExplainer(
model,
train_loader,
num_sampling=100,
generator=None,
padding=None,
hidden_size=100,
latent_size=50,
mid_layer_size=50,
prediction_size=1,
num_samples=1,
cov_noise_level=0.0001,
max_noise_correction=20,
learning_rate=0.001,
num_epochs=100,
timesteps_divide_num=1,
seed=None,
)
Bases: MuppetExplainer
FIT (Feature Importance in Time) explainer implementation.
Implements the FIT method that evaluates the importance of observations by quantifying the shift in the predictive distribution over time using KL divergence. The method specifically addresses the temporal nature of time series data and controls for time-dependent distribution shifts.
The FIT method quantifies feature importance by comparing: 1. The temporal shift between predictions at consecutive time steps 2. The output distributional shift between original and perturbed inputs
This approach provides instance-specific explanations that highlight the most important time points and features throughout the entire time series sequence.
The method works by training conditional generators to create realistic perturbations and evaluating feature importance through distributional shift analysis using KL divergence.
Initialize the FIT explainer for time series explanation.
Parameters:
-
model(Module) –The blackbox model to explain. It must output the probability distribution over the set of classes.
-
train_loader(DataLoader) –The training data loader.
-
num_sampling(int, default:100) –Number of Monte-Carlo sampling of the perturbed values.
-
generator(GaussianFeatureGenerator, default:None) –The generator is used to impute the perturbed values, therefore, it must have the
self.generate()method implemented. Defaults to None: Meaning create aGaussianFeatureGeneratorand train it on the provided train and test datasets. -
padding(str, default:None) –Either "left" or "right" in order to choose how to apply the padding when the black-box model only accepts full length input, otherwise None is chosen which means no padding will be applied, assuming the model doesn't require it. Defaults to None.
-
hidden_size(int, default:100) –Hidden layer size for the generator network.
-
latent_size(int, default:50) –Latent space dimension for the generator.
-
mid_layer_size(int, default:50) –Middle layer size for the generator network.
-
prediction_size(int, default:1) –Prediction output size.
-
num_samples(int, default:1) –Number of samples to generate.
-
cov_noise_level(float, default:0.0001) –The noise to add to the covariance to make it positive definite (PD).
-
max_noise_correction(int, default:20) –Maximum number of covariance PD correction iterations. After exceeding this number the identity matrix will be used as the covariance.
-
learning_rate(float, default:0.001) –Training learning rate used with Adam optimizer.
-
num_epochs(int, default:100) –Training number of epochs.
-
timesteps_divide_num(int, default:1) –Used to divide the time series. E.g, when set to 1, it means predict only at time \(t=T\) using \(X_{0:T-1}\).
-
seed(int, default:None) –The seed value to be used for a deterministic sampling using the generator. If a custom generator is given, therefore, it's expected to handle the reproducibility if it's needed!.
Source code in muppet/explainers/fit.py
86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 | |
Functions
Re-set the generator's seed. Used to control the reproducibility when perturbing.
Parameters:
-
seed(float) –the seed to set.