lime

muppet.explainers.lime

LIME (Local Interpretable Model-Agnostic Explanations) explainer implementations.

This module implements Local Interpretable Model-Agnostic Explanations (LIME) for both image and tabular data. LIME explains predictions by learning locally faithful surrogate models around individual instances. The method generates random perturbations of the input, evaluates the model on these perturbations, and fits an interpretable model (typically linear regression) to approximate the black-box model's behavior locally.

MUPPET Component Integration

For Images (LIMEImageExplainer): - Explorer: SegmentedBinaryRandomMasksExplorer - generates random masks over image segments/superpixels - Perturbator: SetToZeroPerturbator - masks out image regions by setting them to zero - Attributor: SimilarityAttributor - calculates similarity-weighted attributions based on distance from original - Aggregator: SegmentedImageModelAggregator - fits surrogate model and maps segment importance back to pixels

For Tabular Data (LIMETabularExplainer): - Explorer: RandomNormalExplorer - generates random binary feature masks - Perturbator: ScaleFeaturePerturbator - scales features using generator-based perturbations - Attributor: SimilarityAttributor - weights perturbations by similarity to original instance - Aggregator: ModelAggregator - fits Ridge regression to learn feature importance

Classes:

LIMEImageExplainer –

LIME implementation for image classification models.
LIMETabularExplainer –

LIME implementation for tabular data.

References

Ribeiro, Marco Tulio, Sameer Singh, and Carlos Guestrin. "Why should I trust you? Explaining the predictions of any classifier." Proceedings of the 22nd ACM SIGKDD international conference on knowledge discovery and data mining. 2016. https://arxiv.org/pdf/1602.04938v3.pdf

Classes

LIMEImageExplainer

LIMEImageExplainer(
    model,
    surrogate_model=Ridge(alpha=1, fit_intercept=True),
    nmasks=500,
    masked_proba=0.5,
    n_segments=100,
)

Bases: MuppetExplainer

LIME implementation for image classification models.

Implements the Local Interpretable Model-Agnostic Explanations (LIME) method for image classification. LIME explains predictions through superpixel-based perturbations and segmented surrogate model fitting.

LIME's key principle is local fidelity - the explanation should accurately represent the model's behavior in the neighborhood of the specific instance being explained. This allows LIME to work with any type of model (model-agnostic) while providing human-interpretable explanations through simple surrogate models.

The method generates random masks to perturb the input data, then fits a model on the perturbed dataset to create a surrogate model of the explained model locally faithful around the input. For images, LIME generates a heatmap to identify the areas of pixels which made the model take the decision for the input data.

Initialize the LIME Image explainer.

Parameters:

model (Module) –

The black-box model to explain its predictions.
surrogate_model –

The regressor model for learning the surrogate model in the aggregator.
nmasks (int, default: 500 ) –

Number of random masks to generate.
masked_proba (float, default: 0.5 ) –

The probability of masking each super-pixel of the image.
n_segments (int, default: 100 ) –

Number of segments to divide the image into.

Source code in muppet/explainers/lime.py

def __init__(
    self,
    model: torch.nn.Module,
    surrogate_model=Ridge(alpha=1, fit_intercept=True),
    nmasks: int = 500,
    masked_proba: float = 0.5,
    n_segments: int = 100,
) -> None:
    """Initialize the LIME Image explainer.

    Args:
        model (torch.nn.Module): The black-box model to explain its predictions.
        surrogate_model: The regressor model for learning the surrogate model in the aggregator.
        nmasks (int): Number of random masks to generate.
        masked_proba (float): The probability of masking each super-pixel of the image.
        n_segments (int): Number of segments to divide the image into.
    """
    self.nmasks = nmasks
    self.mask_proba = masked_proba

    explorer = SegmentedBinaryRandomMasksExplorer(
        nmasks=self.nmasks,
        masked_proba=self.mask_proba,
        n_segments=n_segments,
    )
    perturbator = SetToZeroPerturbator()
    attributor = SimilarityAttributor(similarity_fun=lime_similarity)
    aggregator = SegmentedImageModelAggregator(surrogate_model)
    memory = PremiseList()

    super().__init__(
        model, explorer, perturbator, attributor, aggregator, memory
    )

LIMETabularExplainer

LIMETabularExplainer(
    model,
    train_loader,
    generator=StandardGaussianTabularGenerator,
    surrogate_model=Ridge(alpha=1, fit_intercept=True),
    nmasks=500,
    sample_around_instance=True,
    seed=1,
    similarity_fun=lime_similarity,
)

Bases: MuppetExplainer

LIME implementation for tabular data.

Implements the Local Interpretable Model-Agnostic Explanations (LIME) method for tabular data. LIME explains predictions by perturbing individual features and learning linear surrogate models.