Phase 113: Interpretability and Mechanistic Understanding

Phase 113 of the AI Encyclopedia — Interpretability and Mechanistic Understanding. Topics 2241–2260.

Part of the AI Encyclopedia · Phase 113 of 130 · Topics 2241–2260

This phase covers Interpretability and Mechanistic Understanding. Below are the 20 concepts grouped under this phase — each is a future article in the Insightful AI World encyclopedia.

2241 Explainable AI

2242 Interpretability

2243 Feature Attribution

2244 Saliency Maps

2245 SHAP

2246 LIME

2247 Concept Activation Vectors

2248 Representation Probing

2249 Circuit Analysis

2250 Mechanistic Interpretability

2251 Neuron Interpretability

2252 Attention Head Analysis

2253 Induction Heads

2254 Sparse Autoencoders for Interpretability

2255 Feature Visualization

2256 Causal Tracing

2257 Model Editing

2258 Interpretability Evaluation

2259 Limits of Interpretability

2260 Interpretability for Alignment

← Phase 112

All phases

Phase 114 →