Phase 113: Interpretability and Mechanistic Understanding
Phase 113 of the AI Encyclopedia — Interpretability and Mechanistic Understanding. Topics 2241–2260.
This phase covers Interpretability and Mechanistic Understanding. Below are the 20 concepts grouped under this phase — each is a future article in the Insightful AI World encyclopedia.
2241 Explainable AI
2242 Interpretability
2243 Feature Attribution
2244 Saliency Maps
2245 SHAP
2246 LIME
2247 Concept Activation Vectors
2248 Representation Probing
2249 Circuit Analysis
2250 Mechanistic Interpretability
2251 Neuron Interpretability
2252 Attention Head Analysis
2253 Induction Heads
2254 Sparse Autoencoders for Interpretability
2255 Feature Visualization
2256 Causal Tracing
2257 Model Editing
2258 Interpretability Evaluation
2259 Limits of Interpretability
2260 Interpretability for Alignment