LLM

QKV Decomposition for Transformer XAI Paper · 2026

Diagnose transformer prediction failures from weights alone, then correct them by retraining one layer. GPT-2 capital-city accuracy 2/8 → 8/8 with zero side effects, achievable through any of attention, FFN, or V-only (590K params).

LLM
Interpretability
Model editing
Transformers
Insights on intelligence

Zenodo (DOI) ↗
HuggingFace dashboard ↗

Dissecting BERT Layers: FFN Dual Role, Separability-Guided Layer Skip, and Interpretable Classification via Charge-Flow Learning Paper · 2026

Layer-level analysis of BERT on the five GLUE tasks, applying a forward-primary learning framework. Three findings: (1) separability-guided layer skip with compensation classifier — lossless compression on 3 of 5 GLUE tasks; (2) FFN's role decomposed as 92% structural (norm normalization) and 8% classification, explaining why FFN removal hurts even when individual layers look classification-harmful; (3) 60–93% of misclassifications are high-confidence errors — the BERT CLS vector itself is the fundamental limitation.

LLM
Interpretability
Model compression
Transformers
Model editing
Insights on intelligence

Zenodo (DOI) ↗

Works

Notes

Deep learning is editable