模型可解释性论文

April 28, 2025 · 1 分钟 · 51 字 · Yangless

先放几篇值得反复回看的模型可解释性材料：

对定性研究的思考 — Reflections on Qualitative Research

特征可视化 — Feature Visualization

Tracing the thoughts of a large language model - Anthropic