Poster
|
Functional Indirection Neural Estimator for Better Out-of-distribution Generalization Kha Pham · Thai Hung Le · Man Ngo · Truyen Tran |
||
Workshop
|
Interpretability in the Wild: a Circuit for Indirect Object Identification in GPT-2 small Kevin Wang · Alexandre Variengien · Arthur Conmy · Buck Shlegeris · Jacob Steinhardt |