Back to All Events

2023 MIT Mechanistic Interpretability Conference


  • Massachusetts Institute if Technology 54 Vassar Street Cambridge, MA, 02139 United States (map)

Schedule

Saturday, May 6

0800: Breakfast
0900: Overview, LLM MI

  • Max Tegmark (MIT): Welcome (video)

  • Dylan Hadfield-Menell: interpretability overview (video)

  • Chris Olah: The SOTA of LLM Mechanistic Interpretability (video)

Coffee break

  • Mor Geva (Google Deepmind): interpreting LLM’s in embedding space (video)

  • David Bau (Northeastern): how LLMs remember facts (video)

Panel with Chris Olah, David Bau & Mor Geva, moderated by Neel Nanda: “What do & don’t we understand about LLMs?”
Lightning intros
1200: Lunch
1300: More LLM MI

  • Jacob Andreas/Evan Hernandez: How LLM’s model people’s beliefs (video)

  • Ekin Akyürek: How LLM’s can do linear regression at runtime (video)

  • Eric Michaud (MIT): Understanding LLM scaling in terms of computational quanta (video)

  • János Kramar: compiling any algorithm into a transformer (video)

1420: Group photo
1430: Poster Session
1530: MI beyond LLM

  • Tony Wang (MIT): how a human beat AlphaGo (video)

  • Ellie Pavlick (Brown): Neural network subroutines (video)

  • Ziming Liu (MIT): MI of knowledge representations, symmetry & modularity (video)

  • Sharon Li (Wisconsin): How unique are knowledge representations? (video)

  • Buck Schlegeris (Redwood): Formalism for thinking about MI (video)

  • Martin Wattenberg: Learned world models and what they’re good for (video)

Panel with Ila Fiete (MIT), Tommy Poggio (MIT), Gabriel Kreiman (Harvard): MI inspiration from neuroscience, physics & math (video)
1800-2100: Dinner Cruise, scintillating conversation

Sunday, May 7

0800: Breakfast
0900: Morning session: MI for AI safety

  • Panel with Viktoriya Krakovna (Google Deepmind/FLI), Connor Leahy (Conjecture), Sharon Li (Wisconsin), Anthony Aguirre (FLI): AGI Safety (video)

  • Neel Nanda (Google Deepmind): How MI can help AI safety (video)

  • Connor Leahy (Conjecture): MI for AGI safety (video)

Coffee break

  • Steve Omohundro: Provably safe AGI (video)

  • Silviu Marian Udrescu (MIT): Symbolic regression (video)

  • Marin Soljacic (MIT): Symbolic regression & applications

1200: Lunch
1300: Lightning talks (video)
1400-1800: Project incubation

  • Neel Nanda II: Whirlwind Tour of MI open problems (video)

  • Panel with Neel Nanda (Google Deepmind), Steve Omohundro & Martin Wattenberg (MIT), moderated by Chris Olah (Anthropic): promising MI research directions (video)

  • All group leaders looking for collaborators stand up & introduce themselves, lightning style

Coffee break
1515: Project incubator unconference, block I: Break out across different tables in atrium with one MI research direction per table. In parallel, Wes Gurnee & Neel Nanda MIT run MI tutorial hackathon in Singleton Auditorium for whoever wants to get their feet wet.
1615: Project incubator unconference, block II
1715: Report-back from breakouts, closing remarks (video)
1800: Conference dinner, mingling, scintillating conversation

Next
Next
8 May

2023 Buterin Fellows Symposium