Stewy Slocum

I'm a third-year PhD student at MIT CSAIL, advised by Dylan Hadfield-Menell. I develop tools to ensure increasingly powerful LLMs and AI agents are safe and aligned with human values. My work spans two main areas:

Adversarial defenses and evaluations
Preference learning and alignment

I like building methods that leverage clear theoretical insights on important problems. My goal is to develop these insights into empirically effective, practical, and scalable tools for alignment and safety.

Previously, I obtained my bachelor's degree at Johns Hopkins, where I worked with Professor Rene Vidal on the theory of deep learning. During undergrad, I also worked with Dyno Therapeutics and NASA on ML-guided protein design.

Outside of work, I enjoy playing piano, backpacking, salsa dancing, and philosophy.

Please reach out at stew@csail.mit.edu if you'd like to talk!

Google Scholar Resume Goodreads