Redwood Research AI Safety Fellowship, Cambridge MARS
Nov 2025 - Mar 2026
- Developed stronger red-team attack policies to support more rigorous AI control evaluations.
- Applied DSPy with GEPA to systematically optimize red-team prompts.
- Analyzed how late-starting and early-stopping attack policies affect safety across a range of thresholds.