Publications
Explore a selection of our recent research on some of the most complex and interesting challenges in AI.
240 publications
-
- 7 May 2024
- Teach LLMs to Phish: Stealing Private Information from Language Models
-
- 7 May 2024
- CORRELATED NOISE PROVABLY BEATS INDEPENDENT NOISE FOR DIFFERENTIALLY PRIVATE LEARNING
-
- 7 May 2024
- Adaptive Hashing: Faster Hash Functions Perhaps with Fewer Collisions
-
- 7 May 2024
- π2vec: Policy Representations with SuccessorFeatures
-
- 6 May 2024
- Position: Leverage Foundational Models for Black-Box Optimization
-
- 6 May 2024
- ExeDec: Execution Decomposition for Compositional Generalization in Neural Program Synthesis
-
- 6 May 2024
- Advancing Biomedical Understanding with Multimodal Gemini
-
- 6 May 2024
- Pose Priors from Language Models
-
- 25 April 2024
- Improving Dictionary Learning with Gated Sparse Autoencoders
-
- 22 April 2024
- Holistic Safety and Responsibility Evaluations of Advanced AI Models
-
- 17 April 2024
- Many-Shot In-Context Learning
-
- 10 April 2024
- Learning Agile Soccer Skills for a Bipedal Robot with Deep Reinforcement Learning
-
- 5 April 2024
- Biomolecular dynamics with machine-learned quantum-mechanical force fields trained on diverse chemical fragments
-
- 29 March 2024
- Gecko: Versatile Text Embeddings Distilled from Large Language Models
-
- 28 March 2024
- Learning from One Continuous Video Stream
-
- 27 March 2024
- Long-form factuality in large language models
-
- 27 March 2024
- Few-Shot Recalibration of Language Models
-
- 21 March 2024
- Evaluating Frontier Models for Dangerous Capabilities
-
- 19 March 2024
- DiPaCo: Distributed Path Composition
-
- 11 March 2024
- Demonstration-Regularized RL
-
- 11 March 2024
- Model-free Posterior Sampling via Learning Rate Randomization
-
- 11 March 2024
- Prosody for Intuitive Robotic Interface Design: It's Not What You Said, It's How You Said It
-
- 11 March 2024
- Robust Exploration via Clustering-based Density Estimation
-
- 11 March 2024
- Understanding Learning from Human Preferences
-
- 4 March 2024
- AtP*: Efficient and scalable methods for localizing LLM behaviour to components
-
- 2 March 2024
- How aligned are different alignment metrics?
-
- 1 March 2024
- Approximating the Core of Cooperative Games
-
- 1 March 2024
- Towards Practical Reinforcement Learning for Tokamak Magnetic Control
-
- 29 February 2024
- Self-supervised video pretraining yields strong image representations
-
- 29 February 2024
- Bad Students Make Great Teachers: Active Learning Accelerates Large Scale Visual Understanding