Search

147 results for "reinforcement learning"

Publication Position: Leverage Foundational Models for Black-Box Optimization
Publication Understanding Learning from Human Preferences
Publication MONA: Myopic Optimization with Non-myopic Approval Can Mitigate Multi-step Reward Hacking
Publication Diffusion model predictive control
Publication Online RL in Linearly $q^\pi$-Realizable MDPs Is as Easy as in Linear MDPs If You Learn What to Ignore
Publication Distributional Bellman Operators over Mean-embeddings
Publication Exposing Limitations of Language Model Agents in Sequential-Task Compositions on the Web
Publication Self-Predictive Universal AI
Publication Robust Exploration via Clustering-based Density Estimation
Publication An Introduction to Universal Artificial Intelligence