147 results for "reinforcement learning"
- Publication Position: Leverage Foundational Models for Black-Box Optimization
- Publication Understanding Learning from Human Preferences
- Publication MONA: Myopic Optimization with Non-myopic Approval Can Mitigate Multi-step Reward Hacking
- Publication Diffusion model predictive control
- Publication Online RL in Linearly $q^\pi$-Realizable MDPs Is as Easy as in Linear MDPs If You Learn What to Ignore
- Publication Distributional Bellman Operators over Mean-embeddings
- Publication Exposing Limitations of Language Model Agents in Sequential-Task Compositions on the Web
- Publication Self-Predictive Universal AI
- Publication Robust Exploration via Clustering-based Density Estimation
- Publication An Introduction to Universal Artificial Intelligence