Publications
2024
- Neural differential equations for temperature control in buildings under demand response programs. Vincent Taboga, Clement Gehring, Mathieu Le Cam, Hanane Dagdougui, Pierre-Luc Bacon. Applied Energy, Volume 368, 2024.
- Do Transformer World Models Give Better Policy Gradients?. Michel Ma, Tianwei Ni, Clement Gehring, Pierluca D'Oro, Pierre-Luc Bacon. ICML 2024.
- Maximum entropy GFlowNets with soft Q-learning. Sobhan Mohammadpour, Emmanuel Bengio, Emma Frejinger, Pierre-Luc Bacon. AISTATS 2024.
- Decoupling regularization from the action space. Sobhan Mohammadpour, Pierre-Luc Bacon, Emma Frejinger. ICLR 2024.
- Bridging State and History Representations: Understanding Self-Predictive RL. Tianwei Ni, Benjamin Eysenbach, Erfan Seyedsalehi, Michel Ma, Clement Gehring, Aditya Mahajan, Pierre-Luc Bacon. ICLR 2024.
- Course Correcting Koopman Representations. Mahan Fathi, Clement Gehring, Jonathan Pilault, David Kanaa, Pierre-Luc Bacon, Ross Goroshin. ICLR 2024.
- Motif: Intrinsic Motivation from Artificial Intelligence Feedback. Martin Klissarov, Pierluca D'Oro, Shagun Sodhani, Roberta Raileanu, Pierre-Luc Bacon, Pascal Vincent, Amy Zhang, Mikael Henaff. ICLR 2024.
2023
- When Do Transformers Shine in RL? Decoupling Memory from Credit Assignment. Tianwei Ni, Michel Ma, Benjamin Eysenbach, Pierre-Luc Bacon. NeurIPS 2023 oral
- Block-State Transformers. Jonathan Pilault, Mahan Fathi, Orhan Firat, Christopher Pal, Pierre-Luc Bacon, Ross Goroshin. NeurIPS 2023 poster
- Policy Optimization in a Noisy Neighborhood: On Return Landscapes in Continuous Control. Nathan Rahn, Pierluca D'Oro, Harley Wiltzer, Pierre-Luc Bacon, Marc G Bellemare. NeurIPS 2023 poster
- Double Gumbel Q-Learning. David Yu-Tung Hui, Aaron Courville, Pierre-Luc Bacon. NeurIPS 2023 spotlight
- Sample-Efficient Reinforcement Learning by Breaking the Replay Ratio Barrier. Pierluca D'Oro, Max Schwarzer, Evgenii Nikishin, Pierre-Luc Bacon, Marc G Bellemare, Aaron Courville. ICLR 2023 notable top 5%
2022
- Myriad: a real-world testbed to bridge trajectory optimization and deep learning. Nikolaus H. R. Howe, Simon Dufort-Labbé, Nitarshan Rajkumar, Pierre-Luc Bacon. NeurIPS 2022 Datasets and Benchmarks
- The Primacy Bias in Deep Reinforcement Learning. Evgenii Nikishin*, Max Schwarzer*, Pierluca D'Oro*, Pierre-Luc Bacon, Aaron Courville. ICML 2022 and RLDM 2022
- Direct Behavior Specification via Constrained Reinforcement Learning. Julien Roy, Roger Girgis, Joshua Romoff, Pierre-Luc Bacon, Christopher Pal. ICML 2022
Continuous-Time Meta-Learning with Forward Mode Differentiation. Tristan Deleu, David Kanaa, Leo Feng, Giancarlo Kerg, Yoshua Bengio, Guillaume Lajoie, Pierre-Luc Bacon. ICLR, 2022.
2021
- Pierluca D'Oro, Pierre-Luc Bacon. Meta Dynamic Programming. NeurIPS workshop "Metacognition in the Age of AI: Challenges and Opportunities", 2021.
- Michel Ma, Pierluca D'Oro, Pierre-Luc Bacon. Long-Term Credit Assignment via Model-based Temporal Shortcuts. NeurIPS Deep Reinforcement Learning Workshop, 2021.
- Andreea Deac, Petar Veličković, Ognjen Milinković, Pierre-Luc Bacon, Jian Tang, Mladen Nikolić. Neural Algorithmic Reasoners are Implicit Planners. NeurIPS, 2021.
- Evgenii Nikishin, Romina Abachi, Rishabh Agarwal, Pierre-Luc Bacon. "Control-Oriented Model-Based Reinforcement Learning with Implicit Differentiation". AAAI, 2022. (arXiv)
2020
- Michel Ma, Pierre-Luc Bacon. Counterfactual Policy Evaluation and the Conditional Monte Carlo Method. NeurIPS workshop on Offline Reinforcement, 2020.
- Yao Liu, Pierre-Luc Bacon, Emma Brunskill. "Understanding the Curse of Horizon in Off-Policy Evaluation via Conditional Importance Sampling". Thirty-seventh International Conference on Machine Learning (ICML), 2020. (arXiv)
- Jean Harb, Tom Schaul, Doina Precup, Pierre-Luc Bacon. "Policy Evaluation Networks". In submission. (arXiv)
- Joshua Romoff, Peter Henderson, David Kanaa, Emmanuel Bengio, Ahmed Touati, Pierre-Luc Bacon, Joelle Pineau. "TDprop: Does Jacobi Preconditioning Help Temporal Difference Learning?". Theoretical Foundations of Reinforcement Learning workshop at ICML 2020. (arXiv)
- Khimya Khetarpal, Martin Klissarov, Maxime Chevalier-Boisvert, Pierre-Luc Bacon, Doina Precup. Options of Interest: Temporal Abstraction with Interest Functions. Thirthy-fourth AAAI Conference On Artificial Intelligence (AAAI), 2020.