Pierre-Luc Bacon

I'm an assistant professor at University of Montreal's DIRO, a member of Mila and the Institute for Data Valorization (IVADO) and I hold a Facebook CIFAR AI chair. I completed my PhD thesis under the supervision of Doina Precup and worked with Emma Brunskill for my postdoc. My research generally pertains to the challenge posed by the curse of horizon when learning and planning over long time spans. I'm interested to tackle this problem from a representation learning perspective grounded in optimization methods.

2020

Yao Liu, Pierre-Luc Bacon, Emma Brunskill. "Understanding the Curse of Horizon in Off-Policy Evaluation via Conditional Importance Sampling". Thirty-seventh International Conference on Machine Learning (ICML), 2020. (arXiv)
Jean Harb, Tom Schaul, Doina Precup, Pierre-Luc Bacon. "Policy Evaluation Networks". In submission. (arXiv)
Joshua Romoff, Peter Henderson, David Kanaa, Emmanuel Bengio, Ahmed Touati, Pierre-Luc Bacon, Joelle Pineau. "TDprop: Does Jacobi Preconditioning Help Temporal Difference Learning?". Theoretical Foundations of Reinforcement Learning workshop at ICML 2020. (arXiv)
Khimya Khetarpal, Martin Klissarov, Maxime Chevalier-Boisvert, Pierre-Luc Bacon, Doina Precup. Options of Interest: Temporal Abstraction with Interest Functions. Thirthy-fourth AAAI Conference On Artificial Intelligence (AAAI), 2020.

2019

Pierre-Luc Bacon, Florian T. Schaefer, Clement Gehring, Animashree Anandkumar, Emma Brunskill. "A Lagrangian Method for Inverse Problems in Reinforcement Learning". NeurIPS 2019 Optimization Foundations for Reinforcement Learning Workshop
Benjamin Petit, Loren Amdahl-Culleton, Yao Liu , Jimmy Smith, Pierre-Luc Bacon. "All-Action Policy Gradient Methods: A Numerical Integration Approach". NeurIPS 2019 Optimization Foundations for Reinforcement Learning Workshop.
Pierre-Luc Bacon, Dilip Arumugam, Emma Brunskill. "Goal-Directed Learning as a Bi-level Optimization Problem". 4th Multidisciplinary Conference on Reinforcement Learning and Decision Making (RLDM), 2019.

2018

Pierre-Luc Bacon. "Temporal Representation Learning". PhD Thesis. McGill University, Montreal, June 2018.
Pierre-Luc Bacon and Doina Precup. "Constructing Temporal Abstractions Autonomously in Reinforcement Learning". Association for the Advancement of Artificial Intelligence (AAAI). p. 39. 2018.
Ahmed Touati, Pierre-Luc Bacon, Doina Precup, Pascal Vincent. "Convergent Tree-Backup and Retrace with Function Approximation". In proceedings of the 35th International Conference on Machine Learning (ICML), 2018. (camera ready, arXiv)
Anna Harutyunyan, Peter Vrancx, Pierre-Luc Bacon, Doina Precup, Ann Nowe. "Learning with Options that Terminate Off-Policy". Thirthy-first AAAI Conference On Artificial Intelligence (AAAI), 2018. (camera ready, arXiv)
Jean Harb*, Pierre-Luc Bacon*, Martin Klissarov, Doina Precup. "When Waiting is not an Option : Learning Options with a Deliberation Cost". Thirthy-first AAAI Conference On Artificial Intelligence (AAAI), 2018. (camera ready, arXiv)
Peter Henderson, Wei-Di Chang, Pierre-Luc Bacon, David Meger, Joelle Pineau, Doina Precup. "OptionGAN: Learning Joint Reward-Policy Options using Generative Adversarial Inverse Reinforcement Learning". Thirthy-first AAAI Conference On Artificial Intelligence (AAAI), 2018. (camera ready, arXiv)
Daniel J. Mankowitz, Timothy Mann, Pierre-Luc Bacon, Shie Mannor, Doina Precup. "Learning Robust Options". Thirthy-first AAAI Conference On Artificial Intelligence (AAAI), 2018. (camera ready)

2017

Pierre-Luc Bacon, Doina Precup. "Unifying Multi-Step Methods through Matrix Splitting". 3rd Multidisciplinary Conference on Reinforcement Learning and Decision Making, 2017.
Martin Klissarov, Pierre-Luc Bacon, Jean Harb, Doina Precup. "Learnings Options End-to-End for Continuous Action Tasks". Hierarchical Reinforcement Learning Workshop (NIPS), 2017.
Pierre-Luc Bacon, Doina Precup. "A Unified View on Multi-Steps Methods using Matrix Splittings". Data Learning and Inference (DALI), 2017.
Pierre-Luc Bacon, Jean Harb, Doina Precup. "The Option-Critic Architecture". Thirthy-first AAAI Conference On Artificial Intelligence (AAAI), 2017. (camera ready, arXiv, slides)

2016

Pierre-Luc Bacon, Doina Precup. "A Matrix Splitting Perspective on Planning with Options". Continual Learning and Deep Networks Workshop, NIPS 2016. (poster)
Doina Precup, Pierre-Luc Bacon. "Advances in Option Construction: The option-critic architecture". Abstraction in RL Workshop, ICML 2016. (video, slides)
Pierre-Luc Bacon and Doina Precup. "The good, the bad and the discovery: the specification problem of options discovery". 10th Barbados Workshop on Reinforcement Learning, 2016.
Emmanuel Bengio, Pierre-Luc Bacon, Joelle Pineau, Doina Precup. "Conditional Computation in Neural Networks for faster models". CoRR abs/1511.06297, 2016.

2015

Pierre-Luc Bacon and Doina Precup. "Learning with options: Just deliberate and relax". Bounded Optimality and Rational Metareasoning Workshop, NIPS 2015. (poster)
Pierre-Luc Bacon and Doina Precup. "The option-critic architecture". Deep Reinforcement Learning Workshop, NIPS 2015. (poster)
Pierre-Luc Bacon, Borja Balle and Doina Precup. "Learning and Planning with Timing Information in Markov Decision Processes". 31st Conference on Uncertainty in Artificial Intelligence (UAI), 2015. (poster)
Joelle Pineau, Pierre-Luc Bacon. "Analyzing Open Data from the City of Montreal". 2nd ICML Workshop on Mining Urban Data (MUD), 2015.
Pierre-Luc Bacon, Doina Precup. "Learning Recognizers". 9th Barbados Workshop on Reinforcement Learning, 2015. (slides)
Pierre-Luc Bacon, Emmanuel Bengio, Doina Precup, Joelle Pineau. "Conditional computation in neural networks using a decision-theoretic approach". 2nd Multidisciplinary Conference on Reinforcement Learning and Decision Making, 2015.
Pierre-Luc Bacon, Borja Balle and Doina Precup. "Learning and Planning with Timing Information in Markov Decision Processes". 2nd Multidisciplinary Conference on Reinforcement Learning and Decision Making, 2015.

2014

Pierre-Luc Bacon, Borja Balle and Doina Precup. "Predictive Timing Models". 2014 NIPS Workshop "From Bad Models to Good Policies". (poster, slides, video)

2013

Pierre-Luc Bacon and Doina Precup. "Using Label Propagation for Learning Temporally Abstract Actions in Reinforcement Learning". AAMAS Workshop on "Multiagent Interaction Networks", 2013.
Pierre-Luc Bacon. "On the Bottleneck Concept for Options Discovery: Theoretical Underpinnings and Extension in Continuous State Spaces". Master's thesis, McGill University, 2013.