Research

My research is on sequential decision making problems under uncertainty and potentially limited feedback. In particular, I work on multi-armed bandit, reinforcement learning, and online learning problems. Often the problems I work on are motivated by issues arising in applications such as education and healthcare, but my main focus is usually on theoretical aspects.

Publications

* indicates alphabetical ordering of authors, † indicates co-first authors.

B. Howson, S. Filippi, C. Pike-Burke, QuACK: A Multipurpose Queuing Algorithm for Cooperative k-Armed Bandits, in Artificial Intelligence and Statistics (AISTATS), 2025.
G. Drappo^†, A. Robert^†, M. Restelli, A. A. Faisal, A. M. Metelli, C. Pike-Burke, Efficient Exploitation of Hierarchical Structure in Sparse Reward Reinforcement Learning, in Artificial Intelligence and Statistics (AISTATS), 2025.
J. Lazarro, C. Pike-Burke, Fixed-Budget Change Point Identification in Piecewise Constant Bandits, in Artificial Intelligence and Statistics (AISTATS), 2025.
M. Abeille, D. Janz, C. Pike-Burke*, When and why randomised exploration works (in linear bandits), in Algorithmic Learning Theory (ALT), 2025.
R. Zhu, C. Pike-Burke, and F. Mintert, Active Learning for Quantum Mechanical Measurements, Physical Review A, 2024.
E. Johnson, C. Pike-Burke, and P. Rebeschini, Sample-Efficiency in Multi-Batch Reinforcement Learning: The Need for Dimension-Dependent Adaptivity, in International Conference on Learning Representations (ICLR) 2024.
G. Lugosi, C. Pike-Burke, and P.A.Savalle*, Bandit Problems with Fidelity Rewards, Journal of Machine Learning Research (JMLR), 2023.
E. Johnson, C. Pike-Burke, and P. Rebeschini, Optimal Convergence Rate for Exact Policy Mirror Descent in Discounted Markov Decision Processes, in Neural Information Processing Systems (NeurIPS), 2023.
A. Robert, C. Pike-Burke, A. Faisal, Sample-complexity of goal-conditioned Hierarchical Reinforcement Learning, 2023.
S. Vakili, D. Ahmed, A. Bernacchia, C. Pike-Burke, Delayed Feedback in Kernel Bandits, in International Conference on Machine Learning (ICML), 2023.
D. van der Hoeven^†, C. Pike-Burke^†, H. Qui and N. Cesa-Bianchi, Trading- Off Payments and Accuracy in Online Classification with Paid Stochastic Experts, in International Conference on Machine Learning (ICML), 2023.
B. Howson, C. Pike-Burke and S.Filippi, Delayed Feedback in Generalised Linear Bandits Revisited, in Artificial Intelligence and Statistics (AISTATS), 2023.
B. Howson, C. Pike-Burke and S.Filippi, Delayed Feedback in Episodic Reinforcement Learning, in Artificial Intelligence and Statistics (AISTATS), 2023.
M. Monaci, C. Pike-Burke and A. Santini, Exact Algorithms for the 0-1 Time-Bomb Knapsack problem, Computers & Operations Research, 2022.
N. Bhatia, C. Pike-Burke, E. Normando and O. Matar, Reinforcement learning with digital human models of varying visual characteristics, in International Digital Human Modeling Symposium, 2022.
E. Garcelon, V. Perchet, C. Pike-Burke and M. Pirotta, Local Differentially Private Regret Minimization in Reinforcement Learning, in Neural Information Processing Systems (NeurIPS), 2021.
G.Neu and C.Pike-Burke*, A Unifying View of Optimism in Episodic Reinforcement Learning, in Neural Information Processing Systems (NeurIPS), 2020.
C.Pike-Burke and S.Grünewälder, Recovering Bandits, in Neural Information Processing Systems (NeurIPS), 2019.
C.Pike-Burke, S.Agrawal, C.Szepesvári and S.Grünewälder, Bandits with Delayed, Aggregated Anonymous Feedback, in International Conference on Machine Learning (ICML), 2018.
C.Pike-Burke and S.Grünewälder, Optimistic Planning for the Stochastic Knapsack Problem, in International Conference on Artificial Intelligence and Statistics (AISTATS), 2017.

Workshop Papers without Longer Versions

E. Garcelon, V. Perchet, C. Pike-Burke and M. Pirotta, Bridging The Gap between Local and Joint Differential Privacy in RL, Workshop on Reinforcement Learning Theory, ICML, 2021.
C.Pike-Burke and S.Grünewälder, Optimistic Planning for Question Selection, in NeurIPS workshop on Machine Learning for Education, 2016.

Thesis

My PhD focused on sequential decision problems arising from the problem of selecting questions to give to students in education software. In particular, I studied several variants of the multi-armed bandit problem specifically motivated by issues arising in education software. My thesis was in collaboration with Sparx and can be accessed here.