Collaborate

Central to this research  is the ability for humans and robots to collaborate safely and effectively in shared physical endeavours. In this theme we focus on complex human robot interactions towards completing joint tasks. By design, our Flagships require us to address open issues in human-robot collaboration, including physical interaction, and how synergy in robot and human capabilities can be created and exploited towards a common goal. Drawing on all the capabilities developed in other themes, the focus here is the question of how to integrate uncertain or learned models of human states – including affect and trust – and robot actions into planning approaches capable of actively shaping interactions and providing formal guarantees over joint behaviour.
 

Rigter M, Lacerda B, & Hawes N

Conference on Neural Information Processing Systems (NeurIPS 2021)

In this work, we consider the problem of risk-averse decision-making in an unknown environment. We pose this problem as optimising the conditional value at risk (CVaR) of the total return in Bayes-adaptive Markov decision processes (MDPs). We show that a policy optimising CVaR in this setting is risk-averse to both the epistemic uncertainty due to the prior distribution over MDPs, and the aleatoric uncertainty due to the inherent stochasticity of MDPs.

reinforcementlearningimage

We reformulate the problem as a two-player stochastic game and propose an approximate algorithm based on Monte Carlo tree search and Bayesian optimisation. Our experiments demonstrate that our approach significantly outperforms baseline approaches for this problem.

 

 

 

Costen C, Rigter M, Lacerda B & Hawes N
We consider shared autonomy systems where multiple operators (AI and human), can interact with the environment, e.g. by controlling a robot. The decision problem for the shared autonomy system is to select which operator takes control at each timestep, such that a reward specifying the intended system behaviour is maximised. The performance of the human operator is influenced by unobserved factors, such as fatigue or skill level. Therefore, the system must reason over stochastic models of operator performance.
activeoperatormodelimage
We present a framework for stochastic operators in shared autonomy systems (SO-SAS), where we represent operators using rich, partially observable models. We formalise SO-SAS as a mixed-observability Markov decision process, where environment states are fully observable and internal operator states are hidden. We test SO-SAS on a simulated domain and a computer game, empirically showing it results in better performance compared to traditional formulations of shared autonomy systems.