At the Conference on Robot Learning (CoRL) 2023, researchers from UC Berkeley, Google DeepMind, Stanford University, and Simon Fraser University present their work on training anthropomorphic robot hands to play the piano using deep RL. They release a simulated benchmark and dataset to advance high-dimensional control. The interactive demo showcases a piano-playing agent trained with reinforcement learning, running MuJoCo in the browser. The team uses the MIDI standard to represent musical pieces and evaluates the agent’s proficiency using precision, recall, and F1 scores. By incorporating human priors through fingering labels and innovative system design choices, the agent achieves impressive results, surpassing a strong derivative-free model predictive control baseline. The research addresses challenges such as finger stretching and forearm thickness, suggesting potential improvements. They also release a debug dataset for sanity checking agent performance. The innovative use of reinforcement learning in training a piano-playing robot pushes the boundaries of high-dimensional control and robotic dexterity.
https://kzakka.com/robopianist/#demo