Back to search results

PhD in Collaborative Multi-agent Deep Reinforcement Learning: Enabling Communication and Relational Learning

University of Warwick - WMG

Qualification Type: PhD
Location: Warwick
Funding for: UK Students, EU Students
Funding amount: £15,285
Hours: Full Time
Placed On: 26th July 2021
Closes: 4th October 2021

One of the main challenges in AI today is that of autonomous sequential decision-making: how can we give algorithms the ability to decide what actions to take whilst interacting with an uncertain environment in order to achieve a goal? Remarkable developments in this direction over the last few years have relied on deep reinforcement learning, which is based on the mathematical formalism of Markov decision processes, using artificial neural networks as flexible function approximators.

Many real-world applications are characterised by the interplay of multiple decision-makers that operate in the same shared-resources environment and need to accomplish goals cooperatively. Some of the most advanced industrial multi-agent systems in the world today are assembly lines and warehouse management systems. Whether the agents are robots, autonomous vehicles or clinical decision-makers, there is a strong desire for and increasing commercial interest in these systems: they are attractive because they can operate on their own in the world, alongside humans, under realistic constraints.

Multi-agent reinforcement learning has been studied since the 1990s; however, the last five years have been characterised by a remarkable boost in academic and commercial activity, fuelled by ground-breaking advances in deep neural networks along with the increasing power and decreasing cost of computing. The fast-developing area of multi-agent deep reinforcement learning has emerged to extend DRL to teams of autonomous agents. However, apart from a handful of highly specialised systems, the number of real-world applications powered by MADRL has still been limited.

As part of this PhD project, which is part of a UKRI Turing AI Acceleration Fellowship, you contribute to the emerging area of MADRL with a view to unleashing its full potential. You will consider the cooperative MADRL problem, in which a system of several learning agents must jointly optimise a single reward signal – the team reward – accumulated over time. Each agent has local autonomy: it can access its local observations and choose actions from its own action space. One of the most significant challenges in this context is how to foster collaborative behaviour within the system. The fundamental enabler of cooperative multi-agent skills is the ability to develop adequate communication. In previous work, we have demonstrated how explicit communication patterns emerge in systems equipped with a differential memory learned end-to-end through policy gradient methods. Even when every agent has access to every other agent’s observations, communication mechanisms still need to be learned for the task at hand to improve coordination because the information that agents possess at a given time may be noisy or not necessarily relevant regarding informing other agents’ decisions.

In this PhD project, you will develop a general graph-based framework to facilitate efficient multi-agent communication, enable learning using sparse rewards and build a relational representation of the environment. You will be joining a larger research team based at WMG at the University of Warwick working on various deep reinforcement learning problems and will support the development of an open-source library of multi-agent tasks with strong connections to industry.

Candidates should have an MSc in Statistics, Computer Science, Engineering or similar quantitative background and very strong and demonstrable programming skills especially in Python.

For informal enquires please contact Professor Giovanni Montana:

Funding - WMG

Funding Duration - 3.5 years

Stipend - Standard PhD at UKRI rates: £15,285

We value your feedback on the quality of our adverts. If you have a comment to make about the overall quality of this advert, or its categorisation then please send us your feedback
Advert information

Type / Role:

Subject Area(s):


PhD tools
More PhDs from University of Warwick

Show all PhDs for this organisation …

More PhDs like this
Join in and follow us

Browser Upgrade Recommended has been optimised for the latest browsers.

For the best user experience, we recommend viewing on one of the following:

Google Chrome Firefox Microsoft Edge