Qualification Type: | PhD |
---|---|
Location: | Birmingham |
Funding for: | UK Students, EU Students |
Funding amount: | £17,668 pa and tuition fees of £4,620 pa |
Hours: | Full Time |
Placed On: | 31st January 2023 |
---|---|
Closes: | 28th February 2023 |
Reinforcement Learning (RL) has achieved exceptional success in recent years, especially for sequential decision-making and tasks that require continuous control. Examples include the game of Go, video games – especially strategy games such as StarCraft – and also robotics.
Recently, a prominent area of research involves the extension to multi-agent reinforcement learning (MARL). In this project, we focus on decentralised MARL, i.e., where agents do not intrinsically know the state of the other agents but can interact with one another. The main advantages of this approach include faster learning (e.g., through parallel computation), robustness to individual failures, transfer learning from more experienced agents, increasing number of applications (both in cooperative and adversarial settings), to name a few.
However, the extension of RL to multi-agent settings brings several challenges. One of the main challenges is the coordination of these agents. Another fundamental issue is scalability and, therefore, the communication between large numbers of agents. Recent literature has tackled the scalability issue through the formulation of optimality guarantees in the limit of an infinite number of agents through a mean-field approach [J01]. The advantage of this approach is to be able to quantify upper bounds in the performance of a large number of agents when performing collective decision-making [J01].
However, when translating these results back to the original problem with finite population, the interactions among the agents play a crucial role and similar approaches do not scale down well. Therefore, working with multi-agent systems is necessary to ensure optimality.
The aim of this project is to study multi-agent reinforcement learning by embedding elements of game theory to tackle collective decision-making. Specifically, three objectives are considered:
Eligibility:
Eligibility: First or Upper Second Class Honours undergraduate degree and/or postgraduate degree with Distinction (or an international equivalent). We also consider applicants from diverse backgrounds that have provided them with equally rich relevant experience and knowledge. Full-time and part-time study modes are available. UoB studentships are open to all and we particularly welcome applications from under-represented groups, including, but not limited to BAME, disabled and neuro-diverse candidates. We also welcome applications for part-time study.
Funding
The position is offered is for three and a half years full-time study. The value of the award is a stipend of £17,668 (subject to review) pa and tuition fees of £4,620 pa. Awards are usually incremented on 1 October each following year.
Type / Role:
Subject Area(s):
Location(s):