SEIHAI: The hierarchical AI that won the NeurIPS-2020 MineRL competition

Overview of the researchers’ hierarchical architecture. Credit: Mao et al

In recent years, computational tools based on reinforcement learning have achieved remarkable results in numerous tasks, including image classification and robotic object manipulation. Meanwhile, computer scientists have also been training reinforcement learning models to play specific human games and videogames.

To challenge research teams working on reinforcement learning techniques, the Neural Information Processing Systems (NeurIPS) annual conference introduced the MineRL competition, a contest in which different algorithms are tested on the same task in Minecraft, the renowned computer game developed by Mojang Studios. More specifically, contestants are asked to create algorithms that will need to obtain a diamond from raw pixels in the Minecraft game.

The algorithms can only be trained for four days and on 8,000,000 samples created by the MineRL simulator, using a single GPU machine. In addition to the training dataset, participants are also provided with a large collection of human demonstrations (i.e., video frames in which the task is solved by human players).

A team of researchers at Huawei Noah’s Ark Lab, Tianjin University and Tsinghua University won the NeurIPS- MineRL 2020 competition. Using a sample-efficient hierarchical artificial intelligence (AI) tool called SEIHAI, the researchers were able to outperform all other algorithms participating in the contest.

“We present SEIHAI, a sample-efficient hierarchical AI that fully takes advantage of the human demonstrations and the task structure,” Hangyu Mao and his colleagues wrote in a paper outlining their AI, which was pre-published on arXiv. “Specifically, we split the task into several sequentially dependent subtasks and train a suitable agent for each subtask using reinforcement learning and imitation learning.”

To obtain a diamond in Minecraft, players need to follow a series of steps. Sequentially, they need to chop a tree to create a log, then use the log to craft a wooden pickaxe, which they will then use to dig out a cobblestone. Finally, the cobblestone needs to be placed into a furnace and crafted into a stone, which could be diamond or something else. Diamond is rare in the game, which further complicates the task for MineRL participants.

To tackle the task most effectively, Mao and his colleagues divided it into a series of subtasks, each of which required different skills and capabilities. They then trained different agents to tackle each of the subtasks individually, using reinforcement learning or imitation learning, depending on which one best suited the problem they were trying to solve.

To decide which agent was better suited for each of the different subtasks, the researchers used a scheduler, a tool that selected an agent for different situations based on the unique characteristics of the subtask that needed to be completed. The hierarchical model created by the researchers significantly outperformed all the other algorithms and models participating in the MineRL 2020 contest, achieving remarkable results.

“We won first place in the preliminary and final of the NeurIPS-2020 MineRL competition, which demonstrates the efficiency of our hierarchical method, SEIHAI,” the researchers wrote in their paper. “We believe that developing methods that properly combine human priors and sample-efficient learning-based techniques is a competitive way to solve complex tasks with limited demonstrations, sparse rewards but an explicit task structure.”


Robots deciding their next move need help prioritizing


More information:
Hangyu Mao et al, SEIHAI: A sample-efficient hierarchical AI for the MineRL competition. arXiv:2111.08857v1 [cs.LG], arxiv.org/abs/2111.08857

© 2021 Science X Network

Citation:
SEIHAI: The hierarchical AI that won the NeurIPS-2020 MineRL competition (2021, December 6)
retrieved 6 December 2021
from https://techxplore.com/news/2021-12-seihai-hierarchical-ai-won-neurips-.html

This document is subject to copyright. Apart from any fair dealing for the purpose of private study or research, no
part may be reproduced without the written permission. The content is provided for information purposes only.

For all the latest Technology News Click Here 

 For the latest news and updates, follow us on Google News

Read original article here

Denial of responsibility! TechNewsBoy.com is an automatic aggregator around the global media. All the content are available free on Internet. We have just arranged it in one platform for educational purpose only. In each content, the hyperlink to the primary source is specified. All trademarks belong to their rightful owners, all materials to their authors. If you are the owner of the content and do not want us to publish your materials on our website, please contact us by email – [email protected]. The content will be deleted within 24 hours.