OpenAI has always focused on artificial intelligence (AI) and machine learning advances that benefit humanity. Recently, the company successfully trained a bot to play Minecraft using more than 70,000 hours of gameplay videos. The achievement is far more than just a bot playing a game. It marks a giant stride forward in advanced machine learning using observation and imitation. OpenAI’s bot is an excellent example of imitation learning (also called “supervised learning”) in action. Unlike reinforcement learning, where a learning agent is rewarded after reaching a goal through trial and error, imitation learning trains neural networks to perform specific tasks by watching humans complete them. In this case, OpenAI leveraged available gameplay videos and tutorials to teach their bot to execute complex in-game sequences that would take the typical player approximately 24,000 individual actions to achieve.
Imitation learning requires video inputs to be labeled to provide the context of the action and observed outcome. Unfortunately, this approach can be highly labor intensive, resulting in limited available datasets. This shortage of available datasets ultimately limits the agent’s ability to learn via observation. Rather than muscling through an extensive manual data tagging exercise, OpenAI’s research team used a specific approach, known as Video Pre-Training (VPT), to significantly expand the number of labeled videos available. Researchers initially captured 2,000 hours of annotated Minecraft gameplay and used it to train an agent to associate specific actions with specific on-screen outcomes. The resulting model was then used to automatically generate labels for 70,000 hours of previously unlabeled Minecraft content readily available online, providing the Minecraft bot with a much larger dataset to review and imitate. The entire exercise proves the potential value of available video repositories, such as YouTube, as an AI training resource. Machine learning scientists could use available and properly labeled videos to train AI to conduct specific tasks, ranging from simple web navigation to aiding users with real-life physical needs.