Meta Introduces Self-Taught Evaluator in AI Models
Meta has unveiled new AI models, including a Self-Taught Evaluator that reduces human involvement in AI development.
AT A GLANCE
New AI Models Released: Meta Platforms Inc. unveiled new AI models, including the Self-Taught Evaluator (STE) designed to reduce human involvement in AI development.
Self-Taught Evaluator's Design: The STE utilizes a "chain of thought" approach, improving accuracy in complex problem-solving areas such as science and coding.
AI-Generated Training Data: The STE was trained exclusively on AI-generated data, minimizing human input during the training phase.
Vision for Autonomous Learning: Researchers suggest the STE could lead to autonomous agents capable of self-correction and learning from their mistakes.
New AI Models Released
Meta Platforms Inc. announced the release of new artificial intelligence models from its research division on Friday, including a "Self-Taught Evaluator" (STE) designed to minimize human involvement in the AI development process. This development follows the introduction of the tool in an August paper, which explained how the STE utilizes a "chain of thought" technique similar to the one used by OpenAI's recent models to enhance the reliability of AI evaluations.
Training Without Human Input
The "chain of thought" method involves breaking down complex problems into smaller, logical steps, which has shown to improve accuracy in challenging areas such as science, coding, and mathematics. Notably, the STE was trained using entirely AI-generated data, eliminating human input during this phase.
Autonomous Learning Capabilities
The researchers behind the STE indicated that the ability for AI to evaluate other AI reliably could lead to the creation of autonomous agents capable of learning from their own errors. This vision aligns with the industry's growing interest in developing intelligent digital assistants that can perform a wide range of tasks without direct human oversight.
Streamlining AI Development
Traditionally, AI development has relied heavily on Reinforcement Learning from Human Feedback (RLHF), a method that requires skilled human annotators to label data accurately and validate answers to complex queries. In contrast, Meta's approach with the STE aims to streamline this process by removing human intervention, which could prove more efficient and cost-effective.
"We hope, as AI becomes more and more super-human, that it will get better and better at checking its work, so that it will actually be better than the average human," said Jason Weston, one of the project's researchers. He emphasized that the concept of being self-taught and capable of self-evaluation is essential for achieving a super-human level of AI.
Public Availability and Other Tools
Meta's research is not the only initiative exploring this concept; other companies, such as Google and Anthropic, have also investigated Reinforcement Learning from AI Feedback (RLAIF). However, unlike Meta, these firms have been less inclined to release their models for public use.
In addition to the Self-Taught Evaluator, Meta has updated its image-identification Segment Anything model and introduced datasets intended to facilitate the discovery of new inorganic materials. By making these models publicly available, Meta aims to foster further research into AI self-evaluation methods.