en
每月不到10元,就可以无限制地访问最好的AIbase。立即成为会员
Home
News
Daily Brief
Income Guide
Tutorial
Tools Directory
Product Library
en
Search AI Products and News
Explore worldwide AI information, discover new AI opportunities
AI News
AI Tools
AI Cases
AI Tutorial
Type :
AI News
AI Tools
AI Cases
AI Tutorial
2024-08-09 17:13:27
.
AIbase
.
11.0k
Is ChatGPT's Mysterious Power Holding Back LLMs? Karpathy and LeCun Jointly Critique RLHF Technology
Andrej Karpathy suggested that Reinforcement Learning from Human Feedback (RLHF) may not be the ultimate solution for AI to achieve human-level problem-solving capabilities. Using AlphaGo as an example, he pointed out that true reinforcement learning techniques optimize neural networks through self-play, ultimately surpassing humans without human intervention. In contrast, RLHF resembles imitation of human preferences rather than solving problems, being effective in closed environments with clearly defined reward mechanisms, such as Go, but less so in open-ended tasks like article summarization and code rewriting.