RL4VLM
An open-source project that fine-tunes large vision-language models via reinforcement learning to act as decision-making agents.
CommonProductProgrammingReinforcement LearningVision-Language Models
RL4VLM is an open-source project aimed at fine-tuning large vision-language models via reinforcement learning, enabling them to function as intelligent agents capable of making decisions. Developed collaboratively by researchers including Yuexiang Zhai, Hao Bai, Zipeng Lin, Jiayi Pan, Shengbang Tong, Alane Suhr, Saining Xie, Yann LeCun, Yi Ma, and Sergey Levine, it is based on the LLaVA model and employs the PPO algorithm for reinforcement learning fine-tuning. RL4VLM provides a comprehensive codebase structure, installation guidelines, licensing information, and instructions on how to cite the research.
RL4VLM Visit Over Time
Monthly Visits
515580771
Bounce Rate
37.20%
Page per Visit
5.8
Visit Duration
00:06:42