RL4VLM

An open-source project that fine-tunes large vision-language models via reinforcement learning to act as decision-making agents.

CommonProductProgrammingReinforcement LearningVision-Language Models
RL4VLM is an open-source project aimed at fine-tuning large vision-language models via reinforcement learning, enabling them to function as intelligent agents capable of making decisions. Developed collaboratively by researchers including Yuexiang Zhai, Hao Bai, Zipeng Lin, Jiayi Pan, Shengbang Tong, Alane Suhr, Saining Xie, Yann LeCun, Yi Ma, and Sergey Levine, it is based on the LLaVA model and employs the PPO algorithm for reinforcement learning fine-tuning. RL4VLM provides a comprehensive codebase structure, installation guidelines, licensing information, and instructions on how to cite the research.
Visit

RL4VLM Visit Over Time

Monthly Visits

499904316

Bounce Rate

37.31%

Page per Visit

5.8

Visit Duration

00:06:52

RL4VLM Visit Trend

RL4VLM Visit Geography

RL4VLM Traffic Sources

RL4VLM Alternatives