An open-source project that fine-tunes large vision-language models via reinforcement learning to act as decision-making agents.

CommonProductProgrammingReinforcement LearningVision-Language Models
RL4VLM is an open-source project aimed at fine-tuning large vision-language models via reinforcement learning, enabling them to function as intelligent agents capable of making decisions. Developed collaboratively by researchers including Yuexiang Zhai, Hao Bai, Zipeng Lin, Jiayi Pan, Shengbang Tong, Alane Suhr, Saining Xie, Yann LeCun, Yi Ma, and Sergey Levine, it is based on the LLaVA model and employs the PPO algorithm for reinforcement learning fine-tuning. RL4VLM provides a comprehensive codebase structure, installation guidelines, licensing information, and instructions on how to cite the research.

RL4VLM Visit Over Time

Monthly Visits


Bounce Rate


Page per Visit


Visit Duration


RL4VLM Visit Trend

RL4VLM Visit Geography

RL4VLM Traffic Sources

RL4VLM Alternatives