AI News

Don't miss any moment of global AI innovation

AI Daily

Daily three-minute AI industry trends

AI Timeline

AI industry milestones

Al Hardware

Lists all AI hardware products.

AI Monetization Guide

Latest Cases

AI monetization case sharing

Image Collection

AI image creation monetization cases

Video Collection

AI video creation monetization cases

Audio Collection

AI audio creation monetization cases

Content Collection

AI content writing monetization cases

AI Tutorials

Latest Tutorials

Free sharing of the latest AI tutorials

AI Product Rankings

AI Product Ranking

Shows total visits ranking of AI websites

AI Traffic Growth Ranking

Track fastest growing AI websites by traffic

AI Traffic Decline Ranking

Focus on AI websites with significant traffic drops

AI Weekly Ranking

Shows weekly visits ranking of AI websites

Popular Country Rankings

United States

AI websites most popular with US users

China

AI websites most popular with Chinese users

India

AI websites most popular with Indian users

Brazil

AI websites most popular with Brazilian users

Popular Category Rankings

Image Generation

Total visits ranking of AI image generation websites

Personal Assistant

Total visits ranking of AI personal assistant websites

Character Generation

Total visits ranking of AI character generation websites

Video Generation

Total visits ranking of AI video generation websites

Popular Open Source Data Rankings

AI Project Ranking

GitHub popular AI projects by total stars

AI Project Growth Ranking

GitHub popular AI projects by growth rate

AI Developer Ranking

GitHub popular AI developer ranking

AI Organization Ranking

GitHub popular AI organization ranking

Popular Open Source Categories

Deepseek

GitHub popular deepseek open source projects

TTS

GitHub popular TTS open source projects

LLM

GitHub popular LLM open source projects

ChatGPT

GitHub popular ChatGPT open source projects

AI Open Source Project Library

Overview

Overview of GitHub popular AI open source projects

Product Library Tool Navigation

ZhiYuan Research Institute Releases Open-Source JudgeLM Evaluation Model to Assess Various Large Models and Provide Scores

站长之家

Published inAI News · 2 min read · Nov 13, 2023

138

The Beijing Academy of Artificial Intelligence (BAAI) has open-sourced a judging model named JudgeLM, which can efficiently and accurately evaluate various large models. Compared to GPT-4, JudgeLM achieves over 90% consistency in evaluation results at just 1/120 of the cost. JudgeLM is applicable to a wide range of evaluation scenarios including pure text and multimodal content, and can output scores, judgments, and explanations for its decisions. Through innovative methods, JudgeLM's consistency with reference answers has exceeded 90%, approaching human performance. BAAI has also open-sourced a dataset containing training and validation samples for in-depth research on large language model judging. In the future, the JudgeLM team will further refine this judging model to provide more accurate, efficient, and comprehensive evaluation of large language models across more scenarios.

Evaluation Model Assessing Large Models ZhiYuan Research Institute

This article is from AIbase Daily

Welcome to the [AI Daily] column! This is your daily guide to exploring the world of artificial intelligence. Every day, we present you with hot topics in the AI field, focusing on developers, helping you understand technical trends, and learning about innovative AI product applications.

—— Created by the AIbase Daily Team

AI News Recommendations

ZhiYuan Research Institute and Tencent Reach Strategic Cooperation to Promote the Implementation of Large Models and AI Applications

On December 18, 2024, ZhiYuan Research Institute signed a strategic cooperation agreement with Tencent Group, aiming for in-depth collaboration in several areas, including large model research and development, exploration of cutting-edge artificial intelligence technologies, and open-source ecosystem construction. According to the cooperation agreement, ZhiYuan Research Institute and Tencent will leverage their respective strengths to promote the deep integration of large model technology and industrial scenarios, and explore technological solutions for large model training and inference optimization in diverse computing environments. Both parties are also committed to building an open and innovative hardware and software ecosystem to foster the development and application of technologies.

Dec 19, 2024

1.6k

Baidu and Zhiyuan Research Institute Reach Strategic Agreement to Collaborate in Large Model and Other Fields

Beijing Baidu Network Technology Co., Ltd. and Beijing Zhiyuan Artificial Intelligence Research Institute announced today the formal signing of a strategic cooperation agreement, under which the two sides will engage in deep collaboration in large model and other fields, jointly constructing an AI industry-research collaborative ecosystem. Baidu has been comprehensively laying out its artificial intelligence strategy since 2010 and is one of the few companies globally that has fully integrated its AI efforts, from Kunlun chips, PaddlePaddle deep learning platform, Wenxin large model to applications.

Sep 24, 2024

1.3k

ZhiYuan Research Institute Launches FlagEval Large Model Arena Featuring Text-to-Video Model Combat Evaluation Service

On September 4, 2024, the Beijing ZhiYuan Artificial Intelligence Research Institute (BAAI) announced the launch of the world's first model combat evaluation service featuring text-to-video, called FlagEval Large Model Arena. This service is open to users and covers approximately 40 large models from both domestic and international sources, supporting customizable online or offline evaluations for four major tasks: language Q&A, multimodal image and text understanding, text-to-image, and text-to-video.

Sep 5, 2024

3.4k

Zhiyuan Releases Three New BGE Models, Setting New Best Standards in Vector Retrieval

The Zhiyuan Research Institute recently released three new vector models that excel in vector retrieval tasks and have set new best standards across multiple evaluation benchmarks. The three models are: BGE-EN-ICL: an English vector model that enhances the model's semantic expression capabilities by introducing task-related query-document examples as few-shot examples.

Jul 29, 2024

5.4k

Zhiyuan Research Institute Releases the World's First Trillion-Parameter Dense Model Tele-FLM-1T as Open Source

Beijing Zhiyuan Artificial Intelligence Research Institute, in collaboration with China Telecom Artificial Intelligence Research Institute, has launched an upgraded version of the Tele-FLM series of large models, including the 52B instruction model FLM-2-52B-Instruct and the trillion-parameter model Tele-FLM-1T. The FLM-2-52B-Instruct focuses on enhancing Chinese conversational abilities through instruction fine-tuning, achieving 90% of GPT-4's capabilities, based on the Tele-FLM-52B base model, utilizing specific datasets and parameter optimization.

Jul 25, 2024

2.8k

ZhiYuan Research Institute Releases Code Generation Training Dataset TACO

ZhiYuan Research Institute has released a code generation training dataset called TACO, aimed at providing more challenging training data and evaluation benchmarks for code generation models. TACO has advantages in terms of data scale, quality, and evaluation schemes, including a larger training and testing set, diverse problem-solving answers, and fine-grained labels. Experimental results show that current popular code generation models show significant differences compared to GPT-4 in TACO evaluations, indicating that there is still room for improvement in this field. TACO is not just a challenging...

Dec 25, 2023

1.7k

ZhiYuan Research Institute Releases Emu2: A New Generation Generative Multimodal Foundation Model

["ZhiYuan Research Institute has released the new generation multimodal foundation model Emu2, pushing the boundaries of multimodal contextual learning capabilities.", "Emu2 surpasses Flamingo-80B and IDEFICS-80B, demonstrating excellent performance in few-shot multimodal understanding tasks.", "Emu2 achieves optimal performance in multiple few-shot understanding, visual question answering, and image generation tasks.", "Emu2-Chat realizes accurate understanding of text-image instructions, while Emu2-Gen offers flexible, controllable, high-quality images."]

Dec 22, 2023

1.6k

ZhiYuan Research Institute Releases Low-Cost, High-Performance LM-Cocktail Model Governance Strategy

ZhiYuan Research Institute proposed the LM-Cocktail model governance strategy, which enhances performance on target tasks while maintaining general capabilities by integrating multiple models. The LM-Cocktail is flexible and efficient, allowing for manual or automatic computation of weighted weights to integrate models, making it suitable for scenarios where fine-tuning is not possible, thus improving general performance. Experimental results indicate that LM-Cocktail improves accuracy on target tasks while avoiding the need for large amounts of data and computational resources, providing a low-cost performance enhancement approach for large models. The innovative aspect of this strategy lies in the fine-tuning after.

Dec 11, 2023

710

ZhiYuan Research Institute Jointly Builds Chinese Internet Corpus CCI to Provide Resources for Big Data and Artificial Intelligence Industries

ZhiYuan Research Institute, in collaboration with TuoSi and ZhongKe WenGe, has jointly established the 'Chinese Internet Corpus' (CCI). This corpus has undergone strict screening and cleaning, with a data scale of 104GB, covering the period from 2001 to 2023. ZhiYuan Research Institute will continue to expand data sources and improve data processing workflows to provide more high-quality and reliable data resources. The institute has also opened up other high-quality Chinese datasets, such as WUDAO corpus, COIG, and MTP. This initiative aims to support the big data and artificial intelligence industries.

Nov 29, 2023

1.9k

ZhiYuan Research Institute Releases 1 Billion Parameter General 3D Vision Model Uni3D

["ZhiYuan Research Institute has recently open-sourced the Uni3D model with 1 billion parameters, designed for general 3D vision tasks.", "The model can process point cloud data and has achieved breakthroughs in mainstream 3D vision tasks.", "Uni3D employs a unified Transformer architecture and introduces a multimodal alignment training method.", "The model has achieved state-of-the-art results across various 3D vision tasks.", "ZhiYuan Research Institute states that the open-source release of Uni3D will contribute to the future of 3D computing."]

Oct 20, 2023

860