promptbench
Unified Language Model Evaluation Framework
CommonProductProgrammingBenchmarkEvaluation
PromptBench is a Python package based on PyTorch designed for evaluating Large Language Models (LLM). It offers a user-friendly API for researchers to assess LLMs. Key features include: rapid model performance evaluation, prompting engineering, adversarial prompting assessment, and dynamic evaluation. Its advantages are simplicity of use, allowing for quick assessment of existing datasets and models, as well as easy customization of personal datasets and models. Positioning itself as a unified open-source library for LLM evaluation.
promptbench Visit Over Time
Monthly Visits
494758773
Bounce Rate
37.69%
Page per Visit
5.7
Visit Duration
00:06:29