promptbench

Unified Language Model Evaluation Framework

CommonProductProgrammingBenchmarkEvaluation
PromptBench is a Python package based on PyTorch designed for evaluating Large Language Models (LLM). It offers a user-friendly API for researchers to assess LLMs. Key features include: rapid model performance evaluation, prompting engineering, adversarial prompting assessment, and dynamic evaluation. Its advantages are simplicity of use, allowing for quick assessment of existing datasets and models, as well as easy customization of personal datasets and models. Positioning itself as a unified open-source library for LLM evaluation.
Visit

promptbench Visit Over Time

Monthly Visits

488643166

Bounce Rate

37.28%

Page per Visit

5.7

Visit Duration

00:06:37

promptbench Visit Trend

promptbench Visit Geography

promptbench Traffic Sources

promptbench Alternatives