PowerInfer-2

An efficient large language model inference framework designed specifically for smartphones

CommonProductProgrammingSmartphoneLarge Model
PowerInfer-2 is a mobile-optimized inference framework that supports MoE models up to 47B parameters, achieving an inference speed of 11.68 tokens per second, 22 times faster than other frameworks. It utilizes heterogeneous computing and I/O-Compute pipeline technology to significantly reduce memory usage and improve inference speed. This framework is suitable for scenarios requiring the deployment of large models on mobile devices, enhancing data privacy and performance.
Visit

PowerInfer-2 Visit Over Time

Monthly Visits

828

Bounce Rate

50.90%

Page per Visit

1.7

Visit Duration

00:00:33

PowerInfer-2 Visit Trend

PowerInfer-2 Visit Geography

PowerInfer-2 Traffic Sources

PowerInfer-2 Alternatives