Moonlit's Kimi Open Platform has announced a 50% reduction in the cost of context cache storage. Specifically, the cache storage fee has been reduced from 10 yuan per 1M tokens per minute to 5 yuan per 1M tokens per minute.

WeChat Screenshot_20240807110150.png

On July 1st, Kimi Open Platform announced the public beta testing of Context Caching.

Context Caching is an efficient data management technique that allows systems to pre-store large amounts of data or information that are likely to be frequently requested.

This way, when you request the same information again, the system can quickly provide it directly from the cache without needing to recalculate or retrieve it from the original data source, saving time and resources.

Context Caching is particularly suitable for scenarios with frequent requests and repeated references to large initial contexts, significantly reducing the cost of long text models and improving efficiency!