In an era where personalization reigns supreme, how can AI become more attuned to you? Imagine, for instance, when you type "I passed, I'm so happy!" in a chat app, an AI that understands your sentiments not only recognizes your excitement but also recalls your fondness for smiling cat emojis. Consequently, it crafts a series of unique smiling cat emoji packs tailored just for you.
In the field of personalized AI content generation, Huawei and Tsinghua University have joined forces to develop a new technology called PMG (Personalized Multimodal Generation). This technology can generate multimodal content such as emoji packs, T-shirt designs, and movie posters that align with users' personalized needs based on their historical behaviors and preferences.
How does PMG work? By analyzing users' viewing and conversation histories, combined with the reasoning capabilities of large language models, it extracts users' preferences. This process involves both explicit keyword generation and implicit user preference vector generation, providing a rich informational foundation for multimodal content creation.
In practical applications, the PMG technology can achieve the following functions:
Keyword Generation: Constructs prompts to guide large models in extracting user preferences as keywords.
Implicit Vector Generation: Combines user preference keywords and target item keywords, using a bias-corrected large model fine-tuned with P-Tuning V2, to learn multimodal generation capabilities.
Balance of User Preferences and Target Items: By calculating the level of personalization and accuracy, it quantifies the generation effect and optimizes the content produced.
The research team validated the effectiveness of the PMG technology through three application scenarios: e-commerce clothing image generation, movie poster scenes, and emoji creation. The experimental results show that PMG can generate personalized content reflecting user preferences and performs excellently in image similarity metrics such as LPIPS and SSIM.
This technology is not only theoretically innovative but also demonstrates significant potential and commercial value in practical applications. With the growing demand for personalization, the PMG technology is poised for explosive growth in the future, offering users richer and more personalized experiences.
Project Link: https://github.com/mindspore-lab/models/tree/master/research/huawei-noah/PMG