Adobe launches new AI audio tool Sketch2Sound, allowing users to create sound effects by humming and imitating sounds

AIbase基地

Published inAI News · 4 min read · Dec 23, 2024

1.6k

Recently, Adobe Research collaborated with Northwestern University to develop an artificial intelligence system called Sketch2Sound, a tool that is expected to revolutionize the way sound designers work. Sketch2Sound allows users to create professional sound effects and ambient sounds through humming, mimicking sounds, and simple text descriptions.

This system analyzes three key elements of the user's vocal input: volume, timbre (which determines whether a sound is bright or dark), and pitch. It then combines these features with the user's text descriptions to generate the desired sounds. For example, when a user inputs "forest ambiance" and produces a short sound, the system automatically recognizes these sounds as birds chirping without needing specific instructions.

Another highlight of Sketch2Sound is its ability to understand context. When creating music, users can input "bass drum, snare drum" and hum a rhythm. The system intelligently places the bass drum on low notes and the snare drum on high notes. This intelligent processing greatly simplifies the sound design process.

To meet the needs of professionals, the research team has also integrated special filtering technology that allows users to adjust the precision of the generated sounds as needed. Sound designers can choose very precise control or a more relaxed, approximate approach. This flexibility may make Sketch2Sound particularly popular among Foley artists, who are responsible for creating sound effects for films and TV shows. With this tool, they can quickly create effects using sound and text descriptions without needing to manipulate physical objects to produce sounds.

Although researchers noted that the spatial audio characteristics in recorded inputs can sometimes negatively affect the generated sounds, they are working to address this issue. Currently, Adobe has not announced whether Sketch2Sound will be released as a commercial product or a specific launch date.

Project link: https://hugofloresgarcia.art/sketch2sound/

Key points:
🎵 Sketch2Sound is a newly developed AI tool that creates sound effects through humming and text descriptions.
🔊 The system analyzes volume, timbre, and pitch, combining the user's vocal input with text to generate target sound effects.
🎬 Especially suitable for Foley artists, it can quickly generate film and TV sound effects, enhancing work efficiency.

AdobeResearch Sketch2Sound ArtificialIntelligenceSystem

This article is from AIbase Daily

Welcome to the [AI Daily] column! This is your daily guide to exploring the world of artificial intelligence. Every day, we present you with hot topics in the AI field, focusing on developers, helping you understand technical trends, and learning about innovative AI product applications.

—— Created by the AIbase Daily Team

AI News Recommendations

Moonshot AI Releases and Opensources Kimi K2 Model, Strong in Code and Agentic Tasks

Moonshot AI officially released its latest creation - the Kimi K2 model, and simultaneously announced its open source. This foundation model based on the MoE architecture has gained widespread attention in the AI field since its release, thanks to its strong coding capabilities and excellent general Agent task processing abilities. The Kimi K2 model has a total of 1T parameters, with 32B activated parameters. It has achieved top performance among open-source models in a series of benchmark performance tests such as SWE Bench Verified, Tau2, and AceBench.

Jul 12, 2025

100

Aliyun Open Sources ThinkSound: AI Automatically Adds Sound Effects to Videos, Bringing a Major Transformation to Film and Game Creation!

Alibaba open sources the audio generation model ThinkSound, which supports multimodal inputs such as video, text, and audio, and can automatically generate high-fidelity sound effects that highly match the visuals. The model uses chain reasoning technology to achieve precise synchronization between audio and video, and is applicable to fields such as film and games. As an open source project, ThinkSound lowers the barriers to sound effect creation, and developers can freely access it through multiple platforms. This is Alibaba's latest breakthrough in the field of multimodal AI, and will drive the development of sound generation technology.

Jul 10, 2025

150

AI Daily: Tencent Huyaun Launches 3D Generation Large Model Hunyuan3D-PolyGen; DingTalk AI Spreadsheet Makes a Big Entry; Alibaba Launches Multimodal Large Language Model HumanOmniV2

1.Tencent's Hunyuan3D-PolyGen boosts 3D modeling efficiency by 70% with BPT tech. 2.Alibaba's HumanOmniV2 achieves 69.33% accuracy in multilingual input. 3.DingTalk AI processes 1k tasks/hour with 'spreadsheet-as-document'. 4.Baidu PaddleOCR3.1 improves 37-language recognition by 30%. 5.Microsoft Deep Research opens API. 6.HKPolyU & OPPO's DLoRAL speeds video enhancement 10x. 7.Google opens MCP Toolbox for SQL. 8.Microsoft Win11 to add AI dynamic....

Jul 8, 2025

1.1k

Ali HumanOmniV2 Launches with a Shock: The New King of Multimodal AI, Accuracy Surges to 69.33%

Jul 8, 2025

1.5k

AI Daily: Bilibili May Launch an AI Creation Tool Named H; Zhiyuan Unveils Naoche Robot Lingxi X2-N; Yushu Technology Pursues IPO on Sci-Tech Innovation Board

AI Daily: B站 launches 'H' tool for video creation; Zhiyuan unveils dual-mode robot X2-N; Yushu Tech aims for IPO at $12B valuation; EarthMind innovates earth data analysis; Gemini CLI updates AV features; macOS assistant Glass goes open-source; Claude to release math-focused Neptune v3; OpenAI's GPT-5 to integrate multi-models.....

Jul 7, 2025

1.0k

Zhixuan Launches Naocha Robot Lingxi X2-N: Can Switch Between Wheel and Foot Dual Modes

Ziyuan's robot Lingxi X2-N features dual-mode design: wheeled for mobility and legged for obstacle-crossing, carrying 6kg. It adapts to complex terrains with excellent balance and load capacity.....

Jul 7, 2025

960

Kunlun Xiwang Once Again Open-Sources the Reward Model Skywork-Reward-V2

On July 4, 2025, Kunlun Xiwang continued to open-source the second-generation reward model Skywork-Reward-V2 series. This series includes 8 reward models based on different foundation models, with parameter sizes ranging from 600 million to 8 billion. Upon its release, it won all seven major reward model evaluation rankings, becoming a focus in the open-source reward model field. Reward models play a key role in the reinforcement learning from human feedback (RLHF) process. To build the next generation of reward models, Kunlun Xiwang has constructed a dataset containing 40 million

Jul 4, 2025

310

Topview Avatar 2 Shakes the Market! AI Digital Humans Revolution E-commerce Live Streaming, Will the Era of Models Come to an End?

Jul 3, 2025

580

Honor Magic V5 Launch: Li Jian Emphasizes Open Ecosystem, Collaborating with Giants to Build the AI Future

In the media Q&A session after today's Honor Magic V5 and AI Terminal Ecosystem Launch, Honor CEO Li Jian, CFO Peng Qiuen, and Product Line President Fang Fei had in-depth discussions with the media. During the event, Honor officially announced support for the MCP and A2A protocols, and revealed that it will collaborate deeply with partners such as Alibaba, BYD, and Midea in the fields of intelligent service ecosystem, smart vehicle networking, and smart home. Honor CEO Li Jian emphasized in the conversation that 'openness' is the core philosophy of Honor. He pointed out...

Jul 3, 2025

Chai-2 Makes a Shocking Debut: AI-Powered Zero-Shot Antibody Design, Accelerating Drug Development by Hundreds of Times

Artificial intelligence once again stirs up the field of drug development! Chai Discovery recently launched a new AI model called Chai-2, which has drawn widespread attention with its breakthrough technology in molecular design. Chai-2 achieves zero-shot antibody design with a success rate of 16%-20%, hundreds of times higher than traditional methods, shortening the drug development cycle from months or even years to just two weeks. Zero-shot antibody design breaks through traditional bottlenecks. Chai-2 is a multi-modal generative AI model developed by Chai Discovery, specifically designed for...

Jul 1, 2025

800

Product Finder

Product Submit

AI Models Finder

MCP Servers

MCP Client

MCP Inspector

Case Tutorials

Latest AI News

AI Daily Brief