ElevenLabs is a startup focused on AI voice cloning and text-to-speech APIs, which recently announced a new feature that allows users to build complete conversational AI agents.
Users can now customize various parameters of the conversational agents on ElevenLabs' developer platform according to their needs, such as voice tone and response length.
Previously, ElevenLabs primarily provided different voice and text-to-speech services. Sam Sklar, the company's growth lead, stated in an interview with TechCrunch that many clients were already using the platform to create conversational AI agents. However, the biggest challenges were integrating knowledge bases and handling customer interruptions. Therefore, ElevenLabs decided to build a complete conversational bot pipeline to simplify this process.
Users can start building conversational agents by logging into their ElevenLabs account, selecting a template, or creating a new project. They can choose the main language of the agent, the first message, and system prompts to define the agent's personality.
Additionally, developers need to select a large language model (such as Gemini, GPT, or Claude), response temperature (which determines creativity), and token usage limits.
Users can also add knowledge bases according to their needs, such as files, URLs, or text blocks, to enhance the capabilities of the conversational bot. At the same time, they can integrate their custom large language models with the bot. ElevenLabs' SDK is compatible with Python, JavaScript, React, and Swift, and the company also offers a WebSocket API for further customization.
The company also allows users to define data collection standards, such as the names and emails of customers interacting with the agent, and to use natural language to define the criteria for evaluating the success of conversations.
ElevenLabs is leveraging its existing text-to-speech pipeline while also developing speech-to-text capabilities for its new conversational AI products. Currently, the company does not offer a standalone speech-to-text API but may launch one in the future to compete with speech-to-text APIs from companies like Google, Microsoft, and Amazon, as well as specialized APIs like OpenAI's Whisper, AssemblyAI, Deepgram, Speechmatics, and Gladia.
The company plans to raise a new round of funding with a valuation of over $3 billion and is competing with other voice AI startups like Vapi and Retell, which are also building conversational agents. More importantly, ElevenLabs will be competing with OpenAI's real-time conversational API. However, ElevenLabs believes its customization capabilities and flexibility in switching models will give it a competitive edge.
Highlights:
💬 ElevenLabs has launched a new feature for building conversational AI agents, allowing users to customize various parameters.
📚 Users can add knowledge bases to enhance agent capabilities and integrate custom large language models with them.
🚀 ElevenLabs plans to raise funds with a valuation of over $3 billion and compete with rivals like OpenAI.