Audiobox is Meta's next-generation audio generation research model. It can generate voices and sound effects using voice input and natural language text prompts, making it easy to create custom audio for various use cases. The Audiobox family of models also includes professional models Audiobox Speech and Audiobox Sound, all of which are built upon the shared self-supervised model Audiobox SSL.