Cooking up voices

What began at Soundcircus a year ago as a playful experiment with the capabilities of AI has evolved into an advanced AI tool that revolutionizes synthesized speech. How did the Soundcircus Intelligent Speech System come into being? What can it do now, and where is it headed? Michiel, an AI specialist at Soundcircus, and  “Circus Director” Kees share insights into the next frontier of voice design: “For three decades, our mission at Soundcircus has been to craft unique voices for every client. Now, we can create those completely bespoke voices ourselves.”

By Sanne Houwing, photo by Martijn van de Griendt (and ai)
Dutch Version 


INTRODUCING ALLIEGO DINGEL  At Soundcircus, we’re known for finding iconic voices for brands. Voices that resonate through commercials and campaigns and lend extra character to their message. However, our collaboration with Joe Public marked a significant shift – rather than searching for a voice, we crafted one. Michiel elaborates: “It’s about designing a voice and bringing it to life through a voice actor. Our system overlays a custom-designed voice onto the voice of the actor.” In the case of the new ALDI voice, that meant composing one out of the voices of ALDI employees themselves. Ten voices from employees across ALDI’s stores, distribution centers, and offices were the building blocks for this AI-generated voice. Kees adds: “From the multitude of then generated voices in our Soundcircus Intelligent Speech System, we chose the one that best suited the ALDI brand. This is how Alliego Dingel was born.”

THE GENESIS  Our journey into synthesized speech began with the arrival of text-to-speech software from America. Kees reminisces: “We wanted to explore its capabilities – its range, its ability to convey emotion. Initially, it fell short. Sure, it could raise its volume with capital letters and pitch with commas, but it lacked the depth required to replace a voice actor. That remained the status quo for years, until the recent explosion of AI and the ease of cloning voices. This prompted us to question: could AI be a tool for us, rather than a force to replace us? That’s when we delved into the possibilities at Soundcircus.”

AI AS A CREATIVE TOOL  Michiel took the plunge into AI technology: “At first, it was just tinkering. But soon, we realized the potential of AI as a creative tool. It became instrumental in refining audio – salvaging poorly recorded sounds became a breeze with AI.” Kees adds: “Previously, noise reduction and filtering could only do so much. If the noise was too overwhelming, redubbing was the only option. With AI, noise disappears, leaving only the voice. It became a powerful tool for us. But we wanted more – could we train AI to produce Dutch, capturing the nuances of human speech?”

CRAFTING THE VOICE  While text-to-speech prioritizes functionality and efficiency, Michiel sought something more human: “Most speech software aims for speed, churning out consistent results. But at Soundcircus, we cherish the serendipitous moments in voice recordings – the imperfections that make it human. Speech-to-speech allows us to capture that essence.” Kees adds: “For thirty years, we’ve strived to give each client a distinct voice. Previously, we scouted theaters for talent, but now we have complete control. With the Soundcircus Intelligent Speech System, we can craft the perfect voice for any brand.”

REAL TIME AI VOICES  “The Soundcircus Intelligent Speech System facilitates a synergy between physical and artificial voices,” Michiel explains. “For musicians, it’s akin to a vocoder. The system takes in a voice, with all its nuances, and transforms it into a custom-designed voice. What remains is the character – warmth, fragility, humanity – while the voice itself can be tailored to any desired tone. And we can do it all in real-time.”


Sanne “Wait a minute. You made a sheep talk?” Kees: “Yes, by adding the sound of a sheep in the mix. Then you get more of that bleating sound.” Michiel: “I even made a cello talk!”

VOICE ARTISTRY  How crucial is the human touch in this AI-driven world? Do we still need voice actors? Kees reflects: “A beautiful voice is more than just sound – it’s the interpretation that breathes life into it. Our AI-generated voice is a canvas awaiting the artist’s touch – the voice actor shapes its range and tone.” Michiel agrees: “Sometimes, our AI struggles with certain sounds or emotions, like a hearty laugh or a whisper. We constantly refine our system to accommodate these nuances.”

A VOICE FOR EVERY OCCASION  Can you purchase a voice for eternity? Kees explains: “Our Soundcircus Intelligent Speech System operates off grid, and anyone lending their voice to our system signs an agreement allowing perpetual use within our system. They’re also compensated for any use in synthesized voices. We’ve created a model that values collaboration between AI and human talent. While the AI-generated voice belongs to Soundcircus, it can serve as a timeless asset for any brand.” Michiel adds: “Moreover, this voice isn’t bound by language – a French, American, or Chinese voice actor can command the same voice. It’s a universal brand voice. Isn’t that remarkable?”

More information, mail us