Until very recently, when someone mentioned an automated voice assistant, they were talking about chatbots.
Back then, the tech teams behind them had the thankless job of collecting every possible customer question, designing endless logic trees, and preparing scripted answers.
Speech recognition, known as Speech-to-Text, or STT, was unreliable, often mishearing words and producing strange requests.
And unless the user stayed strictly within the prepared topics, chatbots quickly drove to dead ends. They could not handle context, nuance, or follow-up questions.
That’s what “automation” looked like from the late 20th century up until ChatGPT arrived in 2022. It wasn’t great and that’s why chatbots still have such a bad reputation.
Everything changed thanks to one big improvement: AI Assistants learned to handle questions they were never programmed for.
They learned language fluency : grammar, tone, even natural human rhythm ; and they learned inference (logic, meaning and a broader range of topics).
Old-school chatbots are gone. Long live the AI Assistant!
These assistants now come in three main forms, each suited to different needs.
And chances are, your brand could benefit from one of them.
We like to call them Text-First AI Assistants.
You can find them across websites, often living in the small chat bubble at the bottom right corner.
Unlike the old bots that relied on keywords, these assistants handle full sentences and natural language. They offer a conversational alternative to the search bar.
For casual users, it’s not always easy to tell old bots apart from new ones, but the difference is obvious once the discussion goes beyond a single question.
Modern AI chatbots can follow the thread of a conversation, refine their answers etc.
On top of text, they can respond with images, maps, or even product visuals and prices.
Usually they rely on pre-trained data for instant answers, though that can mean missing out on breaking news or live pricing. If they choose to fetch web information in real-time, they will say so to explain the additional delay.
Every foundation model (ChatGPT, Gemini, Perplexity, Grok, Claude) still defaults to a text-first experience, which remains where most users spend their time.
Private text Agents are ideal for brands addressing large, diverse audiences with precise in-house information. They are cost-effective, reliable, and simple to deploy.
Taking the example of a wine brand, you can test this link.
The self-service configuration for the Text AI Agent starts at 99€ per month. Direct access is here.
Voice-First AI Agents are the unexpected stars of the LLM revolution.
Thanks to incredible advances in transcription (STT), they now understand speech instantly and accurately, even in noisy environments.
The basic ask/reply pattern hasn’t changed, but replacing typing with natural speech makes interactions fluid and human.
When both transcription and response times drop below a second, genuine real-time conversations become possible.
Voice is transformative. It unlocks new ways to engage customers, but it also demands top-notch performance:
When voice transcription runs directly on-device (like in WhatsApp), it’s lightning-fast but may lack extras like subtitles, autocorrect, or translation.
Interestingly, voice modes in most LLMs still lack features common in text, like images, maps, or product cards.
The reason is that these models were built for desktop first, not mobile. But that’s likely to change.
Public acceptance is another factor.
Speaking to your phone in public still feels odd in quiet settings like museums or classrooms. It’s fine in offices, stores, and cars.
Listening, however, is now universally accepted thanks to wireless earbuds, and new devices like smart glasses or bone-conduction headsets that will make voice assistants even more discreet.
Voice Agents are not mainstream yet but they hold the highest potential to deliver on the promise of AI-powered commerce.
You can test this link for a variety of products. It works by pressing the microphone button to ask a question and pressing it again to hear the answer.
A variant used in the Versailles Chateau can be tested here. It relies on the Live Chat technique, i.e. no button pressing to ask a question and get the answer.
The self-service configuration for the Voice AI Agent starts at 300€ per month. Direct access is here.
James Cameron isn’t behind this technology (despite what his films might suggest), but AI Avatars bring cinematic realism to brand communication.
Avatars combine animated faces or full-body personas with speech to create strikingly lifelike digital presenters.
Among the major AI models, Grok is the only one that features avatars today.
Companies like Synthesia, HeyGen, and Avatar SDK are pushing the limits of realism to make live avatars a reality.
Will brands use them? Absolutely.
Many already rely on iconic characters think Mr. Clean, Michelin’s Bibendum, Ronald McDonald, the Laughing Cow, Mario, Green Giant, Duracell’s rabbit, Pringle’s moustache or the M&M’s crew. Avatars can reimagine these characters for the AI era, giving them interactive, intelligent personalities.
Some avatars will represent fictional brand mascots; others will be neutral experts, offering professional advice with a human touch. And internally, avatars can even simulate customers (angry, curious, demanding) to train teams through lifelike role-play scenarios.
It’s self-service training, elevated to a new level of realism, emotion, and feedback.
For a demo please ask us using this link.
Most brands will soon design, train, and manage their own private AI Agent to handle customer interactions.
Together, they represent a new stage in digital communication—
AI that doesn’t just speak for your brand, but truly embodies it.