Making conversation: Charlie Cadbury on bringing the smart speaker revolution to CTV ads

This article was originally published here in Performance Marketing World on 17th April 2023:

Voice assistants are becoming a mainstay of UK living rooms – but what’s next for the audio ad industry in the age of connected TV and generative AI?

Audio adtech firm Say It Now has pioneered Actionable Audio Ads, with the likes of Tesco, Pizza Hut and Specsavers making use of its voice activated campaign tools.

As the company matures into a software as a service offering, PMW sat down with CEO Charlie Cadbury to learn more about soaring voice-through rates, AI ‘hallucinations’ and why CTV and Alexa are a match made in heaven. 

Q. Say It Now launched back in 2019. How has the audio ads market evolved since then?  

“We first ran actionable audio ad campaigns back in 2020, working with charities and not-for-profits to test out whether anyone would possibly talk back to any radio ads given the right kind of incentive to do so – which they did! Following that, over the last couple of years, we’ve been getting better at understanding what we need to say in the creative to get people to come back to these ads, then how you get the ‘hook’ as tasty as it could be in order to get people engaging.

“This is driving what we define as a ‘virtuous circle’. Amazon, who are often quite reticent to share any of their information, surprisingly put out a statement in Q4 2022 saying that Alexa engagement is rising 30% year over year. So people are spending 30% more time talking to their Alexa in 2022 than they were in 2021, which is great as a high growth channel. 

“That then helps us bring more advertisers on because of people coming into this channel and also because we’ve now got our own tracking capabilities. We’ve now got a fully mature tracking suite, so we can tell exactly when an advert is played across which publishers, whether that’s on Spotify, Heart or Xfm. We can show how many people are engaging at what time of day, allowing us to optimise these campaigns in real time.”

Q. Can you give an example of how this type of audio ad works in practice? 

“So, if you were a pizza company running an Actionable Audio campaign and you saw that many more people were engaging your ads on a Thursday evening than a Tuesday morning, that you could see that result come across in the first one one or two weeks and then you can shift your advertising mix.

“You stop spending advertising on Tuesday morning and put that budget onto a Thursday evening. Thus the whole performance of that campaign goes up. That also allows us to shoot towards what those campaign outcomes should be. 

“This builds a business case for the brand to come back to us and rebook. And that’s what we’ve seen happen over the course of last year – when people try this once they go, ‘Oh, this is great. We can now actually aim for something and massively reduce that feedback loop!’ Historically you’d run all your campaigns, then do a post-campaign analysis and reoptimize for the next one. That would take a cycle or two – often around three months or more. We can now do all of that mostly in real time.”

Q. How are Actionable Audio Ads being used by brands in conjunction with wider marketing campaigns? 

“What we’re seeing with our internal metrics is that the response to the media has doubled over the last year. Our voice-through rate is managing to become on par with other channels’ click-through rates. Alongside performance, it’s got great branding opportunities. Brands are seeing all the benefits of why they have been using audio in the past, but there’s now a whole load more insights. It’s easy to build a business against that. That’s the virtuous circle I was talking about.  

 “Our voice-through rate is managing to become on par with other channels’ click-through rates.”

“The more of these campaigns we run, the more people hear these types of ads on the radio. You try one and get a hefty pizza discount and think, ‘Oh, that was an enjoyable user experience’, then you’re more likely to engage with the next offer you hear on the radio.

“It’s also become an education piece. We’re speaking at more events. We were speaking at the Campaign Radio and Audio Advertising Summit and nearly all the sessions touched upon Actionable Audio Ads or they gave a wink to us at Say It Now in the corner. It’s a great testament to the practice to see that evangelism being played back to us”. 

Q. Two of Say it Now’s most recent client campaigns – for Tesco and Pizza Hut – have the ability for listeners to ‘set reminders’ on their smart speakers after hearing the ad. What are the most useful ‘calls to action’ in Actionable Audio? 

“There’s two sides to that answer. We always start with the consumer first. What’s the benefit to them? What’s the barrier that we’re trying to overcome? How do you get them to fire up their vocal chords and say, Alexa, open Pizza Hut delivery?

“So you play to human behaviours, desires and needs. We’re dopamine junkies and want this instant gratification. We found the campaigns that work best are where you get some kind of reward immediately. So that can be: ‘I’m gonna get a discount for my pizza that I can order right now.’ 

“We had a great response from MSC Cruises that was aimed at a particular audience who were likely to want to go on a boat cruise holiday. It was just to request a callback or a brochure. We achieved a really good response rate on that so much so that we won an award with [WPP agency partner] Xaxis.

“What the judges really liked was that we were driving people to put their details in – getting that first party data – but also once people entered this voice experience they had longer engagement rates. We’re getting somewhere on average between 20 and 40% completion rate of that user action.  

“The minute you’ve said ‘Alexa and MSC Cruises’, for example, you are then in a ‘back and forward’ conversation and that carries its own conversational cadence. So people tend to carry on and complete that action, rather than sending people to a landing page that they’re more likely to lose interest and drift off. On a normal landing page for a campaign like this, the comparable metrics are between 3 and 5% conversion rate. On what we are doing – voice landing pages – we are seeing much higher conversions.  

Q. Say it Now is setting its sights on another medium: Actionable TV Ads. Can you explain how that works? 

“This has been on our roadmap since day one. In 2019 we registered three trademarks for Actionable Audio Ads, Actionable TV Ads, and Actionable Outdoor Ads. We are a couple of years away from people talking to their assistants in cars.  

“The first couple of years was set laying the foundation for Actionable Audio Ads. Over the last six months, we’ve been building a load of partners who are going to do in connected TV (CTV) what we have done in audio. We had a six month exclusive contract with one partner. We’re about to launch our first campaigns through the USA with that partner.  

“It works like this: you are watching CTV. A commercial comes on and says ‘if you’d like to know more about a campaign or get a discount, then just turn to the smart speaker that could be in the corner of your room, (or could be built into the TV) and say these words’. It’s very hard to buy a TV today which doesn’t have voice assistance embedded. And either way, you then engage back with that advertising message. 

Q. CTV ads already have more measurable elements than their linear TV counterparts. How would you convince a brand to bring in an additional smart speaker command into their TV strategy? 

“The big value proposition here is those types of ads find it really hard to deliver attribution and engagement. What we’re doing is delivering maybe another 30 or 60 seconds of brand engagement beyond the ad and we can then deliver that all important attribution – so we know exactly when people are engaging with that.  

“The way that’s currently being done on CTV is through QR codes. People are putting QR codes on these ads and asking you to scan. However, that has a huge juxtaposition towards the behaviour that you see from the consumer in your living room. If you imagine yourself in the living room watching TV you are in a lean back mode, and you want a frictionless way to engage with the messages that come to you in a 30 second ad spot. You don’t want to have to rush and fumble and find your phone and get out the camera app and point it at the phone to get your QR code. What you really want to do is continue sitting in exactly the same position, doing exactly what you’re doing, with no break in your behaviour.  

 “What we’re doing is delivering maybe another 30 or 60 seconds of brand engagement beyond the ad”

“With mobiles, we’ve got this idea of a second screen in the room, but with voice, the way that Google and Amazon talk about this is this idea of ‘ambient computing’. It’s another layer of engagement away from the screen. That doesn’t break any of your kind of behaviour so you can still be scrolling Instagram. It just allows another conversation and you can continue watching TV. So you continue watching your show whilst engaging on another channel with that advertising message. It’s completely seamless in the background and that’s why people are so excited.” 

Q. Is there a concern that you are alienating the proportion of the TV watching audience that won’t have a smart speaker in the room, but will have a smartphone? 

“We can be very clever about how we target these ads. So we can build audiences that have a segment of where it’s likely that the viewer might own a smart speaker – such as if someone’s an Amazon Prime member. Even if you just run a proportion of your campaign using an Actionable TV Ad, then you can still take in all of the insights real time and apply it to the rest of the campaign that’s running concurrently.

“So you’ve got kind of two audiences. One that you’ve got high confidence to have a smart speaker and you target them with the Actionable TV Ad, and the other you don’t place the voice call to action. You can still optimise group two with the insight you get from group one, because it’s a live dashboard you can make that change immediately. Thus making every dollar go further.”  

Q. Looking to the future, there’s a lot of talk about generative AI chatbots. How will that affect audio interfaces as conversations with AI become more sophisticated?

“We are hopefully always ahead of this! We’ve been using ChatGPT for a while for certain use cases within our business. For example, what it has been really good at is expanding utterances. So, when asking for a cup of tea, there’s probably hundreds of ways I could ask you for a cup of tea:

  • “Can I have a cup of tea?”
  • “Could you get me a cup of tea?”
  • “I’d love a cup of tea!”

“When we’re building these conversation experiences, we need to cater for as many different examples of asking the same question, and that better resilience within the language models. So we’ve been using ChatGPT to map and optimise for the multiple branches and nuances of natural language.  

Q. How reliable is ChatGPT as a research tool for marketers? 

“There are other tasks within the organisation that we will use generative AI as a service provider, rather than as the rock and foundation for everything that we’re doing. It’s important to have an understanding of the difference between these kinds of chatbots at the current state and what you get from Alexa.   

“It’s not to be underestimated the amount of time, energy, thought and compliance that Alexa goes through in order for it to be the assistant that it is. If you ask Alexa a question, she will tell you the answer that she knows to be a fact… unless she doesn’t know the answer.  

“So if you ask for the capital Sweden, she will tell you the answer. If you ask her a question where she doesn’t know the answer, she’ll say: 

  • “I don’t know the answer to that question. Here are some answers I found on the web.”

“Now the inference here is that the web is full of rubbish – it’s hard to discern fact from fiction on the web. But if Alexa does know something for sure, she’ll tell you that for sure. That’s because she needs to maintain her professional integrity so she builds trust with you, unlike the web that maybe hasn’t built that trust with you over time.

“On the other hand, what ChatGPT is really good at doing is answering every single question with something that sounds right. As more people are learning and seeing what’s coming out of this generative AI, you are getting more compelling answers. But this is leading to what some people are referring to as ‘hallucinations’. You ask a question and it kicks you back an answer that sounds about right. It must be true! And then sometimes if you dig, you’ll find that answer is not necessarily true. 

 “What ChatGPT is really good at doing is answering every single question with something that sounds right. “

“I think there’s a bit of a way to go to find a sweet spot between those two environments. But there’s loads of useful things you can do [with generative AI] as long as you can put a layer of compliance over the top. It seems like magic and is really compelling. but always check your sources!”

Q. So what’s next for Say it Now?

“We’ve been running campaigns across Europe with lots more activity from clients in Germany and Spain. The multilingual aspect has come back as a challenge, which is good to deliver against. Throughout the course of this year, we’ll be moving some of our more engaged customers onto a self-service model.

“The big difference between this time last year and today is that we used to require a little bit of development time to build out these campaigns, but now our platform has got to the point where we’re able to do these with non-developers producing many of the campaigns.

“Very soon our clients will be able to produce these themselves. That’s where we’re going as a business – building out as a self-service platform, to offer any bespoke services on top of these managed services.”  

Q. You are planning another tour of the USA this summer. Is the USA your next big target market after Europe? 

“We’ve already started in North America with an acquisition of a federally incorporated Canadian entity. We’ve had boots on the ground in the USA since that acquisition. We’ve got a number of marquee clients, and we are already running campaigns with NBC. That’s given us lots of calling cards!

For more insights on the audio advertising industry, listen to Charlie Cadbury’s Attention Seekers podcast episode below.

Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s