The RTS learns how artificial intelligence is reframing the world of content creation and production – and why it doesn’t do comedy
Daily, we are bombarded by headlines announcing the wonders – and risks – that generative artificial intelligence is bringing to our lives. AI has been used to help identify the hostages taken by Hamas from southern Israel on 7 October. More mundanely, apparently it can also help stem the alarming rise in shoplifting. On the other hand, it could put many of us out of work, lead to rampant breaches of copyright and, ultimately, make it nigh on impossible to tell what on our screens is fake and what is real.
Last month, a stimulating and ambitious RTS session, ‘Lights, camera, AI: The art of the possible from script to screen’, wrestled with how AI is affecting the production of TV and film.
Ironically, given the hi-tech nature of much of the discussion, the event was curiously prone to glitches, with miscued videos and an unexplained bang as the panellists settled down for the evening’s sometimes mind-stretching discussion.
At one point, attendees were told about something called “procedural content generation”, in which AI creates “infinite podcasts and infinite movies” that will react to us as individuals as we listen to or watch them. Apparently, it gauges our reactions to assess whether or not we’re enjoying a show and changes it if we appear not to be! And it’s not every RTS event that includes a rap about it created by ChatGPT.
Host Lara Lewington, tech presenter, journalist and AI commentator, began by making the point that even “the godfathers of AI can’t agree on anything, not even a definition of AI.”
So how would panel member Danijela Horak, Head of Applied Research, AI at BBC R&D, define it? “In the old days of machine learning, we had to train the model for a specific task and train many models to execute specific tasks. Today, with the likes of ChatGPT, we no longer need to do that: you send a single query to ChatGPT and get the answer immediately.”
Matthew Griffin, futurist, author and CEO of the 311 Institute, said he had been talking about generative artificial intelligence for about eight years. He believed we were roughly where he expected we would be in terms of the development of the technology and the rate of investment. “From now until 2035, the technology will accelerate quicker than before,” he predicted. “We’ve got technology that can do amazing stuff, but people may not want to use it. It may not be cheap enough or easy enough to use.”
Victoria Weller, Chief of Staff at ElevenLabs, which specialises in AI-generated voiceovers, said her company’s objective was to make content universally accessible in any voice and in any language and so transcend language barriers. “We can go from written text to natural-sounding human audio virtually in real time,” she said. ElevenLabs can also generate voices that never existed before.
But was instant voice cloning safe, asked Lewington, suggesting that one of its uses could be to make it even easier to create deepfakes, such as a recent one of Labour leader Keir Starmer verbally abusing his staff, published on the eve of the party conference.
As the debate opened out, the RTS was shown examples of the BBC using AI for more innocuous purposes. These included a vintage black-and-white clip of a youthful David Attenborough, the second Controller of BBC Two, extolling the virtues of colour TV. Horak said AI could be used to help restore and digitise archive content, principally videos and photographs. It was also being deployed to convert text to speech and introduce more regional accents to the BBC’s output.
It could also make media production workflows more efficient. This was demonstrated with a clip from Autumnwatch. Previously, a producer had to go through countless hours of footage to identify a specific bird species, but machine learning could do the same task far faster.
AI also offers some intriguing possibilities for sports producers. Weller shared footage of motor racing as the race commentary was successively dubbed by AI from the original English into Hindi, Polish, German and French, all in almost real time.
There is now the potential to direct sports events without a camera operator or a director. The idea, Lewington said, was not to abolish these positions and make people redundant, but to provide broadcasters and streamers with opportunities to show “things that might otherwise not be filmed or have a commentary”. She noted that the BBC conducted experiments along these lines at this summer’s Wimbledon, but “found them quite glitchy”.
Griffin pointed out that the pioneering AI company Metaphysic had, for some time, been creating “hyperreal content”, famously crafting a Tom Cruise deepfake during the pandemic and a Simon Cowell avatar for America’s Got Talent.
He added: “Producers can now take a foreign-language movie they have made and automatically dub it into English.” As for creating avatars, he said: “Most technology developments are linear. We fixed wonky mouth syndrome, then the eyes and shoulders [which need to relate to hand gestures]. We’ve now got hand actions because the AI understands the context.”
Hollywood had reached a point where the studios were trying to automate different parts of pre- and post-production, from scriptwriting to colourisation. This had led to the writers’ strike, which has been resolved for now, but such disruptive pressures on working practices can only increase.
The producers of the gloriously irreverent South Park have, not surprisingly, said Griffin, “been messing about with deepfakes for the past four years. On YouTube they’ve created a deepfake channel that takes pops at Donald Trump. They’re doing what a lot of companies should be doing, they’re experimenting.”
He added: “In the conversations I have with Disney, the company says, ‘Ultimately, we want an artificial intelligence where you push a magic red button and it spits out a blockbuster Marvel movie.’ We then get into how AI knows what a good movie is. When we talk about AI being able to create good content, it looks at the ratings and the reviews.”
Rather alarmingly, Griffin suggested that, in the future, books would all be written by AI. “Now that artificial intelligence has mastered human language, it’s able to write books, blogs, journalism, scripts,” he said. “Ultimately, AI is going to start eating into libraries. ‘This is the human section over there,’” he quipped.
Lewington forecast that books authored by people, rather than machines, would carry a premium.
On the vexed question of distinguishing between real images and those created by AI, and audiences’ ability to tell the difference, the jury appears to be out on how to achieve this. “We haven’t found the answer yet,” said Weller. “One thing we’ve found for AI audio content is that, when people are confronted with a piece of content, they need to have a way to figure out: ‘Is this real or is it AI-generated?’”
Here, it seems, the machines may have the upper hand. Griffin said: “We can apply watermarks, but they can be stripped off quite easily. Generative AI can be used to strip the watermark off sound and image content.”
We may not be far off the day when AI is creating the next Disney animated blockbuster, but it seems that anyone hoping to be the next Victoria Wood or Sharon Horgan can relax – generative AI doesn’t do humour.
“About a year ago we found that AI was rubbish at comedy,” said Griffin. “How do we get AI to be funny? It depends on the commands it is given. If you tell it to write jokes in the style of a famous comedian, you get a better result. By changing the prompt, you can affect ChatGPT’s style. If you wanted AI to create a comedy podcast, it would be bad – it doesn’t get comedy.”
‘Lights, camera, AI: The art of the possible from script to screen’ was an RTS national event held at London’s Cavendish Conference Centre on 9 October. The producers were Phil Barnes and Kim Chua.