Bards Lost in The Metaverse Episode 30 - AI Narration
This week, Sharn and Andy explored the exciting and ever-evolving world of AI Narration. Whether you're a writer, content creator, or just a fan of innovative technology, you won't want to miss this deep dive into the latest advancements and tools available. But that's not all – we also discussed the important considerations you need to keep in mind before diving into this cutting-edge field. And as an added bonus, we used some of the AI voice tools to talk about themselves and demonstrate the technology for us. But first, the news.
It’s Monday 17th April 2023, and this is news with Sharn:
In Web3 and tech new this week:
The Ethereum blockchain's Shanghai upgrade, also known as "Shapella," is set to go live on April 12th, marking the completion of the network's transition from proof-of-work to proof-of-stake. However, stakers may not receive their rewards immediately, depending on how they staked their ether (ETH). Validators who directly stake their ETH will have their rewards automatically unlocked, while stakers who used a staking service or pool will have to wait for the service provider or pool to determine when to release the rewards. This has caused uncertainty among stakers, especially those who are eagerly awaiting their rewards.
The US Securities and Exchange Commission (SEC) is increasing its regulation of digital assets with the addition of general attorneys to its crypto enforcement division in New York, Washington DC, and San Francisco. The hiring comes after the agency stated in March that it planned to expand its Crypto Asset and Cyber Unit (CACU). The SEC has been cracking down on the crypto industry under Chairman Gary Gensler, and the new attorneys are expected to investigate "crypto asset securities," develop litigation plans, and draft legal documents.
Moving now to the world of Publishing:
Book lovers worldwide are expressing sadness as online book retailer Book Depository prepares to shut down by the end of the month. The UK-based company, which offers more than 20 million books and free delivery to over 120 countries, was acquired by Amazon in 2011. The closure follows Amazon's announcement of cost-cutting measures and job losses, including changes to its book-selling operations, such as ending the sale of magazine and newspaper subscriptions on its Kindle e-book device. As Book Depository prepares to close its virtual doors, customers reflect on the end of an era in online book retail.
In a move that pits Twitter against newsletter platform Substack, Twitter has made it almost impossible to interact with tweets containing a Substack link. Substack has been making moves to be more than just a newsletter distributor, with its latest innovation being Notes, a feature that allows creators to share short snippets of thought with their followers. This move has made Substack more like Twitter, prompting the response from the social media giant. The situation highlights the need for writers and journalists to be agile in regard to the platforms they use, as even established platforms can disappear or lose value overnight.
And that my friends, was news with Sharn.
And now into the topic for today. AI Narration:
What is AI narration?
AI voices are synthetic voices that mimic human speech through a process called deep learning, where artificial intelligence is used to convert text into speech. Often referred to as TTS or Text-To-Speech. For example, AI technology cloned the voice of James Earl Jones and now voices the Darth Vader character.
AI voice narration, also known as text-to-speech (TTS), is a technology that enables computers to convert written text into spoken words. It uses artificial intelligence (AI) algorithms to generate human-like voices, allowing users to listen to written content rather than reading it.
Here's how it works:
First, the text is processed and analyzed for things like grammar, syntax, and punctuation.
Next, the TTS system selects a voice from its library of available options. These voices can be pre-recorded by actors or generated by AI algorithms.
The TTS system then uses these voices to convert the text into speech. This involves breaking the text into individual phonemes (the basic units of sound in a language) and combining them to form words, sentences, and paragraphs.
The system can adjust things like intonation, pitch, and emphasis to create a more natural-sounding voice. (I have seen these referred to as Neural voices)
Finally, the synthesized speech is output as an audio file, which can be played back on any device with a speaker.
AI voice narration has a variety of applications, from accessibility tools for the visually impaired to automated customer service and even creating audiobooks or podcasts. It can also be used to create synthetic voices for virtual assistants, chatbots, or other AI-powered applications.
AI Voice Narration Vs AI Voice Cloning
AI voice narration and AI voice cloning are two related but distinct technologies.
AI voice narration, as explained earlier, uses artificial intelligence algorithms to convert written text into spoken words. The synthesized speech can sound like a real person, but it's not an actual recording of someone's voice. Instead, it's generated using an AI algorithm that has been trained on a particular voice or set of voices, or even synthesized voices that do not belong to any particular person.
On the other hand, AI voice cloning involves training an AI algorithm to mimic a particular person's voice. This technology uses a database of audio recordings of that person's voice to train the algorithm. The algorithm then learns to mimic the person's voice, including their unique speech patterns, accent, and intonation.
The main difference between the two technologies is that AI voice narration can produce synthesized voices that don't belong to any specific person, while AI voice cloning can create a synthetic version of a specific person's voice.
The applications for AI voice cloning are numerous, including creating synthetic voiceovers for movies or television shows featuring deceased actors or celebrities, generating voice commands for virtual assistants like Siri or Alexa, or even creating personalized audiobooks narrated by the author in their own voice.
Fun fact AI technology cloned the voice of James Earl Jones and now voices the Darth Vader character.
Overall, while AI voice narration and AI voice cloning are both based on similar technologies, they have distinct differences in their purpose and application.
What can you do with AI narration?
There are many different use cases for AI voices, and they are becoming increasingly popular in a variety of industries. Here are some real-world examples:
Audiobooks and Podcasts: AI voice narration can be used to generate synthesized voices for audiobooks and podcasts. This allows publishers to produce audio content at a lower cost and faster speed.
Virtual Assistants and Chatbots: AI voice cloning can be used to create personalized virtual assistants and chatbots that sound like a particular person. This can make interactions more engaging and natural for users.
Accessibility Tools: AI voice narration can be used to create audio versions of text content for people who are visually impaired or have difficulty reading. This can make digital content more accessible for a wider range of people.
Voiceovers for Movies and TV Shows: (In a world) AI voice cloning can be used to generate synthetic voiceovers for movies and TV shows. For example, in the 2020 film "I Am Patrick Swayze," an AI voice was used to narrate the actor's life story.
Customer Service: AI voice cloning can be used to create personalized customer service experiences, where customers can interact with a chatbot or virtual assistant that sounds like a specific person, such as a company CEO or spokesperson.
Education and Training: AI voice narration can be used to create instructional materials for online courses and training programs. This can make learning more engaging and accessible for students.
Entertainment: AI voice cloning can be used to create synthetic versions of famous actors or musicians for entertainment purposes. For example, a hologram of the late rapper Tupac was created using AI voice cloning technology and was able to perform on stage during a music festival.
Overall, AI voice technology has a wide range of applications, and its use is only expected to grow in the future.
What are some AI Narration Tools?
Amazon Polly: Amazon Polly is a cloud-based TTS service that uses advanced deep learning technologies to generate natural-sounding speech. It offers a wide range of voices in multiple languages and dialects, and it can be integrated into a variety of applications.
Google Cloud Text-to-Speech: Google Cloud Text-to-Speech is another cloud-based TTS service that offers high-quality synthetic voices in multiple languages and dialects. It uses Google's neural network technology to create natural-sounding speech.
IBM Watson Text-to-Speech: IBM Watson Text-to-Speech is a cloud-based TTS service that uses deep learning technologies to generate natural-sounding speech. It offers a variety of voices in multiple languages and can be integrated with various applications.
Microsoft Azure Text-to-Speech: Microsoft Azure Text-to-Speech is a cloud-based TTS service that offers a variety of high-quality synthetic voices in multiple languages and dialects. It uses neural network technology to create natural-sounding speech.
NaturalReader: NaturalReader is a desktop application that converts written text into spoken words. It offers a variety of voices in multiple languages and allows users to adjust the speed and tone of the speech.
ReadSpeaker: ReadSpeaker is a cloud-based TTS service that offers a variety of natural-sounding voices in multiple languages and dialects. It can be integrated into various applications, including websites, mobile apps, and e-learning platforms.
Legal concerns associated with AI Narration
As with any emerging technology, the use of AI voices can raise legal and ethical issues. Here are some of the key legal issues that arise with using AI voices:
Intellectual Property: AI voice cloning raises questions about ownership of the original voice and the synthesized version. There is a risk that the synthesized voice could infringe on the rights of the original voice owner or create confusion as to who is the true speaker. All of my services are paid and most are hidden behind a paywall.
Privacy: The use of AI voice cloning for commercial purposes, such as generating synthetic voiceovers or virtual assistants, may raise privacy concerns. There is a risk that a person's voice could be used without their consent, or that the synthesized voice could be used to create misleading or fraudulent content.
Deceptive Practices: The use of AI voices could potentially be used to deceive consumers or the public. For example, a synthetic voice could be used to impersonate a public figure or make false claims about a product or service. There have been a report on increase of crime using AI voices.
Defamation: The use of AI voices in a way that damages a person's reputation or portrays them in a false light could give rise to defamation claims. Murf an AI tool i was looking at using wouldn't allow any swearing at all to protect its voice actors.
Consumer Protection: The use of AI voices in marketing and advertising could raise consumer protection concerns, such as false advertising or unfair and deceptive trade practices.
Data Protection: The use of AI voices may raise data protection issues if personal data, such as voice samples, is collected without proper consent or protection.
Overall, the legal issues surrounding the use of AI voices are complex and evolving. It's important for companies and individuals to consider the legal implications of using AI voices and to take steps to ensure that their use of the technology is lawful and ethical.
Do audiobook distributors allow AI voice narration?
Two of the main publishing distributors/aggregators for audiobooks are ACX and Findaway Voices. They distribute to Amazon, Audible, Apple Books, and many others. However, neither platform accepts AI-narrated audiobooks at the moment.
Kobo accepts auto-narrated audiobooks as read by AI programs.
Audible (owned by Amazon) doesn’t accept AI voice narration, even though Amazon is producing AI narration tools.
Apple Books has been rolling out some AI narration, and offers a selection of fiction and romance titles with "digital narration," alongside e-books and other audiobooks for purchase.
Google Play Books is very AI friendly and even has AI narration tools available to transform your text into audiofiles.
What are your thoughts on it??
Well obviously i love it. I’m currently using it to make an audio version of Star Ranger, and having a blast doing it.
I’m really interested in it. Of course, with any tool out there, you need to understand it and know how to use it properly, as well as understand the risks out there. The idea of cloning my own voice is interesting. Maybe I could just clone my voice and then use it for this podcast. I can just go do what I like while Andy records with my voice clone :P
As an AI language model, I am impartial to the use of AI voice narration. However, I can provide some insights on the benefits and drawbacks of using AI voice narration.One of the main benefits of AI voice narration is its efficiency and cost-effectiveness. AI voice technology allows for the creation of synthesized voices that can be used to generate audio content quickly and at a lower cost compared to hiring human voice actors.Another benefit of AI voice narration is the ability to generate voices in multiple languages and dialects, which can help to reach a wider audience.However, there are also some potential drawbacks to using AI voice narration. One concern is the quality and naturalness of the synthesized voices. While AI voice technology has improved in recent years, there is still a risk that the voices may sound robotic or unnatural, which can detract from the overall user experience.Another concern is the ethical implications of using AI voice technology. For example, using synthesized voices to impersonate someone without their consent or to deceive consumers could raise legal and ethical issues. Overall, AI voice narration has the potential to be a valuable tool for generating audio content, but it's important to consider the quality, ethical implications, and user experience when deciding whether to use it.
It’s now time to enter Andy’s Imagin-asium
Welcome to the Imagin-asium!- Work Out of the week number 4
This week's workout is going to push you a little harder then last time to help you make those creativity gains!
The track for this week is: Icarus - Main Theme by Michael McCann
Your workout is as follows:
Get the music ready on your device.
Get outside and go for a walk, preferably somewhere nice and safe. But if you can't get comfy under your blanket of choice.
Listen to the music and follow the below workout:
This week's workout is a little less direct than the last one as I really want you to use your imagination. This time your prompt is simple. You exit hyperspace into an Alien Solar system teaming with life. What lays before you?
And finally, hit me up on twitter(@AndyMacCreative) or Bards (@Invokecrations) up on Twitter and let me know what you came up with 🎧
And that's your Imagin-asium workout of the week! Remember to stay hydrated and stretch throughout the day ;)
As always, we have a lot to do and a lot more to learn. Hope you all have fun following along as we improve our understanding and knowledge!
You can find this podcast episode (and all our other episodes) here: https://anchor.fm/invokecreations , or directly on your favourite streaming services.
NOTE: Everything discussed during the podcast is simply our own interpretation of information we come across as we research topics, or is commentary based on our own personal experiences. We highly encourage everyone to conduct their own research into topics of interest as information, especially in the technical space, changes regularly.
Music track featured this week was titled Black Sleep and can be viewed/listened to here: https://www.youtube.com/watch?v=pYRz6SBMFLE
To check out our latest art and music adventures, make sure you like and subscribe to both our Instagram account https://www.instagram.com/invoke_art/ and Youtube channel https://www.youtube.com/channel/UCPH5KySvBnWgWPXHi_OtTng
Or, if you want to read some musing from Sharn and Andy, or maybe see some random pictures of the two of them where Sharn doesn’t know what to do with her hands, then follow us on Facebook https://www.facebook.com/invokecreations
Or if everything is just too confusing and you can’t remember where all our stuff is, head to our website www.invokecreations.com and it will point you in the right direction.
As always, we’re off to put our bums on seats and do some work, so until next time stay dangerous!