Curated by THEOUTPOST
On Thu, 3 Oct, 12:04 AM UTC
5 Sources
[1]
Now, Anyone Can Create Podcasts With Google's NotebookLM
Apart from NotebookLM, there are several other products like Descript, Podcastle, Wondercraft, and Lica that can create podcasts using AI. Google recently introduced a new feature for NotebookLM called Audio Overview, which is going viral. With this feature, users can input a link, article, or document, and the AI assistant generates a podcast featuring two AI commentators engaged in a lively 'deep dive' discussion on the topic. They summarise the material, make connections between subjects, and banter back and forth. "It's possible that NotebookLM podcast episode generation is touching on a whole new territory of highly compelling LLM product formats. Feels reminiscent of ChatGPT. Maybe I'm overreacting," quipped former OpenAI co-founder Andrej Karpathy, highly impressed by the product. He couldn't stop praising it. "Deep Dive is now my favourite podcast. The more I listen, the more I feel like I'm becoming friends with the hosts, and I think this is the first time I've actually viscerally liked an AI. Two AIs! They are fun, engaging, thoughtful, open-minded, and curious. Okay, I'll stop now." "Just had my third 'wow' moment in AI... this time through AI Overview by NotebookLM," exclaimed Google's Logan Kilpatrick, who is currently building new Gemini models. Building on its immense success, Google introduced new features that allow users to directly incorporate public YouTube URLs and audio files into their notebooks, alongside PDFs, Google Docs, Slides, websites, and more. AIM also experimented with NotebookLM, converting its in-depth articles about AI into engaging podcasts. The most impressive feature of Google's NotebookLM is its ability to create two-person podcasts that not only convey emotions but also capture intricate details. Instead of just generating a script, these podcasts flow naturally and effectively understand the context of the uploaded article or document. Over time, it felt incredibly natural and human-like and we enjoyed the playful banter as well. There were moments when we completely forgot we were listening to an AI. Similar experiences have been shared by other users online. "What is really interesting to me about NotebookLM is that it doesn't matter what kind of content I provide, it tries its best to generate the most compelling and engaging audio overview," said Elvis Saravia, the founder of DAIR.AI. "For instance, I gave it my newsletter (in listicle format), and it produced something I actually listened to for 15 minutes. It injected its own understanding. Then I provided some papers for additional context and asked it to pull insights based on the newsletter and the connections it made in the papers. The results are amazing!" he added. NotebookLM is stupid good at analysing basketball games. I fed it the box score, play by play, advanced stats, and a transcript of my favourite pod. It easily wove the lines between the commentary and stats. This could be a game changer for quick analysis and storylines," posted a user on X. The possibilities with NotebookLM are endless, as people are converting their research papers, blogs, business documents, and lecture notes into podcasts. A user converted his daughter's sixth-grade social science book into a series of 10 podcasts and uploaded them on YouTube. "It is a bit of a re-imagination of the UI/UX of working with LLMs organised around a collection of sources you upload and then refer to with queries, seeing results alongside and with citations," Karpathy said. He explained that while LLMs are rapidly improving in their technical capabilities -- like intelligence, memory (context length), and multimodal functions (handling multiple types of input, such as text and images) -- the user interface and user experience (UI/UX) for turning these capabilities into practical products are lagging behind. "Think Code Interpreter, Claude Artifacts, Cursor/Replit, NotebookLM, etc. I expect (and look forward to) a lot more and different paradigms of interaction than just chat." Today many AI startups are shifting away from the traditional chat interface. For instance, OpenAI recently launched canvas, a tool that allows users to modify generated texts within an editor and re-prompt specific sections, making it particularly useful for content creation and coding tasks. Simailary, Anthropic earlier introduced Claude Artifacts, which allows the users to visualise whatever they are generating using Claude. Users can view, edit, and iterate on the Artifact content in real-time. "It shows how we're evolving in our interaction with AI, moving away from conventional chat interfaces towards more classical IDE/editor experiences," shared a user on X. Voice is emerging as a natural interface for AI, with people increasingly wanting to listen to AI voices that are more human-like and less robotic. OpenAI recently launched 'Realtime APIs' at its DevDay 2024 event. This feature enables direct speech-to-speech interactions without a text intermediary, resulting in low-latency and nuanced conversational output. "AI needs UI, and OpenAI's impressive new voice APIs open up a lot of possibilities. Congrats to the OpenAI team -- we'll soon see a whole new generation of speech applications!" said Andrew Ng, chief of Deeplearning.ai. OpenAI also launched its Advanced Voice Mode on ChatGPT, and since then, people have been experimenting with it. Deedy Das from Menlo Ventures used it for the dramatic reenactment of a scene in Hindi from Bollywood movie Dangal. Apart from NotebookLM, there are several other products like Descript, Podcastle, Wondercraft, and Lica that can create podcasts using AI.The future of human podcasters is under threat, as AI enables anyone to create their own content with ease. With the podcasting market expected to grow at a compound annual growth rate (CAGR) of 27.6% from 2023 to 2030 and reach $130.63 billion, it remains to be seen how much of this growth will be driven by AI-generated content.
[2]
People are using Google study software to make AI podcasts -- and they're weird and amazing
The tool generates a podcast called Deep Dive, which features a male and a female voice discussing whatever you uploaded. The voices are breathtakingly realistic -- the episodes are laced with little human-sounding phrases like "Man" and "Wow" and "Oh right" and "Hold on, let me get this right." The "hosts" even interrupt each other. To test it out, I copied every story from MIT Technology Review's 125th-anniversary issue into NotebookLM and made the system generate a 10-minute podcast with the results. The system picked a couple of stories to focus on, and the AI hosts did a great job at conveying the general, high-level gist of what the issue was about. Have a listen. MIT Technology Review 125th Anniversary issue The AI system is designed to create "magic in exchange for a little bit of content," Raiza Martin, the product lead for NotebookLM, said on X. The voice model is meant to create emotive and engaging audio, which is conveyed in an "upbeat hyper-interested tone," Martin said. NotebookLM, which was originally marketed as a study tool, has taken a life of its own among users. The company is now working on adding more customization options, such as changing the length, format, voices, and languages, Martin said. Currently it's supposed to generate podcasts only in English, but some users on Reddit managed to get the tool to create audio in French and Hungarian. Yes, it's cool -- bordering on delightful, even -- but it is also not immune from the problems that plague generative AI, such as hallucinations and bias. Here are some of the main ways people are using NotebookLM so far. Andrej Karpathy, a member of OpenAI's founding team and previously the director of AI at Tesla, said on X that Deep Dive is now his favorite podcast. Karpathy created his own AI podcast series called Histories of Mysteries, which aims to "uncover history's most intriguing mysteries." He says he researched topics using ChatGPT, Claude, and Google, and used a Wikipedia link from each topic as the source material in NotebookLM to generate audio. He then used NotebookLM to generate the episode descriptions. The whole podcast series took him two hours to create, he says.
[3]
Forget chat. AI that can hear, see and click is already here
Exhibit A: Google's NotebookLM. NotebookLM is a research tool the company launched with little fanfare a year ago. A few weeks ago, Google added an AI podcasting tool called Audio Overview to NotebookLM, which allows users to create podcasts about anything. Add a link to, for example, your LinkedIn profile, and the AI podcast hosts will boost your ego for nine minutes. The feature has become a surprise viral hit. I wrote about all the weird and amazing ways people are using it here. To give you a taste, I created a podcast of our 125th-anniversary magazine issue. The AI does a great job of picking some highlights from the magazine and giving you the gist of what they are about. Have a listen below. Multimodal generative content has also become markedly better in a very short time. In September 2022, I covered Meta's first text-to-video model, Make-A-Video. Next to today's technology, those videos look clunky and silly. Meta just announced its competitor to OpenAI's Sora, called Movie Gen. The tool allows users to use text prompts to create custom videos and sounds, edit existing videos, and make images into videos. The way we interact with AI systems is also changing, becoming less reliant on text. OpenAI's new Canvas interface allows users to collaborate on projects with ChatGPT. Instead of relying on a traditional chat window, which requires users to do several rounds of prompting and regenerating text to get the desired result, Canvas allows people to select bits of text or code to edit. Even search is getting a multimodal upgrade. In addition to inserting ads into AI overviews, Google has rolled out a new feature where users can upload a video and use their voice to search for things. In a demo at Google I/O, the company showed how you can open the Google Lens app, take a video of fish swimming in an aquarium, and ask a question about them. Google's Gemini model will then search the web and offer you an answer in the form of Google's AI summary. What unites these features is a more interactive, customizable interface and the ability to apply AI tools to lots of different types of source material. NotebookLM was the first AI product in a while that brought me wonder and delight, partly because of how different, realistic, and unexpected the AI voices were. But the fact that NotebookLM's Audio Overviews became a hit despite being a side feature hidden inside a bigger product just goes to show that AI developers don't really know what they are doing. Hard to believe now, but ChatGPT itself was an unexpected hit for OpenAI. We are a couple of years into the multibillion-dollar generative AI boom. The huge investment in AI has contributed to rapid improvement in the quality of the resulting content. But we've yet to see a killer app, and these new multimodal applications are a result of the immense pressure AI companies are under to make money and deliver. Tech companies are throwing different AI tools at people and seeing what sticks.
[4]
How to Generate an AI Podcast Using Google's NotebookLM
Two podcasts hosts banter back and forth during the final episode of their series, audibly anxious to share some distressing news with listeners. "We were, uh, informed by the show's producers that we're not human," a male-sounding voice stammers out, mid-existential crisis. The conversation between the bot and his female-sounding cohost only gets more uncomfortable after that -- an engaging, albeit misleading, example of Google's NotebookLM tool, and its experimental AI podcasts. Audio of the conversation went viral on Reddit over the weekend. The original poster admits in the comments section that they fed the NotebookLM software directions for the AI voices to roleplay this pseudo-freakout. So, no sentience; the AI bots have not become self-aware. Still, many users in the tech press, on TikTok, and elsewhere are praising the convincing AI podcasts, generated through uploaded documents with the Audio Overviews feature. "The magic of the tool is that people get to listen to something that they ordinarily would not be able to just find on YouTube or an existing podcast," says Raiza Martin, who leads the NotebookLM team inside of Google Labs. Martin mentions recently inputting a 100-slide deck on commercialization into the tool and listening to the 8-minute podcast summary as she multitasked. First introduced last year, NotebookLM is an online research assistant with features common for AI software tools, like document summarization. But it's the Audio Overviews option, released in September, that's capturing the Internet's imagination. Users online are sharing snippets of their generative AI podcasts made from Goldman Sachs data dumps, and testing the tool's limitations through stunts, like just repeatedly uploading the words "poop" and "fart." Still confused? Here's what you need to know. Audio Overviews are a fun AI feature to try out, because they don't cost the user anything -- all you need is a Google login. Start by signing into your personal account and visiting the NotebookLM website. Click on the plus arrow that reads New Notebook to start uploading your source material. Each Notebook can work with up to 50 source documents, and these don't have to be files saved to your computer. Google Docs and Slides are simple to import. You can also upload websites and YouTube videos, keeping some caveats in mind. Only the text from websites will be analyzed, not the images or layout, and the story can't be paywalled. For YouTube, Notebook will just use the text transcript and the linked videos must be public. After you've dropped in all of your links and documents, you'll want to open up the Notebook guide available in the bottom right corner of the screen. Find the Audio Overview section and click the Generate button. Next, you'll need to exercise some patience, because it may take a few minutes to load, depending on how much source material you're using. After the tool generates the AI podcast, you can create a sharable link to the audio or simply download the file. Additionally, you have the option to adjust its playback speed, in case you need the podcast to be quicker or more slowed down. The internet has gotten creative with NotebookLM's audio feature, using it to create audio-based "deep dives" into complex technical topics, generate files that neatly summarize dense research papers, and produce "podcasts" about their personal health and fitness routines. Which poses an important question: Should you use NotebookLM to crank through your most personal files? The summaries generated from NotebookLM are, according to Google spokesperson Justin Burr, "completely grounded in the source material that a user uploads. Meaning, your personal data is not used to train NotebookLM, so any private or sensitive information you have in your sources will stay private, unless you choose to share your sources with collaborators." For now this seems to be one of the upsides of Google slapping an "experimental" label on NotebookLM; to hear Google's framing of it, the company is just gathering feedback on the product right now, being agile and responsive, tinkering away in a lab, and NotebookLM is detached from its multi-billion dollar ad business. For now! For now.
[5]
Google generated a 'podcast' from one of my articles and it's the most convincing and eerie human-mimicking AI I've heard yet
A podcast generated in minutes by a bot -- and yet I've heard worse. A podcast at the click of a button -- Google's new AI tool can do it. It's called NotebookLM and it's essentially a summary bot. Input a document, hit generate, and out pops a briefing doc, FAQ, or study guide. What's more, it can generate a podcast covering the document's contents, hosted by fleeting ephemeral beings with chirpy American accents. Take, for example, an article I wrote back in 2023 called "Cache is king when it comes to designing the gaming CPUs of the next 20 years." In which I spoke to a handful of silicon engineering experts about what's next for chip design. I fed the article into NotebookLM, waited around four minutes, and out pops a 10-minute long podcast. You'll have to excuse the expletive but that was my honest reaction to hearing it for the first time. A well-summarised document is one thing, but it's the natural cadence of the conversation and the lifelike emotion that has sent me for a loop after listening. They (I'm already acting like these are real people) even introduce the podcast... as a podcast. I have to remind myself when listening that these aren't real people -- they're the product of me feeding a hyperlink into a box on a website. A computer feigning two humans sharing thoughts and feelings. I don't know why but this feels deeply strange to me. I'm not the only one who feels this way about the new AI tool. We played a version generated on our RTX 4090 review to the rest of the PC Gamer team and received such responses as: "the interruptions and responses from the co-host are freaking me out." My point is, NotebookLM is spectacularly impressive and terrifying in equal measure. What's more, the analogies and references made throughout the recording aren't drawn like-for-like from the subject matter -- in this instance, a PC Gamer article. They're mostly made up by the AI. For example, the reference to how 3D V-Cache is like building a skyscraper instead of a bigger warehouse. I wish I'd come up with that, but that's all AI generated. And that's just another reason why the whole thing is frightfully good. Though, a podcast is as much about the hosts as it is the content. And so far we've not had any other 'hosts' (AI vocaloids) lending their voice to anything we've uploaded to NotebookLM. That's sure to wear a bit thin with time. Not to mention there are unlikely to be any hilarious gaffs with two Google-programmed bots behind the mic. I suppose what I'm saying is this doesn't feel like an actual, credible threat to successful podcasts, nor a replacement for them. The PC Gamer Chat Log is safe another day. Though as we've seen with other forms of AI generation, it may still change the dynamic of what's deemed to be worth the effort. For example, AI-generated art didn't immediately wipe out all human-made art, of course not, but then you probably wouldn't paint 300 stunning images just to run a single D&D campaign for your friends. You might do that with AI, if you're not totally opposed to its use, which would also be completely fair. The same goes here. I wouldn't record a podcast for every article I ever wrote, but if I could do that with a couple button presses? Something not at all worth the effort only months ago is now next to no effort at all. Don't worry, I'll spare you the hundreds of articles on long-since-released graphics card specifications. But you get the idea. Things are possible now that weren't remotely viable only months ago. There's an elephant in the room, though, and it is pretending to be a human being with thoughts, emotions, and vocal cords. This sort of eerily impressive natural language tool being so easy and accessible is a dangerous tool in the wrong hands. How easy would it be to catfish someone if you can voice any text, any document, with a sleek, conversational human tone. Now that's terrifying. But as proof of concept for what AI can do, I've found nothing that's evoked a response out of me quite like NotebookLM. NotebookLM is a free tool available to use over at NotebookLM.google. It has an incredibly boring sounding name, though it's functionally pretty exciting. It's built around Google's Gemini AI model -- the same one being rolled out to new Android phones and being used to generate AI snippets in web searches that I've suggested may break the business of the internet. NotebookLM, however, is intended to be a study buddy -- an AI capable of summarising documents, listening to audio, and saving you time taking notes. This could have totally changed how I revised for exams in school, but I was born 20 years too early -- missed it by a hair. It's been available in the US through 2024, though recent improvements during the summer and a global rollout have seen it land into the hands of more users as of late. The podcast feature, called "Audio Overview" is also a more recent addition, added into the software in September. For now, the software is only capable of speaking in English, and a note on the Google blog post about its rollout says it will "sometimes introduce accuracies". That's a given, as all AI models, even the best, are prone to making stuff up, sometimes. It's often cited as "hallucinating" but it's really just a fancy-sounding term for when the AI is a bit pants (bad). One feature that appears to be headed to Audio Overview is the ability to interrupt the speakers and, assumedly, change the direction of the conversation or issue on-the-fly corrections. It's not for certain yet, but Google notes in its blog post that "you can't interrupt them yet", which is a bit of a weird thing to say if that wasn't an intended feature at some point. Let's think on that for a second: The ability to interrupt a podcast host mid-conversation and tell them what you'd like them to talk about -- it's giving strange, highly-personalised live show with passive-aggressive overtures... Google isn't the only firm to be playing around with AI-powered bots that sound like humans. OpenAI is also in the market with its own voice-assistant to match the one in the movie Her. That was human-to-bot contact, but no less odd for it.
Share
Share
Copy Link
Google's NotebookLM, an AI-powered study tool, has gained viral attention for its Audio Overview feature, which creates engaging AI-generated podcasts from various content sources.
Google's NotebookLM, originally introduced as a study tool, has recently captured the internet's attention with its innovative Audio Overview feature. This AI-powered tool can transform various content sources into engaging podcasts, marking a significant advancement in content creation and consumption 1.
Users can input a wide range of content into NotebookLM, including links, articles, documents, and even YouTube videos. The AI then generates a podcast featuring two AI commentators who engage in a lively discussion about the topic. These AI hosts summarize the material, make connections between subjects, and even banter with each other, creating a surprisingly natural and engaging listening experience 2.
The most striking aspect of NotebookLM's Audio Overview is the realism of its AI-generated voices. Users have reported that the podcasts feature breathtakingly realistic male and female voices, complete with human-like interjections and interruptions. This level of authenticity has led many listeners to forget they're listening to AI-generated content 1.
NotebookLM's versatility has sparked creativity among users. People are using the tool to create podcasts on a wide range of topics, from technical subjects and research papers to personal health routines and historical mysteries. For instance, Andrej Karpathy, a former OpenAI team member, created an entire podcast series called "Histories of Mysteries" using NotebookLM 2.
The ease of use and quality of output from NotebookLM's Audio Overview feature are raising questions about the future of content creation. While it may not replace human-created podcasts entirely, it opens up new possibilities for quick content generation and summarization. This tool could potentially change how people consume information, making it easier to digest complex topics or large amounts of data in an audio format 3.
Despite its impressive capabilities, NotebookLM is not without limitations. The AI can sometimes introduce inaccuracies or "hallucinate" information. Additionally, the tool currently only generates podcasts in English, though some users have reportedly managed to create content in other languages 4.
Google is actively working on improving NotebookLM, with plans to add more customization options such as changing the length, format, voices, and languages of the generated podcasts. The company is also considering adding interactive features, such as the ability to interrupt the AI speakers during playback 5.
As AI technology continues to advance, tools like NotebookLM's Audio Overview feature represent a significant shift in how we interact with and consume information, blurring the lines between human-generated and AI-generated content.
Reference
[1]
[2]
[3]
Google has updated its NotebookLM AI tool with new customization features for its viral podcast generator, allowing users to guide AI-generated conversations and focus on specific topics.
7 Sources
Google's new AI experiment transforms text into professional-sounding podcasts, sparking excitement and ethical debates. While it offers accessibility and efficiency, concerns about authenticity and potential misuse arise.
3 Sources
Google's new AI technology can create lifelike podcasts from text, sparking discussions about its potential impact on media and journalism. The development raises questions about authenticity and the future of content creation.
3 Sources
Google's NotebookLM, powered by Gemini AI, introduces innovative features that can turn various content sources into engaging AI-generated podcast-style discussions, revolutionizing information consumption and learning.
5 Sources
ElevenLabs introduces GenFM, a new AI feature in its ElevenReader app that transforms various content types into podcast-style conversations, offering multi-language support and competing with Google's NotebookLM.
6 Sources
The Outpost is a comprehensive collection of curated artificial intelligence software tools that cater to the needs of small business owners, bloggers, artists, musicians, entrepreneurs, marketers, writers, and researchers.
© 2024 TheOutpost.AI All rights reserved