Google continues to enrich its artificial intelligence ecosystem by announcing a major new feature for its NotebookLM tool. The Audio Overviews feature is now available in French and more than 50 other languages. This service allows you to transform any document source, whether videos, PDFs, texts, or others, into a podcast-style audio conversation.
Audio Summary: Google's impressive tool is available in French
Already adopted in English-speaking countries, the Audio Summary feature stands out in NotebookLM for its ability to create audio summaries from documents. Thanks to Gemini's native audio integration, the tool can simulate a conversation between two artificial intelligences discussing the subject of your documents. Generating summaries takes only a few minutes and the result is impressive.
To generate a podcast with two AIs, the user simply needs to go to notebooklm.google, add sources (PDF, notes, YouTube videos, etc.) and then choose Audio Summary. Audio summaries are generated by default in the preferred language of their Google account, but it is possible to select the desired output language in the new dedicated option (Settings > Output Language).
Google's ambition? the end of language barriers
The April 29 announcement marks an important milestone with the expansion to more than 50 languages, including French. The Mountain View firm explains that it wants to eliminate language barriers and make information more accessible. Google had already extended access to NotebookLM to more than 200 countries last year and gives examples of its use: a teacher could, for example, share documents in different languages, and students could generate an audio summary in their own language.
But what is NotebookLM anyway?
Launched almost a year ago, NotebookLM is an AI service developed by Google to help compile documents and data. Less well-known than other services, it is particularly useful for researchers, journalists, writers, students, and academics. The tool was designed to understand the documents the user is working on, such as their research project, in order to interact meaningfully. It helps to find ideas, understand, and gain an overview from their own materials, and is capable of synthesizing research, supporting analyses, cross-referencing elements, and answering questions, regardless of the source format.
NotebookLM is also surprising because its creator, Steven Johnson, doesn't really have the usual profile. He's an author of fourteen books and a technology enthusiast, whom Google hired to develop this service.
This technology, which generates a conversation between a male voice and a female voice imitating human intonations, represents a colossal challenge for Google in its ambition to eliminate language barriers. It opens up new possibilities for multilingual learning and could potentially change the way we approach complicated subjects. For now, Google invites users to test the feature and share their feedback, notably via Discord.
While audio summaries remain a NotebookLM exclusive for now, their potential is immense. They could particularly appeal to podcast lovers looking for a new way to absorb information, transforming any question or document into a virtual conversation that can be listened to anywhere.
Why did it take Google so long to offer this feature in French?
The arrival of French required specific work, as Steven Johnson explained to our colleagues at Numerama and Tech&Co. Our language was complicated to grasp for this functionality and the firm's desire to obtain a credible audio rendering. The AI model was trained on more than 200 hours of studio recordings with two people talking to capture the intonations, reactions, and interruptions typical of a natural conversation.
“Every language interrupts differently,” explains Steven Johnson, emphasizing the importance of adapting the model for each language in order to achieve the "magic of a fluid and natural conversation". For the moment, the French version also presents a main difference compared to the English version: it is not yet possible to interrupt the podcast to ask questions to the AIs and move the discussion forward. This ability to interact in real time is, however, a popular feature of the English version.
0 Comments