Curated by THEOUTPOST
On Tue, 15 Apr, 12:01 AM UTC
17 Sources
[1]
Google created a new AI model for talking to dolphins
Dolphins are generally regarded as some of the smartest creatures on the planet. Research has shown they can cooperate, teach each other new skills, and even recognize themselves in a mirror. For decades, scientists have attempted to make sense of the complex collection of whistles and clicks dolphins use to communicate. Researchers might make a little headway on that front soon with the help of Google's open AI model and some Pixel phones. Google has been finding ways to work generative AI into everything else it does, so why not its collaboration with the Wild Dolphin Project (WDP)? This group has been studying dolphins since 1985 using a non-invasive approach to track a specific community of Atlantic spotted dolphins. The WDP creates video and audio recordings of dolphins, along with correlating notes on their behaviors. One of the WDP's main goals is to analyze the way dolphins vocalize and how that can affect their social interactions. With decades of underwater recordings, researchers have managed to connect some basic activities to specific sounds. For example, Atlantic spotted dolphins have signature whistles that appear to be used like names, allowing two specific individuals to find each other. They also consistently produce "squawk" sound patterns during fights. WDP researchers believe that understanding the structure and patterns of dolphin vocalizations is necessary to determine if their communication rises to the level of a language. "We do not know if animals have words," says WDP's Denise Herzing. The ultimate goal is to speak dolphin, if indeed there is such a language. The pursuit of this goal has led WDP to create a massive, meticulously labeled data set, which Google says is perfect for analysis with generative AI. The large language models (LLMs) that have become unavoidable in consumer tech are essentially predicting patterns. You provide them with an input, and the models predict the next token over and over until they have an output. When a model has been trained effectively, that output can sound like it was created by a person. Google and WDP hope it's possible to do something similar with DolphinGemma for marine mammals. DolphinGemma is based on Google's Gemma open AI models, which are themselves built on the same foundation as the company's commercial Gemini models. The dolphin communication model uses a Google-developed audio technology called SoundStream to tokenize dolphin vocalizations, allowing the sounds to be fed into the model as they're recorded.
[2]
Google's newest AI model is designed to help study dolphin 'speech' | TechCrunch
Google's AI research lab, Google DeepMind, says that it has created an AI model that can help decipher dolphin vocalizations, supporting research efforts to better understand how dolphins communicate. The model, called DolphinGemma, was trained using data from the Wild Dolphin Project (WDP), a nonprofit that studies Atlantic spotted dolphins and their behaviors. Built on Google's open Gemma series of models, DolphinGemma, which can generate "dolphin-like" sound sequences, is efficient enough to run on phones, Google says. This summer, WDP plans to use Google's Pixel 9 smartphone to power a platform that can create synthetic dolphin vocalizations and listen to dolphin sounds for a matching "reply." WDP previously was using the Pixel 6 to conduct this work, Google says, and upgrading to the Pixel 9 will enable researchers at the organization to run AI models and template-matching algorithms at the same time, according to Google.
[3]
Can AI Let Us Chat with Dolphins?
Google is developing an LLM that could help us communicate with dolphins Dolphins are renowned for their intelligence and social skills. These large-brained marine mammals communicate using individualized signature clicks and whistles and even seem to recognize their own "names." Big advances in artificial intelligence have renewed some scientists' dream of eventually understanding what dolphins are saying. But what if we humans could also respond? Now a few researchers think a form of two-way communication with dolphins could be on the horizon. In collaboration with the Georgia Institute of Technology and the nonprofit Wild Dolphin Project (WDP), Google has announced progress on what the team has as the first large language model (LLM) for dolphin vocalizations, called DolphinGemma. WDP, whose scientists have studied Atlantic spotted dolphins (Stenella frontalis) for 40 years, provided acoustic data from the species to train the LLM. Teams at Georgia Tech and Google then asked the model to generate "dolphinlike" sequences of sounds. "Half of it was background noise stuff that you expect from the ocean," says computer scientist Thad Starner, of Georgia Tech and Google DeepMind. But the rest had authentic-sounding clicks, whistles and burst pulses -- rapid sequences of clicks that dolphins often utter during fighting and other close-proximity behavior. If you're enjoying this article, consider supporting our award-winning journalism by subscribing. By purchasing a subscription you are helping to ensure the future of impactful stories about the discoveries and ideas shaping our world today. "When I first heard it played back..., I was dancing around the room," Starner says. He hadn't yet been able to reproduce burst pulses using his conventional computer programs. Generating sound programmatically requires a human to write code, but LLMs create instructions for making new sounds independently, based on what they've learned from data. The team now wants to project how AI completes vocalization sequences -- like "when I'm typing into Google and it's finishing my sentence," says WDP founder Denise Herzing. Manually, "it would take some 150 years to go through all the data and try to pull out those patterns." Using AI analysis isn't just quicker. It "could give us the opportunity to see patterns that, from a human perspective, we may not look at," says Thea Taylor, managing director ofSussex Dolphin Project, who is not involved in the new project. If the LLM consistently returns the same answers, it might reveal a pattern. Reviewing WDP's video data could then show researchers what the dolphins are doing when they make a specific sound: Are they, for example, playing with friends or fighting with a rival? The team also wants to explore how dolphins react when researchers present them with novel vocalizations -- dolphinlike "words" made up by AI -- to refer to items such as seagrass or a toy. To do this, the team plans to use technology called CHAT (cetacean hearing augmented telemetry). Developed by Starner's team at Georgia Tech, CHAT involves something that looks a little like a Ghostbusters costume: a pack worn on a harness on a diver's chest recognizes audio while a unit strapped to the forearm plays sounds. Two researchers play one of the LLM's made-up dolphinlike sounds while holding or passing an object (essentially, naming it in "dolphin"), and then they observe what the dolphins do in apparent response. For example, might the animals mimic the noise to ask for the object? This is an interesting approach, but researchers must take care that they aren't unintentionally training the dolphins, Taylor says. If the dolphins repeat the sound, "we have to think whether that's actually an understanding of language -- or whether it's the same as teaching a dog to sit because they get a reward." And it's still unclear whether dolphins technically even have language. Arik Kershenbaum, a zoologist who studies animal communication at Girton College in England and isn't involved in the project, thinks they do not. "Language is infinitely complex," he says. "If you have a separate word for every object in your environment, that's not a language." There are limits, Kershenbaum says, to what dolphins can convey through their vocalizations. Dolphin whistles have variations in them, and we don't know if these mean different things. "It's not immediately clear that dolphins have words," he notes. This is a potential concern; for LLMs to conduct their analysis, they rely on a wordlike sequence of symbols. This "big question mark" over whether two similar -- but not identical -- symbols (whistles) mean the same thing could make it difficult for AI to interpret the sequence, he says. It's also worth remembering that this is just one population within one species, Taylor says. "Individual networks can have their own vocalization differences," she adds. Kershenbaum compares the project to the film Star Trek IV: The Voyage Home: when the crew of the starship Enterprise tries to communicate with humpback whales, they can simulate whale song but not its meaning. "This is not about translation," he says. So we aren't going to have a what we usually call a "conversation" with dolphins anytime soon. If the animals do appear to recognise complex AI sequences as having meaning, however, "that still gives us a window into their cognitive abilities," Kershenbaum says. Other groups, such as the Earth Species Project and the Project CETI (Project Cetacean Translation Initiative), are pursuing similar projects to decipher the vocalizations of crows and sperm whales. Humans often see language as the thing that sets us apart from animals. Herzing wonders if we'd have more empathy for dolphins if we could confirm that they use language. "Maybe [understanding them] would make us connect differently -- and realize that these species have the right to a healthy existence," she says.
[4]
Google is talking to dolphins using Pixel phones and AI - and the video is delightful
How Pixel phones are helping researchers decipher dolphin communication. Dolphins are among the smartest creatures on the planet, and a new AI model from Google combined with Pixel phones is helping researchers better understand their language -- and even hopefully communicate with them. Dolphin sounds fall into a few specific categories, such as whistles, squawks, and clicking buzzes, each linked to a different context and behavior. By analyzing these sounds, researchers can detect patterns and structure, just like human language. Researchers at the Wild Dolphin Project have been collecting data on this language for nearly 40 years. A collaboration with Google to use a new Google AI model called DolphinGemma lets them take that research a step further and actually predict what sound is coming next. Here's where Pixel phones come into play. Google says it designed DolphinGemma to take advantage of specific Google audio technology. This means using Pixel phones dramatically lessens the need for custom hardware, improves the ability to keep the system running, lowers power usage, and cuts back on costs -- all important elements for people working in the field. Because the AI model was essentially built specifically for Pixel devices, researchers in the field have been using a Pixel 6 to record and analyze data underwater in real-time. This summer, they'll upgrade to a Pixel 9 that can run both deep learning and template matching at the same time The goal, a YouTube video explains, is to understand how dolphin language works, construct a possible sound a dolphin might understand, and have a dolphin understand the context and replicate that sound to accomplish a task. Understanding the importance of the research, Google says it plans to share DolphinGemma as an open model this summer. It was trained on Atlantic spotted dolphin sounds, but still has utility for other species like bottlenose or spinner dolphins. By opening its tools to everyone, Google hopes to let researchers worldwide gain a better understanding of these super-intelligent animals.
[5]
Google made an AI model to talk to dolphins
A new large language model AI system may soon allow humans to converse with dolphins. Scheduled to debut in the coming months, researchers will test to see if DolphinGemma and its companion Cetacean Hearing Augmentation Telemetry (CHAT) system can translate and mimic some of the mammal's own complex vocalizations. If successful, the breakthrough may represent the culmination of over four decades' worth of work, documentation, and conservation efforts.. Dolphins are some of the Earth's smartest and most communicative animals. Their social interactions are so complex that researchers at the Wild Dolphin Project (WDP) have spent the last 40 years attempting to decipher them. In the process, WDP has amassed decades' worth of underwater audio and video documenting a single community of Atlantic spotted dolphins in the Bahamas. In the process, they have been able to correlate sounds with behavioral aspects like courtships, unique names, and dolphin squabbles. Experts have long theorized it may be possible for humans to actually communicate with the cetaceans, but lacked technology advanced enough to parse and mimic the species' underwater whistles, clicks, and burst pulses. With the rise of large language models (LLMs), researchers recently wondered if the same principles underlying LLMs could be applied to dolphin interactions. To test this possibility, WDP recently partnered with Google and the Georgia Institute of Technology, supplying engineers with a massive, labeled dataset of dolphin whistles, clicks, and burst pulses for use in LLM training. The result is DolphinGemma, an AI model built using the same technology that runs Google's Gemini systems. DolphinGemma is designed on roughly 400 million parameters to function in essentially the same way as predictive LLMs like ChatGPT -- but for dolphins. DolphinGemma first receives and interprets audio inputs, then predicts likely subsequent sounds for recreation. It is next partnered with the CHAT system installed on modified Google Pixel smartphones. CHAT isn't designed to fully translate a dolphin's natural language, but help humans convey and establish a more simplified, shared vocabulary. The plan is to ostensibly teach members of the WDP's Atlantic spotted dolphin community a series of synthetic whistles with their favorite objects such as seagrass, sargassum, and even researchers' scarves. Over time, experts hope that the dolphins will even learn to request desired items when they want to play. There's still a lot of work to be done before humans and dolphins bridge the interspecies communication gap. But with this creative use of LLMs, those underwater conversations are another step closer.
[6]
How Google plans to use AI and Pixel phones to communicate with dolphins
Why it matters: In what could be the best use of AI to date, Google is using a new AI model and Pixel phones to try to understand dolphins and eventually communicate with the mammals. The aptly named DolphinGemma large language model could interpret whistles, clicks, squawks, and other noises made by everyone's favorite sea creature. Google has announced a collaboration with researchers at Georgia Tech and the field research of the Wild Dolphin Project (WDP), the latter of which has been collecting data on the noises dolphins make and their relationship to dolphin behavior for over 40 years. Researchers previously correlated sound types with behavioral contexts. Dolphins have signature whistles that are used like names, allowing mothers to find their calves. They often use click "buzzes" during courtship or chasing sharks, while burst-pulse "squarks" are often seen during fights. DolphinGemma, based on Google's Gemma AI models, has been trained on WDP's vast dataset of dolphin sounds and the accompanying notes. It's not just Google's AI that's being used by the project. Field researchers are using Pixel phones to record the sounds of the dolphins. The Pixels run the sound through Google's SoundStream tokenizer, allowing the noises to be fed into the model as they're being recorded. Google notes that the 400 million-parameter DolphinGemma model is optimally sized to run directly on the Pixel phones WDP uses in the field. Using Pixel phones can also significantly reduce the need for expensive hardware and lower costs, writes Google. DolphinGemma works much in the same way as other large language models. It processes sequences of dolphin sounds to identify patterns, structure, and ultimately predict the likely subsequent sounds in a sequence. WDP has also partnered with the Georgia Institute of Technology on CHAT (Cetacean Hearing Augmentation Telemetry), which is based on the Pixel 6. It creates synthetic whistles and attempts to associate them with specific objects the dolphins enjoy, like sargassum, seagrass or scarves the researchers use. It's hoped the dolphins will mimic the sounds they hear to request these items. The ultimate goal of CHAT is to eventually enable basic two-way interaction with the dolphins. Google says a new Pixel 9-based CHAT will be ready for the 2025 summer research season. CHAT works alongside DolphinGemma. The LLM uses its "predictive power [to] help CHAT anticipate and identify potential mimics earlier in the vocalization sequence, increasing the speed at which researchers can react to the dolphins and [make] interactions more fluid and reinforcing." Hopefully, Google's technology will help avoid this potential scenario:
[7]
Forget Live Translate: Google is now using Pixel phones to talk to dolphins
The model is trained to analyze and generate dolphin vocalizations, all running through the Pixel 6. Google's Pixel phones already have you covered to understand different languages with Live Translate. But now, the company is taking things a big splash further. It turns out Pixel phones can not only help people talk to people but also help people talk to dolphins! Yes, you read that right. The same phone that helps you translate on the fly is now helping scientists decode dolphin-speak. In a new announcement timed with National Dolphin Day, Google unveiled DolphinGemma. It's an AI model Google created in collaboration with researchers at Georgia Tech and marine scientists at the Wild Dolphin Project (WDP). The model is trained to analyze and generate dolphin vocalizations, and the best part is that it can run through Pixel phones. The WDP has been studying these dolphins for nearly 40 years and has collected an ocean of data on who says what, when, and to whom. That's the data bank that DolphinGemma taps into. The AI model takes in real dolphin sounds and identifies patterns in the vocal sequences. The tech behind it borrows from Google's own Gemma models (little siblings of the Gemini models), tailored specifically for audio processing. DolphinGemma works on Pixel phones used in the field, eliminating the need for bulky underwater computers and other equipment. Thanks to Google's audio wizardry and the mighty SoundStream tokenizer (which, no joke, sounds like something from a sci-fi movie), researchers can not only listen to dolphin chatter in real time but also respond to the mammals. WDP and Georgia Tech are even testing a system called CHAT (Cetacean Hearing Augmentation Telemetry) to speak to dolphins. CHAT uses synthetic whistles to represent specific objects dolphins love. The idea is that a dolphin will hear the synthetic whistle, realize it means "toy," mimic it, and communicate with researchers. Pixel 6 phones with DolphinGemma listen through the noise, recognize a dolphin's mimic, and tell human researchers what the dolphin said. That information gets piped straight into bone-conducting headphones, and the researchers can respond in real-time. Google says the Pixel 9 series will further speed up this process by running AI models locally.
[8]
Google’s New AI Is Trying to Talk to Dolphinsâ€"Seriously
A new AI model produced by computer scientists in collaboration with dolphin researchers could open the door to two-way animal communication. In a collaboration that sounds straight out of sci-fi but is very much grounded in decades of ocean science, Google has teamed up with marine biologists and AI researchers to build a large language model designed not to chat with humans, but with dolphins. The model is DolphinGemma, a cutting-edge LLM trained to recognize, predict, and eventually generate dolphin vocalizations, in an effort to not only crack the code on how the cetaceans communicate with each otherâ€"but also how we might be able to communicate with them ourselves. Developed in partnership with the Wild Dolphin Project (WDP) and researchers at Georgia Tech, the model represents the latest milestone in a quest that’s been swimming along for more than 40 years. Since 1985, WDP has run the world’s longest underwater study of dolphins. The project investigates a group of wild Atlantic spotted dolphins (S. frontalis) in the Bahamas. Over the decades, the team has non-invasively collected underwater audio and video data that is associated with individual dolphins in the pod, detailing aspects of the animals' relationships and life histories. The project has yielded an extraordinary datasetâ€"one packed with 41 years of sound-behavior pairings like courtship buzzes, aggressive squawks used in cetacean altercations, and “signature whistles†that act as dolphin name tags. This trove of labeled vocalizations gave Google researchers what they needed to train an AI model designed to do for dolphin sounds what ChatGPT does for words. Thus, DolphinGemma was born: a roughly 400-million parameter model built on the same research that powers Google’s Gemini models. DolphinGemma is audio-in, audio-outâ€"the model "listens" to dolphin vocalizations and predicts what sound comes nextâ€"essentially learning the structure of dolphin communication. Artificial intelligence models are changing the rate at which experts can decipher animal communication. Everything under the Sunâ€"from dog barks and bird whistlesâ€"is easily fed into large language models which then can use pattern recognition and any relevant contexts to sift through the noise and posit what the animals are "saying." Last year, researchers at the University of Michigan, Mexico’s National Institute of Astrophysics, and the Optics and Electronics Institute used an AI speech model to identify dog emotions, gender, and identity from a dataset of barks. Cetaceans, a group that includes dolphins and whales, are an especially good target for AI-powered interpretation because of their lifestyles and the way they communicate. For one, whales and dolphins are sophisticated, social creatures, which means that their communication is packed with nuance. But the clicks and shrill whistles the animals use to communicate are also easy to record and feed into a model that can unpack the "grammar" of the animals' sounds. Last May, for example, the nonprofit Project CETI used software tools and machine learning on a library of 8,000 sperm whale codas, and found patterns of rhythm and tempo that enabled the researchers to create the whales' phonetic alphabet. The DolphinGemma model can generate new, dolphin-like sounds in the correct acoustic patterns, potentially helping humans engage in real-time, simplified back-and-forths with dolphins. This two-way communication relies on what a Google blog referred to as Cetacean Hearing Augmentation Telemetry, or CHATâ€"an underwater computer that generates dolphin sounds the system associates with objects the dolphins like and regularly interact with, including seagrass and researchers' scarves. "By demonstrating the system between humans, researchers hope the naturally curious dolphins will learn to mimic the whistles to request these items," the Google Keyword blog stated. "Eventually, as more of the dolphins’ natural sounds are understood, they can also be added to the system." CHAT is installed on modified smartphones, and the researchers' idea is to use it to create a basic shared vocabulary between dolphins and humans. If a dolphin mimics a synthetic whistle associated with a toy, a researcher can respond by handing it overâ€"kind of like dolphin charades, with the novel tech acting as the intermediary. Future iterations of CHAT will pack in more processing power and smarter algorithms, enabling faster responses and clearer interactions between the dolphins and their humanoid counterparts. Of course, that's easily said for controlled environmentsâ€"but raises some serious ethical considerations about how to interface with dolphins in the wild should the communication methods become more sophisticated. Google plans to release DolphinGemma as an open model this summer, allowing researchers studying other species, including bottlenose or spinner dolphins, to apply it more broadly. DolphinGemma could be a significant step toward scientists better understanding one of the ocean's most familiar mammalian faces. We’re not quite ready for a dolphin TED Talk, but the possibility of two-way communication is a tantalizing indicator of what AI models could make possible.
[9]
How do you build an AI translator to understand dolphinspeak?
Now this is the sort of application of AI that really intrigues me. Researchers have developed DolphinGemma, the first large language model (LLM) for understanding dolphin language. It could help us translate what these incredible creatures are saying, potentially much faster than we ever could with manual approaches used over several decades. "The goal would be to one day speak Dolphin," says Dr. Denise Herzing. Her research organization, The Wild Dolphin Project (WDP), exclusively studies a specific pod of free-ranging Atlantic spotted dolphins who reside off the coast of the Bahamas. She's been collecting and organizing dolphin sounds for the last 40 years, and has been working with Dr. Thad Starner, a research scientist from Google DeepMind, an AI subsidiary of the tech giant. With their powers combined, they've trained an AI model on a vast library of dolphin sounds; it can also expand to accommodate more data, and be fine tuned to more accurately representing what those sounds might mean. "... feeding dolphin sounds into an AI model like dolphin Gemma will give us a really good look at if there are patterns subtleties that humans can't pick out," Herzing noted. Dolphins typically communicate with each other using a range of whistles (some of which are names), echolocating clicks to help them hunt, and burst-pulse sounds in social contexts. Since the 1980s, researchers have recorded these sounds using hydrophones (underwater microphones), analyzed them to find similar patterns with spectrograms (visual representations of sound that show how the frequency content of a signal changes over time), and then experimented with dolphins by playing those sounds back to observe their behavior. That's a whole lot of manual work to develop a catalog of dolphin sounds. The team's LLM, which is built upon the foundational tech that powers Google's Gemini models and its popular chatbot, uses sophisticated audio tech to represent dolphin sounds as tokens. It's trained extensively on WDP's acoustic database of wild Atlantic spotted dolphins, and processes sequences of the sounds they've produced to identify patterns and structure. Google notes that it can practically predict the likely subsequent sounds in a sequence, similar to how you get suggestions when you're searching for something online or composing an email and need help completing a sentence. What if we wanted to go beyond simply observing dolphins talk to each other, and see if we could communicate with them in a shared language? That's what WDP has been exploring in parallel for the last several years with Starner's help. He's not just an AI whiz, but also one of the pioneers behind the pathbreaking Google Glass wearable. Previous rudimentary systems to communicate with these animals involved a large dolphin-nose-sized keyboard mounted to the side of the boat back in the 1990s. The idea was that the researchers would interact with the dolphins while passing around attractive toys, and play artificial dolphin whistles linked to each large key with a symbol on it. They imagined the dolphins might point at those keys to request toys. Here's what that looked like (starts at 6:43 in Herzing's TED talk from 2013): Since that wasn't as interactive as the researchers hoped, they switched to an underwater keyboard they could swim around with. While it didn't exactly result in an actual back-and-forth, it did show that the dolphins were attentive and focused enough to engage in learning to converse. So in 2010, WDP began collaborating with Georgia Institute of Technology - where Starner is a professor - to develop new tech for two-way communication. Together with Herzing, Starner created CHAT (Cetacean Hearing and Telemetry) devices to enable two-way communication with dolphins. These large wearables, built around an underwater computer, featured hydrophones to detect and record dolphin vocalizations, speakers to play artificial whistles in the water, and a specialized interface that divers could use underwater. The system worked by detecting specific dolphin whistles and associating them with objects or concepts. Researchers could also trigger artificial whistles using the interface, essentially "speaking" to the dolphins using sounds they could potentially learn to associate with specific objects. Since those early days, the team has been updating the hardware and incorporating AI in it. Using Google Pixel phones in their wearables, the researchers have been carrying out a similar routine as with the earlier CHAT system: With a prediction-capable AI model at the heart of it, this system is designed to help the researchers react to the dolphins faster and more naturally. The use of off-the-shelf smartphones means CHAT now uses a lot less power, can be maintained more easily, and is smaller than the previous version. Herzing's organization will deploy DolphinGemma this field season, and this new model should accelerate the team's efforts to study and document the behavior of Atlantic spotted dolphins. Google says it will make DolphinGemma an open model around the middle of the year, meaning that it will be more widely available to researchers elsewhere in the world. The company says it should be possible to adapt it for use with other cetacean species, like bottlenose or spinner dolphins, with a bit of fine-tuning. "If dolphins have language, then they probably also have culture," Starner noted. "You're going to understand what priorities they have, what do they talk about?" That could give us a whole new perspective on how intelligent species in the animal kingdom communicate and how their societies operate.
[10]
Scientists are using Pixel phones and Google AI to try to 'speak dolphin'
Summary Google's DolphinGemma AI model aims to decipher and predict patterns in dolphin sounds for better understanding of their communication. The Wild Dolphin Project's CHAT system utilizes Pixel phones for computing, paving the way for future advancements in dolphin research. Dr. Denise Herzing believes that DolphinGemma could reveal hidden structures in dolphins' communication, potentially leading to human-dolphin interaction. We're all used to the now-mundane things AI can do, from generating images to whipping up spreadsheets and presentations. The more exciting AI frontier today is in things that wouldn't be possible or wouldn't be feasible without the use of AI. For example: Google's AI and hardware are being used in research by Georgia Tech and the Wild Dolphin Project that aims to further our understanding of dolphin communication and social interaction -- and even to potentially allow humans to communicate with dolphins. Dolphins make various sounds that communicate information to each other, including particular whistles that function as unique identifiers for individual dolphins, similar to the way humans refer to each other by name. The Wild Dolphin Project has been studying this communication in wild Atlantic spotted dolphins for 40 years. In a blog post for National Dolphin Day, Google shared how it's lending its expertise to the project. Google's created an AI model called DolphinGemma, trained on the World Dolphin Project's data to identify and eventually predict patterns in the sounds dolphins make. DolphinGemma is also being used to develop synthetic dolphin-like sounds that researchers hope they can teach dolphins to associate with "specific objects the dolphins enjoy," like seagrass and scarves. To accomplish this, the Wild Dolphin Project developed the Cetacean Hearing Augmentation Telemetry system (CHAT). New models of CHAT use Pixel phones for computing; the current iteration runs on a Pixel 6, and the Project is set to roll out a Pixel 9-based model later this year. That upcoming build will use the Pixel 9's speakers and microphones, as well as on-device AI processing to run DolphinGemma. Using AI to decipher and generate dolphin 'speech' Google says that use of its DolphinGemma AI model could "help researchers uncover hidden structures and potential meanings within the dolphins' natural communication -- a task previously requiring immense human effort." Dr. Denise Herzing of the Wild Dolphin Project says the AI tool will "give us a really good look at if there are pattern subtleties that humans can't pick out," adding with a laugh that "the goal would be to someday speak dolphin." The Wild Dolphin Project is beginning to deploy DolphinGemma in the field this year. When we'll see Dolphin as an option in Google Translate is anyone's guess.
[11]
DolphinGemma: How Google AI is helping decode dolphin communication
Google DeepMind Research Scientist and Georgia Tech Professor For decades, understanding the clicks, whistles and burst pulses of dolphins has been a scientific frontier. What if we could not only listen to dolphins, but also understand the patterns of their complex communication well enough to generate realistic responses? Today, on National Dolphin Day, Google, in collaboration with researchers at Georgia Tech and the field research of the Wild Dolphin Project (WDP), is announcing progress on DolphinGemma: a foundational AI model trained to learn the structure of dolphin vocalizations and generate novel dolphin-like sound sequences. This approach in the quest for interspecies communication pushes the boundaries of AI and our potential connection with the marine world.
[12]
Google Launches AI to Decode Dolphin Talk, Runs on Pixel Phones - Decrypt
Why try to understand Gen Z slang when it might be easier to communicate with animals? Today, Google unveiled DolphinGemma, an open-source AI model designed to decode dolphin communication by analyzing their clicks, whistles, and burst pulses. The announcement coincided with National Dolphin Day. The model, created in partnership with Georgia Tech and the Wild Dolphin Project (WDP), learns the structure of dolphins' vocalizations and can generate dolphin-like sound sequences. The breakthrough could help determine whether dolphin communication rises to the level of language or not. Trained on the world's longest-running underwater dolphin research project, DolphinGemma leverages decades of meticulously labeled audio and video data collected by WDP since 1985. The project has studied Atlantic Spotted Dolphins in the Bahamas across generations using a non-invasive approach they call "In Their World, on Their Terms." "By identifying recurring sound patterns, clusters and reliable sequences, the model can help researchers uncover hidden structures and potential meanings within the dolphins' natural communication -- a task previously requiring immense human effort," Google said in its announcement. The AI model, which contains roughly 400 million parameters, is small enough to run on Pixel phones that researchers use in the field. It processes dolphin sounds using Google's SoundStream tokenizer and predicts subsequent sounds in a sequence, much like how human language models predict the next word in a sentence. DolphinGemma doesn't operate in isolation. It works alongside the CHAT (Cetacean Hearing Augmentation Telemetry) system, which associates synthetic whistles with specific objects dolphins enjoy, such as sargassum, seagrass, or scarves, potentially establishing a shared vocabulary for interaction. "Eventually, these patterns, augmented with synthetic sounds created by the researchers to refer to objects with which the dolphins like to play, may establish a shared vocabulary with the dolphins for interactive communication," according to Google. Field researchers currently use Pixel 6 phones for real-time analysis of dolphin sounds. The team plans to upgrade to Pixel 9 devices for the summer 2025 research season, which will integrate speaker and microphone functions while running both deep learning models and template matching algorithms simultaneously. The shift to smartphone technology dramatically reduces the need for custom hardware, a crucial advantage for marine fieldwork. DolphinGemma's predictive capabilities can help researchers anticipate and identify potential mimics earlier in vocalization sequences, making interactions more fluid. DolphinGemma joins several other AI initiatives aimed at cracking the code of animal communication. The Earth Species Project (ESP), a nonprofit organization, recently developed NatureLM, an audio language model capable of identifying animal species, approximate age, and whether sounds indicate distress or play -- not really language, but still, ways of establishing some primitive communication. The model, trained on a mix of human language, environmental sounds, and animal vocalizations, has shown promising results even with species it hasn't encountered before. Project CETI represents another significant effort in this space. Led by researchers including Michael Bronstein from Imperial College London, it focuses specifically on sperm whale communication, analyzing their complex patterns of clicks used over long distances. The team has identified 143 click combinations that might form a kind of phonetic alphabet, which they're now studying using deep neural networks and natural language processing techniques. While these projects focus on decoding animal sounds, researchers at New York University have taken inspiration from baby development for AI learning. Their Child's View for Contrastive Learning model (CVCL) learned language by viewing the world through a baby's perspective, using footage from a head-mounted camera worn by an infant from 6 months to 2 years old. The NYU team found that their AI could learn efficiently from naturalistic data similar to how human infants do, contrasting sharply with traditional AI models that require trillions of words for training. Google plans to share an updated version of DolphinGemma this summer, potentially extending its utility beyond Atlantic spotted dolphins. Still, the model may require fine-tuning for different species' vocalizations. WDP has focused extensively on correlating dolphin sounds with specific behaviors, including signature whistles used by mothers and calves to reunite, burst-pulse "squawks" during conflicts, and click "buzzes" used during courtship or when chasing sharks. "We're not just listening anymore," Google noted. "We're beginning to understand the patterns within the sounds, paving the way for a future where the gap between human and dolphin communication might just get a little smaller."
[13]
Google's New AI Model Helps Humans Communicate with Dolphins | AIM Media House
Google, in collaboration with Georgia Tech and the Wild Dolphin Project (WDP), has launched DolphinGemma, an AI model developed to analyse and generate dolphin vocalisations. The announcement, made on National Dolphin Day, marks a new step in the effort to understand and potentially engage in two-way communication with dolphins. DolphinGemma is trained on decades of underwater video and audio data from WDP's long-term study of Atlantic spotted dolphins (Stenella frontalis) in the Bahamas. The model identifies patterns in dolphin vocal sequences and generates realistic dolphin-like sounds. "By identifying recurring sound patterns, clusters and reliable sequences, the model can help researchers uncover hidden structures and potential meanings," Google said. The AI system is based on Google's lightweight Gemma models and leverages SoundStream for audio representation. At roughly 400 million parameters, the model is small enough to run on Pixel phones used in field research. It operates as an audio-in, audio-out system, designed to predict subsequent dolphin sounds much like a language model predicts the next word in human text. "Understanding any species requires deep context," the WDP team said. Their work since 1985 has documented individual dolphins' life histories and linked vocalisations to specific behaviours. Signature whistles are used to reunite mother and calf pairs, while burst-pulse squawks and click buzzes are associated with fighting and courtship. In parallel, the WDP and Georgia Tech have also developed the CHAT (Cetacean Hearing Augmentation Telemetry) system. CHAT uses synthetic whistles linked to objects like sargassum or scarves. If dolphins mimic these whistles, the system can recognise them and inform the researcher via bone-conducting headphones, enabling real-time interaction. "A Google Pixel 6 handled the high-fidelity analysis of dolphin sounds in real time," researchers explained. The next iteration, built around a Pixel 9, will integrate deep learning and template matching directly into the device, reducing the need for specialised hardware. DolphinGemma will be released as an open model this summer. Although currently trained on Atlantic spotted dolphins, it is expected to benefit research into other cetaceans. Fine-tuning will be needed for species with different vocal patterns. "We're not just listening anymore. We're beginning to understand the patterns within the sounds," Google stated. The goal of the research is to bridge the gap between humans and dolphins through data-driven insights and shared interaction systems.
[14]
Google develops AI model to help researchers decode dolphin communication - SiliconANGLE
Google develops AI model to help researchers decode dolphin communication Today, on National Dolphin Day, researchers from Google LLC, Georgia Tech and the Wild Dolphin Project announced DolphinGenna, an artificial foundation model trained on the structure of dolphin vocalizations. Dolphins and the myriads of sounds they produce have long interested researchers as a form of communication. For decades researchers have studied the clicks, whistles and burst sound pulses of the animals as a scientific pursuit to understand what they're saying to each other. Since 1985, the Wild Dolphin Project has conducted the longest-running study of wild Atlantic spotted dolphins in the world on their own terms. This decades-long approach has yielded one of the largest datasets of dolphin communications in the world including specific individuals, sounds and behaviors. A primary focus of the WDP is to observe and analyze dolphins in their natural habitat, including their social interactions. By working underwater with them, researchers can directly link the sounds they make with specific behaviors. This allows them to directly understand how a sound - such as a pattern of clicks and whistles - can communicate unique names, activities, warnings or other signals. To assist with this Google developed DolphinGemma, an AI model that uses audio technologies to tokenize dolphin sounds. As the Gemma AI model is very small, with around 400 million parameters, it is optimally sized to fit on mobile devices and thus can be loaded onto waterproofed Pixel phones worn by WDP field researchers while underwater. "Trained extensively on WDP's acoustic database of wild Atlantic spotted dolphins, DolphinGemma functions as an audio-in, audio-out model, processes sequences of natural dolphin sounds to identify patterns, structure and ultimately predict the likely subsequent sounds in a sequence, much like how large language models for human language predict the next word or token in a sentence," the researchers said in the announcement. Using Gemma, Google is hoping to assist researchers in discovering vocabulary and understanding the unique "syntax" of patterns that may exist between the sounds they generate. This might provide insights into how they identify objects and one another during social interactions and play. WDP is hoping to use the model to expand the research with CHAT, or Cetacean Hearing Augmentation Telemetry, an underwater computer that is designed to help establish a shared human-dolphin vocabulary. It helps facilitate communication between humans and dolphins by producing synthetic clicks, whistles and pulses - distinct from natural dolphin sounds. Researchers use these sounds when interacting with things the dolphins might be interested in such as toys, scarves or sargassum seaweed. This way the dolphins learn to associate the synthetic sounds with the toy or seaweed and can request the object using the sounds. The CHAT can also identify which object was requested by listening for the appropriate tones. CHAT uses a Google Pixel 6 for analysis of dolphin sounds to reduce the need for custom hardware. The Pixel phones are also smaller than a custom device, making it easier to carry underwater, and because it's off-the-shelf, it is easier to maintain and replace. An upcoming generation is planned for Google Pixel 9 that will advance the research effort with its enhanced onboard deep learning model processing capabilities. Using DolphinGemma, it's hoped that the AI's predictive power will help researchers analyze dolphin sounds and mimicry faster. By fine-tuning this process with different communities, or pods, of dolphins it might help create a simple translatable vocabulary of objects by detecting vocalizations earlier in communication sequences so that interactions are more fluid. "By providing tools like DolphinGemma, we hope to give researchers worldwide the tools to mine their acoustic datasets, accelerate the search for patterns and collectively deepen our understanding of these intelligent marine mammals," said Google. The road to better understanding how dolphins communicate is very likely extremely long but the advent of CHAT and DolphinGemma puts researchers one step closer to two-way communication than just listening in. By having a better grasp of the patterns behind the sound and the ability to establish shared vocabulary with a pod of dolphins, in the future researchers may be able to get dolphins to teach them to communicate. The DolphinGemma model is trained primarily on Atlantic spotted dolphin vocalizations, but there could be a future where it can be expanded to include other species and communities of bottlenose or spinner dolphins. With increases in fishing, trawling, shipping the amount of noise and waste being put into the ocean continues to reduce the total habitat that they can live in. Marine noise pollution is particularly problematic for dolphins because, although just like humans they use sound for communication, they also use it to navigate and find food. Google and the WDP hope that using AI will help close the gap between humans and dolphins by finally "cracking the code" of dolphin communication. This breakthrough will not only aid communication between humans and dolphins but also educate us about one of the most intelligent animal species inhabiting our oceans, which cover a significant portion of the Earth's surface.
[15]
Google's new AI model tries to understand dolphins
DolphinGemma will be shared as an open model this summer, Google announced. 14 April is recognised as the National Dolphin Day in the US - not to be confused with World Dolphin Day, which is on 12 September. Makes sense that we have two separate days to celebrate these fascinating cetaceans. With a brain to body ratio second only to humans, these animals have specialised brain cells associated with advanced abilities such as recognising, remembering, reasoning, communicating and even problem solving. Not only that, scientists have even learnt that dolphins can process emotions, and that part of their brain seems to be more complex than our own! And so, Google struck up a collaboration with researchers at Georgia Institute of Technology and the Wild Dolphin Project (WDP) to build an artificial intelligence model which can learn the patterns of dolphin sounds and generate them back. Since 1985, the WDP has conducted the world's longest-running underwater dolphin research project - specifically studying the wild Atlantic spotted dolphins. The team has collected several decades of underwater video and audio from the cetaceans, paired with individual dolphin identities, histories and observed behaviours. Google used this trove of data to create DolphinGemma, a 400m parameter AI model for dolphin sounds, which is run directly on the phones used by the WDP team while on field. The model builds on Google's Gemma, a collection of its lightweight open models, the company said in its progress announcement yesterday, which coincided with the US National Dolphin Day. DolphinGemma, an audio-only model, is built entirely on WDP's collection of dolphin sounds. It processes sequences of natural dolphin sounds to identify patterns and predicts the likely subsequent sound in the sequence, in a process similar to how large language models for humans work. In addition, WDP is also working on potential two-way communication channel using the model, leading to the development of the Cetacean Hearing Augmentation Telemetry (Chat) system, in partnership with Georgia Institute of Technology. With Chat, researchers are hoping to build a simpler, shared vocabulary with the dolphins by associating unique whistle sounds with specific objects the animals enjoy. They hope that the curious dolphins will learn to mimic the whistling sounds, which can then be used to reinforce a positive connection with treat like objects. WDP is beginning to deploy DolphinGemma this field season, Google said. In addition, the tech giant is also releasing it as an open model this summer. Google hopes that the model can be potentially utilised by researchers trying to study other cetacean species like bottlenose or spinner dolphins. Don't miss out on the knowledge you need to succeed. Sign up for the Daily Brief, Silicon Republic's digest of need-to-know sci-tech news.
[16]
Google's New A.I. Model Aims to Decipher Dolphin Language
Google's DolphinGemma LLM could help researchers finally communicate with dolphins. As the world's longest-running underwater dolphin study, the Wild Dolphin Project since 1985 has analyzed generations of wild Atlantic spotted dolphins in the Bahamas. Now, aided by an A.I. model from Google (GOOGL), researchers at the nonprofit science organization are one step closer to finally deciphering the whistles, clicks and chirps that these aquatic mammals use to communicate with each other. Sign Up For Our Daily Newsletter Sign Up Thank you for signing up! By clicking submit, you agree to our <a href="http://observermedia.com/terms">terms of service</a> and acknowledge we may use your information to send you emails, product samples, and promotions on this website and other properties. You can opt out anytime. See all of our newsletters DolphinGemma, built on the same technology behind Google's Gemini models, will tap into the Wild Dolphin Project's (WDP) extensive library of acoustic dolphin data to uncover potential patterns and rules in dolphin sounds that remain indecipherable to humans. "I've been waiting for this for 40 years," said Denise Herzing, WDP's founder and head of research, in a video posted by Google. WDP has long attempted to hammer out evidence of a dolphin language by closely observing dolphin behavior. Over the years, researchers have managed to link various sounds with particular behavioral contexts. Mothers and calves, for example, use distinct signature whistles -- unique names -- to call out to each other. Dolphins have also been observed squawking during fights and emitting buzzes while courting or chasing sharks through the ocean. But uncovering the myriad meanings of dolphin communication remains a behemoth task for humans. WDP researchers are hoping that DolphinGemma's model will pick up on previously unseen structures and sequences as it parses through a reserve of dolphin audio. The ultimate goal is for the A.I. system to predict subsequent sounds in a sequence of dolphin communication -- similar to how other large language models (LLMs) predict the next word in a string of human text. Optimized to run directly on researchers' Google Pixel phones, DolphinGemma will hit the seas soon. It's expected to be deployed by the nonprofit during this year's field season. Contributing new sounds to dolphin communication Beyond identifying hidden patterns in dolphin audio, Google's A.I. model could also help researchers generate entirely new sounds to communicate with the animals. That's the goal of the Cetacean Hearing Augmentation Telemetry (CHAT) system -- a collaborative project between WDP and the Georgia Institute of Technology. CHAT is an underwater computer capable of producing novel whistles linked to objects dolphins enjoy playing with, such as seagrass or scarves. By chirping out whistles while researchers pass these objects to each other, CHAT aims to teach dolphins to mimic them in order to request the sought-after items -- therefore adding new sounds to their roster of audio. DolphinGemma could potentially aid CHAT in identifying these mimics at a faster rate, according to Google. Although DolphinGemma was trained on audio from Atlantic spotted dolphin sounds, the tech company believes its new model could have useful applications for other species like bottlenose or spinner dolphins. To that effect, Google will share the LLM with other researchers as an open model this summer. Despite having spent four decades dedicated to studying dolphins, language remains "the last barrier," noted Herzing in Google's video. "The goal would be to somebody speak dolphin, and we're really trying to crack the code."
[17]
Google to launch AI focused on communicating with dolphins
The system, called DolphinGemma, has already managed to generate dolphin-like acoustic sequences and is based on data collected over four decades in the wild. Google, in collaboration with the Georgia Institute of Technology and the Wild Dolphin Project -- a nonprofit researching dolphins in the Bahamas -- has announced significant progress in developing an artificial intelligence designed to decode dolphin communication and potentially even "speak" it. Amid growing discussions about AI's impact on human life, this unique new model is extending the boundaries of language into the ocean. The new model, DolphinGemma, was trained on tens of thousands of hours of acoustic recordings of Stenella frontalis dolphins, known for communicating through clicks, whistles, and rapid sequences of sounds called burst pulses. Dolphins use these complex sound sequences during play, courtship, and sometimes confrontations. "When I first heard the sounds generated by the model, I danced around the room," said Dr. Thad Starner, an AI researcher with Google and Georgia Tech. "For years, I tried to produce burst pulses with regular software and couldn't do it. But the model created them on its own -- no lines of code -- just from what it learned from the data." Unlike earlier approaches that attempted to translate dolphin sounds into human words, the new model doesn't attempt to "translate" at all. Instead, it identifies and constructs sound patterns that mimic dolphin language. The approach aims for the system to learn to associate specific sound patterns with specific contexts -- like playing with a friend, encountering a new object, or facing off in a conflict. The team is also developing technology called CHAT: a wearable audio system for divers that generates real-time AI-produced "dolphin" sounds. Researchers play a new sound while pointing to an object -- such as a toy or sea plant -- and observe whether dolphins repeat or consistently react to the sound. "The goal is to see if dolphins can learn or recognize a new word," explained Dr. Denise Herzing, founder of the project. "Instead of spending 150 years manually analyzing all the recordings, the model allows us to detect patterns in days." Do dolphins even speak in words like humans? However, some experts caution that dolphins may not use actual "words" at all. Human language, they argue, involves grammar and limitless structures. Dolphins may use signals, but that doesn't necessarily constitute language as we understand it. Still, even if the model doesn't translate or hold a "conversation," the fact that dolphins may recognize patterns or anticipate sound-based responses opens a valuable window into their cognitive abilities. Researchers believe that consistent responses to new sounds could demonstrate memory, learning, attention, and even basic communication. Advertisement Other global projects also aim to communicate with animals using AI, including the Earth Species Project, which focuses on crows, and CETI, which is working to decode sperm whale communication. As for whether we'll ever have a true "conversation" with a dolphin, the researchers answer cautiously: probably not. But the mere attempt to understand could transform the way we view our place in the animal kingdom. Sign up for the Business & Innovation Newsletter >>
Share
Share
Copy Link
Google, in collaboration with the Wild Dolphin Project and Georgia Tech, has developed DolphinGemma, an AI model designed to analyze and potentially communicate with dolphins using advanced language processing techniques.
In a groundbreaking development at the intersection of artificial intelligence and marine biology, Google has unveiled DolphinGemma, an AI model designed to analyze and potentially communicate with dolphins. This innovative project, a collaboration between Google, the Wild Dolphin Project (WDP), and the Georgia Institute of Technology, aims to decipher the complex vocalizations of dolphins and potentially establish a form of two-way communication 123.
DolphinGemma is built on Google's open Gemma series of models, which share foundations with the company's commercial Gemini models. The AI utilizes Google's SoundStream audio technology to tokenize dolphin vocalizations, allowing real-time processing of recorded sounds 1. The model has been trained on a vast dataset of Atlantic spotted dolphin vocalizations collected by the WDP over four decades 35.
Google has optimized DolphinGemma to run efficiently on Pixel smartphones, significantly reducing the need for custom hardware in field research. Currently, researchers are using Pixel 6 devices for underwater recording and real-time analysis. This summer, they plan to upgrade to Pixel 9 phones, which will enable simultaneous deep learning and template matching 24.
Complementing DolphinGemma is the Cetacean Hearing Augmentation Telemetry (CHAT) system, developed by Georgia Tech. This wearable technology allows divers to recognize audio and play sounds underwater. Researchers aim to use CHAT to present novel, AI-generated vocalizations to dolphins while observing their responses 35.
While the project shows promise, experts caution against overstating its potential. The question of whether dolphins possess language in the human sense remains debated. Dr. Arik Kershenbaum of Girton College notes that dolphin communication may not have the infinite complexity characteristic of human language 3.
As research progresses, ethical considerations come to the fore. Thea Taylor of the Sussex Dolphin Project emphasizes the need to ensure that any observed dolphin responses represent genuine understanding rather than conditioned behavior 3. Google plans to release DolphinGemma as an open model this summer, potentially accelerating research across various dolphin species 4.
This project could have far-reaching implications for our understanding of animal cognition and our relationship with other intelligent species. Denise Herzing of the WDP suggests that confirming dolphins' use of language could foster greater empathy towards these marine mammals 3. Similar projects are underway to study the vocalizations of other species, including crows and sperm whales 3.
As DolphinGemma and related technologies continue to evolve, they promise to open new frontiers in interspecies communication, potentially reshaping our understanding of intelligence in the animal kingdom.
Reference
[1]
[3]
[5]
Recent AI-powered studies have made significant progress in understanding and translating animal communication, bringing us closer to the possibility of interspecies dialogue. This development has implications for conservation efforts and our understanding of animal cognition.
2 Sources
2 Sources
Project CETI researchers develop an innovative AI-driven framework called AVATARS to predict sperm whale surfacing and optimize drone rendezvous, advancing cetacean communication studies and conservation efforts.
2 Sources
2 Sources
The Earth Species Project, a nonprofit lab, is using AI to decipher animal vocalizations, with the goal of improving conservation efforts and reconnecting humans with nature.
5 Sources
5 Sources
Google unveils Gemini AI integration across its ecosystem, challenging Apple's AI efforts. The Pixel 9 and Pixel Buds Pro 2 showcase advanced AI capabilities, signaling a new era in smartphone technology.
12 Sources
12 Sources
Google DeepMind unveils Gemini Robotics, an AI model that enables robots to perform complex tasks with improved generalization, adaptability, and dexterity. The technology shows promise in creating more intuitive and capable robots for various applications.
30 Sources
30 Sources
The Outpost is a comprehensive collection of curated artificial intelligence software tools that cater to the needs of small business owners, bloggers, artists, musicians, entrepreneurs, marketers, writers, and researchers.
© 2025 TheOutpost.AI All rights reserved