Microsoft has filed a patent on WIPO IP Portal titled "ARTIFICIAL INTELLIGENCE MODELS FOR COMPOSING AUDIO SCORES." Microsoft is inventing an intelligent audio composing technology for creating sounds, music and other audio elements for all sorts of media like movies, TV shows, games, and even live recordings. The patent mentions dynamic moments in games, suggesting that it could create scores that change to fit in with the player's actions. The abstract for the patent states that parameters can be set using visual, audio, and textual features and prompts (collectively termed as 'Dataset') to instruct the multitude of AI models construct audio scores.

The recent advent of AI has been revolutionary and has crossed with multiple disciplines of art and media. Although more than a few AI tools for audio generation have been released already, Microsoft's latest patent seems to suggest that their proprietary AI model ecosystem will be the most comprehensive and advanced system of machine assisted audio creation till date.

Related:

EA Patents 'Expressive' Text-to-Speech System for Better Video Game Voices

AI plays an integral role in video games. From enemy behavior and combat encounters to procedural level generation and interactions with NPCs and environment, AI is indispensable at every level of game development. In terms of sound design, adaptive/dynamic soundtracks are featured in many video games like the modern Doom games, Metal Gear Rising: Revengeance, Devil May Cry 5, etc. For example, in Devil May Cry 5, the songs will only begin to carry the energetic vocals as the style ranking reaches higher levels.

The image showcases the functioning of certain AI engines in ARTIFICIAL INTELLIGENCE MODELS FOR COMPOSING AUDIO SCORES by Microsoft.

But Microsoft's new AI for audio can go far beyond the conventional utilization of dynamic/adaptive music in games. Player actions can be dynamically scored with appropriate audio cues and music, all in real-time. So the audio experience would differ from person to person. Many games place special emphasis on sounds and music. These games can benefit from the heuristics this tech provides.

The patent description goes into detail about the multitudes of AI engines which are tasked to perform the audio scores in accordance to the provided datasets. They can analyze human expressions and sentiments, collect location data, analyze the tone of the situation and much more. The AI can learn about pictures, videos, films, live events and produce a set of audio files that can layer the visuals with appropriate sound effects and music. This hi-tech AI can open up many exciting avenues for media creation. One can produce films, games, etc. with a huge library of every growing audio scores. Designing an epic orchestral piece for the hero's entrance, composing a melancholic tune for the passing of a pet, developing sound effects for gunfire and explosions; all of these can be entrusted to the AI's algorithm. As a side effect, the composers and sound designers might face some competition.

The technology will be powered by cloud computing. It remains to be seen when the system will be actually become operational. With such ever-expanding database, the AI system will require a substantially powerful infrastructure. But the future of audio design is looking very promising and Microsoft could be helming a revolution in this regard.

MORE:

The Game Awards 2022: Predicting the Game of The Year Winner