Go Summarize

Mikey Shulman: Suno and the Sound of AI Music

645 views|2 months ago
💫 Short Summary

Mikey Schulman, CEO of Sunno AI, shares his transition from music to AI-generated music, highlighting the joy of music creation. Kentro's acquisition by S&P Global led to audio AI projects, emphasizing the value of audio content. The founders left finance to focus on audio, developing the Bark text-to-speech model. Generative AI enables interactive music creation, impacting music format and consumption. The future involves personalized music experiences and dynamic music creation. Collaboration with Microsoft expands creative outlets, emphasizing user-centric product design. Ethical considerations and AI's potential to push creative boundaries in music are discussed. AI's role in music production and artist collaborations is evolving, with a focus on unique sounds and legal considerations. OpenAI's licensing deals and potential collaborations with artists like Taylor Swift for generative music creation are explored. Cambridge's innovation hub fosters talent in software, machine learning, and music, emphasizing a passion for music and building companies.

✨ Highlights
📊 Transcript
Mikey Schulman discusses his background in music and transition to working on AI-generated music.
00:41
Schulman talks about playing piano and bass in bands and the challenges of recording in a studio.
Despite studying physics at Harvard, he ended up at Keno through a series of fortunate events.
Schulman emphasizes the joy and satisfaction of creating music that brings joy to others.
Introduction to Kentro and Acquisition by S&P Global.
04:17
The speaker was introduced to co-founders at Kentro during grad school, leading to a job offer in a company specializing in machine learning for financial services.
Kentro was acquired by S&P Global in 2018, providing access to valuable training data and inspiring the team to explore audio AI projects.
The team found audio data to be more beautiful and complex than images and text, motivating them to focus on advancing audio technology.
The increasing importance of audio content consumption influenced their decision to pursue audio AI development.
Importance of audio content in comparison to video content.
07:59
Models for audio tasks are similar to text models, limited by available labeled data.
Training large self-supervised models on unlabeled text and audio data is increasing to address lack of training data.
Working with audio data is more challenging than text due to difficulties in inspecting and organizing the data.
Audio transcription was initially secondary to text at Keno.
Transition from financial services to audio projects.
11:07
Founders developed text-to-speech model called Bark.
Community showed interest in music over text-to-speech.
Shifted focus to music projects due to community support.
Driven by community enthusiasm, founders pursued music-related endeavors.
Development of Bark music model.
16:03
Bark was not designed for music creation.
Transformer technology inspired by academic work was utilized for audio modeling.
Bark incorporated elements from GPT for text-to-speech conversion.
Progress in generative music opportunities has advanced since Bark's release.
Generative AI enables active music creation without advanced skills.
17:20
Users can easily create personalized music by inputting prompts.
The technology opens up new social dynamics and collaborative opportunities in music-making, akin to gaming experiences.
The focus is on enjoying the creative process and final product, with potential for generative music to shape the future of music creation at scale.
Evolution of Music Format with AI
21:53
AI-generated music is leading to shorter songs to capture attention quickly.
Artists are using AI tools to optimize music for streaming platforms like Spotify.
Changes in music format are influenced by platforms like TikTok and stream payouts.
Future of music format is uncertain, but expected to keep evolving with AI's influence.
Impact of Generative AI on Personalized Music Content
24:21
Emphasis on smaller niche music tailored to individual tastes, facilitated by AI in real-time.
Evolving creative process influenced by faster models for music creation.
Envisioning a future where music is dynamically generated to match personal preferences.
Openness to diverse uses of created music, from sharing on YouTube to intimate group settings, highlighting the changing landscape of music consumption and creation.
Future of music delivery and consumption.
28:30
Current unidirectional nature of streaming music is discussed, emphasizing the lack of social connections between artists and listeners.
Sharing dynamics, social interactions, and remixing are seen as key factors in shaping the future distribution of music.
Microsoft partnership and integration of a free tier into co-pilot for creative experiences is highlighted as a significant development.
Importance of aesthetics in product design for music creation.
31:18
Listening to the output of models rather than relying solely on quantitative metrics is emphasized.
Creating intuitive workflows for users with diverse music backgrounds is a challenge highlighted.
Catering to users with specific music tastes who may not be experienced in music production is the goal.
Caution against relying solely on academic reasoning for product design, stressing the importance of considering user experience.
Importance of exploring various ways to enable people to create music beyond text-to-music models.
33:51
Emphasize the need to think from a user-centric perspective and focus on what inspires individuals to make music.
Introduction of the concept of 'soundtracking your life' to encourage drawing inspiration from everyday experiences.
Highlight of ethical considerations regarding existing artist rights and commitment to creating an artist-friendly and legally compliant platform.
Envisioning this approach as the future of music creation.
Limitations of AI models in creative tasks such as music composition.
38:35
AI models like GPT are compared to artists who push boundaries and create innovative work.
Potential for AI models to exceed existing limits and inspire new forms of creativity is explored.
Examples of using technology like autotune and effects pedals to enhance artistic expression are provided.
Importance of drawing inspiration from various sources and the role of AI in pushing creative boundaries are emphasized.
Potential of AI in Music Production
40:05
AI can enhance music by focusing on unique sounds over chord changes.
Advancements in AI technology will affect music production processes and relationships with rights holders.
Geographic variations could pose challenges for AI models and raise concerns about hacking risks.
The future of AI in the music industry is uncertain, with legal and ethical issues taking precedence for professionals in the field.
AI licensing deals with media publishers for generative music creation.
43:56
Collaboration opportunities between AI models and artists like Taylor Swift are emerging.
Google's Project Lia already involves big artists in AI music creation.
Structured licensing for AI-generated music may allow users to prompt models with specific songs for a fee.
Despite concerns, there is optimism about AI uncovering new music possibilities with advancements in audio fidelity and song quality.
Innovation hub in Cambridge, Massachusetts for building companies and attracting top talent in various fields.
46:37
Passion for building things and music highlighted as a good fit for individuals.
Gratitude and enthusiasm expressed for the conversation.
Encouragement for listeners to rate, review, and subscribe to the podcast for more content.