Asia Silicon Valley

S³:A Universal Music Platform for Edutainment

Principal Investigator:Jyh-Shing Roger Jang Ph.D. of National Taiwan University
Abstract

Our team has been focusing on music and audio related analytics, retrieval, and recognition for almost 20 years. Our goal is to leverage these experiences and technologies to construct the world’s best platform for music-related services and applications. We believe our technologies and development can revolutionize the use of music for edutainment (including music-based games and learning tools) by creating new user experiences and engagement. Such a platform can be achieved through our core technologies for audio, speech, and music processing, including active noise cancellation (over servers and client devices), monaural source separation (for audio music and speech), synchronization of vocal and accompaniment, hardware implementation of pitch shift and vocal removal, singing voice beautification, and so on. With these technologies available at our hand, we can build a variety systems for audio/speech/music services and applications, such as reaction video systems (with enhancement of the user’s recordings), online karaoke services (with singing assessment, and contents from YouTube), microphone-type karaoke (with key transposition for both vocals and accompaniments, with music contents from YouTube), music-based rhythm games (with automatic generated game contents, and with music content from YouTube), music-learning tools (with automatic score following, page turning, and assessment), music retrieval systems (with singing/humming/speech, or original noisy music clips), speech enhancement for noisy environment (in automobiles, for instance), and many others.

1learn more
Team Introduction

Our PI is Prof. Jyh-Shing Roger Jang who received his Ph.D. from the EECS Department at the University of California, Berkeley, 1992. He has since cultivated a keen interest in creating industrial software for pattern recognition and computational intelligence. He is a professor in CSIE Dept. of National Taiwan Univ, where he leads a team to work on the cutting-edge technologies for audio and music. He is the general chair for ISMIR (International Society for Music Information Retrieval) at Taipei in 2014, and a general co-chair for ISMIR at Suzhou, 2017. His research interests include machine learning and pattern recognition, with applications to speech recognition & assessment, music analysis & retrieval, image identification & retrieval, and semiconductor manufacturing intelligence. He was the CTO of CWEB Technology, which created the first online Karaoke product in the world, from 2004 to 2006. Based on his experiences in both research and development, he will successfully lead our team through the maze to create a sustainable and profitable world-wide music platform for edutainment.

Besides Prof Jang, we have several key persons to join force when the company is established:

    • Peter Chiu is an engineering who has more than 10 years of experiences in developing karaoke machines. Moreover, he was a PM focusing on developing “Sing and Share” which was a popular karaoke app over mobile phones. He has extensive experience in dealing with content provides and knows well about market sectors.
    • Darren Yang who was with PC Home, with hands-on experience in developing back-end solutions for large-scale e-commerce applications.
    • Ken Yeh is currently a PhD candidate and he has extensive experiences in music analysis and machine learning.
Goals and Plan

With our comprehensive set of technologies at hand, we would like to revolutionize the use of music for education and entertainment. Our first goal is to build a music rhythm game that can use any music for gameplay, including those from YouTube and the user’s personal collection. Moreover, to strengthen the engagement, the use can even use any locally available objects for percussions, such as a pen or a chopstick. Our second goal is to create a vocal suppression library such that the user can download a piece of music from YouTube and then use the piece for karaoke, either on PCs, mobile phones, or blue-tooth microphones. A singing assessment library will be used to evaluate the user’s performance based on her/his pitch, vibrato, enthusiasm, pronunciation, facial expression, etc. Such online karaoke app will also leverage our connection with content providers to offer the most recent and beautiful music. Moreover, with the use the noise cancellation, the user does not need to use an earphone to pursue such detailed singing scoring.

Entry Barrier

Most of our technologies are entry barriers to our competitors, including:

    • Music onset analysis
    • Real-time percussion identification
    • Vocal/music separation from polyphonic music
    • Vocal pitch tracking from polyphonic music
    • Query by singing/humming for music retrieval
    • Audio fingerprinting for music retrieval
    • Facial expression identification
    • Score following

Moreover, we have competitive edges over others due to the following facts:

    • We have been working on related technologies for almost 20 years
    • We have a strong research team at National Taiwan University to make sure our technologies are state-of-the-art.
    • We have good relationships with music content providers in Taiwan and US.
Market Scope

Potential market sectors are listed here:

    • Music rhythm games: the famous Rayark made 7.3 million USD for their global revenue since 2012.
    • Online/mobile karaoke: Smule had over 100 million downloads over android and iOS devices, and the potential value is over 600 million USD in 2017 since the fund raised by Tencent. Additionally, Chinese app 唱吧 had 4.3 billion RMB potential value in 2015, may project to 6 billion in 2017.
    • Blue-tooth microphone/speakers for karaoke: the wireless microphone had 2 billion USD market size in 2016, and estimated to reach 3.5 billion in 2025. Especially in Blue-tooth microphone for karaoke, the market size in US is 132  million USD in 2017.
    • Music learning tools: the global market size of music learning is over 4.5 billion USD in 2012, projected to 6 billion USD in 2017. And there did not exist any dominant tools or apps for music learning for now.
Video Introduction