Interactive music generation has been defined by Joel Chabade as “a method for using performable, real-time computer music systems in composing and performing music”. Over the past years, many works have focused on the idea of relying on a computer algorithm to generate music. Moreover, a part of this study field, called human computer co-improvisation, aims to rely on a computer in order to produce a musical accompaniment to a musician’s improvisation.

Besides, machine learning aims to empower the computers with the capabilities to perform complex tasks that are innate to humans. The main idea is to develop algorithms able to learn by observing and modeling a set of examples. Thus, machine learning algorithms use computational methods to learn directly from the data by adapting the parameters of a pre-defined family of functions. The overarching goal of machine learning is to produce a model that could be able to generalize its understanding of a given (training) set to unseen data.

In this workshop we would like to meet the musical composition process with the informatic development problematics. The aim of this study is on the one hand to understand how to developed new machine learning models in order to offer better user experiences to composers, and on the other hand to provide a brief understanding of machine learning models in order to give some clues on the use of these models for musical piece composition.

Date and time: June 18, 2019, from 13 o’clock to 18 o’clock

Place: Tokyo University of the Arts, Department of Musical Creativity and the Environment, Senju Campus, The 1st conference room

〒120-0034, 1-25-1 Senju, Adachi-ku, Tokyo


Time table

13:00 Welcome coffee

13:15 Philippe Esling / Modeling musical creativity with variational inference and probabilistic generative models:
The research project carried by the ACIDS team at IRCAM seeks to model musical creativity by extending variational learning approaches towards the use of multivariate and multimodal time series. Our major object of study lies in the properties and perception of musical orchestration. Orchestration is the subtle art of writing musical pieces for orchestra, by combining the spectral properties of each instrument to achieve a particular sonic goal. In this context, the multivariate analysis of temporal processes is required given the inherent multidimensional nature of instrumental mixtures. Furthermore, time series need to be scrutinized at variable time scales (termed here granularities) as a wealth of time scales co-exist in music (from the identity of single notes up to the structure of entire pieces). Furthermore, orchestration lies at the exact intersection between the symbol (musical writing) and signal (audio recording) representations.
After introducing the general framework and multiple state-of-art creative applications done in the past years, we will focus on various applications of the variational learning framework to disentangle factors of audio variation. Hence, we will detail several recent papers produced by our team, allowing to regularize the topology of the latent space based on perceptual criteria, working with both audio waveforms and spectral transforms and performing timbre style transfer between instruments. Finally, we discuss the development of these approaches as creative tools allowing to increase musical creativity in contemporary music and show case-study of recent pieces played at renowned venues.
We will open the discussion to the question of creative intelligence through the analysis of orchestration and how this could give rise to a whole new category of generic creative learning systems.

14:00 Kazuyoshi Yoshii / Statistical Music Analysis, Composition, and Arrangement Based on Music Language Models: Associate professor, Speech and Audio Processing Group (SAP), Department of Intelligence Science and Technology (IST), Graduate School of Informatics, Kyoto University.

14:30 Break

15:00 Suguru Goto / Introduction of AI and Composition: The introduction of AI and especially “musical composition”. Duali II which is composed by AI will be also discussed.
後藤英 / 人工知能と作曲のイントロダクション : 人工知能と、特に音楽における作曲のイントロダクションについて。人工知能によって作曲された「Duali II」の解説も行われる。

15:15 Haolun Gu / The Process of Composing “Selenograph II”: Introduce the conceiving of “Selenograph II”, and also discuss how to use Omax as AI to help composing that. Mainly on the point of using extended technique.
顾昊伦 / タイトル:『月面II』の創作過程について : 「月面」シリ〡ズの第二作として創作背景および構想についてを紹介して、OMaxを用いて作曲にどのような補助を施したのかを、主に特殊奏法が人工知能の補助と合わせることを説明する。

15:30 MAN JIE / Introduction to my music work [Enchanted] on the research theme “Artificial intelligence and composition”: An introduction to using software such as Open Music, Max / Msp, Omax, and Audio Sculpt, various ways in the process of composing this music work, such as combining Mongolian ethnic music recording materials and music materials recorded in Japan with AI technology and creating different sounds in this work. Explain the structure of sound in the initial stage of the composition, the automatic generation system related to the structure of the work, and the automatic generation system of music in the development of the work.
満潔 / 研究テーマ「人工知能と作曲」に関する音楽作品[Enchanted]の紹介: Open Music、Max/MSP、OMax、Audio Sculptといったソフトウェアを使用し、モンゴルの民族音楽の録音材料や日本で録音された音楽材料をAI技術と結びつけ、作品の中で多様な響きを生み出すさまざまな方法について紹介する。作曲の初期段階における音の構造、作品の構造に関する自動生成システム、そして作品の開発における音楽の自動生成システムについて説明を行う。

15:45 Shinae Kang / A discussion of work created with the aid of AI: The discussion of a composition technique which uses OMax, Max/MSP framework developed at IRCAM. A work which is created with the aid of OMax also will be discussed.
姜信愛 / 人工知能を用いた作曲作品について: IRCAMで開発されたMax/MSP frameworkのOMaxを用いた作曲テクニックを紹介。

16:00 Jinwoong Kim / An introduction to composition framework : An introduction to ongoing composition framework development based on concepts of physical event and signal processing. A discussion possibility of using AI techniques is also included.
キムジヌン / 作曲フレ〡ムワ〡ク 紹介: 現在開発中の作曲フレ〡ムワ〡ク を紹介する。フレ〡ムワ〡クの基本概念である身体性、信号処理を話した後、人工知能の利用可能性について話す。

16:15 Agustín Spinetto / Introduction to AudioStellar sampler-like software instrument: An introduction to AudioStellar and how it uses AI to generate a 2D intelligent audio map to organize audio samples by their audio spectrally characteristics.
アグスティン·スピネット / Audiostellar(サンプラ〡のようなソフトウェア)の紹介: Audiostellarの紹介、および、スペクトル特性によるオ〡ディオサンプルを配置するには、どのようにAIを使用して2Dインテリジェントオ〡ディオマップを生成するかについて。

16:30 Break

17:00 Jérôme Nika (by Skype) / DYCI2, Creative Dynamics of Improvised Interaction: He will present a synthesis of his work on the design of agents for human-machine musical co-creativity. This research, carried out in collaboration with Ircam, offers an alternative to the “replaceist” or “technological” visions of creative agents. The objective is indeed to merge the usually exclusive “free”, “reactive”, and “scenario- based” paradigms in interactive music generation to adapt to a wide range of musical contexts involving hybrid temporality and multimodal interactions. The presentation will address the main issues of the project, such as the dialectic between reactivity and planning in human-machine musical interaction, the learning of a musical memory in real time, as well as the detection and inference of underlying structure in an audio stream. The generative models, learning schemes, and scheduling architecture models from this research have been implemented in the DYCI2 library, and have given rise to numerous collaborations and musical productions, particularly in jazz (Steve Lehman, Bernard Lubat, Rémi Fox) and contemporary music (Pascal Dusapin, Marta Gentilucci).

17:30 Tristan Carsault / Automatic chord extraction and structure prediction through machine learning, application to human-computer improvisation: The aim of this project is to develop a software that interacts in real-time with a musician by inferring expected structures (e.g. chord progression). In order to achieve this goal, we divide the project into two main tasks: a listening module and a symbolic generation module. The listening module extracts the musical structure played by the musician whereas the generative module predicts musical sequences based on the extracted features.

Written on May 20, 2019