Name: LYDIAを構成する技術の紹介 / Introduction of the technologies that make up Project LYDIA
Start: 2026-06-02T11:20:00+0900
End: 2026-06-02T12:10:00+0900

LYDIAを構成する技術の紹介 / Introduction of the technologies that make up Project LYDIA

Tuesday June 2, 2026 11:20am - 12:10pm JST

Next 2

音楽AIの分野では、より大規模な音楽生成モデルが注目を集めることが多い一方で、それとは正反対の方向、つまりリアルタイムで動作し、ライブオーディオの厳しいレイテンシー予算内で処理できるほど高速なアーキテクチャへと小型化していく動きも見られます。本講演では、コンパクトで効率的なアーキテクチャを、全く新しいクリエイティブツールとして活用する方法を探ります。初期モデルから変分オートエンコーダー（VAE）に至るまでの道のりを概説し、特に音の非常に圧縮された表現を学習する能力に焦点を当てます。これらの音色潜在空間は、滑らかな補間、操作、サンプリング、あるいは音色変換などのクリエイティブな効果に活用できるパレットとなります。

これらのモデルの仕組みと理論について概説した後、ストリーミング可能な低レイテンシー推論のための最適化、オーディオをフレームごとに処理できるようにネットワークを再構築すること、そして学習済みモデルをC++に移植することなど、これらのモデルを実際に使用できるようにするためのエンジニアリング上の課題について触れます。次に、物理的な側面へと移り、市販のハードウェアであるRaspberry Pi 5とPico 2を組み合わせて、オーディオ入出力、コーデック、MIDIを処理する方法を解説します。これにより、これらのモデルを実際に手に取って演奏できる、自己完結型のユニットとして展開することが可能になります。

内容は予告なく変更される場合があります。

---

While the headlines in musical AI are often dominated by ever-larger music generation models, a parallel trend runs in the opposite direction: shrinking architectures down until they're fast enough to run in real time, inside the tight latency budgets of live audio. This talk explores how compact, efficient architectures can be utilized as entirely novel creative tools. We'll briefly trace the path from early models to variational autoencoders (VAEs), focusing on their ability to learn very compressed representations of sound. These timbral latent spaces then become a palette within which you can smoothly interpolate, manipulate, and sample from, or exploit for creative effects such as timbre transfer.

After a primer on the mechanics and theory behind these models, we will touch on the engineering reality of making these models playable: optimizing for streamable, low-latency inference, restructuring networks so they can process audio frame by frame, and porting trained models into C++. We'll then transition into the physical, looking at handling audio IO, codecs and MIDI on a combination of widely available consumer hardware, a Raspberry Pi 5, and Pico 2, so these models can be deployed in a self-contained unit you can actually pick up and play.

The content is subject to change without prior notice.

Speakers