**2. Materials and methods**

Artificial-intelligence-assisted musical devices come in a wide variety of forms and potentially have a very wide spectrum of uses. In order to create a framework that will cover most of these possibilities, we will start by introducing taxonomy of the different usages of the said devices. It should be clear that it is possible for a device to fall into several categories. As an example, most musical instruments could also be considered educational aids, some of them being used predominantly for this purpose. The monochord was used through the Middle Ages for educational and scientific purposes [4], and similarly, we can design intelligent instruments that, although being able to be used for performing, are meant with an educational intent.

### **2.1 Taxonomy**

We propose a classification for AI-assisted musical devices (AIMEs). It is clear that this is not the only possible taxonomy, but it is complete, easy to apply, and useful. The classification is shown in **Table 1**.

In a first level, we divide our AIMEs into:


A real device may be included in several categories. As an example, a device could generate a set of music scores and then recommend some of them to a student. In this way, this device could be considered as a generator, a recommender, and an educational system.

This main AIME division can then be divided into subcategories. As an example, a Music Generator can either be instrumental, vocal, or combined. An instrumental music generator usually produces music in symbolic format. The most common symbolic format is the Musical Instrument Digital Interface (MIDI), which contains information that indicates the pitch, start time, stop time, and other properties of each individual note, rather than the resulting sound. Combined and voice generators have to use a raw audio format and are much more difficult to implement, although their quality has improved significantly in the present decade [5].


#### **Table 1.**

*AI-assisted musical device (AIME) taxonomy.*

As a further example, recommendation devices can recommend music as a function of the environment or as a function of the user state. The environment-based recommendation is mostly used in social scenarios, e.g., if the system selects music for a shopping mall or an elevator. Personal Music recommendation devices are used mostly when recommending for a single user. As an example, we could estimate the user's emotional state from the data of the wearable device [6] and select the music accordingly. It is also possible to use the acquired data of an AIME personal recommender to try to modify some aspects of user behavior. An interesting possibility would be to train the user, through music, to reduce his or her stress level. In this way, the device could also be considered as part of the Internet of Behavior (IOB) [7].

#### **2.2 Intelligent instrument scenarios**

The area of intelligent musical instruments [8] includes an important subset of musical devices and has a wide range of applications that we will present in four example scenarios.

#### *2.2.1 Able instrument scenario*

Mike had an accident that led to a problem that prevents him from playing with his right hand. However, he would like to continue playing the bass in a small blues band. Mike thought he would not be able to play again as a bass player, as most instruments require significant ability with both hands. There are several alternatives to adapt the instrument to his physical capabilities [9], but finally he settled on a small robotic mechanism that can detect which string is he fretting with his left hand and pluck it. This device can hear what other members of the band are playing and dynamically adapt to the tempo and genre of the song by varying the rhythms and patterns it plays.

Although the results do not match his earlier performances, Mike is still able to play well enough and have fun with his friends' band.

#### *2.2.2 Drum stroke scenario*

Toby recently had a stroke that left him with reduced mobility in his right hand. In his rehabilitation clinic, they proposed that he should follow complementary musicsupported therapy (MST) in which he controls a set of midi drums through his hand gestures [10], which are detected through electromyography signals (EMG). The drums can play almost autonomously at the beginning of therapy and allow control of an increased number of variables as Toby progresses in his recovery.

The rehabilitation device keeps track of Toby's progress and periodically sends reports to his therapist. When Toby goes to the clinic for an in-person session, the therapist will discuss his progress and adapt the MST accordingly.

#### *2.2.3 Teach and play scenario*

Mary wants to start playing the concertina and is following a well-known book and taking some lessons online. However, she does not like the sound that she is producing with the instrument currently and refuses to play it anywhere. A friend tells her about Inteltina, an intelligent didactical concertina that augments Mary's abilities and helps her produce a nice sound. The instrument assistance dynamically decreases as Mary's playing capabilities improve.

Although Mary plays reasonably well with Inteltina, her online teacher warns her that this type of instrument sometimes backfires as the student becomes lazy and her abilities stagnate [8].

#### *2.2.4 TherAImin*

Sara is a computer scientist who plays piano as a hobby. Recently, she has become fascinated by the discovery of the Theremin [11]. **Figure 1** shows an early implementation of Theremin. Being an AI specialist, she believes that the design can be clearly improved with the help of AI. Thus, she decides to become a "digital luthier" and to create a new instrument that is faithful to the original Theremin concept. The TherAImin keeps the pitch and volume antennas of the original instrument but includes an AI-based gesture recognizer to change the timbre of the instrument [12] according to hand gestures.

This scenario reflects the creation of new digital AI-supported musical instruments. Several interesting reflections on this topic can be found in [3].

This type of instrument is fun to build and play, but it can be difficult to create a community of users around them.

#### **2.3 Audio processing scenarios**

This area includes instrument processors, voice processors, and generic audio processors.

#### *2.3.1 Boogie boogie scenario*

Saul is a professional guitar player. He would love to have a Mesa Boogie Mark V amplifier, but the price is too high for him. Saul knows that there are emulations for *FAIME: A Framework for AI-Assisted Musical Devices DOI: http://dx.doi.org/10.5772/intechopen.108898*

#### **Figure 1.**

*Alexandra Stepanoff playing the theremin, 1930.*

this amp for several Digital Audio Workstations (DAWs) including Cubase, which he regularly uses. However, Saul would like to have the emulation as a pedal he can easily carry. He has several friends who work in a small start-up company that designs embedded deep learning devices and learns from them that the boogie can be emulated by an AI system [13] that can be implemented using a Coral Edge TPU accelerator [14].

In a few months, Saul has tested the device and the company is starting to sell the BoogieBoogie Pedal.

#### *2.3.2 DeepTuner Scenario*

Sara is a singer who regularly uses a pitch-correction voice processor for her performances. Currently, she uses an AI enhanced version of Antares Auto-Tune [15] on an Avid Carbon Device. She is satisfied with the natural feeling, and virtually unnoticeable delay that this hardware/software implementation brings to her performances. Nevertheless, she would love a similar pitch correction implementation in a smaller and cheaper device [16].

### *2.3.3 DeepAFx scenario*

Kyra is a production Engineer. Since she discovered the Deep-Learning-based LV2 DeepAFx plug-in framework [17] she regularly uses it to control her DAW and to introduce several effects. Although she always fine-tunes the work manually, the use of the framework has clearly improved her schedule. Kyra would love to have a device with an embedded version of these plug-ins for live performances.

#### **2.4 Music generator scenarios**

In this subsection, we present two scenarios that rely on the use of different AI-based music generators.

#### *2.4.1 On hold scenario*

Peter has a small online seller business with a telephone customer service line. He wants some copyright-free music to keep the costumer on hold while an agent can handle their call. He wants the music to change according to the expected waiting time, the time of the day, and other circumstances.

Peter has heard about AI-based music generation technology [5] and after searching online decides to select some compositions made using AIVA and computoser [18]. Peter consults with his guitar player friend Saul to help him decide which parameters would be best for the different music fragments that he wants for the customer service line. An automated controller dynamically changes the generator parameters to create the desired result.

Peter would like to be able to estimate the emotional state of the client [6] and change the music accordingly; however, this is not possible in a standard phone call. When clients use the customer service app, the music changes according to their comments [19]. All the generators in this scenario produce symbolic music in midi format. This format is suitable for instrumental music and produces results of a quality that can be adequate for the proposed scenario.

#### *2.4.2 Singing elevator scenario*

Mia is a Design Engineer for a large elevator company. In their latest models, the elevators are fitted with a screen that mainly provides news and weather information. Mia wants to have copyright-free background songs while the elevator is in use.

After studying several alternatives, Mia decides to generate the songs dynamically based on the characteristics of the building (residential, commercial, neighborhood, etc.). To generate the songs, she uses the OpenAi Jukebox generator [20] and updates the sons on a regular basis. The entire selection of songs according to the different situations is performed by the elevator media controller, which can also be considered a musical thing.

This scenario uses a nonsymbolic direct audio music generator. This type of generator is much less common than the symbolic alternatives, but the results are becoming acceptable by final users in the last years.

#### **2.5 Music recommendation device scenarios**

#### *2.5.1 Emotiwatch scenario*

Sam is a sports and music fan. Every morning he runs for an hour. While running, Sam likes to listen to music. His musical choices clearly depend on his mood. For years, Sam has selected his songs directly, but he would prefer, at least sometimes, that his smartwatch would do the selection for him. It is well known [21, 22] that emotional states and stress can be predicted using AI technology from physiological indicators. These are mainly electro-dermal activity (EDA), heart rate variability (HRV), and to a smaller extent, peripheral oxygen saturation (Spo2). Several wearable devices, including smartwatches such as Fitbit charge 2 or Sense [22] or

research-oriented Empatica E4 wristband, are capable of measuring at least a subset of these parameters.

Sam finds an app for his watch [23] that selects music based on his mood. The watch, which was already a musical thing, becomes an AI-assisted musical device and lets Sam keep his mind on running.

#### *2.5.2 iClock scenario*

Jane, like a great part of the population in many countries, has been having lack of sleep problems for a long time. The relationships between sleep disorders and anxiety, depression, overweight, and diabetes are well known by the medical community [24]. As part of her treatment, her psychologist tells Jane that some new devices could possibly help restore her sleep quality. Among these devices, Jane finds iClock, a new device that monitors her sleep, using Jane's smartwatch, and modifies her wake up routines taking into account her schedule needs, the sleep monitoring data, and an estimation of her emotional state. Among the different aspects that iClock controls is the selection and modification of the melodies according to the selected waking up routine. Thus, iClock is, among other things, an AI-assisted musical device,

Following her therapist recommendations, including the use of iClock, Jane's sleep patterns improve, which in turn is clearly reflected in an improvement of her quality of life.

#### **2.6 Feedback device scenarios**

### *2.6.1 RumbleRumble scenario*

Gina has a moderate hearing problem. She likes to go to concerts with friends. However, she feels that she is losing an important part of the information. Recently, she learned about the existence of the Subpac backpack [25] that uses haptics, interoception, and bone conduction to deliver bass sensation to even profoundly deaf users. Although the current version of the device requires an external computer to run the software, Gina is using an experimental version that runs in an embedded controller, thus making the Subpac a personal feedback AIME.

#### *2.6.2 MagicShoes scenario*

Peter has a problem with his weigh. He has tried several solutions, but none seem to work well for him. He has even tried game-based approaches [26] with little success. Peter is very fond of music, and he hears from a musician friend of the existence of a wearable device that uses sounds to promote sport activity and to change your own body perception. He starts using MagicShoes [27] and finally finds a way to help him reduce weight in a fun way that adequately fits his tastes and habits. A future update includes machine learning capabilities so that the device selects music based on the user preferences.

#### *2.6.3 Let there be light scenario*

Nico really likes to go to rock music performances. He especially loves when people start following music with their lighters. In some recent concerts, this has even improved due to new musical device technology. When Nico went to his last concert,

he was given a PixMob-led wristband. These devices have a set of preprogramed effects that are triggered usually by a human operator. Nevertheless, the possibility of an AI-based controller that decides which effect to apply according to both the concert and the carrier circumstances is currently perfectly feasible. In this way, the wristband will become an AIME (**Figure 2**).
