文章基本信息

标题：The potential of a music-inspired algorithm for music document grouping.
作者：Geem, Zong Woo ; Choi, Jeong-Yoon
期刊名称：Fontes Artis Musicae
印刷版ISSN：0015-6191
出版年度：2014
期号：July
出版社：International Association of Music Libraries, Archives and Documentation Centres
摘要：1. Introduction

Due to the Internet, the volume of the off-line music record market has decreased while the on-line music market has increased. According to the data of the International Federation of the Phonographic Industry, (2) this trend is expected to continue in the future.

Along with the expansion of the on-line music market, many different kinds of music sound files (MP3, MIDI, etc) and music-related documents (lyrics, artist discographies and biographies, reviews, discussions, research articles, etc.) can also be found on the Internet. Therefore, there is a need to efficiently and accurately analyze, summarize, index, and classify those music files and documents that are scattered online.

Peng et al. classified 300 popular songs of 41 artists and groups, such as Elton John, Jackson Browne, and Led Zeppelin, based on timbre features extracted by sound wave mathematics. (3) When the music files were classified using K-means technique, the accuracy of matching between song and composer group became around 50%.

Cilibrasi et al. downloaded 118 MIDI files, stripped off non-musical data such as work's title and composer's name, and classified them using Kolmogorov complexity. (4) When they performed genre classification among classical (e.g., Bach, Chopin, and Debussy), jazz (e.g., Miles Davis and John Coltrane), and rock (e.g., The Beatles and The Police) works, they obtained good but not perfect results. For example, the jazz group was classified with 10 of the 12 jazz pieces while erroneously including Chopin's Prelude No. 15 and a Bach Prelude. (5) When they also performed classification within classical piano works (4 pieces from Debussy's Suite bergamasque; 4 pieces from Bach's Wohltemperierte Klavier II; and 4 Preludes from Chopin's Opus 28), the results were better but not perfect because one of the Chopin pieces (no. 15) was more closely located near Bach's group.

The potential of a music-inspired algorithm for music document grouping.

Geem, Zong Woo ; Choi, Jeong-Yoon

The potential of a music-inspired algorithm for music document grouping.

1. Introduction

Due to the Internet, the volume of the off-line music record market has decreased while the on-line music market has increased. According to the data of the International Federation of the Phonographic Industry, (2) this trend is expected to continue in the future.

Along with the expansion of the on-line music market, many different kinds of music sound files (MP3, MIDI, etc) and music-related documents (lyrics, artist discographies and biographies, reviews, discussions, research articles, etc.) can also be found on the Internet. Therefore, there is a need to efficiently and accurately analyze, summarize, index, and classify those music files and documents that are scattered online.

Peng et al. classified 300 popular songs of 41 artists and groups, such as Elton John, Jackson Browne, and Led Zeppelin, based on timbre features extracted by sound wave mathematics. (3) When the music files were classified using K-means technique, the accuracy of matching between song and composer group became around 50%.

Cilibrasi et al. downloaded 118 MIDI files, stripped off non-musical data such as work's title and composer's name, and classified them using Kolmogorov complexity. (4) When they performed genre classification among classical (e.g., Bach, Chopin, and Debussy), jazz (e.g., Miles Davis and John Coltrane), and rock (e.g., The Beatles and The Police) works, they obtained good but not perfect results. For example, the jazz group was classified with 10 of the 12 jazz pieces while erroneously including Chopin's Prelude No. 15 and a Bach Prelude. (5) When they also performed classification within classical piano works (4 pieces from Debussy's Suite bergamasque; 4 pieces from Bach's Wohltemperierte Klavier II; and 4 Preludes from Chopin's Opus 28), the results were better but not perfect because one of the Chopin pieces (no. 15) was more closely located near Bach's group.

Traditionally music files and music-related documents have been well classified with the indexes of genre, composer, period, etc. However, purely musically speaking, there may an interest to seek a co-relationship among the musical works or documents themselves. And, with a music file or document in hand, one may also want to find other pieces efficiently that are highly-related to it. In this situation, a classifying (or more exactly, clustering) technique is needed.

More generally, the clustering technique can be applied to the following various music-related situations:

* Grouping huge music files or documents into n clusters in terms of similarity.

* Grouping music library users into n clusters, and providing customized services (concert information, book recommendation, etc.).

* Performing genealogy analysis from ancestor music piece into offspring one.

This paper has two goals. First, it introduces an algorithm, named the Harmony Search (HS), which is inspired by the improvisation phenomenon of musicians. Then, it suggests how to apply this algorithm to classifying music files or music-related documents on the Internet or in general. Once they are optimally classified, they can be more easily and efficiently retrieved.

2. The Harmony Search Algorithm

In our daily lives, we are confronted with various decision-making situations. In order to make the best decision, we may utilize optimization algorithms that select the best solution out of numerous candidate solutions.

Traditionally social scientists or engineers have depended on mathematics-based optimization techniques. However, these mathematical approaches have some drawbacks. One of which is the fact that they may not reach the global optimum when there are more than one local optima in their decision-making problems.

To cope with this problem, various meta-heuristic algorithms based on natural or artificial phenomena have been recently proposed. Harmony Search (HS) is a relatively new meta-heuristic algorithm which is based on the improvisation phenomenon of music player. (6)

Improvisation is a type of music where a group of music players plays musical instruments without using a score and without planning ahead, and where their performance is governed by the immediate environment or their own or the ambient feeling. As the experience of improvisation is accumulated, a group can generate better harmonies, and this is very similar to the process of optimization where a group of variables tries to find better solution vectors, as shown in Figure 1.

There is an analogy between improvisation and optimization as follows:

* Each music player in an improvisation corresponds to each variable in optimization.

* The playable range of each musical instrument corresponds to the value range of each variable.

* A harmony that consists of music notes corresponds to a vector that consists of numerical values.

* While a "good" harmony can be determined by audience's aesthetics, a good vector can be determined by an objective function.

* Just as musicians update their memories with good harmonies practice by practice, the HS algorithm updates its memories (expressed as mathematical matrix) with good solution vectors iteration by iteration.

For improvising a harmony, the HS algorithm uses three operators: 1) memory consideration, 2) random selection, and 3) pitch adjustment, as shown in Figure 2. Memory consideration operation is to select a value from the existing good values, as shown in the left expression of Figure 2, with the probability of Harmony Memory Considering Rate (HMCR) that has a value between 0 and 1. Random selection operation is to randomly select a value from the total value range, as shown in the middle expression of Figure 2, with the probability of (1- HMCR). Pitch adjustment operation is to adjust the value which is originally selected in memory consideration operation, as shown in the right expression of Figure 2, with the probability of Pitch Adjusting Rate (PAR) which has a value between 0 and 1. These operations can also be mathematically expressed as follows:

[MATHEMATICAL EXPRESSION NOT REPRODUCIBLE IN ASCII] (1)

So far the HS algorithm has been successfully applied to diverse optimization problems, such as Sudoku puzzles, (7) chant-based organum compositions, (8) and structural designs. (9)

The success of these applications can be due to the fact that HS has a unique theoretical background named stochastic derivative (10) as follows:

[partial derivative]f/[partial derivative][x.sub.i] = [1/[K.sub.i]][P.sub.Random] + [n([x.sub.i](k))/HMS] [P.sub.Memory] + [n([x.sub.i](k [- or +] m))/MHS][P.sub.Pitch] (2)

The above function utilizes a human's (a musician's) experience as a novel derivative while the traditional optimization technique utilizes differential calculus as a gradient derivative. The novel derivative, which has three terms representing random selection, memory consideration, and pitch adjustment mimicked from musician's behavior, can help HS to efficiently reach the global optimum or near-global optima.

3. Application to Music Document Grouping

This study intends to suggest how this HS algorithm can be applied to grouping or clustering music files or documents on the Internet or in general.

In music projects, clustering can be a problem of partitioning a group of music files or documents into a specified number of sub-groups, where music files or documents in the same sub-group should be similar as much as possible, while those in different subgroups should be dissimilar as much as possible. In terms of optimization, the objective of this problem is to maximize the intra-cluster similarity and to minimize the inter-cluster similarity.

Figure 3 shows the clustering problem graphically. If each dot in the figure represents one music file or document, and different types of music files or documents are expressed in different shapes (in this case, black diamond-shape and grey circle-shape), files or documents can then be classified into sub-groups by using any efficient technique. The efficient clustering technique should calculate the optimal center points (centroids) of the sub-groups and radiuses of grouping circles.

The above-introduced HS algorithm can be a good technique for this clustering problem. Technically speaking, the HS algorithm minimizes the following objective function:

[MATHEMATICAL EXPRESSION NOT REPRODUCIBLE IN ASCII] (3)

Equation (3) is called Average Distance of Documents to the cluster Centroid (ADDC) when dividing all files or documents into K sub-groups.

With the equation, the HS algorithm was already successfully applied to web text clustering problems with politics and news documents. (11) However, it has not yet been applied to music file or document clustering. Thus, this paper briefly introduces the HS algorithm and suggests researchers to use this technique when they need to cluster their music files or documents in the future.

More specifically, below is an illustration on how to convert music documents into dots on a two-dimensional graph (normally, there are more than two dimensions which cannot be graphically described). Each dot can be a vector which mathematically consists of n keywords and each keyword has a numeric value that was quantified based on the frequency it appears in the documents. Of course, function words such as "a", "the", "in" and pronouns such as "I", "he", "it" are stripped out and cannot become keywords even though they may have high frequencies. Also, in order to prevent skewing when a keyword appears frequently in a lengthy document, the frequency can be normalized.

Sometimes, the quantifying process takes into consideration not only the frequency but also semantically related words. Although various keywords may be used, related words such as "piano" and "keyboard" can be considered as similar words.

When representing music files as dots, a vector of the piece can be made by extracting specific numerical features such as pitch, rhythm, and harmony using mathematical techniques such as Fourier transforms or wavelet transforms. (12)

4. Conclusions

This paper introduces a music-inspired optimization algorithm called Harmony Search and briefly explains how it can be applied to the music file or document clustering problem.

The HS algorithm mimics the improvisation behavior in jazz performance to find better solutions for various problems. For music file or document clustering, it can efficiently classify files or documents into proper number of sub-groups based on numerical features or keywords by choosing optimal centroids and sizes. This technique can be useful in searching co-related music files or documents. Thus, highly-related files or documents can appear on the front page with higher relevancy ranking if they are searched.

At this stage, only the idea of music file or document clustering using HS is proposed. But the author hopes to see this idea being adopted in managing music files and documents by other researchers in the future.

Zong Woo Geem and Jeong-Yoon Choi (1)

(1). Zong Woo Geem is Assistant Professor at College of Information Technology, Gachon University, Seongnam, South Korea (zwgeem@gmail.com). Jeong-Yoon Choi is Instructor at the School of Music, Inje University, Gimhae, South Korea.

(2). IFPI. "IFPI Digital Music Report 2011." IFPI, 2011. Web. <www.ifpi.org>.

(3). Wei Peng, Tao Li, Mitsunori Ogihara, "Music Clustering with Constraints, " Proceedings of the Eight International Conference on Music Information Retrieval, Vienna, 23-27 July 2007, <ismir2007.ismir.net/proceedings /ISMIR2007_p027_peng.pdf>.

(4). Rudi Cilibrasi, Paul Vitanyi, and Ronald de Wolf, "Algorithmic Clustering of Music Based on String Compression, " Computer Music Journal 28.4 (2004): 49-67, <eprints.pascal-network.org/archive/00001913/01 /music.pdf>.

(5). Cilibrasi, p. 58.

(6). Music-Inspired Harmony Search Algorithms: Theory and Applications, edited by Zong Woo Geem, Berlin: Springer, 2009.

(7). Zong Woo Geem, "Harmony Search Algorithm for Solving Sudoku, " Knowledge-Based Intelligent Information and Engineering Systems, Ed. Bruno Apolloni, Berlin: Springer, 2007, pp 371-78, <link.springer.com /chapter/10.1007%2F978-3-540-74819-9_46>.

(8). Zong Woo Geem and Jeong-Yoon Choi, "Music Composition Using Harmony Search Algorithm, " Applications of Evolutionary Computing, Ed. Mario Giacobini, et al. Berlin: Springer, 2007, 593-600.

(9). Oguzhan Hasancebi, Ferhat Erdal, and Mehmet Polat Saka. "Adaptive Harmony Search Method for Structural Optimization." Journal of Structural Engineering 136 (2010): 419-431.

(10). Zong Woo Geem, "Novel Derivative of Harmony Search Algorithm for Discrete Design Variables." Applied Mathematics and Computation 199.1 (2008): 223-230. http: //www.academia.edu/462853/Novel_ derivative_of_harmony_search_algorithm_for_discrete_design_variables.

(11). Mehrdad Mahdavi, et al., "Novel Meta-Heuristic Algorithms for Clustering Web Documents, " Applied Mathematics and Computation 201.1-2 (2008): 441-451.

(12). Cilibrasi, 49-67.

COPYRIGHT 2014 International Association of Music Libraries, Archives and Documentation Centres
No portion of this article can be reproduced without the express written permission from the copyright holder.