## Ismir2015.uma.es

COMPARATIVE MUSIC SIMILARITY MODELLING USING TRANSFER
LEARNING ACROSS USER GROUPS
Daniel Wolff, Andrew MacFarlane and Tillman Weyde
Music Informatics Research Group – Department of Computer Science
City University London
approach of transfer learning in music similarity that im-proves results of specialised models, using our
We introduce a new application of transfer learning for
extension of Information-Theoretic Metric Learning (ITML).

training and comparing music similarity models based on
The template-based optimisation in
relative user data: The proposed Relative Information-The-
W0-RITML allows for
a comparison of the general and specialised models – it
oretic Metric Learning (RITML) algorithm adapts a Maha-
derives the latter from the former – which we suggest as
lanobis distance using an iterative application of the ITML
a tool for comparative analysis of similarity data by (e.g.

algorithm, thereby extending it to relative similarity data.

RITML supports transfer learning by training models withrespect to a given template model that can provide prior
We are particularly interested in modelling relative similar-
information for regularisation. With this feature we use in-
ity ratings collected from participants during Games With a
formation from larger datasets to build better models for
Purpose (GWAPs). Using similarity data from user groups
more specific datasets, such as user groups from differ-
promises to provide tailored model performance and the
ent cultures or of different age. We then evaluate what
opportunity to compare such groups via the trained sim-
model parameters, in this case acoustic features, are rele-
ilarity models. The new CASimIR dataset presented in
vant for the specific models when compared to the general
Section 3 contains such similarity ratings and information
about the contributing subjects. We use this extra data togroup users and here exemplarily train age-specific music
We to this end introduce the new CASimIR dataset, the
similarity models based on age-bounded subsets. How-
first openly available relative similarity dataset with user
ever, the relatively small size of the CASimIR dataset re-
attributes. With two age-related subsets, we show that trans-
quires a different approach to training the group-specific
fer learning with RITML leads to better age-specific mod-
models as existing algorithms are not sufficiently effective
els. RITML here improves learning on small datasets. Us-
for this purpose.

ing the larger MagnaTagATune dataset, we show that RITMLperforms as well as state-of-the-art algorithms in terms of
We contribute a solution to this problem with a novel generic
general similarity estimation.

algorithm for transfer learning with similarity models: TheRITML algorithm (see Section 5.2) extends on ITML toallow for learning a Mahalanobis metric from relative sim-
ilarity data like in CASimIR. With W0-RITML, informa-
tion learnt from remaining data can be successfully trans-
Music similarity models are a central part of many ap-
ferred to an age-bounded dataset via a Mahalanobis ma-
plications in music research, particularly Music Informa-
trix. This transfer-learning increases performance on small
tion Retrieval (MIR). When training similarity models, it
datasets and provides interpretable values in the Mahala-
turns out that learnt models vary considerably for differ-
nobis matrix. The Mahalanobis matrix provides a compact
ent data sets and application scenarios. Recently, context-
representation of similarity information in a dataset. This
sensitive models have been introduced, e.g. for the task of
is useful in scenarios where the music data is difficult to
music recommendation (Stober [9] provides an overview).

access due to its data volume or copyright restrictions. The
The main problem with context-sensitive similarity mod-
CASimIR dataset and code used in this paper are available
els is currently to obtain enough data to train the models
for each context. Transfer learning promises to enable ef-fective training of models for specific contexts by includ-
2. RESEARCH BACKGROUND
ing information from related datasets. We here present an
Transfer learning relates to many areas and approaches inmachine learning. A general overview of transfer learning
c Daniel Wolff, Andrew MacFarlane and Tillman Weyde.

Licensed under a Creative Commons Attribution 4.0 International Li-
is given in Pan and Yang [6]. In their categorisation, our
cense (CC BY 4.0). Attribution: Daniel Wolff, Andrew MacFarlane
task is an inductive knowledge transfer from one similarity
and Tillman Weyde. "Comparative Music Similarity Modelling using
modelling task to another via model parameters. Note that
Transfer Learning Across User Groups", 16th International Society forMusic Information Retrieval Conference, 2015.

Proceedings of the 16th ISMIR Conference, M´
alaga, Spain, October 26-30, 2015
in our example the tasks differ only in the dataset, but our
the odd one out (of the triplet Ci, Cj, Ck) results in 2 rela-
method can also be used for more divergent tasks.

tive similarity constraints: clips Ci and Cj are more similar
In MIR, transfer learning is a relatively new method. In
Ci and Ck, and clips Cj and Ci are more similar than
2013, [2] described multi-task learning using a shared la-
Cj and Ck. These constraints are denoted as (i, j, k) and
tent representation for auto-tagging, genre classification and
(j, i, k), respectively which are contained in the constraintset ˆ
genre-based music similarity. This representation includes
both the features and the labels for the different tasks. In
Human ratings regularly produce inconsistent constraints.

experiments on several datasets they showed improvement
We use the graph representation of the similarity data as
of classification accuracy and modelling similarity accord-
suggested by [5] to analyse and filter inconsistencies: Each
ing to genre.

constraint (i, j, k) is represented by an edge connecting
We here work with relative similarity ratings from humans
two vertices (i, j) ! (i, k) corresponding to two clip
in our new CASimIR dataset for group-specific modelling.

pairs, with the edge weight ↵ijk = 1. When combining
Furthermore, we use the MagnaTagATune dataset [3] for
all constraints in a graph, the weights ↵ijk are accumu-
comparison on non-specific similarity learning. Here, the
lated. Inconsistencies then appear as cycles in the graph,
Support Vector Machine (SVM) approach developed by
which in their most common form are of length 2:
Schultz and Joachims [7] and applied in [10, 11] is used
as state-of-the art baseline.

Another state-of-the-art algorithm for learning from rela-tive similarity data is Metric Learning To Rank (MLR).

We remedy such cycles by removing the edge with the
McFee et al. [4] introduce MLR for parametrising a lin-
smaller weight and assigning the weight ↵ijk ↵ikj to
ear combination of content-based features using collabo-
the remaining edge. For both the MagnaTagATune and
rative filtering data. Their post-training analysis of feature
CASimIR datasets this already creates a cycle-free graph
weights revealed that tags relating to genre or radio stations
Q as no larger cycles remain. The cycle-free sets Q are
were assigned greater weights than those related to music
used in this study for training and evaluation.

theoretical terms.

Compared to the MagnaTagATune dataset, the CASimIRdataset features more frequent recurrences of clips between
3. A DATASET FOR USER-AWARE SIMILARITY
the triplets presented to the users. Recurring clips relate thecorresponding similarity data, and result in large connected
In order to perform a related analysis and comparisons of
components in the CASimIR similarity graph: While the
models between different user groups, we have collected
maximal number of clips directly or transitively related
the CASimIR datasets using Spot the Odd Song Out [13],
to each other through similarity data in the MagnaTagA-
an online 2 multi-player Game With a Purpose (GWAP).

Tune dataset was 3 (see [11]), most clips in the CASimIR
The similarity module of the Spot the Odd Song Out game
similarity data are related to at least 5 other clips. The
collects relative similarity data using an odd-one-out sur-
repetition of clips across triplets results in fewer unique
vey: From a set of three music clips, participants are asked
referenced clips: the current CASimIR similarity dataset
to choose the clip most dissimilar to the remaining clips,
contains only 180 clips referenced by 2102 ratings, while
i.e. the odd song out. The game motivates players by
MagnaTagATune references 2000 ratings with about 500
rewarding blind agreement. For various reasons, includ-
clips, and has 1019 clips with 7650 ratings in total.

ing personal data protection, little music annotation data ispublicly available with information about the provider of
3.2 Analysis of Age-bounded Similarity Ratings
the data and their context.

The additional participant attributes allow us to select sub-
Although the game can collect anonymised personal in-
sets of similarity data according to specific profiles of the
formation including gender, nationality, spoken languages
participants. This enables the training of more specific
and musical experience, the amount and type information
models that support better similarity predictions for the rel-
available varies between participants, as data provision is
evant group of users, and allows for comparison of differ-
voluntary. Our overarching goal is to study the relation
between similarity and culture and we thus link annota-tions to cultural profiles rather than indexing specific par-
As an example of group-based similarity modelling we
ticipants. With this paper we publish the first set of simi-
choose age as a separating criterion on the CASimIR simi-
larity data with anonymised profiles.

larity data from over 256 participants: We divide the com-plete set of similarity ratings R into two age-bounded sub-sets
3.1 Constraints from Relative Similarity Ratings
R25 of data provided by participants not older than
25 years and R>25 containing data of older participants.

The boundary of 25 years was chosen as the best approxi-
The MagnaTagATune and CASimIR datasets both contain
mation to equal sizes of the subsets (data input is only in 5
relative similarity ratings. A participant's rating of Ck as
year bands). As shown in Table 1, the number of ratings is
higher for the R25 dataset.

Proceedings of the 16th ISMIR Conference, M´
alaga, Spain, October 26-30, 2015
4.1 Mahalanobis Distances
We use the inverse of the distance of two feature vectors as
the similarity of the two corresponding clips. The mathe-matical form of the Mahalanobis distance is used to spec-
Table 1. Number of votes, unique constraints and refer-
ify a parametrised distance measure. Given two feature
enced clips, after filtering inconsistencies, per dataset.

vectors xi, xj 2 RN, the distance can be expressed as
539 similarity ratings are not associated to a valid age and
stored separately in R;. For the two age-bounded datasets,we furthermore define complementary datasets R{(25) and
where W 2 RN⇥N is a square matrix parametrising the
R{(>25) combining the remaining similarity data, e.g. R{(25) distance function: the Mahalanobis matrix. dW qualifies= R>25 [ R;. These complementary sets will be used for
as a metric if W is positive definite and symmetric.

training of template models for transfer learning.

After splitting, the above (sub)sets of ratings are trans-
5. MODEL TRAINING WITH RITML
ferred into constraints (see Section 3.1) and separately fil-tered for inconsistencies. We now use the corresponding
We now discuss our algorithm which can adapt Mahala-
sets of unique constraints Q, Q25, Q>25, Q{(25) and
nobis distances in order to fit relative similarity data. It is
Q{(>25) for training and testing of models. The number of
based on the ITML algorithm as described below, which
constraints are also noted in Table 1, together with the total
cannot be used directly with relative similarity data. In-
number of clips referenced by the constraint sets. Due to
stead, ITML requires upper or lower bounds on the sim-
multiple ratings referring to the same constraint and filter-
ilarity of two clips, e.g. dW (xi, xj) < mi,j for similar
ing the constraint count is lower than the number of ratings.

clips. In Section 5.2 we will iteratively derive such con-straints during the RITML optimisation process.

4. SIMILARITY MODELLING
5.1 Information-Theoretic Metric Learning
The computational representations of music through fea-
Davis et al. [1] describe Information-Theoretic Metric Lear-
tures, related to physical, musical, and cultural attributes
ning (ITML) for learning a Mahalanobis distance from ab-
determine the basis of similarity models. Both the Magna-
solute distance constraints (e.g. requiring dW (xi, xj) <
TagATune and CASimIR datasets contain pre-computed
0.5). A particularly interesting feature of ITML is that
features created by The Echo Nest API. For our exper-
a template Mahalanobis matrix W0 2 Rn⇥n can be pro-
iments with CASimIR we derive acoustic features from
vided for regularisation. This W0 can be from a metric that
this data which are aggregated to the clip-level. The 41-
is predefined or learnt on a different dataset. If W0 is not
dimensional features contain 12 chroma and 12 timbre fea-
specified, the identity transform is used. The regularisation
tures, both aggregated via averaging, 2 weight vectors and
of ITML exploits an interpretation of Mahalanobis matri-
further features after [8, 11]:
ces as multivariate Gaussian distributions: The distancebetween two Mahalanobis distance functions parametrised
by W and W0 is measured by the relative entropy of the
corresponding distributions, which in [1] uses the LogDet
Dld(W, W0) = tr(W W 1
= 2 ⇤ KL (P (xi; W0) k P (xi; W )) .

KL refers to the Kullback-Leibler divergence. For detailsof the transformation see [1]. Given the constraints in form
Table 2. Features used in our experiments.

of similar (Rs) and dissimilar (Rd) clip indices as well as
upper and lower bounds uij, lij, the optimisation problem
For experiments with the MagnaTagATune dataset we will
is then posed as follows:
use the similar features provided in [12] which contain pre-processed tag information in addition to the acoustic fea-
ITML(W, ⇠, c, Rs, Rd) =
tures described above. For the CASimIR dataset, using un-
argmin Dld(W, W0) + c · Dld(diag(⇠), diag(⇠0))
processed tags from Last.fm did not increase performance
in earlier experiments due to very sparse tag assignments.

s.t. tr(W dLi,j(dLi,j) ) ⇠ij 8(i, j) 2 Rs
Therefore, our experiments on CASimIR use acoustic fea-
tures only. For a clip
Ci, we refer to its feature vector as
Proceedings of the 16th ISMIR Conference, M´
alaga, Spain, October 26-30, 2015
Here, ⇠ij are slack variables enabling and controlling the
Algorithm 1: Relative Training with RITML
violation of individual constraints. The ⇠ij are initialised to
Data: Constraints Q
given upper bounds
t, features xi, template matrix W0,
uij, if (i, j) 2 Rs or lower bounds lij,
regularisation factor c, shrinkage factor ⌘, margin
if (i, j) 2 Rd. During optimisation, they are regularised by
⌧ , number of cycles k
comparison to the template slack ⇠0 using triangular matri-
(⇠) and diag(⇠0).

while m k Q⇤ 6= ; do
Update training sets Qm, Rm and
5.2 Relative Learning with RITML
Update absolute constraints ⇠m ;Calculate parameter change W ;Calculate W m+1 ;
In order to allow for training with relative similarity con-
straints, we present Relative Information-Theoretic Metric
Learning (RITML) based on ITML. Motivated by [14], we
return Mahalanobis matrix W k
embed ITML into an iterative adaptation of the upper andlower bounds.

We start with a training set of relative constraints (i, j, k) 2
5.3 Transfer Learning with W0-RITML
Qt. We require standard ITML parameters such as c, as
well as the relative learning parameters including shrink-
The property that motivates our usage of RITML is that
age factor ⌘, margin ⌧ and number of cycles k at the be-
it enables transfer learning: If a specific starting value or
ginning. We use the identity matrix for the template W
0 other than the identity matrix is provided,
During iteration m, the active training set of violated con-
the optimisation tends to produce results close to the pro-
straints Qm is calculated as
vided W0. In order to sustain this effect for large numbers
of iterations we modify Equation (1) such that regularisa-
tion is fixed towards W
t dW m (xi, xj ) > dW m (xi, xk)} .

0 instead of the Euclidean distance:
Qm is then further divided into the sets of similar and dis-
W = ITML(W0, ⇠m, Rm
similar constraints Rm and
This constitutes the W0-RITML algorithm for transfer learn-
ing with Mahalanobis matrices.

s = {(i, j) (i, j, k) 2 Qm}
d = {(i, k) (i, j, k) 2 Qm},
Afterwards, absolute distance constraints ⇠
For all our experiments we use the 10-fold cross-validation
lowing ITML instance are acquired by adding a margin ⌧
with inductive sampling as described in [11]: Instead of di-
to the average distance values
viding the similarity constraints themselves into test/training
µ = dWm (xi,xj)+dWm (xi,xk)
of the clip pairs:
sets, the data are divided on the basis of connected clustersin the similarity data. This approach prevents the recur-
rence of clips from a training-set in the corresponding test
set. It also leads to a greater variance in test-set sizes for
CASimIR where the clusters of connected similarity dataare larger.

Now, with ⇠m containing the upper and lower bounds, Wcan be calculated using
We evaluate the algorithms' performance based on the per-centage of training and test constraints fulfilled by the trained
W = ITML(W m, ⇠m, Rm
model. Our main focus is on the test-set results as we are
interested how well the learnt models generalise to unseen
and the final Mahalanobis matrix is accumulated over iter-
data. As a baseline we use the Euclidean distance on the
ations using the model update function
features. We have tested results for statistical significanceusing the Wilcoxon signed rank test on cross-validation
folds' results with a threshold of p < 5%.

Both SVM as implemented in svmlight[7] and RITML havehyper-parameters affecting the performance on different
In order for the algorithm to converge, the cardinality of the
datasets. The results reported here were selected on the
active training set Qm needs to decrease. In our experi-
basis of best test-set performances after a grid-search over
ments, k = 200 training iterations are usually sufficient.

a range of value combinations identified as reasonable in
Otherwise an early stopping of the algorithm takes place if
preliminary experiments: The regularisation trade-off c is a
Qm does not decrease for 50 iterations. In this case the
parameter common to SVM, RITML and W0-RITML with
W m for the smallest Qm within the last 50 iterations is
a similar effective range: we explored a c 2 [0.001, 10] us-
returned. RITML does not guarantee dW to be a metric.

ing an approximately logarithmic scale. For RITML and
Proceedings of the 16th ISMIR Conference, M´
alaga, Spain, October 26-30, 2015
0-RITML we additionally used
⌘ 2 {0.1, 0.15 . . 0.95}.

6.1 Comparing the Performance of RITML
For a comparable evaluation of RITML we chose the Magna-
TagATune-based dataset and constraint sampling publishedin [12]. Their evaluation compares various algorithms for
Table 4. Comparison of Test / Training set performance
learning a Mahalanobis metric using two different sam-
on the age-bounded datasets. Training on single datasets
plings. The inductive sampling used here corresponds to
(top 3 rows) and transfer learning with W0-RITML and
the sampling B in their text. Table 3 shows the results on
MagnaTagATune and on the complete CASimIR dataset(
Figure 1. Flow diagram for transfer learning, exemplified
Table 3. Comparison of Test / Training set performance on
for the Q>25 dataset.

the MagnaTagATune and CASimIR datasets for baseline,RITML and SVM. Reported are the number of constraintsfulfilled by the learnt distance measures.

Q{(>25) and Q{(25) using cross-validation with trainingand test data from only these sets. Comparing the indi-
For MagnaTagATune, RITML achieves similar generalisa-
vidual results for validation folds we choose the Maha-
tion results as SVM (with parameters SVM: c = 0.7 and
lanobis matrices with the greatest test-set performance as
RITML: c = 1, ⌘ = 0.85, ⌧ = 0.5), while MLR over-
template matrix W0. The template matrix W0 learnt on
fits to the training data. For both the MagnaTagATune
Q{(25) is then used for transfer learning on Q25, us-
and CASimIR datasets all methods perform significantly
ing W0-RITML. For comparison of the effectiveness of the
better than the baseline. The RITML results are therefore
fine-tuning with W0-RITML, we report the performance
comparable to the state-of-the-art. The training results on
achieved with the unmodified W0 on Q25 as W0-Direct.

MagnaTagATune with SVM and MLR are far better than
This process is repeated analogously for Q25 by applying
the test results, indicating overfitting, which does not oc-
the template matrix W0 from Q{(25) on Q25.

cur for RITML. Interestingly, on the CASimIR dataset, the
The highlighted lower columns of Table 4 show the results
situation between RITML and SVM is reversed. Results
for transfer learning: Row W
published by [11] for acoustic-only features on MagnaTag-
0-Direct reports the direct
performances of the template Mahalanobis matrices W
ATune show a performance of 66% on MagnaTagATune,
The results of fine-tuning these models with W
but the lower performance on CASimIR can be explained
are reported in the last row. We here find that using the ma-
by the smaller number of training examples.

trices trained on the larger datasets, and thus transfer learn-ing, generally improves results. Only the results for W0-
6.2 Transfer Learning
RITML provide gains > 6.21% that are statistically signif-
A core motivation for transfer learning is the training on
icant when compared to the baseline. As the average result
highly specialised but small datasets. To evaluate the
0-RITML also significantly outperforms the average
RITML method for transfer learning, we firstly compared
SVM performance, W0-RITML works best for adapting
the SVM and RITML algorithms with the baseline on the
models to specialised datasets.

age-bounded datasets Q>25 and Q25 in Table 4. The
A drawback of RITML is that it is computationally de-
rightmost column shows the average performance across
manding: For the Q dataset, RITML uses 50 seconds where
both age-bounded datasets. Expectedly, on these smaller
SVM converges in 5 seconds. On the other hand, SVM
datasets generalisation results for RITML as well as the
learns a diagonal W which reduces the number of param-
reference SVM and MLR are lower than on the whole CA-
eters and model flexibility.

SimIR. Only for RITML an increase of 4.37% from thebaseline is notable for the slightly larger Q>25 which im-
6.3 Model Comparison
proves the average score for RITML.

In order to identify specificities of the Q>25 dataset in
We now apply transfer learning to improve generalisation
comparison to the remaining Q{(>25), we now analyse changes
results on the age-bounded sets. The overall process is de-
made to the template matrix W0 in the fine-tuning pro-
picted in Figure 1. First, an similarity modelling experi-
cess. Instead of starting from the Euclidean metric, models
ment is performed on both of the complementary subsets
learnt from the W0-RITML method have a model already
Proceedings of the 16th ISMIR Conference, M´
alaga, Spain, October 26-30, 2015
adapted to similarity data as basis.

Figure 2 shows the relative difference ˆ
halanobis matrix before (W
0) and after (W ) fine tuning.

As the fine tuning process rescales the similarity measure
and thereby W , the matrices have been normalised to the
C· · · · · · · · · · · ·T· · · · · · · · · · · ·S·L· · · · ·B· · · · · ·
C· · · · · · · · · · · ·T· · · · · · · · · · · ·S·L· · · · ·B· · · · · ·
The axes of the figure correspond to feature types, whichfor better overview have been grouped into chroma, timbre
Figure 3. (a) Template matrix W0 before and (b) final ma-
and ranges of the features in Table 2. The template matrix
W after fine-tuning with W0-RITML on Q>25. The
latter shows higher variance in off-diagonal entries for the
W0 in Figure 3a has large values only in the diagonal and
specialised model. Axis labels represent ranges of fea-
homogeneous small values off the diagonal. In comparison
ture types: (C)hroma, (T)imbre, as well as (S)egment,
to this, Figure 2 shows that specific combinations of tim-
(L)oudness and (B)eat+Tempo statistics. Dark red colours
bre features (in the bottom centre) with (B)eat and tempo
correspond to strong weight increase, light yellow to de-
statistics were raised in importance by W0-RITML, result-
ing in the final matrix W as shown in Figure 3b. Also,the centre of the matrix shows increased values for combi-nations of different timbre coefficients. The strongest in-
7. CONCLUSION & FUTURE WORK
creases (20-24%) in weights are reported for the off-dia-gonal fields of C11C1, T6T5, B4T4 and B4T5, where C, T
We presented a method for analysing music similarity data
relate to chroma and timbre coefficients and B4 refers to
of different user groups via models trained with transfer
the tatumConfidence feature. Weights are increased mainly
learning. To this end, the new RITML algorithm was de-
at the cost of diagonal elements, and suggest at a speciali-
veloped extending ITML to relative similarity data. A key
sation of the model to the specificities of the Q>25 similar-
feature of RITML is that it enables transfer learning with
ity subset. For this data collected from users aged over 25,
template Mahalanobis matrices via W
0-RITML. Our eval-
W0-RITML model with stronger influence of
uation of the algorithm was performed on two datasets:
the timbre and beat-statistics features performs best in our
The evaluation on the commonly used MagnaTagATune
dataset showed that RITML performs comparably to state-of-the-art algorithms for metric learning.

For evaluation of transfer learning with W
provide the CASimIR similarity dataset, the first open dataset
containing user attributes associated to relative similarity
data. Tests on the whole CASimIR dataset corroborated
our finding that RITML competes with current similarity
learning methods. Our analysis of W0-RITML was per-
formed on age-bounded subsets of the dataset. Results
showed that transfer learning with
W0-RITML outperforms
the standard SVM algorithm on small datasets.

Our comparison of models allowed us to point out specific
features and combinations that determine similarity in user
data. For this first evaluation we chose age to group users.

We hope this will motivate further research in comparison
of similarity models and adaptation to data with regard to
cultural and user context.

C· · · · · · · · · · · ·T· · · · · · · · · · · ·S·L· · · · ·B· · · · · ·
For future work we are interested in collecting larger sim-
Figure 2. Learnt model difference for W0-RITML on
ilarity datasets, and applying the methods introduced here
Axis labels represent ranges of feature types:
for improved validation of results and the analysis of more
(C)hroma, (T)imbre, as well as (S)egment, (L)oudness and
specific user groups. The set-up used for our experiments
(B)eat+Tempo statistics. Dark red / blue colours corre-
motivates transfer learning across the MagnaTagATune and
spond to strong weight increase / decrease.

CASimIR datasets with W0-RITML for further analysis of
the transferability of similarity information via Mahalano-
3 Subtraction and division are applied to W in a point-wise manner.

bis matrices.

Proceedings of the 16th ISMIR Conference, M´
alaga, Spain, October 26-30, 2015
ing ranking functions using relative relevance judg-ments. In Proc. of SIGIR '07, pages 287–294, New
[1] Jason V. Davis, B. Kulis, Prateek Jain, Suvrit Sra,
York, NY, USA, 2007. ACM.

and Inderjit S. Dhillon. Information-theoretic met-ric learning. In Proc. of ICML '07, pages 209–216,New York, NY, USA, 2007. ACM.

[2] Philippe Hamel, Matthew E. P. Davies, Kazuyoshi
Yoshii, and Masataka Goto. Transfer learning in mir:Sharing learned latent representations for music au-dio classification and similarity. In Alceu de SouzaBritto Jr., Fabien Gouyon, and Simon Dixon, editors,ISMIR, pages 9–14, 2013.

[3] Edith Law and Luis Von Ahn. Input-agreement:
A new mechanism for collecting data using humancomputation games. In Proc. of CHI. ACM Press,2009.

[4] B. McFee, L. Barrington, and G. Lanckriet. Learn-
ing similarity from collaborative filters. In Proc. ofISMIR 2010, pages 345–350, 2010.

[5] Brian McFee and Gert R. G. Lanckriet. Partial or-
der embedding with multiple kernels. In Proc. of the26th International Conference on Machine Learning(ICML'09), pages 721–728, June 2009.

[6] Sinno Jialin Pan and Qiang Yang. A survey on trans-
fer learning. IEEE Transactions on Knowledge andData Engineering, 22(10):1345–1359, October 2010.

[7] M. Schultz and T. Joachims. Learning a distance met-
ric from relative comparisons. In Advances in NeuralInformation Processing Systems (NIPS). MIT Press,2003.

[8] Malcolm Slaney, Kilian Q. Weinberger, and William
White. Learning a metric for music similarity. InJuan Pablo Bello, Elaine Chew, and Douglas Turn-bull, editors, Proc. of ISMIR 2008, pages 313–318,2008.

[9] Sebastian Stober.

Adaptive Methods for User-
Centered Organization of Music Collections. PhDthesis, Otto-von-Guericke-University, Magdeburg,Germany, Nov 2011. published by Dr. Hut Verlag,ISBN 978-3-8439-0229-8.

[10] Sebastian Stober and Andreas N¨urnberger. Similar-
ity adaptation in an exploratory retrieval scenario. InProc. of AMR 2010, Linz, Austria, Aug 2010.

[11] Daniel Wolff and Tillman Weyde. Learning music
similarity from relative user ratings. Information Re-trieval, pages 1–28, 2013.

[12] Daniel Wolff, Sebastian Stober, Andreas N¨urnberger,
and Tillman Weyde. A systematic comparison of mu-sic similarity adaptation approaches. In Proc. of IS-MIR 2012, pages 103–108, 2012.

[13] Daniel Wolff, Guillaume Bellec, Anders Friberg, An-
drew MacFarlane, and Tillman Weyde. Creating au-dio based experiments as social web games with thecasimir framework. In Proc. of AES 53rd Interna-tional Conference: Semantic Audio, Jan 2014.

[14] Zhaohui Zheng, Keke Chen, Gordon Sun, and
Hongyuan Zha. A regression framework for learn-

Source: http://ismir2015.uma.es/articles/151_Paper.pdf

Under the Patronage of H. H. Sheikh Nahayan Mabarak Al Nahayan Minister of Higher Education and Scientific Research, Chancellor of the UAE University The College of Food and Agriculture, UAE University and Municipalities and Agriculture Department, Agricultural Sector First conference of the International Society of Camelids

Primary care based randomised, double blind trialof amoxicillin versus placebo for acute otitis media in children aged under 2 years Roger A M J Damoiseaux, Frank A M van Balen, Arno W Hoes, Theo J M Verheijand Ruut A de Melker 2000;320;350-354 Updated information and services can be found at: These include: This article cites 19 articles, 7 of which can be accessed free at: 37 online articles that cite this article can be accessed at: