Visual interestingness prediction: A benchmark framework and literature review

MG Constantin, LD Ştefan, B Ionescu… - International Journal of …, 2021 - Springer
In this paper, we report on the creation of a publicly available, common evaluation
framework for image and video visual interestingness prediction. We propose a robust data …

Statistical significance, power, and sample sizes: A systematic review of SIGIR and TOIS, 2006-2015

T Sakai - Proceedings of the 39th International ACM SIGIR …, 2016 - dl.acm.org
We conducted a systematic review of 840 SIGIR full papers and 215 TOIS papers published
between 2006 and 2015. The original objective of the study was to identify IR effectiveness …

Statistical significance testing in information retrieval: an empirical analysis of type I, type II and type III errors

J Urbano, H Lima, A Hanjalic - … of the 42nd International ACM SIGIR …, 2019 - dl.acm.org
Statistical significance testing is widely accepted as a means to assess how well a difference
in effectiveness reflects an actual difference between systems, as opposed to random noise …

Toward Cranfield-inspired reusability assessment in interactive information retrieval evaluation

J Liu - Information Processing & Management, 2022 - Elsevier
Re-using research resources is essential for advancing knowledge and develo**
repeatable, empirically solid experiments in scientific fields, including interactive information …

A framework for evaluating automatic indexing or classification in the context of retrieval

K Golub, D Soergel, G Buchanan… - Journal of the …, 2016 - Wiley Online Library
Tools for automatic subject assignment help deal with scale and sustainability in creating
and enriching metadata, establishing more connections across and between resources and …

Affect in multimedia: Benchmarking violent scenes detection

MG Constantin, LD Ştefan, B Ionescu… - IEEE Transactions …, 2020 - ieeexplore.ieee.org
In this article, we report on the creation of a publicly available, common evaluation
framework for Violent Scenes Detection (VSD) in Hollywood and YouTube videos. We …

A systematic evaluation of transfer learning and pseudo-labeling with BERT-based ranking models

I Mokrii, L Boytsov, P Braslavski - … of the 44th International ACM SIGIR …, 2021 - dl.acm.org
Due to high annotation costs making the best use of existing human-created training data is
an important research direction. We, therefore, carry out a systematic evaluation of …

Evaluation in music information retrieval

J Urbano, M Schedl, X Serra - Journal of Intelligent Information Systems, 2013 - Springer
Abstract The field of Music Information Retrieval has always acknowledged the need for
rigorous scientific evaluations, and several efforts have set out to develop and provide the …

Evaluation and combination of pitch estimation methods for melody extraction in symphonic classical music

JJ Bosch, R Marxer, E Gómez - Journal of New Music Research, 2016 - Taylor & Francis
The extraction of pitch information is arguably one of the most important tasks in automatic
music description systems. However, previous research and evaluation datasets dealing …

Topic set size design

T Sakai - Information Retrieval Journal, 2016 - Springer
Traditional pooling-based information retrieval (IR) test collections typically have n= 50 n=
50–100 topics, but it is difficult for an IR researcher to say why the topic set size should really …