Visual interestingness prediction: A benchmark framework and literature review
In this paper, we report on the creation of a publicly available, common evaluation
framework for image and video visual interestingness prediction. We propose a robust data …
framework for image and video visual interestingness prediction. We propose a robust data …
Statistical significance, power, and sample sizes: A systematic review of SIGIR and TOIS, 2006-2015
T Sakai - Proceedings of the 39th International ACM SIGIR …, 2016 - dl.acm.org
We conducted a systematic review of 840 SIGIR full papers and 215 TOIS papers published
between 2006 and 2015. The original objective of the study was to identify IR effectiveness …
between 2006 and 2015. The original objective of the study was to identify IR effectiveness …
Statistical significance testing in information retrieval: an empirical analysis of type I, type II and type III errors
Statistical significance testing is widely accepted as a means to assess how well a difference
in effectiveness reflects an actual difference between systems, as opposed to random noise …
in effectiveness reflects an actual difference between systems, as opposed to random noise …
Toward Cranfield-inspired reusability assessment in interactive information retrieval evaluation
J Liu - Information Processing & Management, 2022 - Elsevier
Re-using research resources is essential for advancing knowledge and develo**
repeatable, empirically solid experiments in scientific fields, including interactive information …
repeatable, empirically solid experiments in scientific fields, including interactive information …
A framework for evaluating automatic indexing or classification in the context of retrieval
Tools for automatic subject assignment help deal with scale and sustainability in creating
and enriching metadata, establishing more connections across and between resources and …
and enriching metadata, establishing more connections across and between resources and …
Affect in multimedia: Benchmarking violent scenes detection
In this article, we report on the creation of a publicly available, common evaluation
framework for Violent Scenes Detection (VSD) in Hollywood and YouTube videos. We …
framework for Violent Scenes Detection (VSD) in Hollywood and YouTube videos. We …
A systematic evaluation of transfer learning and pseudo-labeling with BERT-based ranking models
Due to high annotation costs making the best use of existing human-created training data is
an important research direction. We, therefore, carry out a systematic evaluation of …
an important research direction. We, therefore, carry out a systematic evaluation of …
Evaluation in music information retrieval
Abstract The field of Music Information Retrieval has always acknowledged the need for
rigorous scientific evaluations, and several efforts have set out to develop and provide the …
rigorous scientific evaluations, and several efforts have set out to develop and provide the …
Evaluation and combination of pitch estimation methods for melody extraction in symphonic classical music
The extraction of pitch information is arguably one of the most important tasks in automatic
music description systems. However, previous research and evaluation datasets dealing …
music description systems. However, previous research and evaluation datasets dealing …
Topic set size design
T Sakai - Information Retrieval Journal, 2016 - Springer
Traditional pooling-based information retrieval (IR) test collections typically have n= 50 n=
50–100 topics, but it is difficult for an IR researcher to say why the topic set size should really …
50–100 topics, but it is difficult for an IR researcher to say why the topic set size should really …