Turnitin
降AI改写
早检测系统
早降重系统
Turnitin-UK版
万方检测-期刊版
维普编辑部版
Grammarly检测
Paperpass检测
checkpass检测
PaperYY检测
A theoretical analysis of NDCG type ranking measures
Ranking has been extensively studied in information retrieval, machine learning and
statistics. A central problem in ranking is to design a ranking measure for evaluation of …
statistics. A central problem in ranking is to design a ranking measure for evaluation of …
A comparison of statistical significance tests for information retrieval evaluation
Information retrieval (IR) researchers commonly use three tests of statistical significance: the
Student's paired t-test, the Wilcoxon signed rank test, and the sign test. Other researchers …
Student's paired t-test, the Wilcoxon signed rank test, and the sign test. Other researchers …
Test collection based evaluation of information retrieval systems
M Sanderson - Foundations and Trends® in Information …, 2010 - nowpublishers.com
Use of test collections and evaluation measures to assess the effectiveness of information
retrieval systems has its origins in work dating back to the early 1950s. Across the nearly 60 …
retrieval systems has its origins in work dating back to the early 1950s. Across the nearly 60 …
Statistical significance, power, and sample sizes: A systematic review of SIGIR and TOIS, 2006-2015
T Sakai - Proceedings of the 39th International ACM SIGIR …, 2016 - dl.acm.org
We conducted a systematic review of 840 SIGIR full papers and 215 TOIS papers published
between 2006 and 2015. The original objective of the study was to identify IR effectiveness …
between 2006 and 2015. The original objective of the study was to identify IR effectiveness …
Search result diversification
Ranking in information retrieval has been traditionally approached as a pursuit of relevant
information, under the assumption that the users' information needs are unambiguously …
information, under the assumption that the users' information needs are unambiguously …
Assessing ranking metrics in top-N recommendation
The evaluation of recommender systems is an area with unsolved questions at several
levels. Choosing the appropriate evaluation metric is one of such important issues. Ranking …
levels. Choosing the appropriate evaluation metric is one of such important issues. Ranking …
Time-based calibration of effectiveness measures
Many current effectiveness measures incorporate simplifying assumptions about user
behavior. These assumptions prevent the measures from reflecting aspects of the search …
behavior. These assumptions prevent the measures from reflecting aspects of the search …
[PDF][PDF] University of Wolverhampton at the TREC 2011 Microblog Track.
In this report we discuss the experiments we conducted at the University of Wolverhampton
for the Microblog Track at TREC-2011. As this was the first time we participated in TREC and …
for the Microblog Track at TREC-2011. As this was the first time we participated in TREC and …
Estimating the uncertainty of average F1 scores
In multi-class text classification, the performance (effectiveness) of a classifier is usually
measured by micro-averaged and macro-averaged F 1 scores. However, the scores …
measured by micro-averaged and macro-averaged F 1 scores. However, the scores …
Statistical significance testing in information retrieval: an empirical analysis of type I, type II and type III errors
Statistical significance testing is widely accepted as a means to assess how well a difference
in effectiveness reflects an actual difference between systems, as opposed to random noise …
in effectiveness reflects an actual difference between systems, as opposed to random noise …