Handling the impact of low frequency events on co-occurrence based measures of word similarity-a case study of pointwise mutual information
F Role, M Nadif - … on Knowledge Discovery and Information Retrieval, 2011 - scitepress.org
Statistical measures of word similarity are widely used in many areas of information retrieval
and text mining. Among popular word co-occurrence based measures is Pointwise Mutual …
and text mining. Among popular word co-occurrence based measures is Pointwise Mutual …
Beneath (or beyond) the surface: Discovering voice-leading patterns with skip-grams
Recurrent voice-leading patterns like the Mi-Re-Do compound cadence (MRDCC) rarely
appear on the musical surface in complex polyphonic textures, so finding these patterns …
appear on the musical surface in complex polyphonic textures, so finding these patterns …
Term extraction from sparse, ungrammatical domain-specific documents
A Ittoo, G Bouma - Expert Systems with Applications, 2013 - Elsevier
Existing term extraction systems have predominantly targeted large and well-written
document collections, which provide reliable statistical and linguistic evidence to support …
document collections, which provide reliable statistical and linguistic evidence to support …
KR-WordRank: An unsupervised Korean word extraction method based on WordRank
A Word is the smallest unit for text analysis, and the premise behind most text-mining
algorithms is that the words in given documents can be perfectly recognized. However, the …
algorithms is that the words in given documents can be perfectly recognized. However, the …
[PDF][PDF] Finding multiwords of more than two words
The prospects for automatically identifying two-word multiwords in corpora have been
explored in depth, and there are now well-established methods in widespread use.(We use …
explored in depth, and there are now well-established methods in widespread use.(We use …
CLAD: A corpus-derived Chinese lexical association database
SY Lin, HC Chen, TH Chang, WE Lee… - Behavior Research …, 2019 - Springer
The application of word associations has become increasingly widespread. However, the
association norms produced by traditional free association tests tend not to exceed 10,000 …
association norms produced by traditional free association tests tend not to exceed 10,000 …
Multi-word terms selection for information retrieval
Purpose A number of approaches and algorithms have been proposed over the years as a
basis for automatic indexing. Many of these approaches suffer from precision inefficiency at …
basis for automatic indexing. Many of these approaches suffer from precision inefficiency at …
Measuring coselectional constraint in learner corpora: A graph-based approach
AV Shadrova - 2020 - edoc.hu-berlin.de
The thesis located in corpus linguistics analyzes the acquisition of coselectional constraint in
learners of German as a second language in a quasi-longitudinal design based on the …
learners of German as a second language in a quasi-longitudinal design based on the …
TermeX: A Tool for Collocation Extraction
Collocations–word combinations occurring together more often than by chance–have a wide
range of NLP applications. Many approaches for automating collocation extraction based on …
range of NLP applications. Many approaches for automating collocation extraction based on …
Improving product quality and reliability with customer experience data
Advance technology development and wide use of the World Wide Web have made it
possible for new product development organizations to access multi‐sources of data‐related …
possible for new product development organizations to access multi‐sources of data‐related …