[PDF][PDF] Learning to identify definitions using syntactic features
This paper describes an approach to learning concept definitions which operates on fully
parsed text. A subcorpus of the Dutch version of Wikipedia was searched for sentences …
parsed text. A subcorpus of the Dutch version of Wikipedia was searched for sentences …
[PDF][PDF] DCC&U: An extended digital curation lifecycle model
The proliferation of Web, database and social networking technologies has enabled us to
produce, publish and exchange digital assets at an enormous rate. This vast amount of …
produce, publish and exchange digital assets at an enormous rate. This vast amount of …
Situational Data Integration in Question Answering systems: a survey over two decades
MH Franciscatto, LC Erpen de Bona, C Trois… - … and Information Systems, 2024 - Springer
Question Answering (QA) systems provide accurate answers to questions; however, they
lack the ability to consolidate data from multiple sources, making it difficult to manage …
lack the ability to consolidate data from multiple sources, making it difficult to manage …
[PDF][PDF] Mining the Web to Create Specialized Glossaries.
A first step in establishing a Web community's knowledge domain is to collect a glossary of
domain-relevant terms that constitute the linguistic surface manifestation of domain …
domain-relevant terms that constitute the linguistic surface manifestation of domain …
Co** with highly imbalanced datasets: A case study with definition extraction in a multilingual setting
This paper addresses the task of automatic extraction of definitions by thoroughly exploring
an approach that solely relies on machine learning techniques, and by focusing on the issue …
an approach that solely relies on machine learning techniques, and by focusing on the issue …
[PDF][PDF] Defining file format obsolescence: A risky journey
D Pearson, C Webb - 2008 - core.ac.uk
File format obsolescence is a major risk factor threatening the ongoing usefulness of digital
information collections. While the preservation community has become increasingly …
information collections. While the preservation community has become increasingly …
[PDF][PDF] Creating Glossaries Using Pattern-Based and Machine Learning Techniques.
E Westerhout, P Monachesi - LREC, 2008 - cs.brandeis.edu
One of the aims of the Language Technology for eLearning project is to show that Natural
Language Processing techniques can be employed to enhance the learning process. To this …
Language Processing techniques can be employed to enhance the learning process. To this …
Semantically driven snippet selection for supporting focused web searches
Millions of people access the plentiful web content to locate information that is of interest to
them. Searching is the primary web access method for many users. During search, the users …
them. Searching is the primary web access method for many users. During search, the users …
Glossextractor: A web application to automatically create a domain glossary
We describe a web application, GlossExtractor, that receives in input the output of a
terminology extraction web application, TermExtractor, or a user-provided terminology, and …
terminology extraction web application, TermExtractor, or a user-provided terminology, and …
[PDF][PDF] Answering definition questions via temporally-anchored text snippets
M Pasca - Proceedings of the Third International Joint Conference …, 2008 - aclanthology.org
A lightweight extraction method derives text snippets associated to dates from the Web. The
snippets are organized dynamically into answers to definition questions. Experiments on …
snippets are organized dynamically into answers to definition questions. Experiments on …