[PDF][PDF] Semeval 2015 task 18: Broad-coverage semantic dependency parsing

S Oepen, M Kuhlmann, Y Miyao, D Zeman… - Proceedings of the …, 2015 - aclanthology.org
Abstract Task 18 at SemEval 2015 defines Broad-Coverage Semantic Dependency Parsing
(SDP) as the problem of recovering sentence-internal predicate–argument relationships for …

Automatic hate speech detection using killer natural language processing optimizing ensemble deep learning approach

Z Al-Makhadmeh, A Tolba - Computing, 2020 - Springer
Over the last decade, the increased use of social media has led to an increase in hateful
activities in social networks. Hate speech is one of the most dangerous of these activities, so …

Elephant: Sequence labeling for word and sentence segmentation

K Evang, V Basile, G Chrupała, J Bos - EMNLP 2013, 2013 - hal.science
Tokenization is widely regarded as a solved problem due to the high accuracy that rule-
based tokenizers achieve. But rule-based tokenizers are hard to maintain and their rules …

Statistical learning for OCR error correction

J Mei, A Islam, A Moh'd, Y Wu, E Milios - Information Processing & …, 2018 - Elsevier
Modern OCR engines incorporate some form of error correction, typically based on
dictionaries. However, there are still residual errors that decrease performance of natural …

[HTML][HTML] Graph based text representation for document clustering

AK Abdulsahib, SS Kamaruddin - 2015 - jatit.org
Advances in digital technology and the World Wide Web has led to the increase of digital
documents that are used for various purposes such as publishing and digital library. This …

iSentenizer‐μ: Multilingual Sentence Boundary Detection Model

DF Wong, LS Chao, X Zeng - The Scientific World Journal, 2014 - Wiley Online Library
Sentence boundary detection (SBD) system is normally quite sensitive to genres of data that
the system is trained on. The genres of data are often referred to the shifts of text topics and …

Automated item matching and pricing (IMP) for wood building elements to support BIM-based wood construction cost estimation

T Akanbi, J Zhang, YC Lee - ASCE International Conference on …, 2019 - ascelibrary.org
ABSTRACT A major gap in the automation of construction cost estimation is the need of
manual inputs to complete cost estimation processes. To address this gap, the authors …

Multilingual word segmentation: Training many language-specific tokenizers smoothly thanks to the universal dependencies corpus

E Moreau, C Vogel - … of the Eleventh International Conference on …, 2018 - hal.science
This paper describes how a tokenizer can be trained from any dataset in the Universal
Dependencies 2.1 corpus (UD2)(Nivre et al., 2017). A software tool, which relies on …

[PDF][PDF] Ubertagging: Joint segmentation and supertagging for English

R Dridan - Proceedings of the 2013 Conference on Empirical …, 2013 - aclanthology.org
A precise syntacto-semantic analysis of English requires a large detailed lexicon with the
possibility of treating multiple tokens as a single meaning-bearing unit, a word-with-spaces …

[PDF][PDF] Document parsing: Towards realistic syntactic analysis

R Dridan, S Oepen - … of The 13th International Conference on …, 2013 - aclanthology.org
In this work we take a view of syntactic analysis as processing 'raw', running text instead of
idealised, pre-segmented inputs—a task we dub document parsing. We observe the state of …