[PDF][PDF] A maximum entropy approach to natural language processing

A Berger, SA Della Pietra… - Computational …, 1996 - aclanthology.org
The concept of maximum entropy can be traced back along multiple threads to Biblical
times. Only recently, however, have computers become powerful enough to permit the …

Committee-based sampling for training probabilistic classifiers

I Dagan, SP Engelson - Machine Learning Proceedings 1995, 1995 - Elsevier
In many real-world learning tasks, it is expensive to acquire a sufficient number of labeled
examples for training. This paper proposes a general method for efficiently training …

[PDF][PDF] A practical part-of-speech tagger

D Cutting, J Kupiec, J Pedersen… - Third conference on …, 1992 - aclanthology.org
We present an implementation of a part-of-speech tagger based on a hidden Markov model.
The methodology enables robust and accurate tagging with few resource requirements …

Three new probabilistic models for dependency parsing: An exploration

J Eisner - arxiv preprint cmp-lg/9706003, 1997 - arxiv.org
After presenting a novel O (n^ 3) parsing algorithm for dependency grammar, we develop
three contrasting ways to stochasticize it. We propose (a) a lexical affinity model where …

Robust part-of-speech tagging using a hidden Markov model

J Kupiec - Computer speech & language, 1992 - Elsevier
A system for part-of-speech tagging is described. It is based on a hidden Markov model
which can be trained using a corpus of untagged text. Several techniques are introduced to …

Some advances in transformation-based part of speech tagging

E Brill - arxiv preprint cmp-lg/9406010, 1994 - arxiv.org
Most recent research in trainable part of speech taggers has explored stochastic tagging.
While these taggers obtain high accuracy, linguistic information is captured indirectly …

[PDF][PDF] Tagging English text with a probabilistic model

B Merialdo - Computational linguistics, 1994 - aclanthology.org
Experiments show that the best training is obtained by using as much tagged text as
possible. They also show that Maximum Likelihood training, the procedure that is routinely …

[PDF][PDF] Introduction to the special issue on computational linguistics using large corpora

K Church, RL Mercer - Computational linguistics, 1993 - aclanthology.org
The 1990s have witnessed a resurgence of interest in 1950s-style empirical and statistical
methods of language analysis. Empiricism was at its peak in the 1950s, dominating a broad …

Decision lists for lexical ambiguity resolution: Application to accent restoration in Spanish and French

D Yarowsky - arxiv preprint cmp-lg/9406034, 1994 - arxiv.org
This paper presents a statistical decision procedure for lexical ambiguity resolution. The
algorithm exploits both local syntactic patterns and more distant collocational evidence …

[PDF][PDF] Word-sense disambiguation using statistical methods

PF Brown, SA Della Pietra, VJ Della Pietra… - 29th Annual meeting …, 1991 - aclanthology.org
We describe a statistical technique for assigning senses to words. An instance of a word is
assigned a sense by asking a question about the context in which the word appears. The …