I am co-organizing a shared task on syntactic parsing and you should
consider participating:
Shared Task on Parsing the Web.
Video from a talk on
"Fast, Accurate and Robust Multilingual Syntactic Analysis"
from early 2012.
@InProceedings{rush-petrov:2012:NAACL,
author = {Alexander M. Rush, and Slav Petrov}
title = {Vine Pruning for Efficient Multi-Pass Dependency Parsing},
booktitle = {NAACL},
month = {June},
year = {2012},
}
Coarse-to-fine inference has been shown to be a robust approximate method for
improving the efficiency of structured prediction models while preserving their
accuracy. We propose a multi-pass coarse-to-fine architecture for dependency
parsing using linear-time vine pruning and structured prediction cascades.
Our first-, second-, and third-order models achieve accuracies comparable to
those of their unpruned counterparts, while exploring only a fraction of the
search space. We observe speed-ups of up to two orders of magnitude compared
to exhaustive search. Our pruned third-order model is twice as fast as an
unpruned first-order model and also compares favorably to a state-of-the-art
transition-based parser for multiple languages.
@InProceedings{petrov-das-mcdonald:2012:LREC,
author = {Petrov, Slav and Das, Dipanjan and McDonald, Ryan},
title = {A Universal Part-of-Speech Tagset},
booktitle = {LREC},
month = {May},
year = {2012},
}
To facilitate future research in unsupervised induction of syntactic
structure and to standardize best-practices, we propose a tagset that
consists of twelve universal part-of-speech categories. In addition
to the tagset, we develop a mapping from 25 different treebank
tagsets to this universal set. As a result, when combined with the
original treebank data, this universal tagset and mapping produce a
dataset consisting of common parts-of-speech for 22 different
languages. We highlight the use of this resource via two experiments,
including one that reports competitive accuracies for unsupervised
grammar induction without gold standard part-of-speech tags.
@InProceedings{hall-EtAl:2011:NIPS-WKSHP,
author = {Keith Hall and Ryan McDonald and Slav Petrov},
title = {Training Structured Prediction Models with Extrinsic Loss Functions},
booktitle = {Domain Adaptation Workshop at NIPS},
month = {October},
year = {2011},
}
We present an online learning algorithm for training structured
prediction models with extrinsic loss functions. This allows us
to extend a standard supervised learning objective with additional
loss-functions, either based on intrinsic or task-specific
extrinsic measures of quality. We present experiments with
sequence models on part-of-speech tagging and named entity
recognition tasks, and with syntactic parsers on dependency
parsing and machine translation reordering tasks.
@InProceedings{yi-EtAl:2011:IWPT,
author = {Yi, Youngmin and Lai, Chao-Yue and Petrov, Slav and Keutzer, Kurt},
title = {Efficient Parallel CKY Parsing on GPUs},
booktitle = {Proceedings of the 2011 Conference on Parsing Technologies},
month = {October},
year = {2011},
address = {Dublin, Ireland},
}
Low-latency solutions for syntactic parsing are needed if parsing is to
become an integral part of user-facing natural language applications.
Unfortunately, most state-of-the-art constituency parsers employ large
probabilistic context-free grammars for disambiguation, which renders
them impractical for real-time use. Meanwhile, Graphics Processor Units
(GPUs) have become widely available, offering the opportunity to alleviate
this bottleneck by exploiting the fine-grained data parallelism found in
the CKY algorithm. In this paper, we explore the design space of
parallelizing the dynamic programming computations carried out by the CKY
algorithm. We use the Compute Unified Device Architecture (CUDA)
programming model to reimplement a state-of-the-art parser, and compare
its performance on two recent GPUs with different architectural features.
Our best results show a 26-fold speedup compared to a sequential C
implementation.
@InProceedings{mcdonald-petrov-hall:2011:EMNLP,
author = {McDonald, Ryan and Petrov, Slav and Hall, Keith},
title = {Multi-Source Transfer of Delexicalized Dependency Parsers},
booktitle = {Proceedings of the 2011 Conference on Empirical Methods in Natural Language Processing},
month = {July},
year = {2011},
address = {Edinburgh, Scotland, UK.},
publisher = {Association for Computational Linguistics},
pages = {62--72},
url = {http://www.aclweb.org/anthology/D11-1006}
}
We present a simple method for transferring dependency parsers from
source languages with labeled training data to target languages without
labeled training data. We first demonstrate that delexicalized parsers
can be directly transferred between languages, producing significantly
higher accuracies than unsupervised parsers. We then use a constraint
driven learning algorithm where constraints are drawn from parallel
corpora to project the final parser. Unlike previous work on projecting
syntactic resources, we show that simple methods for introducing multiple
source languages can significantly improve the overall quality of the
resulting parsers. The projected parsers from our system result in
state-of-the-art performance when compared to previously studied
unsupervised and projected parsing systems across eight different
languages.
@InProceedings{katzbrown-EtAl:2011:EMNLP,
author = {Katz-Brown, Jason and Petrov, Slav and McDonald, Ryan and Och, Franz and Talbot, David and Ichikawa, Hiroshi and Seno, Masakazu and Kazawa, Hideto},
title = {Training a Parser for Machine Translation Reordering},
booktitle = {Proceedings of the 2011 Conference on Empirical Methods in Natural Language Processing},
month = {July},
year = {2011},
address = {Edinburgh, Scotland, UK.},
publisher = {Association for Computational Linguistics},
pages = {183--192},
url = {http://www.aclweb.org/anthology/D11-1017}
}
We propose a simple training regime that can improve the extrinsic
performance of a parser, given only a corpus of sentences and a way
to automatically evaluate the extrinsic quality of a candidate parse.
We apply our method to train parsers that excel when used as part of
a reordering component in a statistical machine translation system.
We use a corpus of weakly-labeled reference reorderings to guide
parser training. Our best parsers contribute significant improvements
in subjective translation quality while their intrinsic attachment
scores typically regress.
@InProceedings{das-petrov:2011:ACL-HLT2011,
author = {Das, Dipanjan and Petrov, Slav},
title = {Unsupervised Part-of-Speech Tagging with Bilingual Graph-Based Projections},
booktitle = {Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies},
month = {June},
year = {2011},
address = {Portland, Oregon, USA},
publisher = {Association for Computational Linguistics},
pages = {600--609},
url = {http://www.aclweb.org/anthology/P11-1061}
}
We describe a novel approach for inducing unsupervised part-of-speech
taggers for languages that have no labeled training data, but have
translated text in a resource-rich language. Our method does not
assume any knowledge about the target language (in particular no
tagging dictionary is assumed), making it applicable to a wide array
of resource-poor languages. We use graph-based label propagation for
cross-lingual knowledge transfer and use the projected labels as
features in an unsupervised model (Berg-Kirkpatrick et al. 2010).
Across eight European languages, our approach results in an average
absolute improvement of 10.4% over a state-of-the-art baseline, and
16.7% over vanilla hidden Markov models induced with the Expectation
Maximization algorithm.
@InProceedings{subramanya-petrov-pereira:2010:EMNLP,
author = {Subramanya, Amarnag and Petrov, Slav and Pereira, Fernando},
title = {Efficient Graph-Based Semi-Supervised Learning of Structured Tagging Models},
booktitle = {Proceedings of the 2010 Conference on Empirical Methods in Natural Language Processing},
month = {October},
year = {2010},
address = {Cambridge, MA},
publisher = {Association for Computational Linguistics},
pages = {167--176},
url = {http://www.aclweb.org/anthology/D10-1017}
}
We describe a new scalable algorithm for semi-supervised training of
conditional random fields (CRF) and its application to
part-of-speech (POS) tagging. The algorithm uses a similarity graph
to encourage similar n-grams to have similar POS tags. We
demonstrate the efficacy of our approach on a domain adaptation
task, where we assume that we have access to large amounts of
unlabeled data from the target domain, but no additional labeled
data. The similarity graph is used during training to smooth the state
posteriors on the target domain. Standard inference can be used at test
time. Our approach is able to scale to very large problems and yields
significantly improved target domain accuracy.
@InProceedings{petrov-EtAl:2010:EMNLP,
author = {Petrov, Slav and Chang, Pi-Chuan and Ringgaard, Michael and Alshawi, Hiyan},
title = {Uptraining for Accurate Deterministic Question Parsing},
booktitle = {Proceedings of the 2010 Conference on Empirical Methods in Natural Language Processing},
month = {October},
year = {2010},
address = {Cambridge, MA},
publisher = {Association for Computational Linguistics},
pages = {705--713},
url = {http://www.aclweb.org/anthology/D10-1069}
}
It is well known that parsing accuracies drop significantly on out-of-domain
data. What is less known is that some parsers suffer more from domain
shifts than others. We show that dependency parsers have more difficulty
parsing questions than constituency parsers. In particular, deterministic
shift-reduce dependency parsers, which are of highest interest for
practical applications because of their linear running time, drop to 60%
labeled accuracy on a question test set. We propose an *uptraining*
procedure in which a deterministic parser is trained on the output of a
more accurate, but slower, latent variable constituency parser (converted
to dependencies). Uptraining with 100K unlabeled questions achieves
results comparable to having 2K labeled questions for training. With 100K
unlabeled and 2K labeled questions, uptraining is able to improve parsing
accuracy to 84%, closing the gap between in-domain and out-of-domain
performance.
@InProceedings{huang-harper-petrov:2010:EMNLP,
author = {Huang, Zhongqiang and Harper, Mary and Petrov, Slav},
title = {Self-Training with Products of Latent Variable Grammars},
booktitle = {Proceedings of the 2010 Conference on Empirical Methods in Natural Language Processing},
month = {October},
year = {2010},
address = {Cambridge, MA},
publisher = {Association for Computational Linguistics},
pages = {12--22},
url = {http://www.aclweb.org/anthology/D10-1002}
}
We study self-training with products of latent variable grammars in
this paper. We show that increasing the quality of the
automatically parsed data used for self-training gives higher
accuracy self-trained grammars. Our generative self-trained
grammars reach F scores of 91.6 on the WSJ test set and surpass even
discriminative reranking systems without self-training.
Additionally, we show that multiple self-trained grammars can be
combined in a product model to achieve even higher accuracy. The
product model is most effective when the individual underlying
grammars are most diverse. Combining multiple grammars that were
self-trained on disjoint sets of unlabeled data results in a final
test accuracy of 92.5\% on the WSJ test set and 89.6\% on our
Broadcast News test set.
@InProceedings{burkett-EtAl:2010:CONLL,
author = {Burkett, David and Petrov, Slav and Blitzer, John and Klein, Dan},
title = {Learning Better Monolingual Models with Unannotated Bilingual Text},
booktitle = {Proceedings of the Fourteenth Conference on Computational Natural Language Learning},
month = {July},
year = {2010},
address = {Uppsala, Sweden},
publisher = {Association for Computational Linguistics},
pages = {46--54},
url = {http://www.aclweb.org/anthology/W10-2906}
}
This work shows how to improve state-of-the-art monolingual natural
language processing models using unannotated bilingual text. We build
a multiview learning objective that enforces agreement between
monolingual and bilingual models. In our method the first,
monolingual view consists of supervised predictors learned separately
for each language. The second, bilingual view consists of log-linear
predictors learned over both languages on bilingual text. Our
training procedure estimates the parameters of the bilingual model
using the output of the monolingual model, and we show how to combine
the two models to account for dependence between views. For the task
of named entity recognition, using bilingual predictors increases
F1 by 16.1% absolute over a supervised monolingual model, and
retraining on bilingual predictions increases *monolingual* model
F1 by 14.6%. For syntactic parsing, our bilingual predictor
increases F1 by 2.1% absolute, and retraining a monolingual model
on its output gives an improvement of 2.0%.
@InProceedings{petrov:2010:NAACLHLT,
author = {Petrov, Slav},
title = {Products of Random Latent Variable Grammars},
booktitle = {Human Language Technologies: The 2010 Annual Conference of the North American Chapter of the Association for Computational Linguistics},
month = {June},
year = {2010},
address = {Los Angeles, California},
publisher = {Association for Computational Linguistics},
pages = {19--27},
url = {http://www.aclweb.org/anthology/N10-1003}
}
We show that the automatically induced latent variable grammars of
Petrov et al. 2006 vary widely in their underlying representations,
depending on their EM initialization point. We use this to our
advantage, combining multiple automatically learned grammars into
an unweighted product model, which gives significantly improved
performance over state-of-the-art individual grammars. In our model,
the probability of a constituent is estimated as a product of
posteriors obtained from multiple grammars that differ only in the
random seed used for initialization, without any learning or tuning
of combination weights. Despite its simplicity, a product of eight
automatically learned grammars improves parsing accuracy from 90.2%
to 91.8% on English, and from 80.3% to 84.5% on German.
@incollection{bouchard-petrov-klein:2009:NIPS,
title = {Randomized Pruning: Efficiently Calculating Expectations in Large Dynamic Programs},
author = {Alexandre Bouchard-C\^{o}t\'{e} and Slav Petrov and Dan Klein},
booktitle = {Advances in Neural Information Processing Systems 22}
pages = {144--152},
year = {2009},
url = {http://www.petrovi.de/data/nips09.pdf}
}
Pruning can massively accelerate the computation of feature expectations
in large models. However, any single pruning mask will introduce bias.
We present a novel approach which employs a randomized sequence of
pruning masks. Formally, we apply auxiliary variable MCMC sampling to
generate this sequence of masks, thereby gaining theoretical guarantees
about convergence. Because each mask is generally able to skip large
portions of an underlying dynamic program, our approach is particularly
compelling for high-degree algorithms. Empirically, we demonstrate our
method on bilingual parsing, showing decreasing bias as more masks are
incorporated, and outperforming fixed tic-tac-toe pruning.
@incollection{petrov:2009:NIPS-WKSHP,
title = {Generative and Discriminative Latent Variable Grammars}
author = {Slav Petrov},
booktitle = {The Generative and Discriminative Learning Interface Workshop at NIPS 22}
year = {2009},
url = {http://www.petrovi.de/data/nips09w.pdf}
}
Latent variable grammars take an observed (coarse) treebank and induce
more fine-grained grammar categories, that are better suited for
modeling the syntax of natural languages. Estimation can be done in a
generative or a discriminative framework, and results in the best
published parsing accuracies over a wide range of syntactically
divergent languages and domains. In this paper we highlight the
commonalities and the differences between the two learning paradigms.
@phdThesis{petrov:PhD,
author = {Petrov, Slav},
title = {Coarse-to-Fine Natural Language Processing},
school = {University of California at Bekeley},
address = {Berkeley, CA, USA},
year = {2009},
url = {http://www.petrovi.de/data/dissertation.pdf}
}
State-of-the-art natural language processing models are anything but
compact. Syntactic parsers have huge grammars, machine translation systems
have huge transfer tables, and so on across a range of tasks. With such
complexity come two challenges. First, how can we learn highly complex
models? Second, how can we efficiently infer optimal structures within
them?
Hierarchical coarse-to-fine methods address both questions.
Coarse-to-fine approaches exploit a sequence of models which introduce
complexity gradually. At the top of the sequence is a trivial model in
which learning and inference are both cheap. Each subsequent model
refines the previous one, until a final, full-complexity model is
reached. Because each refinement introduces only limited complexity,
both learning and inference can be done in an incremental fashion. In
this dissertation, we describe several coarse-to-fine systems.
In the domain of syntactic parsing, complexity is in the grammar. We
present a latent variable approach which begins with an X-bar grammar
and learns to iteratively refine grammar categories. For example, noun
phrases might be split into subcategories for subjects and objects,
singular and plural, and so on. This splitting process admits an
efficient incremental inference scheme which reduces parsing times by
orders of magnitude. Furthermore, it produces the best parsing
accuracies across an array of languages, in a fully language-general
fashion.
In the domain of acoustic modeling for speech recognition, complexity
is needed to model the rich phonetic properties of natural languages.
Starting from a mono-phone model, we learn increasingly refined models
that capture phone internal structures, as well as context-dependent
variations in an automatic way. Our approaches reduces error rates
compared to other baseline approaches, while streamlining the learning
procedure.
In the domain of machine translation, complexity arises because there
and too many target language word types. To manage this complexity, we
translate into target language clusterings of increasing vocabulary
size. This approach gives dramatic speed-ups while additionally increasing
final translation quality.
@InProceedings{petrov-haghighi-klein:2008:EMNLP,
author = {Petrov, Slav and Haghighi, Aria and Klein, Dan},
title = {Coarse-to-Fine Syntactic Machine Translation using Language Projections},
booktitle = {Proceedings of the 2008 Conference on Empirical Methods in Natural Language Processing},
month = {October},
year = {2008},
address = {Honolulu, Hawaii},
publisher = {Association for Computational Linguistics},
pages = {108--116},
url = {http://www.aclweb.org/anthology/D08-1012}
}
The intersection of tree transducer-based translation models
with n-gram language models results in huge dynamic
programs for machine translation decoding. We propose a
multipass, coarse-to-fine approach in which the language
model complexity is incrementally introduced. In contrast
to previous *order-based* bigram-to-trigram approaches,
we focus on *encoding-based* methods, which use a
clustered encoding of the target language. Across various
hierarchical encoding schemes and for multiple language
pairs, we show speed-ups of up to 50 times over single-pass
decoding while improving BLEU score. Moreover, our entire
decoding cascade for trigram language models is faster than
the corresponding bigram pass alone of a bigram-to-trigram
decoder.
@InProceedings{petrov-klein:2008:EMNLP,
author = {Petrov, Slav and Klein, Dan},
title = {Sparse Multi-Scale Grammars for Discriminative Latent Variable Parsing},
booktitle = {Proceedings of the 2008 Conference on Empirical Methods in Natural Language Processing},
month = {October},
year = {2008},
address = {Honolulu, Hawaii},
publisher = {Association for Computational Linguistics},
pages = {867--876},
url = {http://www.aclweb.org/anthology/D08-1091}
}
We present a discriminative, latent variable approach to
syntactic parsing in which rules exist at multiple scales
of refinement. The model is formally a latent variable
CRF grammar over trees, learned by iteratively splitting
grammar productions (not categories). Different regions
of the grammar are refined to different degrees, yielding
grammars which are three orders of magnitude smaller
than the single-scale baseline and 20 times smaller than
the split-and-merge grammars of Petrov et al. 2006.
In addition, our discriminative approach integrally admits
features beyond local tree configurations. We present a
multi-scale training method along with an efficient
CKY-style dynamic program. On a variety of domains
and languages, this method produces the best published
parsing accuracies with the smallest reported grammars.
@inproceedings{favre-etal:2008:SLT,
author = {Favre, Benoit and Hakkani-Tur, Dilek and Petrov, Slav and Klein, Dan},
title = {{Efficient Sentence Segmentation Using Syntactic Features}},
booktitle = {Spoken Language Technologies (SLT)},
year = {2008},
address = {Goa, India},
url = {http://petrovi.de/data/slt08.pdf}
}
To enable downstream language processing, automatic speech
recognition output must be segmented into its individual sentences.
Previous sentence segmentation systems have typically been very
local, using low-level prosodic and lexical features to independently
decide whether or not to segment at each word boundary position.
In this work, we leverage global syntactic information from a syn-
tactic parser, which is better able to capture long distance depen-
dencies. While some previous work has included syntactic features,
ours is the first to do so in a tractable, lattice-based way, which is
crucial for scaling up to long-sentence contexts. Specifically, an ini-
tial hypothesis lattice is constrcuted using local features. Candidate
sentences are then assigned syntactic language model scores. These
global syntactic scores are combined with local low-level scores in
a log-linear model. The resulting system significantly outperforms
the most popular long-span model for sentence segmentation (the
hidden event language model) on both reference text and automatic
speech recognizer output from news broadcasts.
@InProceedings{petrov-klein:2008:PaGe,
author = {Petrov, Slav and Klein, Dan},
title = {Parsing {German} with Latent Variable Grammars},
booktitle = {Proceedings of the Workshop on Parsing German at ACL '08},
month = {June},
year = {2008},
address = {Columbus, Ohio},
publisher = {Association for Computational Linguistics},
pages = {33--39},
url = {http://www.aclweb.org/anthology/W/W08/W08-1005}
}
We describe experiments on learning latent variable
grammars for various German treebanks, using a
language-agnostic statistical approach. In our method,
a minimal initial grammar is hierarchically refined
using an adaptive split-and-merge EM procedure,
giving compact, accurate grammars. The learning
procedure directly maximizes the likelihood of the
training treebank, without the use of any language
specific or linguistically constrained features.
Nonetheless, the resulting grammars encode many
linguistically interpretable patterns and give the best
published parsing accuracies on three German
treebanks.
@InProceedings{petrov-klein:2008:NIPS2008,
author = {Slav Petrov and Dan Klein},
title = {Discriminative Log-Linear Grammars with Latent Variables},
booktitle = {Advances in Neural Information Processing Systems 20 (NIPS)},
editor = {J.C. Platt and D. Koller and Y. Singer and S. Roweis},
publisher = {MIT Press},
address = {Cambridge, MA},
pages = {1153--1160},
year = {2008},
url = {http://books.nips.cc/papers/files/nips20/NIPS2007_0630.pdf}
}
We demonstrate that log-linear grammars with latent variables can be
practically trained using discriminative methods. Central to
efficient discriminative training is a hierarchical pruning procedure
which allows feature expectations to be efficiently approximated
in a gradient-based procedure. We compare L1 and L2 regularization
and show that L1 regularization is superior, requiring fewer iterations
to converge, and yielding sparser solutions. On full-scale treebank
parsing experiments, the discriminative latent models outperform both
the comparable generative latent models as well as the discriminative
non-latent baselines.
@InProceedings{petrov-pauls-klein:2007:EMNLP-CoNLL2007,
author = {Petrov, Slav and Pauls, Adam and Klein, Dan},
title = {Learning Structured Models for Phone Recognition},
booktitle = {Proceedings of the 2007 Joint Conference on Empirical Methods in Natural Language Processing and Computational Natural Language Learning (EMNLP-CoNLL)},
pages = {897--905},
year = {2007},
url = {http://www.aclweb.org/anthology/D/D07/D07-1094}
}
We present a maximally streamlined approach to learning
HMM-based acoustic models for automatic speech recognition.
In our approach, an initial monophone HMM is iteratively
refined using a split-merge EM procedure which makes no
assumptions about subphone structure or context-dependent
structure, and which uses only a single Gaussian per HMM
state. Despite the much simplified training process, our
acoustic model achieves state-of-the-art results on phone
classification (where it outperforms almost all other methods) and
competitive performance on phone recognition (where it
outperforms standard CD triphone / subphone / GMM approaches).
We also present an analysis of what is and is not learned by
our system.
@InProceedings{liang-EtAl:2007:EMNLP-CoNLL2007,
author = {Liang, Percy and Petrov, Slav and Jordan, Michael and Klein, Dan},
title = {The Infinite {PCFG} Using Hierarchical {Dirichlet} Processes},
booktitle = {Proceedings of the 2007 Joint Conference on Empirical Methods in Natural Language Processing and Computational Natural Language Learning (EMNLP-CoNLL)},
pages = {688--697},
year = {2007},
url = {http://www.aclweb.org/anthology/D/D07/D07-1072}
}
We present a nonparametric Bayesian model
of tree structures based on the hierarchical
Dirichlet process (HDP). Our HDP-PCFG
model allows the complexity of the grammar
to grow as more training data is available.
In addition to presenting a fully Bayesian
model for the PCFG, we also develop an efficient
variational inference procedure. On
synthetic data, we recover the correct grammar
without having to specify its complexity
in advance. We also show that our techniques
can be applied to full-scale parsing
applications by demonstrating its effectiveness
in learning state-split grammars.
@inproceedings{Petrov-Klein-2007:AAAI,
author = {Slav Petrov and Dan Klein},
title = {Learning and Inference for Hierarchically Split {PCFG}s}
booktitle = {AAAI 2007 (Nectar Track)},
year = {2007},
url = {http://www.petrovi.de/data/aaai2007.pdf},
}
Treebank parsing can be seen as the search for an optimally
refined grammar consistent with a coarse training treebank.
We describe a method in which a minimal grammar is hier-
archically refined using EM to give accurate, compact gram-
mars. The resulting grammars are extremely compact com-
pared to other high-performance parsers, yet the parser gives
the best published accuracies on several languages, as well
as the best generative parsing numbers in English. In addi-
tion, we give an associated coarse-to-fine inference scheme
which vastly improves inference time with no loss in test set
accuracy.
@InProceedings{petrov-klein:2007:main,
author = {Petrov, Slav and Klein, Dan},
title = {Improved Inference for Unlexicalized Parsing},
booktitle = {Human Language Technologies 2007: The Conference of the North American Chapter of the Association for Computational Linguistics; Proceedings of the Main Conference},
month = {April},
year = {2007},
address = {Rochester, New York},
publisher = {Association for Computational Linguistics},
pages = {404--411},
url = {http://www.aclweb.org/anthology/N/N07/N07-1051}
}
We present several improvements to unlexicalized
parsing with hierarchically state-split PCFGs. First,
we present a novel coarse-to-fine method in which
a grammar's own hierarchical projections are used
for incremental pruning, including a method for efficiently
computing projections of a grammar without
a treebank. In our experiments, hierarchical
pruning greatly accelerates parsing with no loss in
empirical accuracy. Second, we compare various
inference procedures for state-split PCFGs from the
standpoint of risk minimization, paying particular
attention to their practical tradeoffs. Finally, we
present multilingual experiments which show that
parsing with hierarchical state-splitting is fast and
accurate in multiple languages and domains, even
without any language-specific tuning.
@inproceedings{Petrov-EtAl:2006:TRECVID,
author = {Slav Petrov and Arlo Faria and Pascal Michaillat and Alexander Berg and Andreas Stolcke and Dan Klein and Jitendra Malik},
title = {Detecting Categories in News Video using Acoustic, Speech and Image Features},
booktitle = {Proceedings of (VIDEO) TREC (TrecVid 2006)},
year = {2006},
url = {http://www.petrovi.de/data/trecvid06.pdf},
}
This work describes systems for detecting semantic categories
present in news video. The multimedia data was processed in
three ways: the audio signal was converted to a sequence of
acoustic features, automatic speech recognition provided a
word-level transcription, and image features were computed for
selected frames of the video signal. Primary acoustic, speech,
and vision systems were trained to discriminate instances of
the categories. Higher-level systems exploited correlations
among the categories, incorporated sequential context, and
combined the joint evidence from the three information sources.
We present experimental results from the TREC video retrieval
evaluation.
@InProceedings{petrov-EtAl:2006:COLACL,
author = {Petrov, Slav and Barrett, Leon and Thibaux, Romain and Klein, Dan},
title = {Learning Accurate, Compact, and Interpretable Tree Annotation},
booktitle = {Proceedings of the 21st International Conference on Computational Linguistics and 44th Annual Meeting of the Association for Computational Linguistics},
month = {July},
year = {2006},
address = {Sydney, Australia},
publisher = {Association for Computational Linguistics},
pages = {433--440},
url = {http://www.aclweb.org/anthology/P/P06/P06-1055}
}
We present an automatic approach to tree annotation
in which basic nonterminal symbols are alternately
split and merged to maximize the likelihood
of a training treebank. Starting with a simple Xbar
grammar, we learn a new grammar whose nonterminals
are subsymbols of the original nonterminals.
In contrast with previous work, we are able
to split various terminals to different degrees, as appropriate
to the actual complexity in the data. Our
grammars automatically learn the kinds of linguistic
distinctions exhibited in previous work on manual
tree annotation. On the other hand, our grammars
are much more compact and substantially more accurate
than previous work on automatic annotation.
Despite its simplicity, our best grammar achieves
an F1 of 89.9% on the Penn Treebank, higher than
most fully lexicalized systems.
@InProceedings{petrov-barrett-klein:2006:CoNLL-X,
author = {Petrov, Slav and Barrett, Leon and Klein, Dan},
title = {Non-Local Modeling with a Mixture of {PCFG}s},
booktitle = {Proceedings of the Tenth Conference on Computational Natural Language Learning (CoNLL-X)},
month = {June},
year = {2006},
address = {New York City},
publisher = {Association for Computational Linguistics},
pages = {14--20},
url = {http://www.aclweb.org/anthology/W/W06/W06-2903}
}
While most work on parsing with PCFGs
has focused on local correlations between
tree configurations, we attempt to model
non-local correlations using a finite mixture
of PCFGs. A mixture grammar fit
with the EM algorithm shows improvement
over a single PCFG, both in parsing
accuracy and in test data likelihood. We
argue that this improvement comes from
the learning of specialized grammars that
capture non-local correlations.
@inproceedings{Tomasi-Petrov-Sastry-2003:ICCV,
author = {Carlo Tomasi and Slav Petrov and Arvind Sastry},
title = {3{D} Tracking = {C}lassification + {I}nterpolation},
booktitle = {Proceedings of the Ninth IEEE International Conference on Computer Vision (ICCV)},
year = {2003},
url = {http://www.petrovi.de/data/iccv03.pdf},
}
Hand gestures are examples of fast and complex motions.
Computers fail to track these in fast video, but sleight of
hand fools humans as well: what happens too quickly we
just cannot see. We show a 3D tracker for these types of
motions that relies on the recognition of familiar configurations
in 2D images (classification), and fills the gaps
in-between (interpolation). We illustrate this idea with experiments
on hand motions similar to finger spelling. The
penalty for a recognition failure is often small: if two con-
figurations are confused, they are often similar to each
other, and the illusion works well enough, for instance, to
drive a graphics animation of the moving hand. We contribute
advances in both feature design and classifier training:
our image features are invariant to image scale, translation,
and rotation, and we propose a classification method
that combines VQPCA with discrimination trees.
@mastersthesis{Petrov-Masters,
author = {Slav Petrov},
title = {Computer vision, sensor fusion, and behavior control for soccer playing robots},
school = {Freie Universitaet Berlin}
year = {2004},
url = {http://www.petrovi.de/data/slav_diplom_arbeit.pdf},
}
This Master's thesis describes parts of the control software
used by the soccer robots of the Free University of Berlin,
the so called FU-Fighters. The FU-Fighters compete in the
Middle Sized League of RoboCup and reached the semi-finals
during the 2004 RoboCup World Cup in Lisbon, Portugal. The
thesis covers several independent topics:
- Automatic White Balance: It is shown how to improve the
white balancing of an omni-directional camera by using a
reference color and a PID-controller.
- Ball Tracking: The reliable tracking of the ball is vital
in robot soccer. Therefore a Kalman-filter based system for
estimating the ball position and velocity in the presence
of occlusions is developped.
- Sensor Fusion: The robot perceives its environment through
several independent sensors (camera, odometer, etc.), which
have different delays. We propose a novel method for fusing
the sensor data and show our results through examples of
selflocalization.
- Behavior Control: Finally we show how all these elements
can be incorporated into a goal keeping robot. We develop
simple behaviors that can be used in a layered architecture
and enable the robot to block most balls that are being shot
at the goal.
Other materials:
Office address:
Google Research
76 Ninth Ave, New York, NY 10011
Email: slav@petrovi.de
Slav Petrov - Слав Петров, May 2012