EACL 2009 accepted paper list is up. Here's my reading list:
WEAKLY SUPERVISED PART-OF-SPEECH TAGGING FOR RESOURCE-SCARCE LANGUAGES
Kazi Saidul Hasan and Vincent Ng
USING CYCLES AND QUASI-CYCLES TO DISAMBIGUATE DICTIONARY GLOSSES
Roberto Navigli
SYNTACTIC AND SEMANTIC KERNELS FOR SHORT TEXT PAIR CATEGORIZATION
Alessandro Moschitti
SENTIMENT SUMMARIZATION: EVALUATING AND LEARNING USER PREFERENCES
Kevin Lerman, Sasha Blair-Goldensohn and Ryan McDonald
PERSON IDENTIFICATION FROM TEXT AND SPEECH GENRE SAMPLES
Jade Goldstein-Stewart, Ransom Winder and Roberta Sabin
OUTCLASSING WIKIPEDIA IN OPEN-DOMAIN INFORMATION EXTRACTION: WEAKLY-SUPERVISED ACQUISITION OF ATTRIBUTES OVER CONCEPTUAL HIERARCHIES
Marius Pasca
GROWING FINELY-DISCRIMINATING TAXONOMIES FROM SEEDS OF VARYING QUALITY AND SIZE
Tony Veale, Guofu Li and Yanfen Hao
GENERATING A NON-ENGLISH SUBJECTIVITY LEXICON: RELATIONS THAT MATTER
Valentin Jijkoun and Katja Hofmann
CONTEXTUAL PHRASE-LEVEL POLARITY ANALYSIS USING LEXICAL AFFECT SCORING AND SYNTACTIC N-GRAMS
Apoorv Agarwal, Fadi Biadsy and Kathleen Mckeown
COMPANY-ORIENTED EXTRACTIVE SUMMARIZATION OF FINANCIAL NEWS
Katja Filippova, Mihai Surdeanu, Massimiliano Ciaramita and Hugo Zaragoza
ANALYSING WIKIPEDIA AND GOLD-STANDARD CORPORA FOR NER TRAINING
Joel Nothman, Tara Murphy and James R. Curran
Friday, December 19, 2008
EACL Reading
- Delip Rao at 11:45 AM 0 comments
Tuesday, December 2, 2008
And we're back ...
Sometime back I wrote about Wordle to visualize textual information using frequency counts. Change.gov, Obama's transition team website uses it on the comments in response to their health care system. This is very interesting but I think Wordle should display top 100 collocations instead of top 100 words. But oh, we also learnt at last ACL how to learn collocation information from unigram frequencies.
- Delip Rao at 4:08 PM 0 comments
Thursday, July 17, 2008
Too many cooks?
Computational Linguistics is becoming like the Science or Nature. For instance, see this paper in the current issue: (In this case, the broth wasn't spoiled ;-)
Guess which paper has the largest number of authors on the ACL anthology?
- Delip Rao at 5:10 AM 0 comments
Tuesday, July 8, 2008
To theory or not to theory
I stumbled upon this paper "Reflections after Refereeing Papers for NIPS" by Leo Breiman that gives some really candid insights into theory papers. (Unfortunately, I could not find a soft copy to share, except this link.) Some noteworthy observations:
"No theorems" implies "No theory"
"... more than 99% of the published papers are useless exercises."
"Mathematical theory is not critical to development of machine learning."
"Our fields would be better off with far fewer theorems, less emphasis on faddish stuff, and much more into scientific inquiry and engineering."
I really liked this article, especially coming from someone who has been working in theory all his life but I would still prefer reading papers giving theoretical insight, however useless, than pages and pages of feature engineering & experimentation using classifier X on problem Y -- the current trend at ACL.
- Delip Rao at 10:55 AM 1 comments
Principal Components: machine learning, ML, NIPS, theory
Monday, July 7, 2008
A quick scan at ACL
Mendicant Bug informs about a new tag-cloud service called Wordle. Here is a look at this year's ACL. Gives a clear idea of what is going on! A larger image is available here.
- Delip Rao at 6:49 AM 0 comments
Sunday, May 11, 2008
Powerset Natural Language Search
Powerset, a company we only remember seeing as conference sponsors, now actually has something working. After receiving an email from them, I tried out several queries. At best, it seems to answer most Wh-questions and certain whole-part relations.
Try out the same query on Google.
- Delip Rao at 10:43 PM 1 comments
Principal Components: Information Retrieval, IR, NLP, search, semantic web
Thursday, April 3, 2008
Writing style
The sweetest thing ever written in a paper: "The reader who is unfamiliar with this field or who has allowed his or her facility with some of its concepts to fall into disrepair may profit from a brief perusal of Feller (1950) and Gallagher (1968)."
- Brown et. al., "Class based n-gram Models of Natural Language.", Computational Linguistics, 1990
- Delip Rao at 8:35 PM 1 comments
Principal Components: writing
Friday, March 28, 2008
Searching ACL anthology
If you are looking up the ACL anthology regularly, my friend Markus has a nice firefox search plugin to do that. You can get that and others from this page.
- Delip Rao at 11:54 AM 0 comments
Thursday, March 27, 2008
ACL accepted papers
Hal posted a while back about the ACL accepted papers that I just read now -- I've been living under a rock for some time. You can get a printer friendly version here. I know, my paper did not make it to that list :(
New additions to my reading list:
Distributional Identification of Non-Referential Pronouns
Shane Bergsma, Dekang Lin and Randy Goebel
An Unsupervised Approach to Biography Production using Wikipedia
Fadi Biadsy, Julia Hirschberg and Elena Filatova
Resolving Personal Names in Email Using Context Expansion
Tamer Elsayed, Douglas Oard and Galileo Namata
Mining Wiki Resources for Multilingual Named Entity Recognition
Alexander Richman and Patrick Schone
Inducing Gazetteers for Named Entity Recognition by Large-scale Clustering of Dependency Relations
Jun'ichi Kazama and Kentaro Torisawa
Name Translation in Statistical Machine Translation - Learning When to Transliterate
Ulf Hermjakob, Kevin Knight and Hal Daume
The Tradeoffs Between Open and Traditional Relation Extraction
Michele Banko and Oren Etzioni
(Longest paper title)
Unsupervised Discovery of Generic Relationships Using Pattern Clusters and its Evaluation by Automatically Generated SAT Analogy Questions
Dmitry Davidov and Ari Rappoport
Finding Contradictions in Text
Marie-Catherine de Marneffe, Anna Rafferty and Christopher Manning
Extracting Question-Context-Answer Triples from Online Forums
Shilin Ding, Gao Cong, Chin-Yew Lin and Xiaoyan Zhu
EM Can Find Pretty Good HMM POS-Taggers (When Given a Good Start)
Yoav Goldberg, Meni Adler and Michael Elhadad
Extraction of Entailed Semantic Relations Through Syntax-based Comma Resolution
Vivek Srikumar, Roi Reichart, Mark Sammons, Ari Rappoport and Dan Roth
Learning Bigrams from Unigrams
Xiaojin Zhu, Andrew Goldberg, Michael Rabbat and Robert Nowak
Evaluating Roget's Thesauri
Alistair Kennedy and Stan Szpakowicz
Randomized Language Models via Perfect Hash Functions
David Talbot and Thorsten Brants
Solving Relational Similarity Problems Using the Web as a Corpus
Preslav Nakov and Marti Hearst
- Delip Rao at 9:26 PM 0 comments
Sunday, February 24, 2008
What do you do?
As a grad student working on NLP how do you explain what you are working on, to friends and family? I inevitably end up referring to the Google search engine even though what I do is quite far from IR. Actually, thats not true. These days IR seems to consume everything but thats another story.
This reminds me of a funny conversation at CLSP recently:
Sanjeev is telling us about an incident where a concerned parent of a young child with a speaking disability is asking him for his opinion. Apparently, she is confused about "Language and Speech Processing" in CLSP.
Keith butts in: "Run a few more iterations of EM and he'll be fine."
- Delip Rao at 5:50 PM 1 comments
Principal Components: Geek Humor, NLP
Thursday, February 14, 2008
A song on parsing
We all know Jason's love for parsing from his work but it takes a different level of dedication to write a Valentine's Day song about parsing.
As Jason says, "Parsers just want to be appreciated, like everyone else."
- Delip Rao at 1:23 AM 0 comments
Principal Components: Geek Humor, NLP, Parsing