Friday, July 3, 2009
Dealing with large scale graphs
- Delip Rao at 1:01 PM 0 comments
Tuesday, March 31, 2009
Sentiment Analysis is AI-Hard
In a breezy article on sentiment analysis, Alex Wright quotes Bo Pang saying:
We are dealing with sentiment that can be expressed in subtle ways.This is so true with the examples I've encountered while working and my favorite is this one I saw on iTunes recently.
While I commend Alex for writing an informative yet accessible article on the topic, I disagree with the article's opinion that sentiment analysis is a series of "filters". That is clearly an euphemism. Any working sentiment analysis system is actually an engineering feat often consisting of a series of hacks duct-taped by a glue handling special cases.
The article also seems to suggest that extracting factual information is somehow easier than opinions. I invite them to participate here.
- Delip Rao at 3:04 PM 0 comments
Principal Components: "information extraction", "sentiment analysis", NLP, research
Saturday, February 28, 2009
On the way to Brewer's Art
- Delip Rao at 9:02 AM 0 comments
Principal Components: graduate students, Humor, NLP, research
Friday, December 19, 2008
EACL Reading
EACL 2009 accepted paper list is up. Here's my reading list:
WEAKLY SUPERVISED PART-OF-SPEECH TAGGING FOR RESOURCE-SCARCE LANGUAGES
Kazi Saidul Hasan and Vincent Ng
USING CYCLES AND QUASI-CYCLES TO DISAMBIGUATE DICTIONARY GLOSSES
Roberto Navigli
SYNTACTIC AND SEMANTIC KERNELS FOR SHORT TEXT PAIR CATEGORIZATION
Alessandro Moschitti
SENTIMENT SUMMARIZATION: EVALUATING AND LEARNING USER PREFERENCES
Kevin Lerman, Sasha Blair-Goldensohn and Ryan McDonald
PERSON IDENTIFICATION FROM TEXT AND SPEECH GENRE SAMPLES
Jade Goldstein-Stewart, Ransom Winder and Roberta Sabin
OUTCLASSING WIKIPEDIA IN OPEN-DOMAIN INFORMATION EXTRACTION: WEAKLY-SUPERVISED ACQUISITION OF ATTRIBUTES OVER CONCEPTUAL HIERARCHIES
Marius Pasca
GROWING FINELY-DISCRIMINATING TAXONOMIES FROM SEEDS OF VARYING QUALITY AND SIZE
Tony Veale, Guofu Li and Yanfen Hao
GENERATING A NON-ENGLISH SUBJECTIVITY LEXICON: RELATIONS THAT MATTER
Valentin Jijkoun and Katja Hofmann
CONTEXTUAL PHRASE-LEVEL POLARITY ANALYSIS USING LEXICAL AFFECT SCORING AND SYNTACTIC N-GRAMS
Apoorv Agarwal, Fadi Biadsy and Kathleen Mckeown
COMPANY-ORIENTED EXTRACTIVE SUMMARIZATION OF FINANCIAL NEWS
Katja Filippova, Mihai Surdeanu, Massimiliano Ciaramita and Hugo Zaragoza
ANALYSING WIKIPEDIA AND GOLD-STANDARD CORPORA FOR NER TRAINING
Joel Nothman, Tara Murphy and James R. Curran
- Delip Rao at 11:45 AM 0 comments
Tuesday, December 2, 2008
And we're back ...
Sometime back I wrote about Wordle to visualize textual information using frequency counts. Change.gov, Obama's transition team website uses it on the comments in response to their health care system. This is very interesting but I think Wordle should display top 100 collocations instead of top 100 words. But oh, we also learnt at last ACL how to learn collocation information from unigram frequencies.
- Delip Rao at 4:08 PM 0 comments
Thursday, July 17, 2008
Too many cooks?
Computational Linguistics is becoming like the Science or Nature. For instance, see this paper in the current issue: (In this case, the broth wasn't spoiled ;-)
Guess which paper has the largest number of authors on the ACL anthology?
- Delip Rao at 5:10 AM 0 comments
Tuesday, July 8, 2008
To theory or not to theory
I stumbled upon this paper "Reflections after Refereeing Papers for NIPS" by Leo Breiman that gives some really candid insights into theory papers. (Unfortunately, I could not find a soft copy to share, except this link.) Some noteworthy observations:
"No theorems" implies "No theory"
"... more than 99% of the published papers are useless exercises."
"Mathematical theory is not critical to development of machine learning."
"Our fields would be better off with far fewer theorems, less emphasis on faddish stuff, and much more into scientific inquiry and engineering."
I really liked this article, especially coming from someone who has been working in theory all his life but I would still prefer reading papers giving theoretical insight, however useless, than pages and pages of feature engineering & experimentation using classifier X on problem Y -- the current trend at ACL.
- Delip Rao at 10:55 AM 1 comments
Principal Components: machine learning, ML, NIPS, theory