Thursday, August 2, 2007

Recommending scientific papers

I noticed a new feature in Citeseer which tries suggest an "alternate document" for a paper.
Clearly it does not do what it implies to do and it doesn't show up for all papers. (Experimental?) So, an interesting question is how does one recommend scientific papers? Something more than mere document similarity is required. If I am reading a CRF paper then there is no point in listing all papers containing similar words. Just listing nodes connected to inward and outward links of the paper in the citation graph wont suffice either. Ideal recommendations for a paper would depend on the role the user is playing. When I am reading a paper about some new topic, I would like to get pointed to original papers on the topic, some recent papers on the topic, and may be some survey papers or books. On the other hand when I am writing a paper, I would like to be pointed to all papers related to the topic (recall important than precision here to avoid reviewer comments on "missing reference") in some magical order that puts papers more relevant to your work above. Also these papers might not be related in directly through citations. If there is a recent related work in the Annals of Statistics, for instance, then it should show up when I am working on, say, approximate inference methods for graphical models. (Possible to deduce this from my previous queries?)

In spite of more information being present in a scientific paper than its text, recommending or ranking papers appears to be quite challenging.

