ACM SIGMOD Anthology SIGIR dblp.uni-trier.de

Inferring Probability of Relevance Using the Method of Logistic Regression.

Fredric C. Gey: Inferring Probability of Relevance Using the Method of Logistic Regression. SIGIR 1994: 222-231
@inproceedings{DBLP:conf/sigir/Gey94,
  author    = {Fredric C. Gey},
  editor    = {W. Bruce Croft and
               C. J. van Rijsbergen},
  title     = {Inferring Probability of Relevance Using the Method of Logistic
               Regression},
  booktitle = {Proceedings of the 17th Annual International ACM-SIGIR Conference
               on Research and Development in Information Retrieval. Dublin,
               Ireland, 3-6 July 1994 (Special Issue of the SIGIR Forum)},
  publisher = {ACM/Springer},
  year      = {1994},
  isbn      = {3-540-19889-X},
  pages     = {222-231},
  ee        = {db/conf/sigir/Gey94.html},
  crossref  = {DBLP:conf/sigir/94},
  bibsource = {DBLP, http://dblp.uni-trier.de}
}
BibTeX

Abstract

This research evaluates a model for probabilistic text and document retrieval; the model utilizes the technique of logistic regression to obtain equations which rank documents by probability of relevance as a function of document and query properties. Since the model infers probability of relevance from statistical clues present in the texts of documents and queries, we call it logistic inference. By transforming the distribution of each statistical clue into its standardized distribution (one with mean µ=0 and standard deviation sigma=1), the method allows one to apply logistic coefficients derived from a training collection to other docu-ment collections, with little loss of predictive power. The model is applied to three well-known information retrieval test collections, and the results are compared directly to the particular vector space model of retrieval which uses term-frequency/inverse-document-frequency (tfidf) weighting and the cosine similarity measure. In the comparison, the logistic inference method performs significantly better than (in two collections) or equally well as (in the third collection) the tfidf/cosine vector space model. The differences in performances of the two models were subjected to statistical tests to see if the differences are statistically significant or could have occurred by chance.

Copyright © 1994 by the ACM, Inc., used by permission. Permission to make digital or hard copies is granted provided that copies are not made or distributed for profit or direct commercial advantage, and that copies show this notice on the first page or initial screen of a display along with the full citation.


ACM SIGMOD Anthology

CDROM Version: Load the CDROM "Volume 2 Issue 3, SIGIR, DASFAA'97, OODBS'86" and ... DVD Version: Load ACM SIGMOD Anthology DVD 1" and ... BibTeX

Printed Edition

W. Bruce Croft, C. J. van Rijsbergen (Eds.): Proceedings of the 17th Annual International ACM-SIGIR Conference on Research and Development in Information Retrieval. Dublin, Ireland, 3-6 July 1994 (Special Issue of the SIGIR Forum). ACM/Springer 1994, ISBN 3-540-19889-X
Contents BibTeX

Online Edition: ACM Digital Library

Citation page
BibTeX
ACM SIGMOD Anthology - DBLP: [Home | Search: Author, Title | Conferences | Journals]
ACM SIGMOD Anthology: Copyright © by ACM (info@acm.org), Corrections: anthology@acm.org
DBLP: Copyright © by Michael Ley (ley@uni-trier.de), last change: Sat May 16 23:38:46 2009