Review

> Home > Journals

- view
  - electronic edition @ dblp.org (open access)
  - no references & citations available
- export record
  dblp key:
  - journals/dr/Wolfson00
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/dr/Wolfson00
Ouri Wolfson:
Review - Automatic Discovery of Language Models for Text Databases. ACM SIGMOD Digit. Rev. 2 (2000)

The web provides access to many searchable text databases (e.g. the 1988 Wall Street Journal). Given a term-query, the user is faced with the problem of which database to search. This question can be answered by constructing a term-frequency index (which is called in the paper a "language model") of each database. This is simply a list of terms that occur in the database, each of which is associated with a frequency (in how many documents it occurs). The paper lists a variety of reasons why the database may not provide its term-frequency index to users.

he paper presents a method of building a term-frequency representation of a document-database. The method of building the language model is based on sampling. Specifically, the method constructs a term-frequency index based on a subdatabase that is obtained by repeatedly presenting single term queries; to each query the system responds with top n documents. The method involves the selection of the single term queries, the number of documents retrieved for each term, the stopping criteria, etc. These parameters of the method indeed are important. In other words, the paper examines the right issues of the problem. The method is evaluated and shown to work in an experimental environment that uses three real and very different document databases. I got the sense that at least some of the questions adressed in the paper could have been answered or verified/confirmed by statistical analysis. However, the paper does not do so.

Overall I liked the problem addressed by the paper, and I found it interesting and well written.

- view
  - electronic edition @ dblp.org (open access)
  - no references & citations available
- export record
  dblp key:
  - journals/dr/Wolfson00
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/dr/Wolfson00
Ouri Wolfson:
Review - Automatic Discovery of Language Models for Text Databases. ACM SIGMOD Digit. Rev. 2 (2000)

a service of

manage site settings

To protect your privacy, all features that rely on external API calls from your browser are turned off by default. You need to opt-in for them to become active. All settings here will be stored as cookies with your web browser. For more information see our F.A.Q.