DBLP FAQ: How does the 'author search' work?
A query is interpreted as a set of prefixes of name parts. If you enter a few words, you get the names which include these words as prefixes of some name parts:
query = A Meyer — answers = Achim Meyer, Andrea Meyer, Anne Meyer, Hans-Albert Meyer, A. Meyers, Anton Smith-Meyer, ...
query = Ar b c — answers = Clark B. Archer, Arnold B. Calica, Arnab B. Chowdry, Armin B. Cremers, ...
More details:
- The query and the names stored in DBLP are broken in parts. The delimiters of this 'tokenizing' are spaces and punctuation marks. The punctuation marks are not relevant for the matching. "Ar-b-c." produces the same result as "Ar b c".
- The matching is NOT case-sensitive.
- The order of the query words does not matter, i.e. the queries "Petra M A" and "M Petra A" are equivalent.
- If you end a query word with a $-sign, only exact matches of this word are shown. Try the queries "xi li" vs. "xi$li" vs. "xi li$" vs. "xi$li$" ("xi$ li" and "xi$li" are equivalent).
Diacritic marks:
- Most parts of DBLP are restricted to the Latin-1 character set. This includes characters like ä, é, è ñ, å ç etc. but NOT ł, č, ű, ş ... In DBLP, we try to transliterate all person names to Latin-1.
- As long as you restrict your query to ASCII (Basic Latin in Unicode) the search engine matches 'diacritic insensitive', i.e. the query "moller" matches "moller", "möller", "møller", "móller" etc.
- As soon as your query contains any diacritic mark, the matching becomes exact. Now "René" matches "René" but not "Rene" or "Renè".
Encoding, form method, ...:
- The preferred encoding to transmit the query is UTF-8. As soon as the query contains a byte sequence which is illegal in UTF-8, the incoming byte sequence is interpreted as Latin-1.
- Additionally, the search engine understands character entities like ä for the Latin-1 characters.
- The author search accepts queries using the GET method.



