dblp.xml (see here for details) contains only the most recent version of publication and person records in dblp. It also provides only the most recent date of modification. To get a better view on the historical development of dblp, we provide a historical data set hdblp.xml that contains all historical revisions (with some limits) of all records in dblp. hdblp can be used to study questions like:
- What type of records were indexed at in a specific year?
- What kind of modifications to records occurred over time?
- What kind of defects were corrected in the past?
hdblp.xml is intended for studying the development of dblp. If you are interested in the current state of dblp please use dblp.xml that is available as daily and monthly release (see here for details)
hdblp.xml has the same structure as dblp.xml and uses the same dtd. However, instead of a single entry for each publication or person record, hdblp.xml contains a full copy of the record's metadata for each time that the record was modified (including it's creation)..The example below shows a publication record with multiple revisions:
<article key="journals/jsyml/NewmanT42" mdate="2017-05-28"> <author>M. H. A. Newman</author> <author>Alan M. Turing</author> <title>A Formal Theorem in Church's Theory of Types.</title> <pages>28-33</pages> <year>1942</year> <volume>7</volume> <journal>J. Symb. Log.</journal> <number>1</number> <url>db/journals/jsyml/jsyml7.html#NewmanT42</url> <ee>https://doi.org/10.2307/2267552</ee> <ee>http://projecteuclid.org/euclid.jsl/1183389307</ee> </article> ... <article key="journals/jsyml/NewmanT42" mdate="2003-10-13"> <author>M. H. A. Newman</author> <author>A. M. Turing</author> <title>A Formal Theorem in Church's Theory of Types.</title> <pages>28-33</pages> <year>1942</year> <volume>7</volume> <journal>The Journal of Symbolic Logic</journal> <number>1</number> <url>db/journals/jsyml/jsyml7.html#NewmanT42</url> </article>
The publication was first indexed in dblp on 2003-10-13. The most recent revision is from 2017-05-28. In this case, the name of the second author was extended and web links were added.
hdblp.xml has the following limitations.
- Exact tracking of record metadata before June 1999 is not possible because of an error in the data processing.
- Modifications of a record that occur on the same day are merged in a single revision.
- In the early years of dblp, not every author profile had it's own person record. Therefore, the person records of some authors cannot be tracked back to their beginning.
A more detailed description is provided with the downloadable data file.
We provide the hdblp.xml data set at https://zenodo.org/record/3051910 under the same licence as dblp.xml (ODC-BY v1.0). We plan to update the file approximately twice per year.