Harvard Libraries recently released bibliodata from their collections – 12 million works in all – under a CC-0 license, which lets other sites and researchers reuse that data in any way possible.
This is the biggest release of bibliographic data of its kind — four times the size of a similar release by the British Library in late 2010. (Without an explicit release under a free license, such collections of metadata are covered by ‘database rights‘).
How would you reuse these records in your own work and dreams? Some quick ideas:
- WP or Wikisource could create 12 million stubs with those records
- Open Library will improve and update its own metadata collection, which was built from scraped subsets of such data
- We can write scripts that autogenerate “lists of works” for authors and authors or categories for works
- We can automatically find mismatches between our person-data and title-data and those in MARC
- We can publicly clean up mistakes in the MARC catalog and suggest updates