Early in the development of the DASe project we decided/realized that the ONLY way we would be able to quickly and efficiently deal with all of the various digital collections we hoped to incorporate would be to NOT enforce any kind of metadata scheme on anyone, but rather simply let folks describe their “stuff” anyway they wish. Not to mention, since many of these were legacy collections set up in a FileMaker or Access database or even an Excel spreadsheet, there was often already a schema in place and folks (rightly) didn’t want to change. Note that we are talking about faculty members and department administrators who have lots better things to do that figure out how to use Dublin Core to describe the images that the have already been using for years in their classes, research, and publications.
We (Liberal Arts Instructional Technology Services at The University of Texas at Austin) had an interest in “rationalizing” this hodge-podge of data & metadata towards two ends: one, we wanted folks to be able to share their collections easily if they wished, and two, we wanted a means by which we could easily repurpose the digital assets in all sorts of ways: podcasts, websites, specialized search interfaces, etc. So we went with what is essential key-value pairs: collection managers create “attributes” (e.g., title, description, person depicted, time period, etc.) that best describes their assets and we provide an interface that allows them to add metadata to any item by filling in a value for any/all attributes that apply. Well, turns out this works REALLY well. We currenly have 88 collections, comprising over 300,000 items (images, audio, video, documents, etc) and the system holds over 4 million pieces of metadata (i.e. the “values” table has over 4 million rows). Searching is fast, adding new collections is easy, and application maintenance (including backing up collections as XML documents) is painless.
The current version of DASe runs on PHP4 with a PostgreSQL back end. The next rev, which is a significant retooling of the current architecture and code base will be PHP5 and will be able to use PostgeSQL, MySQL, SQLite, or XML files as a backend. How that all works, where Atom, REST, RDF and more fit in, problems encountered along the way, as well as solutions settled on (tentative and otherwise) will be some of the topics explored in future posts.