What is Data’s Killer App?

I read with great interest both Peter Brantley’s Reality dreams (for Libraries) and Dorothea Salo’s Top-down or bottom-up? as both address the increasingly obvious need for data systems support in higher ed. The issue, which in practice comes in many shades and hues, is that our increasingly digital and connected world offers challenges that are not being met and opportunities that are not being exploited. Centralized data curation, the formation of “communities of practice,” individualized faculty support — these are a few directions that institutions might look to help matters, but I cannot help but think that a lasting solution demands a more fundamental foundation that simply does not yet exist. To put it more bluntly (cliché-and-all) we don’t yet have our “killer app” for data management.

Dorothea wisely asks if a single tool could possibly fit the bill. Well, no, but it’s likely not a single tool we are after. While it might have seemed in the early 1990′s that the Mosaic web browser was the Internet’s killer app, it was actually HTML and HTTP that allowed browsers to be created and the Web to explode in popularity and usefulness. Likewise, our solution will likely be a set of protocols, formats, and practices all of which will enable the creation of end-user applications that can “hide the plumbing.” Indeed, training our users to practice better data “hygiene” will be a fruitless task unless they have applications that force them (by way of the path of least resistance) to do so. It’s not a stretch — everyday we send emails, post to blogs, upload pictures to Facebook, etc. without a thought that our data must be properly structured to achieve our aims.

Here are a few essential characteristics of our killer app:

  • The inherent structure of the data must be captured and maintained from the moment of its creation through its entire lifecycle.
  • Separable concerns need to be separated (data, presentation, access control, etc. each have their own “layer”).
  • Reuse, repurposing, and “remixing” must be first-class operations. In fact, the line between use and reuse should simply be erased (i.e., reuse IS use and vice-versa).
  • The feedback loop with the user should be as tight as possible. I.e., I can immediately visualize a whole range of possible uses of the data I have created.

These are not pie-in-the-sky demands: lots of applications already do a decent job at this (almost any web application fits the bill, since the medium practically demands it). But the tools I see faculty regularly using (Excel, FileMaker, Microsoft Word, the desktop computer filesystem) do NOT. That these are seen as anything but disastrous for the creation of structured data is surprising to me.

Which leads me to my next point: I don’t think there is such a thing (anymore) as data that “does not need to be structured” or “data that will never need to be shared.” If there is one point of data hygiene that we do really need to get across to all of our non-technical folks it is that everything you create will be reused and will be shared no matter how unlikely that might seem right now. If not by someone else, then by you. Systems that don’t distinguish use from reuse or originator access from secondary/shared access are exactly what is called for (n.b. I’m not suggesting that authorization/access controls go by the wayside, but rather that they happen in another layer).

Too often our digital systems perpetuate distinctions that, while logical in a pre-digital world, are actively harmful in the digital realm. Consider, for example, a digital photograph taken by a faculty member on a family vacation. The camera automatically attaches useful metadata: time and date, location (if the camera is geo-enabled), camera model and settings, etc. The data and metadata are in no way tied to a specific use, but will be useful no matter how that photo is used/reused. I’ve seen plenty of cases where that vacation photo is as likely to appear on a holiday greeting card as it is to be used in a classroom lecture, as part of a research data set, or as an addendum to a published paper. As things stand, our faculty member would turn to different systems for the various use cases (e.g., Flickr or Picasa for personal uses, PowerPoint for classroom use, DSpace for “institutional” use). While I’m not suggesting that Picasa be expected to serve the purposes of an institutional repository or DSpace that of a classroom presentation tool, part of me thinks “well, why not?”

A more practical expectation would be that our systems interoperate more seamlessly than they do, and that moving an item (data plus metadata) from one to the other is a trivial, end-user task. As I mentioned above, we need protocols, practices, and formats to allow this sort of interoperation. I think that for the most part we have all of the pieces that we need — we simply lack applications that use them. For protocols, I think that HTTP and related Open Web standards (AtomPub, OpenID, OAuth, OpenSearch, etc.) offer a fairly complete toolset. Too often, our systems either don’t interoperate, or offer services that are simply proprietary, application-specific protocols on top of HTTP (e.g., SOAP-based RPC-style web services), which misses the whole point of the Web: HTTP is not just a transport protocol, but IS the application protocol itself. The growing awareness of and interest in REST-based systems is simply that: using the Web (specifically HTTP) exactly as it was intended. Thus REST-based architectures and the design principles it promotes offer the “practices” part of the equation.

As an example of a REST-based system (or standard) you cannot do much better than the Atom Publishing Protocol. While it may not have taken the world by storm the way its creators had hoped, it still has loads of potential as “plumbing” — perhaps not something that end-users would be aware of, but hugely useful for application developers. And were one tempted to try some other approach, or “just use HTTP” I am pretty sure they’d end up developing something quite like Atom/AtomPub anyway. In either case, there is no hiding or abstracting away HTTP — it’s in full view in any RESTful system on the web and is in all cases (this being the Web we are talking about) simply unavoidable.

The next bit to tackle is the format: what data formats allow the sort of interoperabilty we are seeking? Certainly as the format behind AtomPub, Atom Syndication Format would be an obvious choice. But there are others: JSON, HTML, XHTML, RDF/XML, RDFa, etc. I tend to favor Atom and JSON, having found both quite suited to the sort of tasks I have in mind. RDF is widely viewed as the basis of the “Web of Data,” but its lack of a containment model makes it ill-suited for the sort of RESTful interactions that are a critical component of the interoperability I’m envisioning. What RDF does offer, which is key, is the ability to “say anything about anything.” I’d contend though, that RDF is not the only way to do that (or indeed even the best way, in many cases). Atom itself offers some RDFish extension points that I quite like, and efforts like OAI-ORE/Atom do so as well. The point is that if a content publisher has some “stuff” they want to describe, they should be able to do so in whatever way they wish, and this original “description” should stay with the item in full fidelity, even if it is mapped to standardized description/metadata schemes if/when necessary. I could say more, especially about OAI-ORE/Atom, since it so closely captures the sort of aggregating and re-aggregating that we see in academic data work. And as an Atom-based format (with an emphasis on the entry rather than the feed), it has a “write-back” story (AtomPub) built in.

I believe things are moving in the direction I describe, but by and large NOT (sadly) in academia and (more sadly) not within the library community. AtomPub is built into products by IBM, Microsoft, Google and others. In fact, Google’s suite of GData-based web applications (GData itself is based on Atom/AtomPub) comes quite close to what I describe. At UT Austin, we’ve been making very wide use of our DASe platform, which is basically a reference implementation of just the sort of application/framework I have described above. A faculty member with “stuff” they need to manage, archive, repurpose, share (images, audio, video, pdfs, data sets, etc.) can add it to a DASe “collection” and thus have it maintained as interoperable, structured data and enjoy the benefits of a rich web interface and suite of web-based RESTful APIs for further development. We have over one hundred such collections, and while most live behind UT authentication, some have seen the light of day on the open web as rich media-based web resources, including eSkeletons and Onda Latina among others.

My conclusion after a few years of working with DASe is that yes this is definitely the way to go — RESTful architectures built on standard protocols and formats offer HUGE benefits from the start and are “engineered for serendipity” such that new uses for existing data are regularly hit upon. I’d also note that it requires buy in from all who wish to play — the hard work of developing specifications and standards, and understanding and following those specifications are the “cost of admission.” Likewise, a willingness to move away from (or build better, more RESTful interfaces around) legacy systems that don’t play well is crucial. This means a shared understanding among technologists, application developers, managers, administrators, and users must be promoted. It’s no easy task, in my experience — pushback can be fairly dramatic, especially when investment (as resources, mindshare, pride, etc.) in our legacy systems is significant. Our work as librarians, repository managers, information technologists is NOT, though, as much a matter of educating users as it is educating ourselves, looking “outside the walls,” and beginning the difficult conversations that start moving us in the right direction.

41 comments

  1. Shona Rosentrater’s avatar

    Pleasant understand, thanks a lot

  2. Brisbane Carpet Cleaning’s avatar

    Good post on app. Thanks!

  3. Brisbane Carpet Cleaner’s avatar

    I have never ever heard about App killer before. How it support to data system? Please explain in detail. Thanks

  4. Gold Coast Removalists’s avatar

    Wow! You have shared knowledgeable information. I wanted to know about “Data Killer” for a long time and now i am very happy to find it in your blog. May i know how i can download it?

  5. SMSF Brisbane’s avatar

    You have done a great artistic work on it. I appreciate the time and effort you spent on the background of this article.

  6. Architectural House Plans’s avatar

    What is the usage of Data’s Killer App? I’ve never used it.

  7. Shellharbour Real Estate’s avatar

    This is a suspense full story written by an author. I really like this post. Thanks for sharing this post.

  8. Swimming School’s avatar

    I was wanted to know, When my data is deleting and destroy then this application will be helpful for my data or not.

  9. Kitchens Brisbane’s avatar

    Thanks so much for posting this guys. I must say that your blog is getting better and better every day. I am a daily reader and and I must admit that I absolutely love your blog.

  10. How To Make Money Online’s avatar

    I am confused about the term “hygiene”. Can you please elaborate it?

  11. Injury Lawyer Toronto’s avatar

    Very good article, well written and very thought out. I am looking forward to reading more of your posts in the future and your painting is marvelous.

  12. IT Services Brisbane’s avatar

    What are the uses of Data’s Killer App? I’ve never used this app.

  13. Real Estate Adelaide’s avatar

    How much minimum speed of internet is required for running this application?

  14. Medical Malpractice Lawyers Toronto’s avatar

    I was not aware about the data’s-killer-app. I learnt a lot from your article. I am waiting for more blogs related to this topic.

  15. Massage Therapy in BC’s avatar

    Data killer application is very helpful for save our computer data. I think after using this Application we save our risk in future. Thanks.

  16. Custom Kitchen Design’s avatar

    I don’t know about data killer before reading this post. Now I am very happy to read it in your blog.

  17. Self Storage Louisville KY’s avatar

    Your blog is quite unsatisfied for me.How date killer is support to data system? Please explain in detail. Thanks

  18. Denver Seo Company’s avatar

    You are using nice thoughts in this post. But I want to know more from this post. What is RDF?

  19. Carpet Cleaning Company London’s avatar

    Really great post you have created here. I honestly think it’s one of the best I have read online. I’ll check your blog later for updates and the latest development.

  20. Oakland Locksmith’s avatar

    Do you have more posts like this one? This is very helpful post for my project.

  21. Plumber tampa’s avatar

    This material is exactly what I was looking for. I am glad that I managed to find this post by happenstance.

  22. Homes for Sale in Louisville KY’s avatar

    This blog picture is very nice and you have done a great artistic work on it.

  23. House cleaning Louisville KY’s avatar

    This is great reading. It’s loaded with useful material that anyone can understand. I really like to read articles that are put together so well.

  24. Treatments For Sleep Apnea’s avatar

    What is the meaning of Dorothea Salo in your post? Please describe clearly.

  25. Dallas Electrician’s avatar

    I am pleasingly surprised that given such a broad subject you managed to present us with some fresh, useful and interesting ideas to help us grow further in our knowledge….

  26. Ecommerce Brisbane’s avatar

    You have provided wonderful information about “HTML”. i have bookmarked this site on my browser for future visiting.

  27. Sleep Dentist Charlotte’s avatar

    Very insightful piece of writing. Keep it up! And keep posting such cool stuff.

  28. Blinds Brisbane’s avatar

    This is really very interesting blog. I have to get maximum information from you so please update this kind of blog in more quantity so that I may get maximum from your next updates.

  29. Contract Manufacturer Brisbane’s avatar

    Thanks for sharing informative information about losing data. All students can get plenty of useful knowledge from this post. Thanks for sharing.

  30. Carpet Steam Cleaning Brisbane’s avatar

    I coincidentally found your post about data-killer-app, because this is exactly what I was looking for.

  31. Womens Dresses Online’s avatar

    What are precaution while using this application?

  32. Dental Implants Louisville KY’s avatar

    Thank you so much for making a sincere effort to write this. It is indeed of great help to me.

  33. Colorado Ghost Towns’s avatar

    What are the advantages and disadvantages of Data’s Killer App? Can you provide me complete details?

  34. Property Management Adelaide’s avatar

    Great article. I was looking for this information for a long time. Thanks for sharing.

  35. Jason’s avatar

    Thanks for the good read I have forwarded this link to my web designer friend who will also find this interesting

  36. 294’s avatar

    That’s a great app you have… I want to try it myself.

  37. Professional medical attorney’s avatar

    This is really good thing about the tools, it is much more looking like a developer tool that can be much useful for the web developers.

  38. CyberLink’s avatar

    Data Killer app looks challenging. I was looking forward to it

  39. Craig’s avatar

    This is a great blog and a great app idea thanks for sharing.

  40. review on raspberry ketone’s avatar

    That oddest 1 up to now is always that I will soon be a floating head if I shed any more, and i nevertheless have 80-90 to check out be with the Leading on the “normal” variety for my height!.

  41. buyessayhere.com’s avatar

    Excellent site I have bookmarked your site

Comments are now closed.