It is pretty common in litigation for a party to demand, via discovery, “all email correspondence between the defendant and Joe Smith” or “all email correspondence in which ‘rebar failure’ is discussed.”
I’m wondering if there is a good automated tool for extracting this from Gmail or another IMAP source.
It would be nice if the tool could do the selection by sending a search term to Gmail. But it would also be acceptable if the tool were capable only of grabbing one IMAP folder. In that case the Gmail user could use Gmail tools to search for a particular recipient or subject line substring, then put all of the results in a folder named “JoeSmith” and have the tool pull all of JoeSmith.
Another nice-to-have feature would be to preserve the conversation threading, but I don’t think it is necessary to fulfill the requirements of the legal discovery process.
Output could be one text file per conversation, one PDF file per conversation, or one huge file with page breaks between messages or, preferably, between conversations.
To me the most obvious way to implement this is as an IMAP client to Gmail. However, given that Gmail already does most of what one would want I wonder if it wouldn’t make more sense to implement this as browser action scripts to simulate user actions and clicks within a Web browser.
Does this exist already? If not, does it seem like it would make a welcome open-source software product?
Thanks in advance for ideas.
Note: About a year ago, I developed a specification for software to do something fairly similar. This was for a friend’s startup that was going to build databases of corporate email, but the company ended up moving in a different direction: ExtractingConversationTimelinesfromEmail
[Update: I should note that the Gmail web interface already does virtually everything that is required above. All that the Google programmers would need to add is an option to "print everything to PDF" where "everything" is a set of a search results, a folder, or all the messages that are selected via a checkbox.]