Personal analytics on my email data
I’ve been playing with the JavaMailwww.oracle.com library, which is a friendly and clean API that abstracts a range of common email protocols like POP, SMTP, and IMAP. Targeting my own Gmail’s IMAP account, I downloaded the headers of all 2700 of my sent messages from the last 5 years and loaded them in Mathematica.
There is also manner of stuff one can do with this kind of data, but a fun starting point for my explorations was a punch-card type visualization that shows my proclivity to send emails at different times of the day and days of the week (inspired by GitHub’s version of same for commitsgithub.com).
What’s special about 4am on a Tuesday morning? No idea!
Another simple statistic is the volume of emails that I’ve assigned to different GMail labels. I picked four interesting labels: emails between me and my family, my friends, my university, and work-related emails.
It’s interesting how directly the structure of my life is visible in these email counts. Take a guess when I finished my degree and started working in industry, and when I moved to the United States.
There are loads of other things to try. For example, I have all the header text, so along with message IDs I can infer the thread structure of all my email conversations. How complex are my conversations with different people? Who do I have the most intricate conversations with? The simplest?