Any of you have a tool/method/process/reference/idea on extracting header information from large quantities of e-mail messages for statistical analysis?
I need to extract at least from, to, subject, date/time sent, and if possible the MUID.
Any suggestion is appreciated.
p.s I found some info, but it seems this is only for people who no little to nothing about forensics. It is through Computer Forensics Analysis & Training Center, Inc, an intro of sorts to forensics…
If you mean "extracting from MS Outlook", try MS Access.
2003 & 2007 have a feature to import MAPI folders into an Access table. The caveat is that it needs to be a "mail folder" in Outlook.
No, I would prefer not to use Outlook. the volume makes Outlook choke through MAPI, and as you described it - single folder at a time…
I have been looking for C, Perl, Phython, or pretty much anything construct of the PST format. I am not up to write it myself from the MS pages…
Have a look at this, not sure but it might be of help.
http//
H
I'm using TextPipe. But at ~100k mails it can be quite complicated to create a rule set that extracts the fields from every single message correctly and suitable for pattern recognition.
Greetings,
Microsoft opened up their PST spec. I've been toying with implementing tools in Python for "doing things" with PST files based on that spec but it isn't a small project so I've not dived into it yet.
-David
Thanks all.
I think TextPipe would not be of value to extract from PST/MSG as they are "binary". On the other hand, if I convert the PST to mbox or something similar, it maybe a handy tool.
Arrgh! I was hoping something other than code writing will be available. Thanks Harry.
David, I have looked at the docs, and that is what I was planning to do - except I do not have time, nor the resources to read through the documentation.
I'm not sure whether this eill help, but try