Main Page > Conclusion

Spam Analysis and Reputation Project

3. Conclusion

In this report, two approaches to distinguish non-spam and spam messages: server based approach and standalone approach were discussed. The mail server that was acting like a honey pot in the server based approach was not able to attract enough spam as it was very fresh. So, the standalone approach was used to analyze messages in the existing mailboxes.

In this report, email source analysis and attachment analysis were discussed with respect to their design, implementation, results obtained and corresponding observations. It was observed that the mail received from a known source is likely to be non-spam. Also, messages with content type multipart/MIXED and multipart/REPORT are like to be non-spam. So these metrics, in combination with other metrics can be used to distinguish between non-spam and spam messages and try to avoid the undesirable classification of non-spam messages as spam.

Next: Appendix


Last updated: 2008-08-19 by Nirav Shah