Hilary Wang
This website represents a selection of projects and research created while attending graduate school at Pratt Institute School of Information 2019-2021


Email Archiving Second-Order Materials

Link: Email Archiving Report

Project Title: Email Archiving Second-Order Materials

Project Description: This research project was my final for INFO 655 Digital Preservation and Curation. It focused on email archiving and specifically researching tools and software to detect second-order materials such as email attachments and message body URLs. It aimed to consider how the presence of second-order materials impact the workflow of appraising and processing emails in archives. I researched and utilized tools and software that archivists could employ to efficiently appraise, acquire, and process emails. This project was presented and submitted as a final report.

Methods: I first created a collection of ‘control’ emails with various types of attachments and URLs in the message body. The collection was then utilized as the submission information package (SIP) that was then used in my explorations of tools and software. The majority of the project required researching and testing out various industry tools and software used in digital preservation and archives to address archiving emails. It also entailed researching how institutions are or aren’t addressing the preservation and access of second-order materials. Finally, after selecting libratom, Thunderbird, and Python3 I used each tool to identify attachments and URLs on the control email collection. The final report and presentation highlighted this process, my assessment, and findings in the success and usability of these tools.

My Role: I am the single author of this work

Learning Outcome Achieved: Research

Rationale: This project demonstrates my ability to develop a thesis on a specific topic and then conduct goal-oriented research. It was informed by my ability to identify user needs, archivists, and possible barriers to their work in email archiving such as budget, training, and IT support. My research ranged from first understanding the technical anatomy of an email, analyzing textual research in white papers and reports published by The Council on Library and Information Resources. Looking over archival documentation and attending a Bitcurator workshop introducing the command-line tool libratom. I then conducted first-hand research by personally using libratom, Thunderbird, and Python3 to identify second-order materials. My final assessment of each tool and software I utilized was framed by how feasible it would be to incorporate these tools into an established digital workflow.