From Catalog to Code

Museum catalogs and databases hold a wealth of information pertaining to the specimens held in each division’s collections. The Invertebrate Zoology catalog here at the Peabody Museum is no exception. For every specimen housed in the collections the catalog keeps track of specimen data, from species identification and preparation techniques to the collector and precise collection locality. The peabody uses EMu, a computer based collection management system to store specimen data. Accessing the information associated with a particular specimen, collector or site is as easy typing a search query and viewing the output.

Although it is fairly easy to conduct searches on the database today, before computers all catalog information was housed in a series of hand written books. Every time a new specimen was added to the collection, a string of key descriptors would have to be hand written into a ledger. And every time a specimen needed to be found in the collections or used for research, the specimen number and storage location would be found by leafing through the pages of the ledger.

Museum specimens are nothing without their accompanying catalog data, the contextual information in the database increases the scientific importance and relevance of these specimens tenfold. In today’s data driven society, even historically description-based sciences such as many of the fields within natural history have had to evolve to encompass more quantitative aspects. Although specimen data has always been incredibly valuable, the push for data driven conclusions drawn from statistics and numbers has further increased the value of catalog information in specimen based research. Even with today’s increased capabilities and technologies to mine catalog data, the present day is far from the golden age of Museums. With the value of collections being constantly debated and funding for museums decreasing, collections based research in a precarious position.

For this summer internship, Sarah and I have been tasked to elucidate, compile and share the stories of the Invertebrate Zoology collections. Over the years people working within the division have made serendipitous discoveries of quirky specimens and uncovered stories of the people who worked with them. To complement this approach, various computational tools can also be used to uncover the patterns and stories hidden within the collections.

Details from one of the original ledgers.

It is incredible to consider the evolution of the catalog from the original entries in the late 1800’s hand written by Yale’s first Professor of Zoology, Addison Verrill, to today’s Museum wide convergence of all its catalogs into the EMu database. Having specimen data in an online database such as EMu, makes it incredibly accessible for wide scale statistical analysis and graphic illustration.

To carry out such analysis the information must be exported from the database in a “data dump” into a machine readable spreadsheet format. This computer file can then be read by a program such as Rstudio. In this format, collection wide patterns can be analyzed, interpreted and illustrated. Such a data file has been created for the specimens of the Invertebrate zoology collections and over the past couple weeks I have dived into the data and have begun to explore the collections from this perspective. Prior to this internship I had limited experience using R and Rstudio however over the past couple of weeks I feel like I have gotten wind under my wings. The program is intuitive and very rewarding to learn, as a result I have been able to uncover some interesting patterns and points of interest.

So far this approach to learning more about the collections has not only been informative but has illustrated to me the potential of computational approaches for both collections based research and in understanding biodiversity. Outside of specimens and species, and for this internship in particular, this computational approach has also revealed a lot about the history and development of the Invertebrate Zoology Collection from a Cabinet of Curiosities into the Peabody Division that we see today.

Although debugging lines of R on a laptop screen can feel very removed from the physical specimens, as soon as an analysis is successfully completed or graphic spectacularly created, the visual representation of the data powerfully reconnects it with the specimens, collectors and the stories they have to tell. In my following posts I plan to share these stories.

Original Catalog Ledgers stored in the IZ lab.


Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s