Help us plan the future of OPenn! We want to learn more about our users, your motivations, research projects and goals, and we want to hear from you. Please fill out this brief survey. It should take only 3-5 minutes of your time and will be invaluable to us. — The OPenn Team

OPenn: Read Me

Welcome to OPenn!

This website contains complete sets of high-resolution archival images of manuscripts from the University of Pennsylvania Libraries and other institutions, along with machine-readable TEI P5 descriptions and technical metadata. All materials on this site are in the public domain or released under Creative Commons licenses as Free Cultural Works. Please see specific collections and documents for applicable license terms.

Licenses and use

All manuscript images and metadata descriptions provided here are released under Creative Commons licenses. All images and metadata are released under licenses that Creative Commons has approved for Free Cultural Works, bearing:

You are free to download and use the images and metadata on this website under the license assigned to each document. You do not need to apply to holding institutions prior to using the images. We do ask that whenever possible you cite this website and the holding institution when you use any of these resources. See the section below on Citation Style for help with formulating citations. Consult each institution's website for the most accurate information on formulating citations.

In order to determine the license under which images from each institutions have been released, please refer to that institution's repository web page on OPenn.

Unless otherwise stated, all manuscript descriptions and other cataloging metadata are copyright by the holding institution and licensed for use under a Creative Commons Attribution Licensed version 4.0 (CC-BY-4.0):

For a description of the terms of use see the Creative Commons Deed:

Citation style

For each document under copyright and all metadata be sure to indicate that the work is under copyright and cite the corresponding Creative Commons license in accordance with the terms of the license.

The title of an image should include the manuscript number and the folio or binding part of the manuscript.

Here is a sample citation for an image:

Here is a sample citation for a manuscript description:

Please consult each institution's website for authoritative information when citing images from its collection.


Many of the manuscripts on OPenn were digitized through grants and awards from public and private donors. See each repository page or curated collection page for information about the sponsorship and support that made possible the wealth of materials available through OPenn.

Intended audiences

The data on OPenn is intended for aggregators, digital humanists, and scholars who have been directed here to procure high-resolution images of manuscript pages. It is presented in a manner most likely to ensure its long-term digital preservation. Many of the images here are available via more user-friendly page-turning applications on institutional websites. See individual repository pages for details.

Let us know how we're doing

We are most grateful for your feedback. If you find any errors, have suggestions or comments about what we've done well or need to do better, or if you need help using OPenn, please let us know at



The documents represented on this site represent the richness and breadth of their holding repositories. See each repository's page for more information on respective holdings. The images of these documents are accompanied by detailed manuscript descriptions in machine-readable TEI format. Images and TEI manuscript descriptions are added frequently, so check often to see new additions.

We will regularly be adding new repositories and new documents in order to bring more OPenn data to the public. Check back often to see what's new.

Document images

We present the highest quality images available. As a general rule these will include a 24-bit archival TIFF image of at least 400 ppi.


Three types of images are delivered for each manuscript element:

  1. a full-sized, archival image, typically a TIFF image, 400 ppi or better
  2. a standard all-purpose JPEG intended for web use, that is 1800 pixels on its longest side; and
  3. a thumbnail JPEG that is 190 pixels on its longest side.

All TIFF images are master files. JPEG derivative image types are indicated by a "tag." For example,

The file name base (e.g., 0023_0012) has two parts: a four-digit prefix, representing an arbitrarily assigned identifier, like 0023, and a four-digit serial number indicating the document order of each image relative to the others in the set. For example, notice these master image names:

Label information for images can be found on each document's browse page and in the TEI XML manuscript description.

Manuscript descriptions

Manuscript cataloging incorporates not only the identification of the author, title, date of origin, and provenance, but also, detailed descriptions intended to aid the palaeographer, codicologist, art historian, historian, and philologist. A description of the manuscript cataloging, with technical and non-technical detail, is given in the Technical ReadMe document.


An OPenn repository is a group of documents belonging to a single institutional collection, all having the same metadata format. Each repository is assigned a numeric ID; for example, 0001 is assigned to the Lawrence J. Schoenberg Manuscripts, 0020 is assigned to The Walters Art Museum.

The scope of the documents available from an institutional collection is determined by the institution. An institution may have a single repository for all its items or choose to have multiple repositories. For example, the University of Pennsylvania is represented by several repositories, including 'University of Pennsylvania Books & Manuscripts', 'Lawrence J. Schoenberg Manuscripts', and 'Penn Museum Archives'.

Curated Collections

An OPenn curated collection is a group of documents belonging to one or more repositories. A curated collection allows for the grouping of items by topic, theme, or project. Items in a curated collection are not required to have the same metadata format. They do not have numeric identifiers like the repositories, but are identified by names; for example, 'PACSCL Diaries', 'Bibliotheca Philadelphiensis'.

Each curated collection page contains information about why the documents in the collection have been grouped together and any other pertinent information related to its curation. It lists the repositories from which the documents are drawn, and under each repository it lists the curated documents with links to the human-readable document page, the TEI manuscript description and a link to its location in the OPenn directory structure.

How to use this data set

This data set contains complete digital surrogates for the manuscripts it contains. All manuscripts are provided with machine-readable TEI manuscript descriptions. We provide HTML access to the files, which is easier for the individual user to navigate. HTML and directory structure access are described below.

HTML access

To ease human navigation of the site, a series of web pages is provided. The entry point is the Repositories.html file, which lists all of the repositories currently on the site and provides links to repository-specific manuscript listings (files like 0001.html for the L. J. Schoenberg manuscripts). The amount of detail for each manuscript varies, based on the manuscript's catalog record.

A typical entry for a manuscript will look something like this:

Alternate access methods and other technical stuff

We also provide access via anonymous FTP and anonymous RSYNC. For information on accessing data from OPenn using those methods and the command-line tool wget, as well as detailed information on OPenn descriptive and structural metadata, please see the technical readme document.