# HMML Frag 32 Multispectral Imaging ReadMe **Prepared by R.B. Toth Associates** ## Multispectral Imaging for the Hill Museum & Manuscript Library **Authors**: Michael B. Toth, William A. Christens-Barry, Cerys Jones **Date**: 15 May 2020 ## 1 HMML Multispectral Imaging The Hill Museum and Manuscript Library (HMML) data set includes captured and processed image data from the multispectral imaging of a HMML Palimpsest in a Washington DC area imaging laboratory on 11 and 20 September, 2018 by R.B. Toth Associates, in partnership with Equipoise Imaging and Phase One A/S. The narrowband multispectral imaging system used for this project includes commercial-off-the-shelf hardware and software for digital spectral image capture and viewing with the integrated system. It also includes customized image processing software for processing and exploitation of the spectral images, utilizing techniques from other cultural heritage studies. The medium-format, high-pixel-count camera takes a series of high-quality digital images, each illuminated by a specific wavelength of light. The resulting image set is then digitally processed and combined to reveal residues and features in the manuscript or book (or artwork) that are not visible to the eye in natural light. These processed images, which are generated from the captured images, clarify and offer new insights to support research into the objects. ### 1.1 Camera System A Phase One iXG Camera System with a 100 Megapixel Achromatic CMOS sensor with a 72 mm lens produced images of over 700 ppi. The higher resolution CMOS sensor and greater dynamic range allows greater resolution and autofocus for increased efficiency and improved results. ### 1.2 Illumination System The imaging system provides narrowband illumination with light in specific wavelengths from low heat and low maintenance, long-lifetime light emitting diodes (LEDs). It includes two integrated illuminators, each with multiple LEDs, providing illumination for imaging in distinct ultraviolet, visible and infrared narrow spectral bands (see wavelengths in "General File Conventions" below). It is integrated with software to allow simplified system operation and unified metadata capture. ### 1.3 Filter System To capture fluorescence from an object, a 6-position motorized filter wheel contains five 2-inch square optical glass filters, with control software and computer interface. Filtered images can increase the range of captured information to include both fluorescence emissions and UV reflectance. This allows the characteristic spectra of substrate, colorant, and contaminant materials to be more completely determined and analyzed. The filter wheel is driven by computer control with a removable carousel containing a selection of filters (UV bandpass; visible bandpass and longpass filters). ### 1.4 Image Capture Integration The Spectral XV integrated image capture operating software developed by Equipoise Imaging LLC provides integrated control of the digital camera back, filters and illumination as a single system. This software – based on the CaptureCore application engine developed by Phase One A/S to control camera capture operations and processing workflow – allows streamlined operation and metadata capture from a single interface with simple setup and imaging. ### 1.5 Spectral Imaging Processing Images are initially processed with ImageJ open-source image processing software and a customized Paleo Toolbox – a spectral imaging toolkit created by Equipoise Imaging LLC, for applications in cultural heritage imaging. The Paleo toolkit comprises plugin modules that integrate into ImageJ, an open source image processing tool originally developed at the US National Institutes of Health. ImageJ has been widely adopted and extended by scientists working in remote sensing, biological science, and cultural heritage world-wide. It offers a wide range of digital operations for the enhancement and reproduction of non-visible features from the manuscripts and books based on their spectral response in images captured with the full set of illumination wavelengths and emission bands. ## 2 Rights These images are licensed for free use under Creative Commons Attribution 4.0 International License (CC BY 4.0). Users are free to copy and redistribute the material in any medium or format, and remix, transform, and build upon the material for any purpose, even commercially with appropriate credit to HMML. Since Michael B. Toth and Bill Christens-Barry conducted this multispectral imaging and digital processing for HMML on a pro bono basis, we request published images be credited to "HMML, R.B. Toth Associates and Equipoise Imaging". ## 3 HMMLData Set Contents This data set comprises a _core_ content set of digital images of St. John's University Manuscript Fragment 32 (listed as SJUMsFrag32). The data set contains the following folders: `README.txt file`: This description of the data set in txt form providing an orientation to the data and rights management. `SJUMs_[Filename]`: Data captured from multispectral imaging of the HMML manuscripts and books and converted into TIFF images. The filename is not intended to substitute for the descriptive metadata in the json file. `Flattened` (if applicable): Converted images that have been processed with reference "flats" images to balance illumination and other imaging artifacts. `Processed`: Digitally processed images from the captured multispectral images of the HMML Palimpsests taken with the 100 MP camera and integrated illumination system. This may be in output folders or folders with the prefix `PROC-`. The directory structure, starting from the root is as follows: |-- Data | |-- Frag32r | | |-- Frag32rXRF | | `-- Processed | `-- Frag32v | |-- Frag32vXRF | `-- Processed |-- manifest-sha1.txt |-- ReadMe_Multispectral.html |-- ReadMe_Multispectral.txt |-- ReadMe_XRF.html `-- ReadMe_XRF.txt ### 3.1 Core Data For each manuscript side, the data set provides sequences or stacks of captured and registered images converted to TIFF and JPEG thumbnail images with metadata. These images should be retained as archival images and will be easiest to read with most image viewers. Images are captured in IIQ format as working images that are converted to TIFF, as they are in a proprietary format that can only be viewed with Phase One's Capture One software. The data set includes: 1. Multispectral images captured using Spectral XV were converted from .IIQ format to 16-bit .TIF format by use of Capture One Software. Converted images have the `_R` at the end of the rootname. 2. Reference "flats" images used to calibrate the light levels across the image. 3. (If applicable) Subject images flattened using reference flats images; flattened images have the string `_F` at the end of the rootname. The core data include: - Captured Image data consisting of captured IIQ image files and those converted to TIFF. These are individual images from each of the imaging systems taken with different energy levels. - Computer Processed images. Images that have been digitally produced through the application of computer algorithms to combine and enhance captured images to enhance visibility of manuscripts artifacts and text. All processed images are TIFF images or AVI video clips of a series of processed images. Metadata is included in associated JSON files for multispectral images. Each multispectral capture image folder is provided with descriptive metadata in the JSON file giving details of the image capture for the project, scene and sequence and processing methods used to generate integrated images from the various captured images. Each multispectral capture image folder is provided with descriptive metadata in the JSON file giving details of the image capture for the project, scene and sequence. This includes basic Archimedes Palimpsest Metadata Standard metadata, such as: { "Project": { "ProjectID": "100001", "Name": "HMML", "Rights": "CC4.0-BY", "Publisher": "R.B. Toth Associates", "ProjectNickName": "HMML", "Creator": "M.B. Toth, W.A. Christens-Barry", "Contributors": "Cerys Jones, Columba Stewart, HMML, Equipoise Imaging, R.B. Toth Associates, Phase One A/S", "Description": "Pro Bono Multispectral imaging of St. John’s University Manuscript Fragment 32 - SJUMsFrag32"", ## 4 General File Conventions The unflattened and flattened captured images file names include six fields plus an extension. The initial three fields match the short forms of the project name, scene name, and sequence name. The first and second fields are delimited by `_`, and the second and third fields are delimited by `-`. The fourth field consists of a three digit number, indicating the illumination wavelength (in nm), plus a plus a single letter identifier for the camera filter. This file naming convention supports automated processing of the captured images using delimited information to define needed parameters used by the processing tools and algorithms. The illumination or illuminations used to produce each image cited in the filename of the flattened images include multiple illumination types. The illumination symbol is one of the following symbols, or a combination of symbols for processed images: - 365 - 365 nm UV LED illumination - 385 - 385 nm UV LED illumination - 405 - 405 nm borderline UV-Visible LED illumination - 420 - 420 nm visible LED illumination - 445 - 445 nm visible LED illumination - 475 - 475 nm visible LED illumination - 505 - 505 nm visible LED illumination - 535 - 535 nm visible LED illumination - 590 - 590 nm visible LED illumination - 635 - 635 nm visible LED illumination - 660 - 660 nm visible LED illumination - 0700 - 700 nm LED IR illumination - 0735 - 735 nm LED IR illumination - 0780 - 780 nm LED IR illumination - 0870 - 870 nm LED IR illumination - 0940 - 940 nm LED IR illumination - N = CLEAR (no filter) - U = BP365 (UV bandpass filter) - V = LP400 (long pass filter that passes wavelengths longer than 400 nm - blue and above) - G = LP515 (long pass filter that passes wavelengths longer than 515 nm - green and above) - R = LP515 (long pass filter that passes wavelengths longer than 590 nm - red and above) - I = LP715 (long pass filter that passes wavelengths longer than 715 nm - IR and above) Examples for captured images are: Unflattened: project_scene-sequence-__R.tif Flattened: project_scene-sequence-<wavelength and filter>_<index number>_F.tif Processed images amend this naming convention to indicate the type of processing employed. The initials of the individual who created the processed images are (optionally) given in the fourth field of the filename of processed files. Since processing operations most often utilize all of the captured images of a sequence, identification of individual images used as inputs for processing operations are generally omitted. One or more following, underscore-delimited fields describe the processing operations and parameters that were used, appended in order of their application. Within an underscore-delimited field, single hyphens are used to delimit parameter values or image indices used during that processing operation. Usually the parameters refer to the index number of a component image. A typical filename exemplifies the naming practices used for processed images: HMML_[ManuscriptName]-[Sequencename]__MBT__PCA_pc05-pc06-pc07.tif Project name: HMML Scene name: [ManuscriptName] Sequence name: [Sequencename] _Creator: Michael B. Toth (this field is sometimes not used)_ Processing: 1. Principal Components Analysis (PCA) 2. PCA components 05, 06, and 07 were used in the R, G, and B channels, respectively, of the final (synthetic) RGB TIFF image In some case, two rounds of PCA processing were performed. Selected components from the first round of PCA processing were used during the second round of processing. In these cases, the string `PCAx2`, followed by `-` delimited indices of the first round components that were used in the second round. For example, the file name: HMML_[ManuscriptName]-[Sequencename]__WCB__PCAx2-05-08-12-15_pc02-pc03--pc06_RGB.tif This indicates that William Christens-Barry used components 2, 5, 8, 12, and 15 from the first PCA round in the second round, and used components 02, 03, and 09 from the second round of processing in the red, green, and blue channels, respectively, of the final RGB TIFF image. Note that the use `--` as a delimiter indicates that a range of component images was used, e.g. `3--6` would indicate that components 3, 4, 5, and 6 were used. Please note that the practice of including a leading `0` is not followed consistently, and that the use of `pc` in the front of a principal component used in an RGB channel may not be followed to avoid excessively long filenames. Other strings in processed file names include: - `dS8_BasicRGB` an RGB image has been synthesized from flattened images and desaturated computationally by a factor of x0.8; - `8gs` a single channel (grayscale) image stored in 8-bit format - `desat` an RGB image has been synthesized from flattened images and desaturated visually by an unspecified factor - `Combi` multiple grayscale images (captured or processed) were use combinatorally to create many different synthetic RGB images. The resulting files are very large, and are frequently stored in the "AVI" movie format for viewing and selection of the best images. The remainder of the file name, including the extension, indicates the file type: 1. TIFF still image files, ending in `tif`, 2. JPEG still image files ending in `jpg`