I mentioned to a colleague that I would be happy to do a short writeup of some of the interfaces that we have for our digital preservation system. This post is trying to move forward that conversation a bit.
System 1, System 2
At UNT we manage our digital objects in a consistent and unified way. What this means in practice is that there is one way to do everything, items are digitized, collected, or created, staged for ingest into the repository and everything moves into the system in the same way. We have two software stacks that we use for managing our digital items, Aubrey and Coda.
Aubrey is our front-end interface which provides end user access to resources, search, browsing, and display. For managers it provides a framework for defining collections, partners, and most importantly it has a framework for creating and managing metadata for the digital objects. Most of the interaction (99.9%) of the daily interaction with the UNT Libraries Digital Collections is through Aubrey with one of its front-end user interfaces, The Portal to Texas History, the UNT Digital Library, or The Gateway to Oklahoma History.
Aubrey manages the presentation versions of a digital object, locally we refer to this package of files as an Access Content Package, or ACP. The other system in this pair is a system we call Coda. Coda is responsible for managing the Archival Information Packages (AIP) in our infrastructure. Coda was designed to manage a collection of BagIt Bags, help with the replication of these bags and allow curators and managers to access the master digital objects if needed.
What does it look like though?
The conversation I had with a colleague was around user interfaces to the preservation archive, how much or how little we are providing and our general thinking about that system’s user interfaces. Typically these interfaces are “back-end” and usually are never seen by a larger audience because of layers of authentication and restriction. I wanted to take a few screenshots and talk about some of the interactions that users have with these systems.
The primary views for the system include a dashboard view which gives you an overview of the happenings within the Coda Repository.
From this page you can navigate to lists for the various sub-areas within the repository. If you want to view a list of all of the Bags in the system you are able to get there by clicking on the Bags tile.
The storage nodes that are currently registered with the system are available via the Nodes button. This view is especially helpful in gauging the available storage resources and deciding which storage node to write new objects to. Typically we use one storage node until it is completely filled and then move onto another storage node.
For events in the coda system including ingest, replication, migration, and fixity check we create and store a PREMIS Event. These are aggregated using the PREMIS Event Service
The primary Coda instance is considered the Coda instance of record and additional Coda instances will poll the primary for new items to replicate. They do this using ResourceSync to broadcast available resources and their constituent files. Because the primary Coda system does not have queued items this list is empty.
To manage information about what piece of software is responsible for an event on an object we have a simple interface to list PREMIS Agents that are known to the system.
With the primary views out of the way the next level that we have screens for are the detail views. There are detail views for most of the previous screens once you’ve clicked on a link.
Below is the detail view of a Bag in the Coda system. You will see the parsed bag-info.txt fields as well as PREMIS Events that are associated with this resource. You have the buttons at the top which will get you to a list of URLS that when downloaded will re-constitute a given Bag of content and the ATOM Feed for the object.
Here is a URLS list, if you download all of these files and keep the hierarchy of the folders you can validate the Bag and have a validated version of the item plus additional metadata. This is effectively the Dissemination Information Package for the system.
An Atom Feed is created for each document as well which can be used by the AtomPub interface for the system. Or just to look at and bask in the glory of angle brackets.
Below is the detail view of a PREMIS Event in the repository. You can view the Atom Feed for this document or navigate to the Bag in the system that is associated with this event.
The detail of a storage node in the system. These nodes are updated to reflect the current storage statistics for the storage nodes in the system.
The detail view of a PREMIS Agent is not too exciting but is included for completeness.
Interacting with Coda
When there is a request for the master/archival/preservation files for a given resource we find the local identifier for the resource, put that into the Coda repository and do a quick search
You will end up with search results for one or more Bags in the repository. If there is more than one for that identifier select the one you want (based on the date, size, or number of files) and go grab the files.
The following screens show some of the statistics views for the system. They include the Bags added per month and over time, number of files added per month and over time, and finally the number of bytes added per month and over time.
There are a few things missing from this system that one might notice. First of all is the process of authentication to the system. At this time the system is restricted to a small list of IPs in the library that have access to the system. We are toying around with how we want to handle this access as we begin to have more and more users of the system and direct IP based authentication becomes a bit unwieldy.
Secondly there is a full set of Atom Pub interfaces for each of the Bag, Node, PREMIS Event, PREMIS Agent, and Queue sections. This is how new items are added to the system. But that it a little bit out of scope for this post.
If you have any specific questions for me let me know on twitter.