Metadata Edit Events: Part 1 – When

This is going to be another several post series as I wade through some of the data we have been collecting for the past year related to metadata editing and various events within a metadata record’s lifecycle.

Background

For the past few years the UNT Libraries has been collecting data about how long our metadata editors are spending editing records in our systems.  We’ve written on the overall change of metadata in our digital library and presented those findings as last years Dublin Core Metadata Initiatives conference in Austin Texas with a paper called “How Descriptive Metadata Changes in the UNT Libraries’ Collection: A Case Study“. The goal of collecting data about metadata change is that we will have a better idea of how our metadata editors are interacting with our systems.

What is an edit event?

Our metadata system will create a log entry when a user opens a record to begin editing.  This log acts as the start of a timer for the given edit session of that specific record by a given user.  When the user publishes that metadata record back into the system the log entry is queried,  the amount of time that has passed is recorded along with the metadata editors username,  identifier for the record and state (hidden or unhidden) is in when the item is saved.  This information is submitted to the Metadata Event Service and logged.

An edit event ends up looking like this once it has been created

id event_date duration username record_id record status record status change record quality record quality change
73515 2014-01-04T22:57:00 24 mphillips ark:/67531/metadc265646 1 0 1 0

With this information we are able to create a number of views into the metadata editing workflow in our environment,  we can easily see the number of metadata edits on a given day, within the month and for the entire period we’ve been collecting data.  We can view the total number of edits,  the number of unique records edited, and finally the number of hours that our users have spent editing records within a given period.

Below are a few screenshots from our Edit Event Service web-interface.

Homepage for the UNT Libraries Edit Event Service

Homepage for the UNT Libraries Edit Event Service

Daily View for the UNT Libraries Edit Event Service

Daily View for the UNT Libraries Edit Event Service

Monthly View for the UNT Libraries Edit Event Service

Monthly View for the UNT Libraries Edit Event Service

Yearly View for the UNT Libraries Edit Event Service

Yearly View for the UNT Libraries Edit Event Service

User Detail View for the UNT Libraries Edit Event Service

User Detail View for the UNT Libraries Edit Event Service

We are able to query a given day, month, year to view statistics as well as show the rankings and information for a specific user or digital object in the system.

Analyzing a year of data.

We were interested in taking a deeper look at the metadata edit events and that is what the following posts in this series will cover.  A year’s worth of metadata edit data was extracted from the event service.  This was paired with two other datasets,  descriptive metadata about the items editing including contributing institution, collection, resource type and format fields. We also classified each user in the dataset with their status as either an UNT-Employee or Non-UNT-Employee, and finally their rank as either Librarian, Staff, Student, or Unknown rank.  These datasets were merged to form a complete record for each metadata event in the Edit Events Dataset.  They were added to a Solr index that was used in analyzing this data.

A total of 94,222 edit events occurred from January 1, 2014 to December 31, 2014 and are the base dataset for the analysis presented here.

Month, Day, Hour

During 2014 we averaged 7,852 metadata edits per month

January 10,133
February 5,082
March 5,960
April 5,543
May 6,622
June 5,136
July 8,099
August 10,508
September 10,989
October 12,840
November 7,712
December 5,598
Monthly Metadata Edit Events for the University of North Texas

Monthly Metadata Edit Events for the University of North Texas

Looking at the day of the week that metadata edits occurred shows the expected pattern of the majority of metadata editing activities taking place during the week with fewer happening on the weekend.  The breakdown by day of the week is presented in the table below.

Sunday Monday Tuesday Wednesday Thursday Friday Saturday
2,765 17,506 19,580 16,876 20,838 14,416 2,241
Metadata Edit Events for the University of North Texas by weekday

Metadata Edit Events for the University of North Texas by weekday

The hour of day that metadata is edited is interesting to take a look at.  For the most part you will see the majority of editing being done during the work week,  with the afternoons being the time of day that most records are edited.  The full data is presented below.

Hour Edit Events
0:00 237
1:00 77
2:00 58
3:00 41
4:00 19
5:00 86
6:00 290
7:00 601
8:00 1,836
9:00 6,189
10:00 8,948
11:00 8,868
12:00 8,134
13:00 10,760
14:00 11,653
15:00 11,184
16:00 9,114
17:00 4,868
18:00 3,564
19:00 2,439
20:00 1,947
21:00 1,787
22:00 937
23:00 585

Presented as a graph you can easily see the swell of metadata editing in the afternoons.

Metadata Edit Events for the University of North Texas by hour of the day

Metadata Edit Events for the University of North Texas by hour of the day

If you combine the day of the week and hour of the day data into a single table you will get something like this.

94,222 edit events plotted to the time and day they were performed

94,222 edit events plotted to the time and day they were performed

In the image above,  green is lower number of edits and red represents higher numbers of edits.  It shows that Thursday afternoons tend to be very busy, while Friday is much lighter compared to other days of the week.

That’s it for the first post in this series,  I have a plan for information about Who is editing records,  What records are they editing, and then finally How Much time are we spending on metadata editing.  Check back for future posts.

As always feel free to contact me via Twitter if you have questions or comments.