Metadata Quality Interfaces: Element Count Dashboard

Next up in our review of the new metadata quality interfaces we have implemented this summer is our Element Count Dashboard.

The basics of this are that whenever we index metadata records in our Solr index we go ahead and count the number of instances of a given element, or a given element with a specific qualifier and store those away in the index.  This results in hundreds of fields that are the counts of element instances in those fields.

We built an interface on top of these counts because we had a hunch that we would be able to use this information to help us identify problems in our metadata records.  It feels like I’m showing some things in our metadata that we probably don’t want to really highlight but it is all for helping others understand.  So onward!

Element Count Dashboard

The dashboard is similar to other dashboards in the Edit system.  You have the ability to limit your view to just the collection, partner or system you are interested in working with.

Count Dashboard

From there you can select an element you are interested in viewing counts for.  In the example below I am interested in looking at the Description element or field.

Select an Element to View Counts

Once your selection is made you are presented with the number of instances of the description field in a record.  This is a little more helpful if you know that in our metadata world, a nice clean record will generally have two description fields.  One for a content description and one for a physical description of the item. More than two is usually strange and less than one is usually bad.

Counts for Description Elements

To get a clearer view you can see the detail below.  This again is for the top level Description element where we like to have two descriptions.

Detail of Description Counts

You can also limit to a qualifier specifically.  In the example below you see the counts of Description elements with a content qualifier.  The 1,667 records that have two Description elements with a content qualifier are pretty strange.  We should probably fix those.

Detail of Description Counts for Content Qualifier

Next we limit to just the physical description qualifier. You will see that there are a bunch that don’t have any sort of physical description and then 76 that have two. We should fix both of those record sets.

Detail of Description Counts for Physical Qualifier

Because of the way that we index things we can also get at the Description elements that don’t have either a content or physical qualifier selected.  These are identified with a value of none for the qualifier.  You can see that there are 1,861,356 records that have zero Description elements with a none qualifier.  That’s awesome.  You can also see 52 that have one element and 261 that have two elements that are missing qualifiers.  That’s not awesome.

Detail of Description Counts for None Qualifier

I’m hoping you are starting to see how this kind of interface could be useful to drill into records that might look a little strange.  When you identify something strange all you have to do is click on the number and you are taken directly to the records that match what you’ve asked for.  In the example below we are seeing all 76 of the records that have two physical descriptions because this is something we are interested in correcting.

Records with Multiple Duplicate Physical Qualifiers

If you open up a record to edit you will see that yes, in fact there are two Physical Descriptions in this record. It looks like the first one should actually be a Content Description.

Example of two physical descriptions that need to be fixed

Once we change that value we can hit the Publish button and be on our way fixing other metadata records.  The counts will update about thirty seconds later to reflect the corrections that you have made.

Fixed Physical and Content Descriptions

Even more of a good thing.

Because I think this is a little different than other interfaces you might be used to, it might be good to see another example.

This time we are looking at the Creator element in the Element Count Dashboard.

Creator Counts

You will see that there are 112 different counts from zero way up into way way too many creators on an item (silly physics articles).

I was curious to see what the counts looked like for Creator elements that were missing a role qualifier.  These are identified by selecting the none value from the qualifier dropdown.

Creator Counts for Missing Qualifiers

You can see that the majority of our records don’t have Creator elements missing the role qualifier but there are a number that do.  We can fix those.  If you wanted to look at those records that have five different Creator elements that don’t have a role you would end up getting to records that loo like the one below.

Example of Multiple Missing Types and Roles

You will notice that when a record has a problem there are often multiple things wrong with it. In this case not only is it missing role information for each of these Creator elements but there is also name type information that is missing.  Once we fix those we can move along and edit some more.

And a final example.

I’m hoping you are starting to see how this interface could be useful.  Here is another example if you aren’t convinced yet.  We are completing a retrospective digitization of theses and dissertations here at UNT.  Not only is this a bunch of digitization but it is quite a bit of metadata that we are adding to both the UNT Digital Library as well as our traditional library catalog.   Let’s look at some of those records.

You can limit your dashboard view to the collection you are interested in working on.  In this case we choose the UNT Theses and Dissertations collection.

Next up we take a look at the number of Creator elements per record. Theses and dissertations are generally authored by just one person.  It would be strange to see counts other than one.

Creator Counts for These and Dissertations Collection

It looks lie there are 26 records that are missing Creator elements and a single record that for some reason has two Creator elements.  This is strange and we should take a look.

Below you will see the view of the 26 records that are missing a Creator element.  Sadly at the time of writing there are seven of these that are visible to the public so that’s something we really need to fix in a hurry.

Example Theses that are Missing Creators

That’s it for this post about our Element Count Dashboard.  I hope that you find this sort of interface interesting.  I’d be interested to hear if you have interfaces like this for your digital library collections or if you think something like this would be useful in your metadata work.

If you have questions or comments about this post,  please let me know via Twitter.