Next up in our review of the new metadata quality interfaces we have implemented this summer is our Element Count Dashboard.
The basics of this are that whenever we index metadata records in our Solr index we go ahead and count the number of instances of a given element, or a given element with a specific qualifier and store those away in the index. This results in hundreds of fields that are the counts of element instances in those fields.
We built an interface on top of these counts because we had a hunch that we would be able to use this information to help us identify problems in our metadata records. It feels like I’m showing some things in our metadata that we probably don’t want to really highlight but it is all for helping others understand. So onward!
Element Count Dashboard
The dashboard is similar to other dashboards in the Edit system. You have the ability to limit your view to just the collection, partner or system you are interested in working with.
From there you can select an element you are interested in viewing counts for. In the example below I am interested in looking at the Description element or field.
Once your selection is made you are presented with the number of instances of the description field in a record. This is a little more helpful if you know that in our metadata world, a nice clean record will generally have two description fields. One for a content description and one for a physical description of the item. More than two is usually strange and less than one is usually bad.
To get a clearer view you can see the detail below. This again is for the top level Description element where we like to have two descriptions.
You can also limit to a qualifier specifically. In the example below you see the counts of Description elements with a content qualifier. The 1,667 records that have two Description elements with a content qualifier are pretty strange. We should probably fix those.
Next we limit to just the physical description qualifier. You will see that there are a bunch that don’t have any sort of physical description and then 76 that have two. We should fix both of those record sets.
Because of the way that we index things we can also get at the Description elements that don’t have either a content or physical qualifier selected. These are identified with a value of none for the qualifier. You can see that there are 1,861,356 records that have zero Description elements with a none qualifier. That’s awesome. You can also see 52 that have one element and 261 that have two elements that are missing qualifiers. That’s not awesome.
I’m hoping you are starting to see how this kind of interface could be useful to drill into records that might look a little strange. When you identify something strange all you have to do is click on the number and you are taken directly to the records that match what you’ve asked for. In the example below we are seeing all 76 of the records that have two physical descriptions because this is something we are interested in correcting.
If you open up a record to edit you will see that yes, in fact there are two Physical Descriptions in this record. It looks like the first one should actually be a Content Description.
Once we change that value we can hit the Publish button and be on our way fixing other metadata records. The counts will update about thirty seconds later to reflect the corrections that you have made.
Even more of a good thing.
Because I think this is a little different than other interfaces you might be used to, it might be good to see another example.
This time we are looking at the Creator element in the Element Count Dashboard.
You will see that there are 112 different counts from zero way up into way way too many creators on an item (silly physics articles).
I was curious to see what the counts looked like for Creator elements that were missing a role qualifier. These are identified by selecting the none value from the qualifier dropdown.
You can see that the majority of our records don’t have Creator elements missing the role qualifier but there are a number that do. We can fix those. If you wanted to look at those records that have five different Creator elements that don’t have a role you would end up getting to records that loo like the one below.
You will notice that when a record has a problem there are often multiple things wrong with it. In this case not only is it missing role information for each of these Creator elements but there is also name type information that is missing. Once we fix those we can move along and edit some more.
And a final example.
I’m hoping you are starting to see how this interface could be useful. Here is another example if you aren’t convinced yet. We are completing a retrospective digitization of theses and dissertations here at UNT. Not only is this a bunch of digitization but it is quite a bit of metadata that we are adding to both the UNT Digital Library as well as our traditional library catalog. Let’s look at some of those records.
You can limit your dashboard view to the collection you are interested in working on. In this case we choose the UNT Theses and Dissertations collection.
Next up we take a look at the number of Creator elements per record. Theses and dissertations are generally authored by just one person. It would be strange to see counts other than one.
It looks lie there are 26 records that are missing Creator elements and a single record that for some reason has two Creator elements. This is strange and we should take a look.
Below you will see the view of the 26 records that are missing a Creator element. Sadly at the time of writing there are seven of these that are visible to the public so that’s something we really need to fix in a hurry.
That’s it for this post about our Element Count Dashboard. I hope that you find this sort of interface interesting. I’d be interested to hear if you have interfaces like this for your digital library collections or if you think something like this would be useful in your metadata work.
If you have questions or comments about this post, please let me know via Twitter.