Packaging Video DVDs for the Repository

For a while I’ve had two large boxes of DVDs that a partner institution dropped off with the hopes of having them added to The Portal to Texas History.  These DVDs were from oral histories conducted by the local historical commission from 1998-2002 and were converted from VHS to DVD sometime in the late 2000s.  They were interested in adding these to the Portal so that they could be viewed by a wider audience and also be preserved in the UNT Libraries’ digital repository.

So these DVDs sat on my desk for a while because I couldn’t figure out what I wanted to do with them.  I wanted to figure out a workflow that I could use from all Video DVD based projects in the future and it hurt my head whenever I started to work on the project.  So they sat.

When the partner politely emailed about the disks and asked about the delay in getting them loaded I figured it was finally time to get a workflow figured out so that I could get the originals back to the partner.  I’m sharing the workflow that I came up with here because I didn’t see much prior information on this sort of thing when I was researching the process.

Goals:

I had two primary goals of the conversion workflow, first I wanted to retain an exact copy of the disk that we were working with.  All of these videos were VHS to DVD conversions most likely completed with a stand alone recorder.  They had very simple title screens and lacked other features but I figured for other kinds of Video DVD work in the future that they might have more features that I didn’t want to lose by just extracting the video.  The second goal was to pull off the video from the DVD without introducing additional compression during the process. When these files get ingested into the repository and the final access system they will be converted into an mp4 container using the h.264 codex so they will get another round of  compression later.

With these two goals in mind here is what I ended up with.

For the conversion I used my MacBook Pro and SuperDrive.  I first created an iso image of the disc using the hdiutil command.

hdiutil makehybrid -iso -joliet -o image.iso /Volumes/DVD_VR/

Once this image as created I mounted the image by double clicking on the image.iso file in the Finder.

I then loaded makeMKV and created an MKV file from the video and audio on the disk that I was interested in.  This resulting mkv file would contain the primary video content that users will interact with in the future.  I saved this file as title00.mkv

makeMKV screenshot

makeMKV screenshot

Once this step was completed I used ffmpeg to convert the mkv container to an mpeg container to add to the repository.   I could of kept the container as an mkv but decided to move it over to mpeg because we already have a number of those files in the repository and no mkv files to date.  The ffmpeg command is as follows.

ffmpeg -i title00.mkv -vcodec copy -acodec copy -f vob -copyts -y video.mpg

Because the the makeMKV and ffmpeg commands are just muxing the video and audio and not compressing, they tend to process very quickly in just a few seconds.  The most time consuming part of the process is getting the iso created in the first step.

With all of these files now created I packaged them up for loading into the repository.  Here is what a pre-submission package looks like for a Video DVD using this workflow.

DI028_dodo_parker_1998-07-15/
├── 01_mpg/
│   └── DI028_dodo_parker_1998-07-15.mpg
├── 02_iso/
│   └── DI028_dodo_parker_1998-07-15.iso
└── metadata.xml

You can see that we place the mpg and iso files in separate folders, 01_mpg for the mpg and 02_iso for the iso file.  When we create the SIP for these files we will notate that the 02_iso format should not be pushed to the dissemination package (what we locally call an Access Content Package or ACP) so the iso file and folder will just live with the archival package.

This seemed to work for me to get these Video DVDs converted over and placed in the repository.  The workflow satisfied my two goals of retaining a full copy of the original disk as an iso and also getting a copy of the video from the disk in a format that didn’t introduce an extra compression step.  I think that there is probably a way of getting from the iso straight to the mpg version, probably with the handy ffmpeg (or possibly mplayer?) but I haven’t take the time to look into that.

There is a downside to this way of handling Video DVDs, which is that it will most likely take up twice the amount of storage as the original disk, so for a 4 GB Video DVD, we will be storing 8 GB of data in the repository,  this would probably add up for a very large project, but that’s a worry for another day.  (and a worry that honestly gets smaller year after year)

I hope that this explanation of how I processed Video  DVDs for inclusion into our repository was useful to someone else.

Let me know what you think via Twitter if you have questions or comments.