Video File Metadata: First Thoughts
A few years ago, I published a series on metadata in image files with a focus on old slides and photos that had been scanned. I've since turned my attention to metadata for video files, because I had a number of old home movies digitized.
I assumed that most of what I'd learned about image file metadata would apply. I also assumed that because widespread consumer use of digital video came after mass adoption of digital photography, the inconsistent, competing, overlapping metadata standards for images would have been avoided and that a single, widely-supported standard for video metadata would exist. I was wrong on both counts. I was reminded again of the the remark at Stack Exchange Photography that "Image and video metadata is a complete hot mess."
The video mess seems even hotter than for images, because
There's no accepted standard for video metadata. That's worse than the three competing standards that bedevil metadata for image files.It's harder to find video players that show metadata. Google Photos (GP) exhibits less predictable behavior for video file metadata.Nevertheless, the case for metadata in videos is as strong as it is for images, so into the mire we wade!
Data to StoreI store the following information in image file metadata, and I want to store the same things in video files:
A description of what's in the video. When the video was taken.When the video was digitized.Who did the digitizing.A copyright notice.I also want to store something I can't believe I overlooked for image files:
GPS coordinates for where the video was filmed.My excuse for this oversight with images is that when I started on the metadata problem, I was mostly dealing with pictures where I had only a vague idea where they were taken.Metadata Fields to UseIn a perfect world, there'd be a widely-supported standard for video file metadata, and that standard would prescribe which fields (i.e., tags) should be used for the information I want to store. Heck, in a really perfect world, that standard would apply not just to videos, but to images as well.
Our world falls short of perfection. Fields for image file metadata are largely ignored in the video world, and while there is an IPTC standard for video metadata, it has nothing to say about digitization, e.g., when it was done and by whom. Furthermore, my experiments indicate that support for the IPTC standard by Google Photos (GP), Windows File Explorer, and other programs I care about is poor.
I've written elsewhere about how GP's search capabilities make it indispensable for me, and that indispensability gives it a lot of clout in the metadata decisions I make. For images, I didn't find that to be constraining, but for videos, GP seems to like to throw its weight around. One of my goals is to be able to round-trip a video to GP (i.e., upload a video, then download it) without a loss of metadata. GP seems to enjoy interfering with that process. Often, fields present in uploaded videos are missing when those videos are downloaded. Sometimes GP keeps the data in downloads, but puts it in different fields. It's irritating.
I aspire to use metadata fields with these features:
They are part of the IPTC standard. They are visible in commonly-used programs (e.g., video players).They have these GP-related characteristics:Are displayed in the web browser UI.Are consulted by GP during searches. For example, the results of searching for videos taken in 1998 in California should include videos with such information in the metadata for when and where they were taken as well as in the videos' "description" field.Survive round-tripping.I didn't find any fields that check all these boxes, so compromise was the name of the game.
Identifying and testing candidate fields was a big job. I initially got help from a capable human, Amelia Huchley, but her time was limited, and when it ran out, I turned to marginally-capable AI chatbots. They've read pretty much everything on the internet, I figured, so they should be able to help choose metadata fields suited to my purposes.
I've blogged about how AI chatbots--or at least the free versions of them--are lousy search assistants, but desperate times, any port in a storm, etc. I turned to ChatGPT, Perplexity, Claude, Gemini, and Copilot and sought to use them as a sort of advisory committee to help me identify prospective fields and evaluate how well each was likely to achieve my goals. As advisory committees go, this one left a lot to be desired. The chatbots helped me put together a list of candidate fields, but their advice on how well the fields were likely to do on things like being displayed in the GP interface or surviving GP round-tripping was unreliable to the point of uselessness. This reinforces my impression that, in their current form, the free AI chatbots are worth about what you pay for them when it comes to internet research.
In the end, these are the fields I decided to use:
When taken: QuickTime:CreateDate. Choosing this field should have been easy. From a metadata perspective, you can't get much more fundamental than when a video was made. I considered six fields for this information. What I found perfectly exemplifies why video file metadata is a mess.Four of the fields come from QuickTime--meaning that QuickTime has four metadata fields for the same information. Two have ridiculously similar names: QuickTime:CreateDate and QuickTime:CreationDate. Two have names differing only in namespace: Keys:CreationDate and QuickTime:CreationDate. (ExifTool unhelpfully treats these fields as the same, sigh.) Two of the fields are changed into a third during a GP round trip: uploading files containing either of the two CreationDates yields downloaded files where those fields have become QuickTime:ContentCreateDate. You get the idea.
I ultimately settled on QuickTime:CreateDate, because it has good GP support (showing in the UI, being found in searches, and remaining intact when round-tripped) and, in contrast to the other candidates, is also visible via Windows' File Explorer and MediaInfo.Where taken: QuickTime:GPSCoordinates. I identified six candidate fields for this information, but I tested only four, quitting when I found that this one offered full GP support, unlike the others I tried.
"Full" GP support is quirky. Copying GPS coordinates from Google Maps and pasting them into an ExifTool command line to set QuickTime:GPSCoordinates works, but if the coordinate values have six or more decimal places, GP will refuse to show where the video was shot. When coordinates have at most five decimal places, no such problem arises. Enlivening the situation is that (at least under Windows) right-clicking a location in Google Maps brings up a context menu showing the location's coordinates to five decimal places, and clicking these coordinates copies them to the clipboard, but pasting them yields values with up to 14 decimal places! So when pasting coordinates copied from Google Maps into an ExifTool command, the values have to be edited back to at most five decimal places. (This restriction doesn't seem to apply to image files. In that case, you can use ExifTool to write coordinates with up to 14 decimal places, yet GP will still show where the photo was taken.) Description: QuickTime:Title. Amelia and I tested a dozen fields in an attempt to find one that GP would display. None did. Many also failed the round trip test. When I found that QuickTime:Title was found in GP searches, was retained in GP round-trips, and was displayed in VLC Media Player, Windows Media Player, Windows Media Player Legacy, and Windows Explorer, I figured that no field would do better, and I stopped looking.When digitized: XMP:DateTimeDigitized. This was the only field I found that is specifically designed for when digitization took place, can be set by ExifTool, and is retained across round trips to GP.Who did the digitizing: XMP-xmpDM:LogComment. I was not able to find a field designed to store this information (Claude suggested the apparently-non-existent XMP:DigitizedBy), but the general-purpose XMP-xmpDM:LogComment seems like a reasonable choice, and it survives a round trip to GP. Copyright: QuickTime:Copyright. This was the only easy metadata field to choose. It's specifically designed to hold copyright information, it's retained across round-trips to GP, it's part of the IPTC standard, and it's visible in at least one video player (QuickTime Player).These fields are my preliminary choices. To date, my experience with them is mostly limited to simple tests. My thinking may change as I get more experience with real video files, but given the dearth of guidance about metadata in video files, I thought it would be useful to document where I am now. That's why this post has "First Thoughts" in its title. Adding TranscriptsAn important part of a video file is its audio track. For many videos, there's a lot of speech in such tracks, and it's easy to imagine uses for transcripts of what's said, e.g., subtitle generation and full-text search. The transcript for a video is a form of metadata, so, at least in concept, I'd like to include it in the video file. Given the availability of speech-to-text software, it's not unreasonable to hope for initial drafts of such transcripts to be generated automatically. (YouTube, Vimeo, and Dailymotion all offer automatic generation of captions or subtitles.) I haven't gone beyond the thinking-about-it stage for generating and embedding transcripts, but it's clear that this could be a useful avenue to explore.
On the other hand, it's possible that this is an area where advances in technology could render the issue moot. If speech can be converted to text quickly and accurately enough for real-time display and search, there'd be no need to create and store transcripts inside video files. The metadata they represent could simply be generated when needed, thus sparing people like me a lot of work.
Scott Meyers's Blog
- Scott Meyers's profile
- 118 followers

