Given (A) foo.mov, (B) bux.avi, and (C) baz.m4v, which can your video software handle correctly?
- A & B
- All of them
- All or some or none of them depending on their codecs, your software’s codecs, and whether they’re doing anything special with the container format.
And the answer is… 4!
A quick introduction to video codecs and why editing video is hard
Unlike most of the other files on your computer, a video file’s extension only hints at what sort of content might be inside. An extension like .mov or .avi refers to the type of the container format, not the video itself. Think of the container format like a Powerpoint file with text and images and perhaps a few other multimedia bits embedded in it. As anyone who’s had to do many presentations on a computer other than their own can attest, just because software can open the container, doesn’t mean that it has the right fonts to avoid turning all your text into strange symbols.
There are only a handful of video container formats in popular use, so what really matters for handling video is the codec used. The codec is how the video inside the container was compressed. Video usually needs to be compressed because it’s big. Just as you couldn’t keep your entire music collection on your computer before the advent of the MP3 audio codec, you’re not going to have much video if it’s not compressed using a good video codec. Unlike audio unfortunately, there are lots of video codecs out in the world. FFMPEG, the go-to video-editing software on my computer, has 147 different video codecs that it understands to some degree or another. There are so many codecs there partially for the usual bad reasons: politics, patents, not-invented-here syndrome, etc; but also for some good reasons. Efficiently recording video is a very different problem than efficiently compressing it for playback, both are a different problem than streaming it, and, as mentioned, video is really big.
A brief aside on the size of video. Pixels on your computer screen are each generally represented by a single 32-bit number. We’ll assume that video isn’t transparent and call it 24 bits. With a small 1024 by 768 screen, that means that a raw fullscreen image is 2.25MB. Video needs to be about 30 frames per second to look smooth. If video just consisted of flashing images past you at 30fps, one hour of video would be over 200GB. Thank goodness for good codecs.
So dealing with video correctly depends on your software understanding whatever codec was used to encode the video you want to use as input, and then being able to write out video in a codec that the software you’re using for playback will be able to understand. And the container format.
Remember a couple paragraphs ago when I told you that there were only a few popular container formats and that what really causes problems is the codec? That’s true, except for when it isn’t. I recently discovered the hard way that iPhones only record video in one orientation. As far as the phone is concerned, the top of the screen is opposite the volume buttons and holding your phone in any other orientation has the same result as if doing that with a video camera–sideways or upside-down video. Except that it doesn’t. Video recorded on a phone plays back pleasantly upright on the phone and in Quicktime, regardless of the phone’s orientation. This is because of the container format.
When you record a movie on your phone, the phone stores its orientation in the metadata in the container format. Software like Quicktime reads that information, and rotates the video appropriately before playing it back. Your video editing software probably doesn’t. FFMPEG can do all sorts of trick rotating video this way and that, but it doesn’t read the .mov headers that give the orientation information. When you convert the video to some other codec in some other container file, the header information is lost and the video is sideways.
On a final note, video encoding is generally a lossy process. That much compression depends on not keeping every last bit of data. The upshot for editing video is that you want to re-encode the video as few times as possible, preferably only once to best preserve the quality.
Given all this, I hope you’ll see why editing video is a harder problem than you thought it was, and that you’re probably better off avoiding it.