A bit confused: what's a HEIF and why do we need 10-bit stills and video?

A bit confused: what's a HEIF and why do we need 10-bit stills and video?
ФОТО: dpreview.com

The latest cameras from most brands are adding 10-bit video (and often stills) capture options. The Canon EOS R5 and R6, for instance, gain both. 10-bit video, 10-bit stills modes and the ability to shoot 'HEIF' files are all features increasingly being added to cameras.

But what's the benefit and when should you use these modes? We're going to look at how data is captured, how it's stored and hence what benefits you should (and shouldn't) expect from 10-bit capture.

Linear encoding

The inefficiency of linear encoding

Linear encoding devotes half the values to the brightest stop, a quarter to the next stop, and so forth.

Now consider photon shot noise; the randomness of the light you captured. Shot noise is essentially the square root of the signal. So the very bright parts of the image (that you're devoting the bulk of your raw values to) has the highest amount of noise, because the square root of a larger number is bigger than that of a smaller number. It doesn't look as noisy, because noisiness relates more closely to the signal-to-noise ratio, not the absolute noise level. But it means that you're capturing lots of very fine detail about something with very high variance.

Worse still, the human visual system is less good at discerning detail and color in bright areas than it is in dark: you're over-encoding a noisy signal that isn't especially visually meaningful. In short: linear encoding is hugely inefficient. (Some Raw compression takes advantage of this: compressing the over-encoded highlights in such a way that has no meaningful effect on the image visually or in terms of editing flexibility).

Unlike the human visual system, cameras record light in a linear manner: twice as many photons hitting the sensor results in a signal that's twice as large, and is recorded using a digital number that's twice as large.

This means half of your available raw values are always consumed by the brightest stop of light that you captured. This is just basic logic: whatever the brightest value you captured was, half as much light (ie: one stop less light) will be captured with a value half as large.

The upshot of this is that in Raw, you can only store roughly the same number of stops of dynamic range as your camera's bit depth. Or, to get cause and effect the right way round: the analog-to-digital converter will have been selected because it has sufficient bit-depth to encode the signals coming off the sensor. It's primarily a question of ensuring you can capture and retain all the information coming off the sensor. Upping the bit-depth beyond what's required to fully encode the signal will not give you 'more subtle gradation' or 'X million more colors,' it'll just mean generating much bigger files that have recorded the noise in the shadows in more detail.

So, why do we record linear Raw files? Because it's the easiest thing to do from a processing perspective, retains all* of the information you first captured and isn't unmanageably big, since you're typically only capturing a single raw value at each pixel, not separate red, green and blue values.

Gamma encoding

Gamma encoding is the process of applying a non-linear transformation to the linear data or, more simply: redistributing the Raw data in a more space-efficient manner. Typically the opposite gamma curve is applied when you open the image to view it, so that you get back to something that looks like the original thing you were trying to capture.

Gamma encoding + tone curve

Because this encoding is non-linear, you can squeeze some or all of the data of a linear Raw file into a much lower bit-depth file. Almost all modern cameras output 8-bit JPEGs, which typically include around nine stops of DR (more when DR expansion modes and adaptive tone curves are used). In principle you could probably fit still more, but in addition to the gamma encoding**, an 'S' curve is typically applied, which gives a nice punchy image.

After gamma encoding, an 'S'-shaped curve is applied to most JPEG files to give an image with a good level of contrast. This assigns nearly 70% of the available 256 values to the four stops around middle grey. This reduces the scope for making changes.

With clever compression, a JPEG can easily be 1/6th the size of a Raw file but still do a good job of conveying everything you saw when you took the image. Or, at least, everything your 8-bit, Standard Dynamic Range display can show. However, because a lot of data has been disposed of, and because the 'S' curve has crushed the highlights and shadows, a JPEG offers much less flexibility if you want to make major changes to it.

An 8-bit file has 256 data values for each colour channel, and by the time you've shared those values out between nine stops, there's not much scope for adjusting the values without gaps starting to appear, and posterization creeping in, instead of smooth tonal transitions. It's a great endpoint, though, especially for SDR displays.

The decision of whether to shoot Raw or JPEG is, broadly speaking, a question of whether you plan to edit the results or present them more or less as-shot. Raw data is typically 12 or 14-bit and although its linear encoding is really inefficient, the fact it hasn't been demosaiced makes it manageable. But for most uses, the final image can be well expressed in an 8-bit JPEG. So why do we need 10-bit options?

Log encoding - a middle ground

Log encoding shares its available values out more evenly: most stops are given the same number of data values, rather than being spectacularly weighted towards the highlights, as they are with linear encoding or focused on the mid-tones like most standard tone curves.

Log curves (Sony's S-Log3 in this instance), share the available values much more evenly between the captured data, retaining a good level of flexibility for editing but without the inefficiency of linear encoding. Note that the relationship in the shadows is not logarithmic (where that approach would devote more data values than the original linear capture).

Essentially, it's a clever way of retaining a good degree of editability in a much more efficient file. You can see why it would become a popular way of working with video, where you can retain the flexibility to edit, but still benefit from the very efficient, well-optimized codecs and filetypes that have been developed for video.

Why shooting Raw video might not be the gamechanger you expect

A move from 8 to 10-bit means you have 1024 values to share out, so you can retain four times more information about each stop you captured. In turn, this means lots more flexibility if you try to make significant adjustments to color and contrast, with much less risk of posterization.

Typically a manufacturer will look at the performance of its cameras then develop a log curve that can encode most of the camera's usable dynamic range. This is why most manufacturers have ended up with multiple Log curves: you don't want to share your 1024 values out across 14 stops if your camera's output is unusably noisy beyond stop 11.

For most applications, though, 10-bit encoding of Log gives a big boost in editabilty without the file sizes becoming too unwieldy.

Why else would I need 10-bit?

So, 10-bit capture lets cameras offer much more editable video, without the added size, and potential compatibility issues (and legal complications) of shooting Raw video. But there are other uses, which promise benefit for both video and stills shooters.

A new generation of TVs are now widely available that can show a wider dynamic range than older, SDR displays. An increasing number of movies and TV shows are being shot in HDR and steaming services can deliver this HDR footage to peoples' homes.

Hybrid Log Gamma (HLG) and Perceptual Quantizer (PQ) are the two most common ways of encoding HDR data. Both require 10-bits of data because they are trying to preserve a wider tonal range than a typical 8-bit footage. Don't get fooled by the word 'Log' in the name HLG: only part of the curve is logarithmic. Both HLG and PQ are, like JPEGs, designed to be end-points, rather than intermediates.

An increasing number of cameras can shoot HLG or PQ video for playback on HDR TVs. But it's also increasingly common for them to offer 10-bit stills based around these standards, for playback on HDR displays.

So 10-bit stills are for lifelike HDR?

What the HEIC? What are HEIF files?

The 10-bit stills modes on most recent camera shoot the single-image 'HEIC' form of HEIF files, which are essentially single frames of H. 265 video. Despite being 10-bit, the more efficient compression means they can retain the same image quality in a much smaller file. Panasonic instead uses the less common 'HSP' format for its 10-bit stills, which is a part of the HLG standard. This is even less widely supported than HEIF, but you're likely to have to plug your camera into a display to view the files properly, either way.

It's fair to say that there isn't much consistency in what the different camera makers are offering. Some camera makers only allow you to shoot 10-bit files when you're capturing true HDR images, whereas others only offer SDR profiles in HEIF mode, and some let you shoot whatever combination you wish.

From our perspective, there's not a lot of point shooting 10-bit stills with conventional SDR curves: the data isn't stored in a manner that's designed for editing and Raw remains a more powerful option anyway, so all you're doing is capturing something that'll end up looking a lot like a JPEG but isn't as widely supported.

Shooting true HDR images (HLG or PQ) in 10-bit makes a lot more sense. These can look spectacular when shown on an HDR display, with highlights that glisten in a way that's hard to convey in conventional photos. At the moment, though, you typically have to plug your camera into a TV using an HDMI lead to view the images, which isn't very practical. But for us, this is where the value of 10-bit stills lies, photographically.

There's a lot of work to be across the imaging industry to boost support for true HDR stills. We need editing tools to let us fine-tune Raws into HDR stills, just as we're used to doing when produce our own JPEGs. But above all, we need wider support and cross-compatibility so that we can share and view 10-bit files without having to connect our camera to the display. Until this is resolved, the ability to shoot 10-bit stills is of disappointingly limited use.

*Or nearly all if your camera offers lossy Raw compression

**Strictly speaking the 'Electro Optical Transfer Function' (EOTF) of sRGB isn't a simple 2. 2 gamma transformation, but it's similar enough that it's often referred to as such.

.

10-bit more encoding

2022-6-14 17:00