Linguofreak
Well-known member
I'm wondering why an OS (or 3rd party antimalware program) can't just flag and/or kill processes that excessively try to read protected memory. The OS has that sort of info, doesn't it? Sure, you want to allow a few bits/bytes every so often due to innocent software bugs, but at least you would stop an attacker from systematically reading your memory at max rate.
Well, by default, a program *will* crash on the first such attempted access, but there's generally a way for programs to be notified instead of killed so that they can attempt to recover, and this generally isn't rate limited
But the broader problem is that it can be rather tricky to determine what "trying to read protected memory" actually means.
Does the following code attempt to read protected memory?
Code:
if(never_happens)
{
int i = array[*pointer_to_kernel * cache_line_size];
}
Now, if never_happens is obviously false, the compiler will simply not compile the whole if statement, and even if the inside of the if statement is present in the compiled code (or the programmer writes assembly code directly), the CPU will never waste time trying to execute the instructions inside the if statement. However, cleverly written code can make the CPU think that never_happens will usually be true while ensuring that it is actually always false. The result will be that the CPU will run that code provisionally while it waits for never_happens to be evaluated, and will then roll those instructions back when it figures out that never_happens is false, so in terms of the architectural state of the machine, the program never actually tried to dereference *pointer_to_kernel.
The Meltdown bug is that, on Intel CPUs, as well as some (but not all) ARM CPUs, the CPU will start dereferencing pointer_to_kernel, fetch the value at that memory location, and execute subsequent instructions that depend on that value, before checking that the program has permission to access that address. If the dereference falls on a code path that is actually taken, an exception will be thrown (the permission check does take place by the time that the CPU determines that the execution of the instruction will no longer be provisional), and whether or not it falls on a code path that is actually taken, the value at that address will not be directly accessible by the program. However, whenever the CPU reads a value from memory, it stores it in a cache on chip in case it needs it later, as accessing RAM takes some time. So when we evaluate
Code:
int i = array[*pointer_to_kernel * cache_line_size];
The value at array[*pointer_to_kernel * cache_line_size] is read from memory and stored in the variable "i", thus causing array[*pointer_to_kernel * cache_line_size] to be pulled into the cache, even though the store into "i" is eventually rolled back (either by the exception or by the branch into the if statement taking the "false" route rather than the "true" route). It takes longer to access a value that isn't in the cache than one that is, so after doing this, we can step through the array in increments of the cache line size, timing each access, and we'll find that one access to the array is faster than all the others. We take the index for that access, divide it by the cache line size, and that gives us the value at pointer_to_kernel.
And the thing is, this doesn't even need to all happen in the same process. Even if we don't pull shenanigans with branch prediction, and even if the OS kills the process on every segfault with no option for the process to handle the fault, the kernel is mapped in the same place in every process (when it's mapped at all, which is how the KPTI patch and the related Windows and MacOS patches deal with it, they unmap the kernel except when programs make system calls), so we can always do something like this:
Code:
childpid = fork();
if(childpid == -1)
{
//fork failed, retry or exit with an error, whatever we want to do
}
else if (childpid == 0) //we're the child process
{
int i = array[*pointer_to_kernel * cache_line_size]; //this will kill us, but will load a value into the cache
}
else //we're the parent process
{
wait(NULL); //wait until the child process has died. We don't need the status value (we just need to know when the child is dead) so we pass a null pointer.
exfiltrated_secret = get_data_from_cache_timings(array);
}
In other words, we just start a new process, which attempts the access and dies, but still loads data into the cache. This might fail if the parent process isn't the first one to be scheduled after the child dies, but if other programs aren't loading the CPU heavily (which is the case most of the time on desktop machines) this may not be much of an issue. The code above is written for a Unix-type system, process creation semantics will be different on Windows, and array will likely need to be in a shared memory region (whereas with fork() on Unix the parent and child share their whole memory copy-on-write).