I hate software: 2020

Wednesday, August 5, 2020

SVE-2019-15230: A bug collision

Researchers from Team T5 recently published their write-up on exploiting a bug in S-Boot and obtaining code execution in the Samsung Secure Bootloader (S-Boot).

This week, they're going to present it at the BlackHat 2020 conference.

Their write-up contains a lot of technical details and I recommend you to read it.

https://teamt5.org/en/posts/blackhat-s-talk-breaking-samsung-s-root-of-trust-exploiting-samsung-secure-boot/

In their report, they say "Another security research team found this vulnerability at the same time and report it to Samsung. ID: SVE-2019-15230".

This one-man security research team was me.

As I described in the previous two posts, in 2019 I got myself a Samsung Galaxy S10 phone with an Exynos SoC and decided to hunt for security bugs.

After finding the first issue (which is also my first SVE and my first report rated "Critical"), "SVE-2019-14371", I decided to carefully review the code around the location where I found the first bug.

I found an integer overflow which could potentially lead to memory corruption, overriding the entirety of S-Boot code and data.

Funnily enough, I got the SVE even though I have not submitted the POC which achieves code execution (well, to be fair, I reported it almost two months earlier).

I have submitted the one which demonstrates that the device handles non-underflowed values correctly whereas a "huge" buffer size causes it to freeze.

I came to the conclusion it's hard to exploit because I could not find a device with the good memory layout.

I have downloaded a ton of images: for S10, S9, S7, A50, J series phones.

Team T5's trick was to find a condition where the error handler code will make the memory layout good.

I have unfortunately overlooked that in S8 the download buffer is right before the S-Boot code AND it was not using the newer ("compressed" or "smp") download modes.

Since I never realized how to make that S-Boot falls back to the "legacy" buffer at 0xc0000000, I was focusing on the first underflow here, and came to the conclusion that there's not much I could do about it.

I wrote in my report to Samsung that USB transfer is done with DMA and I have not seen S-Boot initialize SMMU so it's surely exploitable.

If we could make the buffer point before S-Boot, of course (which I have not found out how to do).

I like how Team T5 discusses that normally it would be hard to exploit the bug as most code and data could be cached, but as they found a bunch of pointers in an uncached area, overwriting them works, even if it's done by the USB controller (which is not necessarily cache-coherent) and the CPU is unaware.

Although it was already the third Critical issue I found in S-Boot by that time and I was completely burnt out on trying to develop yet another POC.

Is the security of Samsung phones bad? NO.

I think, it's quite on par with the competitors. While the Android Security Updates page (https://security.samsungmobile.com/securityUpdate.smsb) regularly lists "High" and "Critical" issues, very few of them happen in the S-Boot bootloader (which is one of the earliest pieces of code which execute on the device). What it means is that the attacker need to first unlock the phone. So keeping your phone locked and BT/WLAN off should give a reasonable level of protection in the settings when you can't keep an eye on your phone.

There are of course things that could be improved, like adding some mitigations to the bootloader (stack cookies, heap guard pages). However, in this case this would not help at all, because the memory copy (and thus overwriting code/data) is not done by the S-Boot code, but by the USB DMA controller.

Bootloaders almost never implement ASLR, and even kernels which do only implement it for the virtual memory, the physical address remains constant or predictable. In fact, as it's overwriting uncached code (exception handlers), it could even work if the CPU supported ARM MTE. So this is in some sense the mother of all S-boot bugs.

In some aspect, the root cause here is very similar to the IROM bug found by Frederic Basse: an integer overflow and copying data from USB, although in case of IROM it seems that IROM's code is copying USB data by small chunks.

https://fredericb.info/2020/06/exynos-usbdl-unsigned-code-loader-for-exynos-bootrom.html#exynos-usbdl-unsigned-code-loader-for-exynos-bootrom

Of course it would not be fair to judge the design decisions now that we know about these bugs. It would be nice to add some checks to ensure DMA regions don't intersect with code/data. Enabling SysMMU would not hurt. This is becoming somewhat worrisome now that USB4 has been announced with Thunderbolt-like DMA capabilities. It's unfortunate that most bootloaders do not focus on this.

Now, the problem is that adding mitigations is quite hard as well as reasoning about their effectiveness. Without memory safety you can never be sure that the code is not exploitable. And as we've just seen before, it's unlikely that even a hardware safety oriented at memory safety are likely to be bypassed. Maybe a combination of MTE/KASAN and instrumenting all DMA memory management would work, but again it relies on the individual developers thinking of all corner cases. At some point bootloaders/firmwares could become as complex as the Linux kernel itself.

To this end, an interesting approach is moving firmware update, including USB download, to user-space and indeed running it from Linux, as Google started doing recently.

https://source.android.com/devices/bootloader/fastbootd

Why are the bugs present then?

It often happens that bugs appear in two areas: the just-written code with new functionality (which not so many people had a chance to review) or the old code (people got tired of trying to find bugs there and gave up).

I think we can draw two conclusions from this.

As a vendor, one should not assume that security or code review is a one-off effort and they need to re-review their stuff once in a while, especially bringing in the perspective of new team members.
As the user, if you expect maximum security, perhaps don't switch to the new tech immediately, give it 2-3 months for the most obvious and annoying issues (not only security bugs) to get ironed out.

What can I do to protect myself?

Enable the "Find My Mobile" feature of your Samsung phone.

Before I and TeamT5 reported a few issues to Samsung, S-Boot implemented one mechanism of preventing the phone from being tampered with: when MDM (device administration by a workspace or school) is enabled, certain partitions were not allowed to be flashed.

This, unfortunately, did not protect the user in two scenarios: when they were not using MDM (which is the majoriy of users) or when the vulnerability is in the USB stack before the flashing code (in which case even enabling MDM would buy you nothing).

In mid-2019, Samsung changed the "Find My Mobile" in such way that it would disallow any USB operations (such as ODIN mode) when the device is not unlocked and FMM is enabled.

This should provide a reasonable level of protection for most users.

It is now much harder for an attacker to hack your phone in a few seconds when you're not looking at it (such as at a conference or in the hotel room). Of course, if is still possible to re-program the storage chip by desoldering it, which could expose other potential vulnerabilities, but it's hard to do it quickly and without tamper evidence.

Will you blog about other findings?

Not sure. It's a tricky question.

In general, Samsung is asking to not disclose the issues, at least until the patching and reward process is done, which is understandable. I am grateful that Samsung allowed me to participate in their security reporting program, even though I work for another mobile SoC company, so I'm not particularly interested in making this relationship go sour.

I do, however, see the immense value in describing the bug contents, because personally for me write-ups on Phrack or later Google Project Zero have been useful for understanding how attackers think, which came handy when both writing code and later working as a security engineer.

I guess vendors can have their own reasons to not like when bugs are disclosed. Before working as a security engineer myself, I thought negative PR/news was the major reason. Turns out, it's impossible to predict this factor, and often the minor bugs are over-hyped but serious ones go unnoticed. So maybe this factor is not that important after all.

Another thing I've noticed is that there is a significant interest in some hacking and "mobile repair" forums in bugs even for older firmware revisions, which means there are regions where people rarely update (expensive internet) or... a source of phones which for some reason remain unused and therefore not updated for a while.

Given that mobile phones are supported (security updates and carrier contracts), maybe half of this term (1.5-2 years from the time the bug is patched) is a reasonable delay for holding off disclosure.

Tuesday, May 5, 2020

On Samsung and Exynos hacking, again

Introduction.

Last year I published a post (http://allsoftwaresucks.blogspot.com/2019/05/reverse-engineering-samsung-exynos-9820.html) about reverse-engineering TEEGRIS and S-Boot
on Samsung Exynos Galaxy S10. This is kind of a follow-up to that post
which has received a lot of attention and led to interesting conversations
with fellow security researchers.

Funnily enough, this very blog with its distinctive URL got into academic papers and
conference talks.
I guess that counts as a success because that's more citations than all my
previous academic work combined. Slowly but steadily I'm progressing on track to receive my PhD from the Shitposting University.

Citations.

(All links have been retrieved on 2020-04-17).
https://gsec.hitb.org/materials/sg2019/D2%20-%20Launching%20Feedback-Driven%20Fuzzing%20on%20TrustZone%20TEE%20-%20Andrey%20Akimov.pdf
Andrey Akimov: Launching feedback-driven fuzzing on TrustZone TEE (HITB GSEC 2019 Singapore).

https://zeronights.ru/wp-content/themes/zeronights-2019/public/materials/5_ZN2019_andrej_akimovLaunching_feedbackdriven_fuzzing_on_TrustZone_TEE.pdf
Andrey Akimov : Launching Feedback-Driven Fuzzing on TrustZone TEE (ZeroNights 2019)

https://blog.quarkslab.com/a-deep-dive-into-samsungs-trustzone-part-1.html
Alexandre Adamski, Joffrey Guilbon, Maxime Peterlin of Quarkslab : A Deep Dive Into Samsung's TrustZone (Part 1)

https://www.usenix.org/system/files/sec20summer_harrison_prepub.pdf
Lee Harrison, Hayawardh Vijayakumar, Rohan Padhye , Koushik Sen , and Michael Grace: PARTEMU: Enabling Dynamic Analysis of Real-World TrustZone Software Using Emulation

https://www.ndss-symposium.org/wp-content/uploads/2020/04/bar2020-23014.pdf
Marcel Busch and Kalle Dirsch : Finding 1-Day Vulnerabilities in Trusted Applications using Selective Symbolic Execution

Follow-up on reverse-engineering and security research.

I also found a few bugs in TEEGRIS and S-Boot that got assigned CVEs by Samsung (check 2019/2020 here).
I'm somewhat happy about this achievement. Prior to that, I mostly worked on
the defense side both implementing mitigations/OS kernels and then debugging
security issues submitted by other researchers. So I was glad to receive this
external validation of my ability to find bugs on my own, although a little
bit surprised at how easy it was to find them with the code review of the
code decompiled with Ghidra.

I have not really found any bugs with fuzzing using the QEMU emulators for
S-Boot and TEEGRIS described in my previous blog post. However, these came
handy for debugging proof-of-concepts as I could use GDB and dump memory as if
it was just a regular Linux app on the PC.

I would also like to point your attention to this paper on Phrack
about emulating RKP (Samsung Hypervisor) with QEMU by Aris Thallas.
http://phrack.org/papers/emulating_hypervisors_samsung_rkp.html

I have used a similar approach with full-system QEMU emulation for debugging some RKP bugs.
However, after having spent so much effort on emulating S-Boot and TEEGRIS,
I was not in the mood to boot Linux in EL1 and put all the pieces together.
I used a different approach for testing Hypervisor Calls (HVCs). Instead
of having a proper EL1 client, I wrote a piece of C code that invoked the
EL2 exception handler directly. I then linked it to the address of some
uninteresting function in RKP and used GDB to overwrite the code in QEMU
memory and jump to my stub.

I especially like the part about using QEMU instrumentation to provide
coverage information to AFL.
I have also implemented a similar approach (based on the QEMU and Unicorn modes
from the AFL source tree) for my TEEGRIS QEMU emulator.
https://github.com/astarasikov/qemu/commits/teegris_usermode_persist_rewriteafl
https://twitter.com/astarasikov/status/1187902865710428160

Unfortunately, I have not found any bugs with fuzzing (although I have with code review).
I believe better results could be achieved with the CompareCoverage plugin which
would prevent the fuzzer from getting stuck on magic values/constants.
https://andreafioraldi.github.io/articles/2019/07/20/aflpp-qemu-compcov.html
Additionally, please check out this blog about implementing ASAN (Address Santiizer)
for binary-mode QEMU within the TCG interpreter/JIT.
https://andreafioraldi.github.io/articles/2019/12/20/sanitized-emulation-with-qasan.html

Finally, if you're interested in fuzzing at the source-code level and are
getting stuck with magic values/constants, please check out this
post from 2016 about a strategy for splitting comparisons (which is related
to CompareCoverage).
https://lafintel.wordpress.com/

This is already implemented in libFuzzer, but
if you have to use AFL, consider using AFL++ which maintains LLVM plugins
for these strategies. In any case, check out AFL++ because it attempts to unify
most of the forks developed in academia.
https://github.com/AFLplusplus/AFLplusplus

Other interesting news.

I9100 (Samsung Galaxy S2) upstream work.

I was surprised when I got a GitHub notification in 2020 about a project I have
not worked on before. Turns out, people have been resurrecting the work I've
done in back 2012 which was a nice surprise.

In 2012 I was doing some work on getting FOSS software to run on
the Samsung Galaxy S2 phone. It was a hobby project, I got this phone
after completing my work on porting Linux and Android to Sony Xperia X1 and
hoped that starting with a device which ran Linux out of the box would be
advantageous for this goal.

So the first problem that I solved was getting multi-boot working.
I solved it by porting the U-Boot bootloader.
This eventually related in a weird chain of events that landed me several interesting
jobs and gigs.

Anyway, the u-boot.

I then attempted porting the Galaxy S2 board support to the mainline kernel tree.
I was using the latest Linaro tree. I had some limited success in getting most
hardware working with upstream drivers (WIFI, Camera with V4L2) and by porting
some non-upstream ones (Sound, Modem).
https://github.com/astarasikov/i9100-proper-linux-kernel/commits/i9100_linaro_33

Eventually I had to resort to using the Android kernel with some changes
but I got dual-boot working with Ubuntu on the SD Card.

Native Ubuntu (with X11) on Samsung Galaxy S2 (2012)
https://www.youtube.com/watch?v=VHl8PytVt50

Back in 2012 I made a post to summarize my efforts related to S2.
https://www.mail-archive.com/smartphones-userland@linuxtogo.org/msg02865.html

Mainline linux port by Sekil

Fast-forward to 2020, I was surprised to learn that not only people are still
using the device, they are also using my U-Boot port and one developer even
went as far as resurrecting the attempts to run mainline linux tree.
They made great progress and independently authored patches for the mainline
tree which have a high chance of being accepted.

See this port by Evgeniy Stenkin.

This effort is acknowledged and is used by the PostmarketOS project.
https://wiki.postmarketos.org/wiki/Samsung_Galaxy_SII_(samsung-i9100)

FOSS RIL for Samsung Galaxy S2, Galaxy Nexus

Later, my focus switched to reverse-engineering the userspace libraries
in order to provide a fully open-source build of Android for Samsung Galaxy Nexus,
a device which shared the modem with Galaxy S2.

For the previous-generation phone (Galaxy S1, I9000) an open-source implementation
of the Radio Interface Layer (RIL) was provided by the engineers from the Replicant
and OpenMoko projects (Paul Kocialkowski, Simon Busch and morphis).

In 2012 I was asked by Ksys Labs to provide an open-source RIL for Samsung
Galaxy Nexus which happened to have the same modem as Galaxy S2.
So I have done the following:

Firmware loader for these modems (based on reverse-engineering and a C++ implementation by another engineer)
Fixing SMS character encoding so that we could receive SMS in Russian
Fixing some edge cases for USSD support
Providing some rudimentary socket callback protocol so that a proprietary GPS library could be used by those who really wanted to.

These changes have been fully integrated into the Replicant project
and served as the basis for supporting many more Samsung modems.
Some builds of LineageOS for Galaxy S3 also use these libraries from the
Replicant project to avoid the overhead of supporting the ABI for the
proprietary driver libraries from 2012.

I even saw the Replicant stand at the CCC last year so these phones
are living on.
And the dream of supporting it in a non-Android setting such as Ofono
seems to have never materialized. Oh well.

Summary

I am happy to see that my work on both U-Boot and RIL got reused by many projects.
Back in 2012 having your phone run upstream software was a very ambitious goal,
especially for a single developer. It usually took around a year and a half
to get familiar with all the hardware and reverse-engineer it to a decent level
in order to develop all the support by which time the device would get obsolete.
However, if you're more interested in using upstream SW than using the latest
HW, there is some hope.

Oh, and Pinephone looks like a nice alternative these days. The hardware is similar to Galaxy S2, but the CPU is 64-bit and it's FOSS out of the box.

U-Boot without the proprietary bootloader.

Here's another interesting development that happened in those years to another
related Exynos device (Galaxy S3 I9300).
Simon Shields ported U-Boot to Galaxy S3, but unlike my port this one
does not rely on the Samsung bootloader in any way and allows to boot the phone
with even fewer proprietary components.
https://blog.forkwhiletrue.me/posts/u-boot-on-galaxy-s3/

Back when I was porting U-Boot to S2, I flashed it into the Linux kernel
partition and made it so that it's loaded by the phone's original bootloader.
My motivation was to avoid bricking the device (back when it was not known
how to use Exynos USB recovery mode) and it was assumed that the bootloader
needed to be signed. As it turned out later around 2014, on these early
Exynos chips the initial bootloader shared the same signing key and device
ID with development boards and it was possible to work around the signing
requirement and replace the original bootloader by using the stage-1 bootloader
from a development board.

KVM on the phone.

Ever since working on the ARM para-virtualization with L4/Genode I wanted
to use real virtualization.
I was very enthusiastic about the first (32-bit) ARM boards with the HYP extension
when they arrived in 2013.
http://allsoftwaresucks.blogspot.com/2013/11/kvm-on-arm-cortex-a15-omap5432-uevm.html

Since then, I've always wanted to get virtualization working on a mobile phone
for the fun of running multiple operating systems.
Unfortunately, most of them enable "secure" booting and require that the EL2
hypervisor image is signed by the OEM.

Some early phones did not implement a hypervisor or left it writable by the OS
but I was wondering if I could do that on a fairly recent and powerful phone.

Here's some small showcase of an attempt to run Windows 10 in KVM on a Samsung
A50 phone with the Exynos9610 CPU.

The bug I found works only on the unlocked phone (with KNOX tripped/fuse blown) before Linux MMU
is on. In principle one might be able to find a variant that works with MMU on,
but even passing arbitrary arguments to RKP would require compromising (rooting)
Linux first. Therefore, this bug does not (IMHO) have a big security impact
(because on older generation Exynos RKP/EL2 was only used for the kernel
memory protection and ROPP/JOPP but not for IOMMU) but is interesting for research purposes.

This is in no way a statement on the security of Samsung devices. I think
their efforts are definitely above average for Android. However, given enough time
any system can be broken, even the ones previously regarded as unbreakable such
as PS4 or iPhone with PAC. Here, patching timely before the issues get disclosed
is important and looks like things have improved a lot in the Android world recently.
https://www.zdnet.com/article/android-oem-patch-rates-have-improved-with-nokia-and-google-leading-the-charge/

The bug has been patched in October 2019 anyway so users with the latest updates
should not be affected (SVE-2019-15221, SVE-2019-15143).

What I've also learnt from watching a lot of talks and following the discussions
by other researchers is that security issues often concentrate in two areas:
where no one has looked before, and where many people have looked and then gave
up because they decided that they found all the low-hanging fruits. So RKP seemed
like an interesting target given the previous research from Google Project Zero
in 2017 (https://googleprojectzero.blogspot.com/2017/02/lifting-hyper-visor-bypassing-samsungs.html).

I will not be providing additional details on that bug but here are some nice
screenshots and videos:

Ubuntu X11 running on Samsung Galaxy A50. KVM guest runs Windows 10.
Here, we can see that the colors are swapped as the framebuffer driver is confiruged
to output BGR instead of RGB by default in Android.

Video of UEFI booting Windows 10 installer in KVM.
https://twitter.com/astarasikov/status/1249904283098796033

A mysterious BSOD (yes it's actually supposed to be blue) in the USB controller
driver, possibly related to how the controller is emulated in QEMU.

Unfortunately for now I had to stop further work on this project because I accidentally
upgraded the phone to the latest firmware revision and now due to the rollback protection
I can no longer install the vulnerable RKP image.

If you're interested in this kind of stuff, there are good news.
Recently a few open-source phones have appeared which do not enforce secure boot/
signature verification and you can run KVM (or any other hypervisor) out of the box.

For example, multiple people have reported getting KVM and Windows 10 working
on the Pinephone and Pinebook.
Pinephone has a Cortex-A7 CPU with an old Mali GPU so in terms of hardware
it's almost an exact copy of the Galaxy S2 discussed above, but it's more
FOSS-friendly.

https://twitter.com/RealDanct12/status/1231607283412426756
https://twitter.com/Manawyrm/status/1197981073101271040