Congratulations and thanks to @jurajsomorovsky, @NimrodAviram and @ic0nz1 who reported this to AWS in November last year. This was a really interesting find that took a lot of diving! Story ...https://twitter.com/ic0nz1/status/1100413895141773312 …
-
-
We also have our own tests and monitoring for padding oracles. You can see s2n's here: https://github.com/awslabs/s2n/blob/master/tests/unit/s2n_cbc_verify_test.c … , and re-running these checks would actually show no impact. Head-scratcher!
Show this thread -
At this point Steven Collison and Andrew Hourselt from our TLS/SSL team started having to dive really deep to find out what was going on.https://twitter.com/jurajsomorovsky/status/1100420753256841218 …
Show this thread -
First we were able to determine that most of the IPs were using OpenSSL to terminate traffic. But it wasn't always impacted. In fact the really common OpenSSL users, software like nginx, Apache ... not impacted at all!
Show this thread -
Tracing through the code of things that were impacted showed that the problem only happened when the application called SSL_shutdown() twice, even if there is a protocol error.
Show this thread -
Calling SSL_shutdown() twice is normal when there's no problem with a connection, and it should be harmless in the error case, so it's understandable that some applications do it ... but thankfully it's not common.
Show this thread -
The actual leak of info, whether it was a padding or MAC error, would effectively show up as a timing or connection close difference between these calls. Impacted applications would either seem to time out, or close connections, differently, depending on the error. Subtle.
Show this thread -
O.k. so next question: why don't existing padding oracle tests find this? Well it turns out only to happen to zero byte records. Records that have no data in them. And the scanning tool happens to send zero byte records.
Show this thread -
Zero-byte records aren't common: browsers don't send them afaict, and packet dumps seem to show that they are exceedingly rare: which makes sense, if you have no data to send, why would bother? So that's very re-assuring.
Show this thread -
Next weird thing: the problem also happened if OpenSSL wasn't using AES-NI hardware acceleration. In practice this means it impacted 3DES (which people should have turned off for other reasons!) and older hardware.
Show this thread -
This also explained why FIPS software appeared in the list, because FIPS software generally can't use AES-NI.
Show this thread -
At this point, a lot of factors have to be combined: TLS sw would have to be coded in an uncommon way, using OpenSSL, negotiating older cipher suites, on older HW, with clients that send 0-byte records, and can be made repeat the same data over and over, with an active MITM.
Show this thread -
But that makes it more interesting! How do we find and prevent even these kind of rarefied cases? Automation, like the scanning tool, is clearly critical - but can we do more at the point of code?
Show this thread -
One thing I'm grateful for is that in s2n we kill connections on any error, and we do it in a way where s2n will completely refuse to interact with the connection after the error has happened. Just with a closed flag ... https://github.com/awslabs/s2n/blob/master/tls/s2n_connection.c#L1031 …
Show this thread -
s2n uses OpenSSL's libcrypto for the underlying cryptography, and the same issue in that code /could/ have caused impact within s2n were it not for that practice. Basically this check .... https://github.com/awslabs/s2n/blob/master/tls/s2n_send.c#L94 …
Show this thread -
Of course the impact still would have been small, because of the other factors, but I'm glad we have that check! Anyway, thanks again to the issue reporters, read their paper when it comes! and thanks for Andrew and Steven from the TLS team. That's it, unless AMA.
Show this thread -
New conversation -
Loading seems to be taking a while.
Twitter may be over capacity or experiencing a momentary hiccup. Try again or visit Twitter Status for more information.