1 00:00:00,179 --> 00:00:03,710 Presenting Jacob Thompson, Independent Security Evaluators. 2 00:00:03,710 --> 00:00:10,710 (Applause) JACOB THOMPSON: I'm Jacob Thompson, and 3 00:00:12,270 --> 00:00:19,270 this is C.R.E.A.M., Cache Rules Evidently Ambiguous, Misunderstood. 4 00:00:22,910 --> 00:00:28,690 Websites use HTTPS for things such as banking and payroll due to the sensitive nature of 5 00:00:28,690 --> 00:00:35,690 the personal information contained therein. The reason they use HTTPS, the data is too 6 00:00:35,780 --> 00:00:41,829 sensitive to transfer over an open network without encryption. 7 00:00:41,829 --> 00:00:46,440 If it is too sensitive to be sent over the network without protection, maybe it shouldn't 8 00:00:46,440 --> 00:00:52,540 be written to disk without encryption either, especially without the user's knowledge. 9 00:00:52,540 --> 00:00:59,540 In the past, many web browser were cautious about persistently caching the information, 10 00:01:02,469 --> 00:01:08,830 whether or not the header said to cache it. I am not concerned about memory caching, but 11 00:01:08,830 --> 00:01:15,830 only persistent caching to disk, as in you close the browser and it is still there. 12 00:01:16,200 --> 00:01:23,200 I was offline and bored at one point, so I opened my copy of Firefox. And having nothing 13 00:01:24,640 --> 00:01:31,640 to read, I went to the disk cache. I was very surprised to see things there from my bank, 14 00:01:38,680 --> 00:01:44,979 like check images and account summaries and recent transactions. 15 00:01:44,979 --> 00:01:50,990 My impression was that browsers did not cache this information. After that, we at Independent 16 00:01:50,990 --> 00:01:57,990 Security Evaluators looked at 30 sites, along with other analysts. We found 21 of them were 17 00:01:59,680 --> 00:02:06,040 causing information to be persistently cached in the latest browsers, either sending no‑caching‑related 18 00:02:06,040 --> 00:02:12,409 headers, or headers that were nonstandard or obsolete which only worked with certain 19 00:02:12,409 --> 00:02:16,950 browsers. The first thing I will do is show you a couple 20 00:02:16,950 --> 00:02:22,980 pages where we found information cached to disk, and what it was. Then I will look at 21 00:02:22,980 --> 00:02:29,180 the history of why this has been so consistent, and how somebody can be confused of whether 22 00:02:29,180 --> 00:02:36,180 this happens at all, and how to prevent it. (Cheers and applause) 23 00:02:41,809 --> 00:02:48,809 Sorry, but this is a first‑time DEF CON speaker. He sincerely asked us: Please do 24 00:02:57,459 --> 00:03:04,459 not interrupt my talk, so we're not going to interrupt his talk. We will just sit up 25 00:03:06,609 --> 00:03:13,609 here quietly and have a drink. Here's to DEF CON! 26 00:03:15,099 --> 00:03:22,099 (Applause) Thank you all for coming. 27 00:03:24,560 --> 00:03:31,560 JACOB THOMPSON: Okay, guys. So the first thing I will show you, some of the information 28 00:03:33,709 --> 00:03:38,769 we found in the disk cache. Then I will go over some of the history of 29 00:03:38,769 --> 00:03:43,900 what browsers used to do, what they do now, what the standard says and what the impressions 30 00:03:43,900 --> 00:03:48,739 are of people when you go on the Internet to places like StackFlow and look for this, 31 00:03:48,739 --> 00:03:55,059 and then some recommendations about how we think it can be more secure. 32 00:03:55,059 --> 00:04:02,059 ADP is a popular payroll processing company. Anybody have their checks done with this company? 33 00:04:02,760 --> 00:04:09,409 Lots of you. We had somebody at ISE with previous ADP history. 34 00:04:09,409 --> 00:04:14,939 He logged into the web interface and looked at a payroll statement. It was cached with 35 00:04:14,939 --> 00:04:20,919 nice information like the last four digits of the Social Security number and bank account 36 00:04:20,919 --> 00:04:25,430 number which might be used for authentication purposes on other sites, so you can see how 37 00:04:25,430 --> 00:04:30,599 things can possibly go wrong. ADP was sending some caching headers, but 38 00:04:30,599 --> 00:04:37,599 they were obsolete and only interpreted by IE. If you went to Firefox or Chrome, it was 39 00:04:38,189 --> 00:04:43,970 left behind. Another site, Argus, processes claims for 40 00:04:43,970 --> 00:04:50,970 health insurance companies. In Maryland, our Blue Cross Blue Shield company uses Argus 41 00:04:51,509 --> 00:04:58,330 to handle these claims. So we went there to see the pharmacy claims, 42 00:04:58,330 --> 00:05:03,340 and without any caching headers at all, there was the name of the patient and the medications 43 00:05:03,340 --> 00:05:07,740 they were on, as well as the dosages ‑‑ maybe not the best thing to have sitting on 44 00:05:07,740 --> 00:05:13,009 your hard‑drive. Also, it was sent with no caching headers, so even IE might cache 45 00:05:13,009 --> 00:05:18,610 this. The final one was more surprising, Equifax, 46 00:05:18,610 --> 00:05:25,610 which does credit reports. After an ISE analyst accessed his credit report there, it was cached, 47 00:05:28,520 --> 00:05:33,319 including information such as the obvious credit score and name. But also by definition, 48 00:05:33,319 --> 00:05:40,319 a credit report has a list of all your accounts. If you recently have applied for credit, a 49 00:05:43,729 --> 00:05:47,770 common question might be: It looks like you have a mortgage from three years ago, what 50 00:05:47,770 --> 00:05:51,370 is the payment, stuff like that, which you can get from the credit report. 51 00:05:51,370 --> 00:05:58,249 Here is a list of the 21 sites we found with some forms of caching issues. Some of them 52 00:05:58,249 --> 00:06:04,919 were big‑name banks. Others were not so big, but a pretty broad spectrum of sites. 53 00:06:04,919 --> 00:06:11,919 Here are some data‑types we found. Some not so severe, others were more concerning 54 00:06:12,030 --> 00:06:18,800 with information like birthdate and the last four digits of you SSN, and credit card numbers 55 00:06:18,800 --> 00:06:25,800 without expiration dates. Then there were some things for auto insurance, et cetera. 56 00:06:26,620 --> 00:06:32,990 Before I go into the nonstandard headers the sites may have been using, if at all, how 57 00:06:32,990 --> 00:06:39,990 do you prevent disk caching in today's popular browsers with these two headers? Don't use 58 00:06:40,020 --> 00:06:44,560 them as metatags. There is a historical precedent, but it is 59 00:06:44,560 --> 00:06:51,560 not reliable. Pragma no‑cache goes back to the mid‑'90s when SSL was introduced, 60 00:06:54,099 --> 00:07:01,099 and you need to pass the header due to a special case with IE 8 when the server speaks HTTP 61 00:07:01,900 --> 00:07:08,900 1.1, as opposed to 1.0. For all other cases, including IE9, cache 62 00:07:11,860 --> 00:07:17,779 control knows what is in the standard specifically for this purpose, talking about preventing 63 00:07:17,779 --> 00:07:23,289 information from being revealed on backup tapes, which is the same concept. 64 00:07:23,289 --> 00:07:30,289 So what are some headers we saw that don't work and fail? Cache control no cache. That 65 00:07:31,569 --> 00:07:37,029 is in the standard, but it is about preventing a user from seeing stale information, telling 66 00:07:37,029 --> 00:07:43,340 the browser that you have to revalidate before serving another request; it has nothing to 67 00:07:43,340 --> 00:07:47,960 do with security. Despite that, when Microsoft first implemented 68 00:07:47,960 --> 00:07:54,960 support, they decided to interpret it with the same meaning. They stayed with that all 69 00:07:55,740 --> 00:08:02,189 the way through IE 9. In IE 10, they started to follow the standard, and this is still 70 00:08:02,189 --> 00:08:09,189 changing up to today. Pragma no‑cache predates HTTP 1.1. And what 71 00:08:09,319 --> 00:08:15,550 it means, if it is over has the meaning of if it is over SSL, do not write it to disk 72 00:08:15,550 --> 00:08:21,550 cache, and it still works in IE. We saw "cache control private" on a handful 73 00:08:21,550 --> 00:08:28,550 of sites, not intended at all for web browsers. It is about caching proxy servers by multiple 74 00:08:28,779 --> 00:08:32,919 users, saying this information is specific to one user, you shouldn't use it to service 75 00:08:32,919 --> 00:08:39,919 another request by a different user. Cache control in meta equiv tags doesn't work. 76 00:08:40,300 --> 00:08:45,230 The pragma does, but there is buggy behavior, and there are more details about that in the 77 00:08:45,230 --> 00:08:52,230 white paper. So it doesn't work, at least for the purpose of preventing disk caching. 78 00:08:53,500 --> 00:09:00,500 Finally, passing the cache control no‑store header when the server uses 1.0 is ignored 79 00:09:00,959 --> 00:09:06,560 by IE 8 and earlier versions, which seems weird. Why would a server speak to a header 80 00:09:06,560 --> 00:09:13,560 it doesn't understand because it is too old? Until recently, Apache mod SSL support would 81 00:09:15,510 --> 00:09:22,329 automatically downgrade a connection from Version 1.1 to 1.0, if IE was the requesting 82 00:09:22,329 --> 00:09:27,800 browser. This was a bug workaround, and it was still 83 00:09:27,800 --> 00:09:34,430 there until two years ago when it was patched in the main Apache branch. The change still 84 00:09:34,430 --> 00:09:41,430 has not percolated to the various Lennox distributions, including the latest Santo SOS. 85 00:09:42,550 --> 00:09:49,550 That behavior is a little weird, something you would never realize until you followed 86 00:09:53,139 --> 00:10:00,139 it through and had a site send the header, and then look in your cache. In the past like 87 00:10:06,279 --> 00:10:13,279 with early Netscape, it didn't cache anything over HTTPS. And even today, Safari still works 88 00:10:16,480 --> 00:10:22,209 like that. When a server is sending something over SSL, it is never cached, and the server 89 00:10:22,209 --> 00:10:26,750 has no way to mark "nonsensitive" to mark an exception. 90 00:10:26,750 --> 00:10:33,750 Firefox experimented in Version 3 to allow certain things marked "not sensitive," what 91 00:10:34,540 --> 00:10:41,540 I call an opt‑in policy of using a public header, go ahead, it's just a CSS file or 92 00:10:45,100 --> 00:10:48,810 something, it is okay. Then there are other browsers, especially 93 00:10:48,810 --> 00:10:55,300 like older ones that only allow the pragma no‑cache header as a way to mark individual 94 00:10:55,300 --> 00:11:01,500 resources not to be written to the disk cache, which I call "nonstandard opt out" because 95 00:11:01,500 --> 00:11:06,459 nonstandard behavior works, and standard headers don't. 96 00:11:06,459 --> 00:11:13,190 Then IE, as it came into newer versions, started to support cache control no‑store, but pragma 97 00:11:13,190 --> 00:11:18,600 no‑cache is still there. So I call it a "generous" opt out, be generous as what you 98 00:11:18,600 --> 00:11:24,360 accept. It's often applied to other parts of rendering. 99 00:11:24,360 --> 00:11:29,850 I have the three IE versions listed separately because there is individual variance, but 100 00:11:29,850 --> 00:11:35,519 the main idea is still there, old behavior, new behavior, both work. 101 00:11:35,519 --> 00:11:42,430 New browsers like Chrome and Firefox 4, either the server sends cache control no‑cache 102 00:11:42,430 --> 00:11:49,430 store, or it gets cached, period, the end. Because of the this discrepancy, there is 103 00:11:49,880 --> 00:11:55,500 a lot of confusion out there in the community about what they really do. 104 00:11:55,500 --> 00:12:02,500 If you go on your search engine and search for "browsers do not cache SSL or HTTPS," 105 00:12:04,380 --> 00:12:10,399 you will find old and new results saying web browsers don't cache to disk if they came 106 00:12:10,399 --> 00:12:15,339 over SSL. Some say "from stack overflow and blog posts" 107 00:12:15,339 --> 00:12:22,339 and even a W3C mailing list which may have been true when written, and some say SSL is 108 00:12:22,560 --> 00:12:29,560 not cached, except IE browsers. But that's not true today, especially with Chrome and 109 00:12:30,230 --> 00:12:34,019 Firefox. This quote comes from the OWASP application 110 00:12:34,019 --> 00:12:41,019 security fact, who should they better. This may have been true with Firefox 2 and earlier, 111 00:12:46,220 --> 00:12:49,810 if those were the browsers you were looking at, but as far as SSL, that part of the standard 112 00:12:49,810 --> 00:12:56,810 is just not there; there is no specific behavior that all browsers follow. 113 00:12:57,279 --> 00:13:03,560 So let's look at the browser developers like Mozilla who decided to change caching to increase 114 00:13:03,560 --> 00:13:10,560 performance. On the "bug" end of things, in deciding to switch the policy from opt‑in 115 00:13:12,329 --> 00:13:18,680 to strict standards compliant opt‑out, one comment on the bug said: Among sites not using 116 00:13:18,680 --> 00:13:24,110 cache control no‑store, the correlation between SSL and sensitive is very low. 117 00:13:24,110 --> 00:13:31,110 But it doesn't work that way for those 21 sites, so where do we go from here? Should 118 00:13:35,420 --> 00:13:41,339 browsers assume websites will mark things "sensitive"? The web sites either use headers 119 00:13:41,339 --> 00:13:46,459 they think are marked sensitive, or they don't realize they need to do it in the first place. 120 00:13:46,459 --> 00:13:52,589 What do we think should be done? First of all, the obvious thing is to fix web applications. 121 00:13:52,589 --> 00:13:58,079 After all, the HTTP standard does say this is the header to use if you want to prevent 122 00:13:58,079 --> 00:14:05,079 disk caching. "Clogs‑browser compatibility" is about more than the tags and HTTP request, 123 00:14:08,839 --> 00:14:15,839 as well as deeper meanings of the standard that can have security consequences. 124 00:14:17,190 --> 00:14:22,889 I have "fixed browsers" as a "maybe," because it can be reasonably said by a browser vendor 125 00:14:22,889 --> 00:14:29,870 that the standard says you send this if you want to prevent caching; however, at the minimum, 126 00:14:29,870 --> 00:14:32,829 browsers should interpret the pragma no‑cache header. 127 00:14:32,829 --> 00:14:39,829 Despite being standard, it had the meaning in IE and earlier versions of Netscape when 128 00:14:42,259 --> 00:14:49,259 they had the opt‑out policy. They could switch from opt out to opt in; browser not 129 00:14:51,170 --> 00:14:56,970 required to cache anything. They can go back to the Firefox 3 policy where nothing is cached 130 00:14:56,970 --> 00:15:03,970 unless the server says "cache control public." More important, the bad documentation out 131 00:15:04,709 --> 00:15:11,709 there saying browsers don't cache this or use pragma no‑cache, that should be fixed. 132 00:15:12,839 --> 00:15:17,720 We obviously can't fix mailing list archives, but at least wikis and other things should 133 00:15:17,720 --> 00:15:22,690 be updated, and anyone doing security assessments of Windows applications, they should be aware 134 00:15:22,690 --> 00:15:25,649 of this issue. Finally ‑‑ and this is probably the most 135 00:15:25,649 --> 00:15:30,459 controversial and least likely to happen. Maybe the HTTP standard should take this into 136 00:15:30,459 --> 00:15:37,459 account that if you look at RFC 2616 and you search for SSL, there is a grand total of 137 00:15:37,839 --> 00:15:42,110 one occurrence. And if you search for HTTPS, there are no 138 00:15:42,110 --> 00:15:47,110 occurrences. So while that may make for a nice layered architecture with the protocol 139 00:15:47,110 --> 00:15:54,110 on top, encryption underneath, you shouldn't be ignoring security consequences or assumptions 140 00:15:54,180 --> 00:16:00,160 people are making, especially when there is a historical behavior among different web 141 00:16:00,160 --> 00:16:03,639 browsers. Finally, I have actually put an HTTPS site 142 00:16:03,639 --> 00:16:10,300 out there that tries different combinations of caching headers so that you can look and 143 00:16:10,300 --> 00:16:13,680 see what really happens in the browser you try. 144 00:16:13,680 --> 00:16:20,680 Before I bring up the site for an experiment, are there any questions? If not, I will go 145 00:16:22,300 --> 00:16:29,300 to the demo site. Safari was on the list as no browser. Chrome was strict compliance, 146 00:16:41,610 --> 00:16:48,339 and the same for Android. On Android, it is possible these things can be getting cached 147 00:16:48,339 --> 00:16:52,970 just like on desktop, and it's harder to change browsers on mobile. 148 00:16:52,970 --> 00:16:59,970 (Question or comment from audience member) JACOB THOMPSON: Something about pdf? I 149 00:17:09,160 --> 00:17:16,160 can't hear the question. The fact it is a pdf file, how does it affect the caching? 150 00:17:23,350 --> 00:17:30,350 Okay. It is possible that due to the fact when it is a pdf, if using Adobe plug‑in, 151 00:17:31,400 --> 00:17:35,400 depending on the implementation, it might be caching to a temporary file. 152 00:17:35,400 --> 00:17:42,370 We tried it in Firefox, and sending cache control no‑store does work, especially with 153 00:17:42,370 --> 00:17:49,370 the new built‑in reader. This person is asking if you are using a web browser, can 154 00:17:51,280 --> 00:17:55,890 you reconfigure to go back to the old policy or not. 155 00:17:55,890 --> 00:18:02,890 In Explorer there is option for: Do not save encrypted pages for disk. In IE 10 and 11, 156 00:18:04,250 --> 00:18:11,250 it is supposed to work, but there were problems with downloads in earlier versions. 157 00:18:11,760 --> 00:18:18,760 In Firefox there is a hidden browser preference you can set the opposite way to go back to 158 00:18:19,970 --> 00:18:25,590 the no‑cache SSL policy. We tried writing an extension for Chrome, but we didn't get 159 00:18:25,590 --> 00:18:30,020 anywhere with the APIs. There's nothing to do with Safari, because it doesn't cache in 160 00:18:30,020 --> 00:18:37,020 the first place. Would it make sense to encrypt (?) 161 00:18:37,750 --> 00:18:43,440 JACOB THOMPSON: Where do you store the key? You want to exit the browser, go back 162 00:18:43,440 --> 00:18:48,830 in. If you can recover the key, potentially, but so can any other application. 163 00:18:48,830 --> 00:18:55,830 (Question or comment from audience member) JACOB THOMPSON: Or just use a memory cache, 164 00:18:59,060 --> 00:19:04,690 but that is valid for a mobile phone without as much memory. 165 00:19:04,690 --> 00:19:11,690 Ready for the demo site? I am in Firefox. The first thing I will do, it will clear the 166 00:19:19,180 --> 00:19:26,180 cache so that this is valid. All right, zero bytes. Next I will visit our test page. 167 00:19:33,690 --> 00:19:40,690 You can also do this, if you would want to. It did work earlier. I even have an ETC host 168 00:19:45,650 --> 00:19:52,650 entry. So this is a main page with a little description of the issue, how to check for 169 00:19:52,810 --> 00:19:58,710 it. Then I have some i‑frames links pages to 170 00:19:58,710 --> 00:20:03,460 send various combinations of the headers. I have a small explanation of what they are 171 00:20:03,460 --> 00:20:07,710 supposed to do. After you visit this page, I will close the 172 00:20:07,710 --> 00:20:14,550 browser so that we are guaranteed it is in the disk cache, and then go back in. We will 173 00:20:14,550 --> 00:20:21,550 go to the magic URL. It shows all the disk cache entries. I named 174 00:20:24,460 --> 00:20:31,460 those files on the demo site so that it tells you what headers it is sending. So no headers; 175 00:20:31,870 --> 00:20:38,870 dot‑HTML is not sending cache‑relate headers, no cash control or pragma. 176 00:20:39,990 --> 00:20:46,990 Then cache control no‑cache, I am proving this does not work. So it might be influencing 177 00:20:48,540 --> 00:20:52,490 the decision of whether or not to validate before reuse, but it has nothing to do with 178 00:20:52,490 --> 00:20:59,490 disk caching. Cache control no‑store is a meta-tag, not 179 00:20:59,650 --> 00:21:06,650 there in the headers. It is in here if you look, and that doesn't work either. And it 180 00:21:07,440 --> 00:21:13,580 probably shouldn't work, because for meta-tags to affect caching, you have some weird condition 181 00:21:13,580 --> 00:21:20,180 where you can cache and then parse, or parse and then cache; it would be buggy behavior. 182 00:21:20,180 --> 00:21:26,650 IE does support the meta-tag for pragma headers, and there is documentation of bugs saying 183 00:21:26,650 --> 00:21:30,540 if it is over a certain size, put the tag at the bottom of the page, something like 184 00:21:30,540 --> 00:21:37,510 that, just crazy. Pragma no‑cache is not working here in Firefox. 185 00:21:37,510 --> 00:21:44,510 It would have worked in IE. One more, cash control public and private, and private doesn't 186 00:21:45,730 --> 00:21:50,800 work. So you can check the behavior and prove it 187 00:21:50,800 --> 00:21:57,800 to yourself. Any further questions? (Comment or question from audience member) 188 00:22:10,880 --> 00:22:17,880 JACOB THOMPSON: We haven't tested that. The best thing would be the justification 189 00:22:25,420 --> 00:22:32,420 in Firefox about why they changed it between Version 3 and 4. Maybe somebody has done statistical 190 00:22:34,230 --> 00:22:39,500 testing. Nowadays we have bigger (?) 191 00:22:39,500 --> 00:22:46,500 JACOB THOMPSON: That is why they changed it in the first place. As long as the site 192 00:22:47,930 --> 00:22:54,490 does the proper behavior, we would be okay, but that is not happening right now. 193 00:22:54,490 --> 00:23:01,490 This is Firefox 22. It hasn't changed. Anything further? All right. Thank you very much. 194 00:23:08,750 --> 00:23:09,600 (Applause)