>>Okay thank you for come today. Um, our talk is on, uh, hardware tol-tolerant uh hardware. Um, that’s tolerant of Trojans and uh we’re gonna talk a little bit more on that and supply chain security in practice. Um, so the highlights of this talk is uh we start first discussing the private life of keys um there are some weak- also some weak links on the supply chain so we’re gonna bring this up. And then we have the, some lessons learned from airplanes. And then we’ll see how can transfer this crypto hardware, uh then Dan is going to do a demo and describe the architecture we came up with. And then I’ll describe the protocols some mods and and some magic stuff we do. And finally uh we’ll close the presentation with, uh, some talk about politics and how we can explode this to crypto hardware. To begin with, um, there’s this thing that we have priv-private uh key and public keeper. Um, how do we generate that? So first someone somewhere in the developer design house, uh, designs an integrated circuit which is then fabricated to a foundry somewhere else probably. Um And then this integrated circuit is delivered to the hardware vendor to that actually ordered it. Uh, the vendor then, um, loads it’s firm--firmware on it and assembles the actual device that’s gonna use it. The integrated circuit. And um then the device is sent with the customer that bought it. The customer use it-uses the device to generate and store, uh, to generate and store the key. Only, um, the problem with this, uh, the life cy-cycle of keys is that, um, in practice if any of these, uh, steps, gets compromised or an attack happens in them, the final key’s weak or compromised completely. Um, and this is, um, for this reason we have hardware security modules so, um, hardware security modules were built with a purpose of, um, um, protecting keys and tech performing audio prints for the device. So it provides very neat features. Some of them are, uh, cryptographic key generation, storage and management. A lot of these things happen on the device. And then they have a-a whole set of features that have to do with, uh, tamper-proof, tamper-resistance, and, uh, tamper-response. So what they do is basically if you physically manipulate, uh, a HSM, um, then this is gonna be, uh, visible to the owner of it. Uh, also the, uh, manipulating physically any system to retrieve the secrets of uh, that sort of thing. Inside that’s not trivial so they provide tamper resistance. And then tamper response means that the actual HSM, actually the HSM can take action if, uh, detects its being manipulating so it may raise the keys or lock down completely. Um, so this makes it very hard for the adversary to retrieve what’s in there. Um, plus for companies that want to check all the boxes. They HSMs are usually certified with a very high level- really high level of security. And then, valida-validated for it. Um, the bottom line for HSMs is that the all the impressions are being carried out on the device so the secret keys and all the secrets stored inside never have to leave the device. And the- for this reason because they provide really high security they’re using a very um, in lots of publications where high service is needed. So like public key infrastructures, um SSL connections, accelerators, um payment systems is very popular and actually vendors are willing to pay the very high cost that comes with them, um ten k that we have here on the slides is the very very low um end of it, usually it's much higher. And then there are lots of other costs that usually um are hard to quantify but they are- they are quite uh high uh that have to do with integrating the actual HSM once you buy it and then uh operating it and supporting throughout the years. Despite the the high cost HSMs protect only the last few steps of the um private key life cycle so the-the top four um are also are still exposed. And um, there’ve been cases where we’ve seen, um, things going wrong in the first four steps meaning that the actual security of the device is being completely um broken and nonexistent. For this reason and because people know that they try to come up with solutions and there lot-there’s lots of academic literature on it actually. Um the most popular of those are trust foundries so this means that you sent your uh circuit design to a foundry or factory uh that you completely trust not to insert any Trojans in it. Um this, the problem with it is that it’s very expensive and of course mistakes can still happen during publication. The other approach is more academic it’s split manufacturing. There should be few um foundries that support that. It’s still expensive and again errors may happen. And the final one is the post fabrication inspection so what happens is that you order your credit, check it um they manufacture it, you get it back and then you run some tests on it. The problem with this is that it’s expensive. You need expensive tools to do that. You need to constantly retool because the connects uh uh advanced uh and then it's a huge pain because if you’re there are like a thousand a few thousand chips you cannot test all of them. So it doesn’t scale very well. Um, in general overall um it's an arms, an arms race because because uh hardware Trojans connects are constantly advancing and uh adversaries are always and will be always a step forward. So you can never be a hundred percent certain that nothing went wrong throughout the process. Even your trust foundry may sometime uh betray you and cooperate with uh someone. So another note there is another community. A fault tolerant community so not security but had a similar problem and they solved it uh using um tandem systems.So what they do is basically uh instead of using one integrated circuit they use three coming from completely different supply chains and uh the they build therefore they build either dual redundancy systems which allows them to detect if one of those two circuits is misbehaving uh and detect errors in the final results or a triple redundancy systems um where all the computations are being replicated between the three different um processors and they in the end they perform a majority vote about what the uh correct output is. And this is actually used in the autopilots uh on uh commercial aircrafts and uh I think also they use it in space. The problem with uh fault tolerant systems is that uh they are they are built for safety and they do their job very well for that. But um for complicated they replicated computations but for security tuh they don't transfer well at all. Actually they’re bought for security because what you end up having is a system that has three processors starting your secret key. Um Meaning that uh if one of your uh processors is uh compromised and then um uh you're prone to attacks so instead of actually improving your security you increase your attack surface. For this reason we came up with uh the solution where we represent today. Which provides protection and the option for uh first steps of these uh life cycle of keys. I am Vasilios uh I did this work with George Danezis, Dan Cvrcek and Petr Svenda. And here are the ingredients of our solution. So we have two ingredients. One of them is a hardware components and the second is a cryptographic protocols. And we need specific things from uh these kind of uh from these components. So for ICs we need the independent fabrications so they must be fabricated in different facilities and the supply chain leading to them should, their supply chains should be non overlapping. Uh they must be programmable um hopefully affordable and if they are commercial off the shelf but that’s actually even better. Um for cryptographic protocols we want protocols that uh all the parties that participate in them are not trusted. That the secrets are completely distributed and allow them to perform operations um in a distributed matter instead of a centralized one. And they’re provably secure meaning there are maths that support their uh security. So, our um hardware components are smart cards. Because you have many dependent manufacturers with who have their own facilities to produce smart cards. And the supply chains are indeed disjoint. Both in terms of locations, design, and the foundries. They’re programmable and they certify at a very high level, high standard. And they’re commercial off the shelf and pretty cheap actually. And then for the protocols we have um multiparty computation protocols which allows you to do um distributed operations meaning that the key is not on a single point at anytime and you can generate random numbers in a restricted matter. Um key pair do key pair generation which is what we are interested in. Decryption is high which is what we are also interested in. Um the two nice products of those protocols is that they allow you to be securing cases where all but one of your components are malicious and they cooperate with each other. Or they allow you to be secure in cases where all your components are malicious but they don’t cooperate. So now Dan is gonna take over and he’s going to introduce our prototype and then move on with our actual demos. >>Alright thank you. [applause] So with help of Jack I’ll try to show you some live demo. Uh and what was so far pretty much slide here I’ll try to turn it into a product. And reproduct look about this. And we got one prototype or one one one piece here on the table and I will try to use it for to show how the multi party computation security of the design actually works. Uh so what’s inside the box, uh we got many smart cards. This this particular one has got one hundred twenty of them. Uh and the we were able to use them in groups of three to basically shows how scalable it is in properties of protocol that we designed. Uh Probably you say well smart cards is pretty slow cheap device. Well we can talk to them directly at about one megabit per second so in the box basically we are talking for than one hundred twenty megabits per second to smart cards. So I don’t see how one hundred twenty megabits is really that slow even today. Um there is FPGA to connect all the pins together and those ports are connected with standard ethernet to through some internal hardware in mine internal port. And just put it to a rack and use it in scale. So here’s just the main the main parts. So hundred twenty smart cards. Use JavaCards because they are easy to program. And we need some department for JavaCards so we can use really very easily JavaCards from different manufacturers. Something presented earlier at Def- at Black Hat. Uh, so each smart card gives you physical security. Very good. Several layers of physical security that makes it very very difficult to get inside and extract any keys. Command of those things uh all the memory and addressing in the smart card is encrypted so just, uh, deleting one AS key basically destroys all the information uh and makes it basically random data. Uh we’ve FPGA. That basically connects JavaCards. All we use here are protocol to [indiscernible] JavaCards into basically TCP packets. And then you got internal network hop and main Linux server that runs for us basically on untrusted restful server that allows connectivity uh to outside. So we got three demonstrations. Uh, so I will try the first one which is about showing geographically disputed uh control of integrated circuits. Uh what I will use is my laptop. I will use a black box that is next to me. It’s got one hundred twenty smart cards inside and runs the RESTful server. I will talk to but I will also try to connect to instead of one hundred twenty smart cards that are just now sitting in our Cambridge office in UK. So if everything goes well. So we’ll load final reports. So as almost all grey or red because we don’t have any data here yet. Let’s start uh the root connector connect RESTful server to the ritualization report and now let’s reach the pond. Now you can see that what I actually did is I started the server with integration that shows two IP address feeds that hide two sets of smart cards. One is local that goes in the server and there’s one that goes all the way to uh to the UK. Uh, through some commercial ISP uh this is an enumeration of the smart cards. What it’s opening now is basically the uh the RESTful server uh should be starting any second. So what is happening when we look at the server try to connect to all those smart cards. It takes longer than elsewhere. Still not-not started. [inaudible off mic] Yeah here we go. So it took a while uh but again we go there so now you could basically available two hundred forty processors that are each able to basically run uh multiparty computation for us. So the nice thing about the system as we designed it is that not only you can use uh micro control different geographic locations, but each multi party computation, each group can contain processors that are in different physical locations so you can run that com processor here in the room, another processing Cambridge that one actually can be JavaCard simulator. And as such on any platform with its own Intel and SPARC so it gives you really wide range of options to provide different supply chains and complete independent manufacturing processes so basically the only come on point. When you start using crypto-generated keys is a lot of key stop generator key. So, I will quickly switch pond off so it should be much much bigger now I go basically just local. Local. I press for abort in here. I change different report. Also put. So something there already and what I’m going to do now is um use my laptop as a low-low generator. I’m start running the request against the smartcards. So what I’m going to do is basically create circuit independent groups of smartcards with search different keys that can basically uh surf different customers at the same time. So instances been allocated and now basically this is uh transactions per second. We run for about a minute. If I just sort of bit of context imagine that use bit chain or bock chain uh technologic and you’ve got five ten parties in a private scheme and basically each transaction needs ten signatures by ten parties. What you can do with this, this computation is basically involve all those ten parties but it will result in just one signature. But you know the signature needed cooperation of all of them. So it’s much easy to verify signatures cause instead of going to ten different uh lectures, you’ve got uh just one, uh one master copy that you could replicate and verify independently. Right, so this basically just demonstration of the throughput of the whole system. And the scalability and the possibilities will show a bit more uh uh veros about our previous tests. Uh that will it. So the last demo is actual showing how someone can try to attack the whole system. But he needs to put a lot of effort into it to actually succeed. So again just one server that is running here. Uh A laptop connecting it to it and I will have a small group or chart of smartcards that have got backdoor inside. And yet I will try to use a RESTful server, the thing that is facing the internet. I will try to set the key that I will use to some kind of default value so it can easily decrypt data.So I gotta change the dashboard to something new. So I’m using dashboard uh and also uh no threats to basically connect uh all the flows and show you something meaningful. Oh it’s bigger than I thought. Um anyway there are two main flows. One is to generate key. I don’t know how well it is-it is visible. So this is to generate the key and then I got three triggers that will allow me to compromise card one, two, or three. Uh how it happens is uh each time I’ll need to create uh a RESTful request that will allow open the backdoor that will part in the chip and then basically I will try to generate key and see if the key uh is all that I can use and I assume that I expect so I cannot take the whole system. Of the first one will be, so let’s go in slide on the dashboard. So we’ve got sort of an initial state. So now there are three cards used because we needed some extra than before. All is green but you see that there is no key and there is no-no group that we can use. So let’s run create. As a result I got now three addresses so I can certificate address for integrated-integrate, uh-integrated uh-uh circuits that I will use. That requires a little bit of clicking. What I’m going to do is basically do all that CR tech would do figure out which of the uh processors uh that he wants to compromise. So this is the first one done. So it’s second one. And the last one. And I’m almost there. Bear with me. So last bit I need is to say to tell the key generate algorithm which uh which group of cards it should be using. So confirm deploy. Make sure that we’ve got all cards that there are. Uh that they are secure. Now we try to generate new key. We got some delays here so you cannot show the switch quickly. So you see that there is now public key and it’s different than the fixed key that we know the attacker set uh and wants to wants to wants to run the chips to share. So the first one that is to compromise card one. So in the inspector. So the card is compromised. Takes a few seconds. And now basically the new uh new cards new configuration will generate new key that is changed. But you can see that it’s still green. It’s still secure. Uh imagine that basically to get this it is either easier to change the firmware that can be uh verified by us or controlled by different parties or have to compromise manufacturing of the chips anywhere. Uh doing their manufacturing process so if I do part cards two. So compromised. Now imagine that all those keys are elected curve to two five six bit keys. And if at least one part is secure we’ve still got two hundred and fifty six bits of random data, random key. So second attempt still didn’t succeed. I’ll try finally the third card. Now basically expect assume the CR tech are compromised three different places, three different chips that can be under control. Three different parties. Manufactured or running on different hardware from-from chip manufacturers. Now only that many does all three of them you can see that the keys as expected and now you can basically decrypt all the data that we’ve tried to encrypt with the key. All for the signature. On the other hand, if I expect that this can happen and regularly try to refresh the uh the chips that I use. If I refresh just one of them and turn it into secure state then I gain- get key that is absolutely secure from uh cryptographing point of view. Alright, so that was bit of long demo and uh Vasilios will tell you what it- what is actually doing inside. >> [applause] Kay thank you again. So yes we build that system um however in um so in for our demos we used um a group of cards that uh had three cards inside to do all the computations. However someone for whatever reason he may wants to use less or more cards so we tried to obtain other protocols rescalable. So for assigning a decryption we do super well. This means that uh you can use as many cards as you want and the processing time doesn’t increase. In the processing time cooperation. Uh for keys in relation because we need very high assurance. Uh this is not the case but as you can see the increase is linearly so it’s it’s nicely it’s not like you get a devastating delay. Uh for scalability uh on our hardware here we used a hundred and twenty or two hundred and forty if you use hardware remotely. You can have as many um processors as you want. As you can see both operations decryptions in signing signing uh the throughput increases um linearly so the more you add the faster you become so depending on your need you can uh you can needs you can uh decide how many processors you can you want to use. So a little bit more about the magic that’s going on behind the scenes. Uh I keep I kept it uh extremely light in terms of mathematics. So there is nothing there. Um so there are three plus one key points that uh we wanted for for it so far in the algorithm you use. So the first one is that there must be no single processor handling sensitive stuff such as secret keys or anything else at any time. Uh the second one is that uh if one of the processors is misbehaving and is trying to actively attack other processors or trick them into doing stuff uh honest ones, honest processors can detect that. The third one is that if one of the honest processors is being excluded from the protocol execution uh the user can actually tell that this happen. And finally if we could uh-uh come up with a protocol or an algorithm that’s uh is doing well in terms of performance which we did. So a little bit of a side note, um secret setting is a very neat concept. So imagine you have three people and they want to share uh a treasure map. Uh the simple solution would be uh so that they can retrieve the treasure only if all of them uh get together again otherwise no one can actually retrieve the treasure map. So the nice solution would actually be to actually uh cut the uh the treasure map in three pieces and then uh each one of them gets one of the pieces. However there is a problem with this because each people is leaking uh part of the information of where the treasure is and someone may be able to successfully use this information to retrieve the sec- the treasure by himself. So we use there are some schemes called secret setting schemes that they allow you to split the secret into shares. And then you they allow you to recombine the shares to retrieve the original secret to reconstruct it. Um but they have this very neat property that allows- that is- that each share doesn’t leak any information about the-the actual secret. So as long as not all shares um are present you learn nothing about the actual secret that they are hiding. And then there are two parameters that they usually can actually chose to tweek. One is how many shares you split the secret into. And the second one is how many of those shares you need to reconstruct the original secret. So you may actually cut the treasure map into hundred shares and then you need only three of them to reconstruct the original uh map. But in this case for our hardware we use um three out of three um scheme. So this means that we split our secrets into three and then you need three uh processors to uh come together to reconstruct them. Okay so here are the operations. Classic key generation. You go to the HSM. You-you inquire a new private key for yourself. If the HSM um responds he generates a key internally, stores a private key inside, pretends to use a public key. The problem is that if the HSM has a malicious processor this means that uh the processor gains full access to the private key and then the public key that you are getting back you have no idea if this is uh some sort of a weak public key or there’s any other problem with that. Instead what we’ll do is something different. So we have three processors as you can see the bottom of this slide. So on the step one uh you inquire them to generate the public k-uh public private keeper. So they generate the public keys and then through a process they combine them to form uh the common public key that you can see on step four. So what’s interesting about the common public key is that despite the fact that all the keys except one may be compromised as long as the-as long as there’s one key that’s strong uh the final key is uh maintained and secured. So going back to the key points that we evaluate algorithms with. Um we have all of them except the one that uh accounts for performance and this is because on the step, on step three there is no interaction between the processors so this means that there is some slowness there. We’ve seen it previously on the-on the graph where we saw how it scales. So then we have decryption. Pretty similar process. You go to the HSM. You say I want to decrypt this email. HSM knows you’re a key so it decrypts the email for you it tends to do in plaintext. Again this is problem. The HSM needs to have in a single place your full key. Uh instead we do something else. We do distributed decryption so in step one inquire uh from the HSM to decrypt your ciphertext. Uh what happens in the second step the different processors generate what we call decryption shares. So the decryption shares hold no information about the plaintext. They are just shares that then they send to-to Bob and Bob can then combine them by himself to retrieve the plaintext. So the reason I didn’t benefit from that because the HSM never sees the actual plaintext. All the decryption process happens actually by the user himself. Again we have all the key points checked except there must be having a processor one. Because there is no interaction between the um um processors in this protocol it makes no sense. I mean if you misbehave then the cipher the plaintext in the end will make no sense. And it seems silly the Trojan, the hardware of the Trojan hardware will reveal its existence and then the user will know that something is going on with his hardware. Um then we have classic signing. Same process. You supply plaintext and you say I want to sign the plaintext um the IC again needs full access to your private and then it returns you the signature for uh for the document you provided. Instead uh what we do is a bit different. So there is a first step. Um or step zero. Which is casing. So when you set up the device you do some casing uh you do this once for thousands of signatures. So you takes uh about um few minutes and you do this only once in the lifetime of the user. And then you move on with the actual protocol. So this is what you execute when you want to sign the document. What you do is you send the document to the HSM. They generate the signature uh signature cells which are then step three returned to the user and the user combines them to retrieve the signature for the document. Again we’re very efficient and at no point no one learns uh the full um private key of the user. However, so far we’ve discussed cases where you have only uh three processors interacting but our hardware uses many more so we have the problem of how to make that thing scale. And by scaling is basically adding more groups of cards. So we ended up adding forty groups of cards and then we had this key- this problem with a key that yes, it’s card, it’s group of cards can generate its own public private keeper. The problem is that how we can make that key be consistent with what’s the public key of Bob. So can all those groups serve request for- coming from Bob. And we had to come up with another protocol if that’s that. And this is key what we call key replication. So the naive way to do that is you have let’s say group A and group B. Group A has the key for Bob. Um and then uh the the processors inside the inside the group serve um the key serves with the processors inside the Group B. On a one to one basis. So this looks fine however what happens is that if processors um A1, uh B2, and A3 which we can see here is malicious uh collude, they all cooperate then they can retrieve the actual secret. So this is really bad and we don’t want this to happen. Uh so this is not clearly- Clearly this is not the right way to do key replication. What you do instead is you split um each of the processors is splitting its secret with three secrets and it distributes them to um um a processors of Group B and then this is how you do that uh in a secure way. I’m not going into details. It’s pretty easy from a mathematical perspective but we don’t need to know that. Um what’s important is that both by the end of the protocol both groups A and B or whichever other groups you may have. We have forty here. You can have more. Uh can serve requests for the same public key that belongs to Bob or whoever else. So now the politics part of the of the uh talk. So initially. So so far we’ve been saying that our system provides security as long as there is at least one processor in the system so you can have many malicious ones but if there is one that is not honest them you’re okay. However this is not always the case or you cannot always be sure that not all your components are a backdoor. And to be to be um accurate. The adversaries that are capable of introducing hardware backdoors or Trojan horses are mainly governments and because they have access to deep access fabrication facilities. They use very sophisticated techniques and they-their Trojans and their techniques are very hard to detect. And if you detect usually you’re not sure if it's an error or a bug or a manufacturing mistake or an actually malicious act. However they are very secretive and all those things are highly classified and there is no chance that they will share the details of their backdoors with anyone. And we were thinking if we could exploit this and how. So what what this entails is that we are unlikely to collude or cooperate with any adversary. Uh any other adversary. So if you remember the um uh MPC protocols, multiparty computation protocols provide the security guarantees against another class of adversaries. Adversaries that are all malicious but they don’t cooperate. So in this case what you can do is basically you can buy processor from the US, another one from China and another one from Russia but they are fabricated there. And even if all of them are backdoored you can be certain that they will never cooperate and they will never reveal their backdoors um one to the other. So you’re safe and secure despite your hardware being super compromised. So concluding this talk um we yes we introduced the hardware that can tolerate faulty and malicious hardware. Uh we have decent performance that we have scaled nicely so you can move this uh you can serve as many requests as you need. Um we use uh off the shelf components and this is neat so we will show you later how this is very nice. And then all the techniques that we’ve discussed about trusted foundries, split manufacturing and these kind of things um you can of course uh source components that manufacture securely to increase the security of the system. We’re not competing with them. We can actually use them. So what’s in this thing is that yes we took it to an extreme. We built a hardware but what you can do is you can build your own um hardware tolerant device. Uh it's pretty easy. Uh you can buy USB hub and a few card readers depending on how many processors you want to have. Uh you can download our MPC applet. Buy some cards from different countries. Do your research there. Where they’re-where they’re coming from. Which manufacturer has its own fabrication facility and where this is located. Uh some of them are actually um providing um details or you have to pay a little bit more and then they are produced in specific fabrication facilities. So download your applet afterwards. Review the code. Please do that. And then uh yes we upload your applet in the cards and you have your homemade HSM that can serve not many requests but I don’t expect that the single user will generate thousands of keys per hour or something. And yeah. That was it. Thank you. [applause]