Introduction to Homomorphic Encryption

Last edited: January 1, 1970

This is the first of a series of blog posts about the use of homomorphic encryption for deep learning. Here I introduce the basics and terminology as well as link to external resources that might help with a deeper understanding of the topic. I also wrap-up with challenges of applying HE to DL that I should discuss in later blog posts.

Introduction

A breakthrough in the history of cryptography was the introduction of the first fully homomorphic scheme by Craig Gentry in 2009, however, the scheme wasn't practical at that time, so many researchers and engineers were putting efforts to making such scheme practical in industries where privacy is a big concern. Since then many researches have been done to improve performance, many open source tools have been released, and papers have been published showing how practical HE has become in the case of logistic regression, as well as deep learning. We also hope to see tools that lower the barrier of applying homomorphic encryption in deep learning. But let's first take a look at what HE is.

Black Box View

Say we have a function F that performs a certain computation on two elements x and y and outputs a result z : z = F(x, y), an HE scheme lets you perform a certain function (we use the same F in our case) on encrypted elements he_encrypt(F(x, y)) = F(he_encrypt(x), he_encrypt(y)). That function F is generally an addition or a multiplication, having a scheme that supports arbitrary function F is a dream at the time of writing this post (but who knows?), which resulted in some challenges we will discuss later on.

So, what do we get from all this? Well, let's say that you want to perform some computation in the cloud using some data, but still want to preserve its privacy, using HE you can send the encrypted version of your data to the cloud, perform the computation there and get back the encrypted result that you can decrypt later on. All these steps don't require the client to stay connected (to provide additional help during the computation) and that's one benefit of HE.

More Details

Like any encryption, HE schemes use a public key to encrypt plaintexts into ciphertexts, and a secret or private key to decrypt ciphertexts into plaintexts, optionally, they may use an evaluation key to perform operations (addition/multiplication) on ciphertexts.

HE schemes (here I'm talking about the ones that are based on lattices and the RLWE problem) operate in similar manners where they hide the plaintext by adding a secret part that can be taken off using only the secret key plus a noise that doesn't affect decryption, however, some issues arise when this noise gets multiplied or added with other noises while performing evaluations on ciphertexts, we will not go into the details here but you should keep in mind this when you choose parameters for your schemes and how to make sure the noise doesn't completely blind your plaintext.

I will dedicate a blog post to talk more about the CKKS scheme and how it can be used in practice using the Microsoft SEAL library.

Types of HE

Depending on the possible functions F you can compute and how many operations can be chained on a ciphertext, HE schemes can be classified into 3 main types:

Fully-HE (FHE): That's the most useful type to be used for deep learning currently, it allows any number of addition and multiplication operations.
Somewhat-HE (SHE): It allows both addition and multiplication, but we are limited in term of the number of operations we can perform.
Partially-HE (PHE): This type of scheme either allows addition or multiplication, but in an unlimited fashion.

I have dedicated a blog post that you can find here that details more about each type.

About the Security of HE Schemes

If you are concerned with the security of HE schemes, and how broken they can be in the near future, then you should know that most current schemes use lattice-based cryptography which is post-quantum secure. If those words doesn't make sense to you then just keep in mind that there is no algorithm that can break the security of this kind of schemes in polynomial time both in classical and quantum computers, actually, the best known attack runs in exponential time. For more details on how to set parameters to achieve a certain security level, please refer to the Homomorphic Encryption Standard listed in the Additional Resources section.

Challenges of Applying HE for Deep Learning

Most of deep neural networks operations are tensor multiplication and nonlinear activation functions, the former is natively supported by HE schemes, but the latter is not, functions like sigmoid can't be simply applied on encrypted data, however, they can be approximated using a linear function, we can use low degree polynomials to do that, of course there is a tradeoff between precision and performance, you get better precision when you choose a higher degree polynomial but you will also need to do more computation which is something you should care about when doing HE, in general, you should choose the lowest degree polynomial that has a good precision on a range a, b that most of your data fall into.

Deep neural networks also involve a long chain of multiplications which is hard to do with practical schemes (which are basically SHE), we can't apply many multiplications to ciphertexts and still be able to decrypt to the correct value, possible solutions would be to send back ciphertexts to the data owner at some point to be decrypted, re-encrypted and sent back so further multiplications can be done, this requires an active connection with the client which isn't always possible. We can also either adapt the parameters of our scheme to make a certain multiplicative depth possible or use bootstrapping, which can allow any multiplicative depth and thus transforming SHE schemes into FHE, however, bootstrapping operations are costly but some works are trying to make it more practical.

A single ciphertext in the CKKS scheme (and many others) can hold more than a simple number, so someone can increase efficiency by batching a vector into a single ciphertext thus multiplying all the value in the vector with a single instruction (SIMD: Single Instruction Multiple Data), this will also reduce both memory consumption (doesn't require to hold N ciphertexts) and communication costs (as we send and get back ciphertexts to the cloud), however, batching introduces complexity in how to handle operations, for instance, if we put a whole vector into a ciphertext then how should we do dot product operations? Designing a good structure for tensors here is key to make HE practical in deep learning.

Conclusion

Only a decade after introducing the first FHE scheme and we have seen practical application of it, of course, there is still room for improvements and for things to become standardized and production-ready, but let's hope for the best and build solutions that make privacy a trivial thing.