
Signature
Erland Sommarskog, SQL Server MVP, esquel@sommarskog.se
Books Online for SQL Server 2005 at
http://www.microsoft.com/technet/prodtechnol/sql/2005/downloads/books.mspx
Books Online for SQL Server 2000 at
http://www.microsoft.com/sql/prodinfo/previousversions/books.mspx
We are required to encrypt SSN and customer credit card number. The user
will need to perform an exact or partial search on the encrypted SSN or
credit card number. Below select statement can do the job but very slow...
I am still searching the net to see if I can find a solution.
Is there any sample that you can share how the Hashbytes() function work?
Can we do like search on Hashbytes function?
> > Can someone provide me a sample of how to search encrypted column? Using
> > below method seems pretty slow.
[quoted text clipped - 17 lines]
> customer calls in, and don't know his customer number, but only the SSN,
> the helpdesk will not be able to find his data.
Uri Dimant - 28 Jan 2007 10:40 GMT
Brain
More info
http://blogs.msdn.com/lcris/---Enscrypt
> We are required to encrypt SSN and customer credit card number. The user
> will need to perform an exact or partial search on the encrypted SSN or
[quoted text clipped - 28 lines]
>> customer calls in, and don't know his customer number, but only the SSN,
>> the helpdesk will not be able to find his data.
dineshasanka@gmail.com - 28 Jan 2007 10:50 GMT
Yes it is slow, that is the nature of the encryption
> We are required to encrypt SSN and customer credit card number. The user
> will need to perform an exact or partial search on the encrypted SSN or
[quoted text clipped - 33 lines]
> > Books Online for SQL Server 2000 at
> >http://www.microsoft.com/sql/prodinfo/previousversions/books.mspx- Hide quoted text -- Show quoted text -
Erland Sommarskog - 28 Jan 2007 10:56 GMT
> We are required to encrypt SSN and customer credit card number.
Requirements are sometimes set with considering what is technically
possible.
> The user will need to perform an exact or partial search on the
> encrypted SSN or credit card number.
And you cannot encrypt safely and also search quickly - least of all
partial matches. So there is all reason to review the requirements.
> Is there any sample that you can share how the Hashbytes() function work?
> Can we do like search on Hashbytes function?
INSERT tbl (hashedssn, ....)
VALUES (HashBytes(@ssn), ...)
SELECT ...
FROM tbl
WHERE hashedssn = HashBytes(@ssn)
It may be better to hash client-side, since it that case you will not
send the key value in clear-text over the wire. This is particularly
important for credit-card numbers. Yes, I assume that you use a encrypted
connection, but never the less.

Signature
Erland Sommarskog, SQL Server MVP, esquel@sommarskog.se
Books Online for SQL Server 2005 at
http://www.microsoft.com/technet/prodtechnol/sql/2005/downloads/books.mspx
Books Online for SQL Server 2000 at
http://www.microsoft.com/sql/prodinfo/previousversions/books.mspx
Mike C# - 28 Jan 2007 16:50 GMT
As Erland, Uri and Dinesh pointed out, encrypting data and searching quickly
are opposing goals. Basically you cannot have both.
For a full string exact match (like a full CC# or full SSN) you can do as
Erland suggested and use HashBytes() to store hashes of the encrypted data
in additional columns. As he also pointed out, this defeats the purpose of
the Random IV generated during the encryption process. It also makes your
system less secure. You can use Raul Garcia's proposed MAC indexing method
at http://blogs.msdn.com/raulga/archive/2006/03/11/549754.aspx to index your
encrypted data. Again this only works for full string exact matches.
If you want to perform "partial" matches of encrypted data, you are stuck
with decrypting each piece of data to do the comparison. As you found out,
this is very slow.
Your best bet is to re-visit those requirements. I don't know of any
businesses that allow partial searches on SSN or CC#. What good does it
really do you to perform a partial search on an SSN for '01' or '987'? Odds
are good you will pull back a lot more results than you really want, and the
performance is going to suck. Similarly for credit card numbers, most
companies that actually store full CC information require either: a) the
full CC#, or b) the last 4 or 5 digits of the CC#, for verification
purposes. Storing a hash of the last 4 or 5 digits of a CC# for searching
or verification purposes is a lot more reasonable than being able to pull up
all CC#s with "98" anywhere in them. I can't imagine the big CC corps would
even allow partial matching of CC#s in the manner you propose. There's no
reason for them to allow it. You might want to double-check your agreements
with your CC processor or with the CC corps themselves to make sure it's
even allowable under your contracts.
> We are required to encrypt SSN and customer credit card number. The user
> will need to perform an exact or partial search on the encrypted SSN or
[quoted text clipped - 28 lines]
>> customer calls in, and don't know his customer number, but only the SSN,
>> the helpdesk will not be able to find his data.
Brian - 28 Jan 2007 19:41 GMT
Thanks for all your advise. It helps to understand the encryption and
searching are in opposite goal.
As HashBytes() defeats the purpose of the Random IV generated during
encryption process, I'll look at Raul Garcia's proposed MAC indexing method.
It could help to improve the searching. Raul's method is a bit complex and I
will need to study in order to understand it.
To clarify the search function on CC#, we need to provide a partial or exact
CC# search for our security department to find the suspicious transaction.
We print the last 4 digits of the CC# on the transaction along with the
transaction amount. To identify the CC#, they will need to perform the CC#
search along with the date/time and transaction amount from our system. We
even have FBI came in to investigate fraudulent credit card activities. This
will be the only system that we can provide them a way to recall the entire
card number. For now, it will be slow to decrypt all this column since we
are required to encrypt them. I will check with Raul's method to see if we
can improve the search for them later.
Again, thanks for all your advise!
Brian
> As Erland, Uri and Dinesh pointed out, encrypting data and searching quickly
> are opposing goals. Basically you cannot have both.
[quoted text clipped - 58 lines]
> >> customer calls in, and don't know his customer number, but only the SSN,
> >> the helpdesk will not be able to find his data.
Mike C# - 28 Jan 2007 23:17 GMT
> Thanks for all your advise. It helps to understand the encryption and
> searching are in opposite goal.
[quoted text clipped - 22 lines]
> we
> can improve the search for them later.
From what you say, it sounds reasonable to assume that your partial CC#
searches will be performed only on the last 4 digits of the CC#. If so,
storing a hash of the last 4 digits in another column can speed up your
searches, and should not compromise the security of your encrypted CC#
column. If, however, you need to perform searches on the middle 8 digits of
a CC# or something else "outside the norm" like that, you're in for some
serious performance headaches.
Erland Sommarskog - 28 Jan 2007 23:24 GMT
> To clarify the search function on CC#, we need to provide a partial or
> exact CC# search for our security department to find the suspicious
[quoted text clipped - 7 lines]
> them. I will check with Raul's method to see if we can improve the
> search for them later.
If the partial search on credit-card number is always on the same digits,
you could stored these in a separate column. Possibly it could here be
acceptable to stored it unencrypted, or at least hashed.

Signature
Erland Sommarskog, SQL Server MVP, esquel@sommarskog.se
Books Online for SQL Server 2005 at
http://www.microsoft.com/technet/prodtechnol/sql/2005/downloads/books.mspx
Books Online for SQL Server 2000 at
http://www.microsoft.com/sql/prodinfo/previousversions/books.mspx