While trying to solve the problem of anomaly detection in accesses to a public cloud resource to identify troubling accesses, I think I have convinced myself that privacy and security are mutually exclusive.
From the lawsuit that killed the second Netflix prize, and subsequent research (Naren Ramakrishnan, Benjamin J. Keller, Batul J. Mirza, Ananth Y. Grama, George Karypis (2001). "Privacy Risks in Recommender Systems". IEEE Internet Computing (Piscataway, NJ: IEEE Educational Activities Department) 5 (6): 54–62. ISBN 1-58113-561-0) into user identification, it appears to me that you can't solve security on a public resource without violating privacy.
Is there research that proves this one way or another?
There is research that addresses privacy issues which goes by the name Privacy-Respecting Secure systems. Privacy-Respecting Intrusion Detection is an example:http://www.springer.com/computer/security+and+cryptology/book/978-0-387-34346-4
In general, the fundamental question to answer (beyond the specific issues of public resources raised by the IEEE article you mention) is how we design a system that needs to identify users as part of a process, without compromising the privacy of the users. This is a very difficult problem, which is evident not only in public systems, but also monitoring systems (I am trying to do some research on logging/audit engines) and systems not exposed to public/open/www interfaces.
In terms of what you prove with this research, well. You do prove this is a big challenge to tackle. However, theory and protocols are there and in most cases, they wait for practical implementation.
I hope this helps you a bit.
GM
Privacy is a logical security and it differs from context to another.
In response to George: thank you very much for the pointer to the privacy-respecting security research.
A quick summary of the research you linked in is: "Privacy and surveillance by intrusion detection are potentially conflicting organizational and legal requirements. In order to support a balanced solution, audit data is inspected for personal data and identifiers referring to real persons are substituted by transaction-based pseudonyms.These pseudonyms are constructed as shares for a suitably adapted version of Shamir's cryptographic approach to secret sharing. Under sufficient suspicion, expressed as a threshold on shares, audit analyzers can perform reidentification."
I read this as a simple firewall to the security data, thus pushes the privacy legality question to a decision about 'reasonable cause', which is a legal system solution, not a technical.
Fundamentally, to identify a security breach, I need to be able to answer who, what, when, where, and how (not necessarily in that order). To answer that question, I need to record these variables over reasonably long periods of time, so that the system can learn behaviors and determine what is the norm. Behaviors that fall out of that norm, are by definition 'special' and that fundamental ability to identify these special cases appears to be a privacy concern, as it is not a causal relationship you are triggering on, but an associative relationship. The person of interest is labeled by a behavior that in itself is not necessarily illegal. A human judgement would then take this to 'reasonable cause', and that seems to be a murky threshold. A system using pseudonyms does not fundamentally change this uncertainty.
I love to learn more about your insights into this problem.
More about privacy: Biskop and Flugel
http://anorien.cs.uni-dortmund.de/static/user/flegel/publications/Biskup_Flegel_2000b.pdf
"Apart from several exceptions, processing personal data is legitimate only
when the data subject has unambiguously given his consent. Systems therefore
must be able to request and store freely given, specific and informed indication of his wishes by which the data subject signifies his agreement. Controllers
are obliged to inform data subjects before, or notify them after collection, respectively, about certain facts and rights. This can partly be carried out by
the system. Data subjects have the right to choose acting anonymously within
a system, unless laws collide. Referring to the instruments of information and
notification, the data subjects must be enabled to make informed decisions regarding their options. Default settings should choose the maximum achievable
anonymity and settings should allow for differentiation of various transaction
contexts. Systems need to be designed such that data subjects are not subject
of solely automatic decisions producing legal effects concerning them."
To add clarity to my concerns: I am designing a security system for a multi-tenant public cloud. This implies that inside the perimeter of my data center operate public facing web services that could attack other tenants in my ecosystem. These web services may have a rider that forces their customers to abide by the above system of consent, but they do not have that legal relationship with the cloud provider which offers the physical infrastructure. That is the first point, and it directs me to a contract law based solution, where such relationships need to become transitive. Second point; most users may cycle between wanting to be anonymous and wanting to be well served. These legal constructs make it impossible to separate these two modes: either you consent or you don't get service. Third point is geographical: with public facing web servers, requests can come from anywhere in the world. The answer to that would be to serve bits to areas of a certain legal framework only from that area, which would kill any lean start-up business model.
Some research from the legal community:
http://papers.ssrn.com/sol3/papers.cfm?abstract_id=1450006
Broken Promises of Privacy: Responding to the Surprising Failure of Anonymization
Paul Ohm
University of Colorado Law School
August 13, 2009
UCLA Law Review, Vol. 57, p. 1701, 2010
U of Colorado Law Legal Studies Research Paper No. 9-12
Abstract:
Computer scientists have recently undermined our faith in the privacy-protecting power of anonymization, the name for techniques for protecting the privacy of individuals in large databases by deleting information like names and social security numbers. These scientists have demonstrated they can often 'reidentify' or 'deanonymize' individuals hidden in anonymized data with astonishing ease. By understanding this research, we will realize we have made a mistake, labored beneath a fundamental misunderstanding, which has assured us much less privacy than we have assumed. This mistake pervades nearly every information privacy law, regulation, and debate, yet regulators and legal scholars have paid it scant attention. We must respond to the surprising failure of anonymization, and this Article provides the tools to do so.
Keywords: privacy, information privacy, anonymization, reidentification, deidentification, HIPAA, Data Protection Directive
And the conclusions from that paper:
CONCLUSION
Easy reidentification represents a sea change not only in technology but
in our understanding of privacy. It undermines decades of assumptions about
robust anonymization, assumptions that have charted the course for business
relationships, individual choices, and government regulations. Regulators must
respond rapidly and forcefully to this disruptive technological shift, to restore
balance to the law and protect all of us from imminent, significant harm. They
must do this without leaning on the easy-to-apply, appealingly nondisruptive,
but hopelessly flawed crutch of personally identifiable information. This Article
offers the difficult but necessary way forward: Regulators must use the factors
provided to assess the risks of reidentification and carefully balance these risks
against countervailing values.
Although reidentification science poses significant new challenges, it also
lifts the veil that for too long has obscured privacy debates. By focusing regulators
and other participants in these debates much more sharply on the costs
and benefits of unfettered information flow, reidentification will make us
answer questions we have too long avoided. We face new challenges, indeed,
but we should embrace this opportunity to reexamine old privacy questions
under a powerful new light.
The basic ideas in the paper are to ask new questions about regulating aspects of privacy information use.
I would like to see some research that has looked at information theoretical aspects to have a proof that security and privacy are truly mutually exclusive because they, by their very nature, must draw from the same information base; for the security system to identify you as nefarious and or fraudulent, it must map your behavior and contrast it with non-nefarious and non-fraudulent behavior.
Dear Theodore,
Thanks for your kind words. To address some of your issues. You wrote: "to identify a security breach, I need to be able to answer who, what, when, where, and how (not necessarily in that order). To answer that question, I need to record these variables over reasonably long periods of time, so that the system can learn behaviors and determine what is the norm"
I see that you have requested the LUARM publication. I have uploaded the fulltext here, but in case you and other folks would like to have a review of author downloadable copies of most of my papers, you could find them here:
http://folk.uio.no/georgios/
LUARM should give you a good platform to monitor a set of events. It's a prototype engine, written for Linux, but the principle applies to other OSes. The source code for it exists at Sourceforge: http://sourceforge.net/projects/luarm/
Give it a go and I can give you a hand if you wish.
What I am trying to do at the moment (need to obtain funding first) is to look at the produced data set of the engine and see first of all which parts can be anonymized, in addition to a "technical binding procedure" which binds the request to de-anonymize things and reach the user-id to multiple independent authorities.
In terms of your envisaged multi-tenant public cloud design, what I would do is to look at how I could adapt transient authentication techniques. Instead of having a token to certify the authenticity of a user, what would happen if that token could have two operation modes (a full-id and a pseudo-id or reference to a binding requests). That way you could maintain anonymity and have some sort of confidence that you maintain a service level to a user that does exist but wishes to remain anonymous for the purposes of some allowed transactions.
Best regards,
GM
Privacy and security are mutual to each other. Look at it this was: Why do we have fences up around our homes? If privacy and security were opposing each other than why bother with having walls in our homes? Why put locks on our doors? I would imagine you have curtains or shades over your windows to protect your spouse and children from unwanted viewers.
From the standpoint of digital security, would you want your medical records available for public review? Security enables us to have privacy. Privacy does not enable us to have security.
The old saying of "security through obscurity" was shredded to bytes years ago. This demonstrates independence on one another to for privacy to work, we must have security.
ARPNET was designed to provide information freedom. IPv4 shows how security was never designed into digital traffic flow. Those that wish to believe that information should be free; ask them about having their credit history, medical records, family history, criminal records of family members, and such posted as free information.
We all need privacy which is why we all need security to protect that privacy. By the way, any idiot can break into a database and expose information. It takes intelligence to keep that same act from happening.
Hola,
Let me propose another point of view. If you need pay for some service that you receive, for example, from an internet provider, they need knows your ID, and you need know where send the money. Then, both know some information from the other, and this is very important to guaranty this service. This process violate privacy? Maybe yes. But it's necessary.
In the world of security occur the same. You can solve the problem applying or doing anythings or many things, looking for guaranty privacy in first place, but then, the cost is translate to the process of identify intruders or attackers. You decide if identification is important or not. But, if you put privacy after security, privacy will be affected.
My apology about my english.
Thanks
Habib: Excellent information: a simple google search yielded a lot of papers on unlinkability. Need to do some reading to see if this fits the bill. Thanks for the pointers.
@Bob, I'd disagree with your statement "Security enables us to have privacy. Privacy does not enable us to have security. " Privacy does enable us to have security.
Habib: thanks for the paper. It has added new information to my search. However, to give you a better idea about the problem I am trying to solve, take a look at this paper:
http://www.public.asu.edu/~kxu01/papers/2012/SPCC12.pdf
I am looking for BKMs that address cross-tenant attacks inside a public cloud. Abnormality flagging methods model normal patterns of behavior and thus can contrast against abnormal behaviors that might be suspect. However, that ability to model normalcy is a privacy concern, as with this information the system will be able to detect completely different behaviors that are not the public cloud's concern. For example, the system could pick up the abnormality of insomnia, possibly a sign of stress, with all its negative connotations. It could point to fraudulent behavior, but it could point to having money problems too. The first is a legitimate concern for a security system, the second is not, but if the information exists, a subpoena is not far away, in this case possibly from a mortgage lender. I am looking for research that is addressing this problem between the demands for security and conflicting demands for privacy.
This is a great paper that researched the problem of cross-tenant information leakage on AWS.
http://cseweb.ucsd.edu/~hovav/dist/cloudsec.pdf
Their appendix is a great list of questions but no answers to the strain between security and privacy.
And this is even better as an example of the threat of connecting behavioral information with public information.
http://m.youtube.com/#/watch?v=F7pYHN9iC9I&desktop_uri=%2Fwatch%3Fv%3DF7pYHN9iC9I
My former PhD thesis Wouter Teepe wrote an interesting PhD thesis motivated by exactly your question, "Do security and privacy requirements clash?". His answer is a resounding "no", also evidenced by the title of the thesis, "Reconciling information exchange and confidentiality: A formal approach" (2007), downloadable from http://www.rinekeverbrugge.nl/PhD-1.html.
Teepe proposes several protocols for 'comparing information without leaking it'. A possible application of his methods (in Chapters 9 and 10 of the thesis) is that, for example, the FBI does not have to see the whole list of passengers of a flight, but can only inspect those that were on its own list of suspicious people already.
Rineke: thank you for that reference. I see conceptually that this approach could work, but don't see a good solution yet to manage the complexity of designing information designators when the demands on information integration seem to explode. The more information needs to be integrated to identify malicious access (firewall logs, application logs, file system logs, etc.), the number of proper designators will increase exponentially. Furthermore, during discovery, when you don't know yet which designators might reveal the information you are seeking, there seems to be a period of limbo in which the technique would not apply.
In the context of public video surveillance, privacy and security are important topics. You might want to look at the research of Thomas Winkler, he has invented a trust-based surveillance system which provides the required security but still protects privacy
@Bernhard Googling Thomas Winkler doesn't provide me with any useful links. Can you provide more direct links for reference papers?
Sure
http://pervasive.uni-klu.ac.at/publications/pdf/Winkler_AVSS2010.pdf
There are also areas of security where it and privacy have common goals and fit very well together. For example, my area of usable access control. The goal of access control is to only give access to those who should have it. The goal of privacy is to make sure only the people who you want accessing your stuff can access your stuff.
I don't think it's a question of 'if they clash', but how to manage them when they do. If security requires information, and privacy requires anonymity, the two concepts will often 'bump heads' in practice. It's the role of systems designers and decision makers to be mindful that both are equally important from a user perspective and to manage the conflicts, when they arise, as ethically and deliberately as possible.
We live in a new age, with new technological possibilities, but the underlying concepts are as old as civilization. We just have to be smarter about how we use the technology to avoid these conflicts on a case by case basis and make it a priority to develop and maintain best practices as we go.
That is the realization that this particular problem has cemented in my thinking as well. One security vector in particular, non-repudiation, brings this trade-off (privacy vs security) front and center, as non-repudiation requires demonstration of genuine authorization AND proof of authenticity and origin. Strong measures to establish identity are essential to solve non-repudiation, but protecting this identity from malicious misuse are just as important. Q.E.D.
I just wanted to chime in with a comment about the difference between authenticating a person and authenticating a persona or a set of credentials. When having security versus privacy debates I find that people commonly conjoin the two. But the distinction can be very important when talking about authenticating someone in a privacy preserving way.
We frequently discuss "identity" as if it is tied to a particular human name, when we actually mean that we want to be able to identify if the person we are interacting with is the same person that was previously authenticated. Credit card purchases are a good example of this. Amazon doesn't need to know my name or address for me to purchase and download an MP3. But they do need to know that I am authorized to charge the provided credit card.
Good point on the distinction between person and persona. However, I wonder if, as more communication, social networking and financial systems, etc. become integrated, that distinction will become increasingly blurred as partially identifiable attributes and records can be cross referenced among large data sets, allowing for the creation of robust 'persona profiles' that might give a clear representation of the user's identity. This is already happening to some degree with online targeted marketing. A given individual system may be secure and relatively private on the front end, but when multiple data sets can be collectively mined on the back end, it may get pretty easy for data miners, regardless of intent, to 'read between the lines'.
This was exactly the problem with the second phase of the Netflix prize: researchers were able to associate real people to the obfuscated raw data set by contextual information, which instigated the cancellation of the second phase. It was that paper that made me think that it is impossible to provide security AND privacy at the same time. However, Kami and Habib demonstrated that privacy CAN be offered through access controls on the identifying data. Like the current legal system: you are offered privacy until you get a subpoena but the subpoena needs to be reasoned and approved so there is a define process to get to private information.
So in the use case that started this question: you are a cloud infrastructure provider and you want to secure your infrastructure for internal and external threats, you need to record all transactions in and out of the infrastructure and piece together if they were part of a legal/valid transaction or a nefarious one. The system could flag a nefarious transaction, and then a legal process could get started to articulate the specifics of that nefarious transactions so that specific actors and nefarious actions are identified. The system thus offers security and privacy, simply leveraging the existing legal process to properly process offenses.
Encapsulating the 'policing' function into a 'black box' of code is a good example of maintaining security and privacy. What would you do with the records afterwards? Destroy them? Encrypt and archive them? Just the legal ones, and keep the nefarious records for later processing? The security vs. privacy conflict seems to become an issue when humans interact with the system. What human interaction would be required to manage and process the records?
Assuming security is about this: http://ieeexplore.ieee.org/xpl/login.jsp?tp=&arnumber=1335465&url=http%3A%2F%2Fieeexplore.ieee.org%2Fxpls%2Fabs_all.jsp%3Farnumber%3D1335465
And with that premise, when we look at security from a discrete-system perspective, this paper would suggest that privacy is a non-functional property of security: http://link.springer.com/article/10.1007%2Fs10796-009-9164-1
Privacy should be an inherent design issue for a system, while as security is something negotiable and varied.
Interesting premise but without providing a foundation it doesn't add much to the discussion. Do you care to elaborate your assertion?
I think, we are talking about a point of research here... In business terms: organizations struggle to keep its consumer's data secure from theft and damage, where on the other hand it struggle to perform inter organizational systems (data transfer). There fore, how this affects consumers? we need much more work here...
thank you all
Are privacy and security mutually exclusive? No, they can co-exist (under certain conditions). In my opinion, for the domain of online transactions, they will be mutually exclusive, only if one can obtain information from all the entities involved. For example, if the only interactions in the system are between the user and the service provider (e.g. bank), users will have to reveal their identity (which let's the bank to identify each user) in order to authenticate users to the bank web site (i.e. achieve security so that bad people cannot misuse bank accounts). In this case, one cannot protect the privacy of users as it will result in loss of security. However, if there is another trusted entity, one can protect the privacy of users from the bank without compromising on the security goals that the bank wants to achieve. This is exactly what privacy preserving access control techniques do. In this case, the privacy co-exists with security on the condition that the trusted entity does not collude with the bank. So, for the problem you have mentioned, if the goal is to protect the privacy of the users from the cloud while at the same time perform auditing for security purposes, one possible approach to reconciliate privacy and security is to utilize one or more additional trusted entities which can mask the actual identity of users.
In that scenario who would provide non-repudiation guarantees? The cloud provider doesn't have the right information to guarantee non-repudiation, and thus the trusted entity would have to, and that exposes the privacy information again. Introducing transitive relations does not remove the problem that non-repudiation requires the combination of authentication guaranties with transactional guaranties and in both the transaction sequence and the whitespace in the sequence privacy violating inferences can be made. Think for example about associative relationships that could be inferred: they would be pure circumstancial but could be very damaging to an individual.
Dear Theodore Omtzigt
"Privacy should be an inherent design issue for a system, while as security is something negotiable and varied"
Description : At the time of system development (any system) , the system may warrant us to make some part of the system as private because of so many factors(Goal/Contention/Efficiency/curiosity/propriety/etc). The private part (Data/Resources) thus inherits Privacy, and had nothing to do with security as such. As we all know few object oriented languages allow us to make some data/functionality private, so what is the ultimate goal nothing but privacy that too which will keep resources/data/subsystem private for a long(If not for ever).Concluding that privacy is to be decided at the time of development(inherent design issue).
Security, on the other had are preventive measures that ensure safety(Data/Resource).The security is often implemented as an additional (non-functional requirement) which may vary from system to system(perception to perception). Two organisations may have different security measures for the same kind of resource while they hardly can differ on privacy issues. The measures of safety(security) are time variant.
Privacy is a functional requirement while as Security is non-functional.
Regards:
Majid Ahmad
@Theodore - yes, the trusted entity need to uncover the identity of selected users to the cloud for non-repudiation for example. Exposing the privacy for a non-repudiation case is unavoidable in my opinion. What I would ideally like to see is to preserve the privacy of "well-behaved" users. I think this goes with the real life scenarios. For example, when a cop stops you for speeding, you have to let go of your privacy by showing your drivers license (which includes your data of birth as well), car registration and insurance policy. We need to have a legal framework in addition to security and privacy measures to protect the privacy of good users in online systems as well.
I would be very much interested to know if there are any significant research work that preserve the privacy of good users while only uncovering the identity of bad users.
@Majid, I disagree with your remark of "security is non-functional". Security is a functional requirement. For example, the security requirement of enforcing two people to sign on a check is a functional requirement that banks follow. Another example, the security requirement of allowing only authorized users (and noone else) access some protected content is a functional requirement.
Both security and privacy should be considered from the day one of system design and development.
@Nabeel: I wish to make it a point, that security is variant and privacy is inherent.
Your example is quite right, A bank may ask two people to sign on a check, where as other bank(same bank after sometime) may ask only a single. Here this a business rule(noting to do with this).
regards:
majid
US-SWIFT case is an important example for this subject I think. US intelligence units took all the financial records and data in the name of providing security secretly from their EU partners since 9/11 attacks until showing up in a newspaper in 2006. These articles might give you some idea from terrorist financing perspective:"DATA WARS BEYOND SURVEILLANCE Opening the black box of Swift" Mara Wesseling, Marieke de Goede and Louise Amoore; Anthony Amicelle & Gilles Favarel-Garrigues (2012): FINANCIAL SURVEILLANCE, Journal of Cultural Economy, 5:1, 105-124; Marc Parker & Max Taylor (2010): Financial Intelligence: A Price Worth Paying?, Studies in Conflict & Terrorism, 33:11, 949-959
All the best,
Burke
I believe alot of this depends TRUST, no matter how rigorously you define these topics and how they align with laws and standards, e.g
SECURITY in Computing has two distinct areas:
DATA / INFORMATION SECURITY: protecting information / data from unauthorized access, use, disclosure, disruption, modification, perusal, inspection, recording or destruction
HARDWARE / SOFTWARE SECURITY: processes and mechanisms by which computer-based equipment and services are protected from unintended or unauthorized access, change or destruction.
PRIVACY:
protecting the personally identifiable information of an individual to the degree where they should have control over the extent to which their data can be accessed, stored and transferred
In order to maintain assurance that privacy and the respective privacy laws are being upheld it is necessary to utilize security to derive compliance. In most circumstances anonymity should not matter as long as the entity you have entrusted your data with is compliant as they cannot transfer/manipulate your data beyond the purposes for which you have given them consent. (IN THEORY) However, you ask, how can this be proved. There is work being done in this area regarding auditing and compliance for cloud computing and distributed systems to tackle these assurance problems.
So in essence security and privacy are separate fields of study, research and application which can overlap and work together. But overall TRUST is needed somewhere, whether it be in the software, the cloud, the government or even the postman who delivers your mail.
All the Best,
Jonny
I guess you are refering to an extremely complex topic, by stating that
"you can't solve security on a public resource without violating privacy".
Already security refers to a very broad range set of mechanisms, protocols, practices.. - the same is true for privacy - which has an even more malleable meaning, depending on personal preferences, data protection legislations..
So, the question is what you mean by security, and whether this could already be achieved "alone" (without explicitly considering privacy). Thriving your understanding of security towards "risk management", which would mean that you aim at implementing countermeasures against the most serious threats already acknowledges that you can't protect against every possible attack - in particular when keeping the cost in mind.
Again, the same is true for privacy (or privacy-enhancing technologies).
However, there are examples that show that it is possible to fairly balance security and privacy requirements - at least in the research community.
Examples: "anonymous credentials" in the area of cryptography mechanisms, "privacy-preserving approaches to auditing", or "security and privacy by design" proposals in the area of development processes..
I guess, you can dig into this nearly arbitrarily deeply..
Interesting thread. In short i would say security is external and privacy is internal so requirements would be different.