Monday, January 26, 2009

Sensitive Information in a network world

Topic: Sensitive Information in a network world
Speaker: game therory application to internet and privacy and security issue.

Introduction:
Speaker talked about her recent 5-year work assignment of "Privacy obligations and rights in technologies of information assessment".
Project revovled about dealing with senstive information. Reason for choosing sensitive information is because Information has to be handled correctly but one need to understand that every information is not private. For instance, copyrighted information is published but it can be used only under certain norms.
The Project PORTIA invovled prople from various domain. The major research theme addressed fllowing issue:
Massive data set algorithm: Sensitive Data mining and information retrieval is also a concern.
Client side defense
Sensitive data in distributed systems
Policy enforecement tool for Database system
Contextual integrity
Talk invovled detailed discussion on Client side defenses and Sensitive data in DS.

Techincal understanding:
Basic research problem revolved about question that "what information does search engine can collect?"
Usual search engine can access following information about netork and clients.
1. TCP/IP: It can find out OS and server side platform details
2. HTTP headers: Information on the client side like cookies etc.
3. HTML: Server side content
4. Query terms
Speaker talked about software requirement and design used for solving this issue.This software is PWS (Privacy application for web search).
This may not be acceptable to all the users hecnce many approaches were adopted to avoid this data mining by search engines. The problems mainly involved anonymity and avoiding search engine based data mining.

Appraoches and solution:
Some different appraoch:
TrackMeNot This plugin will generate fake traffic(Cover traffic) and confuses search engine by hiding the actual search query.
Tor + Privoxy: TCP/IP layer anonimizer about source of search query: TOR anonimizer while Privoxy: browser level plugin. It targets any kind of web based system but vulnerable to active components and it is difficult to use.

Speaker's team effort:
Make users indistinguishable and handle active component problem. Speculation is that this might reduce the efficiency of google seach engine.

Design overview of software:
It is a 3-tierd architecture. This software can be used as a firefox plugin:
HTTP proxy
HTTP filter(strip out unnecessary information) + HTML filter(remove active compoenent)
Tor Client

Tor network sends the anonymized packet in anonymous way to server and hence hides the client information.
There is broad scope of future work in this domain. Few points are listed here:
Reduce impact of Tor path selection on performance.
develop a formal model t measure privacy.
Semantic anonimozation
Also this might reduce the efficiency of future search results.

PORTIA antiphishing tools includes spoofguard, pwdHash, safeCache, safeHistory and spyBlock. This is present on internet.
Later half of presentation was based aroud distributed computing data mining.
Secure multiparty function evaluation.
Problem statement:
In a distributed environment, how to do a collective computation such that no one get to know about others data and final result is obtained.
Speaker's team adopted survey based approach and applied it to CRA taulbee survey.The intention was to convey salary distribution statistics per tier and rank to the CS community without revealing department specific information.

No comments:

Post a Comment