Dear Sir,
I read your papers titled: " Preserving Privacy in Social Networks Against Neighborhood Attacks" ;
Please, I work on anonymizing social networks and I want to ask you some questions:
1. you have used several terms: " noise, dummy and fake " vertex or/and edges, what is the difference between them? Do the vertices or edges added are real accounts created in the social network in the anonymization?
2. Please what is the translation in french of the sentence "aggregate network query", I don't find the documentation about it? Please send me any link about it, what is the difference with aggregate queries: “sum, count, avg, etc” used in database? I found the document « Privacy Risks and Countermeasures In Publishing and Mining Social Network Data” , they classify the queries into three types, do these types the aggregate network query that You're talking about?
3. You have chosen as utility measure “the aggregation queries”, is it possible to use structural Properties of the social graph in addition to aggregation queries, for example seek to preserve the property APL by adding noise nodes to the original graph in addition to adding links as your algorithm do ?
4. When we must say that a vertex violates k-anonymity with respect to neighborhood attacks? Why when the value of k increases the percentage of vertices violating k-anonymity increases (according to what you've shown on Table 1 in the second paper)?
5. In your example of «Anonymizing two neighborhoods", please I want to have the original graph where you extracted the neighborhoods, and how you extracted the neighborhood components and DFS codes for each component? and please how we compare two neighborhood with different number of components? when we have two components with single vertex can we connect them ? if we make this we will have one component? can we do this? for example in the example of "Ada" and "Bob" you are connected between two components of the neighborhood of "Ada" which are single vertices?
6. In the step where we have to choose two similar vertices to conduct a breadth-first search, you said: "If there is no such a matching pair of vertices, we relax the matching requirement (vertex degree and label), calculate the difference of degrees and the normalized certainty penalty of generalizing the labels in the label hierarchy, and choose the one with the minimum cost anonymization. ", How it’s done? Please give me an example? For example if you have the vertices (u1, l1) and (v1, l2), (v2, l1), such as, u1 of degree 1, v1 of degree 1, and v2 of degree 2, what that one chooses between v1 and v2 to start the tests of matching? We based on degree or NCP?
7. In the algorithm 'Anonymization of social network' you say. "Iteratively, we pick the first vertex in the list SeedVertex vertexList The anonymization cost of SeedVertex and Any Other vertices in vertexList is Calculated using the anonymization method for two vertices Discussed in Sect. III-B.2 ", do we calculate the cost of anonymization of seedvertex neighborhood and other neighborhoods as Discussed in Sect. III-B.2 or just calculate the NCP and the difference in degrees between the two vertices? If we calculate the costs as discussed in Sect. III-B.2, then we must anonymise the neighborhood before calculating the cost?
8. We can easily give a lower bound of the anonymization cost based on the number of vertices and the number of edges in two Neighborhoods. Please how can we calculate this cost?
9. How do you exploit the property of social networks "small world" in your approach? I do not see where?
While waiting of your answers accept Sir my distinguished greeting