I proposed a new algorithm for semantic link prediction in social networks, and I want to evaluate it and compare it with other algorithms. How can I do that ? Where can i fInd benchmark to use it in my tests.
this is what he said aout evaluatin in his article:
"Evaluating a link predictor. Each link predictor p that we consider outputs a ranked list Lp of pairs in A×A−Eold ; these are predicted new collaborations, in decreasing order of confidence. For our evaluation, we focus on the set Core, so we define E∗new := Enew ∩(Core×Core) and n := |E∗new |.
Our performance measure for predictor p is then determined as follows: from the ranked list Lp, we take the first n pairs in Core × Core, and determine the size of the intersection of this set of pairs with the set E∗new ."
The problem is that with this measure they relatively low performance outcome which may not accurately present the results. Also, going by a random baseline it will be easier to compare very disparate algorithms
In the Results and Discussion section they highilight the following:
"As discussed in the first section, many collaborations form (or fail to form) for reasons outside the scope of the network; thus, the raw performance of our predictors is relatively low.To more meaningfully represent predictor quality, we use as
I agree with Arturo in that the paper he suggested is arguably the best source on link prediction evaluation.
I would also recommend to take a look at:
Predicting Positive and Negative Links in Online Social Networks (attached)
also by Jon Kleinberg and two more authors, one of whom is Jure Leskovec.
They predict the sign of an edge (u,v) given the signs of all edges in the network. You can re-formulate the problem as: predict the existence of an edge (u,v) given the rest of the network.
I think, sub-section 3.4 will be most useful for baseline definitions.
In terms of general evaluation scheme, you can calculate Precision, Recall and F-measure against the ground truth. And you can take alternative baseline(s) and do the same.
Hope it helps.
Article Predicting Positive and Negative Links in Online Social Networks
node attributes (info about the individuals in the network)
I have data from organization networks that has been anonymized... many with 100% response rates on network surveys asking employees: who are you connected to? I could randomly remove links from the complete data set and see if your algorithm predicts the holes left in the network. Does this fit your needs?
How do you get link data from ResearchGate? Screen-scraping?
But, I follow many people I do not know, and have followers like that also... so we are not really a "social" network. ResearchGate is probably more of an interest graph, with some "social" thrown in.
I have a paper in press on the Revue Francais de Sociologie where I propose an ABM to generate a network of posts and replies between users, and test it on three datasets. It will be out in December - will be happy to share as soon as I have the final pdf.
If it is still useful, you can see this paper: https://www.researchgate.net/publication/264420419_Link_Prediction_in_Online_Social_Networks_Using_Group_Information
Conference Paper Link Prediction in Online Social Networks Using Group Information