Since I found out that there is a correlation between Timeliness and Semantic Accuracy (I'm studying linked data quality dimensions assessment, trying to evaluate a dimension quality -in this case Timeliness- from another dimension (Semantic Accuracy)), I presumed that regression analysis is the next step in this matter.

-the Semantic accuracy formula I used is: msemTriple = |G ∧ S| / |G|

msemTriple measures the extent to which the triples in the repository G (the original LOD dataset) and in the gold standard S have the same values.

-the Timeliness formula I used is:

Timeliness((de)) = 1-max⁡{1-Currency/Volatility,0}

where :

Currency((de)) = (1-(lastmodificationTime(de )-lastmodificationTime(pe ))/(currentTime-startTime))*Ratio (the Ratio measures the extent to which the triples in the the LOD dataset (in my case wikidata) and in the gold standard (wikipedia) have the same values.)

and

Volatility((de)) = (ExpiryTime(de )-InputTime(de ))/(ExpiryTime(pe )-InputTime(pe ) )

(de is the entity document of the datum in the linked data dataset and pe is the correspondent entity document in the gold standard).

NB: I worked on Covid-19 statistics per country as a dataset sample, precisely Number of cases, recoveries and deaths

this is my spss file: https://drive.google.com/file/d/1DqMqVv4JHPbo3-pAXmavuC91pMlImFlu/view?usp=drive_link

this is the output of my spss file: https://drive.google.com/file/d/1JxVf542Kq9KfxeWIqmm1deLfJv67HOUh/view?usp=drive_link

More Mohamed Amine Ferradji's questions See All
Similar questions and discussions