I'm working on a paper about a bestseller book's list that shows a TOP20 book every month.
My problem is to prove that the this list is representative of the population of books sold each year.
From 2011-2020 I have: the number of book solds (it is an official estimative) and the amount of books sold on the TOP20 rank.
Using a simple regression total sold x list sold I can get an adjusted Rˆ2 of 0.92 and a p-value of 6.035e-06...
But because my sample is so small, it doesn't make sense to me... Is there a better way to prove the relationship between the total sold and the number from the bestlist?