I have a large set of stimuli (N = 1,000). Each rater has seen a subset of these stimuli. How would I compute the ICC on these data? I believe that ICC(1,1) and ICC(1,k) are the correct ICC models to compute.
I agree with Kirill, but I also will offer you another way to look at it.
I think instead of traditional ICC, the goal in your case is to answer the question, "Which of these 5,000 stimuli are easy to rate reliably given 10 different raters, and which are not easy to rate reliably?" If we choose that goal, then you could do the following procedure:
Let's call the first stimulus S1. Let's call the raters of this stimulus R1 through R10. Extract the data for S1 for R1 through R10.
Create a Pearson correlation between the ratings of each rater pair for that stimulus. I believe the dataset might be tricky - you would need each rows to have one X and one Y, so R1 and R2 on one row, and R1 and R3 on the next row, etc.
This way, you'd have a correlation (significant or not) for each stimulus (because some may be easier to more reliably rate than others) that you could use as a proxy for an ICC.
I'm sorry, but I don't think you can do an ICC. That is because an ICC relies on 2 requirements 1) you have pairs of ratings (which you kind of do, if you take R1-R10 about S1 and pair up R1 with R2, R1 with R3, etc.), 2) but they have to be on more that one experimental unit (e.g., in this case, you'd have to have R1 and R2 rate S1, S2, S3, S4, etc.). It's the second requirement that messes things up for you.
Since you have 5000 stimuli with 10 ratings each and 500 raters, it is possible that some pairs of raters happened to rate a group of stimuli together. Imagine R123 and R456 just happen to both rate S123, S456, S789, and S999. Then you'd have a little dataset that would qualify for an ICC.
Although that would be possible, the question in my mind is the utility of the answer. You will get a hodge podge of answers, and it won't be clear what they mean altogether. ICC is usually more about the rater than the stimulus. That's why I suggested a different approach (assuming I intuited your goal correctly).