This depends on your data and how you may configure MLP in your task. Therefore, my advice is to compare its performance with the performance of other approaches in order to come up with clear understanding.
My data has some sentences for each sense of the ambiguous word as like as the file I've enclosed. some senses have for example 200 sentences and some senses have 600 sentences. I tried it with weka (I changed the text file to 0 and 1 of the 1000 most frequent words and made an arff file), but it just define one of the senses in the training part and returns 0 for the rest of the senses. Is such data acceptable for MLP?
Your data structure is not suitable to be used in WEKA, it requires to be converted into an ARFF format first. After that, apply "StringToWordVector" filter and run MLP method.
NB: you may need to use a Stanford parser for Persian language in conjunction with StringToWordVector filter. In addition, instead of MLP, you may also try the fastest algorithm for text data, which often gives pretty decent results, that is called "NaiveBayesMultinomial".
I had changed the files to arff format. I've enclosed one. But the problem is, weka just recognizes the sense with the highest number and returns 0 for the rest of the senses in confusion matrix. How can I solve it? thanks