Accuracy is not the perfect metric for the performance measure when we have imbalanced dataset. so which metric should we take into account especially when we have imbalanced and small dataset
The F1- score is a popular metric for imbalanced classification and it is more effective because Precision and recall can be combined into a single score that seeks to balance both concerns.
You can also try the MCC and the ROC-AUC Score is good evaluation matric for an unbalanced dataset. Check this article you may find it useful https://www.kdnuggets.com/2017/06/7-techniques-handle-imbalanced-data.html
These metrics relate to getting the finer-grained idea of how well a classifier is performing and not just about the overall accuracy of the the classifier.
There are a number of considerations that needs to be taken into account: are you looking at binary classification or not? are you interested in the classifier predicting the class an instance belongs to, or you want also the probability of class membership? how much imbalanced is your dataset? are both classes equally important or not? etc. Jason Brownlee summarized very well all these aspects with a nice flowchart you can find in his books and also at this link https://machinelearningmastery.com/tour-of-evaluation-metrics-for-imbalanced-classification/
However, beside the evaluation metric, imbalanced datasets are quite difficult to deal with, and having a small dataset, which you said you have, makes it even more complex.
Ganga Gautam, For example, you are working on the binary classification where two class exists named disease-affected plant and healthy-plant. The stakes of misleading disease-affected plant as healthy-plant or overlooking the disease-affected plant can be extremely high sometimes. But, suppose your main target/expectation is to avoid mistaking disease-affected plant as healthy-plant, in that case you should focus on Recall. Giving the high priority of Recall to measure the model's performance during the disease classification is good practice. On the other hand, considering model's F1-Score is usually more useful than model's accuracy, especially if you have an uneven class distribution.