I am bit confused how to arrange the data.

I have 2 classes and if i arrange the data like

Case1

Feature1 feature2 feature 3 class 1

Feature1 feature2 feature 3 class 1

Feature1 feature2 feature 3 class 1

Feature1 feature2 feature 3 class 2

Feature1 feature2 feature 3 class 2

Feature1 feature2 feature 3 class 2

value of k fold accuracy is lower than one time train test split accuracy.

Case2

and if i arrnage the data like this

Feature1 feature2 feature 3 class 1

Feature1 feature2 feature 3 class 2

Feature1 feature2 feature 3 class 1

Feature1 feature2 feature 3 class 2

K fold acurracy is higher than normal one time train test split accuracy.

So which one is the right approach?

thanks

----------------------------------------------------

edit

Let discuss the result, in case 1

train test split accuracy with test size 25% is 90%.

and accuracy with K fold CV 69% with std 0.07%

in case 2

train test split accuracy with test size 25% is 54%.

and accuracy with K fold CV 78% with std 0.08%

In all cases, data is shuffled and randomized

More Talha Anwar's questions See All
Similar questions and discussions