I have 205000 negative records plus 82000 positive records. Each record contains 22 features. I am going to train a SVM classifier.
A stratified k-fold partition with k = 10 is generally a good starting point. This is well explained on Wikipedia:
http://en.wikipedia.org/wiki/Cross-validation_(statistics)#K-fold_cross-validation
"Stratified" means that you should keep the original balance between positive record and negative records on every fold.
Dear Simone,
Actually, I am looking for a more intelligent subsample algorithm which is able to select representative samples and ignore outliers.
I need the python code to forecast what crop production will be in the next decade considering climate and crop production variables as seen in the attached.csv file.
05 August 2024 2,977 3 View
Hello everyone, I am currently working on a research project that aims to integrate machine learning techniques into an open source SIEM tool to automate the creation of security use cases from...
04 August 2024 3,196 2 View
Machine learning (ML) has shown great potential in predicting the compressive strength of concrete, an important property for structural engineering. However, its practical application comes with...
03 August 2024 2,546 2 View
Hello I want a suitable journal in the field of remote sensing and machine learning to be judged quickly. Thank you for your guidance Thanks
01 August 2024 1,799 4 View
Hello everyone What is your opinion about the introduction of an expert decision support system in which the rules are extracted from existing data without human intervention, instead of being...
31 July 2024 5,785 4 View
Hi, I'm curious to know if data on chemical compounds from PubChem, such as water solubility properties, can be used to train a machine learning model for commercial purposes. Will this infringe...
30 July 2024 8,707 1 View
Machine Learning
24 July 2024 2,487 3 View
Farmers no longer have to apply water, fertilizers, and pesticides uniformly across entire fields. Instead, they can use the minimum quantities required and target very specific areas, or even...
22 July 2024 8,296 3 View
I'm working on a project that involves analyzing a new dataset, and I'm at the stage of selecting the most appropriate machine learning algorithm. The dataset consists of both numerical and...
22 July 2024 6,097 7 View
hi every one I am making vector construction (for fusion proteins) and in this moment I wanna to amplification of ADAM17 prodomain with PCR. to yet, I couldn't amplified the ADAM17 prodomain with...
21 July 2024 8,660 1 View