How to extract the confidence score value of the SVM classifier?

More Sabari Nathan's questions See All

How to use machine learning techniqueto extract the tables from scanned document images?

i want to extract the tables from scanned document images with help of ML. Please suggest robust method for extracting the tables. I need to extract the table details with help of ML functions. I...

05 June 2015 6,916 7 View

How to classify the scanned document images?

Hi friends, i want to classify the scanned document images and so many methods are there. But it depends high with texts in the document. Please suggest any best algorithm that can classify the...

05 June 2015 4,502 6 View

How can I convert the MSR 3D depth video into Mat file in matlab?

Hi Friends, I am working on human action classification. I have tried to convert depth.bin into Matlab mat file but i am not able to do . please share some code for reading depth file and...

03 April 2015 7,600 3 View

Which is the best feature for natural image character recongnition?

Hi friends, Now I am working on NCR. I have used hog,phog,co-occurence hog for natural character recognition. Even i have tried deep learning with RBM. But I got 50% accuracy only so please...

01 February 2015 5,895 8 View

Where can I find a handwritten character dataset ?

hi everyone, I am working on handwritten character recognition. I need some sample images for training. I have searched a lot but I got only few samples. So please share with me dataset links.

11 December 2014 6,310 13 View

How can I segment joined printed characters?

I am working on degraded document image enhancement. Most of them are using complex techniques If anyone knows a simple method for character segmentation, tell me the paper name or code link....

11 December 2014 3,236 2 View

Which author book is best for Hidden markov model learning?

Currently i am working on handwritten charater recognition so i want to learn markov model. Please suggest me good book.

10 November 2014 9,628 4 View

How to convert emxArray_real_T * image into Opencv Mat Image?

I have tried to convert Matlab code into opencv code. I have used few inbuilt Matlab c++ code for my opencv implementation. I don't know emxArray_real_T to Mat conversion. Does anyone know the answer?

07 August 2014 4,510 3 View

How to convert matlab mat file into opencv mat file?

Without using yml file.My mat size 12000X400(feature vector) so not able to create the YML file. please give me any suggestion and code.

06 July 2014 7,625 0 View

How can I extract the without border tables in the scanned document?

I am working on without border table detection and extraction. Please suggest an efficient algorithm.

02 March 2014 1,470 2 View

How to learn more about SPSS and its Application?

I would like to learn more about SPSS and Its application especially in regards to data analysis. Please suggest me how I can learn more about it. Thank you so much.

11 August 2024 9,101 4 View

Can I base on reverse DNA sequences to perform alignment, convert to amino acids and GenBank submission?

I have reverse sequences (AB1 format), can I base on reverse DNA sequences to perform nucleotide alignment, convert nucleotides to amino acids and deposit the sequence in GenBank database?

11 August 2024 5,138 1 View

Baseline drift in HPLC? What causes this?

Hello, Why do i see this baseline drift when i compare my blank (black) to the sample (blue)? Any suggestions as to why this happened? Thank you!

11 August 2024 3,770 4 View

Text-Communication from the M1 Hand Area using BCI—and then there is Elon Musk?

Willett, Shenoy et al. (2021) have developed a brain computer interface (BCI) that used neural signal collected from the hand area of the motor cortex (area M1) of a paralyzed patient. The...

10 August 2024 7,180 0 View

Has anyone applied Python in the field of textile engineering for data analysis, automation, or smart textiles?

I'm currently exploring the application of Python in textile engineering, specifically in areas like data analysis, process automation, and the development of smart textiles. I'm interested in...

10 August 2024 7,429 2 View

How can I use the cif data obtained from rietveld refinement extracted via gsas2, for microstructural analysis using ETEX software?

09 August 2024 7,718 0 View

How are iso-frequency contours plotted?

Let's say we have a standard, regular hexagonal honeycomb with a 3-arm primitive unit cell (something like the figure attached; the figure is only representative and not drawn to scale). The...

07 August 2024 1,937 1 View

How to prepare the nanoparticle treated fungal sample for Environmental SEM analysis?

A fungal strain was treated with nanoparticles. We want to do an environmental SEM analysis. So could anyone share your views on preparing the sample? Thank you.

07 August 2024 5,307 1 View

How to normalize and take the significance of the MTT OD values with 3 replicates for the same cell-line?

Hi, I have a question about normalizing the MTT OD values for doing the statistical analysis. So, if we have 3 different plates and we call them 3 different replicates, so, first we would...

07 August 2024 8,106 4 View

Why does my protein refolded to beta sheet during thermal denaturation analysis?

Hi! So i attempted to understand a novel protein behavior towards heat application by analyzing its secondary structure change. I subjected the protein to a thermal denaturation analysis using...

06 August 2024 1,989 3 View

Suresh Merugu

Hello Sabari,

Please go through this, this may help to you i think so,

TO EXTRACT UNIGRAM FEATURES

Index *ind = IndexManager::openIndex("index-file.key");

int d1;

TermInfoList *tList = ind->termInfoList(d1);

tList->startIteration();

while (tList->hasMore()) {

TermInfo * entry = tList->nextEntry();

cout

Xiaomeng Wu

If you use libSVM, it can output the confidence score for you. SHOGUN library also supports this functionality.

If you have to implement the algorithm by yourself, please read the articles listed below.

Chang, C.C., Lin, C.J.: LIBSVM: A library for support vector machines. ACM TIST 2

(2011) 27 (Section 8)

LIN, H.-T., LIN, C.-J., ANDWENG, R. C. 2007. A note on Platt’s probabilistic outputs for support vector machines.

Mach. Learn. 68, 267–276.

Sample codes:

http://www.work.caltech.edu/~htlin/program/libsvm/ (Probabilistic Outputs for SVM)

Sabari Nathan

Suresh and Xiaomeng thank you

Francesco Orabona

The use of the margin of SVMs (or a non-linear transformation of the margin, for example a sigmoid as in Platt's method) as a measure of confidence, while widely used, has no support at all from the theory. In other words, things can be arbitrarily bad.

The problem is the hinge loss. It is possible to show that when you use the hinge loss, even with an infinite amount of data, the margin will not converge to any measure of confidence or to the conditional probabilities.

On the other hand, it is enough to use the log-loss, the squared hinge, or even the modified Huber's loss, to have the guarantee that the margin will carry some confidence information. The exact mathematical details are in

http://stat.rutgers.edu/home/tzhang/papers/aos04_consistency.pdf

Santosh Tirunagari

Already several answers are given here which discussed the theoretical aspects. This answer will shed light onto the practical implementation in Python. Here, we need to use “predict_proba” function. This method computes the probability that a given datapoint belongs to a particular class using Platt scaling. You can check out the original paper by Platt (http://citeseer.ist.psu.edu/viewdoc/download;jsessionid=92D78A0432AC435DC3DADF0A86A70E1D?doi=10.1.1.41.1639&rep=rep1&type=pdf). Basically, Platt scaling computes the probabilities using the following method:

P(class/input) = 1 / (1 + exp(A * f(input) + B))

Here, P(class/input) is the probability that “input” belongs to “class” and f(input) is the signed distance of the input datapoint from the boundary, which is basically the output of “decision_function”. We need to train the SVM as usual and then optimize the parameters A and B. The value of P(class/input) will always be between 0 and 1. Bear in mind that the training method would be slightly different if we want to use Platt scaling. We need to train a probability model on top of our SVM. Also, to avoid overfitting, it uses n-fold cross validation. So this is a lot more expensive than training a non-probabilistic SVM (like we did earlier). Let’s see how to do it:

>>> classifier_conf = SVC(kernel='linear', probability=True) >>> classifier_conf.fit(X, y) >>> classifier_conf.predict_proba([1, 3]) array([[ 0.67902586, 0.32097414]])

It is 67.9% sure that this point belongs to class 0 and 32.1% sure that it belongs to class 1.