What are some tips for someone who wants to learn Machine Learning?

Ali Kadhum M. Al-Qurabat @Ali_Al-Qurabat2

12 December 2017 12 5K Report

Helene Dörksen Popular answer

Hello Ali,

I would recommend the book

"Introduction to Machine Learning" by Ethem Alpaydin

Helene Dörksen

Hello Ali,

I would recommend the book

"Introduction to Machine Learning" by Ethem Alpaydin

Stu Shulman

First posted on LinkedIn, but still relevant:

https://www.linkedin.com/pulse/20140602174041-36684981-tips-for-building-effective-classifiers/

Custom classifiers (or what we call "sifters") are not difficult to build. A small team, or a spirited individual using DiscoverText, can build one before lunch. The trick is to understand some basic principles.

Fewer categories is easier than many.So, what is optimal? The answer is two or three. We have built many effective classifiers with four, five, or six codes. The trade-off is that you need to do more coding to get to a point where you are confident in the classification. Our advice is to start with two to three codes and then use the "split" feature to drill into the finer grained categories.

Balance your training sets.If you create a dataset that is coded 95% category A, 3% category B, and 2% category C, the results may be very disappointing. To get reliable classification results, ensure that your dataset has a good mix of items from all categories. Using the search and bucket features is a good way to prepare a balanced dataset.

Find good coders.There is tremendous variation in the quality of the annotations that coders produce. Some are fast but inaccurate. Others are slow but highly accurate. The best coders are both quick and accurate. Contact Texifter if you need help finding coders; we know some good ones. Note: Not all good coders are good for all tasks. In some cases, you need domain-knowledgeable coders.

Use the validate dataset feature early in the process.Our adjudication procedures are novel. We started building software (the Coding Analysis Toolkit) specifically to support the process of adjudication of coder disagreement. Our patent filing on "Coder Rank for Enhanced Machine Learning" builds on years of experience seeing widespread variation in coder ability. When you first set out to create a classifier, try to get four of your DiscoverText peers to all code the same 100 items. Then go through the adjudication process. You will learn a lot about your data, the codes, boundary cases, and the coders.

Iterate, iterate, iterate.Repeat this process as many times as needed. After each round, retrain the classifier and be sure to exclude the invalid items when you "rebuild" the classifier via the ActiveLearning. Gradually you weed out the false positives that result from the classification.

Use the classifier scores to pull new samples of high value items.If a new classifier is being developed and you have completed one or two rounds of coding, set up a filter with the following two criteria: a) Filter for items not coded, b) Filter for a code or all codes above 95% likely to be in a category, and c) Put the results in a bucket and create a new dataset using the random sampling tool. These are your high value items. Coding this new dataset helps cut down the false positives in your classification results.

Kah Huo Leong

I always recommend my staffs to start with google cloud platform - https://cloud.google.com/ml-engine/

https://edu.google.com/higher-ed-solutions/google-cloud-platform/?modal_active=none

besides, google is giving researchers access to their most advanced machine learning technologies for free.

https://services.google.com/fb/forms/tpusignup

alternatively, you may try microsoft azure which I think is a lot easier to get started but you will have to pay after trial period (for most of the services)

https://azure.microsoft.com/en-in/

do have fun with your ml projects..

Ali Kadhum M. Al-Qurabat

thank you Helene Dörksen for your answer

Ali Kadhum M. Al-Qurabat

thank you Stu Shulman for your answer

Ali Kadhum M. Al-Qurabat

thank you Kah Huo Leong for your answer

Rajesh N V P S Kandala

The best way of learning is doing more practice. First, go through some tutorials to understand the basics of machine learning algorithms in theoretical way. Later, do some small exercises, that makes you feel comfort. Then, do some online courses, where you can be involved in some sort of assignments in a mathematical flavor. Finally, practice more, explore more tools on various kinds of problems. Definitely, one can become expert in machine learning.

Ali Kadhum M. Al-Qurabat

Thank you Rajesh N V P S Kandala for your answer

Amin Ullah

Firstly find a simple dataset such as MINST and implement SVM, KNN, Decision Tree etc. Python or matlab is eassy for beginners.

You may face some issues while you implementing so at that time you can learn more.

Ali Kadhum M. Al-Qurabat

Many thanks Amin Ullah for your answer

Hamit Can

https://www.quora.com/What-are-some-tips-for-someone-who-wants-to-learn-Machine-Learning

PCR primers stopped working, any reason why?

Why is my LightCycler taking longer to run qPCRs?

Please , I need (XRD) data sheet standard (JCPDS manual). for ( hydroxyapatite, multi-walled carbon nanotubes, High-density polyethylene ) materials?

Where to find EEG dataset for anesthesia?

In clustering, How are graphs of KNN built?

What are the current challenges to be considered to solve in Scientific Workflow scheduling in cloud computing?

Is Baron And Kenny 1986 Mediation Approach Out Of Date?

Is the google scholar h-index a good representative for the strength of the researches? what are the drawbacks of the index ?

Is it necessary to follow the instructions of the journal in regard to the number of words?

Is it critical to move on from MATLAB to Python in ANN field?

Feedback defines the constitution of an organism?

Self-Organizing Superorganisms—as envisaged by Nenad Sestan (2018)?

Measuring the Intelligence of a Species?

How can i do multivariate Time Series forecast using MLP, ANFIS and LSTM?

The Curse of Evolution and Complexity?

Need help with my research project on open source SIEM and machine learning?

Swimming/space travel depends on the proprioceptive muscle spindles?

What are the limitations and challenges of using machine learning for predicting concrete compressive strength in practical applications?

Some new emerging problems on application of RL for scheduling in IoT networks?

How to Compress Information Neurally?