Potential impact of imperatively hidden objects on machine learning, predictions and aging??

08 August 2017 2 9K Report

Potential impact of imperatively hidden objects on machine learning, predictions and aging

1.) Introduction to machine learning.

Image we want to predict the mode of transportation between UALR and UAMS, which are about 3.1 miles apart.

We start with 100 training samples for each class. The machine learning algorithm is on trained by giving two time point, i.e. starting time and finishing time, takes 23 minutes on bike. To get to the classroom it takes another 5 minutes. For our machine learning algorithm for something to be of the class "bike" it the travel needs to take at least 23 minutes. So we can expect a normal distribution peaking at 28 minutes (+/- 5 min)

When traveling by car it only takes 9 min + 5 min parking. To be of class "car" travel time must be at least 14 min, peak around 19 min and have upper extreme of 24 min.

Only if our interval is between 23-24 min our model may predict wrong.

2.) Automatic prediction of new classes/clusters using machine learning.

After having trained it on 100 samples I give it the rule that anything outside the range of the class "bike" or "car" must not be assigned to either class but is a new class instead.

2.1.) Examples for training that allows independent class discoveries using machine learning.

Walking takes 65 min +/-10 min, i.e. range: 55-75 min. The machine learning algorithm cannot classify but must conclude that it discovered a new class of transportation, i.e. walking. The distribution peak should be at 65 min. We must tell our machine learning algorithm that it has just discovered a new class, i.e. mode of transportation other than "car" and "bike". Thus, if a similar situation arises again our algorithm can discover new classes on its own.

2.2.) Introducing a fourth class, i.e. bussing

Taking the bus takes 56 min. The range is 56 min to 116 min peaking at 86, having two local maxima at 66 and 106 minutes if the bus goes every 20 minutes. Based on this distribution the machine learning algorithm should be able to distinguish between walking and bussing given intervals that resulted from the same mode of transportation.

Can the machine learning algorithm conclude that "bussing" is influenced by a periodic features equals the time difference between neighboring extremes, which n our case is causes by the busses running every 20 min. This should separate bussing more from the other transportation options than they differ from one another.

2.3.) Recap of the 4 classes based on the the time interval between starting and arriving only.

Our machine learning model can now distinguish between the following four classes.

For our machine learning algorithm the world looks as described below. It has no concept of bussing, driving, walking or biking.

2.3.1) to be of the class car the interval must be between 14 and 24 min forming a normal distribution around 19

to be of class bike: range 23-33 min with normal distribution around 28 min

2.3.2. )to be of class walking: range: 55-75 min with normal distribution around 65 min

2.3.3.) to be of class bus: range 56-116 min with trimodal distribution with global maximum at 86 min and two local maxima at 76 and 106 minutes given that busses come every 20 min..

2.3.4.) to be of the class biking it forms a normal distribution around 33 minutes ranging from 28 to 38 minutes.

2.4.) Learning a new transportation class on its own, i.e. taxi

If I take a taxi I save 5 minutes because no time needs to be spent on parking. This would result in an interval from 9-19 min, with normal distribution around 14 min. The machine learning algorithm should define it as a new mode of transportation because its outside the range of the other four transportation classes for which it has training samples. Based on this training our machine learning algorithm should not be able to correctly decide whether a series of travel intervals fit an any of the 5 classes or if its a new transportation mode discovery.

I want to draw the distributions in R or Python to help people understand machine learning.

The aim is to use machine learning for new discoveries of classes and concepts.

Renaud Di Francesco

Can you give examples of imperatively hidden objects?

There is work on adversarial approach to machine learning, which means hacking into your training or learning dataset so that what the machine learns is improductive or counterproductive.

Can you clarify your text, separating questions you want to ask from each other? Thanks

Arturo Geigel

I would encourage you to follow Renaud's advise and clarify your text. It also sounds that what you want to do is something similar to non parametric techniques such as Parzen windows (in the spirit of a probabilistic NN) or Adaptive resonance. I have also used adversarial techniques to embed Trojans on ML algorithms, but these techniques can be used to insert metadata or even guide the execution of the machine learning algorithm ( and not just the learning as in other adversarial ML work).

What is the best way to explain human decisions and behaviors?

Could you please share links to cancer datasets and to Python or R packages to analyze them?

What is the compensatory power of remotely training visually impaired computer users??

Who would be interested in me leading workshops about innovative adaptations to make electronic information more accessible to the visually impaired?

What is the big deal about the medical disclaimer on supplements that everyone, except for me, seems to feel obligated to respect this rule?

Can the Immigration Status be adjusted as means of last resort to give foreign disabled job-seekers a chance to get hired for less competitive jobs?

What are the great still undiscovered benefits of standardizing the functional layout and display the same functions at all websites?

What would make cancer the most fascinating disorder if it were not deadly?

Are my concepts, based on which I intend to infer gene functions, correct?

How can feature selection for training Supervised Machine Learning Algorithms be expanded to improve their predictive power (60th revision)?

Feedback defines the constitution of an organism?

Has anyone applied Python in the field of textile engineering for data analysis, automation, or smart textiles?

Request Python code?

Self-Organizing Superorganisms—as envisaged by Nenad Sestan (2018)?

Why does everyone use vs code?

Weak DAPI staining after immunohistochemistry - how to improve?

Measuring the Intelligence of a Species?

How can i do multivariate Time Series forecast using MLP, ANFIS and LSTM?

The Curse of Evolution and Complexity?

Need help with my research project on open source SIEM and machine learning?