Thompson sampling with Bernoulli prior and non-binary reward update?

12 November 2020 1 4K Report

I am solving a problem for which I have to select best possible server(level 1) to hit for a given data. These server(level 1) in turn hit some other servers(level 2) to complete the request. The level 1 servers have the same set of level 2 servers integrated with them. For a particular request I am getting success or failure as response.

For this I am using Thompson Sampling with Bernoulli prior. On success I am considering reward as 1 and for failure it is 0. But in case of failure I am receiving error as well. In some error it is evident that the error is due to some issue at server(level 1) end and hence reward 0 makes sense but some error results from request data errors or issue at level 2 servers. For these kind of errors we cant penalize the level 1 servers with reward 0 nor can we reward them with value 1.

Currently I am using 0.5 as reward for such cases.

Exploring over Internet I couldn't find any method/algorithm to calculate the reward for such cases in a proper(informed) way.

What could be the possible way to calculate reward in such cases?

Taoufik Yeferny

I think that it would be natural to use different server selection algorithms. One for selecting servers level 1 and another for selecting server level 2. In this case each algorithm has its own reward function.

Badges
Science topic

Are there any instruments for studying time similar to the way it is in space?

There are a huge number of methods for studying objects in space, according to the senses (and not only). Mechanical, thermal, optical, acoustic, electrical, magnetic, based on particle beams,...

06 August 2024 7,102 0 View

Some new emerging problems on application of RL for scheduling in IoT networks?

I have seen plenty of existing works on applied Reinforcement Learning (RL) policies for optimized scheduling in IoT networks including Q-learning, DQNs, and the newer ones including PPO for...

01 August 2024 8,754 2 View

Does the delta function actually make sense to use on point particles?

The delta function seems produce logical contradictions when analyzed on a fundamental level. I would be curious if anyone else agrees.

31 July 2024 10,109 3 View

What are needed modules for an IoT waterlevel monitoring system?

I want to know the modules needed for an IoT project for water level monitoring

27 July 2024 1,502 3 View

Who wants opportunities for scientific cooperation?

Dear Colleagues, I hope this message finds you well. My name is Noor Al-Huda K. Hussein,and I am a researcher specializing in deep learning applications in genetic data analysis. I am currently...

18 July 2024 5,562 0 View

How can I adapt cyclic server model in hospital operations?

To help enchancing patients flow and service efficiency

18 July 2024 482 2 View

Given the organizational complexity of academic institutions does an internal institutional politics play significant role in an institution's growth?

There are few business activities more prone to a credibility gap than the way in which executives approach organizational life. A sense of disbelief occurs when managers purport to make decisions...

08 July 2024 1,323 2 View

• What role should preprint servers like arXiv play in the scientific publishing workflow?

Preprint servers play a valuable role in the scientific publishing workflow by accelerating the sharing of research, promoting openness and transparency, and diversifying the publication...

01 July 2024 4,022 2 View

How to know if the oil palm tree is functioning properly in terms of absorption of different nutrient?

The idea is to design an IoT system that is able to capture real-time conditions from an oil palm tree to ensure optimal growth. Nutrients such as Sulfur, Nitrogen, Boron, Zinc, Copper and Iron.

30 June 2024 3,046 2 View

How IoT based keyad does controlled solar powered works in smart greenhouse and what sensors could be used to achieve the goals of a smart greenhouse?

28 June 2024 5,602 5 View