How can Kneser-Ney Smoothing be integrated into a neural language model ?

More Dimitris Dimitriadis's questions See All

Do you think can be any Uranium bearing rocks in Eastern part of Iran and western part of Afghanistan?

I want to know more about Uranium ore deposits in world.

11 August 2024 6,720 0 View

Do you think can be any diamond bearing rocks in Eastern part of Iran and western part of Afghanistan?

I want to know more about diamond ore deposits in world.

11 August 2024 2,167 1 View

What is the difference between mathematical R^4 space and physical 4D unit space?

We assume that the difference is huge and that it is not possible to compare the two spaces. The R^4 mathematical space considers time as an external controller and the space itself is immobile in...

10 August 2024 6,678 14 View

If Banks do not provide credit facility, what are the options available for FPOs and impact on producer’s income?

10 August 2024 8,198 5 View

Controlling for pupil light reflex when analyzing pupil size time course?

I used eye tracking to examine how participants from two different populations (A and B) react to an image. Participants in population A exhibit larger pupil sizes over time, but they also have...

10 August 2024 3,229 0 View

What are a “Farmers Producer Organization” (FPO) and its essential features?

10 August 2024 477 5 View

Strugglling with m6A dot blot any suugesstion ?

I have been doing the m6A dot blot for a while with no improvement, I am extracting the RNA, and I can see the dots although the three biological replicas give a different reading on the memberan...

10 August 2024 8,539 5 View

Do interactions between biosphere, carbon cycle, & water cycle impact global warming & interaction between atmosphere & hydrosphere?

How do interactions between the biosphere, the carbon cycle, and the water cycle impact global warming and interaction between the atmosphere and the hydrosphere?

09 August 2024 3,291 2 View

How to get moment output in Abaqus Standart?

I have input a moment load in module load Abaqus, i put my moment load on the node surface (using reference point). I have define moment in history output and make a set for moment too. But the...

08 August 2024 4,831 4 View

How is energy cycled through the Earth's climate system and how do matter cycle and energy flow through the rock cycle?

08 August 2024 8,162 0 View

How can I prepare virus for a TEM or SEM imaging?

I have virus (viral hemorrhagic septicemia virus) in suspension and the experiment will not involve cells. What level of TCID50 is preferred?

11 August 2024 3,115 1 View

Handling Missing Data and Building a Predictive Model with Incomplete Information ?

I am developing a predictive model for a water supply network that involves 20 influencing points. However, I only have historical data for 10 out of these 20 points. I would like to know how to...

10 August 2024 4,005 2 View

Is it possible to use the Fused Deposition Modeling (FDM) to additively manufacture interconnected porous structure generation of >100-200 micrometer?

Usually, additive manufacturing techniques like SEBM, SLS, and SLM are used for interconnected porous lattice structure generation with sizes of >100–200 micrometers. Can the Fused Deposition...

09 August 2024 7,892 0 View

How to define an anisotropic material with asymmetric elastic compliance/stiffness matrix in ANSYS APDL?

I need to model an anisotropic material in which the Poisson's ratio ν_12 ≠ ν_21 and so on. Therefore, the elastic compliance matrix wouldn't be a symmetric one. In ANSYS APDL, for TB,ANEL...

09 August 2024 5,048 2 View

How can I apply boundary conditions in an orthotropic steel deck numerical model using ABAQUS software?

I am trying to simulate vehicular loading on an orthotopic steel deck bridge section in ABAQUS software. The red arrow mark in the attached figure indicates the direction in which the vehicle will...

08 August 2024 719 0 View

Can you suggest reliable sources defining "3D mesh" and "3D city models"?

Dear fellow researchers, I am currently working on a paper where I need to provide a reliable reference that defines and distinguishes between 3D mesh models and 3D city models. Although I am...

06 August 2024 9,986 2 View

What condition of Kirat Rai's participation in politics before and now in Nepal?

The political participation of the Kirat Rai community in Nepal has evolved significantly over time: Historical Context Historically, the Kirat Rai people, like many indigenous groups in Nepal,...

05 August 2024 5,950 1 View

Please explain how the plastic input value should be considered from the true stress-strain curve for the bilinear elastoplastic material model ?

I am working on Abaqus/Explicit(Quasistatic ) for the deformation of the auxetic structure model. Please explain how the plastic input value should be considered from the true stress-strain curve...

05 August 2024 454 3 View

What are the shear and normal stiffness values of an LLDPE liner in 3D numerical modeling of a stockpile?

I am seeking experimental or applicable data for the liner (LLDPE) interface in FLAC3D numerical modeling of a large stockpile. Could you please recommend suitable references? The preferred data...

05 August 2024 3,665 0 View

Is it necessary to covary exogenous constructs in a structural model?

I am working on a SEM model where i have 7 latent variables (6 exogenous and 1 endogenous). In AMOS when I co-vary the exogenous constructs, only 2 paths are coming significant out of 6. But when...

03 August 2024 6,028 4 View

Muhammad Ali

Have a look at https://arxiv.org/pdf/1901.09296.pdf

https://arxiv.org/pdf/1901.09296.pdf

https://ieeexplore.ieee.org/document/7532765

Dimitris Dimitriadis

Muhammad Ali your answer is not relevant. It would be useful to remove it.

Mohamed Zayan

The Kneser-Ney Smoothing. Probability smoothing (algorithms) is not used for learning the conditional probability. Probability smoothing aims to correct zero probability problem. The problem arises in the case of, the dataset includes some rare events or contain various topics.

To summarize the case zero-probability, just mean, the event has zero probability, i.e. in some cases zero probability can indeed happen (ex. poor data collection or contradictions), therefore, assigning zero weights to rare event not a reasonable thing to do so.

Zero probability hurts the accuracy and performance, as model assign zero weights for rare events, and this might cause overflow, in case of sigmoid activation. Most APIs, or even our implementation, we have to handle dividing by zero, by adding very small value epsilon, to the exponent of the exponential function (clipping) zero, this usually hurts the numerical stability as dominator becomes very small.

In case of, feed-forward neural language model with one linear hidden layer and a softmax activation, Probability smoothing resolve the problem of having sparse weights, which consumes the memory (useless space), moreover improve model accuracy by enforcing model assign small weight values, to rare events (at least).

Kneser-Ney Smoothing, gives better representation, for zero probability than Laplace smoothing, (ex. clipping, extended value function).

Mohamed Zayan Thank you for your answer. You explained the reason of why we use the Kneser-Ney Smoothing very well, but in practice, in which phase is this smoothing applied?

For example, does one train a language model and in prediction phrase the smoothing algorithm re-estimate the probabilities? Or, are the class labels re-estimated with the use of smoothing algorithm during training?

So, In practice how can a neural network and a smoothing algorithm be both in a system/software/architecture?