How does a transformer architecture employ its attention method to determine global input-output dependencies?

24 July 2025 0 6K Report

For the understanding, most probably, not only the overall concept of using an attention method by a transformer architecture to determine global input-output dependencies is needed, but specific mathermatical, logical and computtaional details. What is the way of employing an attention methodand and what is the essence of the attention method from information processing point of view? How are the inputs of varying lengths treated and how does the method (computational mechanism) change its attention depending on the length of the sequence? Why can it replace sophisticated recurrent or convolutional neural networks?

Badges
Science topic

More Imre Horvath's questions See All

Can we mark 'EFL Learners shifting from general digital to AI technologies' as technological transition?

After COVID-19 it has seen that EFL learners technological affiliation has raised. In addition, in the post-COVID period learners started to engage AI technologies like ChatGPT while learning...

08 August 2024 8,964 4 View

How to generate a citation of my paper from ResearchGate?

How we can cite the papers from ResearchGate. I am trying to create citations for this article, Quantum Machine Learning Algorithms for Optimization Problems: Theory, Implementation, and...

08 August 2024 6,690 3 View

Does Anyone have expertise in in vitro transcription and RNA pull down assay?

I am currently working on LncRNA; to know the lncRNA-protein interactions I want to do RNA pull down assay, so I need to design primers with T7 promoter. I need assistance in this regard.

07 August 2024 6,622 1 View

How to fix background error in rietveld refinement of one XRD peak using GSAS-II?

I want to refine one XRD peak of my in-situ xrd but the background is never working good which ultimately fails the refinement. How to refine and adjust the background using GSAS-II

05 August 2024 5,291 2 View

How can I add own Henry coefficients in Aspen Plus?

Hi, i would like to simulate an absorption process in Aspen Plus. I want to use the NRTL model und would like to add some individual Henry coefficients. Is that possible and how?

05 August 2024 2,333 2 View

Why might the impedance values for DI water and 0.1X PBS buffer solution exhibit a decreasing and increasing trend, respectively over time (HP 4194A)?

Hello everyone, I'm encountering an issue with my electrochemical impedance spectroscopy (EIS) measurements and would appreciate some insights. Experimental Setup: Electrodes: Gold interdigitated...

05 August 2024 3,783 2 View

Can usage of AI tools like chat GPT in research work is recommendable ?

AI tools like ChatGPT can enhance research work significantly when used responsibly and in conjunction with thorough human oversight.

05 August 2024 1,842 3 View

Usage of internal standards in LC-MS/MS analysis?

Have you ever seen a LC-MS/MS method uses both internal standards and external standards (in matrix matching purpose) but the concentrations of internal standards are outside the calibration curve...

05 August 2024 3,084 6 View

ANY free software for reconstructing neurons in the microscopic image?

Hi everyone, I am working on brain slices for visualizing a protein in the soma and dendrites, using a fluorescence tag. However, I need a tool (not paid) for reconstruction of the whole neuron,...

04 August 2024 4,725 2 View

How effective is the Citi Bloc standard basket in enhancing the accuracy and comparability of international construction cost assessments?

Citi BLOC Standard Basket Definitions: A standardized unit representing a fixed basket of construction materials, labor, and equipment costs priced in various cities. Purpose: To create a common...

04 August 2024 8,997 1 View

Does the delta function actually make sense to use on point particles?

The delta function seems produce logical contradictions when analyzed on a fundamental level. I would be curious if anyone else agrees.

31 July 2024 10,109 3 View

Can a shoot-through event of a tri-state digital buffer cause momentary Hi-Z state?

// interested in the difference between floating events and short circuits.

22 July 2024 6,565 0 View

Using a large amount of data, such as 30 years of daily data, in a machine learning model like LSTM can reduce the model predictivity?

I have used 30 years of daily data, but the model is not able to optimize, getting just 0.57 in NSE and R2. when I reduced the data to 20 years in daily time frame I got 0.73 in NSE, R2 for both...

22 July 2024 669 0 View

How does an AI detection tool work?

I am interested in knowing how does an AI detection tool work and what's the main logic behind it? How it is able to detect the presence of AI work?

11 July 2024 3,889 6 View

Seeking Clarification on the Fuzzy Analytic Hierarchy Process (FAHP) Methodology?

Hello, ResearchGate community, I am currently exploring the Fuzzy Analytic Hierarchy Process (FAHP) methodology for a research project. I have a few questions that I hope the experts here can help...

02 July 2024 5,278 3 View

Why carrier Concentration decreases while increasing Temperature ?

Hi All . Recently, I conducted Hall measurement experiments at high temperatures on highly p-doped diamond with a doping concentration of 2.5×10^20 cm−3 During these experiments, I observed two...

23 June 2024 6,583 2 View

Is it possible to derive quantum mechanics with classical logic?

In the article linked below, a derivation is done using strictly classical logic and experimental results as the premises. Through logic alone, the processes that mimic superposition, quantum...

16 June 2024 5,880 19 View

Using Integral of Error as Input in Fuzzy Logic Controllers?

Hello. I'm currently working on a control system for a doubly fed induction generator (DFIG) as part of my thesis project. Traditionally, fuzzy logic controllers (FLCs) use the error (e) and the...

27 May 2024 1,732 2 View

How water soluble polymers act as good adsorbents for pollutant removal?

When a polymer is soluble in water, how is it able to interact with the adsorbate molecules dispersed in the solution? I need some logical explanation regarding this phenomenon along with relevant...

20 May 2024 6,116 2 View

Is there any set method or Theory Available to improve the performance of Fuzzy logic controller?

If the Fuzzy logic controller is not giving the desired results than what to do ? Further more...How to know how many membership functions and which type should be chosen?

20 May 2024 628 0 View