Why does resource utilization of FPGA varies with respect to input size?

More Anishchandran .C's questions See All

How to analytically model an FPGA architecture ?

We have developed a floating point architecture, but it has been asked us to analytically justify the performance efficiency of our architecture. Thank you

06 July 2019 3,479 6 View

May i know the difference between stochastic Reaction Diffusion (RD) model and deterministic RD model?

Is Stochastic model superior to deterministic model

06 July 2016 969 2 View

Why does the Heat conduction equation of 2-D is elliptic in space and parabolic in time ?

I had read that to emphasize the requirement of boundary conditions some prefer to classify transient heat conduction equation as elliptic in space and parabolic in time. But I couldn't able to...

04 May 2016 2,732 14 View

Sir, what is meant by spurious conservation laws and what are those quantities?

In lattice gas automata Head-on collisions conserve, in addition to total particle number, the difference of particle numbers in any pair of opposite directions. Thus, head-on collisions on a...

01 February 2016 10,102 0 View

Cuáles fueron las tendencias en investigaciones en arquitectura, urbanismo y patrimonio edificado en decadas del 2000 al 2020?

Cuáles fueron las tendencias en investigaciones en arquitectura, urbanismo y patrimonio edificado en decadas del 2000 al 2020? Porque requiero conocer tesis de posgrado nivel maestría...

24 July 2024 5,494 1 View

Does Nature Scientific Reports waive open access fee for industry authors?

I came across the Green Building and Sustainable Architecture collection under Nature Scientific Reports some weeks ago. https://www.nature.com/collections/gajghaebce The special issue/collection...

10 July 2024 5,533 1 View

LSTM on Time Series: Has LSTM architectures ever been applied to Time-Series Forecasting ?

Have we ever used LSTM architectures on Time-Series Forecasting and Analysis, and gotten a decent result ?

30 June 2024 6,924 3 View

Has fine-tuning techniques like LORA ever been applied to pre-trained Computer Vision CNN architectures ?

Has fine-tuning techniques like LORA and QLORA ever been applied to pre-trained CNN architectures for any application ?

25 June 2024 7,332 2 View

Images: Between CNN architectures and Vision Transformers, which requires more data to train and why ?

Which architecture requires more data to train between CNN and Vision Transformer based models ?

25 June 2024 7,599 0 View

You are kindly requested to investigate the stealing my name from one of my researches?

Unfortunately, I found my name as a senior author of research entitled "Multifunctional prosthetic polyester based hybrid mesh for repairing of abdominal wall hernias and defects" published in...

23 June 2024 7,798 0 View

¿Cuáles son los entornos estrategicos mas importantes frente al tema de inteligencia artificial?

Según el Ministerio de Tecnología e Innovación Colombiano, los entonos estrategicos en los que deben trabajar los gobiernos para adoptar una posición eficiente frente a la Inteligencia artificial,...

23 June 2024 9,844 1 View

Hello In your opinion, which is better: Study Microprocessing first, then Computer Architecture, or vice versa, and why?

Computer Science Department

19 June 2024 8,292 2 View

Object Detection: Which Object Detection Model can identify small objects ?

Which Object Detection architecture (be it CNN-based or Visual Transformer-based) can be used to detect small objects ?

18 June 2024 9,589 2 View

How can we train multi-modal CLIP architecture to generate images using Prompt ?

Can we even make changes to CLIP Model architecture such that it can be used as an image generator from prompts ?

16 June 2024 320 0 View

Peter Schulz

Hi Anishchandran,

resouce utilization in this case increases superlinear (quadratic) because of the number of multiplications. But what I don't understand in your question: how can the architecture stay fixed and how can you then observe increased resource utilization? For observing increased resource utilization different synthesis and implementations runs are necessary.

Best Regards

Peter

Anishchandran .C

Greetings sir

Thanking you for the reply

By mentioning 'architecture stay fixed' i meant that once we implemented an architecture on FPGA, then it does not dynamically change its architecture wrt input. For eg. in case of single precision floating point pipelined multiplier, each module in the architecture is defined for 32 bit width and corresponding logic resources are used. Obviously it is the same for entire algorithm execution process. So., how multiplying 2 matrices of 8 elements each, varies in terms of resource utilization to that of matrix with 16 elements. Because the input is fed serially and at one clock pulse, only two elements [of 32bit size due to SP] are multiplied irrespective of input matrix size.

If you don't mind, please clarify me on this regard

Thank you

@ Anishchandran: sorry, I didn't get it completely: how do you know about different resource utilization, if you don't rearrange your architecture?

What toolchain are you using? Could you please describe shortly what design steps you perform between size 8 and size 16?

I designed a single precision floating point matrix multiplication architecture on FPGA using XILINX ISE. For up to 16x16 matrix sizes [ say A, B], the implementation utilizes very minimal fpga resources. But when I chose matrices A, B size as 64x64, the synthesis report shows more than 100% utilization of fpga resources.

Here the architecture is defined & designed for single precision. The method of inputting matrix A, B., whether it is 16x16 or 64x64 remains same i.e. serial input style. So at each clock cycle the architecture gets each of the multiplier and multiplicand data irrespective of size of the matrix. Then how come an increase in the size of the input matrices increases utilization of fpga resources?

Thanking you

@Anishchandran: are you instantiating the multipliers explicitly? Or do you leave the choice to the synthesis tool? If you leave it to the synthesis tool then it depends on your constraints whether the optimization will take place towards speed or area. If optimization is for speed the syntheziser probably will take all the multiplier instances the chip has available.