CUDA is a C/C++ extension, which allows you to efficiently program NVIDIA GPUs. If your program needs to be executable on non-NVIDIA GPUs, you should use something more general. OpenCL is wide-spread standard for parallel programming on accelerators. Though being less generic, CUDA programs typically run significantly faster due to producing more optimized code for specific GPU architectures.
When considering the use of the GPU with CUDA or OpenCL, it is important that the computations can be executed in parallel without conflicts. GPUs are highly-efficient when many small calculations can be performed simultaneously not depending on each other. I give some exemples, where GPUs are efficiently used:
1) Rendering: The rendering of several triangles in a scene can be done in a sequence of parallel computations. This was the initial use-case for GPUs, as they allowed for impressively fast frame rates in games or other interactive 3D applications. Initially, GPUs acted as fixed function pipelines [1], but as their impressive aggregated processing power was sought for in other disciplines, GPUs became programmable through high-level languages such as CUDA or OpenCL. Now, GPUs can be programmed for rendering tasks such as raytracing or path tracing [2], where thousands of rays are emitted from an eye position through the scene.
2) Calculations in linear algebra such as matrix calculations or solving linear equation systems. These calculations are frequently performed in high performance computing problems such as physical simulations. Reed and others (together with Turing award winner Jack Dongarra) described the HPC computations that are currently frequently performed [3]. These applications can benefit drastically from using GPU accelerated linear algebra libraries such as cuBLAs [4], because this can save days of calculation time.
3) Machine-learning: Learning or training large data sets can be intrusive and frequently requires a lot of time. For this reason, GPUs are frequently used to accelerate the process, while most users rely on already-implemented packages that control the GPU. Take a look into [5] for a detailed explanation.
As a closing note, first check if the problem you have maps well onto the GPU architecture and secondly check if there is already a library for the problem you are facing, before you start with the implementation of a CUDA program by yourself.
[2] SJOHOLM, J. Best practices: Using NVIDIA RTX ray tracing. NVIDIA Developer Blog, 2020. URL: https://developer.nvidia.com/blog/best-practices-using-nvidia-rtx-ray-tracing/
[3] REED, Daniel; GANNON, Dennis; DONGARRA, Jack. Reinventing high performance computing: challenges and opportunities. arXiv preprint arXiv:2203.02544, 2022.
[5] ILIEVSKI, Andrej; ZDRAVESKI, Vladimir; GUSEV, Marjan. How CUDA powers the machine learning revolution. In: 2018 26th Telecommunications Forum (TELFOR). IEEE, 2018. S. 420-425.