Which one is better Apache Spark or Scala as computing framework?

More Ajit kumar Roy's questions See All

Do you agree that Emotional Intelligence impact on Professional excellence?

Valuable answers are requested.

31 December 2018 8,915 3 View

Do you believe AI and Automation Will make many jobless?

Input requested.

10 November 2018 8,706 2 View

Do u believe Neural network forecasts exceeded other traditional forecasting methods?

Traditional forecasting methods like linear or logistic regression systems is surpassed by Neural network forecasts in determining predictor variables?

09 October 2018 520 0 View

Can anyone differentiate between human intelligence and artificial intelligence?

Input requested from RG members.

09 October 2018 6,789 6 View

Can anyone mention the beneficial effect of fish Hilsa ilisa?

Many people believe that frequent eating of the fish controls heart disease.

08 September 2018 2,213 5 View

Can anyone explain the procedure of constructing variables from concept?

In social sciences often use concepts that are more abstract and therefore the standardization in measurement is needed following a standard methodology.

08 September 2018 6,138 4 View

Can anyone suggest how to perform Kernel regression in SPSS?

Suggestions requested for RG members.

08 September 2018 7,191 1 View

Do you feel Emotional intelligence and academic achievements is related?

Wise thoughtful input requested.

08 September 2018 425 18 View

How to Choose a Correct Statistical Test?

Can anyone highlight on this crucial AREA.

07 August 2018 1,310 3 View

What are the reasons that girls drop out of school?

What are the likely consequences of drop out of schools?

07 August 2018 7,911 40 View

Separation of organic acids-HPLC?

Hello What should be done to separate and identify organic acids in HPC when their RetTime is the same?Like oxalic acid with Propanoic Acid.or acids that have a very close RetTime.

07 August 2024 8,782 3 View

Which test should be used to study association among demographic profile and awarness level?

i have to study the awareness and adoption level of cloud computing in a district of India. i also want to use association among demographic variables like gender, age, education, income etc and...

02 August 2024 2,420 3 View

How to use Desmond in HPC ?

Our department has recently acquired an HPC (High-Performance Computing) system, and I'm thrilled to take my molecular dynamics calculations to the next level using Desmond. I used to run my...

28 July 2024 6,553 1 View

What are the future implications of quantum computing on image processing algorithms?

Image Processing Algorithms, Quantum Computing.

17 July 2024 7,958 2 View

Given the current advances in Super Computation and Quantum Computing, what are the missing link between the Applied AI and Ultra Smart Cyberspace?

In recent years, quantum computing has emerged as a groundbreaking technology with the potential to revolutionize various fields, including artificial intelligence (AI). AI has already made...

17 July 2024 1,398 3 View

How can quantum machine learning algorithms be optimized to harness the potential of quantum computing for enhancing data analysis ?

Quantum computing

09 July 2024 4,269 1 View

What are the modules needed in MEC research ?

To elaborate on my question: When implementing offloading techniques in mobile edge computing using simulators like OMNET++, how can I complete the experiment with results and visualizations? I...

03 July 2024 5,238 2 View

How to compile a udf in Fluent to import hourly weather data and update the boundary condition based on it ?

Hello all, I've compiled a udf which reads velocity data with corresponding time stamp from a .csv file using DEFINE_ON_DEMAND macro. Then, I use DEFINE_PROFILE to define an inlet velocity profile...

14 June 2024 6,480 0 View

Considering the issues with the ERP of GCTU, management has asked that you advise them on cloud computing models.?

Considering the issues with the ERP of GCTU, management has asked that you advise them on cloud computing models. You are expected to give 3 advantages and disadvantages of each of the models....

09 June 2024 5,629 0 View

How to promote an academic journal and get high-quality manuscripts?

International Journal of Complexity in Applied Science and Technology(https://www.inderscience.com/jhome.php?jcode=ijcast) is a new journal. Can anyone tell me how to promote the new journal and...

09 June 2024 1,922 1 View

Azedine Boulmakoul Popular answer

Apache Spark™ is a fast and general engine for large-scale data processing : Computing EcoSystem.

Scala is a general-purpose programming language providing support for functional programming and a strong static type system. Multi-paradigm: functional, object-oriented, imperative, concurrent..

So you can use scala to program in Spark Ecosystem.

Azedine Boulmakoul

Ajit kumar Roy

thanks dr Azedine for input

Thanks dr Andreas for input

Bharathiraja Nagu

Apache Spark is a distributed computation framework that simplifies and speeds-up the data crunching and analytics workflow for data scientists and engineers working over large datasets. It offers an unified interface for prototyping as well as building production quality application which makes it particularly suitable for an agile approach. I personally believe that Spark will inevitably become the de-facto Big Data framework for Machine Learning and Data Science.

Why only Scala and Python? Apache Spark comes with 4 APIs: Scala, Java, Python and recently R. The reason why I am only considering “PyScala” is because they mostly provides similar features respectively to the other 2 languages (Scala over Java and Python over R) with, in my opinion, better overall scoring. Moreover R is not a general-purpose language and its API is still in an experimental phase

thanks dr Nagu for answer

Shailna Patidar

park Programming is nothing but a general-purpose & lightning fast cluster computing platform. In other words, it is an open source, wide range data processing engine. That reveals development API’s, which also qualifies data workers to accomplish streaming, machine learning or SQL workloads which demand repeated access to data sets.

Apache Spark currently supports multiple programming languages, including Scala and Python.

Still Scala is chosen over Spark for the following reasons:

1. Python is in general slower than Scala. If you have significent processing logic written in your own codes, Scala definitely will offer better performance. 2. Scala is static typed. It looks like dynamic-typed language because it uses a sophisticated type inference mechanism. It means that I still have the compiler to catch the compile-time errors for me. Call me old school. 3. Apache Spark is built on Scala, thus being proficient in Scala helps you digging into the source code when something does not work as you expect. It is especially true for a young fast-moving open source project like Spark. 4. When Python wrapper calls the underlying Spark codes written in Scala running on a JVM, translation between two different environments and languages might be the source of more bugs and issues. 5. Last but not least, because Spark is implemented in Scala, using Scala allows you to access the latest greatest features. Most features are first availabe on Scala and then port to Python.

Read more here https://data-flair.training/blogs/spark-tutorial/

Thanks @Shailna for answering.