How to analyze data distributed in multiple data centers?

How to learn more about SPSS and its Application?

I would like to learn more about SPSS and Its application especially in regards to data analysis. Please suggest me how I can learn more about it. Thank you so much.

11 August 2024 9,101 4 View

Handling Missing Data and Building a Predictive Model with Incomplete Information ?

I am developing a predictive model for a water supply network that involves 20 influencing points. However, I only have historical data for 10 out of these 20 points. I would like to know how to...

10 August 2024 4,005 2 View

Has anyone applied Python in the field of textile engineering for data analysis, automation, or smart textiles?

I'm currently exploring the application of Python in textile engineering, specifically in areas like data analysis, process automation, and the development of smart textiles. I'm interested in...

10 August 2024 7,429 2 View

Is Galaxy.org good to use for research for analyzing data and for publication?

Hello all, I wanted to know, can I use galaxy (USA, Europe or Australia) platform for analyzing the shotgun data, and can it be used for publication purpose as well? Thanks :)

06 August 2024 6,610 4 View

Do experts have journals in the field of artificial intelligence and big data that are not indexed by SCI or EI?

05 August 2024 8,836 2 View

What are possible strategies can be used to analyze data under sequential explanatory mixed method approach?

Better ways to analyze the qualitative and quantitative data in a sequential explanatory mixed method approaches

04 August 2024 2,703 6 View

How can I interpret the data without the need of solving it manually?

How can I interpret the data gathered without solving?

03 August 2024 9,054 3 View

I need the datasets of Microgrid for system identification?

Hi I am working on data driven model of the microgrid, for that, i need the reliable datasets for the identification of MG data driven Model. Thanks

02 August 2024 5,748 4 View

Why can't academics earn the money they deserve?

Only Journals make money from the articles we have worked on for years. Academics do not earn money from their refereeing. Then shouldn't the solution be a system in which academics can earn...

01 August 2024 6,469 6 View

Conjugation of PEG-Amine to an Amino Acid Using EDC?

I am attempting to conjugate PEG to an amino acid at the C-terminus, for the purposes of producing nanoparticles. I have been told that PEG modified with amine groups can be used for this purpose,...

31 July 2024 2,033 1 View

Mohamed Rihan Elmeligy

What do you mean with ‘analyze’, what kind of analysis do you mean?

Tamer Emara

Thank you for your reply. I mean how to process. For example, the naive solution is to collect all the data into one data center then process them. Is there any other method or techniques to solve the problem?

Mohamed AbdElAziz Khamis

https://www.datastax.com/wp-content/uploads/2012/09/WP-DataStax-MultiDC.pdf

https://mapr.com/blog/multi-master-replication-geo-distributed-data/

http://highscalability.com/blog/2009/8/24/how-google-serves-data-from-multiple-datacenters.html

https://www.researchgate.net/post/Can_HDFS_operate_across_multiple_datacenters

Thanks dr. Mohamed AbdElAziz Khamis for links

Ajit kumar Roy

The article in link may be interesting to you

Hassan Nima

https://www.datastax.com/wp-content/.../WP-DataStax-MultiDC.pdf

- Using Apache Hadoop: An open-source software framework written in Java for distributed storage and distributed processing of very large data sets on computer clusters built from commodity hardware. https://www.tutorialspoint.com/hive/hive_tutorial.pdf https://www.tutorialspoint.com/impala/impala_tutorial.pdf

https://bit.ly/2E5i09D

Ali Rezaee

note that hadoop /spark is not a good fit for geographically distributed data-centers.

Also, as you did not describe the details about the "analysis" you intended to do, the answers will not be helpful.

you should define your use-case and also quality attributes such as response-time constraints before any accurate solution can be offered.

Thanks all

I think wide area data analytics is applicable in your case where data is generated in a geo-distributed fashion.

M. A. A. Al- Fatlawi

I follow answers

regards

Thanks Dr. Ajit kumar Roy . That's exactly what I need. I found some papers in the area of wide area data analytics. It is an interesting topic.

best regards

Gourav Gupta

Dear Tamer Emara ,

You can use NIFI or Sqoop for importing the data from disparate data islands and process the data using Apache Hive,Apache Pig,Python(Pandas or Numpy),Spark etc.

If you don't want to import the different sources data on Hadoop Ecosystem(HDFS),in that case you have to create a federation layer like Impala,Kudos,Drill etc.

Best regards!

Gouarv Gupta