How to statistically compare two maps?

27 August 2014 56 8K Report

I need to statistically compare two maps in order to determine if the spatial distribution of their data is correlated or not. any suggestions? Thanks!

Gianmarco Alberti Popular answer

I am replying to this question since I recently came across a similar issue. I will give my two cents here, bearing in mind that this solution applies to the specific issue at hand.

I have two rasters, each representing two path systems (actually, two least-cost paths networks). Each cell belonging to each path is given a value of 1, the off-path cells are given 0. The rasters have the same resolution and spatial extent.

I wanted to quantify if and to what extent they can be considered correlated, that is how "strong" is the overlap between them. I focused on the Jaccard coefficient (e.g., http://people.revoledu.com/kardi/tutorial/Similarity/Jaccard.html).

This coefficient is equal to: the INTERSECTION between the two rasters divided by the UNION between the two rasters.

Now, in terms of this specific example, the INTERSECTION is the number of only those cells that the two rasters have in common (i.e., the number of overlapping path cells). The UNION is total number of path cells (belonging to either of the two rasters).

In ArcGIS, we can use RASTER CALCULATOR to compute the INTERSECTION and the UNION.

To get the INTERSECTION, we just feed the following formula into RASTER CALCULATOR: "RASTER A" & "RASTER B" (where Raster A and Raster B is the name of the two rasters being analysed).

The same for UNION: "RASTER A" | "RASTER B"

Once we have obtained two new output rasters, to get the Jaccard coefficient, we simply open the attribute table of the two rasters, and take note of the cell count that has value equal to 1, dividing them accordingly (rememeber: INTERSECTION divided by UNION).

In my case, the count of cell with value 1 in the INTERSECTION raster is 22,822, while in the UNION raster is 37,716. The Jaccard coefficient turns out to be about 0.61

I hope this quite long reply will be useful to anyone that will jump here in the future.

A similar approach (in Matlab) is provided here: http://kawahara.ca/matlab-jaccard-similarity-coefficient-between-images/

Mohammed Layelmam

You can use ArcGIS wirh combine tools ...

Raymond K. Timm

Sofia - one challenge is that no matter what you do, the results will likely be significant. An approach that I've taken in the past is to subtract successive timesteps from one another to quantify where differences exist through time. You can then estimate the fractal dimension of the differences. Df is the log of perimeter regressed against the log of area. you can test if the slopes of regression lines are different using ANCOVA. I realize this is kind of an inverted way of testing your question. But, it does make your type I error rate more manageable if you can make it work. good luck.

Article Response to disturbance in a highly managed alluvial river: ...

Michael Brian Harman

Use Mantel or Partial Mantel tests, it creates two matrix, one for data values one for distance. Measures correlation, randomizes locations repeats analysis. Each iteration gives new correlation. based in the actual correlation within the populated distribution it gives you probability that the spatial locations is significant in terms of the data. FYI consider standardizing the data on each map prior to analysis in case one location has higher or lower values of variables are scaled different. There is free software called PASSaGE.

AJ Sousa

The approach you can use depends on the problem: there is not a 'universal' answer to the problem of map comparisons. What kind of maps are you dealing with? Are you comparing two different regions or the same region at diferent times? You must clarify rhe problem

Nicola A Wardrop

Interesting question and this is something I have been thinking about for a while also. As others have said, it depends on the specific question, but assuming you are talking about comparing two different distributions in the same location, here are a couple of ideas that might help:

Spatial overlay of polygons and calculation of the proportion of the area of one which overlaps with the other. (example here: http://www.nature.com/nature/journal/v365/n6444/abs/365335a0.html)

Cross-covariance analysis allow you to calculate the correlation between two datasets at the same spatial location, while also accounting for correlations with neighbouring locations. Co-regionalisation type methods produce spatial models which highlight areas with shared spatial patterns. (Another couple of examples: http://biomet.oxfordjournals.org/content/100/3/539.abstract and http://www.jstor.org/stable/2937096?origin=JSTOR-pdf). I hope these will at least give you a couple of ideas to look up.

Md. Rejaur Rahman

Sofia, if you want to find out correlation between two spatial data sets, either two data sets are correlated or not, you can use regression option in Idrisi image processing software, it will tell you the correlation status and regression line also. For that analysis, your two data sets should be in raster format and same pixel size. You can see one example in my one published paper (http://www.sciencedirect.com/science/article/pii/S0304380009002634).If you need further help, feel free to contact me.

Ari Pramono

I am assuming that your spatial data is formatted as point distribution. In that case there are several ways to compare the spatial distribution;

By comparing how the points are spatially dispersed. The spatial dispersion, tendency and direction can be summarize as the standard deviational ellipsoid based on certain p-value).

By modelling the relationship between the spatial position and the non-spatial value of each of your datapoints. One good way to do this is by using the Geographically Weighted Regression (WGR).

Both can be done rather easily in ArcInfo/Arcmap software.

Good Luck..

Gustavo Henrique Dalposso

One way is to use the Bivariate Moran index.

Here's an example of application

http://www.n-aerus.net/web/sat/workshops/2013/PDF/N-AERUS14_Matkan_Ali%20Akbar_FINAL.pdf

Jose J. Pereira

I am assuming you are dealing with point data on a map of some sort. If that is the case you can use the Geostatistical Analyst extension in ArcGIS to examine spatial autocorrelation in the data. Output will include measured versus predicted values. You can look at the correlation between the two predicted data sets to see if they are the same.

Sreenivasulu Ganugapenta

Please see the following link it may helps you

http://resources.arcgis.com/en/help/main/10.1/index.html#//018p00000006000000

Jan Blachowski

If your maps are in raster format there is a tool Band Collection Statistics. See

http://resources.arcgis.com/en/help/main/10.1/index.html#/How_Band_Collection_Statistics_works/009z000000q3000000/

Hope this helps

Pedro Correia

Assuming both maps are spatially on the same place (so every location has 2 values, one on each map) you can calculate correlation (continue variable) or similarity (discrete variable) directly. For continue variables you may try:

a) Pearson linear correlation (http://en.wikipedia.org/wiki/Pearson_product-moment_correlation_coefficient)

b) Reflective correlation (variant of Pearson, also on the link above)

c) Spearman rank correlation (http://en.wikipedia.org/wiki/Spearman%27s_rank_correlation_coefficient)

d) Cosine similarity (http://en.wikipedia.org/wiki/Cosine_similarity)

For discrete variables you can try:

a) Jaccard index (http://en.wikipedia.org/wiki/Jaccard_index)

b) Sorensen-Dice coefficient (http://en.wikipedia.org/wiki/S%C3%B8rensen%E2%80%93Dice_coefficient)

c) Hamming distance (http://en.wikipedia.org/wiki/Hamming_distance)

These will give you numbers (each coefficient has its own criteria so some are better than others on certain fields) and there are many others. If you need something more wide than just a number you can try the P-P plot (http://en.wikipedia.org/wiki/P%E2%80%93P_plot) or the Q-Q plot (http://en.wikipedia.org/wiki/Q%E2%80%93Q_plot).

This assumes that you've already went for the common scatterplot (http://en.wikipedia.org/wiki/Scatter_plot).

M. A. Malik

It will be helpful to define the nature of data shown on the map. The comparison of topographic maps will be different than comparing thematic maps. Are the data values accompanied by uncertainty estimates ?

Masimalai Palaniyandi

It very simple, by May overlay analysis (Raster map overlay analysis)

David G Dickason

Dr. Bajocco, Comparing two maps displaying polygonal info is much easier when the choropleth classes use the same interval definition method -- i.e., quantile-based class intervals, and the same number of classes on each map. Even if these are not two maps of the same variable at different times, a straightforward "transitiion matrix" can be constructed. From that you can employ a variety of non-parametric methods of correlation related to the general chi-squared methodology -- i.e., phi coefficients for 2x2 matrices, but there are others. See the very old, but time-tested volume by Hubert Blalock, Social Statistics (published in many editions.) Best, David

Henrique Alves

Assuming you are dealing with raster data, try PONTIUS method of budgeting the sources of error (discordance between two maps) in terms of location and quantity. IDRISI has a module to calculate them. Luck! Henrique

Mikwa Ngamba Jean-fiston

it is possible to compare to map by mean of spatial statistics tools as Patch Analyst , you can download this small software in Google and add it as an extension in ArcGIS 10 , than you will gererate lots of statistical information related to your map either in raster or vector format

Wencheng Yu

You can change two maps to raster data by using arcgis tool 'polygon to raster', then use some other tools related to raster type of data, such as 'raster calculator' .

Mohsen Dhieb

I think you have several entries to the issue depending on the type of methods used to portray the data, the general goal and the particular objectives, the context of use and so on. One of these entries is particularly interesting from a semiological viewpoint, that is, the way the data is represented on the map and therefore the way one map could be 'different' from another one.

Bin Jiang

I would compare how ht-index varies from one map to another. Ht-index capture the complexity of maps.

Jiang B. and Yin J. (2014), Ht-index for quantifying the fractal or scaling structure of geographic features, Annals of the Association of American Geographers, 104(3), 530–541.

Jiang B. and Miao Y. (2015), The evolution of natural cities from the perspective of location-based social media, The Professional Geographer, 67(2), 295 - 306.

https://www.researchgate.net/publication/236627484_Ht-Index_for_Quantifying_the_Fractal_or_Scaling_Structure_of_Geographic_Features?ev=prf_pub

https://www.researchgate.net/publication/259914895_The_Evolution_of_Natural_Cities_from_the_Perspective_of_Location-Based_Social_Media?ev=prf_pub

Article Ht-Index for Quantifying the Fractal or Scaling Structure of...

Article The Evolution of Natural Cities from the Perspective of Loca...

Jose Lopez Collado

The Map Comparison Kit is a free software that address the problem of comparing maps, it offers different indices according to the type of spatial information and how the maps are defined. Link:

http://mck.riks.nl/

D. V. Politikos

You can check the paper of Syrjala

A Statistical Test for a Difference between the Spatial Distributions of Two Populations, ECOLOGY 77(1):75 · DECEMBER 1995. An example included in the paper. You can apply it easily with R.

Ansar Khan

You can also used the SRS method.

Nuno Guiomar

Have you tested the geographically weighted regression? You can use ArcGIS or SAM (https://www.ecoevol.ufg.br/sam/).

Best wishes

Nuno

Nuno Guiomar

See also http://www.exprodat.com/blogs/using-the-arcgis-correlation-coefficient-4/

Sachin Kishor Patil

@Anteneh Zewdie Abiy ·- Is this kind of analysis is possible in ArcGIS or we must use tool like R or MATLAB.

Tom Jonesman

The 2015 paper "A method for analysing replicated point patterns in

ecology" (Methods in Ecology and Evolution 2015, 6, 482–490) may be of some help.

Donald Myers

The first question is why do you want to compare the two maps, i.e. what will you use the comparison for? What is being"mapped" in each map? If your situation is as described by Corrriea, which may be the simplest case, there is still the question of what the maps are being used for and why do you want to compare them?

E.g.,do you want to decide which is the better "quality" map, which has the most features.?

Gianmarco Alberti

I am replying to this question since I recently came across a similar issue. I will give my two cents here, bearing in mind that this solution applies to the specific issue at hand.

This coefficient is equal to: the INTERSECTION between the two rasters divided by the UNION between the two rasters.

In ArcGIS, we can use RASTER CALCULATOR to compute the INTERSECTION and the UNION.

To get the INTERSECTION, we just feed the following formula into RASTER CALCULATOR: "RASTER A" & "RASTER B" (where Raster A and Raster B is the name of the two rasters being analysed).

The same for UNION: "RASTER A" | "RASTER B"

In my case, the count of cell with value 1 in the INTERSECTION raster is 22,822, while in the UNION raster is 37,716. The Jaccard coefficient turns out to be about 0.61

I hope this quite long reply will be useful to anyone that will jump here in the future.

A similar approach (in Matlab) is provided here: http://kawahara.ca/matlab-jaccard-similarity-coefficient-between-images/

Baoxiang Pan

To find the correlated pattern between two fields, I would recommend the following dimension reduction methods:

1, Combine the two fields and do Principal Component Analysis(PCA) on it.

2, Do PCA on one field and draw correlation maps of the leading EOFs with the second field.

3, Canonical Correlation Analysis(CCA), which seeks linear combinations of each filed that have maximum correlation with each other.

4, Singular Value Decomposition(SVD), which is a expansion of PCA.

Also, if you are interested in the deep learning trend, Convolution Neural Network might help for certain cases.

Marek Sobczak

If you want to compare many maps containing data series (statistics, historical data, etc.), Look at Kumbi. This algorithm sorts a series of data according to given set of criteria. If the set of criteria is equal to one of the data series, the remaining series will be sorted according to the degree of similarity to this series.

The image of statistics compared in this way (for Poland) is presented in the attached kumbi_examp01.jpg.

Presentation and a simple demo are available here: http://kumbi.co > Applications > Maps

Shahryar Khalique Ahmad

A one-liner would be to use pkstat from pktools (http://manpages.ubuntu.com/manpages/bionic/en/man1/pkstat.1.html) that offers tons of statistics

Joseph Basconcillo

I was reading through this thread and I wonder if it is possible to compare two historical hurricane track maps (in raster format) using the Jaccard Coefficient (considering it is a measure of dissimilarity; as discussed by Gianmarco Alberti) and Change Vector Analysis (usually applied for land cover change detection).

Hayder Dibs

Dear simply you need to do fitting curve to find the best curve and correlation

Eugene Eremchenko

Plot is a solution

Sofia Bajocco

I can now tell you what I did in my cases. In a first case, I had to compare a remotely-sensed fuel map (map1) with a climatic map (map2); what I did was building a contingency table with the number of wildfires falling in each combination of categories (categories from map1 vs categories from map2) and then testing the degree of association through a permutational chi-square test in order to see if the association was statistically significant. For further details and explanation, here it is the corresponding paper: Bajocco, S., Dragozi, E., Gitas, I., Smiraglia, D., Salvati, L., Ricotta, C. Mapping Forest Fuels through Vegetation Phenology: The Role of Coarse-Resolution Satellite Time-Series (2015) PLoS ONE 10(3): e0119811. doi:10.1371/journal.pone.0119811.

Another time, I performed a correspondence analysis (CA) between the map1 and a fire hotspots map that I derived (map3). Correspondence analysis is used to characterize the relationships between two nominal variables; in our study, categories from map1 vs categories from map3. This is the related paper: Bajocco, S., Koutsias, N., Ricotta, C., Linking fire ignitions hotspots and fuel phenology: The importance of being seasonal (2017) Ecological Indicators, 82, pp. 433-440. Maybe also the selectivity analysis we performed in the same paper could be considered to this aim.

Luca Santoro

Both maps are built on the same Earth model or do you assume that the mathematical model is the same? I think it is necessary to focus on geodetics of both maps before all statistical analysis in order to eliminate all possible systematic errors.

All best,

Luca

Hayder Dibs

in easy way create table and put your points and their values in two fields then do regression

Eugene Eremchenko

You could compare statistically only scalar parameters. So you should identify comparative parameter/s. Area of polygons? Length of lines? Share of same objects in the layer? Answer will depend on selection.

Sreedhar Mahendrakar

Two maps shall be in same coordinate system (horizontal and vertical) and at same scale for comparison. comparision can be made in two ways calculating the positional differences between two maps RMSE x,y. at well defined features in maps 2) comparing the information content in two maps how many layers etc; 3) calculating areas for polygon features and linear distances between two points for well defined points..

Bawe Gerard Nfor, Jr

As indicated by Anteneh Zewdie Abiy, use Cross Tabulate Tool. https://gis.stackexchange.com/questions/45020/land-use-land-cover-change-post-classification-in-arcgis-for-cross-tabulation

Krzysztof Będkowski

I suggest to rasterise the maps and than use of Kappa statistics (Kappa Index of Agreement).

Donald Myers

The first question is "what kind of maps?", i.e. are these contour maps, road maps, oceanographic charts, aeronautical charts, soil type maps or what. It appears that each respondent has some particular kind of maps in mind but doesn't say what that is. The original poser also does not really say anything about what the maps are, i.e. what kind of information is presented on the maps. The original poser also refers to "spatial correlation" but this could pertain to correlation "within" a map as opposed to correlation "between" the two maps. This reference also suggests that the maps plot numerical values but are those pixel values (if so, what size?), values at the nodes of a grid? Before rushing in and suggesting particular algorithms or software it is necessary to ask fundamental questions first. Remember that correlation has both a theoretical meaning and an empirical meaning.

Hussein El Hage Hassan

before comparing it is necessary to use a classification method that allows comparisons between the series

Temitope E. Idowu

Following...

Srikanta Sannigrahi

If the data is in pixel-level, then do pixel-wise correlation and regression between the control and response variables using R software.

https://www.hakimabdi.com/blog/test-pixelwise-correlation-between-two-time-series-of-raster-data-in-r

https://gis.stackexchange.com/questions/278979/linear-regression-between-every-3%C3%973-pixels-between-two-rasters-using-r

Hayder Dibs

dear it is so easy you need just to perform change detection

in any available algorithm

Aysar Jameel Abdalkadhum

Use a classification method that allows comparisons between them. Comparing the information content in two maps.

Hussein El Hage Hassan

some methods of discretion allows comparison between series

Ismail Mondal

You can use ERDAS imagine software for accuracy assessment

to compare two image result

Mônica Larissa Aires de Macedo

Following...

Aysar Jameel Abdalkadhum

Following

Arthur Telles Calegario

I compared two categorical rasters applying kernel density to differences between them. I transformed them to numerical before did it.

Results showed trend to overestimate in one part of the study area and subestimate in another.

Atef A. E. Amriche

I would look into the features from each map (possibly after converting to points or polygons).

Starting with a uni-variate k-function to identify the spatial distribution of each feature of interest (for each map separately). Then I would do a bi-variate k-function analysis to study the spatial relationship between features from different maps. This will allow you to identify attraction or repulsion complete randomness of one group (from one map) with respect to the second group (from the second map).

Aristides (Aris) Moustakas

Hi Sofia. There are of course several ways to do it and ultimately the method that you will choose depends on what you are trying to answer

A suggestion is to use spatial cross-correlation. The method can quantify differences between spatial locations or properties and also quantify the difference across scales i.e. how does this change when the distance increases

Assuming that you have x1, y1, z1 and x1, y1, z2 where z is the value on the map and x , y the locations you could do that in R using the ncf package among others

The package is here

https://cran.r-project.org/web/packages/ncf/ncf.pdf

an application for addressing a question if invasive species are more inside or outside protected areas (2 maps same x y but z1 is potected surface area, z2 is alien species richness, calculate across distances...) is here

Article Sampling alien species inside and outside protected areas: D...

If your 2 maps are identical then spatial cross correlation will be 1 at scale unit distance 1. If the one map is a mirror of the other then spatial cross correlation will be -1 at distance unit 1

Best regards,

Aris

Badges
Science topic

Similar topics
Geoscience
Cartography

More Sofia Bajocco's questions See All

Chromosomal spreads are shrunken after fluorescence in situ hybridization (FISH), why?

Dear colleagues, We are trying to do FISH on mice PDEC metaphase spreads. We use homemade biotin labelled probes, synthetized using a nick translation kit. For hybridization, initially we were...

08 July 2024 8,687 3 View

How can technological innovations be leveraged to overcome challenges such as unsuitable blood discards at regional blood transfusion center?

In a journal article written by Kavulavu et al. (2022), entitled “Challenges facing Blood Transfusion Services at a Regional Blood Transfusion Center in Western Kenya”, one of the key challenges...

11 April 2024 7,586 1 View

Can we perform Population genomics analysis with genome resequencing data?

Sometimes the problem in sequencing project formation hamper your analysis, therefore i want to ask the scientific community that if we can pool the samples of a population and do the genome...

08 April 2024 6,645 0 View

¿Como implementar un plan de negocios en una IPS DE SALUD MENTAL?

Necesito crear un plan de negocios en las áreas de estudio administrativo, técnico, ambiental, financiero y de mercadeo

18 March 2024 8,929 0 View

What is the relationship between use of Assistive technology and wellbeing in older persons?

Research Problem While evidence suggests AT can benefit older adults, a comprehensive understanding of its impact on their well-being remains unclear. To effectively support Older persons’...

20 February 2024 2,645 0 View

Which accelerometer is considered the best tool for measuring sedentary behavior for research purposes?

As a student researcher, choosing the right accelerometer for studying sedentary behavior requires careful consideration. The selected device should be sensitive to prolonged sitting and...

12 February 2024 5,274 1 View

Which accelerometer is considered most suitable for measuring physical activity intensities?

As a student researcher, choosing an accelerometer to measure physical activity intensities involves careful consideration of various factors. The selected accelerometer should demonstrate high...

12 February 2024 2,274 3 View

Is there a need to desalinate caulerpa racemosa specimen prior to extraction?

Hello! We are conducting an experiment on caulerpa racemosa crude methanolic extract and we are after the flavonoids present in the plant. According to some journals we read, they did not...

17 January 2024 5,280 0 View

How to write an abstract without conclusion?

I have a retrospective descriptive study, can I write an abstract without conclusion? And how it can be done? Thank you

10 December 2023 9,614 3 View

Cook et al., 2013 - How were epochs extracted and what is the prediction horizon?

While reading the work "Prediction of seizure likelihood with a long-term, implanted seizure advisory system in patients with drug-resistant epilepsy: A first-in-man study" I failed to figure out...

29 November 2023 8,283 0 View

How to learn more about SPSS and its Application?

I would like to learn more about SPSS and Its application especially in regards to data analysis. Please suggest me how I can learn more about it. Thank you so much.

11 August 2024 9,101 4 View

Can I base on reverse DNA sequences to perform alignment, convert to amino acids and GenBank submission?

I have reverse sequences (AB1 format), can I base on reverse DNA sequences to perform nucleotide alignment, convert nucleotides to amino acids and deposit the sequence in GenBank database?

11 August 2024 5,138 1 View

Baseline drift in HPLC? What causes this?

Hello, Why do i see this baseline drift when i compare my blank (black) to the sample (blue)? Any suggestions as to why this happened? Thank you!

11 August 2024 3,770 4 View

Text-Communication from the M1 Hand Area using BCI—and then there is Elon Musk?

Willett, Shenoy et al. (2021) have developed a brain computer interface (BCI) that used neural signal collected from the hand area of the motor cortex (area M1) of a paralyzed patient. The...

10 August 2024 7,180 0 View

Handling Missing Data and Building a Predictive Model with Incomplete Information ?

I am developing a predictive model for a water supply network that involves 20 influencing points. However, I only have historical data for 10 out of these 20 points. I would like to know how to...

10 August 2024 4,005 2 View

Has anyone applied Python in the field of textile engineering for data analysis, automation, or smart textiles?

I'm currently exploring the application of Python in textile engineering, specifically in areas like data analysis, process automation, and the development of smart textiles. I'm interested in...

10 August 2024 7,429 2 View

How can I use the cif data obtained from rietveld refinement extracted via gsas2, for microstructural analysis using ETEX software?

09 August 2024 7,718 0 View

How are iso-frequency contours plotted?

Let's say we have a standard, regular hexagonal honeycomb with a 3-arm primitive unit cell (something like the figure attached; the figure is only representative and not drawn to scale). The...

07 August 2024 1,937 1 View

How to prepare the nanoparticle treated fungal sample for Environmental SEM analysis?

A fungal strain was treated with nanoparticles. We want to do an environmental SEM analysis. So could anyone share your views on preparing the sample? Thank you.

07 August 2024 5,307 1 View

How to normalize and take the significance of the MTT OD values with 3 replicates for the same cell-line?

Hi, I have a question about normalizing the MTT OD values for doing the statistical analysis. So, if we have 3 different plates and we call them 3 different replicates, so, first we would...

07 August 2024 8,106 4 View