What should I do with this two outliers, which I know their causes?

More Patrik Silva's questions See All

What is the cut off value of correlation coefficient among independent variables, to be considered as collinear/high correlated variables?

Please, if someone could give me some reference papers/books I would appreciate it. Thank you in advance! PS

31 December 2018 9,256 3 View

How the predicted R-Squared is mathematically calculated?

I would like to know how the predicted R-Squared are calculate because I want to use it to determine whether I am overfitting a my regression model by including to many terms, based in Mr. Jim's...

10 November 2018 6,634 6 View

How permutation works to calculate pseudo p-value in Moran Index (Spatial Autocorrelation)?

I want to see a real example, just to clarify my thoughts! Thank you in advance!

05 June 2018 7,912 2 View

Please, could someone give me some reference books of physical geography?

To specify, I want books focused on the aspects related to the field itself, geomorphology, geology, soil etc.

05 June 2018 6,568 4 View

Which function should I use to highlight low and high values in my data (variable)?

Let's I have the a have a variable "A" which range from 80 (Min) to 50000 (Max) and the mean is 2500. I want to highlight both, low values and high values. Which function should I use to weight my...

04 May 2018 718 7 View

There is any tool that can convert roads in polygon format to line (centreline)?

I have a road shapefile in polygon format and I want to take the centreline and end up with a shapefile in line format because I need the road's intersections. Thank you in advance!

04 May 2018 1,145 11 View

What is the minimum number of observations that I must have to perform a regression analysis?

I am conducting a research to explore the driving forces of crimes in a particular city. However, the city only has few zones (24). If I perform a regression analysis and have met all the...

03 April 2018 7,383 16 View

In regression analysis, should I use independent variable as a percentage or their natural units?

Let's say I have the total number of household of each zone in a particular city. I want to explore the driving forces of crimes (count data). As candidate's independent variables a have a set of...

03 April 2018 9,330 10 View

Should I normalize my independent variables in OLS regression or just use their absolute value?

I have some independent values, e.g. percentage of head of the household with low low education per zone, percentage of unemployed people per zone [...] The percentage is related to the total of...

02 March 2018 2,021 11 View

How to learn more about SPSS and its Application?

I would like to learn more about SPSS and Its application especially in regards to data analysis. Please suggest me how I can learn more about it. Thank you so much.

11 August 2024 9,101 4 View

Can I base on reverse DNA sequences to perform alignment, convert to amino acids and GenBank submission?

I have reverse sequences (AB1 format), can I base on reverse DNA sequences to perform nucleotide alignment, convert nucleotides to amino acids and deposit the sequence in GenBank database?

11 August 2024 5,138 1 View

Baseline drift in HPLC? What causes this?

Hello, Why do i see this baseline drift when i compare my blank (black) to the sample (blue)? Any suggestions as to why this happened? Thank you!

11 August 2024 3,770 4 View

Text-Communication from the M1 Hand Area using BCI—and then there is Elon Musk?

Willett, Shenoy et al. (2021) have developed a brain computer interface (BCI) that used neural signal collected from the hand area of the motor cortex (area M1) of a paralyzed patient. The...

10 August 2024 7,180 0 View

Has anyone applied Python in the field of textile engineering for data analysis, automation, or smart textiles?

I'm currently exploring the application of Python in textile engineering, specifically in areas like data analysis, process automation, and the development of smart textiles. I'm interested in...

10 August 2024 7,429 2 View

Where can I find the recent population of Indian cities (2020s) ?

Census data seems only to reach the year 2011, but there might be estimates somewhere that I cannot find? I am looking not only for the largest cities but also towns etc. Thanks!

10 August 2024 857 4 View

How can I use the cif data obtained from rietveld refinement extracted via gsas2, for microstructural analysis using ETEX software?

09 August 2024 7,718 0 View

How are iso-frequency contours plotted?

Let's say we have a standard, regular hexagonal honeycomb with a 3-arm primitive unit cell (something like the figure attached; the figure is only representative and not drawn to scale). The...

07 August 2024 1,937 1 View

How to prepare the nanoparticle treated fungal sample for Environmental SEM analysis?

A fungal strain was treated with nanoparticles. We want to do an environmental SEM analysis. So could anyone share your views on preparing the sample? Thank you.

07 August 2024 5,307 1 View

How to normalize and take the significance of the MTT OD values with 3 replicates for the same cell-line?

Hi, I have a question about normalizing the MTT OD values for doing the statistical analysis. So, if we have 3 different plates and we call them 3 different replicates, so, first we would...

07 August 2024 8,106 4 View

Jochen Wilhelm

I wonder if "per capita" is the right unit if you have neighbourhoods of different kinds that have systematically different "resident densities". In this case, the population size in the neighbourhood is not a good standard. Adjusting the model for the "functionality" of the neighbourhood would be one possible solution (I presume), but you say you don't have the information. Maybe you can make a sensible classification into different "functionalities" and use this as a factor in your model.

Patrik Silva

Thank you Jochen Wilhelm!

Please, I would like if you could give/provide me some further advice's/suggestions about how to do this sensible classification into different functionalities.

Thank you in advance!

That's domain knowledge. I can't really help you there. You should be the expert there...

Ok, thank you! I will look forward to some solutions!

I am thinking to use the volume of built-up area to normalize my variable instead of population, I think it may solve the problem of ratio inflation.

But I need to look at the literature to see if it will make sense.

Joseph L Alvarez

You state that you are looking at criminal incidents to create a per capita ratio

What type of incidents are you considering? There are crimes of opportunity and crimes of passion There are crimes on businesses and crimes on persons. There are victimless crimes. You should limit the types of crimes to ensure that there is no confounding by the mixture of crimes. Socio-economics affect types and frequency of crimes. This could be a further source of confounding.

You have differences in locations of the portions of permanent and transient residents. Permanent and transient residents may be involved in different kinds of crimes, particularly crimes of opportunity. There are many possible sources of confounding. Ratios can be tricky when care is not taken to normalize groups. Ratios can inflate or deflate with lack of normalized groups.

Thank you Dear Joseph L. Alvarez,

I totally agree with you, indeed I have separated type of crimes.

I just got some data about number of Jobs in each zones, I am thinking to add it together with the population in each neighborhood and then normalize the crimes ( per type), using this variable (Population + Job) in the denominator). Please, I would like to hear your suggestion about this procedure.

Anyway, I need to normalize the crimes incidents, because there are strong correlation between numbers of crimes and number of population (size of urban area as well) in zones.

I think this would further detoriate the interpretability of the data and the usefulness of your conclusions. Just my non-expert opinon...

I think shou should makes it simpler. Pulling different measures together, using some complicated ratios etc. makes everything extremely complex and hard to interpret eventually. Sometimes different things are just not comparable and need distinct ways of analysis. Possibly focus on a better defined subgroup, for instance.

Thank you, Jochen Wilhelm!

I am starting with statistics now, maybe I might be complicating things even more. All your comments and suggestions will be taken into account and they are being very useful.