How do I interpret this scatterplot?

More Loren Mastracci's questions See All

Can delirium cause brain damage that is similar to damage caused by a stroke?

Elderly patients may develop strokes associated with high blood pressure. Small strokes may result in cognitive dysfunction, confusion, memory lapses and onset of dementia. less frequently and...

07 June 2024 8,723 2 View

What statistical analysis do I use?

Im using the BFFM 50-item, likert scale to compare personality traits for online dating users and non-users, I'm unsure what kind of statistical analysis to use? These are my...

16 November 2021 9,455 2 View

Partial correlation matrix for SEM?

Hi all, I have a model with 8 control variables one predictor and one outcome. Adding all 8 of these variables to the model really complicates the model. Is there any reason, I couldn't create a...

22 June 2020 8,701 3 View

How to extract specific range of sequence from one fasta or text file using bash?

Hi I"m still really new to BASH (and informatics in fact). I have extracted a contig into a .txt file, but I also have the information as one fasta within a multiple fasta file. I want to get one...

29 August 2018 9,547 4 View

Analysis of thyroid hormones from dried blood spots??

With the goals of reducing sample volume and sample stability, I am interested in any information or a review of the use (methods) of blood spots for later analysis of TSH, bound and free T3 and...

24 January 2016 1,404 3 View

Is the prevalence of ectoparasites proportional to the risk of pathogen infection?

I am looking for experimental evidence (or convincing non-experimental evidence) that the number of fleas or ticks an individual animal harbors is related to that individual's risk of infection by...

03 December 2014 9,499 11 View

What is the overall incidence of complications (wound infection, hemorrhage, dehiscence, etc) associated with skin excisions for melanocytic tumors?

When we have patients sign consent forms for excisions, an overall complication risk is often cited as between 1 and 5%, but finding a source for this in the primary literature has been...

28 July 2014 5,645 7 View

Judicial Independence and Rule of Law

Is Judicial Independence a component or rather a pre-requisite of Rule of Law? Do we need to place the notion of Judicial Independence inside or outside the ROL framework? Which theoretical...

01 January 1970 3,608 7 View

Using OBD technique i am trying to measure laser induced shockwaves velocity i found that at start velocity increases and then decay?

i am unable to interpret why its increases in start as shown in figure

11 August 2024 2,179 1 View

GC-MS retention index prediticon?

Hello experts, Does anyone know any free software about retention index prediction ?

08 August 2024 7,403 2 View

Is there an alternative to a multinomial regression which allows the DV to be non mutually exclusive?

I am trying to analyse data from a survey examining what variables affect teachers perceived barriers to incorporating technology into their classroom. I have 5 predictor variables however my DV...

06 August 2024 1,752 3 View

In order to run Multinomial Logistic Regression, is it required that the data be in the long format?

I am using unit level data (IHDS round 2) & Stata 17

06 August 2024 5,725 2 View

Why do we equate male and female arousal?

Women, on the other hand, can become physically aroused (increased blood flow in the reproductive organs) without becoming psychologically aroused even in the slightest. (Robert Weiss)

05 August 2024 9,537 2 View

How can I interpret the data without the need of solving it manually?

How can I interpret the data gathered without solving?

03 August 2024 9,054 3 View

I need the datasets of Microgrid for system identification?

Hi I am working on data driven model of the microgrid, for that, i need the reliable datasets for the identification of MG data driven Model. Thanks

02 August 2024 5,748 4 View

Should I remove an item from a scale to raise Cronbach's alpha and McDonald's omega or is it better to leave it if they are both over .7 already?

Hello! I have this scale which had 10 items initially. I had to remove items 8 and 10 because they correlated negatively with the scale, and then I removed item 9 because Cronbach's alpha and...

01 August 2024 4,606 7 View

Talking therapies for bipolar, psychology?

what is the best research evidence for psychological interventions for Bipolar?

01 August 2024 6,023 2 View

Normality assumption for linear regression is The assumption of normality is whether for residual errors or predictor variavble?

When we conduct linear regression, there are several assumptions. The assumption of normality is whether the residual errors are normally distributed, not whether a predictor is normal?

31 July 2024 6,164 3 View

Sal Mangiafico

Do you mean to say that your _dependent_ variable is binary? If so, what do you hope to determine by knowing if your data are homoscedastic or not?

David Eugene Booth

If we're going to do regression then let's do it right. This is a logistic regression with a binary DV. I don't know what package you use. But download Jared Lander, R for everyone from the z-library and you find a logistic regression program that will do it for you. Try it. You'll like it. And get sensible results. An intro example can be found in Frank Harrell's book Regression Modeling Strategies also available in the z-library. Best wishes David Booth

Kelvyn Jones

There are quite a few technical things you need to know.

Your are in essence trying to model a latent (that is unmeasured) dependent variable, the probability of 'yes' saying, when what you have is a observed binary variable with 1 for Yes and 0 for No. You are not going to get much insight from plotting the observed data.

As others have said, you need a generalized linear model such as a logit or probit that more properly models the underlying dependent variable that is constrained to be between 0 and 1. The random (or error) term of such a model is usually not assumed to be a Normal distribution but is assumed typically to be a Bernoulli distribution in the binary case. This essentially has heteroscedasticity built in to it so that the residual variance depends on the mean predicted probability, being most variable when the predicted responses on the probability scale is around 0.5, and is least variable when the response approaches 0 and 1; there is literally less room to vary. In practice the logit- Bernoulli model takes care of this essential heteroscedasticity.

It is also possible to have additional variability which is called over-dispersion - that is overdispersion compared to the standard model - for example, you may have left out an important predictor variable, and this is being picked up in the random term. The binary model however does not have sufficient information to model this overdispersion. You can do so if you have data in the form of the proportion of Yes saying which you can fit a Logit-Binomial model with potentially an overdispersed random term. See Article Redundant Overdispersion Parameters in Multilevel Models for...

If you want some free learning materials on the logit model which goes in to the practical applications of such technical issues, see http://www.bristol.ac.uk/cmm/software/mlwin/mlwin-resources.html#discrete

Jochen Wilhelm

I also don't understand your approch and what the aim of this analysis should be.

The binary DV cannot be homoscedastic, as its variance must depend on the mean. Hence, checking if the variance in the two groups is similar is a complcated way to check if the means are similar.

The question for a binary DV could be if the assumption makes sense that each observation is a result of a Bernoulli experiment with the same success probability (p). If this probability may be different for different observations, then you will get a larger variance than expected by the binomial model (what is called over-dispersion). In this case you should use a beta-binomial or a quasi-binomial model that can handle over-dispersion.

Babak Jamshidi

As Dr Jochen Wilhelm said, the distribution of the non-standard residuals (or error term) is not uniform, so you need a function, relation, or transformation to describe it. It seems that the variance of the error term is the same for the two groups.

Bruce Weaver

The figure attached to the original question suggests that one variable is Customer Loyalty Behaviour (0=No, 1=Yes). Was this measured as a dichotomy originally? Or do you have some kind of scale that was coarsened into a dichotomy? And what is the other variable, that you presumably treated as the explanatory variable in your regression model? (Or variables, if there were 2 or more explanatory variables.) It would be helpful, I think, if you provided some basic descriptive stats for all variables in your model. It would also help if you spelled out your research question. HTH.

Teshita Uke Chikako

A scatter plot (aka scatter chart, scatter graph) uses dots to represent values for two different numeric variables. The position of each dot on the horizontal and vertical axis indicates values for an individual data point. Scatter plots are used to observe relationships between variables. if the dots approach to the strait line or on strait line then we can say there is linear relation ship between two variable.