Is it problematic to use a covariate derived from the dependent variable in linear regression?

More Nikolaos Tziokas's questions See All

How to reconstruct original observations using PCA?

I ran PCA on 4 variables using the prcomp library. All variables were normalized to have a mean of zero and a standard deviation of one (z-score) before the PCA. prc 1 and I performed a varimax...

26 June 2024 6,792 1 View

Error when running bfastlite using a monthly time-series matrix as input?

I have 12 raster images which I stack them. Then I converted the rasterstack to a matrix, I remove the NA's from the matrix and I convert the matrix to monthly time-series matrix. Finally, I run...

22 December 2023 236 0 View

How to batch download monthly mean band for several years using Google Earth Engine?

My goal is to download the monthly mean Red band (B4) from Landsat 8 for the year April 2013 to December 2022 (total 117 monthly mean images) using Google Earth Engine (GEE). For this reason I am...

08 December 2023 4,516 0 View

Preprocess daily Black Marble (NVP46A2) nighttime light product in R?

I downloaded NASA's Black Marble daily product (VNP46A2) which is in h5 format. One needs to preprocess the data using the Scientific Data Sets (SDS) included in the h5 file. Based on the User...

07 November 2023 8,232 0 View

How to export daily image from PROBA-V using Google Earth Engine?

I'm trying to download single NDVI image from Proba-V. Proba-V produces images every 2 days. Here (https://developers.google.com/earth-engine/datasets/catalog/VITO_PROBAV_C1_S1_TOC_100M) is the...

29 May 2023 8,103 2 View

Black Marble nighttime lights imagery in Google Earth Engine?

I want to download Black Marble daily nighttime lights (NTL) using Google Earth Engine (GEE). My question is if the pixel values of the images are in digital numbers (DN) or radiance. Here is a...

21 May 2023 815 4 View

Random Forest: GridSearch vs RandomSearch and interaction between the parameters?

In R, I am using the randomForest and caret packages for a Random Forest (RF) regression task. Genarally, there are two options when fine-tuning a model: GridSearch, RandomSearch. I have seen on...

02 April 2023 3,165 3 View

How to design any abstract phase profile for metalens design?

I want to use FDTD to design meta-lenses of different types and I have successfully done so for simple systems where equations are available for the desired phase profile. My question is, how can...

13 February 2023 4,906 2 View

How to download yearly MODIS NDVI 16 day 250m using Google Earth Engine?

I am trying to export yearly median MODIS-based NDVI products at 250m spatial resolution. I am using the 16 day MODIS/061/MOD13Q1 product. I have found this code which I am following...

25 January 2023 4,000 4 View

Repeated measures two groups two factors how many DFs?

Let's say we have cells from two genotypes (WT and KO), each tested with three drug concentrations on each of three separate days. Assume sphericity over the responses of 100 cells tested, 40 WT...

24 January 2023 5,179 6 View

Is there an alternative to a multinomial regression which allows the DV to be non mutually exclusive?

I am trying to analyse data from a survey examining what variables affect teachers perceived barriers to incorporating technology into their classroom. I have 5 predictor variables however my DV...

06 August 2024 1,752 3 View

In order to run Multinomial Logistic Regression, is it required that the data be in the long format?

I am using unit level data (IHDS round 2) & Stata 17

06 August 2024 5,725 2 View

How to report results of Generalised Linear Mixed Models in a journal article?

Hi everyone, If you have written or come across any papers where Generalised Linear Mixed Models are used to examine intervention (e.g., in mental health) efficacy, could you please share the...

04 August 2024 4,130 4 View

Request a single Lecture notes for math as detailed as this that I can find in one place?

- The Existence/Uniqueness of Solutions to Higher Order Linear Differential Equations - Higher Order Homogenous Differential Equations - Wronskian Determinants of $n$ Functions - Wronskian...

03 August 2024 2,366 0 View

Normality assumption for linear regression is The assumption of normality is whether for residual errors or predictor variavble?

When we conduct linear regression, there are several assumptions. The assumption of normality is whether the residual errors are normally distributed, not whether a predictor is normal?

31 July 2024 6,164 3 View

Posthoc test lettering in JAMOVI?

Does anyone know of a module for the JAMOVI software that is capable of generating mean separations using the classic letters based on post hoc results (e.g., Tukey test)? If, as I believe, such...

31 July 2024 3,333 4 View

SAS Generalized Linear Model for trial/event anaysis and not survival (time to event) analysis?

I am looking for a published article using SAS or SPSS Generalized linear model for trial/event data and not survival analysis. Both software packages off the option for the number of success out...

30 July 2024 3,835 2 View

Is it redundant to use both Random Forest and Decision Tree algorithms in the same regression project?

I am currently working on a regression model for a project and considering using both Random Forest and Decision Tree algorithms. Given that Random Forest is essentially an ensemble of Decision...

23 July 2024 4,306 3 View

If in a panel data, T>N then which model is appropriate ?

In my data set, T is greater than N, so I chose quantile regression for my data set. So is it appropriate for that?

15 July 2024 6,416 4 View

What are the problems we face when we directly inverse a multivariate regression equation?

Why direct inversion of mutivariate regression equation is not preferred and instead optimization techniques are used?

15 July 2024 8,642 3 View

Ivan Avila Caceres

we call this type of data arima chart. the input depends on output. you could use more tools to study it, in essence its important the independent variable becomes independent of other variable, not really of Y. u can use acf charts to use the autoregression data. good day

James E. McLean

To answer your question, you must understand what a covariate does and does not do in a regression (or ANOVA). The way that a covariate increases the precision of an analysis is by reducing the unexplained within group variation. What you are suggesting may, indeed, decrease the variation across the independent variables since it is draw from the independent variable. This could counterbalance any benefit derived from using a covariate. It is also important to know that a covariate will not correct appropriately for across group differences. The only way you can know the answer to your question for sure is to run it both ways and to compare the effect size of the independent variable with and without the covariate (R squared with and without the covariate). I am not sure this paper would answer all of your question, but you could learn more about covariates by reading "The Care and Feeding of ANCOVA," a paper available on Researchgate by me. Good luck!

Raghavendra Rao Chillarige

Let me state the problem you have as

X independent variable and Y dependent variable

Y: Nighttime lights raster

X: Population raster

Based on the data (xi, yi) for i=1,….,N wish to fit a model for Y=a+bX+e

Let this data related to Location 1 ( which is of your interest) Say D1 ={(xi,yi)|i=1,2,…}

Let you have a sicario say other Location (Source) - data set from location be D2={(uj,vj)|j=1,….}

U: Nighttime lights raster at other location (Source)

V: Population raster at other location (Source)

D3 be an historical data of the Location 1

D4 be an historical data of the Location 2.

%%%%%%%%%%%%%%%%%%%%%%%%

Let Z2 be simply a function of Y say Z2=g(Y)

One should not attempt Y on Z2

%%%%%%%%%%%%%%%%%%%%%%%%%%

If Z2 forecast for Y then one can try Y on Z2

Model1: Y=a+bX+e

Model2: Y=a1+b1*Z1+e Z1 is a derived variable

Case i: Z1=b0+b1*X+b2*U+b3*V+e a linear model for Y based on the historical data D3 and D4.

Case ii: Z1=b0+b1*X +e a linear model built for Y based on D3 alone

Case iii: Z1 = forecast of Y based on D1 ( auto regressive with covariate X, U and V with apt lag)

Mode 3: Y=b0+b1*X+b2*U+b3*U+b4*Z1+e