In case my independent data is categorical and dependent data is continuous, and i have to fit a regression model what kind of regression analysis do i use? Can i still use linear regression?
Hi Ashish, you can fit an ANOVA model, but you can often implement an ANOVA as a "linear regression" function, such as using the function lm() in R. This will compare the differences between the mean values of each level of your categorical variable. You need n(number of samples)-1 degrees of freedom to run the model and categorical variables need more degrees of freedom (2+ depending on the number of independent categorical variable levels) then independent continuous variables (1).
Richard Neil Belcher , Thank you for the answer. I have know the ANOVA and linear regression have similar operations. However will i be able to get correlation coefficients if i run ANOVA.
Ashish Kumar, the output from an LM or ANOVA will most often be effects sizes error and significance estimates. You can also get an R2 (variance explained by your independent variable) which is more similar to a correlation coefficient (how correlated the two variables are). However, most people would be interested in the effect size and the error (and p-value) of each independent variable. The effect size of a continuous variable tells you the effect of having one unit of the independent variable (for example maybe eating 1 kg) on the outcome variable (for example a persons weight). If you had a categorical explanatory variable (for example, type of food eaten), the effect size would be the difference between one level (dohnut, which would be in your output) compared to whatever your reference level is (you could code it to a control level, in this case maybe no food at all, and would not show in your results table).
Anyway you can see an introduction to these concepts on many places online. The analysis factor usually explains them in an understandable way. Here are two links to get you started
A boxplot of your dependent variable as a function of your independent categorical variable is a good descriptive starting point. It will give you insights about location and dispersion differences between groups and atypical values.
I agree with Richard Neil Belcher, but If the sample sizes corresponding to categorical values are very different, I suggest a permutation test using the one-way anova F statistic.
My reading of you question is that the responses is continuous and the predictors are categorical. In that context an important consideration is whether the categories are ordered (eg rare , common, ... very common) or not (Black , White, .....Asian) as that may effect how you code your variables. There are good discussion of this available especially in relation to particular software; for example