In the data, there are 678 observations including one continuous dependent variable (Y: ranges from 15-85) and 12 independent variables (continuous, categorical, and ordinal). The X1 (2 levels) and X6 (3 levels) are considered as categorical variables. Here are some questions that I have:

  • Can I assume that all the coefficients (except X1 and X6 which are categorical) are linear with respect to Y? Please let me know your ideas about the linearity of X2:X4 and X8:X11?
  • Can I consider X5 as continuous variable; however, it is ordinal and ranges from (1-7)?
  • Can I get the year as continuous variable; however, it’s ordinal and rages from 1999-2007 (In fact, year of data per se does not improve the response; it is the other factors occurring in the same time period which result in improvements and we don’t know those factors), does this approach seem logical?
  • What are the advisable transformations for each specific independent variable to make it more linear with respect to Y?
  • Any feedback and insights would be highly appreciated. Thank you

    More Amir Abolhassani's questions See All
    Similar questions and discussions