I want to prepare a manuscript about exploratory data analysis of psychological data using R. Do you have any recommendation about the topics and packages to include?
I’m considering the following packages right at the moment: car, ggplot2, Psych
What kind of data do you have? Survey? Experimental? What kind of exploratory analysis do you want? Factor analysis? Structural equation modeling? Item response theory? So, each of these statistics has its own R package... Please, tell more about your research. Thanks!
Hi, Danilo. I'm don't have any dataset in mind. I'm at the preliminary stage of preparation of a manuscript aimed at promoting exploratory data analysis as a regular and necessary practice. Another goal is promoting the use of R among chilean psychological researchers. Maybe i should focus in the packages that may be useful for the most used analysis in chilean journals. Any general recommendation.
Ok, Juan... but it is necessary to know something about your data. In general, "psych" package has various functions that can be used in exploratory analysis. It is very easy to use... However, if your data are non-normal, I recommend robust statistical analysis using the WRS package (http://dornsife.usc.edu/labs/rwilcox/software/), but you should buy one of the Rand Wilcox books to learn its functions, once help files are not provided (Introduction to Robust Estimation and Hypothesis Testing, 2012, Elsevier or Modern Statistics for the Social and Behavioral Sciences, 2012, CRC Press).
Without further knowledge about the data that you are looking at or what you want to do with it I would recommend the "psych" package and "nFactors" package as well as "WRS" (recommended above). If you want to look at CTT, "psych" and "lavaan" will be very useful. Additionally, "ltm" is useful for fitting IRT models, if by exploratory you mean data-reduction/item functioning. There are also tons of packages for other things too...
talking about exploratory data analysis (EDA) I think of "base R", and "ggplot2" (I admit that "stat.desc" function from the "pastecs" package).
Why am I talking about the "base" package? because in my experience, when a psychologist approaches R tend to underestimate the power of these basic tools, and start to collect an infinite number of packages to solve basic issues.
For example, a frequent error is considering data frame as the main (if not the only) data structure type, not being aware of the power of "List", the true R workhorse. Thus, ignoring how useful can be, for exploratory data analysis purposes, reading multiple data sets as list of data frames for quick comparisons.
Related to the previous example, learning the *apply family in an effective way.
Then, of course, there are domain-specific tools, such as the "tm" package for text mining. I also am a "WRS" users.
I believe that for psychologists approaching R, and EDA is a good example, the elemental step is a change of perspective. Then, the "base package", plus a good tool for plotting, and, the gradual introduction of some "functional programming" elements (as well as some OOP) do the trick.
I always think that if I am not able to teach myself new problem solving strategies, as learning some functional and OO programming can be for a non-CS person, well, maybe my psychological skills are not that strong.
Thanks to all of you for your replies. My goal is to prepare a manuscript that encourages some " good practices" in statistical analysis (e.g. checking if assumptions are met). All of your advices are very useful. May I ask you for some additional orientations as for example what would be a more effective approach to my goal with this paper?
I think EDA is closely related to statistical graphics and data visualization, so I suggest you to you check , for example , RGobi software. On the other hand , I would also encourage you to see other free statistical software, such as ViSta. In the attached files you can see some examples of how using ViSta for EDA and data visualization in psychology.