What is the best statistical program can be used for multivariate analysis?

Mohamed Awad Dadamouny @Mohamed-Dadamouny

14 December 2014 67 10K Report

There are many statistical programs produced by software companies, enough to one should decide which software program is more fit to present and analyze the data. If we have data on ages of trees, size, growth rate, vitality, and seeds production. What is the best statistical program can be used for multivariate analysis for these parameters?

Raid Amin Popular answer

Cost is often a major factor in choosing the statistical software. R is free, but SAS is not. There is a way for educational institutions to cut down on costs for SAS licenses. We use one workstation with SAS, having a license for 200 simultaneous users this way. This is lower cost than buying many individual SAS licenses, or buying a site license for the entire campus. Students access the workstation remotely.

Víctor Granda

In my humble opinion, R is the best statistical software and programming lenguage for multivariate analysis. In R you can find packages like "FactoMineR" and "vegan", along with R-base "multcomp", allowing to perform a lot of different multivariate analyses as Manova, PCA, CCA, NMDS, MCA...

Dr. Senthilvel Vasudevan

Hi,

To do Multivariate analysis means the SPSS is very easy one. But, you want to write the program means then you go to R-Software. You can find the R- software from the following link:

http://www.r-project.org/

Joan Vaccaro

Mohamad,

Have you considered using structural equation modeling (SEM) as a way to due linear regression with weights for your variables? Here is a link that might be useful.

http://statisticasoftware.wordpress.com/2012/09/04/structural-equation-modeling/

In that case, SEM can be performed with Statsofta or another mulitvariate software called EQS provided in this link:

http://www.mvsoft.com/

Timothy A Ebert

I think that you will get as many answers as there are programs. In part it depends on what program do your coworkers use? Can they help when you get stuck and can you help when they get stuck. If you are all using a different program then the answer is no. It also depends on what you have used in the past. It takes a fair bit of effort to learn a new program. It also depends on your skill and expected applications. If you have had an introductory statistics class and now want to do a multivariate nonlinear analysis of seasonal changes in allometric growth in an intertidal invertebrate ..... good luck.

I would say use R or SAS. Both can be used in a canned-package kind of way. However both can be programmed. So I can write a program to do part of the calculation and I can use existing subroutines to take care of other parts. This gives great flexibility in the kinds of analyses that you can do. Of course, ultimately the most flexible "stats" package is something like C+, or the ancient standby Fortran. If you are integrating life table analysis with multivariate regression, you might be forced down this path. I am not sure that I could do this sort of thing in SAS: where I want to calculate lx, mx, and r and model how these change as I increase CO2 levels and manipulate fertilizer regimens in FACE plots.

If you are just doing simple field trials, then most of the statistical packages will be sufficient. Go with what others in your company are using. If you risk having non-textbook examples then you need to have a statistical package that has a large number of modules and allows you to write custom programs. That requirement significantly reduces your options.

Some less traditional options might be Matlab or Mathematica. Both should work, and be intermediate between programming everything yourself and working with a stats package.

Do not use JMP unless all your experiments are textbook examples.

The real answer is entirely dependent on your experience level and your application. You might get a more helpful answer by talking to people in your Statistics Department. You should also consider cost, and level of support offered. R is free and has a diffuse community of loyal followers. SAS is expensive but has a help desk that I have found very helpful in the few cases where my local help didn't know the answer. I worked with other packages, but I am very much out of date.

Mohamed Awad Dadamouny

Dear Víctor Granda, Dr. Senthilvel Vasudevan, and Dr. Joan Vaccaro, many thanks for your positive response and valuable answers.

Dear Dr. Timothy A Ebert, thanks a lot for your detailed answer! I used to analyze my data using MiniTab and SPSS. The first is very simple, and the later is sometimes more complicated even I have studied a short statistics course on SPSS, that's made me to ask my question. I will try with R as recommended by many. I have heard that is easy for most of students. But is R the best? I will see.

Stewart Graham

If you are interested in just doing multivariate analysis such as PCA, PLS and OPLS then I would recommend SImca P by Umetrics (Umea, Sweden). It also has lots of other attributes which are extremely useful (cross validation, S-plots, SUS-plots, time series analysis, Coohman's plots to name but a few). It is a dedicated multivariate software package and it is very easy to use. No programming needed and it is extremely user friendly. One does not need any prior knowledge of specific statistics packages to undertake their analyses. I have found it extremely useful for students as you will have very little to input as it is that easy to get to grips with. In addition you can purchase a site/department license which is relatively inexpensive, as is the software.

Hope this helps.

Timothy A Ebert

I would like to emphasize one thing Paul said. Excel can do some very strange things, especially if you have a large data set. Excel works most of the time, but I always check the result before making any decisions. I would guess that if you design the right problem, then you can get all of the statistical packages to fail. The most common problem is in how a computer represents a number (binary approximation to decimal numbers) and accuracy. A simple test of accuracy is to add 0.001 to 1. You should get 1.001. Now add 0.0001 to 1. Keep going. Excel fails at 1+0.000000000000001. This scales, so I get the same problem if I add 10000 to 0.00000000001. Excel is ok to 14 orders of magnitude. These two problems are at 15 orders of magnitude.

Another game is to add 0.05 t0 0. Keep adding 0.05 until you get to 1. Now take the log of that value. Log(1) should equal zero, but the binary representation of 0.05 is a repeating value, much like the decimal representation of 1/3. Excel will not return a zero for this problem. These issues become magnified when doing standard matrix manipulations necessary in multivariate analysis. It is simply harder, but not impossible, to get a good program to fail.

Juehui Shi

That is why I always use several statistical softwares to cross-validate the results. If I find any large discrepancies, I dig deep into the bottom of why which could be different estimation methods or others unexplained far more dangerous. However, SAS will remain as one of the most trusted ones as Paul has mentioned from his own experience. I never use Excel to conduct any serious regressions except for graphics and data transformation.

Another very good point Paul has said as to multivariate analysis is that one has to get at least positive semi-definite matrix and the best positive definite one, which is absolutely crucial.

I also recommend two other books by Paul and his co-author Laurence on the topic of multivariate analysis:

1) http://www.amazon.com/Reading-Understanding-Multivariate-Statistics-Laurence/dp/1557982732/ref=sr_1_2?ie=UTF8&qid=1418687045&sr=8-2&keywords=Paul+R.+Yarnold

2) http://www.amazon.com/Reading-Understanding-More-Multivariate-Statistics/dp/1557986983/ref=sr_1_1?ie=UTF8&qid=1418687045&sr=8-1&keywords=Paul+R.+Yarnold

You also want to take a look at his optimal methods UniODA here (very easy to follow and intuitive to understand coming with a software):

http://www.amazon.com/Optimal-Data-Analysis-Guidebook-Software/dp/1557989818/ref=sr_1_3?ie=UTF8&qid=1418687045&sr=8-3&keywords=Paul+R.+Yarnold

I will recommend three more books:

1) http://www.amazon.com/Practical-Multivariate-Analysis-Chapman-Statistical/dp/1439816808/ref=sr_1_1?s=books&ie=UTF8&qid=1418696350&sr=1-1&keywords=practical+multivariate+analysis

2) http://www.amazon.com/Multivariable-Analysis-Practical-Clinicians-Researchers/dp/0521141079/ref=sr_1_3?s=books&ie=UTF8&qid=1418696350&sr=1-3&keywords=practical+multivariate+analysis

3) http://www.amazon.com/Advanced-Multivariate-Statistical-Methods-Interpretation/dp/1884585590/ref=sr_1_4?s=books&ie=UTF8&qid=1418696350&sr=1-4&keywords=practical+multivariate+analysis

Juehui Shi

Well, Paul. You have a very deep knowledge regarding multivariate analysis especially when you mentioned positive definite matrix (catching my eye).

Bradley Wilson

by hand.... what.... you actually had to think how something might work before you pressed and checked all the boxes.

i remember a great now departed Ken Rowe instruction on GLM software... he said you can get the software that allows you to drive the mini minor or a ferrari but if you are shocking driver and crash... well .... I know which car I would like to be in....

brad.

Raid Amin

SAS is a rock solid and complete statistics software package for such an application. The algorithms mirror the most current research publications (and even unpublished research). I am using SAS since 1981. Not many software packages stayed "the same way" for over 30 years. You can count on SAS to be around by the time we all retire.

Timothy A Ebert

Mostly just rephrasing what Bradley and Paul have said:

Part of the answer to the original question has to be that the best statistical package is no better than the person using it. A part of this is the person using the software to really understand how things are calculated and the limits of what a computer can do.

Girish Upreti

You may use SAS or NCSS. NCSS is more user friendly.

Shuichi Shinmura

I used and wrote many technical and /or preliminary guide books about SAS, SPSS, Statistica and JMP. JMP is very easy for everybody to use it. And graphic functions is the best among four packages. Price is normal.

Dirk Enzmann

As Timothy said already, you will get as many answers as there are programs. Not mentioned up to now is Stata (which I would recommend). Its recent version 13 offers many advanced canned possibilities for multivariate analyses.

However, the decision which software to use should not only depend on the procedures offered: Although somewhat outdated by now is a very interesting paper by Michael Mitchell that shows how to strategically choose among different software packages, see http://www2.jura.uni-hamburg.de/instkrim/kriminologie/Mitarbeiter/Enzmann/Lehre/StatIIKrim/Mitchell_2007.pdf .

http://www2.jura.uni-hamburg.de/instkrim/kriminologie/Mitarbeiter/Enzmann/Lehre/StatIIKrim/Mitchell_2007.pdf

Raid Amin

Juehui: JMP belongs to SAS. What JMP cannot do, SAS can do. Therefore, since Version 10, JMP can be linked to SAS. I have used this set-up. JMP is popular in industry.

Juehui Shi

Well, Paul. Matrix identity is extremely important in my opinion. Positive definite not even semi-definite is a must. As I state on my LinkedIn profile summary, I have spent my free time playing with all the statistical softwares on the market.

My advisor also told me that in old days one had to punch cards and get results. It's amazing how computer has progressed since then.

Raid Amin

I recall using punch cards at late hours, such as 3AM, while being a student. I would carry the cards carefully ordered as each card represented one line of code. I may have had PROC GLM; on one card! It happens that the cards all fall down on the floor and then I needed to figure out the order of the cards.

Then, Virginia Tech was the first state university in the US that mandated freshmen to buy a PC. Their Provost ended up being featured in TIME magazine. Times have changed since then.

Bradley Wilson

how do you define dumber or smarter? That is the key! Does the conceptual definition mirror the operational definition?

it always makes me laugh when grad student running some analyses for the first time yell out with glee.... It runs...

well ok then.. Good luck driving the Ferrari then.

choose the software that the discipline journals you target publish with the most... There is probably some publication bias towards the common software. Better reviewers will look past it. Then check your results using another software package too. The more emergent or developmental the stats technique... Algorithm etc.. The more I would cross check...

there was a movie dumber and dumber.. I still like this movie.

brad

Timothy A Ebert

It is more about understanding fundamentals. When I was younger, software for programming Basic, Fortran, C, Pascal, Cobol (yuck!) and other languages was common. Now, computer literacy is knowing Excel and Word. So amongst the many problems is that all I can find are Ferraris. Some are missing wheels, alternators, seat cushions, or ejector seats. But at the end of the day there isn't an alternative. I can't be an effective biologist who also starts with 01001001 and ends up with a full statistical analysis software package. Understanding how hard it is to write these programs, I would put more faith in something like SAS, than I would my own code. So another metric for finding the "best" program is to figure out how many users a program has acquired. The more people that use the program, the more likely that the bugs have been documented. For reference, here is a Wikipedia article on Excel http://en.wikipedia.org/wiki/Microsoft_Excel. Look under the title "Quirks" to find a discussion of problems and citations where you can find more information, and start a broader literature search. Of course bigger is not always better, as Microsoft takes a long time to fix bugs. On the other hand, Excel is not marketed as statistical analysis software and we'll leave it at that.

Raid Amin

I was a graduate student at the time when SAS was still "small and growing". SAS had nodes at many research universities at which some scholars were advancing statistical science, while feeding SAS the latest research results. Often, SAS was behind by one year in printing these new results in their SAS manuals. There were no online manuals then. There was no internet!

I recall lectures in regression with Professor Ray Myers during which he would give us SAS procs that were not listed anywhere. They would match exactly the derivations he was teaching us in class. We knew exactly what SAS was giving us. The code matched the theory behind it. Other software packages could not offer such support. Knowing SAS was "knowing statistics". It also meant getting a good job. Job openings often had the line "excellent working knowledge in SAS required".

Oghenekaro Omodior

To add to the ongoing discussion, I would also think it depends on how much time you have to learn new software, how much money you have to invest in purchasing a software, the size of the dataset(s) you'll be working with and the type of manipulations that are required. In my opinion, SAS is good but very command driven, I think STATA is more user friendly and you can achieve as much as you would with SAS. I recently started using R and I have found it very efficent as well. Now I use it for nearly everything I do. So you'll have to decide based on what you currently are familiar with, how much time you have to learn new software, whether you'll be dealing with large datasets (rows >= 1.5 million), etc.

Raid Amin

I should have added that SAS is important in teaching statistics at the graduate level to statistics majors.

Juehui Shi

Also, one of the most important things is the ability to interpret the final results. Such ability can only be obtained after one can correctly write equations and proofs. Software in my opinion is a tool like a calculator, but one has to correctly input data at least and code formulas at most.

Timothy A Ebert

Unless you tell SAS to delete a dataset, it is still in memory after program execution. So I can do a part of the analysis using one dataset, and do another part using a different data set. I can now take both datasets and spot check a few of the numbers to make sure they are correct. The SAS Enterprise Guide makes this super easy as all the datasets are a mouse click away in the "Output Data" tab.

Juehui Shi

Yes, Tim. Although I have a fast computer, SAS takes some time to run a complex equation system. I hope SAS could improve the speed.

Timothy A Ebert

I am not too worried about speed. I had SAS running a randomization test that took 2 days to run. I set up my major professor's computer on Friday after he left, and recovered the output on Monday before he got back. That program now takes about an hour to run on today's computer. With that as perspective, the 5 minutes it takes my current analysis to run on the desktop seems fairly trivial. It would run faster on a mainframe if I really cared. It takes longer for Excel and Word to process the output file than it does for SAS to create it. It doesn't sound like the problem in this question is bad enough that Mohamed will be able to detect a difference in runtime between any of the commercially available programs that we have talked about. However, it might become a more important issue as the research question develops.

Juehui Shi

For me, the speed slows as the sample size increases and the equation system becomes more complex. Multitasking like you said earlier might also be affected if one is running many complicated analysis simultaneously. In this case, multi-core CPU like i7-5960X, large RAM, and SSD can definitely help. However, it will be the best if SAS can have the equivalent fast speed on affordable PCs as on expensive ones.

Raid Amin

Juehui Shi

I agree with Raid. That is why SAS now has a University version: http://www.sas.com/en_us/news/press-releases/2014/may/university-edition.html, but which is limited in both scope and scale.

Timothy A Ebert

Yes, SAS is stupidly expensive if you are a single user and want your own copy. I get my copy through UF and it costs me something like $80/year (I can't remember the exact figure, might be a bit less). The UF IT department sends my computer a code, and SAS works for another year. It used to be that this renewal process was very bad. At times it would take months to get the new activation code or get SAS installed on anew computer -- never quite sure if that was the fault of SAS or the people in IT. The problem is not entirely solved. The activation code keeps my SAS active. It does not update to a new version. So if I need a new version I have to beg/threaten IT to annoy someone on the main campus to send a disk with the newest version of SAS, and then schedule an appointment to get the old version removed and the new version installed. Once the software is here, it takes about half a day to get this done. Thank you Raid for jogging my memory of these problems with SAS.

Raid Amin

It is painless at UWF, Timothy to get SAS installed. First of all, I stopped buying my own computers. It doesn't make any sense to do so. I get a grant, and then I buy a computer. It is the property of UWF, so I get free updates and free maintenance "forever". I also get SAS for free. The other option is to remote access my work computer from home. It has SAS on it too. The third option is to access the workstation via my.uwf.edu (e-desktop). It also has SAS on it.

Shuichi Shinmura

I had answered your question two days ago, but I can't find it. I may answer another question.s textbooks of these software. JMP is very easy for many researchers and students because it offers easy operation and graph capability. I think the perpetual license is cheap. My research results was analysed by JMP. See my papers.

Dominique Desbois

It is rather difficult to contribute in such a fast-expanding topic, but browsing the previous contributions, it seems to me that the discussion is focusing on a comparison between R and SAS (with JMP also), with a lot of interesting information

However, as far as I read correctly all the items, SPSS is cited by only one contribution and STATA, Statistica, MATLAB among many others (general purpose packages) are not cited .

Please, let me report a recent professional paper giving some figures about the relative popularity of data analysis softwares : http://r4stats.com/articles/popularity/,

where this topic is discussed from several points of view.

Because popularity is perhaps not the best criterion (nor the only one) for choosing to invest in a software (even a free one), before trying to answer your question, sould I ask you : "what is /are your criterion/criteria ? "

Best regards

Timothy A Ebert

I really liked the article that Dominique posted because it gave a different perspective with loads of data. I was surprised that things like C were listed. Yes, programming languages are capable of doing any type of analysis you desire. Yet it is a very small group of people that have the required skill to use these tools to achieve these results.

I also noted that R now has 5000 packages and is growing rapidly. What the heck am I going to do with 5000 packages, even assuming that I want to take the time to learn them all? How do I search this mountain to find the few pieces that I need?

Another criteria might be documentation. More than any other package, I have liked working with the SAS documentation. They have a large number of worked problems. I could look through them and say "Ah, I need something like that." I can copy the program, and I can get output that looks like the documentation using my data. I can then read a bit more and at least start to figure out if I should have done this analysis. A clustering procedure might have 11 different methods. I know Average linkage from my statistics class, but what about the Flexable-Beta method? Never even heard of that, so now I have more questions. Even if I don't understand the manual, it has given me the key words that I need to do more searching and figure things out. Having gone that far also helps me ask better questions and understand the answers if I have to consult a living statistician. In this case, the statistician also tends to put more effort into their answer.

Abdallah Samy

R is the best one

Richard Arlin Erickson

Hi Mohamed, Like you, I am an ecologists. Without knowing the exact analysis you would like to run, providing an exact recommendations is hard. Personally, I use R and recommend you check out the vegan package for multivariate analysis. The package has some great tutorials (called Vignettes in R) created by the package's author:

http://cran.r-project.org/web/packages/vegan/index.html

Also, Timothy provided a great answers. R can be hit or miss for documentation. It also requires you to know what you are doing, otherwise, you can get into a lot of trouble quickly.

Raid Amin

The online multivariate statistics course at Penn State University lists all SAS programs, with annotation. It is a great "educator" website for multivariate statistics. The theory is given in the lectures that are posted, and SAS is linked to each lecture. I call it "Best Link" when I teach a graduate course on multivariate statistics. All steps are explained, and backed with the theoretical background needed to understand what you are doing.

SAS has major conferences each 1-2 years (SAS Users Guide International.. SUGI) in which the latest statistical results are integrated with SAS. I often give SUGI publications to my graduate students to study.

Juehui Shi

I agree with Raid that SAS has great documentation and very detailed steps on how to carry out many different kinds of analysis from simple to very advanced, not only on their own manual site but also on many other institutions'. I like it. Both flexibility and scalability of SAS are great with proved reliability.

Raid Amin

I know from conferences in which SAS division directors gave talks that SAS deliberately does not make their programs "super easy to use" so that users with a good understanding of the theory behind the programs use them.

How do you write a program for mixed models with several covariates and having missing data ... etc.? It is cases like this one that require some additional modeling understanding, and knowing exactly which theory is behind which chosen algorithms is very important.

Another case is pseudo random generators. What are the possible complications when we use certain algorithms to produce certain distributional properties in some pseudo random variables? How can we anticipate problems, and how do we go about circumventing such problems?

In over 30 years of doing research in statistics, I have never felt that I needed another software besides SAS, except for special applications, such as disease surveillance.

Yes, there are other options, such as R, STATA, MINITAB, SPSS, ... etc., and it is best to leave the choice to the users who usually know best which software package fits best their needs and their budgets.

I recall teaching a short workshop on biostatistics at the German University in Cairo (GUC), and we ended up teaching statistics with SAS, followed an hour later, teaching the same material with SPSS to the same students, if the programs were available through SPSS. I made sure that another professor taught the SPSS portion since I wanted someone who actually taught SPSS at the university level, and I wanted someone who knew all parts of SPSS inside and out.

Juehui Shi

Raid and Paul, Thank you both for discussing your experience with SAS. I am actually enjoying playing with SAS programming and coding now. I also like the fact that SAS gives you many printout choices.

Shuichi Shinmura

After graduating Kyoto Univ. in 1971, I entered SCSK, which is a subsidiary of SUMITOMO Corp. I engaged in the project of “the automatic diagnosis system of electrocardiogram” that was joint project between NEC and the center for adult diseases Osaka. My theme was to develop the logic of discrimination 32 abnormal symptoms groups from the normal group by multivariate analysis of NEC package. My four years research was completely inferior to the decision tree logic developed by the medical doctor. Therefore, this is my life work to find the reason why statistical discriminant analysis was wrong.

In 1976, I decided to buy SPSS (3000 users). But I checked SAS (560 users in the world) and started the service of computer center because SAS concept was better. SAS data base was completely separated by the statistical procedures (methods). This means that we can use SAS as the system building languages. After 1986(?), I became distributors of SAS mini-computer versions about four years. We started the system integration service of SAS + VAX + Oracle + LINDO products (mathematical programming) + application development. We sold 32 pharmaceutical companies and the big portfolio system to the Toyo Trust Bank (now, Mitsubishi).

In 1996, I became the professor at Seikei Univ. I thought I must study SPSS Windows version because I needs SPSS at my lecture. In 1994, I wrote “the introduction to SPSS for Windows”. In 1997, “the Data Analysis by PC using SPSS and Bank.SAV”. Bank.SAV is data of trial about the bankruptcy of First Illinoi bank (?). This book was the best seller. I find the defects of SPSS. For example, there are two or three procedures about the basic statistics about one variable. We must choose many options, but it was difficult for preliminary users to choose the proper options. Therefore, my first SPSS guide book is explained about choosing options about 60%. SPSS updated the version only the users interface. I spent the fees of version, but I stop the version up and I can’t use my SPSS. Therefore, I misunderstand my perpetual license is not perpetual. I guess the algorithms are not updated.

I focus Statistica. It was designed on Windows platform. It is fine, but we bother many output Windows.

In 2003, I stayed at IIASA and reviewed “JMP Start Statistics” and wrote “the new statistical study by JMP”. This text book includes JMP demo version (Ver.5, now Ver.12) that can use all procedures limited only first 20 cases. By Microsoft bad decision, JMP technical staff send the message JMP cannot support Ver.5. But I checked it. And my students use the demo version now. Students are happy they need not collect over 21 cases. SAS offers many statistical methods. And it is difficult for preliminary researchers to understand by the manuals. I recommend to reuse SAS examples in the manual. JMP does not support of SAS procedures because of the sales policy. For example, the cross tabulation is restricted under two variables and we cannot exact logistic regression. However, if you can program by the script, you can develop the complicated procedure using JMP platform. My “100-fold cross-validation of Fisher’s LDF and nominal logistic regression” is very helpful for my research. I compare MP-based linear discriminant functions (LDFs) such SVM and my new methods with Fisher’s LDF and the logistic regression. JMP uses the term platform. For example, one variable platform can analyze many statistical procedures. I do not bother to select the option of the statistics and graph options.

You can save your research time about data analysis. If you can understand my Japanese textbook, you can become good data analyst within one week.

Raid Amin

We are honored by your presence here, Dr. Shinmura. You have played an important rolerole in the development and dissemination of statistical software packages.

Shuichi Shinmura

Hi, Paul

Nice to meet you again. I follow your works. But I am very busy in my last stage of my research. I get many results, but I cannot upload those after 2015 international conferences in summer.

I appreciate you worked in the very wide area.

My answer is not enough for you and many persons.

SAS offers OR methods including MP. However, I think it is not used for serious works. Many years ago, Japanese mega bank developed the portfolio system by SAS/IML. The staffs requested me why some solutions converge zero. They cannot decide the portfolio. I compared IML results with LINDO. I found if the ratio (maximum value / minimum value of coefficient) is bigger than 10**8, SAS/IML output wrong solution. SAS/IML is useful in the research problem. In addition, LINGO offers full optimization such LP, QP, IP and nonlinear programming with global optimal. I need full functions.

Your question of “Does JMP wed with LINDO? “ is the nice questions. I hope so. JMP does not support OR and MP. And I explained SAS/OR technicians about my results of LDFs using LINGO and JMP script of Fisher’s LDF and logistic regression at Informs2012. They were surprised LINGO ability. I proposed JMP division about achievement of your question as follows two years ago:

1) Fisher’s LDF and quadratic discriminant functions have many problems.

2) My LINGO program about my methods and SVM is better spec to develop MP-based LDFs using LINDO/API (C-library). Therefore, JMP develop MP-based LDFs very cheap. But, JPM refused my proposal.

Therefore, I upload my results at Research Gate. I am regretful to write Japanese papers. And I hope many users about the discriminant analysis admit my claim. I expect JMP, SAS, SPSS or other companies develop MP-based LDFs. It is useful in medical diagnosis, pattern recognition and various rating. It is very serious problems that ordinary LDFs cannot recognize the linearly separable data. I guess if we review the medical papers using the discriminant analysis, error rates are always improved.

Were you LINDO users? LINDO was not supported now. Now, LINGO, LINDO/API and What’s Best!. You are surprised the improvement of these products. I recommend you develop prototypes of applications by LINGO at first. You can built those very quickly and cheap. After you sell packages recoded by LINDO/API.

Juehui Shi

Paul, we need an automated universal translator like in Star Trek. That way, Dr. Shinmura's work can be translated instantaneously.

Shuichi Shinmura

Hi, Paul

I am surprised that you live in Chicago! In Japan, it is very cold.

Now, give me about seven days for considering your opinion.

I found mis-operation of Excel. And I rerun important results again.

I output all results of LINGO to Excel. And I may mis-operate Excel such as no re-calculating option setting.

Until now, I found minor mistakes. Within 2 or 3 days, I finish it. After this work, I carefully read your opinion carefully. I'm sorry there are many mistakes of English sentences in my answers.

I'm regret not to present English papers because I'm weak in English and Latex requested by journal. However, I stope to present at the conferences in Japan.

And I collect full papers at the international conferences and upload those at the Research Gate. Until now, I can not get good questions from attendants at the conferences because they are surprised at the unusual results. However, I am very happy to find you and another Research Gate fellows.

Juehui Shi

Criticism about any statistical work is very important, which I have learned from my advisor. Otherwise, it is very dangerous to obtain misleading results either due to mis-specification, data issue, or inappropriate model. And I strongly believe and quote myself, "We have to model after data but not the opposite let data driving the model." Shuichi and Paul, I greatly appreciate your contribution to the field.

Raid Amin

I join Juehui in supporting solid statistical work as a base for developing statistical software packages and programs. It is often insufficient to grab some software and just use it with some data sets. I often see at RG posted questions that suggest lack of understanding of statistical theory. It is alarming to find requests for programs and see many suggestions on what to do in response to incomplete descriptions of experiments. More damage may be caused by "advice" than to say .... STOP .... THINK ... Look for a competent statiistician close to where you are working ... Take additional courses in statistics ....

There is a wealth of useful information posted on RG. ...

Juehui Shi

I thank Raid's agreement with me. In fact, the most crucial point about any methodology is to deeply understand why one wants to use that specific approach by using which what research objectives one can reach. I never trust any results I cannot fully interpret or justify. Sometime, simplicity is better than complexity. I have learned a lot from my advisor. It's not unusual nowadays computer is gradually replacing critical thinking and hard scientists. We believe that computer can solve anything for us. But even the android "Data" from Star Trek, who obviously is capable of much more advanced calculation than any known computers on the planet, is still learning humanity.

Juehui Shi

Paul, my advisor quoted one time in class "garbage in garbage out," which reminds me of how important original data is. I bear this warning sign when I collect any kind of data.

Juehui Shi

To Shuichi, Paul, and Raid,

Another interesting part is that I happen to have a lot to discuss with the elders, because I think they have a lot of life-time experience we can learn from and respect for, not to mention the knowledge from those senior researchers and scientists. I like their diverse and unique personalities as well.

Best regards,

Richard

Shuichi Shinmura

To Juehui, Paul and Raid

I wrote books about JMP, LINGO and What's Best! with free demo version.

In US, you try my style for education.

Raid Amin

Yes, it was The Who with "I won't get fooled again."

Raid Amin

The Grateful Paul!

Raid Amin

Statisticians seems to be artistically inclined."You must be a mathematician" a man told me many years ago during an exhibit of some of my photography. He saw geometric patterns in my photos.

Juehui Shi

I worked as play production manager and stage crew during my undergraduate. I think I will be doing it again.

Juehui Shi

Thank you Shuichi for sharing your experience.

Shuichi Shinmura

I am very happy you and Mohamed send me your "Thanks".

My suggestion is not enough about free demo version.

I contact to obtain the permission to bundle trial CD with my books from JMP and LINDO Systems Inc.

But researchers can down load those CD and can evaluate the functions etc.

Shuichi Shinmura

Paul

I am sorry to introduce your above answer of free CDs.

My books with CDs were publised by Japanese publishers.

Therfore you must buy those.

If anybody translated and publish those, I can negotiate to get translations from the publishers under good condition.

You can find LINGO guide book by Linus Schrage on "download Tab.

About JMP, all manuales are included in JMP demo CDs.

But total pages of those may be over 5000 pages. So, I wright two guide books about JMP.

Reader can understand how to use JMP quickly.

I am very weak in English.

Therefore my answer may be simple and not enough for most persons

You lives in Chicago. Do you know Speakeasy that was the origin of Mathematica and/or MatLab. Stan Cohe was founder of Speakeasy, but he mistakes the marketing and was late to develop Windows.

I wrote an introduction guide book about numerical computation, matrix operation etc. about the theme of loan and interest, Markov chain , integration and the portfolio systems by US company. If Speakesy is active in US, I can send my 50 pages paper publisher by my Univ. article about loan and Google search using Markov chain. But I must make PDF and need some days.

Shuichi Shinmura

Paul

I have no knowledge about copy wight ant other matters. Thank you for your explanation.

After finishing my research, I negotiate with publishers.

Neal Van Eck

MicrOsiris is free and provides pretty much everything other do, as well as efficiently handling large datasets and lots of variables.

Juehui Shi

Thank you Neal. I will check it out.

Erik Martinelly

I use SPAD the graphics are better and you can customize it.

Giuseppe Marco Randazzo

Dear Mohamed Awad Dadamouny

I use my own software, QStudioMetrics. It is simple, easy to use (runs under graphical user interface) and you do not need to write code. The software is opensource and you can find binary distributions for OSX and windows 64 bit. When you compute a model (PCA/PLS and so on), you can extract tables and replot the results using any type of external software (excel, origin lab, graphpad, and so on).

Have a look here -> https://github.com/gmrandazzo/QStudioMetrics/releases

Best regards

Blessing Mhlanga

I tried CANOCO and found it to be user-friendly

Amir Souissi

R is a free tool that can be used for multivariate analyses.

I can recommend also Primer-E (Multivariate Analysis for Ecology).

Badges
Science topic

More Mohamed Awad Dadamouny's questions See All

I have get this error when i calculated the geometrical optimization for prco3, it takes 12 hours until gives this message in the outputfile?

Fatal error in MPI_Allreduce: Other MPI error, error stack: MPI_Allreduce(1628)......: MPI_Allreduce(sbuf=000002459254A180, rbuf=000002459F86A140, count=4851, MPI_DOUBLE_COMPLEX, MPI_SUM,...

09 August 2024 7,615 1 View

How much total RNA concentration to be extracted from sorted plasma cells from bone marrow of C57BL/6 mice for RT-PCR ?

i have sorted anti-NP specific plasma cells from bone marrow of C57BL/6 mice at certain times after immunization with variable counts and isolated total RNA using TRIZOL method for RT-PCR using...

05 August 2024 8,835 1 View

How can we use mobile apps for improving students' academic performance?

Mobile apps can be a powerful tool for enhancing academic performance, how can we use mobile apps for improving academic performance

04 August 2024 9,492 0 View

Is someone interested in testing some of our new Alluaudite compounds (Molybdate) in the fields of heterogeneous catalysis and photocatalysis?

We have an original Alluaudite-Molybdate we want to test for catalysis and photocatalysis applications.

04 August 2024 2,261 1 View

Has anyone used Jump 2 before?

Jump 2 is the 1st app scientifically developed to measure your jump height.

31 July 2024 8,194 0 View

Why applying a saved template of a bar chart in a different/new one doesn't take effect in SPSS?

Using SPSS, I made my edits in one bar chart — e.g., font type and size, hiding grid lines, and colours, namely: blue, orange, green, purple, and grey for 5 bars, respectively, etc. — and I saved...

29 July 2024 1,981 1 View

IHC profiler in image j software. has anyone used it to quantify nuclear DAB positivity?

Dear researchers. I tried using the IHC PROFILER in image j to quantify nuclear DAB staining. I followed the instructions in the original article by "Varghese F, Bukhari AB, Malhotra R, De A...

29 July 2024 2,229 0 View

While using the IHC PROFILER plugin in image j software, to quantify the epithelial cytoplasmic DAB staining do you crop remove the stromal tissue?

My question pertaining to the DAB staining in cytoplasm of human oral squamous cell carcinoma tissue. When quantifying the epithelial cancer cells do we have to crop remove the stromal tissue?...

29 July 2024 2,682 6 View

What is EPOC ?

Excess post-exercise oxygen consumption

16 July 2024 366 3 View

How calculate the conversion alph from tga analysis?

I read it Convert the y-axis from mass loss to degree of conversion (100-x)%. What if tga analysis have residual char, for example at mass loss% 27% Is it the same eqation to get conversion...

15 July 2024 4,218 1 View

How to learn more about SPSS and its Application?

I would like to learn more about SPSS and Its application especially in regards to data analysis. Please suggest me how I can learn more about it. Thank you so much.

11 August 2024 9,101 4 View

Can I base on reverse DNA sequences to perform alignment, convert to amino acids and GenBank submission?

I have reverse sequences (AB1 format), can I base on reverse DNA sequences to perform nucleotide alignment, convert nucleotides to amino acids and deposit the sequence in GenBank database?

11 August 2024 5,138 1 View

Baseline drift in HPLC? What causes this?

Hello, Why do i see this baseline drift when i compare my blank (black) to the sample (blue)? Any suggestions as to why this happened? Thank you!

11 August 2024 3,770 4 View

Text-Communication from the M1 Hand Area using BCI—and then there is Elon Musk?

Willett, Shenoy et al. (2021) have developed a brain computer interface (BCI) that used neural signal collected from the hand area of the motor cortex (area M1) of a paralyzed patient. The...

10 August 2024 7,180 0 View

Handling Missing Data and Building a Predictive Model with Incomplete Information ?

I am developing a predictive model for a water supply network that involves 20 influencing points. However, I only have historical data for 10 out of these 20 points. I would like to know how to...

10 August 2024 4,005 2 View

Has anyone applied Python in the field of textile engineering for data analysis, automation, or smart textiles?

I'm currently exploring the application of Python in textile engineering, specifically in areas like data analysis, process automation, and the development of smart textiles. I'm interested in...

10 August 2024 7,429 2 View

How can I use the cif data obtained from rietveld refinement extracted via gsas2, for microstructural analysis using ETEX software?

09 August 2024 7,718 0 View

GC-MS retention index prediticon?

Hello experts, Does anyone know any free software about retention index prediction ?

08 August 2024 7,403 2 View

How are iso-frequency contours plotted?

Let's say we have a standard, regular hexagonal honeycomb with a 3-arm primitive unit cell (something like the figure attached; the figure is only representative and not drawn to scale). The...

07 August 2024 1,937 1 View

How to prepare the nanoparticle treated fungal sample for Environmental SEM analysis?

A fungal strain was treated with nanoparticles. We want to do an environmental SEM analysis. So could anyone share your views on preparing the sample? Thank you.

07 August 2024 5,307 1 View