Introduction to R for Excel Users | Thomas Hopper | R-bloggers

“…The quality of our decisions in an industrial environment depends strongly on the quality of our analyses of data. Excel, a tool designed for simple financial analyses, is often used for data analysis simply because it’s the tool at hand, provided by corporate IT departments who are not trained in data science.

Unfortunately, Excel is a very poor tool for data analysis and its use results in incomplete and inaccurate analyses, which in turn result in incorrect or, at best, suboptimal business decisions. In a highly competitive, global business environment, using the right tools can make the difference between a business’ survival and failure. Alternatives to Excel exist that lead to clearer thinking and better decisions. The free software R is one of the best of these…”

Sourced through from:

Michel Baudin‘s comments:

 Kudos to Thomas Hopper for writing this guide and for making the complete 87-page PDF file available for download. For over two decades, I limited the analyses offered to my consulting clients to what I could do with Excel, because it was the only tool they had, and I wanted to reproduce my results.

For the past three years, however, I have been teaching myself R and fully agree with Hopper that it is a much more powerful and reliable tool for analytics. I also agree that it takes time and effort to learn, but it is useful even at a beginner’s level of proficiency.

Many, including Hopper, refer to this gradual learning process as  “steep learning curve,” which, strictly speaking, means the opposite: the steeper the learning curve of a skill, the faster you learn it…

Learning curves - steep vs shallow

The main challenge I see for the manufacturing engineers and managers I know is the switch from a spreadsheet to a coding mindset.

Excel is still preferable for expense reports or project cost justification, and R does not obviate the need for a database management system (DBMS).

See on Scoop.itlean manufacturing

2 comments on “Introduction to R for Excel Users | Thomas Hopper | R-bloggers

  1. Michel,
    Many of us in trained in Six Sigma use Minitab or other statistical software for data analysis. Are there advantages to R over these, aside from the cost?

    • From my limited experience with Minitab (Version 15), it is a package with a kit of tools that is richer than Excel’s data analysis add-on, but finite. It won’t help you if you want to use techniques that aren’t in the kit. Looking at the Version 17 feature list, I see many tools, including some I am not familiar with, but I don’t see, for example, MARS, Random Forests, or Bootstrapping.

      In R, by contrast, no matter what technique you want to use, there usually is a package for it, in repositories called CRAN or on Github. These packages are contributed by users, and their quality varies. There is some control on CRAN. Not every submitted package makes it in; On Github, my understanding is that participants maintain their own software independently.

      With R packages, it’s user beware. You need to check the quality of the documentation and run tests before using them on your business data. They have author names and, over time, you come to trust some authors more than others. In any case, R packages cannot be expected to be at the level of consistency and ease of use of a Minitab module, but they cover more ground.

      R is a language, with multiple user interfaces to choose from. Like Hopper, I use RStudio, which is also free. RStudio’s Chief Scientist is Hadley Wickham, who teaches statistics at Rice University and is author of several of my favorite R packages.

Please share your thoughts: