F19 Workshops and Tutorials

Oh yes!  It is that time of year again 🙂  I have to admit that I love fall – my favourite season.  The time for so many new beginnings.  With this all in mind, the new schedule for F19 OACStats workshops is now open for registration at https://oacstats_workshops.youcanbook.me/.   Workshops will be approximately 3 hours long with breaks and hands-on exercises – so bring your laptops with the appropriate software installed.  Please note that the workshops are being held in Crop Science Building Rm 121B (room with NO computers) and will begin at 8:30am.

September 10: Introduction to SAS
September 17: Introduction to R
October 15: Getting comfortable with your data in SAS: Descriptive statistics and visualizing your data
October 29: Getting comfortable with your data in R: Descriptive statistics and visualizing your data
November 5: ANOVA in SAS
November 15: ANOVA in R

I am also trying something new this semester – to stay with the theme of new beginnings 🙂  Tutorials!  These will be held on Friday afternoons from 1:30-3:30 – sorry only time I could get a lab that worked with all the schedules.  They will be held in Crop Science Building Rm 121A (room with computers).  Topics will jump around a bit with time to review and work on Workshop materials.  To register for these please visit:  https://oacstatstutorials.youcanbook.me/

September 13: Saving your code and making your research REPRODUCIBLE
Cancelled:  September 20: Introduction to SPSS
September 27: Follow-up questions to Intro to SAS and Intro to R workshops
October 18: More DATA Step features in SAS
October 25: More on Tidy Data in R
November 1: Open Forum
November 15: Questions re: ANOVAs in SAS and R
November 29: Open Forum

I hope to see many of you this Fall!

One last new item – PODCASTS.  I’ll be trying to record the workshops and tutorials.  These will be posted on the new page and heading PODCASTS.  I will also link to them in each workshops post.

Welcome back and let’s continue to make Stats FUN

Name

R vs. SAS

A question that comes up more and more in my position.  Graduate students starting their academic career or experienced researchers looking to keep up with the “trends”.

There was a recent article published on the RBloggers website, that compared the top statistical packages:  R, Python (?), SAS, SPSS, and Stata.  If you are interested in reading the original article I’ve linked to it here.  I’d like to summarize and show a few examples as well.

What do they look like?

R Studio is one of the more common ways that folks are using R today.  It is a comfortable environment – a little bit of GUI that really doesn’t leave you hanging out in space – ok maybe a little – but you’re fine once you get comfortable with the coding.

RStudio

Yes!  you read that correctly – you need to write coding in R – very similar to needing to write code in SAS.  The code or syntax is different for the 2 programs – but you need to write some code in order to conduct any statistical analyses in either program.

SAS as you may be aware has a few different interfaces as well.  There is the SAS Studio – used with the Free University edition

SAS Studio

Licensed version of SAS:

SAS

Sample coding

As I noted earlier each program has their own language or syntax.  R is comprised of packages that may deal with a type of analysis.  Within a package there are several functions.  SAS we have PROCedures with options and lines of code that will run the analysis.  Very similar concepts.  Each program will have documentation.  Since R is open source and community driven, the detail of the documentation will depend on the creator of the package.  SAS documentation is extensive but very technical at times.

R coding

library(ggplot2)
ggplot(fruit, aes(x=Yield)) +
geom_histogram()

plot(Yield ~ Variety,
col = factor(Variety),
data=fruit)

legend(“topleft”,
legend = c(1, 2, 3, 4),
col = c(“black”, “red”, “green”, “blue”),
pch=1)

SAS coding

Proc sgplot data=out_asp2010_test;
scatter x=julian y=mms / group=entry yerrorlower= low4 yerrorupper = high4;
series x=julian y=mms / group=entry lineattrs=(pattern=solid);
xaxis label =”Julian Day”;
yaxis label = “Mms”;
title “Plot of Mms by Julian Day for 2010”;
Run;

Support

As noted above R is open source and community-driven.  Which also means that it is supported by the community.  Any questions, challenges you may encounter, you will use a variety of sources to find help:  the author of the package you are using, or a listserv.

SAS is a commercial product with professional support network to assist its users.  There are listservs of users as well.

Conclusion

As pointed out in the R Bloggers article, they both have their strengths and their weaknesses.  I’ll be honest I never through I’d see the day when banks and pharma started using R, but it’s here!  The small program that folks used because it was free and accessible, has now become a major contender in the statistical analysis world.

Which program you select to use, will depend on your background – what have you used in your undergrad or in your course – the level of support available to you on your campus, maybe what program your supervisor uses or recommends.  I used to recommend SAS if you were going to work in a workplace that needed standards, but after learning more about R and seeing its growth, I’m not sure that should be a reason to use SAS in academia anymore.

I, personally, believe, that we should be learning both programs – I know too much time to learn – but they both look awesome on a resume, and they both provide you with the opportunities to increase your skillset and talk stats to SAS and R users 😉

Name

 

S19 R Workshop

To complete the contents of the day-long R workshop offered on June 11, 2019, we will work through the following sessions:

  1. R Workshop:  Introduction to R and Definitions
  2. R Workshop:  Introduction to RStudio and R packages
  3. R Workshop:  Cleaning and Tidying your data
  4. R Workshop:  Getting the data in, merging files, and creating new variables
  5. R Workshop:  Getting comfortable with your data:  Descriptive statistics, Normality, and Plotting
  6. R Workshop:  ANOVA / Partitioning of Variance with an RCBD

S19 Workshops

A couple of workshops are now available for booking.  I will be hosting 2  1-day long workshops in June.  June 4 will be a 1-day SAS workshop followed by a 1-day R workshop on June 11.  The workshops will be held in ANNU Rm 102 starting at 9am and ending the latest by 4pm.

Please register for the one(s) you would like to attend by visiting https://oacstats_workshops.youcanbook.me/.    Please note you will need to bring a laptop with the software already installed.  If you do not have the software, you may watch the demos – however, I will not be able to help you with any software installations.

June 4 – SAS:  We will begin by touring the different versions of the SAS program that are available to us on campus.  Our next stop will be getting data into SAS, followed by some descriptive statistics. We will then move onto Regression and ANOVAs, and if time premits PCA and/or Factor analysis.  If you have a particular analysis in mind that you would like to work through in SAS, please let me know beforehand – email oacstats@uoguelph.ca.

June 11- R/RStudio:  We will again begin our tour with RStudio and discuss the merits and challenges of using the R software.  We will then work through a number of ways to get the data into RStudio, followed by some descriptive statistics and data visualization options.  We will move onto Regression and ANOVAs, and if time permits we will try our hand at some on-demand analyses.  If you have a particular analysis in mind that you would like to work through in R, please let me know beforehand – email oacstats@uoguelph.ca.