Communities of Practice: Coming Fall 2017

“Communities of practice are groups of people who share a concern or a passion for something they do and learn how to do it better as they interact regularly.”  – wenger-trayner.com

The OAC Stats Support Service will facilitate Communities of Practice (COP) to engage the OAC research community and assist with the statistical analyses and statistical software. Our researchers use a variety of statistical approaches and statistical software packages to conduct their research, by meeting, sharing perspectives, and learning new aspects of our software and/or statistical approaches, as a community, we can create enriched learning environments for all.

Fall 2017, will see the creation and revitilization of four COPs:

  • SASsy Fridays
  • Crimes of Statistics
  • OAC R Users Group
  • OAC Data Visualization

SASsy Fridays

SASsy Fridays started as a COP in W14 in response to the growing interest of SAS-specific topics beyond what was being taught in the workshops. If you use SAS and are interested in learning and sharing new approaches to using the software or new statistical approaches in SAS, this is the COP for you!  For past topics please review the SASsy Fridays blog.  If you have a topic you would like to present or would like more information about, please email oacstats@uoguelph.ca. SASsy Fridays sessions will take place in the Crop Science Lab Rm 121A on the following dates and times:

  • Friday, October 13  from 12:30-1:20 p.m.
  • Friday, October 27 from 12:30-1:20 p.m.
  • Friday, November 10 from 12:30-1:20 p.m.
  • Friday, November 24 from 12:30-1:20 p.m.
  • Friday, December 8 from 12:30-1:20 p.m.

Crimes of Statistics

Many of us conduct experiments and run the appropriate statistical analysis, but sometimes we can get caught up in questioning the basics of the theoretical background. Topics such as replication, sampling, power, p-values, and many more. This COP will meet to discuss these and other topics. A short presentation on the topic du jour will be followed by a discussion of situations you may have encountered. The Crimes of Statistics COP will meet in the OAC Boardroom (Johnston Hall) on the following dates and times:

  • Tuesday, August 22 from 10:00-10:50 a.m.
  • Tuesday, September 5 from 10:00-10:50 a.m.
  • Tuesday, October 3 from 10:00-10:50 a.m.
  • Thursday, November 2 from 10:00-10:50 a.m.
  • Thursday, November 30 from 10:00-10:50 a.m.
  • Tuesday, December 12 from 10:00-10:50 a.m.

The first meeting on August 22 will be an information gathering session. Please bring any topics you would like to see discussed to this session.

OAC R Users Group

R is growing in popularity and is gaining international acceptance in the research community. The goal of this group will be to exchange knowledge about R-packages and R-libraries that your research field or your lab uses. A short presentation or demonstration of  practical application of an R-package or R-library will be followed by questions and exploration of other uses for the presented material. The OAC R User Group meetings will take place in Crop Science Lab Rm 121A on the following dates and times:

  • Friday, October 20 from 12:30-1:20 p.m.
  • Friday, November 3 from 12:30-1:20 p.m.
  • Friday, November 17 from 12:30-1:20 p.m.
  • Friday, December 1 from 12:30-1:20 p.m.
  • Friday, December 15 from 12:30-1:20 p.m.

Data Visualization

You have been collecting data for a project and now it’s time to do something with it! What do you do? How do you present it? Should it be a table? A graph? A chart? This COP will discuss different ways of presenting data, the pros and cons of different formats, and will encourage the community to demonstrate their favourite data visualization formats. The Data Visualization COP will meet in the OAC Boardroom (Johnston Hall) on the following dates and times:

  • Tuesday, October 17 from 12:00-12:50 p.m.
  • Tuesday, October 31 from 12:00-12:50 p.m.
  • Tuesday, November 14 from 12:00-12:50 p.m.
  • Tuesday, November 28 from 12:00-12:50 p.m.
  • Tuesday, December 12 from 12:00-12:50 p.m.

Ridgetown Workshop – August 2, 2017

SAS program files:

Data for balanced RCBD example

Complete SAS program for balanced RCBD example

Data rcbd;
input block trmt Nitrogen;
datalines;
1 1 34.98
1 2 40.89
1 3 42.07
1 4 37.18
1 5 37.99
1 6 34.89
2 1 41.22
2 2 46.69
2 3 49.42
2 4 45.85
2 5 41.99
2 6 50.15
3 1 36.94
3 2 46.65
3 3 52.68
3 4 40.23
3 5 37.61
3 6 44.57
4 1 39.97
4 2 41.9
4 3 42.91
4 4 39.2
4 5 40.45
4 6 43.29
;
Run;

Data for unbalanced RCBD example

Complete SAS program for unbalanced RCBD example

Data rcbd_unb;
input block trmt Nitrogen;
datalines;
1 1 34.98
1 2 40.89
1 3 42.07
1 4 37.18
1 5 37.99
1 6 34.89
2 1 41.22
2 2 46.69
2 3 49.42
2 4 45.85
2 5 41.99
2 6 50.15
3 2 46.65
3 3 52.68
3 4 40.23
3 5 37.61
3 6 44.57
4 1 39.97
4 3 42.91
4 4 39.2
4 5 40.45
4 6 43.29
;
Run;

Data for Repeated Measures example

Complete SAS program for Repeated Measures example

Data repeated;
input ID Room trmt day wt;
datalines;
1 1 1 1 13
2 1 1 1 17
3 1 1 1 13
4 1 2 1 16
5 1 2 1 17
6 1 2 1 17
1 2 1 2 22
2 2 1 2 24
3 2 1 2 20
4 2 2 2 23
5 2 2 2 22
6 2 2 2 23
1 3 1 3 36
2 3 1 3 38
3 3 1 3 46
4 3 2 3 45
5 3 2 3 45
6 3 2 3 32
;
Run;

Data for Count example

Complete SAS program for Count data example

Data trial;
input trmt$ block count;
datalines;
A 1 69
A 2 56
A 3 20
A 4 63
B 1 69
B 2 72
B 3 74
B 4 82
C 1 87
C 2 72
C 3 80
C 4 95
D 1 78
D 2 72
D 3 50
D 4 94
;
Run;

S17 SAS Workshop: Proc GLM, Proc MIXED, Proc GLIMMIX – an overview – RCBD

Notes For the CRD and RBCD Workshop – PDF file

This workshop will look at a Randomized Complete Block Design (RCBD) in Proc GLM, Proc MIXED, and Proc GLIMMIX.  The goal is to review the coding similarities & differences, along with the differences & similarities in the respective outputs.

The SAS program can be found here – please note that it is a PDF file

Proc GLM Results

Proc MIXED Results

Proc GLIMMIX Results

 

S17 SAS Workshop: Proc GLM, Proc MIXED, Proc GLIMMIX – an overview – CRD

Notes For the CRD and RBCD Workshop – PDF file

The goals of this workshop are:

  • to compare Proc GLM, Proc MIXED, Proc GLIMMIX using a Completely Randomized Design (CRD) for the example by:
    • showing coding differences
    • showing output differences
  • to provide guidelines/explanations as to why and when you would use GLM, MIXED, and GLIMMIX

Proc GLIMMIX, appears to be the “new” kid on the block when it comes to analyzing our data.  But believe it or not, GLIMMIX has existed for many years, but never really caught on, until a few years ago.  Many of us now are relearning our traditional analyses methods in SAS and converting to GLIMMIX.

There will be several workshops that will concentrate on the use of Proc GLIMMIX.  The idea is that we will start with the straighforward experimental designs and increase the complexity to showcase the strengths of GLIMMIX and maybe convince you to make the switch to this more robust SAS procedure.  This workshop will use the basic Completely Randomized Design to primarily show coding and output differences among the 3 procedures.

Completely Randomized Design

Our fictitious design has 6 treatments (A, B, C, D, F, G) with 4 observations per treatment. Our Null Hypothesis states that all treatment means are equal, with our Alternate Hypothesis stating that at least 2 means are not equal.  We will have a model to reflect this design as:

Outcome variable(Weight) = overall mean + Treatment effect + residual error

To read in the data we will use a Data Step as follows:

/***************************************************************************/
/* Reading data gathered from a CRD conducted across 6 treatments */
/* Variables are 6 treatments and weight collected in hypothetical units */
/* This is a dummy dataset created for the purposes of a demo and workshop */
/* Created by A.M.Edwards May 23, 2017 */
/***************************************************************************/

Data crd;
input ID trmt$ weight;
datalines;
1 A 41
2 A 24
3 A 33
4 A 38
5 B 24
6 B 21
7 B 16
8 B 43
9 C 46
10 C 33
11 C 14
12 C 19
13 D 32
14 D 38
15 D 15
16 D 17
17 F 31
18 F 15
19 F 36
20 F 46
21 G 28
22 G 40
23 G 37
24 G 39
;
Run;

History of ANOVA analyses in SAS

1966 – SAS is released with Proc ANOVA, which is to be used with:

  • balanced data ONLY!
  • FIXED effects ONLY!
  • NOTE from SAS Online Docs: “Caution:If you use PROC ANOVA for analysis of unbalanced data, you must assume responsibility for the validity of the results.

1976 – SAS released Proc GLM

  • balanced (Type I SS) and unbalanced (Type III SS)
  • RANDOM statement introduced – provides EMS (expected mean squares equations, but you need to do the calculations!)

1992 – Proc MIXED

  • RANDOM statement incorporated
  • REPEATED statement introduced
  • “Normally distributed” data ONLY
  • linear effects

1992 – Proc GENMOD

  • Non-normal data
  • Fixed effects ONLY

xxxx? – Proc NLMIXED

  • normal, binomial, Poisson distributions
  • nonlinear effects

2005 – Proc GLIMMIX

  • Proc MIXED
  • Proc NLMIXED
  • Non-normal data

Proc GLM – General Linear Model

Proc GLM was the second generation PROCedure developed in SAS to conduct ANOVAs (analysis of variance).  This Proc is still used today for situations where you have a FIXED effects model and a balanced design – same number of observations in each treatment group.

Proc glm data=crd;
  class trmt;
  model weight = trmt;
  title “Proc GLM Results”;
Run;
Quit;

Proc glm – calls on the GLM Procedure.  data=crd – specifies the dataset which you want Proc GLM to use.

Class statement – list your classification variables here.  Think of these variables are those that tell you which group your observations fall into.

Model statement – this should be based on your experimental design.  In this case we have a CRD – our dependent variable = independent variable or our fixed effect.

Title statement – another great little habit to start.  Create a title statement for each procedure you use.  This way you will have a title at the top of our output window.  You will never guess again as to what that output was about.  If you want more titles or subtitles simply type title2 or title 3, etc….  You can also use the Footnote option to add notes to the bottom of our output page.

Run statement finishes the Procedure.

Quit statement will let SAS know that you do not want to add any more information to the Proc GLM.  Proc GLM is one of the few SAS Procedures that will wait for more instructions by running in the background.  In order to close it out, you will need to add a Quit.

View Proc GLM Results

Proc MIXED

With the increasing use of mixed models – models that include both fixed and random effects, Proc MIXED was developed.   Proc MIXED can also account for unbalanced designs.  Using the same CRD dataset:

Proc mixed data=crd;
  class trmt;
  model weight = trmt;
  title “Proc MIXED Results”;
Run;

 

You should obtain the SAME results with both procedures with a basic CRD design. For most straightforward models, Proc GLM and Proc MIXED should yield the same results.

Proc mixed – calls on the MIXED Procedure.  data=crd – specifies the dataset which you want Proc MIXED to use.

Class statement – list your classification variables here.  Think of these variables are those that tell you which group your observations fall into.

Model statement – this should be based on your experimental design.  In this case we have a CRD – our dependent variable = independent variable or our fixed effect.

Run statement finishes the Procedure.

View Proc MIXED Results

Proc GLIMMIX

Proc GLIMMIX does it all!  ok, almost.  For our purposes, Proc GLIMMIX handles the different types of experimental designs that are used in OAC and in the agricultural field.

Proc glimmix data=crd;
  class trmt;
  model weight = trmt;
  title “Proc GLIMMIX Results”;
Run;

Proc glimmix – calls on the GLIMMIX Procedure.  data=crd – specifies the dataset which you want Proc GLIMMIX to use.

Class statement – list your classification variables here.  Think of these variables are those that tell you which group your observations fall into.

Model statement – this should be based on your experimental design.  In this case we have a CRD – our dependent variable = independent variable or our fixed effect.

Run statement finishes the Procedure.

View Proc GLIMMIX Results