Tackling an analysis using GLIMMIX

So, you have some data and you want to analyze it using Proc GLIMMIX.  You have some data which you’ve collected and have a few treatments which you’d like to compare.  So how do you start this?

My goal is to provide steps to tackle these types of analyses, whether you are working with weed data, or animal data, or yield data.  I suspect I’ll be updating this post as we clarify these steps.

First Step – your experimental design

Ah yes!  Despite popular belief you DO have an experimental design!  Find it or figure it out now before you go any further.  Why?  Because your model depends on this!  Your analysis comes down to your experimental design.

Second Step – build your MODEL statement

You know what your outcome variable is, you know what your experimental design is, which means you know what factors that you’ve measured and whether they are fixed or random.  So…  you now know the basis of your MODEL statement and your initial RANDOM statement.

Third Step – expected distribution of your outcome variable

You already know whether your outcome variable comes from a normal distribution of not.  Chances are it is not, but what is it?  Check out the post on Non-Gaussian Distributions to get an idea of what distribution your outcome variable may be.  Think of it as the starting point.

Add this distribution and the appropriate LINK to the end of our MODEL statement.

Fourth Step – run model and check residuals

Remember that when we run the Proc GLIMMIX – we need to check our assumptions – the residuals!  How do they look?  How’s the variation between your fixed effect levels?  Homogeneous or not?  Are the residuals evenly distributed?  Are the residuals normally distributed?

Fifth Step – residuals NOT normally distributed

Is there another LINK for the DISTribution that you selected?  If so, please try it.

Sixth Step – fixed treatment effects not homogeneous

Now the fun begins.  To fix this one, we need to add a second RANDOM statement – essentially telling SAS that we need to it to use the variation of the individual treatment levels rather than the residual variation.  As an example, a RANDOM statement, for a design that has a random block effect, would be as follows:

RANDOM _residual_ / subject = block*treatment group=treatment;

Seventh Step – try another distribution

Now – we do NOT want you trying ALL the distributions possible – this just doesn’t make sense.  Remember you need to think back to the distribution possibilities for our outcome variable.  Please use the link provided in Step 3 as a guide.  However, one distribution I have discovered works for many situations is the lognormal distribution.  At the end of your model statement you would add / DIST=lognormal LINK=identity.

Another option is to transform the data in the GLIMMIX procedure.  The one transformation that researchers like is the arcsine square root transformation.  To try this one please use the following code.

Proc GLIMMIX data=first;
trans = arsin(sqrt(outcome));

model trans = …;

Run;

Last Step – results will not always be perfect!

You will do the best that you can when analyzing your data.  But please recognize that you may not be able to match all the assumptions everytime.  Go back, review your data, review your experimental design, to ensure you have the correct proc GLIMMIX coding.

As I’ve noted earlier, as we continue to learn more about GLIMMIX this post will probably be updated to include and/or refine these steps.

Name

Communities of Practice: Coming Fall 2017

“Communities of practice are groups of people who share a concern or a passion for something they do and learn how to do it better as they interact regularly.”  – wenger-trayner.com

The OAC Stats Support Service will facilitate Communities of Practice (COP) to engage the OAC research community and assist with the statistical analyses and statistical software. Our researchers use a variety of statistical approaches and statistical software packages to conduct their research, by meeting, sharing perspectives, and learning new aspects of our software and/or statistical approaches, as a community, we can create enriched learning environments for all.

Fall 2017, will see the creation and revitilization of four COPs:

  • SASsy Fridays
  • Crimes of Statistics
  • OAC R Users Group
  • OAC Data Visualization

SASsy Fridays

SASsy Fridays started as a COP in W14 in response to the growing interest of SAS-specific topics beyond what was being taught in the workshops. If you use SAS and are interested in learning and sharing new approaches to using the software or new statistical approaches in SAS, this is the COP for you!  For past topics please review the SASsy Fridays blog.  If you have a topic you would like to present or would like more information about, please email oacstats@uoguelph.ca. SASsy Fridays sessions will take place in the Crop Science Lab Rm 121A on the following dates and times:

  • Friday, October 13  from 12:30-1:20 p.m.
  • Friday, October 27 from 12:30-1:20 p.m.
  • Friday, November 10 from 12:30-1:20 p.m.
  • Friday, November 24 from 12:30-1:20 p.m.
  • Friday, December 8 from 12:30-1:20 p.m.

Crimes of Statistics

Many of us conduct experiments and run the appropriate statistical analysis, but sometimes we can get caught up in questioning the basics of the theoretical background. Topics such as replication, sampling, power, p-values, and many more. This COP will meet to discuss these and other topics. A short presentation on the topic du jour will be followed by a discussion of situations you may have encountered. The Crimes of Statistics COP will meet in the OAC Boardroom (Johnston Hall) on the following dates and times:

  • Tuesday, August 22 from 10:00-10:50 a.m.
  • Tuesday, September 5 from 10:00-10:50 a.m.
  • Tuesday, October 3 from 10:00-10:50 a.m.
  • Thursday, November 2 from 10:00-10:50 a.m.
  • Thursday, November 30 from 10:00-10:50 a.m.
  • Tuesday, December 12 from 10:00-10:50 a.m.

The first meeting on August 22 will be an information gathering session. Please bring any topics you would like to see discussed to this session.

OAC R Users Group

R is growing in popularity and is gaining international acceptance in the research community. The goal of this group will be to exchange knowledge about R-packages and R-libraries that your research field or your lab uses. A short presentation or demonstration of  practical application of an R-package or R-library will be followed by questions and exploration of other uses for the presented material. The OAC R User Group meetings will take place in Crop Science Lab Rm 121A on the following dates and times:

  • Friday, October 20 from 12:30-1:20 p.m.
  • Friday, November 3 from 12:30-1:20 p.m.
  • Friday, November 17 from 12:30-1:20 p.m.
  • Friday, December 1 from 12:30-1:20 p.m.
  • Friday, December 15 from 12:30-1:20 p.m.

Data Visualization

You have been collecting data for a project and now it’s time to do something with it! What do you do? How do you present it? Should it be a table? A graph? A chart? This COP will discuss different ways of presenting data, the pros and cons of different formats, and will encourage the community to demonstrate their favourite data visualization formats. The Data Visualization COP will meet in the OAC Boardroom (Johnston Hall) on the following dates and times:

  • Tuesday, October 17 from 12:00-12:50 p.m.
  • Tuesday, October 31 from 12:00-12:50 p.m.
  • Tuesday, November 14 from 12:00-12:50 p.m.
  • Tuesday, November 28 from 12:00-12:50 p.m.
  • Tuesday, December 12 from 12:00-12:50 p.m.