Labels in SAS – Variable and Value

Do you know what group, trmt represent?  We can probably guess what age, height, and eye_colour mean, but would you know what units age and height were measured in?  Without a codebook or information, such as labels for the variables and value labels for the variable values, you would be guessing!

In SAS, and with many other statistical programs, you can add both a variable label and value labels.

Whenever you work with the data, you need to be working in a DATA step.  Drawing parallels to Excel, you will need to open a new dataset or excel worksheet, make the changes and then save it.  In SAS, you will create a new DATA Step, make the changes to the variable(s), and save it.

Data tuesday_new;
set tuesday;        * this tells SAS that you want to use the dataset called tuesday that you                                    created earlier;
label
group = “Individuals on the trial were randomly assigned to 4 groups”
trmt = “Treatments were assigned within each group”
age = “Age of the participant in years”
height = “Height taken of the participants at the end of the trial, measured in cm”
eye_colour = “Colour of the participants’ eyes”;
Run;

To view these changes, try a Proc print – what happens??

Try the following:

Proc Contents data=tuesday_new;
Run;

What do you see?

Sometimes you will collect variables that are coded.  Rather than writing Blue eyes, brown eyes, you might provide them with a code such as 1,2, etc…  But how do you remember what code you gave what value?  Writing it down on a piece of paper is fine, but what if you misplace that paper?  Adding value labels to your data is a great way to keep all the information together.

To accomplish this in SAS, it is a 2-step process.  We need to create the codes and their labels first, and then we need to apply these to the variables in the dataset.  This allows you to re-use the labels.

CREATING THE VALUE LABELS

Proc format;
value \$groupformat
a = “Group A – Monday morning”
b = “Group B – Monday afternoon”
c = “Group C – Tuesday morning”
d = “Group D – Tuesday afternoon”;

value trmtformat
1 = “Treatment 1 – Placebo”
2 = “Treatment 2 – Vitamin C”;
Run;

This creates SAS formats.  One called groupformat and another called trmt format.  Think of these as boxes that say a represents Group A – Monday morning, etc..

APPLYING THE VALUE FORMATS TO THE DATA

Remember that we are touching the data or making changes to the data, so we need to use a Data Step.  Let’s re-use the one where we added variable labels:

Data tuesday_new;
set tuesday;

label
group = “Individuals on the trial were randomly assigned to 4 groups”
trmt = “Treatments were assigned within each group”
age = “Age of the participant in years”
height = “Height taken of the participants at the end of the trial, measured in cm”
eye_colour = “Colour of the participants’ eyes”;

format
group groupformat.
trmt trmtformat.

Run;