This tutorial explains various ways to calculate the standard deviation in SAS, along with examples.
Standard Deviation tells us how spread out the data points are from the mean. A low standard deviation indicates that the data points tend to be close to the mean, while a high standard deviation suggests that the data points are more scattered and farther away from the mean.
The STD
option within PROC MEANS
tells SAS to calculate the standard deviation of the specified variable(s).
proc means data=mydata std; var variable1; run;
proc means data=mydata std; var variable1 variable2 variable3; run;
proc means data=mydata std; run;
proc means data=mydata std; class grouping_variable; var numeric_variable; run;
Let's create a sample SAS dataset for demonstration purposes. This dataset will be used in the examples of this tutorial.
data mydata; input ID Age Gender $ Weight Height; datalines; 1 25 M 68.5 172 2 31 F 55.2 158 3 22 M 72.3 180 4 28 F 60.1 165 5 35 M 80.0 175 6 29 M 76.9 178 7 27 F 61.8 163 8 33 M 85.5 180 9 30 F 56.4 160 10 26 M 71.2 175 ; run;
3 Ways to Calculate Standard Deviation in SAS
There are 3 ways to calculate the standard deviation in SAS:
- STD option within PROC MEANS
- STD(variable_name) function in PROC SQL
- MOMENTS table in the results generated by PROC UNIVARIATE
Calculate Standard Deviation with PROC MEANS
The STD option in the PROC MEANS procedure tells SAS to calculate the standard deviation (std) of a variable. In this example we are calculating the standard deviation of a variable named "Weight" in the dataset "mydata".
proc means data=mydata std; var Weight; run;
PROC MEANS
: PROC MEANS is a SAS procedure used for summarizing data, providing various statistics for numeric variables, such as means, standard deviations, minimums, maximums, etc.data=mydata
: This part specifies the dataset named "mydata" that will be used for analysis. Replace "mydata" with the actual name of the dataset you want to analyze.std
: This option tells PROC MEANS to calculate the standard deviation of the specified variable(s).var Weight;
: This line indicates that the analysis should be performed on the variable "Weight" in the dataset. Replace "Weight" with the actual variable name you want to calculate the standard deviation for.run;
: This statement signals the end of thePROC MEANS
step and tells SAS to execute the procedure.
To save the results of the PROC MEANS procedure into a dataset in SAS, we can use the OUTPUT statement.
proc means data=mydata std; var Weight ; output out=output_data std=Std_Weight; run;
std=Std_Weight: This option in the OUTPUT statement tells SAS to include the standard deviation of the variable "Weight" in the output dataset. The name "Std_Weight" is the variable name assigned to store the standard deviation in the output dataset.
Calculate Standard Deviation with PROC SQL
In PROC SQL, you can use the STD()
function which is used to calculate the standard deviation. Here we are calculating the standard deviation of the "Weight" variable from the "mydata" dataset. The variable will be named as "Std_Weight" in the output.
proc sql; select std(Weight) as Std_Weight from mydata; quit;
Calculate Standard Deviation with PROC UNIVARIATE
When you run the PROC UNIVARIATE procedure, the standard deviation is displayed in a table named "moments".
ods select moments; proc univariate data=mydata; var Weight; run;
ODS SELECT moments;: The ODS SELECT statement is used to specify which output tables you want to generate. In this case, you are selecting the "moments" table, which contains the moments statistics, including the standard deviation.
To store the standard deviation in a dataset, we can use the ODS OUTPUT statement.
ods output moments=std_output; proc univariate data=mydata; var Weight; run;
Calculate Standard Deviation of Multiple Variables in SAS
Here we are calculating the standard deviation for the variables "Weight", "Height" and "Age".
proc means data=mydata std; var Weight Height Age; run;
Calculate Standard Deviation by Group in SAS
In this example, we are calculating the standard deviation for the variables Weight, Height, and Age, within each level of the variable Gender in the "mydata" dataset.
proc means data=mydata std nonobs; class gender; var Weight Height Age; run;
Calculate Standard Deviation of All Numeric Variables in a Dataset
If you don't include VAR statement in the PROC MEANS procedure, it produces standard deviation for all the numeric variables in a dataset.
proc means data=mydata std; run;
Share Share Tweet