Home » How to Use the SUBSTR Function in SAS (With Examples)

How to Use the SUBSTR Function in SAS (With Examples)

by Erma Khan

You can use the SUBSTR function in SAS to extract a portion of a string.

This function uses the following basic syntax:

SUBSTR(Source, Position, N)

where:

  • Source: The string to analyze
  • Position: The starting position to read
  • N: The number of characters to read

Here are the four most common ways to use this function:

Method 1: Extract First N Characters from String

data new_data;
    set original_data;
    first_four = substr(string_variable, 1, 4);
run;

Method 2: Extract Characters in Specific Position Range from String

data new_data;
    set original_data;
    two_through_five = substr(string_variable, 2, 4);
run;

Method 3: Extract Last N Characters from String

data new_data;
    set original_data;
    last_three = substr(string_variable, length(string_variable)-2, 3);
run;

Method 4: Create New Variable if Characters Exist in String

data new_data;
    set original_data;
    if substr(string_variable, 1, 4) = 'some_string' then new_var = 'Yes';
    else new_var = 'No';
run;

The following examples show how to use each method with the following dataset in SAS:

/*create dataset*/
data original_data;
    input team $1-10;
    datalines;
Warriors
Wizards
Rockets
Celtics
Thunder
;
run;

/*view dataset*/
proc print data=original_data;

Example 1: Extract First N Characters from String

The following code shows how to extract the first 4 characters from the team variable:

/*create new dataset*/
data new_data;
    set original_data;
    first_four = substr(team, 1, 4);
run;

/*view new dataset*/
proc print data=new_data;

Notice that the first_four variable contains the first four characters of the team variable.

Example 2: Extract Characters in Specific Position Range from String

The following code shows how to extract the characters in positions 2 through 5 from the team variable:

/*create new dataset*/
data new_data;
    set original_data;
    two_through_five = substr(team, 2, 4);
run;

/*view new dataset*/
proc print data=new_data;

Example 3: Extract Last N Characters from String

The following code shows how to extract the last 3 characters from the team variable:

/*create new dataset*/
data new_data;
    set original_data;
    last_three = substr(team, length(team)-2, 3);
run;

/*view new dataset*/
proc print data=new_data;

Example 4: Create New Variable if Characters Exist in String

The following code shows how to create a new variable called W_Team that takes a value of ‘yes‘ if the first character in the team name is ‘W’ or a value of ‘no‘ if the first characters is not a ‘W.’

/*create new dataset*/
data new_data;
    set original_data;
    if substr(team, 1, 1) = 'W' then W_Team = 'Yes';
    else W_Team = 'No';
run;

/*view new dataset*/
proc print data=new_data;

Additional Resources

The following tutorials explain how to perform other common tasks in SAS:

How to Normalize Data in SAS
How to Replace Characters in a String in SAS
How to Replace Missing Values with Zero in SAS
How to Remove Duplicates in SAS

Related Posts