Combining SAS data sets

Very often a SAS programmer is required to combine observations from two or more data sets. There are different ways of combining data sets :

  1. One – to – one merging
  2. Concatenating
  3. Appending
  4. Interleaving
  5. Match – merging

We shall discuss each of these methods of combining. But before that we create two sample data sets. Below, I have created two data sets CLASS1 and CLASS2 out of CLASS data set in SASHELP library. I’ve spilt CLASS data set on the basis of age and have also dropped height variable from one data set and weight variable from other data set. Through these data sets, we will be able to clearly distinct one method of combining from the other.

program_sas
FIRST PROGRAM – CREATE DATA SETS

 

Class1_sas
CLASS1 data set

 

Class2_sas
CLASS2 data set

 

One to One Merging

When two or more data sets are combined by using one to one merging method –

– the new data set contains all variables from all input data sets

– the observations are combined on the basis of their relative positions in input data set

one2one_sas
ONE2ONE dataset

– if there are common variables in input data sets, the observations in the last data set overwrite observations in earlier data sets

– number of observations in new data set = number of observations in smallest input data set.

Concatenating

Concatenating method is used to append observations from one data set to another data set. When two or more data sets are combined by using one to one merging method –

– the new data set contains all variables and observations from all input data sets

program_sas_2

CONCAT data set
CONCAT data set

– if there is a common variable, it must have same type attribute

– if there is a common variable and its length , label, format or informat atttribute is different, SAS takes it from the first data set that contains that variable

Interleaving

Interleaving creates a new data set out by combining input data sets in sorted order(BY some variable). To use this method, each input data set must be sorted  in ascending order based on the BY variable(s). In our example, we sort the input data sets by AGE.

INTERLEAVE data set
INTERLEAVE data set

Match – Merging

Match merging method helps in combining data sets on the basis of values of variables in a sorted order. Other methods of combining combine observations in data sets on the basis of their relative positions. This method uses MERGE and SET statements. To show how this method works, we create a new input data set CLASS3.

CLASS3 data set
CLASS3 data set

 

Match – Merging in Ascending Order 

matchmerge1
MATCHMERGE1 data set

Match – Merging in Descending Order

Class1_descending_sas
CLASS1 SORTED IN DESCENDING ORDER BY AGE
Class3_descending_sas
CLASS3 SORTED IN DESCENDING ORDER BY AGE
matchmerge2
MATCHMERGE2 data set

Appending

Appending method of combining uses PROC APPEND. It has a BASE data set and one another data set which it adds to the BASE data set. It does not create a new data set. One PROC step can append only one data set to the base data set. To add more data sets more proc steps need to be executed. After PROC APPEND, the variable description in the BASE data set remains as it is. As this method adds observations to the BASE data set, the program executes only if all variables in BASE data set and other data set are common. CLASS2 and CLASS3 have common variables. Lets append CLASS3 to CLASS2.

appending1
Appending CLASS3 to CLASS2

Appending with Force

appending with force

As CLASS1 and CLASS2 contain some different variables, the program stops and shows an error message in the log.

To overcome this error, we use FORCE option. The result is that SAS retains all original variables in the BASE data set and appends. The observations from other data set contain missing values for the uncommon variables.

appending2
CLASS2 APPENDED to CLASS1 by FORCE


Note :  Run the FIRST PROGRAM each time you begin to play with a different method so as to retain the data sets CLASS1, CLASS2 and CLASS3 in their original form(i.e. unsorted, unappended, etc.)

Advertisements

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s