Very often a SAS programmer is required to combine observations from two or more data sets. There are different ways of combining data sets :
- One – to – one merging
- Match – merging
We shall discuss each of these methods of combining. But before that we create two sample data sets. Below, I have created two data sets CLASS1 and CLASS2 out of CLASS data set in SASHELP library. I’ve spilt CLASS data set on the basis of age and have also dropped height variable from one data set and weight variable from other data set. Through these data sets, we will be able to clearly distinct one method of combining from the other.
One to One Merging
When two or more data sets are combined by using one to one merging method –
– the new data set contains all variables from all input data sets
– the observations are combined on the basis of their relative positions in input data set
– if there are common variables in input data sets, the observations in the last data set overwrite observations in earlier data sets
– number of observations in new data set = number of observations in smallest input data set.
Concatenating method is used to append observations from one data set to another data set. When two or more data sets are combined by using one to one merging method –
– the new data set contains all variables and observations from all input data sets
– if there is a common variable, it must have same type attribute
– if there is a common variable and its length , label, format or informat atttribute is different, SAS takes it from the first data set that contains that variable
Interleaving creates a new data set out by combining input data sets in sorted order(BY some variable). To use this method, each input data set must be sorted in ascending order based on the BY variable(s). In our example, we sort the input data sets by AGE.
Match – Merging
Match merging method helps in combining data sets on the basis of values of variables in a sorted order. Other methods of combining combine observations in data sets on the basis of their relative positions. This method uses MERGE and SET statements. To show how this method works, we create a new input data set CLASS3.
Match – Merging in Ascending Order
Match – Merging in Descending Order
Appending method of combining uses PROC APPEND. It has a BASE data set and one another data set which it adds to the BASE data set. It does not create a new data set. One PROC step can append only one data set to the base data set. To add more data sets more proc steps need to be executed. After PROC APPEND, the variable description in the BASE data set remains as it is. As this method adds observations to the BASE data set, the program executes only if all variables in BASE data set and other data set are common. CLASS2 and CLASS3 have common variables. Lets append CLASS3 to CLASS2.
Appending with Force
As CLASS1 and CLASS2 contain some different variables, the program stops and shows an error message in the log.
To overcome this error, we use FORCE option. The result is that SAS retains all original variables in the BASE data set and appends. The observations from other data set contain missing values for the uncommon variables.