SAS Problem of the Month by Robert Virgile

About the author: Bob Virgile is an independent SAS trainer and consultant. He has written two books for SAS Institute, and writes the problem-solving contests for SUGI (and formerly for NESUG). There is more information about his services posted on this web site Click to see more about Robert Virgile

Or, contact
Bob Virgile
Phone: (781) 862-4642
44 Woburn Street
Lexington, MA 02420
Email:

The solution to each monthly problem will occur a couple of weeks later.

March 2003 Problem of the Month


The following data step generated an unusual result:
 
data nonmissing;
merge males females;
by lastname;
where lastname GT ‘ ‘;
if lastname=’ ‘ then put ‘Impossible!’;
run;
 
Naturally, the log contained the word Impossible!  How could this happen?
 
Level of difficulty compared to previous SUGI problems:  difficult.

Solution:
 
How could the put statement possibly write a message to the log?
 
data nonmissing;
merge males females;
by lastname;
where lastname GT ' ';
if lastname=' ' then put 'Impossible!';
run;
 
Before reading on, note that this is the final regularly scheduled problem
of the month.  If you enjoy tackling these problems, make sure you try this
one before reading the solution.
 
The where statement is not a problem.  As a stand-alone statement, it applies
to both incoming data sets and accurately selects nonmissing values for lastname.
So how does lastname end up being missing?  One key ingredient is that lastname
has a longer length in females than in males.  As a result, the values in females
get truncated as the data sets are merged.  The second key ingredient is that some
lastnames in females contain leading blanks ... enough leading blanks that when
truncation occurs, all that is left are the blanks.


February 2003 Problem of the Month

The following data step prints a four-column report:

 data _null_;
do i=1, 31, 61, 91;
   set employees;
   put @i lastname +(-1) ', ' firstname @;
end;
put;
run;

 However, with the incoming names in alphabetical order, the report is in alphabetical order from left to right. Assuming that the page size is large enough to fit all the data on one page, here is your mission. Reorder the observations so that the report is in alphabetical order from top to bottom (first column top to bottom, then second column top to bottom, etc.). The report should maintain its current shape (number of rows and columns), and must use the data step above to print the report. The trickiest situations occur when the number of observations is not a multiple of 4.

Solution:

The new report must have the same number of rows and columns as the old. Assuming that the data are in order by lastname firstname, the tricky programming occurs when the number of observations is not a multiple of 4. Here is one approach, which assumes that the nobs= option accurately retrieves the number of incoming observations:


data new;
  n_rows = int(totalobs/4);
  leftover = mod(totalobs,4);
  do column_number = 1 to 4;
      do row_number = 1 to n_rows + (leftover > 0);
         set original nobs=totalobs;
         output;
      end;
      if leftover then leftover = leftover - 1;
   end;
   stop;
run;
 
proc sort data=new;
   by row_number column_number;
run;
 
The variable leftover indicates how many observations exist over and above a multiple of 4.


You are Visitor #