Selected Papers on SAS

$29.95

Author: Shaoji Xu

Page Count:328

Trim Size:8.25″×11″

This book comprehensively covers nearly all aspects of SAS programming—from the DATA step to PROC SQL, PROC REPORT, and Macro programming. It offers numerous practical tips, points that require special attention, and insightful comparisons between statements and options, including BY versus WHERE clauses. As such, it serves as a valuable resource for SAS programmers at every level.

Today, most programmers know that in computing, 0.1 + 0.2 ≠ 0.3. Interestingly, the first person to highlight this phenomenon was the author, at the 2008 NESUG (Northeast SAS Users Group) conference (link (lexjansen.com in Bing)). Using SAS, the author provided a detailed explanation of floating-point precision. SAS has an advantage over many other languages—it can represent numbers using 64‑bit binary format. If you are familiar with SAS and curious about this topic, you will find clear answers here.

For programmers working in pharmaceutical companies, generating tables in RTF format is a common task. Yet few realize that SAS can also read RTF files. The author, who previously worked in a Contract Research Organization (CRO) serving pharmaceutical clients, once received RTF files as raw data. To import these into SAS datasets, the team used a double‑input procedure—one for data entry and another for verification. Recalling a section in this book about RTF handling, the author revisited it and developed a program to read data directly from RTF files. The resulting program proved faster and more accurate than manual input. Without this book, completing the task would have required far more time and effort.

Category:

Preface

There is an old saying in Chinese: “Indigo comes from blue but is darker than blue; ice comes from water but is colder than water.” This means, “The master is surpassed by the apprentice.” However, this saying also describes the situation of my two “daughters”: Understanding SAS and Selected Papers on SAS. This new book (Selected Papers on SAS) is derived from the older sister (Understanding SAS), but she is prettier and more charming than the older sister because I have concentrated on important and interesting parts rather than discussing every point. Actually, I should say that papers in this book are the essence of the older sister.

In the book Understanding SAS I included many new ideas, that is, my own research results. I finished the book, but the labor and delivery process takes a long time. Meanwhile, I decided to write some papers. I have taken some ideas from the book and finished them as separate, independent papers. So now the younger sister makes her debut before her older sister.

The book has 17 papers.

Mainly, there are three kinds of contents: basic, fundamental parts such as names, order of statements and options, end of data set, and operators; hot topics such as CHKLOG, page of, rtf files and special characters, and transmitting between SAS data sets and Excel files; and that are just for fun.

The SAS manual is a great resource for every SAS programmer. However, it is too general in some places. For example, when talking about variable names, it says

You do not assign the names of special SAS automatic variables (such as _N_ and _ERROR_) or variable list names (such as _NUMERIC_, _CHARACTER_, and _ALL_) to variables.

This is far, far from enough. As we know, many computer languages have a determined set of reserved words, such as SQL. But SAS is different. “The rules are more flexible for SAS variable names than for other language elements. . . . . SAS reserves a few names for automatic variables and variable lists, SAS data set, and librefs.”

First, what are these “a few names”? Second, this gives users great flexibility. They can use almost any words freely. On the other hand, it brings users some inconvenience, because you don’t know how SAS treats a word: Is it treated as a user-defined name or as a keyword? This depends on SAS’s interest and SAS’s understanding. Sometimes you may think it is OK to use a specified word as a name, but SAS says: No, it is a keyword. Then your programming will be messed up. The following program creates two printouts. Guess what these printouts are.

DATA s;

a=5;n=6;

PROC REPORT NOWD;

COLUMN a n;

RUN;

DATA s;

not=0;

yes=not+1;

PROC PRINT;RUN;

Therefore, if you don’t know how SAS behaves, you may not get the results that you want. We need to know about individual names. We need to know what names are forbidden in SAS and what names and in what situations SAS has its own interpretation of their meanings, and then we can correctly use these names or just avoid them.

Another example is subsetting. We all know that we can use a WHERE statement and an IF statement (subsetting IF) to subset a data set. Also, we know that there are some differences between a WHERE statement and a subsetting IF statement. We know that they will produce different data sets in some situations as the SAS manual mentions in the following:

The WHERE statement can produce a different data set from the subsetting IF when a BY statement accompanies a SET, MERGE, or UPDATE statement.

Then we may ask the question: If there is no BY statement, are there any differences? Some books discuss this, but the discussions do not contain enough details. We need comprehensive comparisons between two statements. Several papers in this book discuss this topic.

END of data set is fundamental in the SAS language. Almost every programmer uses option END=. In the paper [IV] I discuss when we can use this option and how SAS works on this option.

Relationships among the options KEEP (DROP), RENAME, and WHERE are basic. In the paper [III] I discuss relationships among statements and options. You may not care about relationships between options OBS= and WHERE= because you never use them together. However, it is quite possible that you have used options RENAME= and WHERE= together. The SAS online document talks about relationships among the options KEEP=, DROP=, and RENAME=, but not with the WHERE= option. So what is wrong with the following program?

DATA s;

a=3;

DATA t;

SET s(WHERE=(a>2) RENAME=(a=b));

RUN;

3    DATA t;

4    SET s(WHERE=(a>2) RENAME=(a=b));

ERROR: Variable a is not on file WORK.S.

5    RUN;

NOTE: The SAS System stopped processing this step because of errors.

Some topics are just for fun, for example, the positive prefix operator. Do you know what the printout of the following program is?

DATA s;

a=3;OUTPUT;

a=-3;OUTPUT;

DATA t;

SET s;

WHERE +a>0;

PROC PRINT;

RUN;

This is a very simple and trivial program, but I believe that no one has ever tried it. If you are interested or curious, you can find this out by yourself (It is very simple.), or you can read this book. This problem is discussed in [V]. Of course, this is not the main topic of that paper. Actually, [V] is my favorite paper. In that paper, I discuss operators in a DATA step (not in a WHERE statement or a WHERE option), in a WHERE statement, in PROC SQL, and in macro facility, and I make a comprehensive comparison. These are fundamental parts of the SAS language and can be a supplement of the SAS manual. They touch a basic and fundamental question: When we learn, use, analyze, and compare operators, what factors do we need to consider? I think it is not very clear for many SAS programmers and in many books. The following program creates three data sets s, t, and p. Do you know the difference between the data set t and the data set p?

DATA s;

INPUT a b c;

CARDS;

-3 5 2

2 5 2

;

DATA t;

SET s;

IF a=-3 MAX b MIN c;

PROC PRINT;

DATA p;

SET s;

WHERE a=-3 MAX b MIN c;

PROC PRINT;

RUN;

Let’s see another example. Do you know what the printouts of the following program are?

%MACRO ss(company);

%IF &company EQ GE %THEN %PUT The company is GE;

%MEND;

%ss(GE)

%MACRO tt(state);

%IF &state EQ OR %THEN %PUT The state is Oregon;

%MEND;

%tt(OR)

We can see that the first macro is OK, but the second one is not. This problem has been discussed in many papers and books. There are two questions: Why does this happen and how could we avoid the problem? Many books give the answer to the second question: Use quoting functions. In [V], I give my explanation for the first question.

Every programmer knows about variable lists and more or less uses variable lists. Now I ask the following question: Precisely which statement can use which variable list? I think many people may be not clear about this. This is the purpose of paper [VI]. Many people know statistical functions. You can guess that the function MEAN() returns the mean value of arguments without reading any book. But do you know how to use variable lists in statistical functions? For example, in the following, which statements are OK, and which have problems? Assume that all variables and arrays are defined.

A=MEAN(OF x1-x6 p, OF q y1-y5, OF z1 z2,y1+y2);

A=MEAN(OF x1-x6 p q y1-y5 z1 z2);

A=MEAN(OF x1-x6 y1+y2);

A=MEAN(x1 x2 x3);

A=MEAN(OF x1-x5, y2 y3);

A=MEAN(OF x:);

A=MEAN(OF z(1) – z(3));

A=MEAN(OF z(*));

In [XIII], I discuss relationships between formats and outputs of statistical procedures. Formats affect not only the appearance of printouts but also the results as well. For example, the following program creates two printouts. Do you know if they are different?

DATA s;

a=1.34; b=5;OUTPUT;

a=1.28;b=4;OUTPUT;

a=2;b=5;OUTPUT;

a=2;b=4;OUTPUT;

FORMAT a 5.1;

PROC MEANS;

CLASS a;

PROC SQL;

SELECT a, MEAN(b) FROM s GROUP BY a;

QUIT;

A variable has two kinds of values: original (unformatted) values and formatted values. Every SAS programmer knows that if there is a BY statement, the data set should be sorted, but should the data set be sorted by original values or by formatted values? The following program is supposed to have two printouts. Is there any problem for the program? If there is no problem, what printout do you need? This problem is discussed in [XIV].

PROC FORMAT;

VALUE aa 1,3,5=odd 2,4=even;

DATA s;

INPUT a b @@;

FORMAT a aa.;

CARDS;

1 1 1 1 3 3 3 3 2 2 2 2 4 4 4 4

;

PROC MEANS;

BY a; VAR b;

PROC SORT;

BY a;

PROC MEANS;

BY a; VAR b;

RUN;

PROC SQL is an important part of SAS. Many people have done some work on this topic and advocate using PROC SQL. However, most SAS programmers still prefer DATA step because they are used to it. In this book, I discuss PROC SQL in much detail. Moreover, I present two complete programs using PROC SQL to create two tables that are used in pharmaceuticals companies, so you can compare these programs with those using DATA steps. Will you turn to PROC SQL after reading this book? Well, it is possible.

Is .1+.2 equal to .3? If you ask a math professor, he will say ‘Yes’. But if you ask Dr. SAS (Dr. C, Dr. JAVA, or Dr. AWK), she will say ‘No’. Moreover, she will tell you that the difference is 2**(-54) or approximately 5.55E-17. In [VIII] I discuss how SAS stores numeric values and does addition so that you can understand why Dr. SAS says No and how SAS’s behavior affects your programming.

Some parts of the book are state of the art, most advanced, such as Excel and RTF. I am sure that many SAS programmers will encounter them. This book, I think, could be a good reference.

There are so many ways to move data between SAS and Excel files. In paper [XV],  not only do I collect most of the methods that currently exist but also I indicate how to categorize them and how to compare them. You can then choose the one that best fits your needs. For example, many people know that some methods require the installation of SAS/ACCESS, and some methods do not. Do you know the answers to the following questions? What methods need SAS/ACCESS, and what methods do not? What is the difference between these two kinds of methods? What is the advantage of installing SAS/ACCESS? If there is no difference between them, then why would people waste money purchasing SAS/ACCESS software? (SAS/ACCESS has other functionalities, but here we concentrate on Excel.)

I provide my programs for CHKLOG and Page x of y. My criteria for good programs are that they are easy to use, easy to understand, and easy to modify. I hope my programs meet these criteria.

Finally, I have to emphasize that these results are run on PC Windows using SAS v9. Due to different settings, some conclusions may not apply to your situation. In many cases I will indicate the differences for v8, because so many SAS programmers are still using SAS v8.

No doubt, some conclusions may be inaccurate. The author welcomes readers’ criticisms, comments, and suggestions.

I want to thank the SAS institute. Whenever I have questions, I can always get answers from them.  I want to thank all the colleagues that I have ever worked with and that I am working with now. Discussions between you and me greatly inspire my thinking and my research.

I want to express my sincere thanks to Mrs. Jenny Brain for her help.

It is my hope that everyone who sees my “daughter” will say, “Your daughter is so charming. I like her.” In other words, “I enjoy reading this book. I have learned something from it.” Then, I will be happy.

Contents

I. View a SAS Data Set     1

SAS Viewtable window     1
SAS System viewer window     8

Table 1 Techniques for viewing a data set in SAS     8
Table 2. Comparison between the SAS Viewtable and the SAS System Viewer     10

II. Names in the SAS Language     11

Introduction     11
Data set names     11
Variable names      14
Array names     24
Format names and informat names     24
Names in PROC SQL     25
PROC REPORT: variable names or statistics?     28
Names in Macro     31

Table 1. Options in a BY statement     16
Table 2. Names forbidden or not good     33

III. Statements, Options, and their Order     35

Introduction     35
Definition statements     37
BY statement     41
WHERE statement     47
Effectiveness of declarative statements     53
Order of DROP, KEEP, and RENAME, statements and options     56   
RENAME statement and RENAME option     59
WHERE= option     62
Order of statements and options	  65
Order in PROC SQL     68
Mystery of PROC SORT     75
WHERE statement, BY statement and LENGTH statement     76

Table 1. Comparison between BY statement and WHERE statement     48
Table 2. Effectiveness of attributes and declarative statements     53

Figure 1. Pseudo flow of options and statements in DATA step     67
Figure 2. Pseudo flow of options and statements in some PROC     68
Figure 3. Pseudo flow in PROC SQL     72

IV. Options NOBS=, END=, and EOF=     79

Introduction     79
NOBS= option     79
END= option     80
EOF= option     84

Table 1. Options to indicate the end of a data set     79

V. Operators in the SAS language     89

Introduction     89
DATA step operators     91
Operators in WHERE statement     96
Operators in PROC SQL     106
Operators in Macro     110
Operators in %EVAL     111
Operators in %SYSEVALF     117
Positive prefix +     118

Table 1. Operators in a DATA step    91
Table 2. Operators in a WHERE statement     96
Table 3. Rules to convert character values to Boolean    99
Table 4. Operators in PROC SQL    106
Table 5. Operators in function %EVAL    111

Figure 1. Conversion rules in a non-WHERE statement and a WHERE statement     99

VI. Variable List     123

Introduction     123
Definition     123
Dash list (Numbered range list) ¬¬¬¬¬¬¬–¬ Colon list (Name Prefix List) –¬ Double dash list (Name range list) ¬– Type double dash list – Special SAS name list ¬– Array list
Usage     125
KEEP, DROP, and RENAME statements – FORMAT, INFORMAT, ARRAY, and RETAIN statements – PUT statement – INPUT statement – LENGTH statement –BY statement (in DATA step or in PROC) and VAR statement (in PROC) – Arguments in statistical functions
Summary     132
Missing elements     133

Table 1. Usage of variables list     132
Table 2. Missing variables in variable lists     135

VII. Comparison between a WHERE Statement and a Subsetting IF statement     137

Table 1. Comparison between a subsetting IF statement and a WHERE statement     137
Table 2. Relationship between subsetting and sorting     141

VIII. Is .1+.2 Equal to .3?      145

Introduction     145
How does SAS store numeric values?      147
How does SAS do addition?      150
How can we avoid problems?      152
Magnitude and precision     153
Length attribute     154
Other software     157

Table 1. Conversion between binary values and hex values     146

IX. Binary Expression of Values and Related Formats and Informats     161

Introduction     161
BINARYw. Format     161
BINARYw.d informat      162
HEXw. format     164
HEXw. informat     164
Hexdecimal expression     165
Characters and their hex values     166
Converting characters to binary values or hex values     171
Informat $BINARYw. and informat $HEXw.      172

Table 1. Characters and their hex values (SAS monospace font)      166
Table 2. Characters and their hex values (Courier New font)      167
Table 3. Some informats and formats and their width     173

X. Comparison among DATA Step, PROC SQL, and PROC REPORT     175

Introduction     175
Create an empty data set     175
Change order of variables     176
Create new variables     177
Remove duplicate rows     177
Sort tables     178
Combine with summary results     178
PROC REPORT and PCTSUM     179
Macro variable assignment     181
Examples of PROC SQL programming     183

XI. Table Operations in PROC SQL   191

Introduction   191
Match SELECT   191
Join operations     193
Union operations     196
UNION – OUTER UNION – EXCEPT – INTERSECT
INSERT INTO operation      200
Comparisons among table operations     200
Subquery     202
Examples of subqueries     206

Table 1. Comparison of options in union operations     198
Table 2. Comparisons among table operations     200

XII. Summary Report or Detail Report?  – PROC REPORT and PROC SQL     209

Introduction     209
PROC REPORT     209
Summary functions in PROC SQL     212
Summary report in PROC SQL     219

Table 1. Statistics used in PROC REPORT     209 
Table 2. Statistical functions used in a DATA step     212
Table 3. Operands of MAX and MIN and arguments of some functions     215
Table 4. Summary functions used in PROC SQL     217

XIII. Statistical Procedures and Formats     221

Introduction     221
General discussion     221
PROC FREQ     224
PROC MEANS     225
PROC TABULATE     226
PROC SQL     229
PROC REPORT     230
PCTN in PROC REPORT     232

Table 1. Specify class variables and analysis variables     222
Table 2. Non-classic formats in descriptive procedures     223
Table 3. Relationships between format and statistical values     223

XIV. Formats and BY statement     235

Introduction     235
Formats and BY statement     235
Types of formats     238

XV. Data Sets and Excel Files     243

Introduction     243
Compare an Excel file with a SAS data set     243
Import data     246
Import Wizard for an Excel file (Category 1)      247
DIMPORT command, libref and pass-through facility     254
Import wizard from a plain data file (category 2)      256
DATA step (category 3)      259
Comparison between methods in Category 1 (using SAS/ACCESS) and that in Category 2, 3 (not using SAS/ACCESS) when importing data     263
Summary     266
Output to Excel file     267
Limitations     267
Export Wizard and PROC EXPORT (Category 1)      268
Export Wizard and PROC EXPORT without SAS/ACCESS (Category 2)      270
DATA step (Category 3)      271
ODS tool (Category 4)      273
Other methods (Category 5)      275
Compare methods in Category 1 (using SAS/ACCESS) and that in Category 2, 3, 4, 5 (not using SAS/ACCESS) when exporting data     276
Summary     278

Table 1 Comparison of Excel and SAS data set on limitations     244
Table 2 Options in Import Wizard and statement in PROC IMPORT (for xls files) 251
Table 3. Options in libref and pass-through facility     256
Table 4 Options in Import Wizard and Statements in PROC IMPORT (for csv files)  258
Table 5 Comparison of ways to import data from an Excel file     267
Table 6 Comparison of ways to export data to an Excel file     278

XVI. SAS and File Editing     281

Introduction     281
CHKLOG program     281
Other programs     284
Wrapped text     286
More than one word     287
Text edit     288
Text count     289
Page x of y     289
Search in RTF files     292
Page number in RTF     293

XVII. RTF Output and Special Characters     299

Introduction     299
How many ways are there to insert a dagger into an ODS RTF output?      299
Use hex values for special characters     301
Symbol characters     303
How does RTF work?      304
Write a macro (FILE + PUT) to create RTF files     305
Unicode characters     308

Table 1. Comparison of ways inserting special characters into ODS RTF     302

SAS and I

The reason I began writing this book is very simple: I lost my job. As the old saying goes, “Read when bored.” In my case, it could be described as, “Write when jobless.”

There is a world of difference between the United States and China regarding what is expected of a college graduate. In China, if a graduate from Peking University—a prestigious institution of higher learning—earns his living as a butcher, many people would cry foul. They would view it as a tragic waste of talent, blaming the situation on social inequities or systemic flaws. It is quite different in the United States. No one cares when a Ph.D. loses their job. The general consensus is that you should do whatever it takes to earn your own bread. People might feel a pang of sympathy for you because so many years of education seem to go to waste, but no one questions the fairness of society. If you cannot find a job in this land of opportunity, you have only yourself to blame.

Losing a job is no big deal in America. There are no “iron rice bowls” here—no guarantees of lifetime employment. If you lose a job here, you simply find another one. Somehow, though, this formula failed me. To me, finding another job felt as impossible as riding a bicycle to the moon. Having been a software developer for many years—and a pretty good one by my own standards—I was dismayed to find myself so powerless and helpless in the job market. The bearish U.S. economy, combined with heavy overseas outsourcing, had packed the waiting rooms of employment agencies. Nowadays, employers have the luxury of demanding candidates who are jacks-of-all-trades, mastering ten different skills at once. No agency will even look at you if your resume shows you’ve been unemployed for more than a month. Their rationale is brutally simple: if you couldn’t find a job within a month, it implies your skills are inadequate or that you’ve already been rejected by everyone else. Who wants to run a charity? Looking at my resume, I saw no leverage. A Ph.D. degree felt like the last, desperate anchor for my self-esteem. Taking that resume into today’s job market was like sending a delicate, frail girl into a boxing ring. Facing so many heavyweight competitors, how could I possibly win?

Many of my friends tried to help, but they didn’t know how. Others tried, but soon grew frustrated by my perceived clumsiness—how hard could it be to tweak a resume, after all? The rest of my acquaintances simply looked away, because recommending me might cause potential embarrassment with their own bosses. How could anyone pitch such a weak candidate? I had truly become a nuisance—a piece of “chicken cartilage,” useless yet reluctant to throw away.

Looking back, my career has been an endless dance with programming languages ever since undergraduate school, starting with ALGOL, then moving on to BASIC, COBOL, FORTRAN, C, C++, UNIX, SQL, and JAVA. They were like a succession of beautiful girls, each younger and prettier than the last. Generally speaking, I got along with them quite well, though no real sparks ever flew between us. On the job, my motto was always, “You give me the requirements, and I will give you the code.” This meant that no matter what the demand was, I could always build it. But now that I am jobless, everything has changed. It feels as though my boat has sunk, and not a single one of those beauties has bothered to offer me a hand.

I saw dead ends in every direction. But just as the old Chinese proverb says, “Heaven never cuts off all paths.” A door opened. Dr. Chen-Ning Yang once said that Miss Wong was a gift from God; for me, SAS was my gift from Heaven.

The first time I met her (SAS) was several years ago when I was a TA for a statistics class. I taught DATA and FREQ steps from a thin textbook titled Introductory SAS. Back then, SAS seemed like a naive, innocent little girl. Meeting her again after several years, she had blossomed into a stunning young lady. As a matter of fact, SAS was born of renowned lineage. Her father is Statistics, while her mother is Software—hence her name: SAS (Statistics and Software). She has an older sister named SPSS, but SAS has become much more popular. Yet, looking at her journey, one might still say: a beautiful maiden fully grown, yet lacking true devotion despite having so many suitors.

In my life, I have suffered two episodes of “lost loves” (layoffs) just a few months apart. The two experiences left me with completely different feelings. The first time filled me with depression, anxiety, and pessimism. I wondered if I was destined to remain “loveless” (jobless) for the rest of my days. The second time, however, I was full of confidence because I had met Miss SAS, and I knew I could win her heart.

I made my move immediately after being laid off. Even though well-meaning matchmakers tried to steer me elsewhere, my mind was made up. Some friends questioned my choice: “Why not stick with mainstream IT? It’s a wealthy, prominent family, and you already have a good relationship with them.” In truth, I felt like a fallen prince in an old novel who sells himself into slavery for the sole purpose of wooing the master’s beautiful daughter. Outsideders couldn’t understand why I was willing to sink into such lowliness, but to me, it was a godsend. I was filled with joy.

For SAS and me, it was love at first sight, and an engagement by our second encounter. SAS gave me a chance, a lifeline; to me, she has been a savior. In return, I cherish and adore her. I regard her as my soulmate, inseparable from then on. Every night brought endless longing; every day was filled with the sweet murmurings of lovers. Day after day, I would rise before dawn and sit before the computer monitor, my fingers flying across the keyboard to express my devotion. I skipped showers, forgot breakfast. I took short naps when exhausted and wolfed down simple comfort food when hungry. Leveraging my background as a CRO (Chief Research Officer), I diagnosed Miss SAS’s frequent ailments, rare disorders, common complaints, and hard-to-resolve symptoms. In return, Miss SAS loved to curl up in my arms, chatting endlessly about the quirks of her family. Of course, sometimes she could be a mischievous little sprite, playing hide-and-seek with me. I would clearly see a dataset sitting right there, but the moment I blinked, it vanished.

Within her family, my absolute favorites are REPORT and ODS—the “Golden Boy and Jade Girl” of the clan. REPORT is slender, quiet, and always immaculately dressed, while ODS is warm, open, sexy, and captivating. There is a saying in her family: When the Golden Boy and Jade Girl join hands, they create a beauty beyond compare.

Recalling my days with SAS, it truly felt like a secret, illicit affair. I was obsessed every day, yet I did not dare tell a soul that I was writing a book. If I had confessed to anyone, they would have written me off as a lunatic, and the diagnosis would have been ready-made: clinical depression brought on by unemployment. During those times, I deeply missed my former mentor and the co-author of my previous book Linear Programming, Mr. Jianzhong Zhang. Only he might have understood what I was doing—though even he would likely have been skeptical.

Fortunately, this faithful romance did not go unrewarded. This little book is the fruit of our love, born of true devotion. More than three years have passed since my second encounter with Miss SAS.

Finally, I would like to extend my gratitude to many friends who helped along the way, especially Dr. Jianming Miao and Dr. Qiuhu Shi. Dr. Miao provided me with childhood photos of Miss SAS—early SAS menus—which rekindled my old passion. Dr. Shi, an expert in biostatistics, generously provided the means for Miss SAS and me to meet regularly; it was at his company that I began writing these SAS programs. It is fair to say that without their warm support, my beloved “daughter” would not have come into the world so smoothly. Please accept my deepest bow of gratitude.

I also wish to thank my fellow SAS professionals whose work has deepened my understanding of Miss SAS. I must add that the SAS community is exceptionally warm and welcoming; whenever I approached family members with questions about Miss SAS, they always provided swift and helpful answers.

My final words are reserved for my child.

My child, do you know how fervently I anticipated your birth? For you, I forgot the taste of food and lost the ability to sleep soundly. Every day, every hour, I watched you grow. Every minute, every second, I counted down to your arrival.

At the risk of making you laugh, I must confess that I have never been so afraid of dying as I was before you were born. My untimely end would have meant you would remain forever unborn, trapped in the womb. Long before your arrival, I had already prepared your room in this world. So many times, I smoothed your bed, tucked in your quilt, and imagined the moment you would finally be here.

Now, you have entered the world. Are you pleased with what you see?

Good luck, my child.

My child, do you know how fervently I anticipated your birth? For you, I forgot the taste of food and lost the ability to sleep soundly. Every day, every hour, I watched you grow. Every minute, every second, I counted down to your arrival.

At the risk of making you laugh, I must confess that I have never been so afraid of dying as I was before you were born. My untimely end would have meant you would remain forever unborn, trapped in the womb. Long before your arrival, I had already prepared your room in this world. So many times, I smoothed your bed, tucked in your quilt, and imagined the moment you would finally be here.

Now, you have entered the world. Are you pleased with what you see?

Good luck, my child.

SAS 和我

写这本书的原因很简单:我失业了。

前人有言:“无聊才读书。”我这大概可算是:“失业才写书。”美国与中国的情况不同。在中国,一个北大毕业的学子去卖肉,很多人会感到不平,认为是社会、制度有缺陷,是一股资源的浪费。美国不一样,一个博士失业了,才不会有人在乎呢,自己混饭去吧。或许会有人为你个人感到可惜:这么多年的书白读了;但绝对没有人会觉得社会有什么不公——你没本事找到工作,这能怪谁?

失业在美国本不是什么大事。这里没有铁饭碗,失业了,再找,再就业便是。但对我来说,情况却不同了——再就业简直比登天还难。做了多年的程序员,也自认是个不差的工匠,直到此时才认识到自己在就业市场上是如此的脆弱与无能。美国经济的低迷加上大量的外包,使得就业市场的板凳上坐满了人。如今的公司要的都是十门、八门样样精通的全才。只要你的简历上显示待业超过一个月,便没有中介公司再理你了。他们的理由很简单:既然你一个月都找不到工作,就说明你能力差,说明别人都不要你。你成了被淘汰下来的垃圾,我干嘛要你?

看看自己的简历,也实在是寒酸。挂着一个博士头衔,也只能算是对自己的一点精神安慰。我的简历在就业市场上,就如同林黛玉站在竞技场,面对着满台五大三粗的汉子,哪里还有我的胜算?

一些朋友想帮忙,却爱莫能助;一些朋友来帮忙,却恨铁不成钢:“让你改个简历都改不好!”更有些朋友,则是避而远之——不帮吧,面子上不好看;帮吧,只怕在老板面前难以交代:“你怎么把这么差的人给介绍进来了?”一时间,我竟成了人人嫌弃的鸡肋。

说起来自己的生涯,从大学毕业以后,主要的就是和计算机语言打交道:从ALGOL、BASIC、COBOL,到FORTRAN、C、C++,再到UNIX、SQL、JAVA。一个比一个漂亮,一个比一个俊俏。我和她们的关系都算不错,尽管从未擦出过什么轰轰烈烈的火花。工作的时候,我常挂在嘴边的一句话是:“你给我要求,我给你程序。”(You give me requirement, I give you code.)意思是说,不管你提什么要求,我总能写出程序来。可等到我失业了,情况就完全两样了。这时当真是:公子落难,竟无一位佳人伸援手。

就在我山穷水尽、走投无路的时候,应了中国那句古话:天无绝人之路。杨振宁博士曾说,翁小姐是上帝送给他的礼物;而SAS,可以说是上天送给我的礼物。

初识SAS,还是若干年前当统计课助教(TA)的时候,只教了一点 DATA 和 FREQ。那时用的是一本薄薄的教材:《Introductory SAS》。在我的记忆中,SAS依稀还是个不谙世事的小女孩。数年不见,再次相遇,她竟已成了亭亭玉立的大姑娘。说起来,SAS也是出身名门。其父叫“统计”,其母叫“软件”,故而名字采用了父母的缩写:SAS(Statistics and Software)。她有个姐姐叫SPSS,只是SAS比她姐姐要大众化得多。不过,说起SAS当年的身世,却可以说是:靓女初成,空有众多郎舅缺关爱。

在我的一生中,失过两次恋(业),中间相隔仅几个月。两次“失恋”,心理感受大不相同。第一次是失望、焦虑与悲观——难道我就此独身至老,了却此生吗?我迷惘,我困惑。第二次则是信心十足,因为我认识了SAS小姐,我有把握赢得她的芳心。

从一失业起,我就对她展开了猛烈的攻势。虽然也有三媒六婆关心着我,但我已是铁了心。一些朋友对此不解:IT家族那么富有,之前和你的关系也不错,为什么要改行呢?其实,我就像是古典小说中甘愿卖身为奴的公子,真正的目的,不过是为了拐走主人家的小姐。旁人看来,不解我何以自甘沉沦;在我而言,却以为是天赐良机,乐不可言。

SAS与我,可以说是一见钟情,二见订终身。SAS给了我机会,给了我生路;SAS于我,恩重如山。在SAS身上,我找到了久违的青春与活力;而我,也给了SAS全部的关怀与温暖。我待SAS,知心贴肉;从我这里,SAS也得到了久盼的瞩目与疼爱。

自打结识了SAS,夜夜是止不住的相思浓浓,天天有说不完的情话绵绵。每天凌晨即起,不梳洗,不进食,便坐在计算机前敲打键盘,诉说衷情。累了,倒在床上小寐片刻;饿了,一碗泡饭、几块酱菜扒拉下肚。凭借着CRO(Chief Research Officer)的独特视角,我对SAS小姐的多发病、少见症、常见病、疑难症,都贡献了个人的一孔之见。而SAS,则喜欢偎依在我的怀里,絮絮叨叨地叙述她家族的趣事。有的时候,她却又顽皮得像个小精灵,喜欢玩猫捉老鼠的游戏——明明看见一个 DATA SET 在这里,眼睛一眨,转眼就不见了。

在她家族中,我最喜欢的是 REPORT 和 ODS,人称“金童玉女”。REPORT 羸弱而文静,总是衣冠楚楚;ODS 热情、开放、性感而迷人。家族有言:金童玉女一携手,人间美景不可收。

回想起和SAS相聚的日子,真的如同在“偷恋”。日日厮守,却不能告诉任何人我在写书。我若告诉了旁人,人家必定以为我是疯子,而且这发疯的原因都是现成的:失业压力太重。这时候,我常常有些怀念以前的导师——《线性规划》一书的合作者张建中先生。唯有他,或许能理解我,不过最多也只能是将信将疑了。

所幸的是,忠贞的恋情终于开花结果。这本小书,便可以说是我们爱情的结晶。屈指算来,从再识SAS至今,已经三年有余了。

最后,在SAS与我的恋爱过程中,有许多朋友鼎力相助。尤其是缪建铭博士和史秋湖博士。缪博士给了我SAS小姐儿时的照片——SAS早期的菜单,激起了我对SAS的恋情。史博士是生物统计专家,为我和SAS小姐的经常相聚提供了诸多方便——后来我便是在他的公司里编写SAS程序。至少可以说,没有他们的热情相助,我的“宝贝女儿”是不会这么顺利出世的。在此,请受我一拜。

当然,我还要感谢我的SAS同行,他们的杰出工作使我对SAS小姐有了更深的了解。我还应该说,SAS家庭是一个极其友善的大家庭,只要我对SAS小姐有什么疑问并向他们求助,他们总能很快地提供详尽的答复。

最后的几句话,是对我的宝贝女儿说的:

儿啊,你可知道我是多么殷切地盼望着你的出世吗?为了你,我食不甘味,寝不安席。每一天,每一时,我都在估量:孩子,你又长大了几许?每一分,每一秒,我都在算计:离你的生日又近了多少?

说句不怕你笑话的话,我从来不曾这般地怕死过。因为我的不幸,便意味着你的胎死腹中。远在你的出生之前,我就为你布置好了来到人间后的住房。多少遍,我摸摸你的床,捏捏你的被,想象着你出世后的景象。

如今,你来到了世上,你还感到满意吗?

儿啊,祝你成功。

Reviews

There are no reviews yet.

Be the first to review “Selected Papers on SAS”