Debugging in SAS involves various techniques to identify and resolve errors in program, ranging from examining the SAS log to using specialized debugging tools.
Debugging in SAS involves a combination of careful log analysis and the use of specific system options and tools to trace program execution and macro logic. The most fundamental rule is to always check the SAS log for NOTE, WARNING, and ERROR messages, starting from the top.
General debugging
Examine the SAS log
The log is your most important debugging tool, providing information on syntax issues, data processing, and execution errors.
Notes: These are informational messages that are often not critical but should be reviewed. For example, a note may tell you that a variable was converted from character to numeric.
Warnings: These alert you to potential problems that didn't stop the program but may have produced an unintended result. Examples include misspelled keywords that SAS assumes an interpretation for.
Errors: These are critical issues that cause SAS to stop processing the current step. An error message is usually the most explicit and indicates that the program has failed.
First error: Always focus on the first error listed in the log. An error early in the program can trigger a cascade of subsequent messages that disappear once the root cause is fixed.
Validate your data
Logic errors can result in an incorrect output even if the program runs without errors.
Use PROC FREQ for character variables and PROC MEANS for numeric variables to get descriptive statistics. This can help identify issues like unexpected values or missing data that are causing logical flaws.
Macro debugging
The following system options are crucial for debugging macro code by writing additional information to the SAS log:
OPTIONS MPRINT: Writes each SAS statement generated by a macro to the log. Use this to confirm that the macro is generating the SAS code you expect.
OPTIONS MLOGIC: Traces the flow of macro execution. It shows when a macro is invoked, what parameters are passed, and how %IF/%THEN statements are evaluated.
OPTIONS SYMBOLGEN: Displays the value of macro variables as they are resolved. This is essential for confirming that macro variables contain the correct data.
Using the %PUT statement
The %PUT statement writes text, including the resolved values of macro variables, directly to the SAS log. It is a simple but powerful way to test specific parts of your macro.
%PUT ¯o_variable; displays the value of a macro variable.
%PUT ***¯o_variable***; displays the value with asterisks to reveal any leading or trailing spaces.
External file output
For complex macros that generate a lot of code, you can write the MPRINT output to an external file for easier review.
SAS
filename mprint 'c:\temp\macro_debug.sas';
options mprint mfile;
%your_macro(param1, param2);
options nomprint nomfile;
filename mprint clear;
Use code with caution.
This writes the generated SAS statements to the specified file. You can then edit and run that file separately to debug the base SAS code.
DATA step debugger
For logic errors within a DATA step, the interactive DATA step debugger allows you to step through execution, pause at breakpoints, and examine variable values.
SAS
data mydata / debug;
set input_data;
/* ... your DATA step code ... */
run;
Use code with caution.
When this code runs, it will launch the interactive debugger. From the command line, you can use commands like EXAMINE, STEP, and BREAK to control execution.
Specific use cases
Debugging PROC HTTP: When troubleshooting PROC HTTP requests, add the DEBUG statement with the LEVEL=1 option to get detailed information about the request and response in your log.
Debugging stored processes: When working with stored processes, retrieving the SAS log is still the primary method for debugging errors.
Writing SAS Programs That Work
It's not always easy to write a program that works the first time you run it. Even experienced SAS programmers will tell you it's a delightful surprise when their programs run on the first try. The longer and more complicated the program, the more likely it is to have syntax or logic errors. But don't despair, there are a few guidelines you can follow that can make your programs run correctly sooner and help you discover errors more easily.
Make programs easy to read
One simple thing you can do is develop the habit of writing programs in a neat and consistent manner. Programs that are easy to read are easier to debug and will save you time in the long run. The following are suggestions on how to write your programs:
- Put only one SAS statement on a line. SAS allows you to put as many statements on a line as you wish, which may save some space in your program, but the saved space is rarely worth the sacrifice in readability.
- Use indentation to show the different parts of the program. Indent all statements within the DATA and PROC steps. This way you can tell at a glance how many DATA and PROC steps there are in a program and which statement belongs to which step. It's also helpful to further indent any statements between a DO statement and its END statement.
- Use comment statements generously to document your programs. This takes some discipline but is important, especially if anyone else is likely to read or use your program. Everyone has a different programming style, and it is often impossible to figure out what someone else's program is doing and why. Comment statements take the mystery out of the program.
Test each part of the program
You can increase your programming efficiency tremendously by making sure each part of your program is working before moving on to write the next part. If you were building a house, you would make sure the foundation was level and square before putting up the walls. You would test the plumbing before finishing the bathroom You are required to have each stage of the house inspected before moving on the next. The same should be done for your SAS program. But you don't have to wait for the inspector to come out; you can do it yourself.
If you are reading data from a file, use PROC PRINT to check the SAS dataset at least once to make sure it is correct before moving on. Sometimes, even though there are no errors or even suspicious notes in your SAS log, the SAS dataset is not correct. This could happen because SAS did not read the data the way you imagined (after all it does what you say, not what you're thinking) or because the data had some peculiarities you did not realize. For example, a researcher who received two data files from Taiwan wanted to merge them together by date. She could not figure out why they refused to merge correctly until she printed both datasets and realized one of the files used Taiwanese dates, which are offset by 11 years.
It's a good habit to look at all the SAS datasets you create in a program at least once to make sure they are correct. As with reading raw data files, sometimes merging and setting datasets can produce the wrong result even though there were no error messages. So when in doubt, use PROC PRINT.
Test programs with small datasets
Sometimes it is not practical to test your program with your entire dataset. If your data files are very large, you may not want to print all the data and it may take a long time for your programs to run. In these cases, you can test your program with a subset of your data.
If you are reading data from a file, you can use the OBS= option in the INFILE statement to tell SAS to stop reading when it gets to that line in the file. This way you can read only the first few lines of data or however many it takes to get a good representation of your data. The following statement will read only the first 100 lines of the raw data file mydata.dat:
INFILE 'mydata.dat' OBS = 100;
You can also use the FIRSTOBS= option to start reading from the middle of the data file. So, if the first 100 data lines are not a good representation of your data but 101 through 200 are, you can use the following statement to read just those lines:
INFILE 'mydata.dat' FIRSTOBS=101 OBS=200;
Here FIRSTOBS= and OBS= relate to the records of raw data in the file. These do not necessarily correspond to the observations in the SAS dataset created. If, for example, you are reading two records for each observation, then you would need to read 200 records to get 100 observations.
If you are reading a SAS dataset instead of a raw data in the file, you can use the OBS= and FIRSTOBS= dataset options in the SET, MERGE, or UPDATE statements. This controls which observations are processed in the DATA step. For example, the following DATA step will read the first 50 observations in the CATS dataset. Note that when reading SAS datasets, OBS= and FIRSTOBS= truly do correspond to the observations and not to data lines:
DATA sample_of_cats;
SET cats (OBS=50);
Test with representative data
Using OBS= and FIRSTOBS= is an easy way to test your programs, but sometimes it is difficult to get a good representation of your data this way. You may need to create a small test dataset by extracting representative parts of the larger dataset. Or you may want to make up representative data for testing purposes. Making up data has the advantage that you can simplify the data and make sure you have every possible combination of values to test.
Sometimes you may want to make up data and write a small program just to test one aspect of your larger program. This can be extremely useful for narrowing down possible sources of error in a large, complicated program.
Fixing Programs That Don't Work
In spite of your best efforts, sometimes programs just don't work. More often than not, programs don't run the first time. Even with simple programs it is easy to forget a semicolon or misspell a keyword--everyone does sometimes. If your program doesn't work, the source of the problem may be obvious like an error message with the offending part of your program underlined, or not so obvious as when you have no errors but still don't have the expected results. Whatever the problem, here are a few guidelines you can follow to help fix your program.
Read the SAS Log
The SAS log has a wealth of information about your program. In addition to listing the program statements, it tells you things like how many lines were read from your raw data file and what where the minimum and maximum line lengths. It
Debugging in SAS involves various techniques to identify and resolve errors in program, ranging from examining the SAS log to using specialized debugging tools.
Debugging in SAS involves a combination of careful log analysis and the use of specific system options and tools to trace program execution and macro logic. The most fundamental rule is to always check the SAS log for NOTE, WARNING, and ERROR messages, starting from the top.
General debugging
Examine the SAS log
The log is your most important debugging tool, providing information on syntax issues, data processing, and execution errors.
Notes: These are informational messages that are often not critical but should be reviewed. For example, a note may tell you that a variable was converted from character to numeric.
Warnings: These alert you to potential problems that didn't stop the program but may have produced an unintended result. Examples include misspelled keywords that SAS assumes an interpretation for.
Errors: These are critical issues that cause SAS to stop processing the current step. An error message is usually the most explicit and indicates that the program has failed.
First error: Always focus on the first error listed in the log. An error early in the program can trigger a cascade of subsequent messages that disappear once the root cause is fixed.
Validate your data
Logic errors can result in an incorrect output even if the program runs without errors.
Use PROC FREQ for character variables and PROC MEANS for numeric variables to get descriptive statistics. This can help identify issues like unexpected values or missing data that are causing logical flaws.
Macro debugging
The following system options are crucial for debugging macro code by writing additional information to the SAS log:
OPTIONS MPRINT: Writes each SAS statement generated by a macro to the log. Use this to confirm that the macro is generating the SAS code you expect.
OPTIONS MLOGIC: Traces the flow of macro execution. It shows when a macro is invoked, what parameters are passed, and how %IF/%THEN statements are evaluated.
OPTIONS SYMBOLGEN: Displays the value of macro variables as they are resolved. This is essential for confirming that macro variables contain the correct data.
Using the %PUT statement
The %PUT statement writes text, including the resolved values of macro variables, directly to the SAS log. It is a simple but powerful way to test specific parts of your macro.
%PUT ¯o_variable; displays the value of a macro variable.
%PUT ***¯o_variable***; displays the value with asterisks to reveal any leading or trailing spaces.
External file output
For complex macros that generate a lot of code, you can write the MPRINT output to an external file for easier review.
SAS
filename mprint 'c:\temp\macro_debug.sas';
options mprint mfile;
%your_macro(param1, param2);
options nomprint nomfile;
filename mprint clear;
Use code with caution.
This writes the generated SAS statements to the specified file. You can then edit and run that file separately to debug the base SAS code.
DATA step debugger
For logic errors within a DATA step, the interactive DATA step debugger allows you to step through execution, pause at breakpoints, and examine variable values.
SAS
data mydata / debug;
set input_data;
/* ... your DATA step code ... */
run;
Use code with caution.
When this code runs, it will launch the interactive debugger. From the command line, you can use commands like EXAMINE, STEP, and BREAK to control execution.
Specific use cases
Debugging PROC HTTP: When troubleshooting PROC HTTP requests, add the DEBUG statement with the LEVEL=1 option to get detailed information about the request and response in your log.
Debugging stored processes: When working with stored processes, retrieving the SAS log is still the primary method for debugging errors.
Writing SAS Programs That Work
It's not always easy to write a program that works the first time you run it. Even experienced SAS programmers will tell you it's a delightful surprise when their programs run on the first try. The longer and more complicated the program, the more likely it is to have syntax or logic errors. But don't despair, there are a few guidelines you can follow that can make your programs run correctly sooner and help you discover errors more easily.
Make programs easy to read
One simple thing you can do is develop the habit of writing programs in a neat and consistent manner. Programs that are easy to read are easier to debug and will save you time in the long run. The following are suggestions on how to write your programs:
- Put only one SAS statement on a line. SAS allows you to put as many statements on a line as you wish, which may save some space in your program, but the saved space is rarely worth the sacrifice in readability.
- Use indentation to show the different parts of the program. Indent all statements within the DATA and PROC steps. This way you can tell at a glance how many DATA and PROC steps there are in a program and which statement belongs to which step. It's also helpful to further indent any statements between a DO statement and its END statement.
- Use comment statements generously to document your programs. This takes some discipline but is important, especially if anyone else is likely to read or use your program. Everyone has a different programming style, and it is often impossible to figure out what someone else's program is doing and why. Comment statements take the mystery out of the program.
Test each part of the program
You can increase your programming efficiency tremendously by making sure each part of your program is working before moving on to write the next part. If you were building a house, you would make sure the foundation was level and square before putting up the walls. You would test the plumbing before finishing the bathroom You are required to have each stage of the house inspected before moving on the next. The same should be done for your SAS program. But you don't have to wait for the inspector to come out; you can do it yourself.
If you are reading data from a file, use PROC PRINT to check the SAS dataset at least once to make sure it is correct before moving on. Sometimes, even though there are no errors or even suspicious notes in your SAS log, the SAS dataset is not correct. This could happen because SAS did not read the data the way you imagined (after all it does what you say, not what you're thinking) or because the data had some peculiarities you did not realize. For example, a researcher who received two data files from Taiwan wanted to merge them together by date. She could not figure out why they refused to merge correctly until she printed both datasets and realized one of the files used Taiwanese dates, which are offset by 11 years.
It's a good habit to look at all the SAS datasets you create in a program at least once to make sure they are correct. As with reading raw data files, sometimes merging and setting datasets can produce the wrong result even though there were no error messages. So when in doubt, use PROC PRINT.
Test programs with small datasets
Sometimes it is not practical to test your program with your entire dataset. If your data files are very large, you may not want to print all the data and it may take a long time for your programs to run. In these cases, you can test your program with a subset of your data.
If you are reading data from a file, you can use the OBS= option in the INFILE statement to tell SAS to stop reading when it gets to that line in the file. This way you can read only the first few lines of data or however many it takes to get a good representation of your data. The following statement will read only the first 100 lines of the raw data file mydata.dat:
INFILE 'mydata.dat' OBS = 100;
You can also use the FIRSTOBS= option to start reading from the middle of the data file. So, if the first 100 data lines are not a good representation of your data but 101 through 200 are, you can use the following statement to read just those lines:
INFILE 'mydata.dat' FIRSTOBS=101 OBS=200;
Here FIRSTOBS= and OBS= relate to the records of raw data in the file. These do not necessarily correspond to the observations in the SAS dataset created. If, for example, you are reading two records for each observation, then you would need to read 200 records to get 100 observations.
If you are reading a SAS dataset instead of a raw data in the file, you can use the OBS= and FIRSTOBS= dataset options in the SET, MERGE, or UPDATE statements. This controls which observations are processed in the DATA step. For example, the following DATA step will read the first 50 observations in the CATS dataset. Note that when reading SAS datasets, OBS= and FIRSTOBS= truly do correspond to the observations and not to data lines:
DATA sample_of_cats;
SET cats (OBS=50);
Test with representative data
Using OBS= and FIRSTOBS= is an easy way to test your programs, but sometimes it is difficult to get a good representation of your data this way. You may need to create a small test dataset by extracting representative parts of the larger dataset. Or you may want to make up representative data for testing purposes. Making up data has the advantage that you can simplify the data and make sure you have every possible combination of values to test.
Sometimes you may want to make up data and write a small program just to test one aspect of your larger program. This can be extremely useful for narrowing down possible sources of error in a large, complicated program.
Fixing Programs That Don't Work
In spite of your best efforts, sometimes programs just don't work. More often than not, programs don't run the first time. Even with simple programs it is easy to forget a semicolon or misspell a keyword--everyone does sometimes. If your program doesn't work, the source of the problem may be obvious like an error message with the offending part of your program underlined, or not so obvious as when you have no errors but still don't have the expected results. Whatever the problem, here are a few guidelines you can follow to help fix your program.
Read the SAS Log
The SAS log has a wealth of information about your program. In addition to listing the program statements, it tells you things like how many lines were read from your raw data file and what where the minimum and maximum line lengths. It
Printing Macro Variables in the Log: %PUT
The SAS log is a text file that provides detailed information about the execution of a SAS program. It contains messages, warnings, errors, and other diagnostic information generated during the process of SAS code. The log helps users identify and troubleshoot issues in their programs, such as syntax errors, data error, or any other unexpected behavior.
The %PUT statement, which is analogous to the DATA step PUT statement, prints out the current values of macro variables along with some other text messages to SAS log. For example:
%PUT Libref: &libref;%PUT Dataset: &dsn;%PUT Batch: &n;
Several reserved words are available for you to print out macro variables through %PUT:
- _ALL_: List all macro variables in all referencing environments.
- _AUTOMATIC_: List all of automatic macro variables.
- _GLOBAL_: List all global macro variables.
- _LOCAL_: List all macro variables that are accessible only in the current referencing environment.
- _USER_: List all macro variables that can be accessed by the current user.
The SAS log is a text file that provides detailed information about the execution of a SAS program. It contains messages, warnings, errors, and other diagnostic information generated during the process of SAS code. The log helps users identify and troubleshoot issues in their programs, such as syntax errors, data error, or any other unexpected behavior.
The %PUT statement, which is analogous to the DATA step PUT statement, prints out the current values of macro variables along with some other text messages to SAS log. For example:
%PUT Libref: &libref;%PUT Dataset: &dsn;%PUT Batch: &n;
Several reserved words are available for you to print out macro variables through %PUT:
- _ALL_: List all macro variables in all referencing environments.
- _AUTOMATIC_: List all of automatic macro variables.
- _GLOBAL_: List all global macro variables.
- _LOCAL_: List all macro variables that are accessible only in the current referencing environment.
- _USER_: List all macro variables that can be accessed by the current user.
Automatic Macro Variables
Some macro variables are created automatically by the macro processor. You can employ these variables in the same manner as any other macro variable. Listed below are some commonly used automatic variables:
- &SYSDATE: Date that the session began executing (DATE7. form).
- &SYSDATE9: Date that the session began executing displayed with a four-digit year (DATE9. form).
- &SYSDAY: Day of the week that the session began executing.
- &SYSTIME: Time of the day that the SAS session began executing.
- &SYSLAST: Name of the last SAS data set created with the library and data set name separated with at least one space.
- &SYSDSN: Name of the last SAS data set created with the library and data set name separated with at least one space.
- &SYSERR: Stores the return codes of PROC and DATA steps.
- &SYSCC: Stores the overall session return code.
- &SYSPARM: Specifies a character string that can be passed into SAS programs. Usually used in the batch environment, this macro variable accesses the same value as is stored in the SYSPARM= system option and can also be retrieved using the SYSPARM() DATA step function.
- &SYSRC: Indicates the last return code from your operating environment.
- &SYSSITE: Contains the current site number.
- &SYSSCP: Gives the name of the host operating environment.
- &SYSUSERID: This macro variable stores the operating system username used for the current login session. If the operating system has not captured the user ID, &SYSUSERID receives "default."
- &SYSMACRONAME: This automatic macro variable provides the name of the macro that is currently running. &SYSMACRONAME is commonly used for documenting the execution flow of your SAS application. Note that if &SYSMACRONAME is used in open code (outside of any macro definitions), it will have a null value. Also note that if macros are nested, &SYSMACRONAME will store the name of the inner most macro.
Some macro variables are created automatically by the macro processor. You can employ these variables in the same manner as any other macro variable. Listed below are some commonly used automatic variables:
- &SYSDATE: Date that the session began executing (DATE7. form).
- &SYSDATE9: Date that the session began executing displayed with a four-digit year (DATE9. form).
- &SYSDAY: Day of the week that the session began executing.
- &SYSTIME: Time of the day that the SAS session began executing.
- &SYSLAST: Name of the last SAS data set created with the library and data set name separated with at least one space.
- &SYSDSN: Name of the last SAS data set created with the library and data set name separated with at least one space.
- &SYSERR: Stores the return codes of PROC and DATA steps.
- &SYSCC: Stores the overall session return code.
- &SYSPARM: Specifies a character string that can be passed into SAS programs. Usually used in the batch environment, this macro variable accesses the same value as is stored in the SYSPARM= system option and can also be retrieved using the SYSPARM() DATA step function.
- &SYSRC: Indicates the last return code from your operating environment.
- &SYSSITE: Contains the current site number.
- &SYSSCP: Gives the name of the host operating environment.
- &SYSUSERID: This macro variable stores the operating system username used for the current login session. If the operating system has not captured the user ID, &SYSUSERID receives "default."
- &SYSMACRONAME: This automatic macro variable provides the name of the macro that is currently running. &SYSMACRONAME is commonly used for documenting the execution flow of your SAS application. Note that if &SYSMACRONAME is used in open code (outside of any macro definitions), it will have a null value. Also note that if macros are nested, &SYSMACRONAME will store the name of the inner most macro.
D
D
MLOGIC and MPRINT Options
When modifying or debugging your macro, printing its execution process would be very helpful. The MLOGIC option outputs the macro's logical flow to the SAS LOG. On the other hand, the MPRINT option displays the actual execution of program. For example:
OPTIONS MLOGIC MPRINT;
%interleaving_two_datasets(SASHELP.NVST1, SASHELP.NVST2, Date);
OPTIONS NOMLOGIC NOMPRINT;
When modifying or debugging your macro, printing its execution process would be very helpful. The MLOGIC option outputs the macro's logical flow to the SAS LOG. On the other hand, the MPRINT option displays the actual execution of program. For example:
OPTIONS MLOGIC MPRINT;%interleaving_two_datasets(SASHELP.NVST1, SASHELP.NVST2, Date);OPTIONS NOMLOGIC NOMPRINT;
Programming Tips
People who have no experience in any programming language often get frustrated when their programs don't work correctly on the first try. Don't try to tackle a long complicated program all at once. By starting small, building upon what works, and consistently checking your results, you can enhance your programming efficiency.
Even if you get errors, never get frustrated. Surprisingly, experienced SAS programmers could make simple mistakes; they forget to add a semicolon, misspell a word, or place statements in an incorrect order. These small mistakes can cause a whole list of errors. Sometimes, even when programs run without throwing errors, they may still be incorrect. It is always a good practice to test your code with small cases.
People who have no experience in any programming language often get frustrated when their programs don't work correctly on the first try. Don't try to tackle a long complicated program all at once. By starting small, building upon what works, and consistently checking your results, you can enhance your programming efficiency.
Even if you get errors, never get frustrated. Surprisingly, experienced SAS programmers could make simple mistakes; they forget to add a semicolon, misspell a word, or place statements in an incorrect order. These small mistakes can cause a whole list of errors. Sometimes, even when programs run without throwing errors, they may still be incorrect. It is always a good practice to test your code with small cases.



0 Comments