Chapter 4: Evaluation Step-By-Step
The Art of Appropriate Evaluation

CHAPTER FOUR

EVALUATION
STEP-BY-STEP

By now you should have gotten the message that evaluation is an integral part of program implementation and needs to be built in right from the start. Therefore, the primary steps involved in evaluation mirror the steps you follow when you implement a program.

  1. Identify the problem you are trying to solve

  2. Develop reasonable objectivesmerge sign

  3. Develop a plan for measuring results

  4. Gather baseline data

  5. Implement your program

  6. Gather data and analyze results

  7. Report results

The remainder of this section provides a brief overview of what you should keep in mind in each of these steps.

Step 1—Identify Your Problem.

It may sound obvious but you need to understand the problem you are facing before you can expect to solve it. All too often, decisions are made to implement a program based on a reaction to a single, tragic fatal crash. It is always wise to take the time to understand your problem before you try to solve it. Problem identification serves two important functions.

  • It provides you the information you need to select an appropriate countermeasure and target audience for your program. You will be looking for information on the magnitude of the problem, the underlying causes, and the target groups most affected. This information should enable you to select the most effective countermeasure.

  • A Safety Team was formed to identify, develop, and implement countermeasures to reduce crashes on the Capital Beltway in suburban Washington, D.C. Everyone knew there was a problem, but there were a lot of opinions about what was causing the problem and how it should be solved. An evaluation was conducted at the beginning of the project, rather than at the end, to identify how, why, and where crashes occurred on the Beltway. This research led to a number of specific actions, including engineering changes, increased enforcement, and speedier incident response times.
     
    It may provide you with some of the baseline data you will need to determine if your program meets its objectives. You may start your problem identification with crash data, but you will also need to collect other types of data in order to understand the problem you have and to select the most effective strategy for dealing with it. This might include baseline observations of safety belt use, measures of enforcement levels, public opinion and awareness surveys, or speed counts. At this stage, it is also helpful to gather any trend data that may have been collected over the prior few years so that you will be able to show a trend before and after your program.

During the problem identification step, you also lay the foundation for your data collection efforts throughout the program evaluation. As you collect your baseline data, it is critical that you carefully document the procedures you follow, so that data collected later in the project can be compared with your baseline. In order for the data to be compared, it has to be collected at the same locations and times of day, using the same collection forms, and ideally the same observers. Failure to follow the same data collection procedures can make it difficult to document your accomplishments.

Step 2. Develop Valid Objectives for Your Traffic Safety Program.

Program objectives should be SMART (Specific, Measurable, Action-oriented, Reasonable, and Time-specific).
 

Once you have identified your problem and selected your strategy for addressing it, you need to define what you expect to accomplish. Many would argue that this is the most critical step in the evaluation process because it determines what success will be and how it will be measured.

Volumes have been written on how to write program objectives, each with its own set of do’s and don’ts. These rules are all similar and it is not important which set you follow. The one advantage to the list shown below is that it is easy to remember.

wieght limit 20 tons signProgram objectives should be SMART (Specific, Measurable, Action-oriented, Reasonable, and Time-specific). Let us elaborate.

Objectives should be SPECIFIC: Avoid using generalities like “improving traffic safety” or “increasing awareness.” If you identify exactly what you want to happen, then you can document your success. Sometimes you can be specific about the amount of change you anticipate, expressed either in absolute (increase safety belt use to 75 percent) or relative (increase citations by 15 percent over the baseline) terms. At other times, you can simply observe and record the change in behavior.

Objectives should be MEASURABLE: For an objective to be measurable, there must be something you can quantify, like DWI citations, and you must be able to detect a change over time. To the extent possible you should also be able to isolate the targets of the countermeasure. For example, you want to increase by 10 percent the number of DWI citations issued to young drivers.

curvey road signObjectives should be ACTION-ORIENTED: Action is good. You usually can see an action and count the number of times it happens. It is much easier to document that safety belt laws were enforced, by counting the number of traffic stops and citations, than it is to document if public support for belt law enforcement increased.

Objectives should be REASONABLE: A small community implemented a public information campaign on the value of traffic safety enforcement. The published objective of this public service campaign was to reduce traffic deaths community-wide. While this would be a desirable outcome, it is not reasonable to expect that an advertising campaign alone would change behaviors and ultimately reduce traffic crashes, at least not within the time- frame of the study. This community should take another look at the problem they are trying to solve, select a countermeasure that will address that problem, and then establish a reasonable target for success.

Objectives should be TIME-SPECIFIC: Projects don’t last forever and objectives should have deadlines. Deadlines make it clear to everyone when results can be expected. They also keep people focused on what needs to be accomplished by when.

SAMPLE OBJECTIVES
NOT SO SMART
S.M.A.R.T
• To encourage increased safety belt enforcement

• To reduce underage drinking

• To work with the legislature to advocate tougher impaired driving laws

• To increase safety belt citations by 15 percent in 6 months

• To reduce the number of liquor establishments that serve minors by 40% in 12 months

• To get a .08 law introduced and passed through committee in the next legislative session


 

SMART objectives don’t leave you a lot of wiggle room. It will be very obvious if you meet them or not. They challenge you to accomplish what you set out to do and serve as a constant reminder of your criteria for success. This is all the more reason to be honest and practical when you write them.

Once you have drafted your objectives for your program, you need to circulate them to those decision-makers who hold the fate of your program in their hands. You need to get buy-in at the outset as to what you are trying to accomplish. If they are expecting dramatic bottom line results (i.e. a reduction in fatalities), now is the time to explain to them why that would be difficult, if not impossible to demonstrate in the short term. If you wait until the program is over, they will likely come to the conclusion that the program failed because it did not meet their objective even if the program accomplished your objective!

This does not mean that your community may not experience a reduction in deaths and injuries over time. If you continue to implement effective countermeasures targeting specific traffic safety problems, you should begin to observe a downward trend in crashes, deaths and injuries. Your decision-makers need to understand, however, that this improvement will not occur over night.

Step 3. Develop a Plan for Measuring Results

Before you can begin implementing your program, you have to plan how you will conduct your evaluation. This plan will address the questions:

  • What will you measure?
  • How will you measure it?
  • How will you analyze your results?

While all of these questions are important, the first, what you will measure, is critical to the success of your evaluation.

What will you measure?

What you will measure must be tied directly to the objectives you have established for your program. If your objective is to reduce speeding on a given roadway, the most logical thing to measure would be average speeds on that given road. Since your objective is tied to speeding, you don’t need to spend time or money trying to measure a reduction in crashes.

One of the reasons that you don’t want to be forced into counting lives is that fatal crashes don’t occur very frequently in most communities. It is very difficult to even observe a change in fatalities, let alone connect that change to a specific countermeasure.
 

The problem that you will face again and again, is that everyone else will be urging you to tie program success to saving lives. Rather than getting backed into that corner, you should point out that the traffic safety literature indicates that excessive speed contributes to serious crashes. Since you have documented that there is a problem with excessive speeds on specific highways in your community, you are going to implement a countermeasure whose objective is to reduce speeding. You will measure program success by monitoring speeds on the selected road segments, before during and after your program is in effect. Wherever possible you should try to measure observable phenomena - things you can see and quantify, and that occur with a high degree of frequency. The phenomena can include behaviors, knowledge, opinions, and attitudes, and institutional responses. Here are some examples of each of these:

ped and bike xing signBehaviors

  • Using safety belts and child safety seats
  • Wearing bicycle and motorcycle helmets
  • Speeding
  • Red-light running
  • Jaywalking

Public opinion, awareness, and knowledge

  • Awareness of Public Information and Education campaigns
  • Support for legislative initiatives
  • Knowledge of a safety belt law
  • Teen attitudes about drinking and driving
  • Perceived risk of getting a traffic ticket

Institutional responses

  • Citations issued by the police
  • Special police patrols and check-points
  • Presentations
  • Training programs
  • Media coverage
  • Policies and legislation

Changes in these observable phenomena can be caused by your program or by some other confounding factors such as engineering improvements along a roadway. It will be important to understand what these confounding variables might be and how you can control them. This is a an area in which an evaluation specialist can be extremely valuable.

How will you measure it (and when)?

Once you have decided what you will measure to determine if your program achieved its objectives, you will need to decide how you will gather the information needed to make the measurement. There are four basic ways that you can measure program effects:

  • Field Observations,
  • Surveys,
  • Forms, and
  • Archival Data.

Field observations are used to measure changes in safety behaviors. They can detect the presence or absence of a behavior, (wearing, or not wearing a helmet), or record some measurement of a condition, such as a vehicle’s speed, or the size of a traffic gap that a person accepts before pulling into traffic. To conduct a valid field observation you, or your evaluation specialist, will need to determine where and when to make the observations, how many observations will be needed, and what procedures will be followed to record the data.

The classic problem that can plague a survey effort is bias introduced into the sampling plan.
 

Surveys are used to collect attitude, knowledge, and opinion information about individuals. They can be administered in person, over the telephone, or by mail, (and with growing frequency, via e-mail.) Each of these approaches has its own strengths and weaknesses which your evaluation specialist can describe for you. Surveys can provide a wealth of information but the survey instrument you use must be designed very carefully and tested thoroughly, and the procedures you use for including individuals in the survey (the sampling plan) must be well thought out. The classic problem that can plague a survey effort is bias introduced into the sampling plan. For example, if you conduct a telephone survey with your sample drawn from the telephone directory, you are limiting your respondents to households that have a telephone. Your target population might be college students who are not adequately represented in the telephone directory.

Consideration has to be given to the conditions under which the forms will be completed (at a busy PTA meeting with people milling around or back at the office with access to a computer) and the amount of time it is reasonable to expect someone to spend on the task.
 

Forms should be used to collect “process” data such as the number of presentations made (and where and when), the number of requests received for a brochure, the number of visits made to liquor establishments and the outcome, etc. These forms should be tailored to the specific data you need to capture and should be designed in coordination with the people who will be using them. There is a fundamental conflict between the people who would like to know the information and the people who actually have to collect it. Consideration has to be given to the conditions under which the forms will be completed (at a busy PTA meeting with people milling around or back at the office with access to a computer) and the amount of time it is reasonable to expect someone to spend on the task. The forms should be tested with real users prior to giving them out to be sure that there is no confusion.

Archival Data can be used to document a variety of issues. They are powerful because their use allows you to consider trends such as how a behavior, such as speeding, has changed over time. Archival sources would include:

  • emergency vehicles only signPolice crash records
  • Department of Motor Vehicle driver records
  • Traffic citations logs
  • EMS transport records
  • Emergency room records
  • Traffic court files
  • Hospital disposition records, etc.

It could also include newspaper archives, city council or State legislature records, and any other files that document program activity or responses to program activity.

The biggest challenge you will face with archival data is getting access to it. Any organization that maintains databases with any personal information will have very strict guidelines for who can access the information and what can be done with it. Make sure that your evaluation specialist understands these data sources and has experience accessing them. Since your evaluation is not concerned with the identity of individuals, you can usually obtain summary, data with the personal identifying information deleted.
Keep in mind that archival data may change over time as improvements are made in data collection. You will need to check each data item that you are interested in to see if it is consistent.

If you are conducting a State-level evaluation that focuses on fatalities and injuries, you can also access the Fatality Analysis Reporting System (FARS), the General Estimates System (GES), and the National Automotive Sampling System (NASS), all of which are maintained by NHTSA. Information about these archival data sources can be obtained from:

The National Center for Statistics and Analysis, NHTSA
400 Seventh Street, SW
Washington, D.C. 20590
Phone: 1-800-934-8517
World Wide Web: www.nhtsa.dot.gov/people/ncsa

You want each individual to collect data in exactly the same way. If you collected observational data as part of your problem identification activity, use the same procedures you used then so that you can make valid comparisons.
 

Once you have determined the type of data you will be collecting, and its source, you will need to develop systematic procedures for data collection. You cannot leave this important step to chance. You will likely have multiple people collecting data and you want to minimize any variations in how they interpret what they are seeing. You accomplish this by designing data collection forms that can be used by everyone, and by providing training on how to make observations, read police forms, etc. You want each individual to collect data in exactly the same way. If you collected observational data as part of your problem identification activity, use the same procedures you used then so that you can make valid comparison. Your evaluation specialist will be responsible for ensuring systematic data collection procedures.

speed checked by radar signThere is one last consideration under the topic “How Will You Measure.” That deals with the timing of your data collection efforts. We have discussed how your pre-and post data should be collected under similar conditions, which could include time of year, time of day, etc. You must also consider when the post data should be collected in relation to the implementation schedule. For example, you will want to collect safety belt use data immediately after a major enforcement blitz to determine if belt use changed. Traditionally, each increase after an enforcement blitz will level off over time. It won’t go all the way back down to the “pre” level, but it will go down. So, you also need to know what the long-term effects of that enforcement blitz may be. You will therefore need to plan for follow-up data collection at scheduled intervals after implementation is complete.

Your schedule for data collection should be determined before implementation begins, so that it will not be influenced by the implementation itself.

Step 4.  Gather Baseline Data.

Now that you have refined you program objectives and developed a plan for measuring results, you may collect some additional data about other aspects of your program.
 

During problem identification, you gathered preliminary data on such factors as safety belt use, and documented how you collected this information so that you can repeat these procedures after implementation. Now that you have refined you program objectives and developed a plan for measuring results, you may collect some additional data about other aspects of your program. You may need to conduct an opinion poll to document what your citizens think about DWI enforcement, before you implement a campaign to conduct sobriety checkpoints on weekends. This information should all be gathered before you actually start implementing anything, so that you can easily isolate any effect your program may have.

Step 5.  Implement Your Program.

Many people would be surprised to see implementation as a step in the evaluation process. But remember, you should be monitoring how your project is going right from the start, rather than waiting until everything is over. You should be keeping track of project costs and other process data that could indicate if program activity is at expected levels. You might do periodic opinion polls to see if the public is paying attention to your public information campaigns. You should also gather feedback at any training programs or public presentations. You may discover that there is a serious problem that should be fixed before any other contacts are made. If you include any media events in your program, you definitely want to pay attention to the amount of media coverage you receive. This information is much easier to capture in real time rather than to recreate the records weeks or months later.

Whatever you learn during program implementation, it is critical that you document it.
 

Whatever you learn during program implementation, it is critical that you document it. You may have planned for weekly sobriety checkpoints at five locations in the county, with support from the State Patrol. Due to unexpected budget cuts, the State Patrol can only support one location per week. This may be a problem that you cannot fix, but you need to factor it in when you analyze your citation data. Based on this development you may want to adjust your program objective or extend the duration of your implementation phase. You will definitely want to document how actual implementation differed from your plan, and what impact you believe this change could have.

Step 6.  Gather and Analyze Data.

Causal or Correlated?

A final word of caution about statistical analyses and how they are reported: Your evaluator will very carefully choose the right words to describe the outcomes observed and their relationship to the countermeasure implemented. Distinctions will be made between a causal relationship (Implementing A caused outcome B.) and a correlation (A was implemented and B happened, and they appear to be connected.). The distinction is an important one, and should not be lost in the excitement of success. If your evaluator does not use the term causal relationship, it is because she does not believe that a causal relationship can be proven with the data available. Even though correlation is harder to explain than cause, don’t undermine the validity of your effort by slipping into sloppy terminology.
 

While the work involved in planning an evaluation is critical to success, it is in this step that your evaluation specialist will earn his or her fee. Gathering the data is the most labor intensive aspect of the program evaluation, and analyzing it may be the most complex. As a manager, your biggest concern during the data collection phases is that the effort is adequately staffed and that everyone has been trained on the correct procedures to follow. Your evaluation specialist should also keep you informed about any changes that have to be made because of some external event that could influence the outcome. For example you may have collected baseline data on child safety seat use outside of a child care center. One year later, when you are looking to see if your campaign had an effect, you discover that the center has closed. Your evaluation specialist will need to find a suitable alternate site so that you don’t miss any data.

During the analysis phase, your main focus should be becoming comfortable with the statistics. Your evaluation specialist will determine what statistical tests, if any, are appropriate. There is no point into going into any detail here on the various tests that could be used and the circumstances under which they are most appropriate. Your evaluation specialist should be able to explain them all to you in terms that you understand.

Meaningful in this context means that both you and your funding sources will be satisfied that the program really made a difference.
 

When you start to get results from your evaluator, there is one thing that you should keep in mind. Just because something is “statistically” significant doesn’t mean that it is also “programmatically significant” or meaningful. Meaningful in this context means that both you and your funding sources will be satisfied that the program really made a difference.

car signYour evaluator may tell you that there is a statistically significant decrease in the number of repeat DWI offenders following implementation of your mandatory sentencing program. She can report with a high degree of confidence that this change is not due to chance. However, when you look at the actual numbers, you discover that the total number of repeat offenders only dropped by ten. While your evaluator is tickled that she was able to prove that your program was a success, you are worried that your funding source may view this result with less enthusiasm. Trust your instincts. You don’t want to be in a position of claiming victory based on statistically significant results that no one else can really see.

Step 7.  Report Results.

A successful evaluation is worthless if no one knows about it.
 

The results are in, and your program was a big success. Before you celebrate, however, you need to pay attention to a very important step in the process. A successful evaluation is worthless if no one knows about it or can understand what is being said.

Your purpose in reporting evaluation results are two-fold:

  1. You want to convince your funding source that they should continue funding your traffic safety program, and maybe even increase their support.

  2. You want to generate support for your program among the media, the general public and among the other organizations you would like to take a more active role in traffic safety.

As program manager you will need to report your results to your funding source, and to the media, at a minimum. If other organizations were involved in implementation, you should share the results with them, along with appropriate thanks for their participation.

The presentation of your results will vary depending on your audience. You should create a detailed report for your funding source, to convince them that you take evaluation seriously. It must include a short, punchy, Executive Summary which hits the high points and emphasizes the conclusions. The detailed report should include an accounting of how your program funds were spent. The detailed report should follow a standard research format, with the following sections.

  • Table of Contents

  • Executive Summary—No more than three pages in length, ideally shorter.

  • Background—Why the study was conducted and the questions it attempts to answer. It should include the objectives for the program being evaluated and the criteria for success.

  • Methods—Complete descriptions of the design, procedures, techniques etc. that were used to collect and analyze the data. Questionnaires and data collection forms should be included in an appendix.

  • Findings—The outcomes of the research presented in tables and graphs.

  • Discussions and Conclusion—Interpretation of the findings, how they relate to the purpose of the evaluation and the objective of the program being implemented.

  • Recommendations for Action—Discussion of changes that should be made to the program to increase effectiveness. This section could also include proposals for continued, or even increased funding, based on the results provided.

Your evaluation specialist should be principally responsible for the Methods and Findings sections and should have major input to all other sections.

children crossing signYour report to the media, and through them to the general public, should be very different. It can be issued as a press release which specifies what was done, and why, and what the results were. This information should focus on the impact the program will have on the general public. Will they be seeing more enforcement on the street? Will their children be safer walking to school? A clear table or graph of the most significant findings should be included if possible. Your audience will understand percentages use them whenever possible. People also understand the concept of risk when applied to traffic safety. Try to include a discussion of the average person’s risk of being involved in a crash, and how that risk may have changed as a result of your program.

Each experience should provide important lessons learned that can save you time money, and frustration in the future.
 

Once you have communicated your results to everyone, you need to turn your attention to what changes should be made before you implement the program the next time. You should review all the documentation on what went right, and what obstacles were encountered, so that you can do some contingency planning the next time. You should also review your performance against your budget and milestone schedule to determine if you need to request more money or allow more time in the future. Did you have enough data collectors? Did the media understand what you were doing? Did you get enough cooperation from the local police or school system? All of the factors should be reviewed and built into your planning for future implementation of this same project or any others. Each experience should provide important lessons learned that can save you time money, and frustration in the future.