9 Chapter 9: Basic Statistics for Consumers
CASE STUDY: Gang Involvement and Violent Victimization
Research Study
Understanding the Relationship between Violent Victimization and Gang Membership^{1}
Is gang involvement, along with involvement in gang crime and other risky lifestyles, related to being the victim of a violent crime?
This study, conducted by Katz, Webb, Fox, and Shaffer, examined data collected through the Arrestee Drug Abuse Monitoring (ADAM) program. ADAM was established in 1987 by the National Institute of Justice (NIJ) and consisted of surveys of recently jailed juveniles and adults. The ADAM survey covered a number of topics from drug use to participation in other risky behavior and was administered across a variety of cities. A supplement to the ADAM survey also inquired about gang membership, victimization, and gangrelated activity, among other related areas.
The final survey sample for the current study included 909 juvenile arrestees from Maricopa and Pima counties in Arizona. Of all potential juveniles eligible for interview, only 5% declined to participate in the survey.
In determining gang membership, the researchers separated juvenile arrestees into four different types: 1) Never in a gang, 2) Gang associate, 3) Former gang member, and 4) Current gang member. In terms of being the victim of a crime, the researchers collected information on lifetime victimization and violent victimization in the past 30 days. Several types of violent victimization were measured by the survey, including but not limited to being threatened, shot, or shot at with a gun, nongun weapon victimizations, and assaults.
Following interviews with juvenile arrestees, the researchers examined relationships among variables, for example, crosstabulating gang status with violent victimization. Overall, bivariate analyses revealed that current gang members were more likely to be the victim of a violent crime than all other members of the study sample, including nongang members, gang associates, and former gang members. For example, current gang members were more likely to be threatened with a gun, shot at, and shot than gang associates, nongang members, or former gang members.
Beyond bivariate associations, the authors also conducted a multivariate analysis. Multivariate logistic regression models (see discussion on logistic regression later in this chapter) were utilized to determine if the type of gang membership was associated with being the victim of a violent crime (in the past 30 days). The researchers examined this relationship even after accounting for the effects of other variables collected by researchers, such as whether the juvenile was involved in gangrelated crime in the past 30 days, the number of prior arrests of the juvenile, and whether or not the juvenile was still in school, among others. Accounting for the influence of these factors on violent victimization in the past 30 days, the authors revealed that being involved in gangrelated crime in the past 30 days was the strongest predictor of violent victimization. Being involved in a gangrelated crime increased the juvenile’s likelihood of being the victim of a violent crime by 51%—regardless of gang membership status. Thus, it is not so much a matter of gang membership that contributes to violent victimization, according to the authors, but rather, whether the juvenile engages in gangrelated activity or not.
Limitations with the Study Procedure
In the current study, a potential limitation is that juvenile offenders were surveyed or asked questions about delinquent acts, victimizations, and gang status, among others. Particular to gang status, for example, juveniles were asked to selfreport gang status through questions such as “Have you ever been in a gang?” or “Are you currently a member of a gang?” When questioning juveniles, in jail or another custodial setting, there is always the potential that the offenders may have exaggerated their ties to a gang to be boastful, or conversely, played down their true level of gang membership for fear of legal repercussions. The same could also be said for any number of questions asked of the juvenile arrestees, for example, their gangrelated activity or levels of victimization. Unfortunately, official data (e.g., arrest records) were not available to the authors so as to confirm juvenile selfreports. These and other potential limitations should be considered from the perspective of an informed consumer of research and how such limitations could have impacted the findings of the study.
The results of this study furthered the recognition that those who participate in criminal and delinquent activities are more prone to be victimized. The authors label this effect as the victimoffender overlap. In this study, the authors revealed that being a member of a gang, per se, does not automatically translate into violent victimization. What counts is the type of behavior demonstrated by the individual, and specifically, participation in gangrelated crime. Among other potential impacts, such research has much relevance to prevention and intervention programs targeting gang members. Indeed, the results of the research suggest that providing information to gang members on the link between gangrelated crime (not simply just gang membership) and being the victim of a violent crime might prove effective concerning the reduction of gangrelated crime. As the authors rightly note, “… policymakers might be wise to focus on the problems associated with gang membership rather than on gang membership itself.”^{2}
In This Chapter You Will Learn
That statistical tests are best considered as a set of tools to help organize, understand, and interpret data collected through a research study
Basic information related to variable coding and measurement
About univariate statistics—such as the mean, mode, and median—or statistics for examining the characteristics of one variable
About bivariate statistics—such as a correlation—or statistics for examining the association or relationship between two variables
About the origin of the .05 level of significance
How to interpret an alpha level of .05, .01, and .001
That the nature of the data being nominal, ordinal, interval, or ratio will dictate the appropriate type of statistical test required to examine the data
About two common statistical techniques for examining the relationship among more than two variables (also called multivariate statistical techniques)
Introduction
This text originated with research consumers in mind. As opposed to a guide on training researchers, it was driven by the reality that most students of research methods are unlikely to ever design and conduct a research study of their own. As a result, previous chapters provided an overview of various research methods and designs focused on the basics of research consumerism—which type of methods and designs are appropriate for different types of research questions and the advantages, disadvantages, and caveats associated with different research designs. Consistent with our focus on consumerism, what remains in this text is an introduction to basic statistics.
As opposed to a chapter on how to conduct statistics, this chapter focuses on the function of statistical tests in the research process. For researchers, statistics are best considered as a set of tools to help organize, understand, and interpret data collected during the course of a research study. These tools then help researchers provide answers to the questions that led to a research study in the first place. For consumers, however, a basic knowledge of statistics—the different types of statistical tests available, when they are appropriate to use, what they accomplish, and how they are interpreted—will help provide better insight into the reported results of a research study. We do not believe that research consumers necessarily need to know how to conduct a full complement of statistical tests to be able to understand the results produced by such tests. As a result, little attention in this chapter is paid to the specific procedures for conducting statistical tests. Rather, this chapter places more emphasis on the use, presentation, and basic interpretation of common statistical tests.
This chapter begins by presenting a hypothetical dataset based on a hypothetical research question. This hypothetical dataset and its associated research question will serve as a foundation for the statistical tests covered in this chapter. This chapter then revisits levels of measurement. Revisiting measurement levels is important because the level at which a variable is measured drives the type of statistical test(s) appropriate to examine the variable. Next, this chapter covers a number of common statistical tests used to explore single variables. Included in this section are typical examples of how statistics on a single variable are presented and interpreted, and generally, their function in the research process. This chapter then covers another set of common statistical tests utilized for exploring relationships or associations between two variables, again with an emphasis on presentation and interpretation. This chapter ends with a discussion of two common statistical tests used to explore relationships among more than two variables, and some important caveats of statistical analyses relevant to consumers of research.
CLASSICS IN CJ RESEARCH
The Criminal Investigation Process
Research Study
The Criminal Investigation Process
Methodology
The goal of this classic study in criminal justice was to come to an understanding of the role and functions of police investigators (as opposed to patrol officers, for example). According to the study authors, Greenwood and Petersilia, there has always been the belief that investigators were crucial to solving crimes. Unfortunately, there was simply little information at the time (1970s) on the role, function, and success of investigators in solving criminal incidents.
To explore the role and function of investigators, Greenwood and Petersilia sent mail surveys to 300 large police departments. Of these 300 departments, 153, or just over 50%, responded to the survey. The mail survey inquired about a number of areas of investigative work, for example, the training and evaluation of investigators, the clearance and arrest rates associated with investigative work, and the organizational structure of investigations, among others. The researchers also conducted interviews with investigators at several police departments and also observed firsthand the work of some investigators. In short, the researchers utilized survey methods and field work to understand the role and function of police investigators.
Results
The results of the Greenwood and Petersilia study called into question the notion that investigators were absolutely essential in solving crime. Survey data indicated that only a small portion of investigators’ time was actually spent in activities that resulted in an arrest. For example, almost half of investigators’ time was spent on such activities as questioning victims, locating witnesses, and gathering evidence. Moreover, it was revealed that investigators were responsible for clearing only a small percent of index crimes they investigated (under 3%).
Perhaps one of the more interesting findings of the study was that routine police procedures, often conducted by patrol officers or other first responders to the crime scene, were most important in solving crimes. Such routine police procedures required no special training or investigative experience beyond what a typical patrol officer would receive. Indeed, when the researchers compared survey findings across the different departments, their overall conclusion was that the contribution of investigators to clearance and arrest rates was minimal.
Limitations with the Study Procedure
Predictably, this study was criticized on a number of fronts. Critics attacked the fact that only 153 of 300 agencies responded to the survey (or just over 50% response rate) and surmised that the authors received a biased picture on the investigative police function. In short, critics rightly questioned whether the 50% of agencies that responded may have differed significantly from the 50% of agencies that did not respond to the mail survey. Moreover, the surveys consisted only of large police departments with at least 150 fulltime employees and in areas that served at least 100,000 individuals in the population. Thus, the study looked mostly at large urban settings. Critics argued that the findings likely were not applicable to smaller jurisdictions.
Despite these and other criticisms, this was the first major examination of the role and function and relevance of investigators in large police agencies across the United States.
Impact on Criminal Justice
This study was important to criminal justice in a number of ways. Specific to criminal justice practice, the findings of this study led to changes in how police agencies trained patrol officers. For example, patrol officers in several jurisdictions received extended training and were given greater responsibility in conducting investigations when they responded to a crime incident. This was in direct recognition of the finding that one of the most important aspects to crime solvability was the information gathered by first responders to the crime scene, typically the patrol officer. Specific to criminal justice research, two researchers replicated this study nearly a decade later utilizing a suburban police department in Los Angeles, CA. Overall, their findings were consistent with those found by Greenwood and Petersilia, but their study was more supportive of the important role of investigators. At the most basic level, however, this study was the first of its kind and questioned longheld beliefs on the role of investigators in police agencies. Through the use of data, including facetoface interviews and field observations, this study spurred additional research into the role and function of investigators to help improve police performance in the solving of crimes.
Source: Greenwood, P., & Petersilia, J. (1975). The criminal investigation process. Santa Monica, CA: Rand.
Hypothetical Research Scenario: Prison Smoking Ban
The Situation
As a result of state laws banning smoking in a number of public places such as restaurants and parks, the director of a large state prison system was notified that smoking would likely be banned inside the state prison system in the near future. In anticipation of this change, the prison system director hired a researcher from a local university to conduct a pilot test to determine the potential impact of a full smoking ban in the state’s prison system. The prison director feared that a prison system smoking ban would possibly inflame tensions among inmates and lead to an increase of violent incidents. As a result, the director believed that a pilot test gauging the potential impact of this change in the prison system would allow prison authorities to anticipate and plan for any potential negative consequences once a full smoking ban went into effect.
The Design
After several meetings with the prison director and other prison officials, the researcher decided to conduct the pilot test at one prison. The researcher designed the study so that one cellblock at one prison was banned from smoking. Designing the study this way, the researcher wanted to compare the number of violent incidents that occurred among prison inmates in the selected cellblock 4 months prior to the pilot smoking ban to the number of violent incidents that occurred during a 4month period after the ban went into effect.
To more accurately specify the impact that a smoking ban might have on violent incidents, the researcher decided to select a comparison cellblock. Selecting a comparison cellblock of inmates not subject to the smoking ban provides a baseline for comparison. As discussed in Chapter 5, the addition of a comparison cellblock allowed the researcher more insight as to whether any change in the level of violent incidents during the smoking ban was due to the ban or some other rival causal factor.
Because inmates already occupied all cells in each cellblock in the prison system, the researcher was not able to utilize an experimental design for the pilot study. This was because inmates were not able to be randomly assigned to cellblocks prior to the study, nor could other conditions be randomized prior to the smoking ban, such as the number and type of officers in the cellblock. Therefore, the researcher designed the pilot smoking ban as a twogroup longitudinal design (also known as a multiple interrupted time series design), a quasiexperiment. In this design, the researcher compared two equivalent cellblocks in one prison—one cellblock whose inmates had their smoking privileges removed (smoking ban cellblock) and a comparison cellblock that was not subject to the ban. At a basic level, the logic of this design is that if incidents increase in the cellblock subject to the smoking ban compared to the equivalent cellblock not subject to the ban, the increase must be attributable only to the ban.
Selection of Cellblocks
In selecting two different cellblocks, the researcher first chose Cellblock A, which is one of four cellblocks within a larger separate prison building located at the south end of the prison grounds. Cellblock A is a smoking cellblock and contains 40 inmates. It is designated as a maximum security cellblock, although inmates are generally free during the day to roam the cellblock to work and participate in a number of other activities. Cellblock A was the cellblock subject to the smoking ban.
After searching for an equivalent cellblock, the researcher identified Cellblock B, which is located in a larger prison building at the north end of the prison grounds. Like Cellblock A, it was also a smoking cellblock, designated as maximum security, and in a building with three additional cellblocks. Cellblock B also held 40 inmates and the researcher believed that it was the most comparable cellblock at the prison in terms of the types and number of inmates that are housed in that cellblock, in addition to other attributes such as number of correctional staff monitoring inmates. Additionally, because Cellblocks A and B are at opposite ends of the prison grounds, there was virtually no contact between inmates of the two cellblocks, a factor that the researcher felt was important to the pilot study. In short, the researcher wanted to ensure that inmates in Cellblock B did not find out about the pilot test smoking ban in Cellblock A, or vice versa. In short, the researcher wanted the inmates to behave normally, without knowledge that they might be part of a study.
Collection of Additional Secondary Data
After selecting the treatment and comparison cellblocks, the researcher was allowed access to agency files to collect certain types of secondary or agency collected data on inmates, for example, the number of previous incarcerations, age at first incarceration, and the number of years incarcerated on current sentence for which the inmate was committed to prison (see Table 9.2 for all variables collected by the researcher). The researcher believed that these and other variables would help explain participation in violent incidents by inmates, independent of the effects of a potential smoking ban in prison. The design of the study is portrayed in Table 9.1.
Although the design is considered a quasiexperiment because the inmates residing in each cellblock were not randomly assigned, the researcher attempted to ensure equivalence among inmates and cellblock characteristics by selecting cellblocks with similar types of inmates and by considering other factors that might relate to involvement in or discovery of violent incidents. Also note that because the researcher was collecting information on the number of violent incidents in each month before and during the smoking ban, this design can be considered longitudinal. However, because the researcher is also interested in the final number of incidents during the entire 4month pretest and 4month posttest period, and not necessarily the number of incidents in any particular month of the pretest and posttest, this design could have also been conducted as a nonequivalent group design, which is crosssectional.
Cellblock A 
NR 
O_{1} 
O_{1} 
O_{1} 
O_{1} 
X Smoking Ban 
O_{2} 
O_{2} 
O_{2} 
O_{2} 
Cellblock B 
NR 
O_{1} 
O_{1} 
O_{1} 
O_{1} 

O_{2} 
O_{2} 
O_{2} 
O_{2} 


PreTest Violent Incidents Before Smoking Ban 

PostTest Violent Incidents During Smoking Ban 

Recall from Chapter 5 that anytime two or more groups of individuals (or cellblocks of inmates) are being compared in an experimental or quasiexperimental design, one consideration is that the groups be equivalent so as to be able to isolate the unique effects of the treatment after it is implemented. In an experimental design, the randomization process helps to ensure that group differences are minimized or eliminated. In a quasiexperiment, such as the twogroup longitudinal design discussed previously or the nonequivalent group design, the absence of randomization means that there could indeed be differences between the groups of inmates at the pretest, and hence, such differences could impact the posttest. For example, if inmates in Cellblock A perpetrated a significantly higher number of incidents than inmates in Cellblock B prior to the smoking ban, whatever led to the higher number of incidents in Cellblock A might lead to higher incidents after the smoking ban is put into place. Fortunately, the availability of data on inmates allowed the researcher to explore, through the use of statistics, how similar or different each group of inmates was with respect to factors that might be linked to participation in violent incidents following implementation of the smoking ban. Such information allows a more precise interpretation of what effect, if any, a smoking ban had on participation in violent inmateoninmate incidents.
Table 9.2 presents all of the variables collected by the researcher to help examine the impact that a smoking ban may have on the prison setting. The table provides the name of the variable collected, the label or what the variable is measuring, how the variable is coded, and finally, at what level the variable is measured.
Variable Name 
Label 
Coding 
Level of Measurement 
Race 
Race of inmate 
0 = Nonwhite 1 = White 
Nominal 
Previnc 
Previous incarcerations 
Number 
Ratio 
Agefirst 
Age at first incarceration 
Age in years 
Ratio 
Yearinc 
Years incarcerated on current sentence 
Years 
Ratio 
Classrisk 
Classification risk category within custody level 
1 = Low 2 = Medium 3 = High 
Ordinal 
Cigsday 
Cigarettes smoked per day before the ban 
Number 
Ratio 
Cellblock 
Cellblock assignment 
0 = Cellblock B 1 = Cellblock A 
Nominal 
lnc4moprior 
Violent incidents 4 months prior to smoking ban 
Number 
Ratio 
lnc4modur 
Violent incidents 4 months during smoking ban 
Number 
Ratio 
Levels of Measurement Revisited
Table 9.2 includes nine variables collected by the researcher to help determine the impact of a smoking ban on violent incidents among inmates. Two of the variables are nominal level variables (Race and Cellblock). This means that the variable provides a name or label to some value. For example, labels such as nonwhite and white are characteristics of variables measured at the nominal level as are indicators of Cellblock A and Cellblock B. Although nominal variables and their labels may be designated by numerical codes, the codes are not numerically meaningful. For example, inmates who occupy Cellblock A are coded as 1 and inmates in Cellblock B are coded as 0. However, the coding scheme of 1 and 0 do not represent numbers useful for mathematical manipulation nor do they indicate rankings; rather, they are simply labels used to indicate a particular category. This is a property of variables measured at the nominal level.
Further examination of Table 9.2 reveals there is one ordinal level variable included in the data collected by the researcher. This variable, Classrisk, is a measure of each inmate’s risk level within their designated prison custody level. For example, all inmates in the smoking ban study were maximum security inmates. However, within that maximum security custody level, Classrisk measures whether each inmate was considered low, medium, or highrisk. As opposed to the nominal level variables, the fact that Classrisk is measured at the ordinal level means that the labels of low, medium, and high and the codes of 1, 2, and 3 are ranked or ordered, from low to high risk. However, one of the properties of a variable measured at the ordinal level is that although the categories are ranked (e.g., low to high), it is not known by how much each of the ranks differs. For example, highrisk maximum security inmates are riskier than both low and mediumrisk maximum security inmates; it is just not known how much risk separates these ranked categories.
There are no interval level variables available in this study. Unlike variables measured at the nominal or ordinal level, interval level variables are measured on a scale where the numerical distance between two different intervals or points on the scale is equal. However, an interval scale of measurement does not feature a true or meaningful zero point. For example, temperature is an often cited example of an interval level measurement. With temperature, the distance between 30° and 60° is 30 intervals. However, when the temperature is 0, there is still warmth. This is what is meant by the absence of a true or meaningful zero point.^{4}
As opposed to an interval level of measurement, Table 9.2 indicates that there are six variables measured at the ratio level of measurement. One important characteristic of a variable measured at the ratio level is that two different values are equally spaced apart, just like in an interval level of measurement. For example, an inmate with four previous incarcerations has exactly two more previous incarcerations than an inmate with two previous incarcerations. Another important characteristic is that there is a meaningful or true zero point—where zero actually equals zero and is not arbitrary as with variables measured at the interval level of measurement. For example, variables such as age, height, weight, and number of previous incarcerations are considered variables measured at the ratio level of measurement because zero is meaningful—zero actually means zero.
Because of the preceding properties, variables measured at the interval or ratio level can be subject to a number of statistical tests that would not be appropriate for nominal or ordinal level variables. Additionally, variables measured at these levels can be “scaled down” or “recoded” to create nominal or ordinal level variables. For example, an interval level variable measuring temperature can be scaled down into categories—30°–50°F, 60°–70°F, or “warm” and “cool,” and so on. Although interval and ratio variables can be scaled down to resemble nominal or ordinal level variables, nominal and ordinal level variables cannot be “scaled up” or recoded into interval or ratio level variables. As a result, interval and ratio level variables are much more versatile than nominal or ordinal variables when it comes to statistical tests. Finally, it is important to note that variables are not often called nominal, ordinal, interval, or ratio in research reports and other publications. Often, researchers simply refer to the variables as categorical or qualitative (nominal and ordinal) or quantitative or metric (interval or ratio).
Regardless of how they are labeled, revisiting the different levels of measurement is important for the primary reason that certain statistical tests apply only to variables measured at a particular level of measurement. For example, it would not make much sense to compute the average of an inmate’s race as measured in the current hypothetical study because this is a nominal level variable. This aspect about levels of measurement should become clearer as we explore statistics for examining one variable in the following section.
RESEARCH IN THE NEWS
A Just Measure of Crime: The Uniform Crime Reports (UCR)
When considering crime rates, fluctuations in crime over time, or the level of crime in a particular location, there is no better source than the Uniform Crime Reports (UCR). Beginning in 1930, the uniform crime reporting program has relied on city, university and college, and county, state, tribal and federal law enforcement agencies to voluntary report information on known offenses committed and persons arrested each year. To be sure, the UCR only collects information on known offenses and persons arrested—it is not a count of processes and outcomes following arrest such as findings of the court or decisions of the prosecutor.
The UCR reporting program is perhaps the most accurate count of crime that occurs in the United States each year. This does not mean it is perfect, for it does have many limitations. For example, the UCR cannot account for crimes that go unreported. Moreover, the UCR utilizes what is considered a hierarchy rule. This means that if multiple crimes are committed in one criminal incident, only the highest level or most serious crime is counted—therefore underestimating some crime. Despite these and other limitations, the UCR currently functions as a count of crime for roughly 95% of the total population of the United States.
Despite the important nature of the UCR, it is critical to remember that with any compilation of data, there is always the potential for error and other mistakes. Errors and mistakes can occur during any stage of the collection process, from the initial crime report completion by the individual officer, to the collection of the information by the local law enforcement agency, to the transmission of data to the FBI. Other potential errors can include differential interpretation of information requested by the FBI, missing or incomplete information, or simply failing to report an accurate number of crimes. There have even been reports of law enforcement agencies intentionally omitting some crimes from their UCR reporting sheets or otherwise skewing their local crime data.^{3} In light of these potentials, there are significant training procedures in place for law enforcement agencies and quality standards and reviews to make sure the data is as accurate as possible. Visit www.fbi.gov and explore information about the UCR, including the history of the program, quality control guidelines, limitations of the data, and what is and is not included in the UCR. Also examine other national data crime collection programs such as the National Crime Victimization Survey (NCVS), the National Incident Based Reporting System (NIBRS), and Supplemental Homicide Reports (SHR), to name a few.
Adapted from: http://www.fbi.gov/aboutus/cjis/ucr/frequentlyaskedquestions/ucr_faqs, retrieved August 26, 2011.
Statistics for Examining One Variable
One of the fundamental functions of certain statistical tests is to explore and describe the characteristics of a single variable. For producers of research, the purpose of analyzing a single variable is to determine certain attributes or characteristics of the variable. These characteristics become important in determining which types of statistical tests are appropriate to use with the variable. For research consumers, information on a single variable (or a number of single variables) may lead to a better understanding of the characteristics of data, and ultimately, a better understanding of the outcomes of a research study based on that data.
Measures of Central Tendency
The beginning point to understanding the role of statistics in the research process is measures of central tendency. Measures of central tendency include a number of statistics “designed to find a single number that best represents several numbers.”^{5} Three of the most common statistics for measuring central tendency are the mean, median, and mode. The mean represents the average of all scores for a particular variable, for example, the average age at first incarceration or the average number of cigarettes smoked per day for the entire sample of prison inmates. In sum, the mean is a single number that helps to provide an overall view or “average” of a number of different ages at first incarceration.
Averages or means are calculated and meaningful for variables measured at the ratio, interval, and in certain instances at the ordinal level, but are not appropriate for variables measured at the nominal level. One of the most important properties of the mean is that this statistic is influenced by extreme values. Extremely high values, for example, pull the average higher and extremely low values pull the average downward. While extreme values do not change the mathematical accuracy of the mean, such values can affect how accurate a picture the mean represents for any given variable. For example, if among the hypothetical sample of inmates in the smoking ban study of focus in this chapter, a single inmate had 45 previous incarcerations while the remaining 79 inmates each had one previous incarceration, the average would be inflated and would not be an accurate portrayal of the number of previous incarcerations among the majority of the inmate sample. Based on the figures just presented, the mean number of previous incarcerations would equal 1.55. However, nearly 99% of all inmates have only one previous incarceration. While the mean of 1.55 is mathematically accurate to the sample, it probably does not do a good job of clearly portraying the average or “typical” inmate in terms of the number of previous incarcerations.
As opposed to the mean, the median represents the middle value in a series of ordered values. Unlike the mean, the median is not influenced by extreme low or high values. For example, the median in the following list of numbers [1, 3, 4, 5, 6] is 4, and the median in the next list of numbers [1, 3, 4, 5, 800] is also 4. Despite the presence of an extreme value of 800 relative to the other values in the ordered list, the median remains the same because the middle number in a ranked series is still the middle number, regardless of the values below or above it. As a consumer of research, sometimes the median provides a more accurate overall representation of the characteristics of a particular variable than the mean, because unlike the mean, it is not influenced by extreme values.
In addition to the mean and median, the final measure of central tendency is the mode. In simple terms, the mode is the most frequent value of any particular variable. Suppose nearly all inmates in the smoking ban study had one previous incarceration, the mode for the number of previous incarcerations would be one. It is also the case that a variable may have more than one mode. In fact, there can be multiple modes depending on the frequency of values of a particular variable. For example, presume that among the 80 prisoners in the smoking ban study, 20 prisoners had one previous incarceration, 20 prisoners had three previous incarcerations, 20 prisoners had four previous incarcerations, 15 prisoners had five previous incarcerations, and 5 prisoners had six previous incarcerations. In this example, there would be three modes (1, 3, and 4 previous incarcerations).
Table 9.3 presents all three measures of central tendency (mean, median, and mode) for the ratio level variables collected by the researcher in the hypothetical smoking ban study. Table 9.4 presents the same information but separates the information by the cellblock in which inmates were assigned.
Variable 
Mean 
Median 
Mode 
Previous incarcerations 
1.28 
1.00 
1.00 
Age at first incarceration 
25.98 
23.00 
18.00 
Years incarcerated on current sentence 
3.99 
3.00 
1.00 
Cigarettes smoked per day 
11.10 
8.50 
2.00 
Incidents 4 months prior to smoking ban 
2.06 
2.00 
2.00 
Incidents 4 months during smoking ban 
4.57 
3.00 
1.00 
The information presented in Tables 9.3 and 9.4 represents typical ways the mean, median, and mode might be presented in a research report. Also notice “N” and “n” in the tables. N is an indicator of the size of the population or sample. Although N and n are sometimes used interchangeably, N is usually utilized to denote the size of a population, whereas n is utilized to denote the size of a sample from the population. N can also be used as a sample size indicator with n being used to indicate a subset of a sample. In this example, “N” is indicative of the sample of inmates in the smoking ban study (N = 80 inmates total), and n is used to represent a subset of inmates from the larger sample (n = 40 in Cellblock A and n = 40 in Cellblock B). However N or n is used, the bottom line is that this letter is used to demonstrate the number or size of the sample or population under study.
As mentioned, measures of central tendency are a group of measures that attempt to describe the characteristics of a particular variable with a single number. While measures of central tendency can help provide an overall picture of a particular variable, to more fully understand the characteristics of a variable it is best to have access to the original data that produced the measure of central tendency. If that data is not available to the consumer, certain basic assumptions of the data can be accomplished by examining all measures of central tendency together. For example, if the mean and median are extremely different, one could make the assumption that data values for a particular variable vary considerably. If the average age of prisoners in the smoking ban sample was age 35, for example, but the median age was 23, it would be safe to assume that the sample contains a number of older inmates that have influenced the mean upwards. However, one should be careful not to make large assumptions simply from a comparison of the mean, median, and mode. Although careful examination of these measures is important for an informed consumer of research, what can actually be assumed about data characteristics from examining these measures is somewhat limited. In short, these forms of summary statistics provide only a broad look at variable characteristics and a basic level of understanding about the nature of the data. More detailed information about the characteristics of particular variables can be found by examining a set of statistics commonly referred to as measures of variation.
Variable 
Cellblock A (n = 40 Inmates) 
Cellblock B (n = 40 Inmates) 

Mean 
Median 
Mode 
Mean 
Median 
Mode 

Previous incarcerations 
1.13 
1.00 
1.00 
1.43 
1.00 
1.00 
Age at first incarceration 
25.35 
21.00 
18.00 
26.60 
23.00 
23.00 
Years incarcerated on current sentence 
4.10 
3.50 
2.00 
3.87 
3.00 
1.00 
Cigarettes smoked per day 
14.75 
14.00 
23.00 
7.45 
4.00 
2.00 
Incidents 4 months prior to smoking ban 
2.15 
2.00 
2.00 
1.98 
2.00 
2.00 
Incidents 4 months during smoking ban 
7.40 
6.50 
5.00 
1.75 
1.00 
1.00 
Measures of Variation
Like measures of central tendency, measures of variation provide a single number that gives information about a single variable. Three of the most commonly used measures of variation include the range, variance, and standard deviation. In the most common instances, these measures of variation are only appropriate and meaningful with interval and ratio level data.
The first measure of variation is the range. The range is calculated by taking the highest value for any individual variable and subtracting the lowest value for that individual variable. For example, the highest age of first incarceration (Agefirst) for any prisoner in the smoking ban sample is age 70, and the lowest age of first incarceration is 18. The range associated with Agefirst is thus 52 (70 – 18 = 52). The range associated with number of previous incarcerations (Previncar) is 3 because the inmate(s) with highest number of previous incarcerations in the sample has 3 and the inmate(s) with the lowest number of previous incarcerations has 0 (3 – 0 = 3). Because the range is calculated using only the highest and lowest values of a particular variable, like the mean, it can be influenced by extremely high or low values. For example, the heaviest smoker in the prisoner sample smoked 41 cigarettes per day, and the least frequent smoker smoked 2 cigarettes per day, resulting in a range of 39 (41 – 2 = 39). Note, however, the range as a measure of variation is used to show how much “variability” exists within a given variable—it is not necessarily meant to depict the typical sample member. As a result, it gives the consumer an idea of how much variation there is between the highest and lowest value for any particular variable.
A second common measure of variation is called the variance. Calculating the variance, especially by hand, is tedious and can sometimes be confusing. Of particular interest for our purposes, however, is what the variance measures. The variance provides an indication of how much each individual value of a particular variable differs from the average or mean of a particular variable. Because the variance is calculated in part by determining how much each individual value of a particular variable differs from the mean of a particular variable, a larger variance is an indication that there is more variability of scores around the mean. Alternatively, a variance that is closer to the mean indicates that individual scores of a particular variable are closer to the mean. For example, the average age at first incarceration for the 80 inmates in the hypothetical study is 25.98 years, but the variance is 115.80. Although the variance statistic of 115.80 does not provide a simple description of the variability of individual values for each inmate regarding their age at first incarceration, the large difference between the mean and variance suggest that there is variability among the sample of inmates regarding their age at first incarceration—some inmates are both younger and older than the mean. Indeed, if the variance is 0, this is an indication that all the values for a particular variable are the same as the mean, thus, no variance.
An even more detailed picture of the characteristics of the age at first incarceration for the sample is found through an examination of the standard deviation. The standard deviation is calculated by taking the square root of the variance. Mathematical procedures aside, because the standard deviation is a product of the variance, it too provides an idea of the variation of all particular values of a variable compared to the mean of that particular variable. And, just like the variance, the larger the standard deviation, the more individual values of a particular variable stray from the mean.
Table 9.5 provides an examination of all measures of central tendency (mean, mode, and median) and variation (range, variance, and standard deviation) for the variable age at first incarceration in the hypothetical smoking ban study. Inspection of these statistics together and how they are typically presented helps in understanding the characteristics of data collected for a research study. Examining these statistics combined with basic knowledge of how they are calculated and what they represent is also a good first step at being an informed consumer of research. Based on what you know so far, what insight can you gather about Agefirst from the measures of central tendency and variation presented?
Variable 
Mean 
Median 
Mode 
Range 
Variance 
St. Dev. 
Age at first incarceration 
25.98 
23.00 
18.00 
52 
115.80 
10.76 
Thus far, the previous discussion on statistics for single variables has centered on those statistics that are used to examine the characteristics of interval and ratio level data. What about data that is measured at the nominal or ordinal level? Because measures of central tendency and variation are not meaningful for nominal or ordinal level variables, researchers typically use other methods to understand the characteristics of data measured at these levels. For example, frequency tables, percentages, and an assortment of charts, such as bar charts or pie charts, are often used to examine the characteristics of variables measured at the nominal or ordinal level. Again, the goal of the basic statistics just covered is to provide an understanding of the characteristics of data. For nominal and ordinal level variables, percentages, frequency tables, and charts help accomplish that goal.
Statistics for Examining Two Variables
Measures of central tendency and variation provide information on a single variable only. It is often the case, however, that researchers are interested in whether there is a relationship or association between two variables. For example, the researcher conducting the pilot smoking ban study may be interested in whether there is an association between a prisoner’s cellblock assignment (Cellblock) and his classification risk upon prison intake (Classrisk), among other associations.
Associations between two variables can be observed by crosstabulating variables. Crosstabulating, or producing a “crosstab,” is accomplished by comparing the categories of one variable to the categories of another variable. Because crosstabulation requires data to be in category form, it is used with nominal and/or ordinal data. As mentioned, however, both interval and ratio level data can be “scaled down” into categories for crosstabulation purposes. For example, suppose the researcher wanted to examine if there was any association between the number of cigarettes inmates in Cellblock A smoked per day before the smoking ban (Cigsday) to the number of violent incidents in Cellblock A in the 4 months after the smoking ban went into effect (Inc 4modur). These ratio level variables can be recoded or scaled down into categories. Once into categories, a crosstabulation can be produced for visual inspection of categories. For example, the researcher could recode the ratio level variable Cigsday into categories (inmates who smoked 0–6 cigarettes per day, 7–12 cigarettes per day, and so on) and do the same with the individual frequencies for the ratio level variable Inc4modur (inmates who engaged in 0–5 incidents, 6–10 incidents, and so on). Table 9.6 demonstrates a crosstabulation of these two recoded variables.
An examination of Table 9.6 shows the number of cigarettes smoked per day by inmates in Cellblock A before the smoking ban went into effect compared to the number of violent incidents perpetrated by Cellblock A inmates during the 4month smoking ban. Basic visual inspection of Table 9.6 shows there is a slight association between the number of cigarettes smoked per day before the ban and involvement in violent incidents during the 4 months the smoking ban was in place. Indeed, as the number of cigarettes smoked per day increases, there appears to be an increase in the number of inmates who engaged in categories that include the highest number of incidents. For example, of the 22 inmates in Cellblock A that smoked 13 or more cigarettes per day before the ban, 12 had 6–10 violent incidents and 5 had 11 or more violent incidents during the test period.
Although there appears to be a slight positive association between the variables (as the number of cigarettes smoked before the ban increases so does the number of incidents during the smoking ban in Cellblock A), this basic association should not be taken as an indication that more frequent smokers will automatically engage in a higher level of violent incidents if smoking privileges are removed. There are perhaps a number of other variables that could explain this association beyond the number of cigarettes per day. Perhaps those inmates who smoked the most cigarettes per day were also the most risky or violent inmates, and perhaps had a large number of violent incidents even before the smoking ban went into effect and simply continued to act accordingly. In short, a mere association offers no proof of a causal relationship.

lnc4modur recoded 



0–5 
6–10 
11+ 
Totals 
Cigsday recoded 
0–6 
11 (100%) 
0 (0%) 
0 (0%) 
11 (100%) 
7–12 
4 (57%) 
2 (29%) 
1 (14%) 
7 (100%) 

13–20 
4 (40%) 
3 (30%) 
3 (30%) 
10 (100%) 

21 + 
1 (8%) 
9 (75%) 
2 (17%) 
12 (100%) 


Totals 
20 (50%) 
14 (35%) 
6 (15%) 
40 (100%) 
Also make note of an important attribute of Table 9.6—percentages. Percentages can be calculated in two ways, across (or row percentages) or down (or column percentages). Table 9.6 provides row percentages. A general rule of thumb is that percentages should be performed in the direction of the independent or supposed causal variable. In this example, the recoded variable for the number of cigarettes smoked per day is hypothesized to be a factor associated with violent incidents during the 4month smoking ban, and therefore, the percentages are row percentages. Indeed, it would make little sense to assume that the number of incidents incurred during the smoking ban caused the number of cigarettes smoked before the ban—therefore Cigsday is considered the independent variable.
Often, consumers of research find confusion with row and column percentages, and for good reason. For example, it might easily be misinterpreted from Table 9.6 that 100% of inmates who incurred 0–5 incidents during the ban smoked between 0–6 cigarettes per day before the smoking ban (see the cell in the upper lefthand comer of the table). In fact, the correct interpretation is that 100% of inmates who smoked 0–6 cigarettes a day before the ban (11) were involved in 0–5 incidents during the smoking ban. Indeed, calculation of a column percentage, instead of the row percentages provided, would show that only 55% of inmates who incurred 0–5 incidents during the smoking ban smoked 0–6 cigarettes per day (11/20) before the ban went into effect. In essence, one must be careful when interpreting percentages in a crosstabulation and understand the difference between row and column percentages.
In addition to examining whether there “appears” to be an association between two variables through face value inspection of a crosstabulation table, researchers are often interested in a number of other considerations concerning associations. These considerations include how strong the association is between variables (is the association weak, moderate, or strong), what the direction of the association or relationship is (association positive or negative), and whether the relationship between variables is statistically significant.^{6} Depending on the measurement of the variables, statistical tests exist to address all of these issues.
WHAT THE RESEARCH SHOWS: IMPACTING CRIMINAL JUSTICE OPERATIONS
Why is .05 the Traditional Level Indicating Statistical Significance?
A significance level of .05 is the standard threshold for indicating that a relationship among variables, or a difference between two or more variables, is statistically significant. In simple terms, a significance level equal to .05 means that there is a relationship between variables, with only a 5 out of 100 or 5% chance of the relationship being due to a chance occurrence.
But who established this traditional cutoff for determining whether a relationship or difference between variables was “good enough” to be considered real and not a chance occurrence? Surprisingly, not much research exists on this subject. This lack of research is even more surprising considering that the .05 level of significance is so universally used across scientific disciplines. In an important article, researchers Cowles and Davis (1982) searched to uncover the origins of the .05 level of significance. Regarding “who” developed this threshold, they revealed that perhaps the earliest formal recognition of the .05 level of significance came from Sir Ronald Fisher. Fisher, a wellknown statistician of the early 1900s, wrote a book called Statistical Methods for Research Workers in 1925. In this book he formally introduced the .05 level. But Cowles and Davis found evidence of others using similar thresholds even before this time.
Despite the universal and important nature of the .05 threshold, and regardless of the debate as to who should be credited with creating .05 as the standard threshold of statistical significance, Cowles and Davis revealed that the adoption of the .05 level was more than just some arbitrary threshold. At a basic level, the authors reveal that an event that occurs less than or equal to 5% of the time is generally considered by scientists and nonscientists alike as the result of something other than random chance occurrence. This is also because this threshold has intuitive appeal—something occurring only 5 out of 100 times is quite a rare event, indicating that something special, or other than chance, is occurring. But perhaps the most basic reason it was adopted so universally was because the absence of a common threshold indicating significance would leave the task to individual researchers to establish what is and is not statistically significant for a particular study. The widespread adoption of the .05 level led to a minimum standard in determining the likelihood of an event not being due to chance.
Source: Cowles, M., & C. Davis. (1982). On the origins of the .05 level of statistical significance. American Psychologist 37, 553–558.
See also, http://www.jerrYdallal.com/LHSP/p05.htm, retrieved on August 26, 2011.
Testing for a Statistically Significant Relationship
Although each of the previous questions is important, we focus on the issue of statistical significance because this is what is often reported in research reports and other outlets and is perhaps of most relevance to our purposes. Note, however, that statistical programs that allow a determination of statistical significance also provide the opportunity to evaluate both the direction and strength of an association between two variables, and this information is also typically found in reported research results. Such tests can be accomplished with variables measured at all levels—nominal, ordinal, interval, and ratio.
Two of the most commonly reported statistics for determining if a relationship between two variables is statistically significant include chisquare and ttests. In a nutshell, chisquare statistical tests determine whether the relationship between two nominal or ordinal variables is significant. As a practical matter, a relationship between two variables is considered statistically significant when the observed relationship is greater than what would be expected to occur by chance alone. For example, a chisquare test could be utilized in the hypothetical study to determine whether the difference in the number (or frequency) of white and nonwhite inmates assigned to Cellblock A and B is significantly different than what would be expected by chance. A chisquare test could also be used to examine whether there is a significant relationship between the recoded variables in Table 9.6—Cigsday and Inc4modur.
As opposed to chisquare tests, ttests are primarily used with interval and ratio level data. Although there are different types of ttests, one of the most common is that which examines whether mean differences between two independent or distinct groups are statistically significant. For example, a ttest could be used by the researcher to answer whether the average number of violent incidents among inmates in Cellblock A and Cellblock B were significantly different before the smoking ban went into effect. The researcher could do similar mean comparison ttests between Cellblock A and B inmates on the mean number of previous incarcerations, years incarcerated on current sentence, number of violent incidents during the smoking ban period, and cigarettes smoked per day before the ban.
Table 9.7 provides results of a ttest that compared the mean number of cigarettes smoked per day by inmates in Cellblock A to the mean number of cigarettes smoked by inmates in Cellblock B prior to the smoking ban. First, note that inmates in Cellblock A, on average, smoked roughly 7 more cigarettes per day than inmates in Cellblock B. Recognizing this difference, a ttest examines whether this mean difference is significant, or alternatively, whether this is a difference that could be expected by chance alone. Examining the significance column in Table 9.7 (Sig.), a value of .000 is presented. This is the significance level produced from the ttest in Table 9.7 and is the number of most interest. In basic terms, this reported significance level (.000) indicates that the difference in the average number of cigarettes smoked per day among inmates in Cellblock A compared to the average number smoked by inmates in Cellblock B is significantly different than a chance occurrence. It indicates that the probability of two means, from this sample, being this different by a chance occurrence is less than one chance out of 1,000—too small a chance to conclude that the difference in the number of cigarettes smoked per day is simply a fluke occurrence or some error in the sampling process. In sum, Cellblock A inmates smoke a significantly greater number of cigarettes than Cellblock B inmates.


N 
Mean 
Sig. 
Cigsday 
Cellblock A 
40 
14.75 
.000 
Cellblock B 
40 
7.45 


Traditionally, the level at which a difference is considered statistically significant is any significance value that is less than .05, typically reported in a research report as p < .05 (less than 5 out of 100), or p < .01 (less than 1 out of 100), or p < .001 or just p < .000 (less than 1 out of 1,000). Relative to a ttest of difference between two group means as presented in Table 9.7, the reported level of .000 means that in less than 1 chance out of 1,000 should we expect this large a difference in the number of cigarettes smoked to be due to chance. In short, the difference in the average number of cigarettes smoked among inmates in each respective cellblock is a real or true difference and such a large difference is unlikely to be the result of a chance occurrence—in fact, in less than 1 out of 1,000 times would we expect to see this large a difference just by chance.
Let us assume that the researcher is also interested in whether there is a difference in the average number of violent incidents recorded among inmates in Cellblock A compared to inmates in Cellblock B during the 4month smoking ban period. A ttest could also be utilized to examine this question. Recall, however, that inmates in Cellblock B were not subjected to a smoking ban—only inmates in Cellblock A had their smoking privileges removed. Conducting this analysis reveals that inmates in Cellblock A accumulated an average of 7.40 incidents during the 4month smoking ban and inmates in Cellblock B accumulated an average of 1.75 incidents. According to the ttest results comparing these averages, there is a statistically significant difference between the mean number of violent incidents between inmates in Cellblock A and Cellblock B. In this analysis, the significance value (Sig.) is .000. This means that the probability of this difference being a chance occurrence is less than 1 out of 1,000.
There are two important considerations worth mentioning at this juncture. First, just because a chisquare test or a ttest (or any other statistical test) may indicate a statistically significant relationship or a statistically significant difference between group means does not indicate a causal relationship. Again, there could be a number of other intervening variables that might explain why, for example, inmates in Cellblock A incurred a significantly greater number of incidents during the smoking ban, none of which might be related to the fact that their smoking privileges were removed.
Second, sometimes the notion of statistical significance does not translate into practical significance. For example, inmates in Cellblock A may have served a significantly greater number of years incarcerated on their current sentence, but in practical terms, the mean differences between the two groups may be only a few years and not otherwise be meaningful on a practical level. In other words, while a relationship or difference may be statistically significant, it may not indicate anything of practical value. Thus, statistical significance does not indicate a causal relationship between two variables nor does it necessarily equate to practical significance. It is simply an indication of whether differences are greater than what would be expected by chance, relative to the size of the sample and other considerations. Although the notion of practical significance must be viewed in the context of a particular study and relative to a particular research question, the important point here is not to blindly put too much emphasis on statistical significance because it might often be the case that there is no practical or real world significance to such a finding in the larger context of a particular study.
RESEARCH IN THE NEWS
STATS.ORG: The Numbers behind the News
If you question statistics presented in news articles, stats.org is a website for you. Among other activities, researchers at stats.org examine the numbers and associated findings behind major issues and news stories. This organization advocates scientific methods and statistical analyses to examine a variety of social issues today. Highly trained researchers and statisticians associated with stats.org also provide independent review and critique of popular media studies, discussing benefits and limitations. In the quest to become an informed consumer of research, stats.org is a must visit.
To visit this organization’s website, visit, http://stats.org/index.htm.
Statistics for Examining More Than Two Variables
Thus far, this chapter has introduced the reader to a number of basic statistical tools to explore the characteristics of data and to explore relationships between two variables and whether such relationships are statistically significant. As a whole, this chapter presented only a brief overview of some of the most common statistics typically found in a research report and how they are commonly presented. While there are numerous other types of specific statistical procedures and data considerations that could well fit within this chapter, in our opinion, such a focus detracts from the ultimate goal of providing research consumers with an introductory level of information they need to begin understanding the more complex world of statistical analyses. In short, this chapter was meant as a starting point to help the research consumer, not the potential research producer, to get a basic but important grasp on fundamental statistical concepts and tests to become a more informed consumer of research.
With that said, it is common today to find more advanced statistical analyses presented to research consumers at all levels—general public consumers, field practitioners, and others. As a result, we believe it is important to cover, if only briefly, two of the more common statistical techniques for examining relationships among more than two variables. In statistical terminology, these statistical tests are broadly known as multivariate (multiple variables) statistical tests. Although numerous different types of multivariate statistical tests exist, the two most common are Ordinary Least Squares (or OLS) regression and Logistic Regression. We provide a discussion and presentation of these statistical tests in the following section. However, it is important to note that with any statistical procedure, even those examining one variable, the nature of the data is important. For example, a mean can only be meaningful when used with interval or ratio level data. All multivariate statistical tests also come with a set of assumptions and the most basic of all assumptions is the type of data that they are able to analyze.
OLS Regression
Ordinary Least Squares (OLS) regression (hereafter OLS regression) is a multivariate statistical technique where a researcher is interested in the relationship between multiple independent variables and one dependent variable. Although OLS regression includes a number of assumptions about the characteristics of data, what is most important for our purposes is that the dependent variable must be measured at the interval or ratio level. Independent variables can be measured at any level—nominal, ordinal, interval, or ratio.
Suppose the researcher in the smoking ban study is interested in the relationship between cigarettes smoked per day before the smoking ban (Cigsday) and the number of incidents during the 4month smoking ban (Inc4modur). Let’s also suppose that the researcher believes other variables might be related to violent incidents during the smoking ban beyond the number of cigarettes an inmate smokes per day. For example, the researcher might also be interested in whether knowledge about inmates’ age at first incarceration (Agefirst), years incarcerated on current sentence (Yearinc), and the number of violent incidents prior to the smoking ban (Inc4moprior) may also have an effect on the incidence of violence during the 4month smoking ban (Inc4modur)—even accounting for the number of cigarettes inmates smoked before the ban.
By including the variables mentioned previously, the researcher is interested in whether there is a relationship between the number of cigarettes smoked per day before the smoking ban and violent incidents that occurred during the smoking ban, after accounting for the effects of the other independent variables. More specifically, the researcher wants to know whether a smoking ban will lead to violence or whether this relationship is actually explained by factors other than the smoking ban, such as the types of inmates, how long they have been incarcerated, their age at first incarceration, and their involvement in violence before the smoking ban took effect. Because the dependent variable is measured at the ratio level, and the researcher is interested in the impact of multiple independent variables on the dependent variable, OLS regression would be an appropriate statistical technique to examine this research question.
It is possible that the level of violence during the smoking ban period may have nothing to do with how many cigarettes an inmate smokes and the smoking ban, but rather, the characteristics of inmates and their previous demonstrated violent behavior, among other pieces of information. Alternatively, the level of violence may in part be related to the number of cigarettes smoked per day and the smoking ban, but other factors may also contribute to involvement in violent incidents. OLS regression can help to answer what effect each variable has on the dependent variable, if any, and whether some variables are more important than others in explaining inmate violence.
Table 9.8 summarizes the results of an OLS regression model to help answer the previous considerations. Table 9.8 is a simplified presentation of how the results from an OLS regression might be presented in a research report or perhaps an academic article or book. As shown, there were five variables “regressed” on the dependent variable measuring the number of incidents during the 4month smoking ban (Inc4modur). Because only Cellblock A was subjected to the smoking ban, the variable Cellblock was also included to determine whether being in a smoking cellblock (Cellblock B) as opposed to a nonsmoking cellblock (Cellblock A) also influenced incidents during the ban, in addition to the other variables. One way to answer whether banning smoking leads to an increase of violent incidents is to determine whether cellblock assignment has an effect on incidents. In other words, including the variable Cellblock serves as a proxy measure for losing smoking privileges (Cellblock A) or retaining smoking privileges (Cellblock B).
Variable 
Standardized Coefficient 
Sig. 
Agefirst 
.016 
.840 
Yearinc 
−.072 
.376 
Cigsday 
.385 
.000 
Cellblock (A) 
.510 
.000 
lnc4moprior 
.013 
.878 
R^{2} = .560 


Notice that there are two major columns in Table 9.8, Standardized Coefficient and Sig. The column Standardized Coefficient is a statistic that has been standardized for each variable. It allows us to determine the importance of each variable in the model in explaining Inc4modur, after accounting for the influence of other variables. For example, it tells us the importance of the number of cigarettes smoked per day on the number of incidents incurred during the smoking ban, after accounting for the effects of the number of years incarcerated and other variables in the model. Because these coefficients are standardized, it allows us to determine which variable is most important in explaining the dependent variable. For example, being assigned to Cellblock A is the most influential variable in explaining violent incidents during the 4month smoking ban because it has the largest Standardized Coefficient; next most important is the number of cigarettes per day smoked before the smoking ban, and so on.
Also notice the negative (–) sign next to the Standardized Coefficient for Yearinc. This sign indicates whether the independent variable has a positive impact on the dependent variable (meaning it leads to an increase in violent incidents) or a negative influence (meaning it leads to a decrease in violent incidents). For example, Agefirst, Cigsday, Cellblock, and Inc4moprior are all positively related to violent incidents during the smoking ban. This means a 1unit increase in value of these variables (e.g., such as a 1year increase in the age at first incarceration) leads to a certain increase in violent incidents. Specific to Cigsday, inmates who smoked more cigarettes per day were predicted to have an increased level of violence during the smoking ban, after taking into consideration the other variables in the model. More specifically, for each additional cigarette smoked per day, the dependent variable increases by a factor of .385 incidents after accounting for the influence of other variables in the model.
The third column of Table 9.8 presents the significance level associated with each variable. It tells us which variables are statistically significant in explaining the number of incidents during the 4month smoking ban. Indeed, variables may be related to violent incidents, but may not have a statistically significant impact on violent incidents (recall the previous discussion on levels of significance, .05, .01, and .001). After accounting for the effects of all variables in the model, only two of the five variables are statistically significant in relation to the dependent variable, Cellblock and Cigsday. We know this because only two of the variables have a significance level that is less than .05, or the traditional barrier denoting a significant effect.
At the most basic level, the results in Table 9.8 suggest that the number of cigarettes smoked per day and cellblock assignment were statistically significant explanations of violence during the smoking ban. Here, inmates subject to the smoking ban (as a result of being in Cellblock A) and inmates who smoked more were significantly more likely to engage in violent incidents during the smoking ban than inmates in Cellblock B or inmates who smoked fewer cigarettes per day. These findings are statistically significant even after accounting for inmates’ violence before the smoking ban, their age at first incarceration, and the number of years inmates had been incarcerated. In sum, this model indicates that a smoking ban will increase violence among more frequent smokers subject to the ban, independent of the effects of other variables in the model.
A final piece of information is included in Table 9.8 and is also typically found in the presentation of OLS regression results. This is the R^{2}. R^{2} is a measure that indicates how much variation in violent incidents during the smoking ban is explained by the variables included in the model. In this example, the effects of all the variables in the model predicted .560 or 56.0% of the variance in inmates’ involvement in violent incidents measured for 4 months during the smoking ban. In other words, for the sample in the current hypothetical study, more than 50% of what leads to violent incidents during the smoking ban is included in the model and most of that variance is explained by Cigsday and assignment to Cellblock A, or the smoking ban. The variables that might explain the remaining 50% of the variance in participation in violent incidents are not included in the statistical model. For example, perhaps the inclusion of variables that measured mental health problems or incarceration as a juvenile offender would explain additional variation in participation in violent incidents. Unfortunately, such variables were not available to the researcher so the contribution of these and other variables, in explaining violent incidents, is unknown.
Logistic Regression
Logistic regression is another multivariate technique that allows a researcher to determine the impact of multiple independent variables on a single dependent variable. Independent variables can be nominal, ordinal, interval, or ratio level variables. However, the dependent variable in a logistic regression analysis must be a binary categorical variable. An example of a binary dependent variable might include reoffense and no reoffense, arrested or not arrested, or involvement in a violent incident and no involvement in a violent incident.
As noted earlier, it is always possible that a researcher can “scale down” or “recode” an interval or ratio level variable into categories. In the current hypothetical study, for example, the researcher could recode the frequency of violent incidents during the 4month smoking ban into two exclusive categories—inmates who incurred 1–4 incidents, and inmates who incurred 5 or more incidents. Two points are worth mentioning here. First, the choice of categories is entirely up to the researcher and might be informed by a number of things such as previous research and general knowledge about the characteristics of the data. Second, a general rule is that a researcher would never want to scale down a variable from interval or ratio level to a categorical variable simply to conduct a certain type of multivariate analysis. If a researcher has an interval or ratio level variable, they simply could use OLS or another statistical technique appropriate for dependent variables measured at that level. For our purposes, however, let’s assume the researcher does not have access to an interval or ratio level dependent variable measure of violence during the smoking ban and only has access to a dichotomous dependent variable.
Suppose the researcher is interested in the same question that formed the basis for the OLS regression previously. Suppose also that the researcher has a dependent variable that measures whether inmates were involved in 1–4 incidents during the smoking ban (no inmates in the current data had 0 incidents during the smoking ban), or 5 or more incidents during the smoking ban. This would be an appropriate analysis for logistic regression.
Table 9.9 shows the results of the logistic regression model regressing Agefirst, Yearinc, Cigsday, Cellblock (A), and Inc4moprior on the dichotomous dependent variable.
Variable 
Coefficient 
Sig. 
Exp(B) 
Agefirst 
–.001 
.979 
.999 
Yearinc 
–.050 
.753 
.951 
Cigsday 
.328 
.001 
1.388 
Cellblock (A) 
6.02 
.004 
412.86 
lnc4moprior 
1.13 
.034 
3.109 
R^{2} = .60 



The table offers a typical presentation of results produced by a logistic regression analysis. It has four columns, but for our purposes, column 3 (Sig.) and column 4 (Exp(B)) are most important. Recall from previous discussions that Sig. indicates whether a particular variable is significant in predicting or explaining the dependent variable. All Sig. values that are less than .05 indicate variables significant in explaining the dependent variable. Knowing this, the results in Table 9.9 show that three variables emerged as significant in explaining the dependent variable (e.g., the category of having 5 or more incidents during the 4month smoking ban period is the predicted category in the logistic regression model). We know this because three variables have significance values that are less than .05. These variables are Cigsday, Cellblock A, and Inc4moprior. In a nutshell, the results of this analysis suggest that inmates who smoke more per day, those assigned to Cellblock A (those subject to the smoking ban), and those with a greater number of incidents prior to the smoking ban are more likely to be found in the category of inmates with 5 or more incidents during the smoking ban compared to those who accumulated 1–4 incidents during the ban.
Further information can be found in column 4, or Exp(B). Exp(B), also called the odds ratio, is best viewed as a measure of association between variables. Odds ratios that are greater than 1 indicate a positive relationship of the independent to the dependent variable, those less than 1 indicate a negative relationship, and those that are exactly 1 indicate a neutral relationship between the variables. For example, the odds ratio for the variable Cigsday is 1.388. This means that the more cigarettes an inmate smoked before the ban, the more likely they will be found in the category of inmates with 5 or more incidents.
Without getting too complex, the odds ratio simply compares the odds of being in one group (the group with 5 or more violent incidents) to the odds of being in the other group (the group of inmates with 1–4 violent incidents during the smoking ban). In the current example, the number of cigarettes smoked per day significantly increases the odds of being in the group with 5 or more incidents, but this effect is relatively weak as the odds ratio is only slightly more than 1.0. Interestingly, being in Cellblock A, and hence being subject to the smoking ban significantly increased the likelihood of being in the higher incident category after accounting for the effects of all other variables in the model. Moreover, those who demonstrated a higher level of previous violent incidents were much more likely to be found in the category of 5 or more incidents, even after accounting for the effects of other variables in the model.
Summing up Table 9.9, the results indicate that inmates who smoke more, are subject to a smoking ban, and those with more incidents prior to the smoking ban are significantly more likely to be involved in a higher number of incidents. Note, these results are somewhat different from the results found previously on the OLS model. For example, in the OLS model accounting for the number of violent incidents as the dependent variable, an inmate’s previous history of violent incidents (Inc4moprior) was not a significant factor explaining violence during the smoking ban. One reason for this discrepancy is that the OLS model was predicting the total number of incidents whereas the logistic regression model is predicting a category of incidents. As a result, the outcome of the data analysis may be different because different outcomes are being measured. If in fact the researcher recoded the dependent variable into two different categories, for example, those with 1–2 incidents and those with more than 3 incidents, the results may change further. Such differences can also occur because OLS regression and logistic regression operate on different assumptions about the characteristics of data beyond their level of measurement. Although this discussion is well beyond this chapter, this is one reason why it is important for researchers to understand the nature of their data.
As mentioned, there are a number of other multivariate statistical techniques available to researchers, and like all statistics, whether they are appropriate to use depends on a number of considerations. That said, the previous discussion of OLS and logistic regression demonstrated two of the most common statistical tests to examine relationships among multiple variables, and how the results of these tests are typically presented and interpreted. Although there are a number of particulars about each of these statistical tests that were not covered because they are far beyond the scope of this text, the basic presentation of these tests will help consumers have at least a basic understanding of two common tests and how they are used in answering a research question.
Chapter Summary
The bad news is that no single chapter can fully prepare consumers for the complex world of statistics and their use in the research process. The good news is that this was not the goal of this chapter. Rather, the goal of this chapter was to help readers come away with a basic understanding of the most common statistical tests, why they are used, what they accomplish, and their basic presentation and interpretation. Despite the foundational nature of this chapter, research consumers should feel more confident about reading and interpreting statistics in many common research outlets, and hopefully, will have greater insight into the role statistics play in the final aspects of the research process.
Although statistics are undoubtedly important in seeing a research study through its final stages, it is important for readers to remember that a study is only as “good” as the methods and design that led to the collection of data so as to allow statistical analyses. The saying—”garbage in, garbage out”—is a truism when it comes to research methods. Poorly designed and implemented research studies, whether the fault of the researcher or not, will lead to the production of data and statistical analyses that will also suffer from problems experienced at the front end. Therefore, statistics should not be viewed as a set of tools that allow shortcomings in method and design to be remedied through fancy number crunching. Rather, statistics are just one ingredient of a good study that also must include appropriate methods and design. As a result, consumers who are educated about the research process, including an understanding of the role of basic statistics, are on the path to becoming informed consumers of research.
Critical Thinking Questions
1. What is the difference between a variable measured at the ratio level and a variable measured at the interval level? Give one example each of a ratio level and an interval level variable not discussed in this chapter.
2. What are measures of central tendency, and what are measures of variance?
3. What does it mean when a study claims that a variable is significant at the .05 level? Explain.
4. Make a persuasive argument that a study’s methods and design are perhaps more important than statistical tests.
5. What does it mean if a study variable, like age at first arrest, has a large standard deviation relative to the mean of that variable?
Key Terms
categorical/qualitative: Two terms that are often used to describe variables measured at the nominal and ordinal level
chisquare test: A statistical test that examines the association between two variables measured at the ordinal or nominal level
logistic regression: A multivariate statistical test that regresses the independent variables on a categorical dependent variable (e.g., convicted or not convicted, violent or not)
mean: The average of a set of scores; example would be the average score for an entire research methods class
measures of central tendency: Refers to a set of statistics that produce a single number to represent a larger set of numbers; includes the mean, mode, and median
median: Is the middle score of all scores
mode: Is the most frequent score among a set of scores
multivariate statistical tests: Refers to a set of statistical tests that examine the relationships between multiple independent variables and one dependent variable
N/n: Letter utilized to denote population and/or sample size
Ordinary Least Squares (OLS) regression: A multivariate statistical technique that regresses several independent variables on a metric dependent variable (e.g., number of violent incidents)
pilot test: A pilot test is a “test run” or preliminary study. It can be used to work out any problems before a full study, or to provide a preliminary set of answers to research questions before a full study
quantitative/metric: Term used to sometimes refer to variables measured at the interval or ratio level
range: A measure of variation that is the distance between the lowest and highest score in a set of scores
statistically significant: A relationship between variables, or a difference between variables, is statistically significant when it is larger or smaller than would be expected by chance. The minimum level at which a relationship is considered statistically significant is .05, meaning that in only 5 in 100 chances would the relationship or difference be this small or large
standard deviation: A measure of variation that provides an idea of the variation of all particular values of a variable compared to the mean of that particular variable
ttest: A statistical test that examines the level of association between two interval or ratio level variables. For example, a ttest can examine whether the difference in final test scores between two classes is significantly different
variance: Provides an indication of how much each individual value of a particular variable differs from the average of a particular variable
Endnotes
1 Katz, C., V. Webb., K. Fox, & J. Shaffer. (2010). Understanding the relationship between violent victimization and gang membership. Journal of Criminal Justice, 39, 48–59.
2 Ibid.
3 Mosher, C., T. Miethe, & T. Hart. (2011). The mismeasure of crime. Thousand Oaks, CA: SAGE.
4 Vogt, W. P. (1993). Dictionary of statistics and methodology: A nontechnical guide for the social sciences. Newbury Park, CA: SAGE, p. 115.
5 Vogt, P. (1993). p. 32.
6 Babbie, E., F. Halley, & J. Zaino. (2003). Adventures in social research. Thousand Oaks, CA: SAGE.