On Spotting Trends…

Share

[NOTE: I encourage you to read the entire post. I ask for your opinions on certain matters later in the entry. You can contact me via e-mail at or at my Twitter handle - .]

Thank you, all, for clicking on my page. The goal of this page is to introduce thoughtful — and perhaps, academic — research into NASCAR’s overarching policies…

And what better way to study the popularity of NASCAR than by researching its top series throughout time?

We have the data to provide some conclusions. But for some reason, many rely on anecdotal evidence, which is often riddled with bias, misinformation, and misunderstanding. All of that perpetuates ignorance across the sanctioning body, its teams, and its fans. To find a solution, one must first identify the problem. So here’s the objective of my first motorsports project:

“To set straight the narrative on NASCAR’s history of popularity.”

To do so, one can generate two theoretical datasets, A and B, to determine the popularity of stock car racing’s top series — NASCAR’s Cup Series:

A — A dataset that measures the television audience for each race
B — A set that tracks the attendance for each race

I choose the household television audience as the better proxy for the series’ popularity. Here are my reasons for selecting the T.V. side:

1 – Availability. The racing experience is available to more people via television than by actual attendance. Thus, the television audience is a better sample of the population. Additionally, the price of watching a race on T.V. is similar (though not the same) for each person. For in-race attendance, various tiers of ticket prices exist and fans must drive from different distances to attend. Therefore, prices fluctuate for each person. We would have to somehow incorporate a mechanism to hold constant the cost in that instance (note that we could do so theoretically, but the data we could gather are not granular enough).

2 – Source. The household audience measurements originate from a single firm unrelated to motorsports — the Nielsen Group. As such, the methodology for viewership compilation is the same for each race. In contrast, attendance figures emerge from varying entities. Each track provides its own numbers. These data are normally rounded and include an upward (or perhaps downward) bias to serve policies and agendas for the track.

3 – Exclusivity. Unlike many other sports, the only legal way to watch a Cup race, if not in-person, is to watch on television (ESPN has made a quiet push to allow use of WatchESPN). Other organizations often provide an online outlet to viewership, which would deflate those television audience numbers. Since online viewership is not yet possible for the Cup Series, we have no worry of downward bias in our television estimates.

4 – Completeness. The television dataset is rich. I have compiled ratings for every race since 1995. Although this sort of depth may be possible for attendance figures, it certainly is not available publicly for 17 seasons.

5 – Value. NASCAR implicitly values television audiences over those who attend a race. The 2001 television deal and NASCAR’s latest deal with FOX and FS1 confirm that much more revenue is generated from television deals than from tickets to races.

6 – Comparison. One can compare Nielsen figures from race-to-race. Since the supply of tickets for races vary depending on a track’s capacity, a potentially binding ceiling exists. In other words, a sell-out at Bristol could not capture the unobserved demand of those who wanted to go due to the saturated supply.

So how is television audience measured exactly? The number you see at the end of the week (like on this page) is simply the percent of households (with televisions) that watched the race. Even though the population of people with televisions has declined recently, recall that one cannot legally view a Cup race in any other fashion.

Dating back to the 1996 Daytona 500, I note which races were postponed or delayed for inclement weather. While I obtained those numbers, I did not include them within any model. They are not indicative of the popularity of that race since the race did not occur in its scheduled time slot.

I also take note of the tracks’ configurations as well as other features (like what time the race occurred, what television networked it appeared on, etc.):

Okay. This is fine, I guess. We see that more households watched from 2001 through maybe 2006 relative to the rest of the sample time period. We are not quite sure of these dates, however, and the magnitude of those ratings’ changes is very difficult to see.

This noisiness is happening because different race types affect who watches. For instance, more people enjoy plate races, the NFL steals a big chunk of viewers, etc. (these impacts are the bread-and-butter of this blog and will be discussed extensively for the next few entries). But for now, let’s find some sort of average to indicate when the ratings changed and by how much.

The mean (the regular average we all consider) will not work. It cannot differentiate between those different kinds of races. What if our mean includes five plates race over the past year (due to schedule changes)? The average would be artificially high and not capture the true average audience. Similarly, imagine a scenario where three road courses races occurred in a calendar year (note: road courses are the least-viewed configuration in NASCAR). Then the ratings would be doing NASCAR a disservice by underestimating the average crowd.

I submit that the median is our best bet. It is just the “middle rating” from the past year in which half of the ratings are above and half are below. When we include the black median line, here’s what we get:

That’s much better. Now we have an idea of when the average rating changed. We can see that the number of households spiked in 2001, grew a little more through 2003, dropped a little to 2006, and then plummeted in late 2006.

Before I provide some empirical analysis in future entries, I want your input. Why the drop in 2006 through 2011? Why the huge spike in 2001? Any thoughts on ratings in the 1990s? Feel free to answer these questions and send me any other opinions and thoughts that you have. You can contact me via e-mail at or on Twitter at .

So the median fixed that noisiness problem from the first graph. But we should want to know more about that noisiness. Do people prefer one T.V. station to another? Which track types are popular? Do night races affect viewership? That’s what statistical modeling can do for us. I’ll talk more about that in my next several entries.

Andrew, a.k.a. “”

Share

Comments are closed.