This post is part of a series that considers methodological and reporting flaws that frequently arise in surveys, and uses Travelandleisure.com’s “America’s Favorites Cities” (AFC) survey as its primary point of discussion. Links to other posts in this series are featured on the author’s page.
Reports will frequently boast a survey’s sample size, as if a massive sample size should somehow instill confidence in the survey’s results. It shouldn’t. In fact, a massive sample size should raise a massive red flag.
Travel and Leisure’s “America’s Favorite Cities” survey highlights its huge response count of 40,000. This survey was performed online and asked respondents to rank up to 35 cities on a wide variety of attributes. The internet makes such large sample sizes relatively easy to obtain and has vastly increased the number of surveys conducted. It seems every news website or blog in cyberspace (including this one) features a survey of the week. Surveys have proliferated and response counts have exploded thanks to inexpensive data collection. The downside is that cheaply collected data frequently delivers suspect results.
Many casual observers sign off on a survey’s reliability by considering response counts alone. A review of more serious surveys shows that massive response counts (e.g. 40,000) are unnecessary. Take for example surveys from the 2012 Presidential race. A recent Rasmussen Reports survey had a sample of 1,500 responses and delivered results with a +/- 3 percentage points margin of error at 95% confidence. Certainly Rasmussen could vastly increase its pool of responses at relatively little cost. Yet whatever the survey might gain in terms of statistical precision from cheaply collected data would likely be lost in terms of accuracy. Rasmussen and other serious surveys are appropriately more focused on capturing a representative sample than one that is overwhelmingly large.
Huge sample sizes should beg for further inquiry. Several questions are worth pursuing. First, how did respondents qualify for the survey, if there were any requirements at all? Travel and Leisure’s survey doesn’t specify how it identified survey takers. It does say that “respondents were asked whether they live in or had visited the cities they rate.” But it doesn’t comment on how this response affected the survey’s reporting, or for that matter whether a response was counted even if the survey taker had no direct experience with the city. Of course, the survey does not indicate that it did anything to verify the accuracy of respondent travel claims. A more accurate — and much more expensive — approach might involve taking surveys at tourist destinations. Response counts would be far smaller due to the expense, but likely more accurate.
Second, were respondents permitted to take the survey more than once? One way to guarantee a large number of responses is to allow survey takers to answer the survey multiple times. A city’s tourism board would have strong interest in swaying a survey’s results. Not restricting repeat respondents would boost response count but hurt accuracy. It is remarkable, for example, how well San Juan, Puerto Rico performs across the board in Travel & Leisure’s survey results. Did the San Juan tourism board have something to do with this?
How many respondents actually completed the survey? It may be that all those who started the survey are included in the response count. Yet the number of survey “drop outs” could be substantial, especially if the survey is very long. Incomplete surveys could potentially hurt the accuracy of results.
What other corners did the survey cut in order to obtain such a massive response rate? For example, the survey might have limited its length and overlooked important questions in order to maximize response size. Furthermore, if responses are cheaply obtained, was the survey cheaply constructed? More on this in other posts.