Almost every American research university and library has made significant investments in digitizing its intellectual and cultural resources and making them publicly available. There is, however, little empirical data about how these resources are actually used or who is using them (Harley, 2007). Those who fund and develop digital resources have identified the general lack of knowledge about the level and quality of their use in educational settings as pressing concerns. As part of a larger investigation into use and users of digital resources (Harley et al., 2006),1 we conducted an experimental analysis of two commonly-used methods for exploring the use of university-based Web-based resources: transaction log analysis (TLA) and online site surveys. In this article, we first provide an overview of these two methods, including their key challenges and limitations. We then describe an implementation of TLA and online surveys in combination on two local sites and the results of that test, including an exploration of the surveys' response rates and bias. From that test, we draw conclusions about the utility of these two methods and the particular analytic methods that may provide the most valuable and efficient results.
TLA and online surveys explore slightly different aspects of a site's use and users; they can be complementary tools, and the combination of the two may allow a deeper understanding of a site's use than either alone. For example, many Web sites use online surveys to learn more about their users. Among their strengths, surveys can be used to develop a profile of the site's visitors and their attitudes, behavior, and motivations. In particular, sites often employ surveys to determine personal information about their users, to discover users' reasons and motivations for visiting the site, and to explore user satisfaction levels. Transaction log analysis (TLA), on the other hand, can describe the actual usage of the site, including the relative usage volume of different resources, the details of users' navigation paths, the referring pages that led users to the site, and the search terms used to locate or navigate the site. It is a particularly valuable method, either alone or in combination with online surveys, because the usage data are collected automatically and passively; the method records actual user behavior on a site rather than relying on self-reports.
Although these two methods are widely used, there seems to be some ambiguity about the best way to implement them and to report the results, particularly for educational resources (Troll Covey, 2002; Mento and Rapple, 2003). This lack of consensus makes it difficult to interpret statistics for different sites and to compare one site with another (Bishop, 1998). Both TLA and online surveys can be time-consuming and labor-intensive and, unless research and analytic methods are sound, the results may be ambiguous or even misleading. Online surveys often suffer from disappointingly low response rates and biased samples, resulting in potentially misleading interpretations.
Transaction log analysis
Transaction log analysis (TLA) takes advantage of the computerized log files that automatically record online access to any Web site. By analyzing these logs, one can determine a number of characteristics of the site's users and summarize total site use.
There are significant challenges to assessing the use and usability of digital collections through transaction log analysis (Troll Covey, 2002). Bishop's (1998) previous research suggested many of the same issues.
Despite these challenges and limitations, transaction log analysis still has two major advantages over most other user research methods. First, it captures the actual behavior of real users in their own real-use environments; it does not rely on biased self-reports or artificial, laboratory-based use scenarios. Second, because TLA records behave passively without requiring users' active participation, it can capture a much broader spectrum of uses and users than can surveys, focus groups, or other methods.
An online survey can be a valuable complement to transaction log analysis for studying the use and users of a Web site; while TLA can reveal users' actual online behavior and usage patterns, surveys can reveal users' motivations, goals, attitudes, and satisfaction levels (Evans and Mathur, 2005). In the past decade, online surveys have become more widespread for a variety of reasons (Fricker and Schonlau, 2002; Gunn, 2002). Online surveys provide some cost and convenience advantages over other survey modes, but they also raise some problems that warrant careful consideration (Evans and Mathur, 2005).
Online surveys can take a variety of forms. Surveys can be administered online as part of a traditional, well-developed survey methodology involving a defined population of interest; an explicit sampling method for generating a representative sample; a well-thought-out recruitment strategy; carefully calculated response rates; and statistical estimates of the likelihood of response bias. Increasingly, however, online surveys are posted on a Web site and made available to anyone who happens upon them. These surveys rarely have a defined population or sampling method; with no way of tracking those who do or don't complete the survey, it is often impossible to report a response rate or estimate response bias.
When one designs a survey instrument for online administration, a variety of new options are available for question structure, layout, and design (Gunn, 2002; Schonlau et al., 2002; Faas, 2004). Important issues in instrument design include question wording, survey navigation and flow, skip patterns, survey length, and the graphical layout of the instrument. Computerization allows the design of more complicated skip patterns and question randomization. Additionally, it is possible to program automatic data checks and verification to disallow the entry of inconsistent responses.
The automation of data collection and analysis can result in an economy of scale, making online surveys much more cost efficient, especially for large sample sizes. Automation can also mean that data (and basic analyses) are available in a much shorter timeframe even instantaneously. [A more detailed exploration of techniques for survey design, administration, and analysis can be found in Rossi, Wright, and Anderson (1983) and Fowler (2002).]
Survey response rates
Survey response rates are of some concern to researchers, as rates for all types of surveys have been on the decline since the 1990s (Johnson and Owens, 2003; Baruch, 1999). Evidence suggests that response rates for online surveys are lower than for other media and continue to shrink (Fricker and Schonlau, 2002). In traditional social science survey research, sampling methods are designed to ensure that the survey respondents are representative of the population of interest. If the sample is representative and the response rate is high, the survey results can shed light on the characteristics of the population. If, on the other hand, response rates are low or the sample is known to be non-representative, it is possible even likely that the survey results will be misleading. (A large response rate alone is no guarantee that the respondents are representative.)
Sampling techniques and the measurement of response rates, however, are a particular challenge when a survey is posted online and made available to any Web user anonymously, without active recruitment or sampling. In such an environment, the population of users and the characteristics of the respondents are essentially unknown, making it difficult to report response rates and even more difficult to estimate the survey's response bias. The lack of knowledge of the complete population also makes it difficult to design appropriate sampling frames.Measuring response rates is a particular challenge for online surveys, partly because of the tricky definition of "response." Bosnjak and Tuten (2001) identify distinct response types, including lurkers (who view a survey without responding), drop-outs (who complete the beginning of a survey without continuing), item non-responders (who omit individual questions), and complete non-responders. Complicating the picture is the common practice of offering various rewards to increase participant motivation. The use of rewards and incentives can introduce response bias, however. Individuals who are motivated to respond by a specific reward may not be representative of the whole study population.
Methods and Results: Testing response bias in online surveys
We conducted a test on two local sites, using a combination of TLA and online surveys, to explore the effectiveness of these two methods for elucidating patterns of use and to explore survey response rates and bias. We selected two sites for our analysis: SPIRO, which provides online access to the UC Berkeley Architecture Department slide library, and The Jack London Collection, which features a wide variety of resources about the early-twentieth-century American author.
We placed short surveys on the homepages of both sites for a two-month period and collected the sites' transaction logs from the same period. After analyzing the logs and the survey responses individually, we combined the two by matching each survey response with the logs from the same Web user. (We identified individual users by the combination of IP address and user agent.) We then used this combined dataset to estimate each survey's response rate and to attempt to quantify the self-selection bias among the respondents. More information about the tests, including the survey instruments and analyses, can be found on our project Web site.
Table 1: Transaction log analysis: Selected results
Table 2: Online surveys: Selected responses
A summary of the test results can be found in Tables 1 and 2. Table 1 summarizes the usage of the two sites, based on TLA. The Jack London Collection had nearly three times as many usage sessions and unique users as SPIRO (145,959 vs. 54,375 sessions; 97,284 vs. 38,962 unique IP addresses). Overall, the usage patterns were similar, with a few exceptions. SPIRO received twice as much traffic as Jack London from international top-level domains (22% vs. 11%) and four times as much from .edu domains (9% vs. 2%). The Jack London Collection had approximately fifty percent more repeat visitors than SPIRO (17% vs. 11%).
Table 2 summarizes the responses to each site's online survey. Both surveys were designed with a single question on the site's home page leading to a second page with the remainder of the survey. The initial question was completed by 433 visitors for the Jack London Collection and 106 for SPIRO; fewer than half of these completed the remainder of the survey (196 for Jack London; 45 for SPIRO). In both cases, the number of survey responses was less than one percent of the number of unique IP addresses logged through TLA. Among the survey responders, SPIRO had more visitors from higher education than from K-12 schools (72% vs. 12%) while Jack London had the reverse (25% vs. 47%). For both sites, the majority of responders reported that it was their first time on the site, although the trend was more extreme for Jack London (74% vs. 67%). Only a minority of responders reported using each site at least monthly (24% for SPIRO; 13% for Jack London).
Online survey representativeness
In both tests, fewer than two site visitors in a thousand completed the online survey. Because of the low number of responses relative to site visitors, we had serious concerns about the respondents' representativeness and the value of the survey results. Since users could freely choose whether to answer the survey, it seems reasonable to assume that certain types of people were more or less likely to respond. But could we test that assumption quantitatively?
Combining online surveys with transaction log analysis of the same site during the same time period allows new techniques for measuring the survey's response rate and for estimating response bias. The transaction logs enables us to measure the full population of site users during the study period every user who viewed the site's homepage and therefore had the opportunity to take the survey. The transaction logs also allow us to describe everyone in the target population according to a few behavioral measures, based on their actual browsing patterns on the site. (Additional analyses would be required to see if site usage during the study period was typical of site usage at other times.)To assess whether the survey respondents were representative, we identified three behavioral measures from the transaction logs that could be calculated for both survey responders and non-responders: the number of browsing sessions each person had during the logging period, the number of files accessed per session, and the average session time length. The first of these measures the frequency of site usage; the second and third estimate the depth of that usage or the user's level of engagement with the particular site. We compared these measures for the survey responders and the survey non-responders.
To assess the likelihood and magnitude of response bias, we performed a series of t-tests, comparing the two groups on the three behavioral measures above. The t-test focuses on the observed means and provides an estimate of the likelihood that the difference between the means of the respondents and the non-respondents is due to chance (Steel and Torrie, 1980). A low p-value indicates that the survey responders are unlikely to be a representative sample of the population. We performed this analysis for both test sites.
Table 3: Representativeness of t-tests
Survey responders (who submitted both pages of the survey) for whom log data are also available
For both sites, these results indicate that the users who responded to the survey were noticeably different from the typical site user they used each site more frequently, and each session was longer and more in-depth, using more files per session. The p-values indicate that these differences are highly statistically significant. The survey clearly suffers from response bias, and the respondents are a non-representative sample on the three measures we compared.
These findings confirm our fear about survey response bias; the few users who bothered to respond to the surveys are demonstrably different from the average site visitors. Since the results show that the respondents are non-representative on these three behavioral measures, we determined that it would be unwise for us to draw any conclusions from the survey about the characteristics of the site visitors overall.
Online surveys and TLA, when properly implemented, can be useful and reliable methods for understanding site use; however, low response rates for online surveys may make it difficult to obtain a representative sample of users (Bishop, 1998; Evans and Mathur, 2005). In this case, a survey may actually yield ambiguous or even misleading results. Based on our tests, we have several observations and suggestions for the use of online surveys and TLA.
First, survey responses will probably not be representative of all site users. Therefore conclusions drawn from survey results can be erroneous if applied to the whole population of a site's users. As a basic-level check, we suggest that researchers at least estimate the survey's response rate by tracking the total number of site users during the survey time period. This analysis should be relatively straightforward, and will not require actually merging the two datasets.
Estimating the survey's response bias quantitatively, as we did in our tests, is more involved; nonetheless, this may be a valuable analysis to conduct before drawing any global conclusions from a potentially biased survey, especially if expensive planning decisions might be drawn from the results. In order to estimate the survey's response bias, we linked survey responses to transaction log data from the same usage session. We found that this process required a high level of expertise and a great deal of time. In addition, we did not discover any easily-available software tools to facilitate these analyses. For sites that have the time and expertise, this high-level analysis can allow the calculation of survey response rates and estimation of survey response bias.
When planning to link survey responses with transaction logs, both the site and the survey should be designed to support linking, with unique identifiers visible in both the usage logs and in the survey results. The data manipulations should certainly be part of a pilot test, before the full-scale survey is launched.
Our experiment involving the complementary power of online surveys and TLA suggests both benefits and drawbacks to Web site owners in using one or both of these tools.
Understanding the usage and users of an educational Web site can provide valuable insights to facilitate better decision-making, improve site design, and support the site's target users in making better use of the available materials. Though there are many types of sites and models for understanding users, TLA and online surveys have become ubiquitous practice. Supplementing survey results with information from transaction logs can help to reveal survey bias, to better understand users' goals and objectives, and to clarify user behavior patterns. While a comprehensive picture of users is not generally possible even when combining these tools, the extra effort of linking the two methods may provide a more robust understanding of users than if using either surveys or TLA alone.
This work was made possible by generous funding from the William and Flora Hewlett Foundation and the Andrew W. Mellon Foundation. Additional support was provided by the Hewlett-Packard Company, the Center for Information Technology Research in the Interest of Society (CITRIS), the California Digital Library (CDL), and the Vice Chancellor of Research, UC Berkeley. We are very grateful to the staff of SPIRO and the Jack London Collection for being so generous with their time and allowing us access to their sites and records. Invaluable contributions to data collection and analysis were made by Ian Miller, Charis Kaskiris, and David Nasatir. Shannon Lawrence assisted with editing.
1. The final project report and other materials are available on our project Web site at <http://cshe.berkeley.edu/research/digitalresourcestudy/index.htm>.
2. See for example the Web TLA Software Package Comparison in Appendix N: Harley, Diane, Jonathan Henke, Shannon Lawrence, Ian Miller, Irene Perciali, and David Nasatir. 2006. Use and Users of Digital Resources: A Focus on Undergraduate Education in the Humanities and Social Sciences. University of California, Berkeley. Available at: <http://cshe.berkeley.edu/research/digitalresourcestudy/report/>.
Bosnjak, Michael, and Tracy L. Tuten. 2001. Classifying Response Behaviors in Web-based Surveys. Journal of Computer-Mediated Communication (JCMC) 6(3) (April). Available at <http://jcmc.indiana.edu/vol6/issue3/boznjak.html>.
Carson, Stephen E. 2004. MIT Opencourseware Program Evaluation Findings Report. Massachusetts Institute of Technology, Cambridge, Mass. Available at <http://ocw.mit.edu/OcwWeb/Global/AboutOCW/evaluation.htm>.
Chambers, Ray L., and Chris J. Skinner. 2003. Analysis of Survey Data. Chichester, England: Wiley.
Evans, Joel, and Anil Mathur. 2005. The value of online surveys. Internet Research 15(2): 195-219.
Faas, Thorsten. 2004. Online or Not Online? A Comparison of Offline and Online Surveys Conducted in the Context of 2002 German Federal Election. Bulletin de Méthodologie Sociologique 82: 42-57.
Fowler, Floyd J. 2002. Survey research methods. Thousand Oaks, Calif.: Sage Publications.
Fricker, Ronald D., and Matthias Schonlau. 2002. Advantages and Disadvantages of Internet Research Surveys: Evidence From the Literature. Field Methods 14(4): 347-367 (November 1). Available at <http://fmx.sagepub.com/cgi/reprint/14/4/347.pdf>.
Harley, Diane, Jonathan Henke, Shannon Lawrence, Ian Miller, Irene Perciali, and David Nasatir. 2006. Use and Users of Digital Resources: A Focus on Undergraduate Education in the Humanities and Social Sciences. University of California, Berkeley. Available at: <http://cshe.berkeley.edu/research/digitalresourcestudy/report/>.
Harley, Diane. 2007. Why Study Users? An Environmental Scan of Use and Users of Digital Resources in Humanities and Social Sciences Undergraduate Education. First Monday, volume 12, number 1 (January 2007) Available at <http://firstmonday.org/issues/issue12_1/harley/index.html>.
Johnson, Timothy, and Linda Owens. 2003. Survey Response Rate Reporting in the Professional Literature. Paper presented at Annual Conference of the American Association for Public Opinion Research, Nashville, Tenn., May 15.
Rossi, Peter H., James D. Wright, and Andy B. Anderson, eds. 1983. Handbook of Survey Research. New York: Academic Press.
Schonlau, Matthias, Ronald D. Fricker, and Marc N. Elliott. 2002. Conducting research surveys via e-mail and the web. Santa Monica, Calif.: Rand. Available at <http://www.rand.org/publications/MR/MR1480/>.
Steel, Robert George Douglas, and James H. Torrie. 1980. Principals and Procedures of Statistics: A Biometrical Approach. New York: McGraw-Hill.
Troll Covey, Denise. 2002. Usage and Usability Assessment: Library Practices and Concerns. Digital Library Federation and Council on Library and Information Resources (CLIR), Washington, D.C. Available at <http://www.clir.org/pubs/reports/pub105/contents.html>.
Copyright © 2007 Diane Harley and Jonathan Henke