uncertaintiespaper

USER UNCERTAINTIES WITH TABULAR STATISTICAL DATA: IDENTIFICATION AND RESOLUTION

Carol A. Hert and Naybell Hernández

Syracuse University

October 1, 2001

Final Report for Purchase Order #B9J03235

1. EXECUTIVE SUMARY

1.1. Study Objectives

United States government services are increasingly becoming Web-based, creating opportunities to make potentially useful, even vital, information and services more easily accessible to the citizens than in the past. This opportunity has challenged Federal agencies as they work to provide information and services that are easy to use and understandable to an extremely diverse constituency. Federal mandates requiring agencies to provide "universally usable" information and services have added further impetus to resolving the challenges.

The statistical agencies have been addressing these issues via a variety of strategies and approaches. The FedStats website and its related planning and research development activities has been one venue. National Science Foundation funding for projects associated with statistical digital government has been another. The work reported on here was conducted in conjunction with an NSF-funded project investigating statistical information in tabular format.

Enabling universal access and usability of statistical tables can be modeled as a process in which a user with an information need comes to a system in order to locate and then use a table or tables of interest. The NSF project developed an integrated approach to that process. Several specific technologies were developed to support this process, each of which was designed to incorporate a rich understanding of user behavior that the project has developed. The specific piece funded under this BLS purchase order concerned user understanding of tables and the extent to which metadata could be used to support enhanced understanding.

Specifically, this project addressed the following questions:

· What questions and uncertainties do users have when investigating the statistical tables used in the NSF project?

· What are the answers to these questions?

· To what extent is metadata available to answer the questions?

· How do the questions, question types, and answers map to the XML DTD developed by the NSF project to support the Table Browser?

As a result of the investigation of these questions, a number of issues related to metadata creation and use were identified and a set of recommendations developed.

1.2. Findings and Recommendations

Users had a variety of uncertainties when investigating tables. The majority of these related to definitions of terms, categories of variables, etc. A second important class of uncertainties was that concerning rationales for why certain things were done, reported in certain ways, etc. Other uncertainties related to the structure of the tables and lack of information on various aspects of the tables. Users also provided a wide variety of suggestions and complaints about the tables.

Answers were found for all user uncertainties by searching relevant documentation and asking experts. Questions concerning rationales were difficult to resolve through existing documentation while other answers were found in the documentation.

The uncertainty categorization scheme developed in the project can serve to categorize questions in future studies in which the goal is to map to metadata sources and specify tool implementations.

Perhaps one of the important implications of this study for metadata design will be the provision of some notions of how to translate users’ uncertainties into metadata and metadata into functionality features of an information system. In order to scale the results of this project, it will be necessary to understand the processes by which a user uncertainty can be mapped to a potential answer and then potentially presented via the interface tools. A number of issues related to both the uncertainties and currently available metadata were identified. These include the potential uniqueness of answers needed to respond to user uncertainties, the specificity of answers provided, and the lack of easily retrieved information from documentation (due to lack of encoding within documentation).

One of the obstacles experienced during the project was the incomplete development of existing DTD’s and their lack of compatibility with project needs and this project suggests approaches to further development work.

This work might be furthered with the following additional research:

· Expand the identification and coding of user uncertainties to additional tables in order to further validate the coding scheme, potentially begin to determine relative frequencies of uncertainty types.

· Test the extent to which the Table Browser or other tools that incorporate relevant metadata are able to resolve user uncertainties.

· Conduct document analyses to determine the effort involved in resolving user uncertainties with existing documentation.

The most obvious area in which further development of applications exists is in the area of metadata encoding and DTD development. As the statistical community continues to disseminate its information electronically, it will become ever more critical for the metadata behind the data to be easily available for users and applications. The most logical approach is would be to encode it in structured and standardized formats. Some metadata already exists in this form (such as data dictionaries) but technical documentation does not. XML has also become the standard of choice for encoding information. Thus the following recommendations for development and for agency action seem relevant:

· Continue efforts to develop metadata standards.

· Build relevant XML DTD’s for agency information.

· Investigate mechanisms for ensuring compatibility of DTD’s across document types and agencies.

2. PROJECT OVERVIEW

Enabling universal access and usability of statistical tables can be modeled as a process in which a user with an information need comes to a system in order to locate and then use a table or tables of interest. The NSF project developed an integrated approach to that process. Several specific technologies were developed to support this process, each of which was designed to incorporate a rich understanding of user behavior that the project has developed. Figure 1 represented the larger project. The specific piece funded under this BLS purchase order concerned user understanding of tables and the extent to which metadata could be used to support enhanced understanding. In Figure 1, the work of this project is contained within the component at the far right, entitled the Table Browser.

FIGURE 2.1: INTEGRATION

Specifically, this project addressed the following questions:

· What questions and uncertainties do users have when investigating the statistical tables used in the NSF project?

· What are the answers to these questions?

· To what extent is metadata available to answer the questions?

· How do the questions, question types, and answers map to the XML DTD developed by the NSF project to support the Table Browser?

As a result of the investigation of these questions, a number of issues related to metadata creation and use were identified and a set of recommendations developed.

2.1. Universal Usability And The Role Of User Understanding

Shneiderman (2000, p. 85-6) has framed the universal usability challenge as having three components: 1) the need to support a diverse technology base, 2) the need to provide access to diverse users with diverse skills and tasks, and 3) the need to bridge user knowledge gaps. It is the second and third aspects of universal usability that are the focus of this project, as appropriate technological solutions rest, to some extent, on the characteristics of users and their needs.

The world of Federal statistical information is a challenging one for most users who must navigate a labyrinth of agencies (over 70 at the Federal level), interpret very distilled information (numbers, often presented in formats such as tables that are difficult to use, and, who, to use the information appropriately, may need to understand very specific details of the data collection and analysis that generated the numbers. Statistics and statistical information are not easy to use for the layperson. Most of us are not taught in school how to read or work with statistics, resulting in low statistical literacy for the general population (Moore, 1997). Statistics are often highly distilled (as a specific statistic, a table or a time-series of statistics), have been produced through complex statistical and mathematical procedures (such as sampling design, weighting), and utilize specific and sometimes arcane definitions of concepts and variables (with associated jargon). All these represent potential sources of misunderstanding and barriers to use.

The work reported here is focused on tables. There are several rationales for this focus. Although there is a substantial effort given to graphical representations of data (e.g., Carr, 1998; Wainer, 1997; Wilkinson, 1999), tabular display treatments are treated minimally at best (e.g., Hall, 1943; Walker and Dorost, 1936). Tables are a common conceptual and presentational structure by which statistical data are stored and represented. Data in tabular form are often the starting point for additional depictions (such as graphics or analytical reports) and contextualize specific numbers. Tables are, however, difficult to find, interpret and use. Most commercial search engines do not index the contents of tables nor can they retrieve that information and often do not even identify the existence of tables within a text. Once a table is found, users face succinct labels and highly distilled numbers and may wish to perform comparisons and calculations that are difficult in static tables. The ubiquity of tables along with the associated challenges suggest that research into improvement of table retrieval, interpretation, and use has the potential to significantly improve access to data produced by statistical agencies.

Providing universal access to tables is both a technical as well as a challenge concerning user understanding and modeling of that understanding. In this project, we first identified user questions then investigated how to model them as metadata, specifically in terms of metadata elements available within the NSF’s project’s DTD.

2.2. The Role Of Metadata In Location And Understanding

Metadata is an often ambiguous and nebulous term and is used variously in different communities. Dempsey and Heery (1998) define metadata as information that enables one to manage and use the data/information to which they refer. This definition highlights two key points, that metadata are defined within a context (there is no one set of metadata associated with a set of data), and that they are information that supports usage. Some of the purposes which metadata may support are information/resource discovery, administrative uses such as tracking terms and conditions of use, the context of creation, and unique identification of objects (see Bearman (1996) for a discussion).

Within the statistical domain, metadata may include subject heading schemes to support resource discovery (such as the list of headings employed by the American Statistical Index (published by the Congressional Information Service) and HASSAT (from the University of Essex), codebook information, survey instruments and related documentation, as well as reports and other documentation produced by survey methodologists about data collection strategies, analysis of past survey efforts, etc. (Dippo and Gilman, 1999).

2.2.1 Past empirical work on metadata use

The study of user interaction with metadata is not completely unknown. Within the traditional library and information science domain, there is a thread of research most commonly known as relevance judgment research that investigates how users make judgments on the relevance (variously defined and operationalized) or potential relevance of information units. Traditionally those information units have been articles and books, and users examine representations of those units (such as citations, which represent the metadata in this case) and indicate those they consider relevant or non-relevant. Users are asked about the criteria they are using in the judgments and how they make those judgments. The intent of this line of work has been to understand the phenomenon of relevance judgment, provide typologies of relevance criteria, and in some cases to suggest enhancements to the representations of the information units (See for example, Park (1993) and Barry (1994).) For example, if users indicate that having information on the chapter titles in a book is helpful, it may be suggested that such information be added to the description of the book.

The vast majority of work of this type has looked at books (using information on records in online library catalogs) or articles (using periodical databases with or without abstracts). Users may be asked to examine different representations of the same item such as a citation, a citation with an abstract, or the item itself. Only recently have other types of information entities such as maps (Gluck, 1996) and meteorological data (Schamber, 1991) been considered.

In the domain of statistical information seeking, the author and Bosley (as reported in Hert, 1999) have been investigating how experts and other users employ metadata within codebooks (in this case, from the Current Population Survey) as they choose variables for analysis. He and Gey (1996) allude to the value of the codebook data in choosing variables in a paper that discusses a system that might facilitate browsing of such data.

In general, these studies have worked from existing metadata associated with information entities back to user behavior with that metadata. Such an approach limits our ability to see what metadata might actually resolve user uncertainties since we have not begun with those uncertainties. Thus in this project we began with identifying these uncertainties then moved on to the potential of metadata to resolve them.

2.3. Metadata and XML

To make metadata accessible in an automated environment, it needs to be represented and encoded so that software can identify appropriate metadata components and retrieve them. In the last several years, there have been a variety of efforts to encode statistical metadata. The International Organization of Standardization (ISO) has developed a standard, ISO/IEC 11179. The Inter-University Consortium for Political and Social Research’s (ICPSR) has a program entitled the Data Documentation Initiative (DDI) and an UN/ECE Work Session on Statistical Metadata (see for example: http://www.unece.org/stats/documents/2000.11.metis.htm) has been actively engaged in discussions.

For this project, the DDI encoding was used to encode tables and metadata. This choice was made because the DDI has a specific encoding designed to encode tables and project personnel had expertise with this encoding. However, the DDI encoding was not fully compatible with project needs and some modification was done. Details on the encoding of tables and metadata using the DTD can be found in Marchionini and Mu (2001).

3. METHODOLOGY

3.1. Investigating User Uncertainties

3.1.1 Overview

The investigation of user uncertainties involved several different activities. First, a set of respondents interacted with specific tables. The research team then mined transcripts of their sessions for uncertainties, questions, complaints, and suggestions. Answers to all questions were found by the research team. Questions and complaints were categorized.

Eleven people participated in the study. Each participant viewed a total of three tables in a mix of electronic and paper formats. After an initial unstructured period in which each participant was instructed to examine the tables, the researchers asked a series of questions about the participant’s understanding of each table. Demographic information on each respondent was gathered via a self-administered questionnaire at the beginning of the interview.

The team created records of each participant’s comments, responses to interview questions and other data (such as which component of a table was the focus of the comments). Analysts reviewed the records, and extracted uncertainties, suggestions, and complaints. The team coded the resultant lists using the schemes described below.

The team also searched for answers to the specific individual questions (rather than for the derived categories of questions). Answers were sought within the actual table and accompanying text (e.g., footnotes), related documentation (in both electronic and paper format), and in some cases, by consulting experts within the agencies that produced the tables.

3.1.2 Data collection

3.1.2.1 Table Selection

For the study, the team selected four tables from a set of tables nominated by agency partners in the project (tables available in Appendix 1). The four selected differed in their content, complexity, size, and formatting styles. The intent was to provide sufficient variety while still assuring that the researchers could provide the tables in multiple formats as well as be able to find answers to user questions. While this has implications for the generalizability of the results, consensus on what constitutes important differences in table format is generally lacking even among experts on statistical presentation. Additionally, it was important to show users real instances of tables, rather than artificial constructions, in order to identify actual questions.

All participants reviewed a set of three assigned tables about which they would answer questions. Some of these tables were presented in paper format, others in electronic format according to a researcher pre-defined set of combinations of the four tables and the two formats. All combinations had at least one example in each of the two formats (paper and electronic) to account for any difference that might occur when using different presentation media. One table was only available in electronic format.

3.1.2.2 Selection of Participants

Study participants were solicited through calls for participation posted in the university library’s government documents section. The researchers assumed that visitors to this section of the library would be more likely to be interested in and potentially knowledgeable about government information and statistics. Potential participants were screened for previous use of government statistical data. The study had a total of eleven participants (three males and eight females). Characteristics of the participants are indicated in Table 3.1. Each person was paid 25 dollars upon completion of participation.

TABLE 3.1. Demographic characteristics (N=11)

Characteristics

Measurement

# Participants

Level of Education

High School
College
Graduate
Post-Master
Ph.D.
Did not answer

Gender

F- Female

M - Male

Computer Uses

01- Email

02- Word Processing

03- Web surfing

04- Games

05- Database mgmt.

06- Multimedia

Web Searching Experience

Novice (1) – Expert (10)

1 - 4

Frequency of Table Use

on the Web

Never
Occasionally
Monthly
Weekly
Daily

Most of the participants were undergraduate students at Syracuse University and all of them reported using computers on a daily basis and to be highly experienced web searchers. One participant also reported to have been exposed to statistics and to have used tables from government websites at least occasionally.

Potential participants were recruited throughout the data collection period until the researchers determined that theoretical saturation on the uncertainties was achieved for each table (no matter in what format the table was presented). Theoretical saturation is reached, among other things, when no more relevant data seems to emerge regarding a category or variable (Glaser and Strauss, 1967). In this case, interviewing stopped when no new uncertainties were elicited for a given table. The eleven interviews is a reasonable sample; as pointed out by Schamber (2000), as few as ten interviews can be expected to provide representative results when eliciting cognitive perceptions purely for exploratory purposes.

3.1.2.3 The Interview

Once a preliminary questionnaire was developed, we started the pretest process with a total of six respondents (with their data not included in analysis). After each interview the questionnaire was revised. The final questionnaire consisted of two sections. The first session contained eleven demographic questions that asked participants about their frequency of computer use, web searching experience, statistical background, statistical packages used, as well as frequency of use of some specific statistical tables. Factual questions allowed the researchers to verify the appropriateness of each participant for the study as well as to collect data to classify each of them based on background information. The second set of questions (the loop section) contained twelve general questions intended to elicit what questions/uncertainties participants faced during exploration of government statistical tables. The underlying purpose of these questions was to determine what kind of metadata and its content would need to be accessible during table usage so that users of these types of tables could better understand the meaning and significance of the data presented. Appendix 2 presents the interview guide.

The choice of these particular questions was supported by defined characteristics of good tables such as the ones described by (UN/ECE, 1992); by the standards on the sources, methods and procedures of statistics as defined in Walker and Durost (1936); as well as by researchers’ own evaluations of each of the tables to be used in the investigation. Some of these standards point out the need for titles to be constructed as an aid to the reader in understanding the facts, for the source of the data to be indicated, as well as for the indication of unit of measure used and the methods used to compute the data. It is based on these and other standards that our specific questions emerged. Examples of such a questions include but are not limited to: ‘Does the title help you to understand the facts on the table?’, ‘Is there anything in the way the table, its rows or columns are organized that makes the table more difficult to understand?’, ‘Can you tell from the information in the table how any of the statistical measures were calculated? ’.

A member of the team conducted interviews in person at a time convenient to each participant, over a period of three weeks. All the interviews were limited to ninety minutes since during pre-test sessions this amount of time proved to be sufficient for coverage of three tables and not overly tiring.

The interviews were performed following the interview guide, but researchers exercised some flexibility in order to give the researchers more control of the situation. This control allowed the interviewer to clarify terms that were unclear for the participants and to probe for additional information (Frankfort-Nachmias & Nachmias, 1996). All the interviews were taped-recorded and the transcripts were utilized as the main source for the subsequent content analysis.

3.1.3 Data Analysis

Data analysis had two components. In one component, specific answers to each user question were found. These answers were then forwarded to the project’s system design team for inclusion in tools designed to support manipulation and usage of the tables. The second component was to categorize the questions in order to better understand user’s uncertainties and how they could be resolved.

3.1.3.1 Finding Answers to Questions

Table 3.2 lists all questions asked (by table), and the frequency of asking. Due to the length of the answers, the full table is presented as Appendix 3. Researchers searched for answers to each question in a variety of paper and online sources. They first examined the table itself for answers (e.g., the footnotes in a table), then examined associated technical documentation. For online tables, links present within the table were also followed. The researchers did not do general searches on the respective websites, as the assumption was that users of tables would not be likely to do so. If no answer was found, a member of the team contacted the tables’ experts from the government agencies that were working with the research team. These experts also confirmed the answers that had already been found.

TABLE 3.2. Questions Asked by Users and Their Frequency by Table and Table Format
Questions Asked	Table	Freq.
Questions Asked	Table	Paper	Elec.
What is the meaning of "seasonally adjusted"?	AAG		1
How is "unemployment rate" calculated?	AAG		1
What is "change in payroll employment?	AAG		1
Who is classified as "production, non-supervisory workers"?	AAG		1
In Note 4, why does 1982-84=100?	AAG		1
In Note 5, what is meant by "finished goods"?	AAG		1
In Note 5, why does 1982=100	AAG		1
In Note 6, why are the imports not seasonally adjusted?	AAG		1
Clarification of Note 7	AAG		1
Clarification of Note 8	AAG		1
Preliminary- when will the current data become available?	AAG		1
R- does this mean revised? If so when were they revised and how?	AAG		1
What is meant by "civilian labor force"?	AAG		3
What is the difference between "employed" and "unemployed"?	AAG		1
What are the definitions of the job categories?	AAG		3
Why is the Construction and Mining category not seasonally adjusted?	AAG		1
What is included in the Syracuse metropolitan area?	AAG		1
What is the difference between CPI-U and CPI-W?	AAG		1
Note 5- Who is a clerical worker?	AAG		1
Why is there a difference in the information given for different metropolitan areas CPI? i.e. for Syracuse there is annual % change, for LA Orange County there is the % change and the actual numbers, and in Arkansas there is no CPI data given.	AAG		1
Why doesn't the title say more specifically what the table is about?	AAG		8
Does that include the subway, infrastructure, trains, etc.?	AAG		1
Why is non-farm wage on the titles with the table and not listed with other jobs?	AAG		1
What do TXT and PDF mean?	AAG		1
What does T&PV mean?	AAG		1
I don't understand what the numbers are about. Do they mean people in the civilian labor force or something else?	AAG		1
Does employment include civilian and armed forces labor force?	AAG		1
What does non-farm mean?	AAG		4
What is T&P?	AAG		1
What does preliminary mean?	AAG		1
What do they mean by 12-month % change?	AAG		3
What is salary employment	AAG		1
How are employment and unemployment rates different?	AAG		1
What is the definition of services?	AAG		1
What do the dinosaurs do?	AAG		1
What do the different colors mean?	AAG		1
Why do all of the links change color, when I only click on one of them?	AAG		1
Why is the P on each number for October?	AAG		1
What am I supposed to find when I click this link to another page?	AAG		1
Why are the news releases first when I click this link?	AAG		1
Where did the get the information for these tables?	AAG		1
What are these numbers about?	AAG		1
Why can't I get the information directly when I click on this link?	AAG		1
What are the units?	AAG		1
What is meant by "enumerated population"?	T14	1
What is the "median"?	T14	1
In Note 1- why haven't specific group numbers been revised and how does that affect the totals vs. breakdown?	T14	1
Note 2 is confusing	T14	1
What is the difference between Note 1 and Note 2 and therefore what happened in 1980 vs. 1990?	T14	1
How is the population estimated for the in between years?	T14	1
Why is there a second breakdown of school-age children?	T14	2
What are the implications of suddenly switching to 10-year segments of the population after doing the rest in 5-year blocks? And what about 85 and over?	T14	1
"Excludes Armed Forces overseas" -why are they excluded and how long do they have to be overseas?	T14	2
What are the implications of calculating the numbers from April 1 in 1980 and 1990, and July 1 in the interim years?	T14	1
In the title it mentions population, but are they talking about US population? Why aren't they more specific?	T14	3
Residents of where?	T14	1
What are count resolution corrections?	T14	1
What are the texts on the side for?	T14	1
Percentage of what, the respondents?	T14		1
Why do they have the male and female breakdown for only 1980, 1990, and 1997 and not for the other years?	T14	1
I'm not sure if this means that these 3 years are based on the census and others are projected. The answer might be in the notes, but they are really difficult to understand.	T14	1
Why do they only have 1997 in bold?	T14	1
What are those 3 columns before the mean? Why did they group them together?	T14		1
Why were those places picked?	T14		1
Why are some things in purple and not others?	T14		1
They don't tell the total number of people who weren't surveyed and they should at least give a general idea.	T14		1
It doesn't give enough information about the area that the population is from.	T14		1
What is the point of the count? Did they double count?	T14	1
It is confusing. What do they mean by in thousands?	T14	1
What does death registration states mean?	L.E	5
What do they mean by whites? I am not sure what they include	L.E	1
Does this refer to people who are citizen or not?	L.E	1
Does black mean people who was born African-American or people who are black that live here.	L.E	1
Why is area in the column, I don’t understand that?	L.E		1
What does con mean?	L.E		1
What all others include or refer to?	L.E		4
Then it says total, is it the total of all other races?	L.E		2
What the --- are? Does this mean they don’t have data collected or what?	L.E		1
Where did they get the numbers	L.E	1
Who is classified as white?	L.E		1
Who is classified as black?	L.E		1
Do the data points represent more years to live?	L.E		1
How is this number calculated?	L.E		1
What about data for after age 85?	L.E		1
Is it possible to include a mouse over calculator to figure out the age of death?	L.E		1
The table is difficult to read due to a lot of data columns and rows. Gridlines might help.	L.E		1
How does this table relate to other years? I was 20 in 1996 and it says I should live for 60.4 more years, does this mean in subsequent year's tables I will always live to 80.4?	L.E		1
What areas mean? This is pretty vague	Gas	1
What PADD means?	Gas	11
What OPRG means? What is this abbreviation?	Gas	5
Why ozone-non-attainment is abbreviated RFG? What does that mean?	Gas	8
What the subcategories of PADDs 1, 1A, etc means?	Gas	8
Why are they comparing RFG areas with OPRG areas?	Gas	1
What originated areas are?	Gas	1
What are the different gasoline categories?	Gas	2
What attainment conventional areas or oxygenated or carbon monoxide areas are	Gas		5
What are carbon monoxide areas?	Gas		2
What are Oxygenated areas are?	Gas		3
Why do they choose the dates they close? What is the significance of those dates?	Gas		1
Why is there not consistent data between the regions? i.e. Why is there no OPRG in PADD 1C and no Oxygenated in the PADD 1's?	Gas		1
Is there a way that we can compare between rows and columns?	Gas		1
Is there a graphing capacity, to see more clearly the historical changes?	Gas		1
There are a lot of data points and the column headings get lost.	Gas		1
How is this data collected?	Gas		1
It is illegal in NJ to have self service gas stations, so how can these be the "self service prices per gallon" for the country?	Gas		1
I know that state gas tax can vary from state to state, how is that handled in this comparison between states?	Gas		1
TOTAL		69	101

T14 = No. 14. Resident Population, by Age and Sex: 1980 to 1997.

AAG = Economy at a Glance NY (presented only in electronic format).

LE = Table 5. Estimated average length of life in years, by race and sex: Death-registration states, 1900-28, and United States, 1929-96.

Gas = Retail Gasoline: (Self Service Prices per Gallon, Including Taxes).

3.1.3.2. Categorizing Questions

Inductive open coding (Krippendorff, 1980; Strauss & Corbin, 1990) was used to develop the final coding scheme for questions/uncertainties. With this technique, researchers derive the topics or categories in the data by identifying key issues. Researchers used data from a previous study (Hert, unpublished) to start the process of inductive open coding. The data from the study provided a preliminary set of categories by which to categorize questions. Our set of categories was put together by identifying instances where users had a: 1) direct question about the data, 2) concern about the clarity and completeness of information supporting the table, and 3) confusion about the meaning of terms, data, formatting style of the table and other issues.

To assess the reliability of the coding schema, two researchers coded the list of users’ questions/uncertainties independently of each other and then calculated a simple level of percent agreement (total number of agreements between the coders divided by total number of possible agreements). In our study, the percentage of agreement reached 91%, which as suggested by (Krippendorff, 1980) is an acceptable level.

The resulting categories constructed from the data relate to the uncertainties encountered by users during exploration of statistical tables. These categories let the researchers to classify and cluster the users’ questions/uncertainties into different levels of metadata to be provided by the table browser.

3.1.3.3. Categorizing Complaints and Suggestions

In the process of identifying user’s questions from interview data, researchers realized that were other forms of statements which users used to express their uncertainties in the process of understanding the statistical tables presented to them. These other forms of statements expressed either a suggestion on how users thought it best to present information, some functionalities that could be added to the electronic forms, or where the supporting information should be located, among other issues. Some of these comments include "Maybe if the columns, rows and numbers were more spaced it would be easier to read","I would add to the title what area is the population from’, ‘I would change some of the colors that are difficult to read’, etc.

The other form of statements made by users expressed the same sort of concerns previously mentioned but in the form of a complaint or dislike. These different forms of expressions might be the result of the way in which researcher questions were presented but they all captured the issues that concerned users in the process of understanding the particular statistical tables shown to them during the investigation. Some of the complaints expressed by users include: ‘There are too many numbers and that is confusing’, ‘You would have to pick apart notes underneath to understand what they are saying’, ‘Table does not explain how things are calculated’, ‘The page have a lot of related stuff besides what they explain about the data’, etc.

The researchers chose to tabulate and code complaints and suggestions separately from the uncertainties/questions.

MAPPING TO THE XML DTD

The intent behind gathering and categorizing user uncertainties was to use this knowledge to provide design specifications for the Table Browser developed in the larger NSF project. In specific, we were 1) trying to determine which user questions could be mapped to specific elements in the DDI DTD and 2) provide details on the specific answer that might display to a user with a question in the Table Browser, and 3) understand the issues in building such a mapping. For the larger project, the ultimate goal would be to automatically identify relevant metadata that resolved user questions, tag it appropriately within the DDI DTD and port it into the Table Browser.

The methodology employed to pursue these objectives is provided in Figure 3.1.

Figure 3.1. METHODOLOGY FOR METADATA MAPPING TO TABLE STRUCTURE

INPUT: TABLE EXAMPLES FROM AGENCIES

Results from this process were to feed into a table such as that depicted by Figure 4.2.

Figure 4.2 TABLE MERGING RESULTS FROM XML MARKUP AND USER QUESTIONS

Question Asked (From SU work)	SOURCE OF INFO
	DDI (From UNC)	Content from Table (identify which item) From UNC	Content from source other than table (identified as specifically as possible) UNC and SU

As the teams began this process, difficulties were quickly discovered. Elements in the DDI DTD tended to be somewhat "structural" in nature. That is, they were able to encode information if it could be determined which structural element contained that information within a given table. Thus, a question such as, "What does the P mean in a given cell [in the At-a-Glance tables from BLS]" resulted an answer ("it indicates preliminary data’) where the answer could be easily mapped to the DDI element which represented a footnote within a table. Other questions that were easily mapped included definitions of terms, headers for tables, columns, rows, and cells, and units of measurement. Questions concerning rationales and questions of a more specific nature were extremely difficult to map from uncertainties to DTD.

Given these difficulties, the teams briefly considered building an entirely new DTD for the project that would incorporate elements from the ISO metadata standard along with aspects of the DDI. This proved to be beyond the capabilities of the project so a decision was map to move away from attempting the specific types of mapping above, begin to encode the tables in the project using the DDI DTD (as further development work on the Table Browser depending on having the tables encoded) and addressing issues associated with incorporating the user uncertainties separately. More detailed reports on the DTD and its use are available in Marchionini and Mu (2001) and other NSF-project reports (some of which are available directly from Gary Marchionini and some of which are on the project website: http://istweb.syr.edu/~tables).

Since most of the work on the encoding following these investigations was done by the University of North Carolina team, results will not reported in this document, though the discussion and recommendations section of the document includes further information.

4. RESEARCH FINDINGS

4.1.User Uncertanties

Table 3.2 provided the complete list of questions/uncertainties for each table. The questions with their answers are presented in Appendix 3. A total of 170 questions was identified. As stated earlier, the team continued collecting data until new questions were not being identified from the new participants in the study. While this was done in advance of the final coding scheme for questions, the redundancy of questions was assessed against preliminary versions of the scheme as well as the researchers’ sense that little new data were being provided by new respondents. Thus, some categories in the scheme do not have large frequencies, as the other categories were saturated and data collection stopped.

A review of the users’ questions/uncertainties in terms of the type of medium in which tables were presented (paper, electronic) shows that, in general, participants asked the same or similar questions regardless of the format in which the tables were presented to them.

4.1.1 Categorization of Uncertainties

Table 4.1 shows the result of the categorization of questions/uncertainties. The categorization scheme had four main categories: ‘Definitional needs’, ‘Rationale of Information’, ‘Table Structure’, ‘Lack of Information’, and a category to enclose other no so clear user’s statements called ‘User uncertainty is not clear‘. The category ‘Definitional needs’ is concerned with the users’ need for having key terms, data categories and other element definitions available for them. Another important category that emerged from the data was ‘Rationale of Information’. This category is meant to enclose user questioning as to why some things were computed or reported in a particular way in the tables studied. ‘Table Structure’, another mayor category includes all the issues referring to the layout organization of the table. The last of the 4 major categories called ‘Lack of Information on‘ refer to the user’s need for explanation of data collection procedures, sources of data, computational methods among other issues, that allow them to evaluate the credibility and reliability of the data presented on the tables.

Most of these categories contain a set of subcategories intended to cluster more precisely the different uncertainties expressed by the users that participated in the research experiment. The most frequent type of user uncertainty was ‘definitional needs’ (specifically definitions about the ‘meaning of terms’) followed by uncertainties about the ‘rationale of the information’.

TABLE 4.1. Categories of Users’ Uncertainties during Statistical Tables Exploration
Categories Subcategories	Definitions/Examples	Freq.
Definitional needs Meaning of Terms. Meaning of Data. Meaning of Categories Meaning of Abbreviations Population Universe Unit of measurement	Users ask about the meaning of something in the table.	98
	What does seasonally adjusted means?	47
	I’m not sure what the data cell refers to.	5
	User is uncertain of what belongs to a particular category. Ex. What does Non-Farm wage include?	18
	What does T&P means?	17
	User is uncertain of to what population/universe data can be generalized to. Ex. Is it the population of the US?	9
	Is this in number of persons or number of jobs?	1
Rationale of information	User is uncertain of the reasons why something was done, reported, computed, etc in a particular way. Ex Why the numbers are reported differently for NY & LA?	28
Table structure Formating, layout, and components Meaning of Labels Organization of the links in webpage	The way the table is organized and formatted make user uncertain about the meaning of data.	24
	I don’t understand why the numbers are in purple.	10
	I don’t understand this label.	11
	I wouldn't expect to find Press releases first when you click on the link	3
Lack of information on Data collection procedures Sources of data Computation methods Comparability/Relationship of Info. Tool Functionality Updates to information	User is uncertain of how data were collected, computed, etc.	17
	How was data collected? What method were used?	2
	From where was data collected?	2
	How were rates computed?	4
	What is the difference between CPI-U & overall CPI?	6
	Can I make a graph right now?	2
	When was the information updated?	1
User uncertainty is not clear	The user didn’t clearly explained his/her uncertainty.	4
	TOTAL	170

The most common class of questions was Definitional Needs (98 questions, 58% of total questions) and, in specific, definitions of terms (47 questions, 28% of the total questions). Rationale of Information and Table Structure with 28 and 24 questions respectively were the next most frequent. Questions concerned with lack of information had 17 questions.

It is important to note that this data represents only questions/uncertainties that are expressed. Users may have other questions that the interview protocol used in this study was not able to elicit. We can assert that users have questions in these categories, but they may also have additional questions and the relative frequencies with which questions might be asked might also change. Steps were taken to ensure elicitation of all questions through the structure and pretesting of the instrument and through the continuation of data collection until redundancy was reached.

4.2. User Suggestions and Complaints

Another significant finding that resulted from the content analysis of the interviews to the users was a categorization of comments/suggestions and complaints. This categorization reflects users’ concerns about different aspects of the tables. Specifically, the ‘comments/suggestions’ reflect some of the actions that users think would increase table understanding. The ‘complaints’ on the other hand, show user dissatisfaction with several issues about the tables as shown in Table 4.2.

TABLE 4.2. Categories of Users’ Comments/Suggestions & Complaints during Statistical Tables Exploration
Categories Subcategories	Definitions/Examples	Freq.
Comment/Suggestion about: Adjusting table formatting/layout Facilitating understanding More specific labels Changes in location of information Added tool functionality Additional/more specific information Nice feature of the table Irrelevant Information	User gives his/her opinion about issues that could improve table understanding.	150
	Have the numbers in bold in a bigger print.	81
	They should at least give a general idea.	11
	It would help if they added "Total Resident Population by Age & Sex"	6
	I would put the dates a little closer to the data.	13
	It would help to be able to do splits of the table.	13
	It would help to know what TXT and PDF are.	22
	I think the history button is great because it tells you 10 years worth of information without putting too much information together in the table.	1
	Calling them would be the last thing I would do.	3
Complaint about: Excessive amount of information Insufficient amount of information Not specific information	User is not satisfied for different reasons.	32
	There are a lot of data points.	12
	Titles don’t say anything. Clarification of note 7.	14
	I’d like X information to be spelled out.	2
User’ statement is not clear	The comment/suggestion or complaint stated by the user is not clear.	4
	TOTAL	182

4.3. Answers to User Questions

The researchers found answers to all user questions (Appendix 3 reports all answers). They searched a variety of sources, online information within the table (such as footnotes), links on online tables, associated technical documentation (in both paper and electronic formats) and in some instances when no answers were found through searching, by asking table experts.

Table 4.3 contains an overview of some of the questions users asked during their exploration of statistical tables. An answer to each question is presented as well as an indication of where it was found either in the document provided with the table or as a response from an expert. It was necessary to consult experts for questions involving requests for rationales. Definitions and clarification of other terms were able to be resolved by information in documents.

TABLE 4.3. Overview of questions asked by subjects and answers where they were found.

Question

Answer

Location of

Answer

What is the meaning of "seasonally adjusted"?

Normal seasonal fluctuations are smoothed out by a statistical process

Document

How is "unemployment rate" calculated?

Persons are classified as unemployed if they do not have a job, have actively looked for work in the prior 4 weeks, and are currently available for work.

Document

Who is classified as "production, non-supervisory workers"?

Employees who are not owners or who are not primarily employed to direct, supervise, or plan the work of others. Production workers in mining & manufacturing, & construction workers in construct.

Document

In Note 4, why does 1982-84=100?

Most of the specific CPI indexes have a 1982-84 reference base. That is, BLS sets the average index level (representing the average price level)--for the 36-month period covering the years 1982, 1983, and 1984--equal to 100.

Document

Why is there a second breakdown of school-age children?

School age breakdown - for convenience. Those are popular aggregations. NOTE - this may clear up further questions - the national estimates are produced and available for single years of age, by sex, race and Hispanic origin. The figures that appear in the table are put there as space allows and in an attempt to please as many users as possible.

Expert

What are the implications of suddenly switching to 10-year segments of the population after doing the rest in 5-year blocks? And what about 85 and over?

NOTE - this may clear up further questions - the national estimates are produced and available for single years of age, by sex, race and Hispanic origin. The figures that appear in the table are put there as space allows and in an attempt to please as many users as possible.

Expert

How is this data collected?

We don't currently have a link on the site for that. We should for the new one but it is still missing from the test site. The data are collected using computer-assisted telephone interviews from a statistically selected sample of approximately 800 retail gasoline stations each week. The prices are collected every Monday morning and the data released by 5 p.m. every Monday night, except on government holidays the data are released on Tuesday (but still represent Monday's price).

Expert

It is illegal in NJ to have self service gas stations, so how can these be the "self service prices per gallon" for the country?

Yes, some states, NJ for one, do not allow self-serve. In those cases, the prices represent the only service of gasoline provided in that state. Our analysis has always shown, that this is not a big price effect in those states as compared to states allowing self-serve have higher prices for full-serve vs. self serve. I had even heard NJ monitors the impact of their law to help justify it to state resident's as not contributing to higher prices because it is required. I have nothing in writing on any of this though, it is all anecdotal. The industry doesn't make an issue of it nor do we. Some states have other laws such as refiners can't operate gas stations (MD for one), and we don't note them either as non-refiner state stations or anything.

Expert

What exactly do the job categories like transportation and public utilities entail?

Establishments reporting on the schedule (form BLS 790) are classified into industries based on their principal product or activity determined from information on annual sales volume. This industry classification, based on the 1987 Standard Industrial Classification Manual, is collected on a supplement to the quarterly unemployment insurance tax reports filed by each employer. For an establishment making more than one product, the entire employment is included under the industry of the principal product or activity. http://www.bls.gov/790faq2.htm#q6

Document

5. DISCUSSION AND RECOMMENDATIONS

5.1. User Uncertainties

This study has demonstrated that users have a variety of questions, some of which have the potential to be easily resolved with available electronic documentation (see next section for discussion of issues). The preponderance of definitional questions have fairly easy resolutions, and in fact, definitions of variables, categories of variables, etc. are already well documented and considered within existing metadata systems. This makes answers easy to retrieve.

Some uncertainties are much more complex, however, in particular those relating to rationales. Answers to these questions seem to require a richer domain knowledge that might be difficult to retrieve. For example, a question such as "Why is there no OPRG in PADD 1C and no Oxygenated in the PADD1's?" related to a gasoline table could not be resolved with simple definitions. To answer it requires an additional source (A map in another document) and knowledge of how the gasoline formulations and their reporting is changing. (Armstrong, personal interview with Paula Weir, 12-15-00).

The categorization scheme developed in the project can serve to categorize questions in future studies in which the goal is to map to metadata sources and specify tool implementations. Table 5.1 provides a demonstration of this utility. In the NSF project, preliminary mappings were made and Marchionini and Mu (2001) can be consulted for those mappings.

Table 5.1 Potential Mappings between User Uncertainty Categories and Tool Designs

Uncertainty Category	Definition of Category	Possible Design Options
Definitional Needs	Users ask about the meaning of something in the table	Mouse-over (at appropriate point, the cell, the row, the column, etc.) with definition, links to technical documentation explaining concepts, variables, variable categories as necessary
Rationale of Information	User is uncertain of the reasons why something was done, reported, computed, etc. in a particular way	Link to online question form (to be submitted to expert) or interactive help
Table Structure	The way the table is organized and formatted makes user uncertain about meaning	Mouse-overs, possible "About the format of the table" help option, pull-down menu with available manipulation options displayed
Lack of information	User is uncertain how data were collected, computed, etc.	Links to technical documentation
User uncertainty is unclear		This might be resolved by a long description of the object of concern or by parsing the content for definitions and providing those definitions

5.2 Mapping User Uncertainties to Metadata

The first issue is the question of whether a user is provided with a somewhat generic answer to his or her question or one that specifically resolves the uncertainty. For example, one user had the question: Why are the imports not seasonally adjusted (from the BLS At-A-Glance tables)? There is a very specific answer to this question but more generically, this might be considered a question that concerns a definition and a user could be provided with the definition of seasonal adjustment and import. Thus if definitions of terms were coded as such in related documentation, it would be a straightforward process to retrieve it for a user once the user’s uncertainty had been categorized as such. However, it is clear that providing the definitions is only one component to assist user understanding. One might envision a set of tools that would analyze user questions perhaps in terms of facets of the question (e.g., a why question concerning the co-joining of two definitions) which might be further assessed in terms of a user’s history (e.g., level of statistical expertise) to provide an answer to the user which could then be modified via a feedback mechanism, and also stored for use in later, similar queries.

A second issue is that not all answers (generic or otherwise) are easily found. The team has found that answers may not be in electronic format at all (though they may be available in a paper document or in a human expert’s head), or buried within a large document (in one instance a document of 92 pages) thus making it difficult to retrieve. It might be helpful to develop companion XML DTD for documentation associated with tables and statistical data so that information relevant to user uncertainties can be quickly (and automatically) found in the documentation and ported into a tool such as a Table Browser.

A third issue identified is that some answers are consistent across tables, while others might only be relevant to one specific instance of a table. A question such as "Why is the 1998 statistic for urban unemployment so high in relationship to the other 1998 numbers?" would relate only to one specific cell on one specific table, while a question such as "what is the definition of seasonally-adjusted" is likely to be at least consistent at the agency level. Knowing the "uniqueness" of an answer would provide insight into strategies for metadata storage and implementation in tools. Currently, it is difficult to assess the uniqueness/consistency of information without expert knowledge. Metadata repositories (such as that being developed by the Census Bureau) can be used to determine the level of consistency.

A point that needs further study is to what extent should the table browser provide specific answers or point to general types of information that user’s might need when they find themselves exploring statistical tables. While users often have uncertainties that are highly contextual and related to their specific situation and experience, it is difficult to anticipate those in advance and provide previously encoded solutions. Finding the balance between completely contextualized and general answers needs further exploration.

3. XML and DTD’s for Statistical Information

One of the obstacles experienced during the project was the incomplete development of existing DTD’s and their lack of compatibility with project needs. Most approaches to DTD development start with repositories of information and model them, not from a user’s perspective, but from a more conceptual or structural perspective. As a result, existing DTD’s don’t encode available information to support all the user uncertainties identified in the study. Starting with users, as was done here, may be another approach to developing DTD’s worth considering.

4. Recommendations

1. Future Research

This work might be furthered with the following additional research:

o Expand the identification and coding of user uncertainties to additional tables in order to further validate the coding scheme, potentially begin to determine relative frequencies of uncertainty types.

o Test the extent to which the Table Browser or other tools that incorporate relevant metadata are able to resolve user uncertainties.

o Conduct document analyses to determine the effort involved in resolving user uncertainties with existing documentation.

2. Further Applications Development Work

o Continue efforts to develop metadata standards.

o Build relevant XML DTD’s for agency information.

o Investigate mechanisms for ensuring compatibility of DTD’s across document types and agencies.

6. REFERENCES

Barry, C.L. (1994). User-defined relevance criteria: an Exploratory study. Journal of the American Society for Information Science, 45(3):149-159.

Bearman, D. (1996). Developments in metadata management frameworks. Archives and Museum Informatics 10(2):185-188.

Carr, D. B. 1998. "Multivariate Graphics," Encyclopedia of Biostatistics, Eds. P. Armitage and T. Colton, Vol. 4, pp. 2864-2886.

Dempsey, L. and Heery, R. (1998). Metadata: A Current view of practice and agreements. Journal of Documentation 54(2):145-172.

Dippo, C.S. and Gilman, D.W. (1999). The Role of Metadata in Statistics. Working Paper UN/ECE Work Session on Statistical Metadata, Geneva, Switzerland, Feb. 1999.

Frankfort-Nachmias, C. & Nachmias D. (1996). Research Methods in the Social Sciences, Fifth edition, New York: St. Martin’s Press.

Gluck, M. (1996). Exploring the relationship between user satisfaction and relevance in information systems. Information Processing and Management. 32(1):89-104.

Hall. R. (1943). Handbook of tabular presentation: How to design and edit statistical tables, a style manual and case book. NY: The Ronald Press Co.

He, J. & Gey, F. (1996) Online codebook browsing and conversational survey analysis. Social Science Computer Review 14(2): 181-186.

Hert, C.A. (1999). Federal Statistical Website Users And Their Tasks: Investigations Of Avenues To Facilitate Access: Final Report to the United States Bureau of Labor Statistics. Available at: http://istweb.syr.edu/~hert/BLSphase3.PDF

Krippendorff, K. (1980). Content Analysis. An Introduction to Its Methodology. Newbury Park, CA: Sage publications.

Marchionini, G. and Mu, X. (2001). User Studies Informing E-Table Theory and Interfaces. Submitted to the ACM SIGCHI –02 conference. Available from the authors.

Moore, D.S. (1997). New pedagogy and new content: The Case of statistics. International Statistical Review. 65 (2):123-165.

Park, T. (1993). The Nature of relevance in information retrieval: An Empirical study. Library Quarterly, 63:318-351.

Schamber, L. (1991). users’ criteria for evaluation in a multimedia environment. ASIS Proceedings 1991, pp. 126-133.

Schamber, L. (2000). "Time-line Interviews and Inductive Content Analysis: Their Effectiveness for Exploring Cognitive Behaviors". Journal of the American Society for Information Science, 51(8): 734-744.

Shneiderman, B. (2000). Universal Usability. CACM. 43(5): 84-

Strauss, A. & Corbin, J. (1990). Basics of Qualitative Research. Grounded Theory Procedures and Techniques. Newbury Park, CA: Sage publications.

Wainer, H. (1997). Visual revelations: Graphical tales of fate and deception from Napolean Bonaparte to Ross Perot. NY: Copernicus Books.

Wilkinson, L. (1999). The Grammar of Graphics. New York: Springer-Verlag.

UN/ECE, C(47) (1992). "The Fundamental Principles of Official Statistics in the region of the economic commission for Europe". Adopted at the 47^th session of the ECE, Geneva, Switzerland, 1992. http://www.nso.magnet.mt/principles/principles.htm

Walker, H., & Durost, W. (1936). Statistical Tables. Their structure and use. Bureau of Publications Teachers College, Columbia University.

7. ACKNOWLEDGEMENTS

The authors acknowledge the work of Kristen Armstrong, and Hala Annabi at Syracuse University for their work on this project. The University of North Carolina team included Gary Marchionini, Zhen-Zhen Deng, and Xiaming Mu. The expertise of Fred Gey and Dan Gillman was invaluable in understanding existing metadata standards. Cathyrn Dippo and Fred Conrad are also to be thanked for their support and ongoing assistance.

APPENDIX 1

INSERT PRINTED VERSIONS OF TABLES HERE (Since some of the electronic versions on the websites have been changed since we did the interviews to users).

These will be included in paper version of document.

APPENDIX 2: INTERVIEW GUIDE\

Subject ID _________

Date: ____/____/____ Starting time:_________

Demographic Questions

Highest level of education completed?

______________________________ In which field? _____________________________

Sex: ___ F ___M

How often do you use a computer?

____Never ____Occasionally ____Monthly ____Weekly ____Daily

What applications do you use (please check all that apply)

____Email ____Word processing ____Web surfing ____Games

____Database ____Multimedia ____Programming ____Other

Experience in web searching (1-10, novice to expert):____

Have you ever taken a statistical course? ____Yes ____No

If yes, choose all that apply:

____High school ____College ____Graduate study ____Professional training

Please select any statistical package(s) that you have used:

____Excel or other spread sheet ____SAS ____SPSS ____Others

We’d like to know how often you use statistical tables. Please check the response that best represents your experience

Please tell us how many times (ever) you have used the following tables (including both paper and electronic formats)

Stock market tables/listings