NSF Task-Based Information Search Systems Workshop

Prior to the workshop, participants were asked to identify one or two outstanding research questions that need to be addressed in order for search systems to become more task-aware.

Participant Responses

Assembled booklet in PDF

Eugene Agichtein

Response

Automatically identifying and naturally supporting long-running (multi-session or multi-day) search tasks. Aspects of the problem include:

  1. Building a taxonomy of complex search tasks, and important components of the task, e.g., a template for the kinds of things people find when planning a trip.
  2. Automatically detecting early on that a user is embarking on a (potentially) long search task (e.g., as in [1]).
  3. Identifying the type of a task by matching to the taxonomy in 1
  4. Detecting whether the user has completed the task or may resume it later.
  5. Understanding the possible interfaces to help the searcher resume the task from the last state (e.g., by expanding on [2]).
References
  1. Eugene Agichtein, Ryen W. White, Susan T. Dumais, and Paul N. Bennet. Search, interrupted: understanding and predicting search task continuation. In Proceedings of the 35th International ACM SIGIR Conference on Research and Development in Information Retrieval, SIGIR ’12, pages 315–324, New York, NY, USA, 2012. ACM.
  2. Debora Donato, Francesco Bonchi, Tom Chi, and Yoelle Maarek. Do you want to take notes?: identifying research missions in yahoo! search pad. In Proceedings of the 19th International Conference on World Wide Web, WWW ‘10, pages 321–330, New York, NY, USA, 2010. ACM.

Jae-wook Ahn

Response

What are the limitations of visual user interfaces for task-based search and how can we overcome them? [1,2] shows when a transparent user model (or task model) can fail. Unlike [1,2] which implement an offline search system or a text-based transparent user model, [3] presents a 2-D visualization based approach, which can overcome some of the limitations of the past approaches.

What are the properties that should be considered when evaluating task-based search system user interfaces that emphasize transparency? [4] suggests a list of aims for explanatory recommender systems, which could be helpful for defining the aims of task-based search system user interfaces.

References
  1. Annika Waern. User involvement in automatic filtering: An experimental study. User Modeling and User-Adapted Interaction, 14(2-3):201–237, 2004.
  2. Jae-wook Ahn, Peter Brusilovsky, Jonathan Grady, Daqing He, and Sue Yeon Syn. Open user profiles for adaptive news systems: help or harm? In Proceedings of the 16th International Conference on World Wide Web, WWW ‘07, pages 11–20, New York, NY, USA, 2007. ACM.
  3. Jae-wook Ahn and Peter Brusilovsky. Adaptive Visualization for Exploratory Information Retrieval. To appear in Information Processing and Management.
  4. Nava Tintarev and Judith Masthoff. Evaluating the effectiveness of explanations for recommender systems. User Modeling and User-Adapted Interaction, 22:399–439, 2012.

Nicholas Belkin

Response

I think that the most fundamental problem in this respect is the ability to infer motivating task type from the searcher's past and current information-seeking behaviors. This implies having a typology of motivating search tasks to start with, which in and of itself is a significant research problem. I find it difficult to separate these two research problems, so consider them in this context as one.

References
  1. Yuelin Li and Nicholas J. Belkin. A faceted approach to conceptualizing tasks in information seeking. Information Processing and Management, 44(6):1822–1837, 2008.

    This paper proposes a scheme for classification of both motivating search task types, and information searching tasks. The major contribution here is a means for classification that is not just naming different tasks, but rather a principled scheme for characterizing different task types. This means that in experimental situations, task type can be manipulated according to different values of some facets of task.

  2. Chang Liu, Michael Cole, Eun Baik, and Nicholas J. Belkin. Rutgers at the TREC 2012 Session Track. In Proceedings of 21st Text Retrieval Conference, TREC '12. 2012.

    Although this paper does not investigate the issue of predicting motivating task from information-seeking behaviors, it makes a step in this direction by considering how different motivating tasks influence interpretation of information-seeking behaviors for the purpose of identifying "useful" documents during a search session.

Pia Borlund

Response

The research problem that I would like to address is in line with the fourth mentioned example: “The need to develop IR evaluation methods that operate across multiple queries and even multiple search sessions”. To me the objective is to be able to evaluate the IR interaction of the user as realistically as possible, that is, to handle multiple queries and even multiple search sessions – or in other words, to understand and evaluate IIR as it takes place in real life, including multi-facetted information needs and multi-tasking/task-switching. E.g., see the papers by Belkin (2008; 2010) and Spink (2004).

Also I would like to bring attention to the need for focus on research on searching of work tasks. That is, information searching as part of work task solving, as briefly addressed in the paper by Borlund, Dreier & Byström (2012).

References
  1. Nicholas J. Belkin. Some(what) grand challenges for information retrieval. SIGIR Forum, 42(1):47–54, 2008.

    This paper explicitly points out a number of issues we ought to address, not the least with reference to evaluation of IIR systems.

  2. Nicholas J. Belkin. On the Evaluation of Interactive Information Retrieval Systems. In: Larsen, B., Schneider, J.W., & Åström (Eds.). The Janus Faced Scholar: A Festschrift in Honour of Peter Ingwersen. Copenhagen: Royal School of Library and Information Science. 13–21, 2010.

    This paper contributes with a new framework for IIR evaluation focusing on usefulness.

  3. Pia Borlund, Sabine Dreier, and Katriina Byström. What does time spent on searching indicate? In Proceedings of the 4th Information Interaction in Context Symposium, IIIX ‘12, pages 184–193, New York, NY, USA, 2012. ACM.

    The reported information seeking work task study reminds us that information searching takes place also in information intensive work task performance settings. The impression we have, is that the majority of current IIR research centres on Internet searching and everyday-life information needs – including the two IIR studies reported in this paper. However, there remains a need for IIR research on information searching in relation to information intensive work task performance with respect to optimise information searching, the various platforms used for information searching, and understanding of the conditions under which work task performance takes place.

  4. Amanda Spink. Multitasking information behavior and information task switching: An exploratory study. Journal of Documentation, 60(4), 336–351. 2004. Emerald.

    This paper is an early example of the addressing of multi-tasking and task switching, hence not including the seamless IT and information environment of today, which also have to be taken into account.

Katriina Byström

Response

I think the following issue appears as fruitful to be addressed in order to design search systems that are more task-aware:

Contextualizing task properties and search behavior, and the relationship them between into relevant information practices/behaviour.

Taylor’s (1991) article discusses how different professional groups are formed around information use environments that in themselves include traits for what information is valued and consequently sought for as well as through what channels and sources this information is searched/distributed. Byström & Lloyd (2012) pushes the idea further by suggesting that each information use environment creates pervasive information practices with time sensitive professional and local influences. Work tasks fit into these environments as concrete instances where explicit and tacit knowledge culminates, which is why they provide useful base to study information search behavior and understand the role of IR systems. For the field of task-based information search this may provide a possibility to explain search behavior and design/evaluate IR systems not only from a user-oriented perspective, but also acknowledging the sociocultural aspects of search.

References

  1. Katriina Byström and Annemaree Lloyd. Practice theory and work task performance: How are they related and how can they contribute to a study of information practices. Proceedings of the American Society for Information Science and Technology, 49(1):1–5, 2012.
  2. Robert S. Taylor. Information use environments. In B. Dervin (Ed.) Progress in Communication Sciences, 10: 217–225,1991

Ben Carterette

Response I

Whole-session evaluation: being able to evaluate the utility of a search system over the course of a user's interaction with it, ideally from task commencement to task completion. I'm envisioning a "task-aware" system as being one that attempts to determine a user's task from their interactions and adapt accordingly; if nothing else, it seems like some kind of sessiony evaluation would be necessary for use in objective functions. For example, Liu et al. [1] use task type prediction to select a feedback model during the course of a section.

While there are probably many ways to do whole-session evaluation (user studies, log analysis, etc), I am particularly interested in batch-style evaluations with reusable test collections. Batch evaluations allow researchers and developers to quickly perform tests of many possible combinations of features, models, and inputs while maintaining high statistical power. Reusability allows them to go back to any point in that search space and reliably get the same performance.

Creating test collections for whole-session evaluation is a difficult problem. We have been attempting to tackle it through the TREC Session track for the last three years [2, 3], and while we are happy with what we have accomplished, we still have a long way to go. The main problem is that it is difficult to model the fact that user interactions at time t+1 can depend on what the system does at time t; if the same test collection is going to be used to evaluate n different systems, it has to be able to model up to n different possible user actions at each time step. A direction we are considering is to use user simulation; while it is not likely that we will be able to accurately simulate users, we may be able to produce interactions that are at least useful for improving task-aware search systems.

The two Session track papers describe our efforts towards creating test collections for session evaluation. The second paper on the 2012 track is more specifically related to task-aware search, as our topics were categorized into four different broad task types. The Liu et al. paper describes the participation of Rutgers in the track. They built different feedback models for different task types and showed substantial improvements on some task types. This suggests that such a test collection can actually be useful for training task-aware systems.

References I
  1. Chang Liu, Michael Cole, Eun Baik, and Nicholas J. Belkin. Rutgers at the TREC 2012 Session Track. In Proceedings of 21st Text Retrieval Conference, TREC '12. 2012.
  2. Evangelos Kanoulas, Ben Carterette, Mark Hall, Paul Clough, and Mark Sanderson. Overview of the 2011 Session Track. In Proceedings of 20th Text Retrieval Conference, TREC '11. 2011.
  3. Evangelos Kanoulas, Ben Carterette, Mark Hall, Paul Clough, and Mark Sanderson. Overview of the 2012 Session Track. In Proceedings of 21st Text Retrieval Conference, TREC '12. 2012.
Response II

The notion of "relevance", which is so important to batch-style system-based evaluation, strikes me as limited in its ability to capture what users need from systems in order to actually complete tasks. If we instead talk about "utility"---as in the utility of a document to aid task completion---we can model utility not just by relevance but also by other important criteria such as timeliness, readability, truthfulness and trustfulness, completeness, novelty, obtainability, and more. Test collections in which documents are judged for utility given a specific task and context would allow researchers and developers to build and train systems that are more aware of tasks and user needs.

This idea is not new; it goes back to the late 60s and especially a number of papers by Cooper through the 70s (Stefano Mizzaro's review of the concept of relevance briefly describes much of this work [1]). But it hasn't been applied much, possibly because there are so many dimensions on which one can discuss "utility" that only looking at one or two at a time is even feasible. A few recent TREC tracks have done this: the Contextual Suggestion track, the Web track's diversity task.

Mark Rorvig argued that utility can be sufficiently modeled with preference judgments [2]: give an assessor two documents and a context, and ask which document they would prefer in that context. These preference judgments capture utility without needing to enumerate and judge against every possible aspect of utility. We have been applying this idea to building large collections of preferences that capture novelty and diversity along with relevance and other aspects of utility [3, 4].

References II
  1. Stefano Mizzaro. Relevance: the whole history. Journal of the American Society for Information Science, 48(9):810–832, 1997.
  2. Mark E. Rorvig. The simple scalability of documents. Journal of the American Society for Information Science, 41(8):590–598, 1990.
  3. Ben Carterette, Paul N. Bennett, David Maxwell Chickering, and Susan T. Dumais. Here or there: preference judgments for relevance. In Proceedings of the IR research, 30th European Conference on Advances in Information Retrieval, ECIR‘08, pages 16–27, Berlin, Heidelberg, 2008. Springer-Verlag.
  4. Praveen Chandar and Ben Carterette. Using preference judgments for novel document retrieval. In Proceedings of the 35th International ACM SIGIR Conference on Research and Development in Information Retrieval, SIGIR ‘12, pages 861–870, New York, NY, USA, 2012. ACM.

Arjen P. de Vries

Response I

How do we design and evaluate search systems (and their retrieval models) given that we know that relevance is not just topical?

References I
  1. William S. Cooper. Expected search length: A single measure of retrieval effectiveness based on the weak ordering action of retrieval systems. American Documentation, 19(1):30–41, 1968.

    The paper is useful because it takes an effort oriented view on the evaluation of systems (actually, developing also a measure that quantifies a system's success in terms of the reduction in effort that can be expected). I have been surprised about the fact that the measures proposed in this work are hardly ever used. (Last year's best SIGIR paper may bring it back into the picture, who knows.)

  2. Kevyn Collins-Thompson, Paul N. Bennett, Ryen W. White, Sebastian de la Chica, and David Sontag. Personalizing web search results by reading level. In Proceedings of the 20th ACM International Conference on Information and Knowledge Management, CIKM ‘11, pages 403–412, New York, NY, USA, 2011. ACM.

    This paper considers readability in IR, demonstrating how it is a relevance criterion that matters.

Response II

How can we tailor the retrieval model to the task? What part can we automate in this tailoring process, and what part will remain the designer's task?

References II
  1. Stefano Ceri, Alessandro Bozzon, and Marco Brambilla. The anatomy of a multi-domain search infrastructure. In Web Engineering, volume 6757 of Lecture Notes in Computer Science, pages 1–12. Springer Berlin Heidelberg, 2011.

    The Search Computing project developed a powerful search environment, that allows to define searches that integrate multiple sources. The more traditional database approach that motivates this project does emphasize that we could perhaps benefit from more abstraction in defining how search will take place, instead of the usual focus on a specific instance.

  2. Norbert Fuhr. Logical and conceptual models for the integration of information retrieval and database systems. In East/West Database Workshop, Klaenfurt, pages 206–218. Springer Verlag, 1994.

    Describes IR as generalization of the DB approach, and argues for a view that would into both the DB, IR and HCI aspects of retrieval in a unified manner.

Fernando Diaz

Response

What are appropriate auxiliary tools for different types of search tasks? Previously studied tools include query and URL history. However, it may be that finer-grained specialization of tools may be helpful. For example, when a user is researching a product, supplying a simple spreadsheet for price or review information may be useful; when a user is planning a trip, decomposing an interface into trip subtasks (e.g. accommodation, plane tickets) may be useful.

Can we adaptively augment traditional search interfaces with these auxiliary tools?

References
  1. Dan Morris, Meredith Ringel Morris, and Gina Venolia. Searchbar: a search-centric web history for task resumption and information refinding. In Proceedings of the SIGCHI Conference on Human Factors in Computing Systems, CHI ‘08, pages 1207–1216, New York, NY, USA, 2008. ACM.
  2. Henry Allen Feild and James Allan. Task-aware search assistant. In Proceedings of the 35th International ACM SIGIR Conference on Research and Development in Information Retrieval, SIGIR ‘12, pages 1015–1015, New York, NY, USA, 2012. ACM.
  3. Debora Donato, Francesco Bonchi, Tom Chi, and Yoelle Maarek. Do you want to take notes?: identifying research missions in yahoo! search pad. In Proceedings of the 19th International Conference on World Wide Web, WWW ‘10, pages 321–330, New York, NY, USA, 2010. ACM.

Abdigani Diriye

Response

One of the challenges stifling work on task-aware systems is identifying and mapping out the kind of search support and features needed to help users during different search tasks. The challenge here is identifying the inherent search activities the user might be engaged in, and the set of features and functionality that would best support them.

References
  1. Gene Golovchinsky, Abdigani Diriye, and Tony Dunnigan. The future is in the past: designing for exploratory search. In Proceedings of the 4th Information Interaction in Context Symposium, IIIX ‘12, pages 52–61, New York, NY, USA, 2012. ACM.

    The above paper provides a good introduction on how to design for more complex and exploratory search tasks and some of the factors that need to be kept in mind.

Susan T. Dumais

Response I

Identifying tasks using implict interactions. This is especially important for tasks that extend across time and devices. The references below provide examples of techniques for identifying queries related to tasks, for predicting whether a task will be resumed, and looking at tasks over a longer time scale.

References I
  1. Alexander Kotov, Paul N. Bennett, Ryen W. White, Susan T. Dumais, and Jaime Teevan. Modeling and analysis of cross-session search tasks. In Proceedings of the 34th International ACM SIGIR Conference on Research and Development in Information Retrieval, SIGIR ‘11, pages 5–14, New York, NY, USA, 2011. ACM.
  2. Debora Donato, Francesco Bonchi, Tom Chi, and Yoelle Maarek. Do you want to take notes?: identifying research missions in yahoo! search pad. In Proceedings of the 19th International Conference on World Wide Web, WWW ‘10, pages 321–330, New York, NY, USA, 2010. ACM.
Response II

Thinking broadly about what support for search tasks looks like. The reference below provide examples from simple "answers" seen in web search engines, to apps for specific tasks, to richer environments for exploratory search.

References II
  1. Lydia B. Chilton and Jaime Teevan. Addressing people‘s information needs directly in a web search result page. In Proceedings of the 20th International Conference on World Wide Web, WWW ‘11, pages 27–36, New York, NY, USA, 2011. ACM.
  2. 50 ultimate travel apps ... so far

Luanne Freund

Response

We still do not know very much about how tasks influence search, or more specifically: what are the task-based requirements of IR systems? Much of the research on task-based IR has focused on behavioural analyses of searchers in different task contexts, which informs our understanding of task as a contextual variable that influences behaviour, but does not necessarily have design implications for search.

This is a multifaceted problem, as it involves the relationships between task characteristics, document characteristics and characteristics of retrieval systems. We have descriptive models of each of these components that can help up identify key characteristics, but we are lacking in theoretical and empirical models that identify the relationships between them that are most likely to influence search outcomes. The empirical studies that we do have are of limited value due to the lack of a standard nomenclature for tasks and the idiosyncratic operationalization of task characteristics in assigned search tasks.

References
  1. Norbert Fuhr. Salton award lecture information retrieval as engineering science. SIGIR Forum, 46(2):19–28, 2012.

    My thinking about this problem has been influenced by the 2012 SIGIR Salton Award keynote delivered by Norbert Fuhr, in which he discusses the need for an engineering approach in IR that would allow us to predict the kinds of systems and features needed in response to particular domain and task scenarios. The paper points us towards to importance of developing theoretical models of task-based IR as well as conducting more carefully controlled and systematic empirical studies to test and further develop these models.

  2. Robert Capra, Gary Marchionini, Jung Sun Oh, Fred Stutzman, and Yan Zhang. Effects of structure and interaction style on distinct search tasks. In Proceedings of the 7th ACM/IEEE-CS joint Conference on Digital libraries, JCDL ‘07, pages 442–451, New York, NY, USA, 2007. ACM.

    There is very little published research that predicts and tests for task-based effects of retrieval system features on retrieval outcomes rather than user behaviours. The Capra et al. (2007) study comes close, as it examines relationships between task types, interaction styles and information architecture.

  3. Barbara Wildemuth and Luanne Freund. Search tasks and their role in studies of search behaviors. In Proceedings of the 3rd International Workshop in Human-Computer Interaction and Information Retrieval, HCIR ‘09, New York, NY, USA, 2009. ACM.

    This position paper identifies some of the issues with task characterization and operationalization in interactive IR studies.

Gene Golovchinsky

Response

I think the biggest obstacle to the deployment of task-aware systems is lack of understanding when such systems may be useful. When it's clear that records of prior interaction can be used to inform subsequent system behavior, this information is already incorporated into systems. There are no significant technical difficulties to start down this road. The biggest challenge is one of perception: because Google doesn't do something, doesn't mean that that something isn't possible or desirable in other contexts.

Jaap Kamps

Response I

To build an information access tool that actively supports a searcher to articulate a whole search task, and to interactively explore the results of every stage of the process. There is a striking difference in how we ask a person for information, giving context and articulating what we want and why, and how we communicate with current search engines. Current search technology requires us to slice-and-dice our problem into several queries and sub-queries, and laboriously combine the answers post hoc to solve our tasks. Combining different sources requires opening multiple windows or tabs, and cutting-and-pasting information between them. Current search engines may have reached a local optimum for answering micro information needs with lighting speed. Supporting the overall task opens up new ways to significantly advance our information access tools, by develop tools that are adapted to our overall tasks rather than have searchers adapt their search tactics to the "things that work."

References I
  1. Ian Ruthven. Interactive information retrieval. Annual Review of Informa¬tion Science and Technology, 42(1):43–91, 2008.

    Solid overview of how much we know about the interaction, also immediately highlighting how little we know about the mechanics of interaction during a process of performing a complex task.

  2. Arjen P. de Vries, Wouter Alink, and Roberto Cornacchia. Search by strat-egy. In Proceedings of the 3rd Workshop on Exploiting Semantic Annotations in Information Retrieval, ESAIR ’10, pages 27–28, New York, NY, USA, 2010. ACM.

    Interesting new approach to formulate a complex query (or search strategy) for tasks of increasing complexity.

Response II

Can we make a retrieval system aware of the searcher’s stage in the information seeking process, tailor the results to each stage, and guide the searcher through the overall process? A search session for a non-trivial search task consists of stages with different sub-goals (e.g., problem identification) and specific search tactics (e.g., reading introductory texts, familiarizing with terminology). Making a system aware of a searcher’s information seeking stage has the potential to significantly improve the search experience. Searchers are stimulated to actively engage with the material, to get a grasp on the information need and articulate effective queries, to critically evaluate retrieved results, and to construct a comprehensive answer. This may be of particularly great help for those searchers having poor information or media literacy. This is of obvious importance in many situations: e.g., education, medical information, and search for topics “that matter.” Some special domains, such as patent search and evidence based practices in medicine, have clearly prescribed a particular information seeking process in great detail. Here building a systems to support (and enforce) this process is of obvious value.

References II
  1. Marcia J. Bates. Where should the person stop and the information search interface start? Information Processing and Management, 26(5):575–591, 1990.

    There is a need for a new discussion on what role the system and user play, and how the interface supports the task progress as well as the information seeking process.

  2. Forest Woddy Horton. Understanding information literacy: a primer; an easy-to-read, non-technical overview explaining what information literacy means, designed for busy public policy-makers, business executives, civil society administrators and practicing professionals. UNESCO, 2008.

    Information/Media literacy research has many relations and basically outlines what type of information seeking behavior should be promoted by the system.

Bill Kules

Response

Research Problem: Design of exploratory search tasks for search system evaluation

Evaluation is an essential part of developing search tools that are more task aware, particularly for exploratory search, which is a recognized challenge for information seeking systems and an area of active research and development.

For any user study, tasks must be carefully constructed to balance ecological validity with experimental control. For exploratory search, this is a particular challenge, because we are trying to induce search behaviors that are inherently open-ended. Individual searchers have to interpret the task, formulate their own queries and evaluate the results based on their understanding of the information need and their own knowledge and experience. At the same time, we wish to maintain some level of experimental control to permit comparisons between systems and longitudinally.

Borlund (2003) developed the concept of a simulated work task, which forms the basis for many user evaluations of search systems. Many studies have used the simulated work task as the basis for search tasks, but tasks are rarely comparable between or even within studies, limiting our ability to build up a corpus of results in a manner similar to the TREC studies. Recent work has started to formalize attributes of exploratory search tasks and provide suggestions for how to create and validate such tasks (Kules and Capra, 2012; Wildemuth and Freund, 2012). There are a number of open questions to be investigated. Three of them are:

  1. What is an appropriate, parsimonious set of attributes to define exploratory search tasks?
  2. How can we quantify (can we quantify) measures for these attributes?
  3. Given that searchers individually interpret tasks and results, what comparisons does this allow us to make between systems and studies?
References
  1. Pia Borlund. The IIR evaluation model: a framework for evaluation of interactive information retrieval systems. Information Research, 8(2), 2003.

    This paper developed the concept of a simulated work task. It has formed the basis for much user-focused systems evaluation.

  2. Bill Kules and Robert Capra. Influence of training and stage of search on gaze behavior in a library catalog faceted search interface. Journal of the American Society for Information Science and Technology, 63(1):114–138, 2012.

    This paper extends the simulated work task concept to create exploratory search tasks. It incorporates 7 additional criteria and describes an iterative procedure to create and validate exploratory search scenarios for a semi-controlled laboratory study.

  3. Barbara M. Wildemuth and Luanne Freund. Assigning search tasks designed to elicit exploratory search behaviors. In Proceedings of the Symposium on Human-Computer Interaction and Information Retrieval, HCIR ‘12, pages 4:1–4:10, New York, NY, USA, 2012. ACM.

    This paper reviews and critiques 84 papers related to exploratory search task design and proposes a set of design recommendations.

Birger Larsen

Response

One way of progressing for systems to become more task aware is to facilitate research by considering if it is possible and fruitful to extend the Cranfield paradigm to support experiments with task based search. What are the demands on topics and relevance assessment to support task based experiments, and what additional procedures and performance measures are needed? Can the complexity be handled and what could be learned from such experiments?

References
  1. Marianne Lykke, Birger Larsen, Haakon Lund, and Peter Ingwersen. Developing a test collection for the evaluation of integrated search. In Proceedings of the 32nd European Conference on Advances in Information Retrieval, ECIR‘2010, pages 627–630, Berlin, Heidelberg, 2010. Springer-Verlag.

    This poster paper describes the iSearch test collection, where we put much more emphasis on obtaining through and structured descriptions of the work tasks and information needs. This may be one step towards task based search as it facilitates experiments with extended task descriptions.

Christina Lioma

Response

One potentially interesting aspect of task-aware search is the ranking model that estimates the relevance of the retrieved results. Traditionally, ranking models are grounded on mathematical estimations, such as metric distance or probabilities, and often include empirically-tuned parameters. It is not uncommon to use the exact same ranking model in different search tasks. However, relevance should not necessarily always be treated uniformly across different tasks. Task-based ranking models could be considered, taking as a starting point advances in dynamic similarity measures, which are partly tuneable at query time manually by the user (Bustos and Skopal 2006), or which accommodate various different task-based similarity functions (Ciaccia and Patella 2009).

These papers present the two examples of dynamic similarity measures mentioned above:

References
  1. Benjamin Bustos and Tomáš Skopal. Dynamic similarity search in multi-metric spaces. In Proceedings of the 8th ACM International workshop on Multimedia Information Retrieval, MIR ‘06, pages 137–146, New York, NY, USA, 2006. ACM.
  2. Paolo Ciaccia and Marco Patella. Principles of information filtering in metric spaces. In Proceedings of the 2009 Second International Workshop on Similarity Search and Applications, SISAP ‘09, pages 99–106, Washington, DC, USA, 2009. IEEE Computer Society.

Jingjing Liu

Response I

For multi-session tasks, how can search systems perform better, at different stages, and for different task types (e.g., tasks with different structures, difficulty/complexity levels, life vs. scholarly tasks, actionable vs. informational tasks, etc.)?

Frequently seen in everyday life, multi-session tasks are usually complex and require multi-sessions to complete. While IR systems do a decent job with simple search tasks, there’s much room for them to improve in multi-session tasks. How can systems be better designed to facilitate users’ finding and re-finding of information in multi-session tasks? What system features will be supportive and preferred by users? Understanding multi-session task features, user behaviors, and system features are all important to address this question.

References I
  1. Jaime Arguello, Wan-Ching Wu, Diane Kelly, and Ashlee Edwards. Task complexity, vertical display and user interaction in aggregated search. In Proceedings of the 35th International ACM SIGIR Conference on Research and Development in Information Retrieval, SIGIR ‘12, pages 435–444, New York, NY, USA, 2012. ACM.

    Arguello et al. (2012) addressed task features (complexity) and the system interface feature (vertical display) as well as their interaction with users in aggregated search. This could be very relevant to and beneficial in dealing with multi-session tasks.

  2. Alexander Kotov, Paul N. Bennett, Ryen W. White, Susan T. Dumais, and Jaime Teevan. Modeling and analysis of cross-session search tasks. In Proceedings of the 34th International ACM SIGIR Conference on Research and Development in Information Retrieval, SIGIR ‘11, pages 5–14, New York, NY, USA, 2011. ACM.

    Kotov et al. (2011) showed that it is possible to effectively model and analyze users’ cross-session search behaviors. Two problems they dealt with were: 1) identifying related queries to a current one from previous sessions, and 2) given a multi-query task, predicting if the user will return to the task in the future. This research is helpful for search systems to determine task context and suggest queries for multi-session tasks.

Response II

What task features make a search difficult? And how can systems better support “difficult” tasks according to the reasons why they are difficult?

Byström, K. & Järvelin (1995) and Byström, K. (2002) explored the effect of task complexity (defined as the a prior determinability of information inputs, processing, and outputs) on people’s information seeking and use in a work task environment. These studies found that with the increase of task complexity, increased the complexity of information needed, the needs for domain information and problem solving information, and the number of sources, but decreased the success. There is a strong link between information types acquired and sources used, and that task complexity has a direct relationship to source use.

Although it is not the same concept as task difficulty, according to Li & Belkin (2008), both represent the information seeker’s perception that the information seeking is not easy. More qualitative studies like these are needed to understand what task features make IR system users feel “difficulty”. These will help design systems can better support “difficult” tasks according to the reasons why they are difficult.

References II
  1. Katriina Byström and Kalervo Järvelin. Task complexity affects information seeking and use. Information Processing and Management, 31(2):191– 213, March 1995.
  2. Katriina Byström. Information and information sources in tasks of varying complexity. Journal of the American Society for Information Science, 53(7):581–591, 2002.
  3. Yuelin Li and Nicholas J. Belkin. A faceted approach to conceptualizing tasks in information seeking. Information Processing and Management, 44(6):1822–1837, 2008.

Gary Marchionini

Response

An overarching problem is two-fold: user context elicitation and use. By this I mean determining what and how information seekers learn over sessions and correspondingly how systems might assist this process.

A second, more specific problem is how to represent search history to users.

References
  1. Robert Capra, Gary Marchionini, Javier Velasco-Martin, and Katrina Muller. Tools-at-hand and learning in multi-session, collaborative search. In Proceedings of the SIGCHI Conference on Human Factors in Computing Systems, CHI ‘10, pages 951–960, New York, NY, USA, 2010. ACM.

Catherine Smith

Response I

Research Problem 1: the need to study transitions between task-specific applications and search sub-tasks.

As Belkin (2009) stated, “… we might say that an ultimate goal of, and challenge for IR research is to arrange things such that a person never has to engage with a separate IR system at all (although I am quite willing to agree that there are certainly circumstances in which such engagement might be indeed desirable.).” In this view, the burden of acquiring useful task descriptions (useful to the retrieval system) might be handled by applications that support “parent-task” goals (with a parent-task defined as any task that invokes an information search sub-task). In order to exploit task-related data available from such an application, we need to study transitions between search sub-tasks and parent-tasks.References I
  1. Nicholas J. Belkin. Really supporting information seeking: A position paper. Information Seeking Support Systems. Technical Report. 2009.
Response II

Following from above, as an example of transitions, one can imagine search sub-tasks interleaved with active reading, where reading is the parent-task. An application like the one described by Hinkley, Bi, Pahud, & Bixton (2012) might collect implicit and/or explicit task-related data, which it could pass to a search utility when search sub-tasks are invoked. We need to describe transitions, and investigate how transitions may be improved for the user. Toms, Villa, & McCay-Peet (2013) is an example of an experimental study along these lines. The authors state their objective as, “… to explore the boundaries of the work task and search process to examine how users integrate search with the larger task” (p. 16). The study used an active reading interface which was developed by the researchers, and was an integral component of a larger experimental retrieval system. Work on this problem would be further advanced by collaborations with HCI researchers designing task-specific applications. This is particularly important if we are to consider an architecture that enables coupling of task applications and a search utility.

References II
  1. Elaine G. Toms, Robert Villa, and Lori McCay-Peet. How is a search system used in work task completion? Journal of Information Science, 39(1):15–25, 2013.
  2. Ken Hinckley, Xiaojun Bi, Michel Pahud, and Bill Buxton. Informal information gathering techniques for active reading. In Proceedings of the SIGCHI Conference on Human Factors in Computing Systems, CHI ‘12, pages 1893–1896, New York, NY, USA, 2012. ACM.

Mark Smucker

Response

In our work on time-biased gain (TBG), Charlie Clarke and I have written about how TBG has no presupposed notion of gain, or of user interfaces, or even of retrieval systems. What matters to time-biased gain is that we have some way of estimating the gain achieved by the user over time.

This workshop concerns itself with search tasks where the user wants to "cultivate a deeper understanding of a problem or topic" and where the task requires "sustained interaction and engagement with information". The notion that lengthy interactions with information are central to task-based search implies that not only will gain likely be spread out over a long time period, but that gain may not be simply accumulated on acquisition of relevant material. I wonder to what extent our notions of gain in search must change. Today we think of gain as finding relevant documents, but will that be the correct model of gain for task-based search?

Cooper (1973) discusses the notion that each document encountered in a search session should have some positive or negative utility. In Cooper's formulation of the problem, the retrieval system's job is to deliver documents and the user can report to us the utility of each document. If we see our IR systems as becoming more than tools for retrieval of documents, we may need new measures of gain. For example, if our IR systems became designed for supporting creative work, we might need a measure of gain similar to the creativity support index of Carrol and Latulipe (2009). Or, perhaps we need to start measuring and modeling negative utility along the lines of searcher frustration as done by Feild, Allan, and Jones (2010). Once we know how to measure gain for users, we will then be faced with the task of how to incorporate these notions of gain into our Cranfield-style evaluations of task-based search.

References
  1. Erin A. Carroll and Celine Latulipe. The creativity support index. In CHI ‘09 Extended Abstracts on Human Factors in Computing Systems, CHI EA ‘09, pages 4009–4014, New York, NY, USA, 2009. ACM.
  2. William S. Cooper. On selecting a measure of retrieval effectiveness. Journal of the American Society for Information Science, 24(2):87–100, 1973.
  3. Henry Allen Feild and James Allan. Task-aware search assistant. In Proceedings of the 35th International ACM SIGIR Conference on Research and Development in Information Retrieval, SIGIR ‘12, pages 1015–1015, New York, NY, USA, 2012. ACM.
  4. Mark D. Smucker and Charles L.A. Clarke. Time-based calibration of effectiveness measures. In Proceedings of the 35th International ACM SIGIR Conference on Research and Development in Information Retrieval, SIGIR ‘12, pages 95–104, New York, NY, USA, 2012. ACM.

Simone Stumpf

Response I

There are two areas for task-based search systems that I think are interesting to explore:

  1. Understanding task-based search for non-text items.

    Not all search is for text-based items; users’ search tasks also include images, music and videos. Research into searching for these items is limited and fragmented. Previously, there has been some work to understand how users search for images (Westman 2009), however there is a growing realization that more information is needed that take the context and background of the user into account to support them in their task-based search. More recently, there has been increasing interest in searching for and in videos (Smeaton 2007).

  2. Providing better cues and “scent” in task-based search.

    Search engines results pages on the web have moved on from being just a collection of ranked items and they now provide subtle cues for the user to get to the information that they want via snippets, visual previews, etc. However, there are two issues surrounding this. Firstly, this functionality is usually not available to users on their personal storage systems and they may rely on cues of association (Chau et al. 2008). Secondly, there is a lack of understanding of the role these cues play in users’ task-based search (Woodruff et al. 2001).

References I
  1. Allison Woodruff, Andrew Faulring, Ruth Rosenholtz, Julie Morrsion, and Peter Pirolli. Using thumbnails to search the web. In Proceedings of the SIGCHI Conference on Human Factors in Computing Systems, CHI ‘01, pages 198–205, New York, NY, USA, 2001. ACM.
  2. Alan F. Smeaton. Techniques used and open challenges to the analysis, indexing and retrieval of digital video. Information Systems, 32(4):545– 559, 2007.
  3. Stina Westman. Image Users' Needs and Searching Behaviour, pages 63–83. John Wiley & Sons, Ltd, 2009.
  4. Allison Woodruff, Andrew Faulring, Ruth Rosenholtz, Julie Morrsion, and Peter Pirolli. Using thumbnails to search the web. In Proceedings of the SIGCHI Conference on Human Factors in Computing Systems, CHI ‘01, pages 198–205, New York, NY, USA, 2001. ACM.
Response II

Fundamentally - what is a "task"? There are so many different understandings of this term and it really matters, as any systems that are developed rest on a basic assumption of what is meant by "task".

References II
  1. Victor M. Gonzalez and Gloria Mark. Constant, constant, multi-tasking craziness: managing multiple working spheres. In Proceedings of the SIGCHI Conference on Human Factors in Computing Systems, CHI ‘04, pages 113–120, New York, NY, USA, 2004. ACM.

    This paper is one take on what a task is but not necessarily for searching.

Jaime Teevan

Response

Defining task boundaries. While some tasks (like buying a car or planning a trip) are clearly defined, others (like planning summer activities or doing research) are much harder to identify because they evolve, change, are part of larger tasks, and consist of sub-tasks. It can be very hard for a person -- let alone a computer -- to clearly identify task boundaries, but clear task definition may be important for tools that want to support task-based search.

References
  1. Rosie Jones and Kristina Lisa Klinkner. Beyond the session timeout: automatic hierarchical segmentation of search topics in query logs. In Proceedings of the 17th ACM Conference on Information and Knowledge Management, CIKM ‘08, pages 699–708, New York, NY, USA, 2008. ACM.

    Jones and Klinkner [CIKM 2008] discuss the challenges with identifying tasks (which they call "search missions") versus sessions, and present one way to do so automatically.

Elaine Toms

Response

In this discipline, we seem trapped in the user-centred paradigm; not everything information-oriented is about the user and search behavior. A task may exist in isolation from the people who accomplish it. Tasks emerge out of an organizational environment and their resolution supports some organizational outcome. Tasks have clear objectives, but may have multiple outcomes, and multiple ways of reaching that outcome. One could conceive of the user as a convenient slave/robot/handmaiden to get the task done. The challenge is twofold:

  1. Understanding the process: think Henry Ford and project the automobile assembly line a century later when the task has information components that have to be mixed, scrapped, stirred, and moulded although not in quite the same physical way. We do not know how similar typical “knowledge work” tasks are to, for example, the tasks that occur on an automobile assembly line, although there have been clues as demonstrated by the work on how people write papers and proposals.
  2. Understanding which “sledge hammer, drill or screwdriver” the “slave” needs to get the job done; perhaps less like the automobile assembly line, much of knowledge work requires human intervention in the form or decision making that requires intense cognitive activity. What tools does the slave need to assist with the job?

In the context of knowledge work, what are those generic tasks that are shared by many contexts, that is, which ones are comparable to, for example, the “cut and paste” tasks of the desktop application work? which ones are context specific, for example comparable to the produce a slide show in a presentation software? Which ones require finding data and/or information? Which ones require using information? Which ones rely on the talents of the slave because the technology is still not sophisticated enough to do the task from beginning to end, and how do we assist the slave with more useful tools?

Does the approach used by Bartlett in decomposing a bioinformatics task fit with other types of "knowledge work" tasks?

The Kulthau and Vakkari work on writing proposals and papers goes a long way toward decomposing task in an educational context (although they may not see it that way)

What "cognitive protheses" do we need to develop to support task completion? An interesting and short note that defines this concept: http://www.lpi.usra.edu/publications/reportsCB-1089/ford.pdf

Why have we never done a formal requirements analysis for any of our information solutions? Take even the digital library. Its design is based on past practices.

References
  1. Joan C. Bartlett and Elaine G. Toms. Developing a protocol for bioinformatics analysis: An integrated information behavior and task analysis approach: Research articles. Journal of the American Society for Information Science, 56(5):469–482, 2005.

Pertti Vakkari

Response

Understanding more in detail how larger tasks are related to search tasks and searching. By tasks I mean information intensive work tasks, which generate several search sessions. Empirical results hint that various aspects of search process like term selection, querying, relevance judgment and the information utilized vary between search sessions when task performance proceeds. In order to understand the role of various activities (stages) in the search process within and between sessions it is necessary to understand the whole search process and how it is associated with task performance. This is important 1) theoretically for understanding the phenomenon we are interested in, 2) for system design to better match the tools with human activities from the angle of both search tasks and work tasks, and 3) for creating evaluation procedures and metrics for task-based search.

References
  1. Jingjing Liu and Nicholas J. Belkin. Searching vs. writing: Factors affecting information use task performance. Proceedings of the American Society for Information Science and Technology, 49(1):1–10, 2012.

    In an experimental longitudinal setting Liu & Belkin studied the associations between newspaper article writing tasks and information searching and use in three points in time during the preparation of the article. The results in this and other articles from the same experiment are important because they have extended our understanding of how some features of tasks and task performance are related to various aspects of searching and utility assessments.

  2. Pertti Vakkari and Saila Huuskonen. Search effort degrades search output but improves task outcome. Journal of the American Society for Information Science and Technology, 63(4):657–670, 2012.

    In a field study Vakkari & Huuskonen examined how medical students’ search effort for an assigned essay writing task was associated to precision and relative recall, and how this was associated to the quality of the essay. They found out that effort in the search process degraded precision, but improved task outcome. The poorer the precision, the better the quality of the essay. The findings concerning the whole process are important, because they suggest that traditional effectiveness measures in information retrieval are not sufficient for task-based searching. They should be complemented with evaluation measures for search process and task outcome.

Ryen White

Response
  1. Characterizing and supporting cross-session and/or cross-device search tasks, including “slow search” support that capitalizes on time between search episodes . Motivation: Complex tasks persist over time. People are using multiple devices more frequently. Need ways to support transitions between devices that capitalizes on the time that search engines may have – both in predicting whether a searcher will resume the task, deciding what action to take to help them (e.g., finding more/better results while the searcher is away from the search engine), and helping them restore their task state.
  2. Leveraging on-task behavior of the current user (personalization) and similar users (those in related cohorts). Motivation: On-task behavior is most relevant for personalization. Need ways to automatically identify search tasks and use this task-relevant information to adapt the search experience (results and UX) within the current session and beyond. Also potential benefit from using other searchers’ on-task search behavior, especially for addressing the “cold start” problem associated with new users.
  3. Understanding and modeling the impact of task and user characteristics on information search behavior. Motivation: Attributes of the user (e.g., their domain knowledge), the search task (e.g., it’s complexity), or their relationship (e.g., user familiarity with tasks of this type) affect search behavior. Better understanding these effects and developing user/task models that consider these effects can help design better systems and methodologies (including user simulations learned from sources such as logs) to evaluate these systems.
  4. Automatically identifying components of search tasks and guiding users through those stages. Motivation: Complex search tasks have multiple aspects. Automatically identifying those parts can help systems guide users through the stages in a useful sequence. Tours or trails could be shown to searchers as an alternative/complement to existing result lists. These tours can be manually created or determined algorithmically from sources such as search log data.
References
  1. Eugene Agichtein, Ryen W. White, Susan T. Dumais, and Paul N. Bennet. Search, interrupted: understanding and predicting search task continuation. In Proceedings of the 35th International ACM SIGIR Conference on Research and Development in Information Retrieval, SIGIR ’12, pages 315–324, New York, NY, USA, 2012. ACM.
  2. Alexander Kotov, Paul N. Bennett, Ryen W. White, Susan T. Dumais, and Jaime Teevan. Modeling and analysis of cross-session search tasks. In Proceedings of the 34th International ACM SIGIR Conference on Research and Development in Information Retrieval, SIGIR ‘11, pages 5–14, New York, NY, USA, 2011. ACM.
  3. Yu Wang, Xiao Huang, and Ryen W. White. Characterizing and supporting cross-device search tasks. In Proceedings of the sixth ACM International Conference on Web Search and Data Mining, WSDM ‘13, pages 707–716, New York, NY, USA, 2013. ACM.
  4. James Pitkow, Hinrich Schütze, Todd Cass, Rob Cooley, Don Turnbull, Andy Edmonds, Eytan Adar, and Thomas Breuel. Personalized search. Communications of the ACM, 45(9):50–55, September 2002.
  5. Bin Tan, Xuehua Shen, and ChengXiang Zhai. Mining long-term search history to improve search accuracy. In Proceedings of the 12th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, KDD ‘06, pages 718–723, New York, NY, USA, 2006. ACM.
  6. Zhen Liao, Yang Song, Li-wei He, and Yalou Huang. Evaluating the effectiveness of search task trails. In Proceedings of the 21st International Conference on World Wide Web, WWW ‘12, pages 489–498, New York, NY, USA, 2012. ACM.
  7. Ryen W. White, Wei Chu, Ahmed Hassan, Xiaodong He, Yang Song, and Hongning Wang. Enhancing personalized search by mining and modeling task behavior. In Proceedings of the 22nd International Conference on World Wide Web, WWW ‘13, New York, NY, USA, To Appear. ACM.
  8. Diane Kelly and Colleen Cool. The effects of topic familiarity on information search behavior. In Proceedings of the 2nd ACM/IEEE-CS joint Conference on Digital libraries, JCDL ‘02, pages 74–75, New York, NY, USA, 2002. ACM.
  9. Katriina Byström and Kalervo Järvelin. Task complexity affects information seeking and use. Information Processing and Management, 31(2):191– 213, March 1995.
  10. Yuelin Li and Nicholas J. Belkin. An exploration of the relationships between work task and interactive information search behavior. Journal of the American Society for Information Science and Technology, 61(9):1771– 1789, 2010.
  11. Ryen W. White, Susan T. Dumais, and Jaime Teevan. Characterizing the influence of domain expertise on web search behavior. In Proceedings of the Second ACM International Conference on Web Search and Data Mining, WSDM ‘09, pages 132–141, New York, NY, USA, 2009. ACM.
  12. Ryen W. White, Ian Ruthven, Joemon M. Jose, and C. J. Van Rijsbergen. Evaluating implicit feedback models using searcher simulations. ACM Transactions of Information Systems, 23(3):325–361, 2005.
  13. Catherine Guinan and Alan F. Smeaton. Information retrieval from hypertext using dynamically planned guided tours. In Proceedings of the ACM Conference on Hypertext, ECHT ‘92, pages 122–130, New York, NY, USA, 1992. ACM.
  14. Adish Singla, Ryen White, and Jeff Huang. Studying trail finding algorithms for enhanced web search. In Proceedings of the 33rd International ACM SIGIR Conference on Research and Development in Information Retrieval, SIGIR ‘10, pages 443–450, New York, NY, USA, 2010. ACM.
  15. Ahmed Hassan and Ryen W. White. Task tours: helping users tackle complex search tasks. In Proceedings of the 21st ACM International Conference on Information and Knowledge Management, CIKM ‘12, pages 1885–1889, New York, NY, USA, 2012. ACM.
  16. Randall H. Trigg. Guided tours and tabletops: tools for communicating in a hypertext environment. ACM Transactions of Information Systems, 6(4):398–414, 1988.
  17. Alan Wexelblat and Pattie Maes. Footprints: history-rich tools for information foraging. In Proceedings of the SIGCHI Conference on Human Factors in Computing Systems, CHI ‘99, pages 270–277, New York, NY, USA, 1999. ACM.

Barbara Wildemuth

Response

Many studies have used one or more attributes of search tasks as an independent variable and examined various search behaviors (e.g., search terms selected, search strategy formulation and re-formulation, or browsing behavior) as the dependent variable. Many of these have found some type of effect, but not all of them have. Which task attributes are most worthwhile to incorporate in future studies of this type? Are there any that consistently show no effect on search behaviors or outcomes?

References
  1. Yuelin Li and Nicholas J. Belkin. An exploration of the relationships between work task and interactive information search behavior. Journal of the American Society for Information Science and Technology, 61(9):1771– 1789, 2010.

    Li's taxonomy of task attributes contributed to this empirical study; it's useful because it demonstrates a strong (though not perfect, I think) conceptual foundation for an empirical study.

  2. Elaine G. Toms, Luanne Freund, Richard Kopak, and Joan C. Bartlett. The effect of task domain on search. In Proceedings of the 2003 Conference of the Centre for Advanced Studies on Collaborative research, CASCON ‘03. IBM Press, 2003.

    This is an older study, but it makes me think that we should try to pick some of the low-hanging fruit first. It may be relatively easy to detect the domain in which a person is searching; can we then tune the search engine to better support the searcher?

    Many interactive IR studies use (search) task complexity or difficulty as an independent variable. Yet these concepts are rarely defined clearly and have been operationalized in a variety of ways. So that the results of future studies can be compared with each other, we need to come to some agreement on the definitions of search task difficulty and search task complexity. In addition, in many studies, it’s not clear whether the focus is on the search task or the work task, so we may also need to come to some agreement on definitions of work task difficulty and work task complexity.

  3. Donald J. Campbell. Task complexity: A review and analysis. The Academy of Management Review, 13(1):40–52, 1988.

    This is a classic review, but outside our field. Campbell treats task complexity as 1) primarily a psychological experience, 2) an interaction between task and person characteristics, and 3) a function of objective task characteristics. Applied to our work today, this framework may provide a basis for our own definitions of task complexity.

  4. Katriina Byström and Kalervo Järvelin. Task complexity affects information seeking and use. Information Processing and Management, 31(2):191– 213, March 1995.
  5. Jingjing Liu, Chang Liu, Michael Cole, Nicholas J. Belkin, and Xiangmin Zhang. 2012. Exploring and predicting search task difficulty. In Proceedings of the 21st ACM International Conference on Information and Knowledge Management, CIKM '12. ACM, New York, NY, USA, 1313-1322.

    The first of this pair focuses on work tasks and the second focuses on search tasks. While we tend to focus on search tasks (as interactive IR researchers), it may be equally important (or more important) to attend to the complexity or difficulty of work tasks.

Organizers:Diane Kelly (PI), Jaime Arguello (Co-PI), Robert Capra (Co-PI)
Student Assistant: Anita Crescenzi
School of Information & Library Science
University of North Carolina
100 Manning Hall, CB#3360
Chapel Hill, NC 27599-3360