Information Retrieval and the Web
The science surrounding search engines is commonly referred to as information retrieval, in which algorithmic principles are developed to match user interests to the best information about those interests.
Google started as a result of our founders' attempt to find the best matching between the user queries and Web documents, and do it really fast. During the process, they uncovered a few basic principles: 1) best pages tend to be those linked to the most; 2) best description of a page is often derived from the anchor text associated with the links to a page. Theories were developed to exploit these principles to optimize the task of retrieving the best documents for a user query.
Search and Information Retrieval on the Web has advanced significantly from those early days: 1) the notion of "information" has greatly expanded from documents to much richer representations such as images, videos, etc., 2) users are increasingly searching on their Mobile devices with very different interaction characteristics from search on the Desktops; 3) users are increasingly looking for direct information, such as answers to a question, or seeking to complete tasks, such as appointment booking. In Research at Google, we are continuing to enhance and refine the world's foremost search engine by aiming to scientifically understand the implications of those changes and address new challenges brought out by them.
213 Publications
-
Hidden in Plain Sight: Classifying Emails Using Embedded Image Contents
Navneet Potti, James B. Wendt, Qi Zhao, Sandeep Tata, Marc Najork
The Web Conference (2018) (to appear)
-
Position Bias Estimation for Unbiased Learning to Rank in Personal Search
Xuanhui Wang, Nadav Golbandi, Michael Bendersky, Donald Metzler, Marc Najork
Proceedings of the 11th ACM International Conference on Web Search and Data Mining (WSDM), ACM (2018), pp. 610-618
-
WWW ’18 Companion, April 23–27, 2018, Lyon, France (2018) (to appear)
-
Learning from User Interactions in Personal Search via Attribute Parameterization
Mike Bendersky, Xuanhui Wang, Don Metzler, Marc Najork
Proceedings of the 10th ACM International Conference on Web Search and Data Mining (WSDM), ACM (2017), pp. 791-800
-
Learning to Attend, Copy, and Generate for Session-Based Query Suggestion
Mostafa Dehghani, Sascha Rothe, Enrique Alfonseca, Pascal Fleury
CIKM 2017 (2017)
-
Multiscale Quantization for Fast Similarity Search
Xiang Wu, Ruiqi Guo, Ananda Theertha Suresh, Sanjiv Kumar, Dan Holtmann-Rice, David Simcha, Felix X. Yu
NIPS (2017)
-
Neural Ranking Models with Weak Supervision
Mostafa Dehghani, Hamed Zamani, Aliaksei Severyn, Jaap Kamps, W. Bruce Croft
Proceedings of The 40th International ACM SIGIR Conference on Research and Development in Information Retrieval, ACM (2017)
-
Reflections on the REST Architectural Style and “Principled Design of the Modern Web Architecture”
Roy T. Fielding, Richard N. Taylor, Justin Erenkrantz, Michael M. Gorlick, E. James Whitehead, Rohit Khare, Peyman Oreizy
Proceedings of the 2017 11th Joint Meeting on Foundations of Software Engineering (ESEC/FSE 2017), pp. 4-11
-
Cheng Li, Mike Bendersky, Sujith Ravi, Vijay Garg
Proceedings of WSDM (2017)
-
Situational Context for Ranking in Personal Search
Hamed Zamani, Mike Bendersky, Mingyang Zhang, Xuanhui Wang
WWW (2017)
-
Batz Spear, Betsy (Adrienne Elizabeth) Beyer, Luca Cittadini, Max Saltonstall
Login (2016)
-
Jing Kong, Alex Scott, Georg M. Goerg
Google Inc (2016) (to appear)
-
Incorporating Clicks, Attention and Satisfaction into a Search Engine Result Page Evaluation Model
Aleksandr Chuklin, Maarten de Rijke
CIKM, ACM (2016) (to appear)
-
Qi Guo, Yang Song
CIKM 2016, ACM
-
Learning for Efficient Supervised Query Expansion via Two-stage Feature Selection
Zhiwei Zhang, Qifan Wang, Luo Si, Jianfeng Gao
SIGIR 2016 (2016)
-
Learning to Rank with Selection Bias in Personal Search
Xuanhui Wang, Michael Bendersky, Donald Metzler, Marc Najork
Proc. of the 39th International ACM SIGIR Conference on Research and Development in Information Retrieval, ACM (2016), pp. 115-124
-
M3A: Model, MetaModel, and Anomaly Detection in Web Searches
Da-Cheng Juan, Neil Shah, Mingyu Tang, Zhiliang Qian, Diana Marculescu, Christos Faloutsos
arXiv preprint arXiv:1606.05978 (2016)
-
Using Machine Learning to Improve the Email Experience
Proc. of the 25th ACM International Conference on Information and Knowledge Management, ACM (2016), pp. 891
-
Wide & Deep Learning for Recommender Systems
Heng-Tze Cheng, Levent Koc, Jeremiah Harmsen, Tal Shaked, Tushar Chandra, Hrishi Aradhye, Glen Anderson, Greg Corrado, Wei Chai, Mustafa Ispir, Rohan Anil, Zakaria Haque, Lichan Hong, Vihan Jain, Xiaobing Liu, Hemal Shah
arXiv:1606.07792 (2016)
-
Wikipedia Tools for Google Spreadsheets
Wiki Workshop @ WWW (2016)
-
AdAlyze Redux: Post-Click and Post-Conversion Text Feature Attribution for Sponsored Search Ads
WWW '15 Companion Proceedings of the 24th International Conference on World Wide Web, ACM (2015)
-
Category-Driven Approach for Local Related Business Recommendations
Yonathan Perez, Michael Schueppert, Matthew Lawlor, Shaunak Kishore
Proceedings of the 24th ACM International on Conference on Information and Knowledge Management, ACM, New York, NY (2015), pp. 73-82
-
Guidelines and Registration Procedures for URI Schemes
Dave Thaler, Tony Hansen, Ted Hardie
IETF RFCs, Internet Engineering Task Force (2015), pp. 19
-
Learning to Extract Local Events from the Web
John Foley, Michael Bendersky, Vanja Josifovski
SIGIR 2015
-
What can be Found on the Web and How: A Characterization of Web Browsing Patterns
Alexey Tikhonov, Arseniy Chelnokov, Gleb Gusev, Ivan Bogatyy, Liudmila Ostroumova Prokhorenkova
WebSci 2015, Oxford (to appear)
-
A Scalable Gibbs Sampler for Probabilistic Entity Linking
Neil Houlsby, Massimiliano Ciaramita
Advances in Information Retrieval (ECIR 2014), Springer International Publishing, pp. 335-346
-
Circumlocution in Diagnostic Medical Queries
Isabelle Stanton, Samuel Ieong, Nina Mishra
The 37th Annual ACM SIGIR Conference (2014)
-
Wei Liu, Cun Mu, Sanjiv Kumar, Shih-Fu Chang
Neural Information Processing Systems (2014)
-
Near Neighbor Join
Herald Kllapi, Boulos Harb, Cong Yu
ICDE (2014)
-
On Reconstructing a Hidden Permutation
Flavio Chierichetti, Anirban Dasgupta, Ravi Kumar, Silvio Lattanzi
RANDOM (2014)
-
Scalable K-Means by ranked retrieval
Andrei Broder, Lluis Garcia-Pueyo, Vanja Josifovski, Sergei Vassilvitskii, Srihari Venkatesan
Proceedings of the 7th ACM international conference on Web search and data mining, ACM, New York, NY, USA (2014), pp. 233-242
-
Storing and Querying Tree-Structured Records in Dremel
Foto N Afrati, Dan Delorey, Mosha Pasumansky, Jeffrey D. Ullman
Proceedings of the VLDB Endowment, vol. 7 (2014), pp. 1131-1142
-
The SMAPH System for Query Entity Recognition and Disambiguation
Marco Cornolti, Paolo Ferragina, Massimiliano Ciaramita, Stefan Rued, Hinrich Schuetze
ERD 2014: Entity Recognition and Disambiguation Challenge. SIGIR Forum., ACM
-
Towards better measurement of attention and satisfaction in mobile search
Dmitry Lagun, Dale Webster, Chih-Hung Hsieh, Vidhya Navalpakkam
SIGIR '14 Proceedings of the 37th international ACM SIGIR conference on Research & development in information retrieval (2014), pp. 113-122
-
Up Next: Retrieval Methods for Large Scale Related Video Suggestion
Michael Bendersky, Lluis Garcia Pueyo, Vanja Josifovski, Jeremiah J. Harmsen, Dima Lepikhin
Proceedings of KDD 2014, New York, NY, USA, pp. 1769-1778
-
A Framework for Benchmarking Entity-Annotation Systems
Marco Cornolti, Paolo Ferragina, Massimiliano Ciaramita
Proceedings of the International World Wide Web Conference (WWW) (Practice & Experience Track), ACM (2013)
-
Elizabeth Foss, Hilary Hutchinson, Allison Druin, Jason Yip, Whitney Ford, Evan Golub
Journal of the American Society for Information Science and Technology, vol. 64(1) (2013), pp. 173-189
-
Brad Green, Shyam Seshadri
O'Reilly (2013), pp. 196
-
Capturing the functionality of Web services with functional descriptions
Ruben Verborgh, Thomas Steiner, Davy Van Deursen, Jos De Roo, Rik Van de Walle, Joaquim Gabarró Vallés
Multimedia Tools Appl., vol. 64 (2013), pp. 365-387
-
Crawling deep web entity pages
Yeye He, Dong Xin, Venkatesh Ganti, Sriram Rajaraman, Nirav Shah
WSDM (2013), pp. 355-364
-
Distributed affordance: an open-world assumption for hypermedia
Ruben Verborgh, Michael Hausenblas, Thomas Steiner, Erik Mannens, Rik Van de Walle
WWW (Companion Volume) (2013), pp. 1399-1406
-
Packt Publishing, Packt Publishing Limited, 2nd Floor, Livery Place, 35 Livery Street, Birmingham, B3 2PB (2013)
-
Learning to Rank Recommendations with the k-Order Statistic Loss
Jason Weston, Hector Yee, Ron Weiss
ACM International Conference on Recommender Systems (RecSys) (2013)
-
Modelling Score Distributions Without Actual Scores
Stephen Robertson, Evangelos Kanoulas, Emine Yilmaz
Proceedings of the 2013 Conference on the Theory of Information Retrieval, ACM, New York, NY, USA, pp. 85-92
-
Nearest Neighbor Search in Google Correlate
Dan Vanderkam, Rob Schonberger, Henry Rowley, Sanjiv Kumar
Google (2013)
-
Proceedings of the 2013 Conference on the Theory of Information Retrieval
Oren Kurland, Donald Metzler, Christina Lioma, Birger Larsen, Peter Ingwersen
ACM (2013)
-
R-Score: Reputation-based Scoring of Research Groups
Sabir Ribas, Berthier A. Ribeiro-Neto, Edmundo de Souza e Silva, Nivio Ziviani
CoRR, vol. abs/1308.5286 (2013)
-
Random Grids: Fast Approximate Nearest Neighbors and Range Searching for Image Search
Dror Aiger, Efi Kokiopoulou, Ehud Rivlin
ICCV 2013
-
Real-time communications for the web
Cullen Jenngins, Ted Hardie, Magnus Westerlund
Communications Magazine, IEEE, vol. 51 (2013), pp. 20-26
-
Top-k Publish-Subscribe for Social Annotation of News
Alexander Shraer, Maxim Gurevich, Marcus Fontoura, Vanja Josifovski
Proceedings of the 39th International Conference on Very Large Data Bases, VLDB Endowment (2013)
-
Philippe Hamel, Matthew E. P. Davies, Kazuyoshi Yoshii, Masataka Goto
14th International Conference on Music Information Retrieval (ISMIR '13) (2013)
-
Web Workers Multithreaded Programs in JavaScript
O'Reilly, 1005 Gravenstein Hwy N Sebastopol, CA 95472 (2013), pp. 62
-
A Cross-Lingual Dictionary for English Wikipedia Concepts
Valentin I. Spitkovsky, Angel X. Chang
Eighth International Conference on Language Resources and Evaluation (LREC 2012)
-
A Social Description Revolution - Describing Web APIs' Social Parameters with RESTdesc
Ruben Verborgh, Thomas Steiner, Joaquim Gabarró, Erik Mannens, Rik Van de Walle
AAAI Spring Symposium: Intelligent Web Services Meet Social Computing (2012)
-
An Integrated Framework for Spatio-Temporal-Textual Search and Mining
Bingsheng Wang, Haili Dong, Arnold Boedihardjo, Chang-Tien Lu, Harland Yu, Ing-Ray Chen, Jing Dai
20th ACM SIGSPATIAL International Conference on Advances in Geographic Information Systems (ACM SIGSPATIAL GIS 2012), ACM, 2 Penn Plaza, Suite 701, New York, NY 10121, pp. 570-573
-
Angular Quantization-based Binary Codes for Fast Similarity Search
Yunchao Gong, Sanjiv Kumar, Vishal Verma, Svetlana Lazebnik
Neural Information Processing Systems (NIPS) (2012)
-
Beyond Web Developer Tools: Strace
Web Performance Daybook Volume Two, O'Reilly Media, Inc., 1005 Gravenstein Highway North, Sebastopol, CA 95472 (2012), pp. 119-121
-
Compact Hyperplane Hashing with Bilinear Functions
Wei Liu, Jun Wang, Yadong Mu, Sanjiv Kumar, Shih-Fu Chang
International Conference on Machine Learning (ICML) (2012)
-
Context-aware querying for multimodal search engines
Jonas Etzold, Arnaud Brousseau, Paul Grimm, Thomas Steiner
Proceedings of the 18th international conference on Advances in Multimedia Modeling, Springer-Verlag, Berlin, Heidelberg (2012), pp. 728-739
-
DOHA: scalable real-time web applications through adaptive concurrent execution
Aiman Erbad, Norman C. Hutchinson, Charles Krasic
Proceedings of the 21st international conference on World Wide Web, ACM, New York, NY, USA (2012), pp. 161-170
-
Dart: Up and Running
O'Reilly Media, 1005 Gravenstein Highway North Sebastopol, CA 95472 USA (2012)
-
Experimental methods for information retrieval
Donald Metzler, Oren Kurland
SIGIR (2012), pp. 1185-1186
-
Hokusai | Sketching Streams in Real Time
Sergiy Matusevych, Alex Smola, Amr Ahmed
Proceedings of the 28th International Conference on Conference on Uncertainty in Artificial Intelligence (UAI) (2012)
-
I-SEARCH: a multimodal search engine based on rich unified content description (RUCoD)
Thomas Steiner, Lorenzo Sutton, Sabine Spiller, Marilena Lazzaro, Francesco Nucci, Vincenzo Croce, Alberto Massari, Antonio Camurri, Anne Verroust-Blondet, Laurent Joyeux, Jonas Etzold, Paul Grimm, Athanasios Mademlis, Sotiris Malassiotis, Petros Daras, Apostolos Axenopoulos, Dimitrios Tzovaras
Proceedings of the 21st international conference companion on World Wide Web, ACM, New York, NY, USA (2012), pp. 291-294
-
IR paradigms in computational advertising
SIGIR (2012), pp. 1019
-
Indexing the World Wide Web: The Journey So Far
Next Generation Search Engines: Advanced Models for Information Retrieval, IGI-Global (2012), pp. 1-28
-
Latent Collaborative Retrieval
Jason Weston, Chong Wang, Ron Weiss, Adam Berenzweig
International Conference on Machine Learning (2012)
-
Jason Weston, John Blitzer
UAI (2012)
-
Nowcasting the macroeconomy with search engine data
Hal R. Varian
Proceedings of the fifth ACM international conference on Web search and data mining, ACM, New York, NY, USA (2012), pp. 1-2
-
On the Difficulty of Nearest Neighbor Search
Junfeng He, Sanjiv Kumar, Shih-Fu Chang
International Conference on Machine Learning (ICML) (2012)
-
Online Selection of Diverse Results
Debmalya Panigrahi, Atish Das Sarma, Gagan Aggarwal, Andrew Tomkins
Proceedings of the 5th ACM international Conference on Web Search and Data Mining (2012), pp. 263-272
-
Participatory design of social search experiences
Nick Matterson, David Choi
Proceedings of the 2012 ACM annual conference extended abstracts on Human Factors in Computing Systems Extended Abstracts, ACM, New York, NY, USA, pp. 1937-1942
-
Spotting fake reviewer groups in consumer reviews
Arjun Mukherjee, Bing Liu, Natalie Glance
Proceedings of the 21st international conference on World Wide Web, ACM, New York, NY, USA (2012), pp. 191-200
-
The Shoebox and the Safe: When Once-Personal Information Changes Hands
Proceedings of the 5th International Workshop on Personal Information Management at CSCW 2012
-
Topical clustering of search results
Ugo Scaiella, Paolo Ferragina, Andrea Marino, Massimiliano Ciaramita
Proceedings of the fifth ACM international conference on Web search and data mining, ACM, New York, NY, USA (2012), pp. 223-232
-
Towards a High Quality and Web-Scalable Table Search Engine
Proceedings of the Third International Workshop on Keyword Search on Structured Data (2012), pp. 1-1
-
Web Search - Challenges and Opportunities
AMW (2012), pp. 16-17
-
Who knows?: searching for expertise on the social web: technical perspective.
Commun. ACM, vol. 55, 4 (2012), pp. 110-110
-
YouTube around the world: geographic popularity of videos
Anders Brodersen, Salvatore Scellato, Mirjam Wattenhofer
Proceedings of the 21st international conference on World Wide Web, ACM, New York, NY, USA (2012), pp. 241-250
-
A Four Group Cross-Over Design for Measuring Irreversible Treatments on Web Search Tasks
Li Ma, David Mease, Daniel M. Russell
Proceedings of Hawaii International Conference on System Sciences (HICSS) (2011), pp. 1-9
-
A generic Web-based entity resolution framework
Denilson Alves Pereira, Berthier A. Ribeiro-Neto, Nivio Ziviani, Alberto H. F. Laender, Marcos André Gonçalves
JASIST, vol. 62 (2011), pp. 919-932
-
A generic Web-based entity resolution framework
Denilson Alves Pereira, Berthier Ribeiro-Neto, Nivio Ziviani, Alberto H. F. Laender, Marcos Gonçalves
Journal of the American Society for Information Science and Technology, vol. 62 (2011), pp. 919-932
-
Analysis of an Expert Search Query Log
Yi Fang, Naveen Somasundaram, Luo Si, Jeongwoo Ko, Aditya P. Mathur
SIGIR (2011)
-
Context-sensitive query auto-completion
Ziv Bar-Yossef, Naama Kraus
Proceedings of the 20th International Conference on World Wide Web (WWW) (2011), pp. 107-116
-
CrowdForge: Crowdsourcing Complex Work
Aniket Kittur, Boris Smus, Susheel Khamkar, Robert Kraut
Proceedings of UIST 2011, Santa Barbara, CA
-
O'Reilly Media, 1005 Gravenstein Highway North Sebastopol, CA 95472 (2011), pp. 50
-
DiversiWeb 2011
Elena Paslaru Bontas Simperl, Devika P. Madalli, Denny Vrandecic, Enrique Alfonseca
SIGIR Forum, vol. 45 (2011), pp. 49-53
-
DiversiWeb 2011: first international workshop on knowledge diversity on the web
Elena Paslaru Bontas Simperl, Devika P. Madalli, Denny Vrandecic, Enrique Alfonseca
WWW (Companion Volume) (2011), pp. 319-320
-
Efficient Runtime Service Discovery and Consumption with Hyperlinked RESTdesc
Ruben Verborgh, Thomas Steiner, Davy Van Deursen, Rik Van de Walle, Joaquim Gabarro
The 7th International Conference on Next Generation Web Services Practices (NWeSP 2011), Salamanca, Spain
-
Estimating the size of online social networks
Shaozhi Ye, S. Felix Wu
International Journal of Social Computing and Cyber-Physical Systems, vol. 1 (2011), pp. 160 - 179
-
Fulfilling the Hypermedia Constraint Via HTTP OPTIONS, the HTTP Vocabulary In RDF, and Link Headers
Thomas Steiner, Jan Algermissen
Proceedings of the Second International Workshop on RESTful Design, ACM, New York, NY, USA (2011), pp. 11-14
-
Matt Mohebbi, Dan Vanderkam, Julia Kodysh, Rob Schonberger, Hyunyoung Choi, Sanjiv Kumar
Google (2011)
-
Learning to Search Efficiently in High Dimensions
Zhen Li, Huazhong Ning, Liangliang Cao, Tong Zhan, Yihong Gong, Thomas S. Huang
Neural Information Processing Systems (2011)
-
Modern Information Retrieval - the concepts and technology behind search, Second edition
Ricardo A. Baeza-Yates, Berthier A. Ribeiro-Neto
Pearson Education Ltd., Harlow, England (2011)
-
Recovering Semantics of Tables on the Web
Petros Venetis, Alon Y. Halevy, Jayant Madhavan, Marius Pasca, Warren Shen, Fei Wu, Gengxin Miao
Proceedings of the VLDB Endowment, vol. 4 (2011), pp. 528-538
-
Reputation Systems for Open Collaboration
B.T. Adler, L. de Alfaro, A. Kulshrestra, I. Pye
Communications of the ACM, vol. 54 No. 8 (2011), pp. 81-87
-
The Future of Browsers - A Primer for HTML5 and Other Modern Browser Game Technologies
Game Developer Magazine, vol. 18 #5 (2011), pp. 35-41
-
The Need for Music Information Retrieval with User-Centered and Multimodal Strategies
Cynthia C.S. Liem, Meinard Müller, Douglas Eck, George Tzanetakis, Alan Hanjalic
MIRUM '11, ACM, Scottsdale, Arizona (2011), pp. 1-6
-
The Snap Framework: A Web Toolkit for Haskell
Gregory Collins, Doug Beardsley
IEEE Internet Computing, vol. 15 (2011), pp. 84-87
-
Capacity Planning for Vertical Search Engines
Claudine Santos Badue, Jussara M. Almeida, Virgilio Almeida, Ricardo A. Baeza-Yates, Berthier A. Ribeiro-Neto, Artur Ziviani, Nivio Ziviani
CoRR, vol. abs/1006.5059 (2010)
-
Children's Roles Using Keyword Search Interfaces in the Home
Allison Druin, Elizabeth Foss, Hilary Hutchinson, Evan Golub, Leshell Hatley
Proceedings of CHI 2010, ACM Press
-
Clustering Query Refinements by User Intent
Eldar Sadikov, Jayant Madhavan, Lu Wang, Alon Halevy
Proceedings of the International World Wide Web Conference (WWW) (2010)
-
Combining Evidence with a Probabilistic Framework for Answer Ranking and Answer Merging in Question Answering
Jeongwoo Ko, Luo Si, Eric Nyberg
Information Processing and Management, vol. 46 (2010), pp. 541-554
-
Generalized Syntactic and Semantic Models of Query Reformulation
Amac Herdagdelen, Massimiliano Ciaramita, Daniel Mahler, Maria Holmqvist, Keith Hall, Stefan Riezler, Enrique Alfonseca
Proceedings of SIGIR-2010
-
Google Squared: web scale, open domain information extraction and presentation
Dan Crow
European Conference on Information Retrieval, Industry Day (2010)
-
How Google is using Linked Data Today and Vision For Tomorrow
Thomas Steiner, Raphael Troncy, Michael Hausenblas
Proceedings of Linked Data in the Future Internet at the Future Internet Assembly (FIA 2010), Ghent, December 2010
-
Information Retrieval: Implementing and Evaluating Search Engines
Stefan Buettcher, Charles L. A. Clarke, Gordon V. Cormack
MIT Press, Cambridge, MA (2010)
-
Personalized News Recommendation Based on Click Behavior
Jiahui Liu, Elin Pedersen, Peter Dolan
2010 International Conference on Intelligent User Interfaces
-
Probabilistic Models for Answer Ranking in Multilingual Question Answering
Jeongwoo Ko, Luo Si, Eric Nyberg, Teruko Mitamura
Transactions on Information Systems (2010)
-
Quantitative Analysis of Culture Using Millions of Digitized Books
Jean-Baptiste Michel, Yuan Kui Shen, Aviva Presser Aiden, Adrian Veres, Matthew K. Gray, The Google Books Team, Joseph P. Pickett, Dale Holberg, Dan Clancy, Peter Norvig, Jon Orwant, Steven Pinker, Martin A. Nowak, Erez Lieberman Aiden
Science (2010)
-
Query Difficulty Prediction for Contextual Image Retrieval
Xing Xing, Yi Zhang, Mei Han
32nd European Conference on Information Retrieval (ECIR'10) (2010)
-
Query Rewriting using Monolingual Statistical Machine Translation
Stefan Riezler, Yi Liu
Computational Linguistics, vol. 36 (2010)
-
Research trails: getting back where you left off
Jiahui Liu, Peter Jin Hong, Elin Rønby Pedersen
Proceedings of the 19th international conference on World Wide Web, ACM, Raleigh, North Carolina (2010), pp. 1151-1152
-
Search flavours - recent updates and trends
SIGIR (2010)
-
Shopping for Top Forums: Discovering Online Discussion for Product Research
Jonathan Elsas, Natalie Glance
KDD SOMA 2010 Workshop on Social Media Analytics
-
Stochastic Models for Tabbed Browsing
Flavio Chierichetti, Ravi Kumar, Andrew Tomkins
Proceedings of the 19th international conference on World Wide Web, ACM, Raleigh, North Carolina (2010), pp. 241-250
-
User browsing models: relevance versus examination
Ramakrishnan Srikant, Sugato Basu, Ni Wang, Daryl Pregibon
Proceedings of the 16th ACM SIGKDD international conference on Knowledge discovery and data mining, ACM, Washington, DC (2010), pp. 223-232
-
Using structural information to improve search in Web collections
Edleno Silva de Moura, David Fernandes, Berthier A. Ribeiro-Neto, Altigran Soares da Silva, Marcos André Gonçalves
JASIST, vol. 61 (2010), pp. 2503-2513
-
What the web can't do
David A. Shamma, Seth Fitzsimmonds, Joe Gregorio, Adam Hupp, Ramesh Jain, Kevin Marks
Proceedings of the 19th international conference on World Wide Web, ACM, Raleigh, North Carolina (2010), pp. 1341-1342
-
7th workshop on large-scale distributed systems for information retrieval (LSDS-IR'09)
Claudio Lucchese, Gleb Skobeltsyn, Wai Gen Yee
SIGIR forum, vol. 43 (2009), pp. 34-40
-
A Simple Linear Ranking Algorithm Using Query Dependent Intercept Variables
Nir Ailon
ECIR 2009 (to appear)
-
An Audio Indexing System for Election Video Material
Christopher Alberti, Michiel Bacchiani, Ari Bezman, Ciprian Chelba, Anastassia Drofa, Hank Liao, Pedro Moreno, Ted Power, Arnaud Sahuguet, Maria Shugrina, Olivier Siohan
Proceedings of ICASSP (2009), pp. 4873-4876
-
Answer typing for information retrieval
Christopher Pinchak, Davood Rafiei, Dekang Lin
Proceeding of the 18th ACM conference on Information and knowledge management (CIKM), ACM, Hong Kong (2009), pp. 1955-1958
-
Challenges in building large-scale information retrieval systems: invited talk
WSDM '09: Proceedings of the Second ACM International Conference on Web Search and Data Mining, ACM, New York, NY, USA (2009), pp. 1-1
-
Do not crawl in the DUST: Different URLs with similar text
Ziv Bar-Yossef, Idit Keidar, Uri Schonfeld
ACM Transactions on the Web, vol. 3 (2009), pp. 3
-
Estimating the ImpressionRank of Web Pages
Ziv Bar-Yossef, Maxim Gurevich
Proceedings of the 18th International Conference on World Wide Web (WWW) (2009), pp. 41-50
-
Evaluating web search using task completion time
Ya Xu, David Mease
SIGIR '09: Proceedings of the 32nd international ACM SIGIR conference on Research and development in information retrieval, ACM, New York, NY, USA (2009), pp. 676-677
-
Expected reciprocal rank for graded relevance
Olivier Chapelle, Donald Metlzer, Ya Zhang, Pierre Grinspan
CIKM '09: Proceeding of the 18th ACM conference on Information and knowledge management, ACM, New York, NY, USA (2009), pp. 621-630
-
Fancy a Drink in Canary Wharf?: A User Study on Location-Based Mobile Search
Alia Amin, Sian Townsend, Jacco Ossenbruggen, Lynda Hardman
INTERACT '09: Proceedings of the 12th IFIP TC 13 International Conference on Human-Computer Interaction, Springer-Verlag, Berlin, Heidelberg (2009), pp. 736-749
-
Going Beyond Gzipping
Even Faster Web Sites, O'Reilly Media, Inc., 1005 Gravenstein Highway North, Sebastopol, CA 95472 (2009), pp. 121-132
-
Good Abandonment in Mobile and PC Internet Search
Jane Li, Scott Huffman, Akihito Tokuda
32nd Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, ACM (Association for Computing Machinery), 2 Penn Plaza, Suite 701, New York 10121-0701 (2009), pp. 43-50
-
Harnessing the Deep Web: Present and Future
Jayant Madhavan, Loredana Afanasiev, Lyublena Antova, Alon Halevy
Proceedings of the Conference on Innovative Data system Research (CIDR) (2009)
-
Harvesting Relational Tables from Lists on the Web
Hazem Elmeleegy, Jayant Madhavan, Alon Halevy
Proceedings of the VLDB Endowment (PVLDB) (2009), pp. 1078-1089
-
High precision retrieval using relevance-flow graph
Jangwon Seo, Jiwoon Jeon
SIGIR '09: Proceedings of the 32nd international ACM SIGIR conference on Research and development in information retrieval, ACM, New York, NY, USA (2009), pp. 694-695
-
How opinions are received by online communities: A case study on Amazon.com helpfulness votes
Cristian Danescu-Niculescu-Mizil, Gueorgi Kossinets, Jon Kleinberg, Lillian Lee
Proceedings of the 18th International Conference on World Wide Web, WWW 2009, Madrid, Spain, April 20-24, 2009, pp. 141-150
-
Incremental Crawling
Encyclopedia of Database Systems, Springer, New York (2009), pp. 1417-1421
-
Information arbitrage across multi-lingual Wikipedia
Eytan Adar, Michael Skinner, Daniel S. Weld
WSDM '09: Proceedings of the Second ACM International Conference on Web Search and Data Mining, ACM, New York, NY, USA (2009), pp. 94-103
-
Information extraction meets relation databases
Davood Rafiei, Andrei Broder, Edward Chang, Patrick Pantel
CIKM '09: Proceeding of the 18th ACM conference on Information and knowledge management, ACM, New York, NY, USA (2009), pp. 897-897
-
Modeling similarity in the age of data
MAA (2009)
-
Reciprocal rank fusion outperforms condorcet and individual rank learning methods
Gordon V. Cormack, Charles L A Clarke, Stefan Buettcher
SIGIR '09: Proceedings of the 32nd international ACM SIGIR conference on Research and development in information retrieval, ACM, New York, NY, USA (2009), pp. 758-759
-
Search Engines: Information Retrieval in Practice
W. Bruce Croft, Donald Metzler, Trevor Strohman
Addison Wesley (2009)
-
The impact of result abstracts on task completion time.
Rehan Khan, David Mease, Rajan Patel
WWW 2009 Proceedings
-
Topic and Trend Detection in Text Collections Using Latent Dirichlet Allocation
Levent Bolelli, Şeyda Ertekin, C. Lee Giles
ECIR '09: Proceedings of the 31th European Conference on IR Research on Advances in Information Retrieval, Springer-Verlag, Berlin, Heidelberg (2009), pp. 776-780
-
Using web information for author name disambiguation
Denilson Alves Pereira, Berthier A. Ribeiro-Neto, Nivio Ziviani, Alberto H. F. Laender, Marcos André Gonçalves, Anderson A. Ferreira
JCDL (2009), pp. 49-58
-
Web Derived Pronunciations for Spoken Term Detection
Doğan Can, Erica Cooper, Arnab Ghoshal, Martin Jansche, Sanjeev Khudanpur, Bhuvana Ramabhadran, Michael Riley, Murat Saraçlar, Abhinav Sethy, Morgan Ulinski, Christopher White
32nd Annual International ACM SIGIR Conference (2009), pp. 83-90
-
YouTube's Collaborative Annotations
Michael Fink, Sigalit Bar, Aviad Bazilai, Nir Kerem, Isaac Elias, Julian Frumar, Herb Ho, Ryan Junee, Simon Ratner, Jasson Schrock, Ran Tavory
Webcentives (2009), pp. 18-19
-
Eye Monitoring in Online Search
Laura A. Granka, Matthew Feusner, Lori Lorigo
Passive Eye Monitoring, Springer Verlag, 69121 Heidelberg, Germany (2008), pp. 283-304
-
Generating Diverse Katakana Variants Based on Phenomic Mapping
Kazuhiro Seki, Hiroyuki Hattori, Kuniaki Uehara
Proc. 31st Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, ACM, Singapore (2008), pp. 793-794
-
Generating Links by Mining Quotations
Hypertext, Pittsburgh, Pennsylvania, USA (2008), pp. 117-126
-
Google's Deep-Web Crawl
Jayant Madhavan, David Ko, Lucja Kot, Vignesh Ganapathy, Alex Rasmussen, Alon Halevy
Proceedings of the International Conference on Very Large Databases (VLDB) (2008)
-
How evaluator domain expertise affects search result relevance
Kenneth A. Kinney, Scott B. Huffman, Juting Zhai
Conference on Information and Knowledge Management (2008), pp. 591-598
-
Income Inequality in the Attention Economy
Google, Inc. (2008)
-
Learning From Labeled Features Using Generalized Expectation Criteria
Gregory Druck, Gideon Mann, Andrew McCallum
Proc. 31st International ACM SIGIR Conference on Research and Development in Information Retrieval, ACM, Singapore (2008), pp. 595-602
-
Local Approximation of PageRank and Reverse PageRank
Ziv Bar-Yossef, Li-Tal Mashiach
Proceedings of the 17th ACM Conference on Information and Knowledge Management (CIKM) (2008), pp. 279-288
-
Microscale Evolution of Web Pages
Sean O'Brien, Carrie Grimes
WWW 2008
-
Mining Search Engine Query Logs via Suggestion Sampling
Ziv Bar-Yossef, Maxim Gurevich
Proceedings of the VLDB Endowment (2008), pp. 54-65
-
Next-generation Digital Earth. A position paper from the Vespucci Initiative for the Advancement of Geographic Information Science
M. Craglia, M.F. Goodchild, A. Annoni, G. Camara, M. Gould, W. Kuhn, D.M. Mark, I. Masser, D.J. Maguire, S. Liang, E. Parsons
International Journal of Spatial Data Infrastructure Research, vol. 3 (2008), pp. 146-167
-
Random sampling from a search engine's index
Ziv Bar-Yossef, Maxim Gurevich
Journal of the ACM, vol. 55 (2008)
-
Retrieval models for question and answer archives
Xiaobing Xue, Jiwoon Jeon, W. Bruce Croft
SIGIR (2008), pp. 475-482
-
Rich Media and Web 2.0
Edward Chang, Ken Ong, Susanne Boll, Wei-Ying Ma
Proc. 17th International Conference on World Wide Web, ACM, Beijing (2008), pp. 1259-1259
-
The Mobile Web is Structurally Different
Apoorva Jindal, Chris Crutchfied, Samir Goel, Ravi Jain, Ravi Kolluri
11th IEEE Global Internet Symposium (2008)
-
Translating Queries into Snippets for Improved Query Expansion
Stefan Riezler, Yi Liu, Alexander Vasserman
Proceedings of the 22nd International Conference on Computational Linguistics (COLING'08), Manchester, England (2008)
-
Using Web Information for Creating Publication Venue Authority Files
Denilson Alves Pereira, Berthier Ribeiro-Neto, Nivio Ziviani, Alberto H. F. Laender
Proc. ACM/IEEE Joint Conference on Digital Libraries, ACM, Pittsburgh (2008), pp. 295-304
-
WCAG 2.0: A Web Accessibility Standard for the Evolving Web
Loretta Guarino Reid, Andi Snow-Weaver
Proceedings of the 2008 Internationl Cross-disciplinary Conference on Web Accessibility (W4A)
-
Web-scale extraction of structured data.
Michael Cafarella, Jayant Madhavan, Alon Halevy
SIGMOD Record, vol. 37(4) (2008), pp. 55-61
-
A Fact/Opinion Classifier for News Articles
Adam Stepinksi, Vibhu Mittal
Proc. 30th SIGIR, ACM, Amsterdam (2007), pp. 807-808
-
ASAP: An Advertisement-based Search Algorithm for Unstructured Peer-to-peer Systems
Peng Gu, Jun Wang, Hailong Cai
Proc. International Conference on Parallel Processing (ICPP), IEEE Computer Society (2007), pp. 8
-
Analyzing imbalance among homogeneous index servers in a web search system
Claudine Santos Badue, Ricardo A. Baeza-Yates, Berthier A. Ribeiro-Neto, Artur Ziviani, Nivio Ziviani
Inf. Process. Manage., vol. 43 (2007), pp. 592-608
-
Automatic and Versatile Publications Ranking for Research Institutions and Scholars
Jie Ren, Richard N. Taylor
Communications of the ACM, vol. 50, no. 6 (2007), pp. 81-85
-
Corroborate and learn facts from the web
Shubin Zhao, Jonathan Betz
Proceedings of the 13th ACM SIGKDD international conference on Knowledge discovery and data mining, ACM, San Jose (2007), pp. 995-1003
-
Detecting near-duplicates for web crawling
Gurmeet Singh Manku, Arvind Jain, Anish Das Sarma
WWW 2007 (16th International Conference on the World Wide Web), ACM, Banff, pp. 141-150
-
Do Not Crawl in the DUST: Different URLs with Similar Text
Ziv Bar-Yossef, Idit Keidar, Uri Schonfeld
WWW (2007), pp. 111-120
-
Efficient Search Engine Measurements
Ziv Bar-Yossef, Maxim Gurevich
WWW (2007), pp. 401-410
-
Efficient Search Ranking in Social Networks
Monique V. Vieira, Bruno M. Fonseca, Rodrigo Damazio, Paulo B. Golgher, Davi de Castro Reis, Berthier Ribeiro-Neto
Proc. CIKM, ACM, Lisboa, Portugal (2007)
-
Google News Personalization: Scalable Online Collaborative Filtering
Abhinandan Das, Mayur Datar, Ashutosh Garg, Shyam Rajaram
Proceedings of WWW 2007, pp. 271-280
-
How well does result relevance predict session satisfaction?
Scott B. Huffman, Michael Hochster
Proceedings of the 30th annual international ACM SIGIR, ACM, Amsterdam (2007), pp. 567-574
-
Learning people annotation from the web via consistency learning
Jay Yagnik, Atig Islam
Proc. international Workshop on Multimedia Information Retrieval, ACM, Augsberg, Germany (2007), pp. 285-290
-
Multiple-Signal Duplicate Detection for Search Evaluation
Scott Huffman, April Lehman, Alexei Stolboushkin, Howard Wong-Toi, Fan Yang, Hein Roehrig
Proceedings of 30th Annual International ACM SIGIR Conference, ACM (2007), pp. 223-230
-
Query logs alone are not enough
Carrie Grimes, Diane Tang, Daniel Russell
WWW 2007 Workshop on Query Log Analysis: Social and Technological Changes
-
Statistical Machine Translation for Query Expansion in Answer Retrieval
Stefan Riezler, Alexander Vasserman, Ioannis Tsochantaridis, Vibhu Mittal, Yi Liu
Proceedings of the 45th Annual Meeting of the Association for Computational Linguistics (ACL'07), Prague, Czech Republic (2007)
-
A Web-based Kernel Function for Measuring the Similarity of Short Text Snippets
Mehran Sahami, Tim Heilman
Proceedings of the Fifteenth International World Wide Web Conference, Edinburgh, Scotland (2006), pp. 377-386
-
A comparative study of citations and links in document classification
Thierson Couto, Marco Cristo, Marcos Andr, P, Nivio Ziviani, Edleno Silva de Moura, Berthier A. Ribeiro-Neto
JCDL (2006), pp. 75-84
-
Browsing on Small Screens: Recasting Web-Page Segmentation into an Efficient Machine Learning Framework
Proceedings of the Fifteenth International World Wide Web Conference, Edinburgh, Scotland (2006)
-
Finding Near-Duplicate Web Pages: A Large-Scale Evaluation of Algorithms
Monika Henzinger
Proc. SIGIR, ACM (2006)
-
Identity management on converged networks: a reality check
Arnaud Sahuguet, Stefan Brands, Kim Cameron, Cahill Conor, Aude Pichelin, Fulup Ar Foll, Mike Neuenschwander
WWW (2006), pp. 747
-
Indexing Shared Content in Information Retrieval Systems
Andrei Z. Broder, Nadav Eiron, Marcus Fontoura, Michael Herscovici, Ronny Lempel, John McPherson, Runping Qi, Eugene J. Shekita
EDBT (2006), pp. 313-330
-
Introduction to the special issue on XML retrieval
Ricardo Baeza-Yates, Norbert Fuhr, Yoelle Maarek
ACM Transactions on Information Systems, vol. 24 (2006), pp. 405-406
-
Learning to Advertise
Anísio Lacerda, Marco Cristo, Marcos André Gonçalves, Weiguo Fan, Nivio Ziviani, Berthier Ribeiro-Neto
Proc. SIGIR, ACM Press, Seattle (2006), pp. 549-556
-
Retroactive Answering of Search Queries
Beverly Yang, Glen Jeh
Proc. International World Wide Web Conference, ACM, Edinburgh, Scotland (2006), pp. 457-466
-
Semantic Search via XML Fragments: A High Precision Approach to IR
Jennifer Chu-Carroll, John Prager, Krzysztof Czuba, David Ferrucci, Pablo Duboue
Proc. 29th ACM SIGIR Conference on Research and Development in Information Retrieval, ACM, Seattle, WA (2006), pp. 445-452
-
The Mobile Web in Developing Countries
W3C Workshop on the Mobile Web in Developing Countries, W3C, W3C (2006)
-
Using annotations in enterprise search
Pavel A. Dmitriev, Nadav Eiron, Marcus Fontoura, Eugene Shekita
WWW (2006), pp. 811-817
-
Web mining with search engines: A web-based kernel function for measuring the similarity of short text snippets
Mehran Sahami, Timothy D. Heilman
Proc. 15th International World Wide Web Conference, ACM, Edinburgh, Scotland (2006), pp. 377-386
-
Challenges in running a commercial search engine
SIGIR (2005), pp. 432
-
Concept-based interactive query expansion
Bruno M. Fonseca, Paulo Braz Golgher, Bruno Possas, Berthier A. Ribeiro-Neto, Nivio Ziviani
CIKM (2005), pp. 696-703
-
Current trends in the integration of searching and browsing
Andrei Z. Broder, Yoelle S. Maarek, Krishna Bharat, Susan T. Dumais, Steve Papa, Jan O. Pedersen, Prabhakar Raghavan
WWW (Special interest tracks and posters) (2005), pp. 793
-
Hyperlink analysis on the world wide web
Monika Rauch Henzinger
Hypertext (2005), pp. 1-3
-
Information Discovery--Needles and Haystacks
Carl Lagoze, Amit Singhal
IEEE Internet Computing, vol. 9 (2005), pp. 16-18
-
Thresher: automating the unwrapping of semantic content from the World Wide Web
Andrew Hogue, David Karger
WWW '05: Proceedings of the 14th international conference on World Wide Web, ACM Press, New York, NY, USA (2005), pp. 86-95
-
Algorithmic Aspects of Web Search Engines
Monika Rauch Henzinger
ESA (2004), pp. 3
-
Internet Searching
Computer Science: Reflections on the Field, Reflections from the Field, Computer Science and Telecommunications Board of the National Academies (2004)
-
The Happy Searcher: Challenges in Web Information Retrieval
Mehran Sahami, Vibhu Mittal, Shumeet Baluja, Henry A. Rowley
The Eighth Pacific Rim International Conference on Artificial Intelligence (PRICAI-2004)
-
The Past, Present and Future of Web Information Retrieval
Monika Rauch Henzinger
PODS (2004), pp. 46
-
The Past, Present, and Future of Web Search Engines
Monika Rauch Henzinger
ICALP (2004), pp. 3
-
Extracting knowledge from the World Wide Web
Monika Henzinger, Steve Lawrence
Mapping Knowledge Domains, National Academy of Sciences, USA, Irvine, CA (2003)
-
Patterns on the Web
Krishna Bharat
SPIRE (2003), pp. 1-15
-
Monika Henzinger, Bay-Wei Chang, Brian Milch, Sergey Brin
Proceedings of the 12th International World Wide Web Conference (WWW-2003), Budapest, Hungary
-
eBizSearch: An OAI-Compliant Digital Library for eBusiness
Yves Petinot, Pradeep B. Teregowda, Hui Han, C. Lee Giles, Steve Lawrence, Arvind Rangaswamy, Nirmal Pal
JCDL (2003), pp. 199-209
-
eBizSearch: a niche search engine for e-business
C. Lee Giles, Yves Petinot, Pradeep B. Teregowda, Hui Han, Steve Lawrence, Arvind Rangaswamy, Nirmal Pal
SIGIR (2003), pp. 413-414
-
Modern Information Retrieval: A Brief Overview
IEEE Data Eng. Bull., vol. 24 (2001), pp. 35-43
-
Who Links to Whom: Mining Linkage between Web Sites
Krishna Bharat, Bay-Wei Chang, Monika Henzinger, Matthias Ruhl
IEEE International Conference on Data Mining (ICDM '01), San Jose, CA (2001)
-
A Comparison of Techniques to Find Mirrored Hosts on the WWW
Krishna Bharat, Andrei Z. Broder, Jeffrey Dean, Monika Rauch Henzinger
JASIS, vol. 51 (2000), pp. 1114-1122
-
The Anatomy of a Large-Scale Hypertextual Web Search Engine
Computer Networks, vol. 30 (1998), pp. 107-117