
A Biomimetic, 4.5 µW, 120+dB, Log-domain Cochlea Channel with AGC, Andreas G. Katsiamis, Emmanuel M. Drakakis, Richard F. Lyon, IEEE JSSC (Journal of Solid-State Circuits), vol. 44 (2009), pp. 1006-1022.
Adapting the Tesseract Open Source OCR Engine for Multilingual OCR, Ray Smith, Daria Antonova, Dar-Shyang Lee, 2009.
Adaptive, selective, automatic tonal enhancement of faces, Hrishikesh Aradhye, George D. Toderici, Jay Yagnik, ACM Multimedia, 2009, pp. 677-680.
Audiovisual Celebrity Recognition in Unconstrained Web Videos, Mehmet Sargin, Hrishikesh Aradhye, Pedro Moreno, Ming Zhao, Proceedings of the IEEE Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2009.
Automatic, Efficient, Temporally-Coherent Video Enhancement for Large Scale Applications, George Toderici, Jay Yagnik, ACM Multimedia, 2009, pp. 609-612.
Combined Orientation and Script Detection using the Tesseract OCR Engine, Ranjith Unnikrishnan, Ray Smith, Workshop on Multilingual OCR (MOCR), Proc. 10th Intl. Conf. on Document Analysis and Recognition (ICDAR),, 2009.
Computer Vision Interfaces for Interactive Art, Andrew Senior, Alejandro Jaimes, Human-Centric Interfaces for Ambient Intelligence, 2009 (to appear).
Efficient and Robust Music Identification with Weighted Finite-State Transducers, Mehryar Mohri, Pedro Moreno, Eugene Weinstein, IEEE Transactions on Audio, Speech, and Language Processing, vol. to appear (2009).
Google Newspaper Search – Image Processing and Analysis Pipeline, Krishnendu Chaudhury, Ankur Jain, Shobhit Saxena, 10th International Conference on Document Analysis and Recognition, ICDAR 2009, pp. 621-625.
Hybrid Page Layout Analysis via Tab-Stop Detection, Ray Smith, Proceedings of the 10th international conference on document analysis and recognition, 2009.
Image Reconstruction in the Gigavision Camera, Feng Yang, Luciano Sbaiz, Edoardo Charbon, Sabine Susstrunk, Martin Vetterli, ICCV workshop OMNIVIS 2009 (2009) (to appear).
LSH Banding for Large-Scale Retrieval with Memory and Recall Constraints, Michele Covell, Shumeet Baluja, International Conference on Acoustics, Speech, and Signal Processing, 2009.
Large-scale Privacy Protection in Google Street View, Andrea Frome, German Cheung, Ahmad Abdulkader, Marco Zennaro, Bo Wu, Alessandro Bissacco, Hartwig Adam, Hartmut Neven, Luc Vincent, IEEE International Conference on Computer Vision, 2009.
Low Cost Correction of OCR Errors Using Learning in a Multi-Engine Environment, Ahmad Abdulkader, Matthew R. Casey, Proceedings of the 10th international conference on document analysis and recognition, 2009.
Predictive Models for Music, Jean-Francois Paiement, Yves Grandvalet, Samy Bengio, Connection Science, vol. 21 (2009), pp. 253-272.
Privacy Protection in Video Surveillance, Andrew W. Senior, 2009.
Probabilistic Models for Melodic Prediction, Jean-Francois Paiement, Samy Bengio, Douglas Eck, Artificial Intelligence Journal, vol. 173 (2009), pp. 1266-1274.
Sound Ranking Using Auditory Sparse-Code Representations, Martin Rehn, Richard F. Lyon, Samy Bengio, Thomas C. Walters, Gal Chechik, ICML 2009 Workshop on Sparse Method for Music Audio.
State of the Art in Example-based Texture Synthesis, Li-Yi Wei, Sylvain Lefebvre, Vivek Kwatra, Greg Turk, Eurographics 2009, State of the Art Report, EG-STAR.
Tour the World: building a web-scale landmark recognition engine, Yantao Zheng, Ming Zhao, Yang Song, Hartwig Adam, Ulrich Buddemeier, Alessandro Bissacco, Fernando Brucher, Tat-Seng Chua, Hartmut Neven, International Conference on Computer Vision and Pattern Recognition (CVPR), 2009.
A Distance Model for Rhythms, Jean-Francois Paiement, Yves Grandvalet, Samy Bengio, Douglas Eck, International Conference on Machine Learning (ICML), 2008.
A Generative Model for Rhythms, Jean-Francois Paiement, Samy Bengio, Yves Grandvalet, Doug Eck, Neural Information Processing Systems, Workshop on Brain, Music and Cognition, 2008.
A new baseline for image annotation, Ameesh Makadia, Vladimir Pavlovic, Sanjiv Kumar, European Conference on Computer Vision (ECCV 08), 2008.
Beyond Sliding Windows: Object Localization by Efficient Subwindow Search, Christoph H. Lampert, Matthew B. Blaschko, Thomas Hofmann, IEEE Computer Vision and Pattern Recognition (CVPR), 2008.
Deploying GOOG-411: Early Lessons in Data, Measurement, and Testing, Michiel Bacchiani, Francoise Beaufays, Johan Schalkwyk, Mike Schuster, Brian Strope, Proc. ICASSP, 2008.
Face Tracking and Recognition with Visual Constraints in Real-World Videos, Minyoung Kim, Sanjiv Kumar, Vladimir Pavlovic, Henry A. Rowley, Computer Vision and Pattern Recognition, 2008.
Fluid in Video: Augmenting Real Video with Simulated Fluids, Vivek Kwatra, Philippos Mordohai, Rahul Narain, Sashi Kumar Penta, Mark Carlson, Marc Pollefeys, Ming C. Lin, Comput. Graph. Forum (Proc. Eurographics), vol. 27 (2008), pp. 487-496.
Large Scale Learning and Recognition of Faces in Web Videos, Ming Zhao, Jay Yagnik, Hartwig Adam, David Bau, FG2008 (2008).
Large-Scale Manifold Learning, Ameet Talwalkar, Sanjiv Kumar, Henry A. Rowley, Computer Vision and Pattern Recognition, 2008.
Markovian Mixture Face Recognition with discriminative face alignment, Ming Zhao, automatic face and gesture recognition, 2008.
Mass Personalization: Social and Interactive Applications using Sound-Track Identification, Michael Fink, Michele Covell, Shumeet Baluja, Journal of Multimedia Tools and Applications, vol. 36 (2008), pp. 115-132.
PageRank for Product Image Search, Yushi Jing, Shumeet Baluja, WWW-2008.
Permutation Grouping: Intelligent Hash Function Design for Audio & Image Retrieval, Shumeet Baluja, Michele Covell, Sergey Ioffe, International Conference on Acoustics, Speech and Signal Processing (ICASSP-2008).
Reducing Photon Mapping Bandwidth by Query Reordering, Joshua Steinhurst, Greg Coombe, Anselmo Lastra, IEEE Transactions on Visualization and Computer Graphics, vol. 14 (2008).
Speech Recognition with Weighted Finite-State Transducers, Mehryar Mohri, Fernando C. N. Pereira, Michael Riley, Handbook on Speech Processing and Speech Communication, Part E: Speech recognition, 2008.
Visual Synset: Towards a Higher-level Visual Representation, Yantao Zheng, Ming Zhao, Shi-Yong Neo, Tat-Seng Chua, Qi Tian, CVPR, 2008.
VisualRank: Applying PageRank to Large-Scale Image Search, Yushi Jing, Shumeet Baluja, IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 30 (2008), pp. 1877-1890.
Waveprint: Efficient Wavelet-Based Audio Fingerprinting, Shumeet Baluja, Michele Covell, Pattern Recognition (2008).
Web-scale Image Annotation, Jiakai Liu, Rong Hu, Meihong Wang, Yi Wang, Edward Chang, Pacific-Rim Conference on Multimedia, 2008 (to appear).
An Overview of the Tesseract OCR Engine, Ray Smith, Proc. Ninth Int. Conference on Document Analysis and Recognition (ICDAR), 2007, pp. 629-633.
Audio Fingerprinting: Combining Computer Vision & Data Stream Processing, Shumeet Baluja, Michele Covell, Proceedings of the 2007 International Conference on Acoustics, Speech, and Signal Processing.
Automated Image Orientation Detection: A Scalable Boosting Approach, Shumeet Baluja, Pattern Analysis and Applications (2007).
Automatic Alignment of Large-scale Aerial Rasters to Road-maps, James Xiaqing Wu, Rodrigo Carceroni, Hui Fang, Steve Zelinka, Andrew Kirmse, ACM GIS 2007.
Biometric Person Authentication IS A Multiple Classifier Problem, Samy Bengio, Johnny Mariéthoz, 7th International Workshop on Multiple Classifier Systems, 2007.
Boosting Sex Identification Performance, Shumeet Baluja, Henry A. Rowley, International Journal of Computer Vision, vol. 71 (2007), pp. 111-119.
Classification of Weakly-Labeled Data with Partial Equivalence Relations, Sanjiv Kumar, Henry A. Rowley, International Conference on Computer Vision, 2007.
Detail Preserving Shape Deformation in Image Editing, Hui Fang, John C. Hart, Proc. SIGGRAPH 2007, no. 12.
Efficient Complete and Incomplete Path Openings and Closings, Hugues Talbot, Ben Appleton, Image and Vision Computing, vol. 25, no. 4 (2007), pp. 416-425.
GRADE-IV: Visualizing Graphics Library Operations in an Executing Program, Hidehiko Abe, Takeo Igarashi, SIGGRAPH 2007 Posters, no. 118.
Google Books: Making the public domain universally accessible, Adam Langley, Dan Bloomberg, SPIE, vol. 6500 (2007), 65000H1-65000H10.
Known-Audio Detection Using Waveprint: Spectrogram Fingerprinting By Wavelet Hashing, Michele Covell, Shumeet Baluja, Proceedings of the 2007 International Conference on Acoustics, Speech, and Signal Processing.
Music Identification with Weighted Finite-State Transducers, Eugene Weinstein, Pedro J. Moreno, Proceedings of the International Conference in Acoustics, Speech and Signal Processing (ICASSP), 2007.
Music identification, detection, and analysis in adverse conditions, M. Mohri, Pedro J. Moreno, Eugene Weinstein, Proceedings of the International Conference on Music Information Retrieval (ISMIR), 2007.
Ordinal Regression Based Subpixel Shift Estimation for Video Super-Resolution, Mithun Das Gupta, Shyamsundar Rajaram, Thomas S. Huang, Nemanja Petrovic, EURASIP Journal on Advances in Signal Processing, vol. 85963 (2007).
Practical Gammatone-Like Filters for Auditory Modeling, Andreas G. Katsiamis, Emmanuel M. Drakakis, Richard F. Lyon, EURASIP Journal on Audio, Speech, and Music Processing, vol. 2007 (2007), pp. 12.
Practical MythTV: Building a PVR and Media Center PC, Michael Still, Stewart Smith, 2007, pp. 350.
Raising Global Awareness with Google Earth, Rebecca Moore, Imaging Notes, vol. 22, no. 2 (2007), pp. 24-29.
Temporally Consistent Reconstruction from Multiple Video Streams using Enhanced Belief Propagation, E. Scott Larsen, Philippos Mordohai, Marc Pollefeys, Henry Fuchs, Eleventh IEEE International Conference on Computer Vision, 2007.
Speech Recognition with Weighted Finite-State Transducers, Mehryar Mohri, Fernando C. N. Pereira, Michael Riley, Handbook on Speech Processing and Speech Communication, Part E: Speech recognition, 2007.
Advertisement Detection and Replacement using Acoustic and Visual Repetition, Michele Covell, Shumeet Baluja, Michael Fink, Proceedings of the 2006 International Workshop on Multimedia Signal Processing.
Content Fingerprinting Using Wavelets, Shumeet Baluja, Michele Covell, Proceedings of the Conference of Visual Media Production, 2006.
Globally Minimal Surfaces by Continuous Maximal Flows, Ben Appleton, Hugues Talbot, IEEE Trans. Pattern Anal. Mach. Intell., vol. 28 (2006), pp. 106-118.
Large Scale Image-Based Adult-Content Filtering, Henry A. Rowley, Yushi Jing, Shumeet Baluja, 1st International Conference on Computer Vision Theory, 2006.
Query by Semantic Example, Nikhil Rasiwasia, Nuno Vasconcelos, Pedro J. Moreno, CIVR, 2006, pp. 51-60.
Social- and Interactive-Television Applications Based on Real-Time Ambient-Audio Identification, Michael Fink, Michele Covell, Shumeet Baluja, European Interactive TV Conference (Euro-ITV), 2006.
Time-Scale Modification for 3G-Telephony Video, Michele Covell, Sumit Roy, Bo Shen, Proceedings of the 2006 International Workshop on Multimedia Signal Processing.
Boosting Sex Identification Performance, Shumeet Baluja, Henry A. Rowley, Proceedings of the Seventeenth Innovative Applications of Artificial Intelligence Conference, 2005, pp. 1508-1513.
Large Scale Performance Measurement of Content-Based Automated Image-Orientation Detection, Shumeet Baluja, Henry A. Rowley, International Conference on Image Processing, 2005.
The Definitive Guide to ImageMagick, Michael Still, 2005, pp. 335.
Efficient Face Orientation Discrimination, Shumeet Baluja, Mehran Sahami, Henry A. Rowley, International Conference on Image Processing (ICIP-2004).