skip to main content
10.1145/2858036.2858115acmconferencesArticle/Chapter ViewAbstractPublication PageschiConference Proceedingsconference-collections
research-article

Embracing Error to Enable Rapid Crowdsourcing

Published: 07 May 2016 Publication History

Abstract

Microtask crowdsourcing has enabled dataset advances in social science and machine learning, but existing crowdsourcing schemes are too expensive to scale up with the expanding volume of data. To scale and widen the applicability of crowdsourcing, we present a technique that produces extremely rapid judgments for binary and categorical labels. Rather than punishing all errors, which causes workers to proceed slowly and deliberately, our technique speeds up workers' judgments to the point where errors are acceptable and even expected. We demonstrate that it is possible to rectify these errors by randomizing task order and modeling response latency. We evaluate our technique on a breadth of common labeling tasks such as image verification, word similarity, sentiment analysis and topic classification. Where prior work typically achieves a 0.25x to 1x speedup over fixed majority vote, our approach often achieves an order of magnitude (10x) speedup.

Supplementary Material

ZIP File (pn0496-file4.zip)
pn0496-file4.zip
suppl.mov (pn0496-file3.mp4)
Supplemental video

References

[1]
Michael S Bernstein, Joel Brandt, Robert C Miller, and David R Karger. 2011. Crowds in two seconds: Enabling realtime crowd-powered interfaces. In Proceedings of the 24th annual ACM symposium on User interface software and technology. ACM, 33--42.
[2]
Michael S Bernstein, Greg Little, Robert C Miller, Bjorn Hartmann, Mark S Ackerman, David R Karger, David Crowell, and Katrina Panovich. 2010. Soylent: a word processor with a crowd inside. In Proceedings of the 23nd annual ACM symposium on User interface software and technology. ACM, 313--322.
[3]
Arijit Biswas and Devi Parikh. 2013. Simultaneous active learning of classifiers & attributes via relative feedback. In Computer Vision and Pattern Recognition (CVPR), 2013 IEEE Conference on. IEEE, 644--651.
[4]
Jonathan Bragg, Mausam Daniel, and Daniel S Weld. 2013. Crowdsourcing multi-label classification for taxonomy creation. In First AAAI conference on human computation and crowdsourcing.
[5]
Steve Branson, Kristjan Eldjarn Hjorleifsson, and Pietro Perona. 2014. Active annotation translation. In Computer Vision and Pattern Recognition (CVPR), 2014 IEEE Conference on. IEEE, 3702--3709.
[6]
Steve Branson, Catherine Wah, Florian Schroff, Boris Babenko, Peter Welinder, Pietro Perona, and Serge Belongie. 2010. Visual recognition with humans in the loop. In Computer Vision-ECCV 2010. Springer, 438--451.
[7]
Donald E Broadbent and Margaret HP Broadbent. 1987. From detection to identification: Response to multiple targets in rapid serial visual presentation. Perception & psychophysics 42, 2 (1987), 105--113.
[8]
Moira Burke and Robert Kraut. 2013. Using Facebook after losing a job: Differential benefits of strong and weak ties. In Proceedings of the 2013 conference on Computer supported cooperative work. ACM, 1419--1430.
[9]
Stuart K Card, Allen Newell, and Thomas P Moran. 1983. The psychology of human-computer interaction. (1983).
[10]
Justin Cheng, Jaime Teevan, and Michael S Bernstein. 2015. Measuring Crowdsourcing Effort with Error-Time Curves. In Proceedings of the 33rd Annual ACM Conference on Human Factors in Computing Systems. ACM, 1365--1374.
[11]
Lydia B Chilton, Greg Little, Darren Edge, Daniel S Weld, and James A Landay. 2013. Cascade: Crowdsourcing taxonomy creation. In Proceedings of the SIGCHI Conference on Human Factors in Computing Systems. ACM, 1999--2008.
[12]
Lacey Colligan, Henry WW Potts, Chelsea T Finn, and Robert A Sinkin. 2015. Cognitive workload changes for nurses transitioning from a legacy system with paper documentation to a commercial electronic health record. International journal of medical informatics 84, 7 (2015), 469--476.
[13]
Jia Deng, Wei Dong, Richard Socher, Li-Jia Li, Kai Li, and Li Fei-Fei. 2009. Imagenet: A large-scale hierarchical image database. In Computer Vision and Pattern Recognition, 2009. CVPR 2009. IEEE Conference on. IEEE, 248--255.
[14]
Jia Deng, Olga Russakovsky, Jonathan Krause, Michael S Bernstein, Alex Berg, and Li Fei-Fei. 2014. Scalable multi-label annotation. In Proceedings of the SIGCHI Conference on Human Factors in Computing Systems. ACM, 3099--3102.
[15]
Ethan Fast, Daniel Steffee, Lucy Wang, Joel R Brandt, and Michael S Bernstein. 2014. Emergent, crowd-scale programming practice in the IDE. In Proceedings of the 32nd annual ACM conference on Human factors in computing systems. ACM, 2491--2500.
[16]
Li Fei-Fei, Asha Iyer, Christof Koch, and Pietro Perona. 2007. What do we perceive in a glance of a real-world scene? Journal of vision 7, 1 (2007), 10.
[17]
Eric Gilbert and Karrie Karahalios. 2009. Predicting tie strength with social media. In Proceedings of the SIGCHI Conference on Human Factors in Computing Systems. ACM, 211--220.
[18]
Ross Girshick, Jeff Donahue, Trevor Darrell, and Jagannath Malik. 2014. Rich feature hierarchies for accurate object detection and semantic segmentation. In Computer Vision and Pattern Recognition (CVPR), 2014 IEEE Conference on. IEEE, 580--587.
[19]
Marius Catalin Iordan, Michelle R Greene, Diane M Beck, and Li Fei-Fei. 2015. Basic level category structure emerges gradually across human ventral visual cortex. Journal of cognitive neuroscience (2015).
[20]
Panagiotis G Ipeirotis. 2010. Analyzing the Amazon Mechanical Turk Marketplace. XRDS: Crossroads, The ACM Magazine for Students 17, 2 (2010), 16--21.
[21]
Panagiotis G Ipeirotis, Foster Provost, and Jing Wang. 2010. Quality management on amazon mechanical turk. In Proceedings of the ACM SIGKDD workshop on human computation. ACM, 64--67.
[22]
Lilly C Irani and M Silberman. 2013. Turkopticon: Interrupting worker invisibility in amazon mechanical turk. In Proceedings of the SIGCHI Conference on Human Factors in Computing Systems. ACM, 611--620.
[23]
Suyog Dutt Jain and Kristen Grauman. 2013. Predicting sufficient annotation strength for interactive foreground segmentation. In Computer Vision (ICCV), 2013 IEEE International Conference on. IEEE, 1313--1320.
[24]
Tatiana Josephy, Matt Lease, and Praveen Paritosh. 2013. CrowdScale 2013: Crowdsourcing at Scale workshop report. (2013).
[25]
Ece Kamar, Severin Hacker, and Eric Horvitz. 2012. Combining human and machine intelligence in large-scale crowdsourcing. In Proceedings of the 11th International Conference on Autonomous Agents and Multiagent Systems-Volume 1. International Foundation for Autonomous Agents and Multiagent Systems, 467--474.
[26]
David R Karger, Sewoong Oh, and Devavrat Shah. 2011. Budget-optimal crowdsourcing using low-rank matrix approximations. In Communication, Control, and Computing (Allerton), 2011 49th Annual Allerton Conference on. IEEE, 284--291.
[27]
David R Karger, Sewoong Oh, and Devavrat Shah. 2014. Budget-optimal task allocation for reliable crowdsourcing systems. Operations Research 62, 1 (2014), 1--24.
[28]
Aniket Kittur, Ed H Chi, and Bongwon Suh. 2008. Crowdsourcing user studies with Mechanical Turk. In Proceedings of the SIGCHI conference on human factors in computing systems. ACM, 453--456.
[29]
Alex Krizhevsky, Ilya Sutskever, and Geoffrey E Hinton. 2012. Imagenet classification with deep convolutional neural networks. In Advances in neural information processing systems. 1097--1105.
[30]
Ranjitha Kumar, Arvind Satyanarayan, Cesar Torres, Maxine Lim, Salman Ahmad, Scott R Klemmer, and Jerry O Talton. 2013. Webzeitgeist: design mining the web. In Proceedings of the SIGCHI Conference on Human Factors in Computing Systems. ACM, 3083--3092.
[31]
Gierad Laput, Walter S Lasecki, Jason Wiese, Robert Xiao, Jeffrey P Bigham, and Chris Harrison. 2015. Zensors: Adaptive, Rapidly Deployable, Human-Intelligent Sensor Feeds. In Proceedings of the 33rd Annual ACM Conference on Human Factors in Computing Systems. ACM, 1935--1944.
[32]
Walter Lasecki, Christopher Miller, Adam Sadilek, Andrew Abumoussa, Donato Borrello, Raja Kushalnagar, and Jeffrey Bigham. 2012. Real-time captioning by groups of non-experts. In Proceedings of the 25th annual ACM symposium on User interface software and technology. ACM, 23--34.
[33]
Walter S Lasecki, Kyle I Murray, Samuel White, Robert C Miller, and Jeffrey P Bigham. 2011. Real-time crowd control of existing interfaces. In Proceedings of the 24th annual ACM symposium on User interface software and technology. ACM, 23--32.
[34]
David D. Lewis and Philip J. Hayes. 1994. Guest Editorial. ACM Transactions on Information Systems 12, 3 (July 1994), 231.
[35]
Fei Fei Li, Rufin VanRullen, Christof Koch, and Pietro Perona. 2002. Rapid natural scene categorization in the near absence of attention. Proceedings of the National Academy of Sciences 99, 14 (2002), 9596--9601.
[36]
Tao Li and Mitsunori Ogihara. 2003. Detecting emotion in music. In ISMIR, Vol. 3. 239--240.
[37]
Liang Liang and Kristen Grauman. 2014. Beyond comparing image pairs: Setwise active learning for relative attributes. In Computer Vision and Pattern Recognition (CVPR), 2014 IEEE Conference on. IEEE, 208--215.
[38]
Tsung-Yi Lin, Michael Maire, Serge Belongie, James Hays, Pietro Perona, Deva Ramanan, Piotr Dollar, and C Lawrence Zitnick. 2014. Microsoft COCO: Common objects in context. In Computer Vision-ECCV 2014. Springer, 740--755.
[39]
Adam Marcus and Aditya Parameswaran. 2015. Crowdsourced data management: industry and academic perspectives. Foundations and Trends in Databases.
[40]
David Martin, Benjamin V Hanrahan, Jacki O'Neill, and Neha Gupta. 2014. Being a turker. In Proceedings of the 17th ACM conference on Computer supported cooperative work & social computing. ACM, 224--235.
[41]
Winter Mason and Siddharth Suri. 2012. Conducting behavioral research on Amazons Mechanical Turk. Behavior research methods 44, 1 (2012), 1--23.
[42]
George A Miller and Walter G Charles. 1991. Contextual correlates of semantic similarity. Language and cognitive processes 6, 1 (1991), 1--28.
[43]
Bo Pang and Lillian Lee. 2008. Opinion mining and sentiment analysis. Foundations and trends in information retrieval 2, 1--2 (2008), 1--135.
[44]
Amar Parkash and Devi Parikh. 2012. Attributes for classifier feedback. In Computer Vision-ECCV 2012. Springer, 354--368.
[45]
Mausam Daniel Peng Dai and S Weld. 2010. Decision-theoretic control of crowd-sourced workflows. In In the 24th AAAI Conference on Artificial Intelligence (AAAI10. Citeseer.
[46]
Mary C Potter. 1976. Short-term conceptual memory for pictures. Journal of experimental psychology: human learning and memory 2, 5 (1976), 509.
[47]
Mary C Potter and Ellen I Levy. 1969. Recognition memory for a rapid sequence of pictures. Journal of experimental psychology 81, 1 (1969), 10.
[48]
Adam Reeves and George Sperling. 1986. Attention gating in short-term visual memory. Psychological review 93, 2 (1986), 180.
[49]
Olga Russakovsky, Jia Deng, Hao Su, Jonathan Krause, Sanjeev Satheesh, Sean Ma, Zhiheng Huang, Andrej Karpathy, Aditya Khosla, Michael Bernstein, Alexander C Berg, and Fei-Fei Li. 2014. Imagenet large scale visual recognition challenge. International Journal of Computer Vision (2014), 1--42.
[50]
Olga Russakovsky, Li-Jia Li, and Li Fei-Fei. 2015. Best of both worlds: human-machine collaboration for object annotation. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 2121--2131.
[51]
Niloufar Salehi, Lilly C Irani, and Michael S Bernstein. 2015. We Are Dynamo: Overcoming Stalling and Friction in Collective Action for Crowd Workers. In Proceedings of the 33rd Annual ACM Conference on Human Factors in Computing Systems. ACM, 1621--1630.
[52]
Robert E Schapire and Yoram Singer. 2000. BoosTexter: A boosting-based system for text categorization. Machine learning 39, 2 (2000), 135--168.
[53]
Prem Seetharaman and Bryan Pardo. 2014. Crowdsourcing a reverberation descriptor map. In Proceedings of the ACM International Conference on Multimedia. ACM, 587--596.
[54]
Victor S Sheng, Foster Provost, and Panagiotis G Ipeirotis. 2008. Get another label? improving data quality and data mining using multiple, noisy labelers. In Proceedings of the 14th ACM SIGKDD international conference on Knowledge discovery and data mining. ACM, 614--622.
[55]
Aashish Sheshadri and Matthew Lease. 2013. Square: A benchmark for research on computing crowd consensus. In First AAAI Conference on Human Computation and Crowdsourcing.
[56]
K. Simonyan and A. Zisserman. 2014. Very Deep Convolutional Networks for Large-Scale Image Recognition. CoRR abs/1409.1556 (2014).
[57]
Padhraic Smyth, Michael C Burl, Usama M Fayyad, and Pietro Perona. 1994. Knowledge Discovery in Large Image Databases: Dealing with Uncertainties in Ground Truth. In KDD Workshop. 109--120.
[58]
Padhraic Smyth, Usama Fayyad, Michael Burl, Pietro Perona, and Pierre Baldi. 1995. Inferring ground truth from subjective labelling of venus images. (1995).
[59]
Rion Snow, Brendan O'Connor, Daniel Jurafsky, and Andrew Y Ng. 2008. Cheap and fast-but is it good?: evaluating non-expert annotations for natural language tasks. In Proceedings of the conference on empirical methods in natural language processing. Association for Computational Linguistics, 254--263.
[60]
Zheng Song, Qiang Chen, Zhongyang Huang, Yang Hua, and Shuicheng Yan. 2011. Contextualizing object detection and classification. In Computer Vision and Pattern Recognition (CVPR), 2011 IEEE Conference on. IEEE, 1585--1592.
[61]
Hao Su, Jia Deng, and Li Fei-Fei. 2012. Crowdsourcing annotations for visual object detection. In Workshops at the Twenty-Sixth AAAI Conference on Artificial Intelligence.
[62]
Omer Tamuz, Ce Liu, Serge Belongie, Ohad Shamir, and Adam Tauman Kalai. 2011. Adaptively learning the crowd kernel. arXiv preprint arXiv:1105.1033 (2011).
[63]
Bart Thomee, David A. Shamma, Gerald Friedland, Benjamin Elizalde, Karl Ni, Douglas Poland, Damian Borth, and Li-Jia Li. 2016. YFCC100M: The New Data in Multimedia Research. Commun. ACM 59, 2 (2016).
[64]
Sudheendra Vijayanarasimhan, Prateek Jain, and Kristen Grauman. 2010. Far-sighted active learning on a budget for image and video recognition. In Computer Vision and Pattern Recognition (CVPR), 2010 IEEE Conference on. IEEE, 3035--3042.
[65]
Oriol Vinyals, Alexander Toshev, Samy Bengio, and Dumitru Erhan. 2014. Show and tell: A neural image caption generator. arXiv preprint arXiv:1411.4555 (2014).
[66]
Carl Vondrick, Donald Patterson, and Deva Ramanan. 2013. Efficiently scaling up crowdsourced video annotation. International Journal of Computer Vision 101, 1 (2013), 184--204.
[67]
Catherine Wah, Steve Branson, Pietro Perona, and Serge Belongie. 2011. Multiclass recognition and part localization with humans in the loop. In Computer Vision (ICCV), 2011 IEEE International Conference on. IEEE, 2524--2531.
[68]
Catherine Wah, Grant Van Horn, Steve Branson, Subhrajyoti Maji, Pietro Perona, and Serge Belongie. 2014. Similarity comparisons for interactive fine-grained categorization. In Computer Vision and Pattern Recognition (CVPR), 2014 IEEE Conference on. IEEE, 859--866.
[69]
Erich Weichselgartner and George Sperling. 1987. Dynamics of automatic and controlled visual attention. Science 238, 4828 (1987), 778--780.
[70]
Peter Welinder, Steve Branson, Pietro Perona, and Serge J Belongie. 2010. The multidimensional wisdom of crowds. In Advances in neural information processing systems. 2424--2432.
[71]
Jacob Whitehill, Ting-fan Wu, Jacob Bergsma, Javier R Movellan, and Paul L Ruvolo. 2009. Whose vote should count more: Optimal integration of labels from labelers of unknown expertise. In Advances in neural information processing systems. 2035--2043.
[72]
Jacob O Wobbrock, Jodi Forlizzi, Scott E Hudson, and Brad A Myers. 2002. WebThumb: interaction techniques for small-screen browsers. In Proceedings of the 15th annual ACM symposium on User interface software and technology. ACM, 205--208.
[73]
Dengyong Zhou, Sumit Basu, Yi Mao, and John C Platt. 2012. Learning from the wisdom of crowds by minimax entropy. In Advances in Neural Information Processing Systems. 2195--2203.

Cited By

View all
  • (2024)Belief Miner: A Methodology for Discovering Causal Beliefs and Causal Illusions from General PopulationsProceedings of the ACM on Human-Computer Interaction10.1145/36372988:CSCW1(1-37)Online publication date: 26-Apr-2024
  • (2024)Explaining crowdworker behaviour through computational rationalityBehaviour & Information Technology10.1080/0144929X.2024.2329616(1-22)Online publication date: 24-Apr-2024
  • (2024)LNL+K: Enhancing Learning with Noisy Labels Through Noise Source Knowledge IntegrationComputer Vision – ECCV 202410.1007/978-3-031-73030-6_21(374-392)Online publication date: 24-Nov-2024
  • Show More Cited By

Index Terms

  1. Embracing Error to Enable Rapid Crowdsourcing

    Recommendations

    Comments

    Information & Contributors

    Information

    Published In

    cover image ACM Conferences
    CHI '16: Proceedings of the 2016 CHI Conference on Human Factors in Computing Systems
    May 2016
    6108 pages
    ISBN:9781450333627
    DOI:10.1145/2858036
    Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

    Sponsors

    Publisher

    Association for Computing Machinery

    New York, NY, United States

    Publication History

    Published: 07 May 2016

    Permissions

    Request permissions for this article.

    Check for updates

    Author Tags

    1. RSVP
    2. crowdsourcing
    3. human computation

    Qualifiers

    • Research-article

    Funding Sources

    • NSF

    Conference

    CHI'16
    Sponsor:
    CHI'16: CHI Conference on Human Factors in Computing Systems
    May 7 - 12, 2016
    California, San Jose, USA

    Acceptance Rates

    CHI '16 Paper Acceptance Rate 565 of 2,435 submissions, 23%;
    Overall Acceptance Rate 6,199 of 26,314 submissions, 24%

    Upcoming Conference

    CHI '25
    CHI Conference on Human Factors in Computing Systems
    April 26 - May 1, 2025
    Yokohama , Japan

    Contributors

    Other Metrics

    Bibliometrics & Citations

    Bibliometrics

    Article Metrics

    • Downloads (Last 12 months)66
    • Downloads (Last 6 weeks)3
    Reflects downloads up to 23 Nov 2024

    Other Metrics

    Citations

    Cited By

    View all
    • (2024)Belief Miner: A Methodology for Discovering Causal Beliefs and Causal Illusions from General PopulationsProceedings of the ACM on Human-Computer Interaction10.1145/36372988:CSCW1(1-37)Online publication date: 26-Apr-2024
    • (2024)Explaining crowdworker behaviour through computational rationalityBehaviour & Information Technology10.1080/0144929X.2024.2329616(1-22)Online publication date: 24-Apr-2024
    • (2024)LNL+K: Enhancing Learning with Noisy Labels Through Noise Source Knowledge IntegrationComputer Vision – ECCV 202410.1007/978-3-031-73030-6_21(374-392)Online publication date: 24-Nov-2024
    • (2024)Dynamic Labeling: A Control System for Labeling Styles in Image Annotation TasksHuman Interface and the Management of Information10.1007/978-3-031-60107-1_8(99-118)Online publication date: 1-Jun-2024
    • (2023)Interface Design for Crowdsourcing Hierarchical Multi-Label Text AnnotationsProceedings of the 2023 CHI Conference on Human Factors in Computing Systems10.1145/3544548.3581431(1-17)Online publication date: 19-Apr-2023
    • (2023)RSVG: Exploring Data and Models for Visual Grounding on Remote Sensing DataIEEE Transactions on Geoscience and Remote Sensing10.1109/TGRS.2023.325047161(1-13)Online publication date: 2023
    • (2022)Elephant motorbikes and too many neckties: epistemic spatialization as a framework for investigating patterns of bias in convolutional neural networksAI & SOCIETY10.1007/s00146-022-01542-839:3(1079-1093)Online publication date: 10-Aug-2022
    • (2022) K ‐submodular function‐based service‐limit incentive mechanisms for crowd heterogeneous sensing International Journal of Communication Systems10.1002/dac.536936:1Online publication date: 17-Oct-2022
    • (2021)Synchrony within Triads using Virtual RealityProceedings of the ACM on Human-Computer Interaction10.1145/34795445:CSCW2(1-27)Online publication date: 18-Oct-2021
    • (2021)On the State of Reporting in Crowdsourcing Experiments and a Checklist to Aid Current PracticesProceedings of the ACM on Human-Computer Interaction10.1145/34795315:CSCW2(1-34)Online publication date: 18-Oct-2021
    • Show More Cited By

    View Options

    Login options

    View options

    PDF

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader

    Media

    Figures

    Other

    Tables

    Share

    Share

    Share this Publication link

    Share on social media