Abstract
The human labor behind AI
Within recent years AI AI-driven tools and services, such as Alexa and Siri, ChatGPT, and customer service chatbots, have become indispensable in our everyday lives. What is widely unknow, is that these AIs and algorithms rely on human labor in the form of data processing, labelling, and reviewing of model performance, to be built and maintained (Gray & Suri, 2019). This labor is commonly mediated through online platforms like Amazon’s Mechanical Turk (MTurk), and makes up a significant proportion of the gig economy (Piasna, Zwysen, & Drahokoupil, 2022). This online platform-mediated labor has been given various names in the literature including microwork, micro tasking, crowdwork, crowdsourcing, or clickwork (Berg, Furrer, Harmon, Rani, & Silberman, 2019; Durward, Blohm, & Leimeister, 2020; Webster, 2016). In this article we use the term microwork as this refers to the completion of short, independent tasks that contribute to a greater, mostly unknown project (Nakatsu, Grossman, & Iacovou, 2014). Despite this labor being at the heart of technological products that are used daily, very little is known about its hidden workers or the working conditions under which they work.
What is microwork like?
The working conditions are reportedly unstable, with workers being considered independent contractors, competing for work with every other worker on a given platform (Berg et al., 2019; Wood, Graham, Lehdonvirta, & Hjorth, 2019). Individual pieces of work, called tasks, often only pay cents, making it difficult for workers to make an acceptable hourly wage (Berg et al., 2019), requesters can reject work without stating a reason (Berg et al., 2019; McInnis, Cosley, Nam, & Leshed, 2016), and workers tend to work from home with no possibility connect to other workers leading to feelings of alienation (Bucher, Fieseler, Lutz, & Buhmann, 2024; Hertwig, Holz, & Lorig, 2024). The fully digitalized organization of this labor further creates novel conditions in this work. Workers have to invest substantial amounts of unpaid labor to maintain their working opportunities on the platforms (Berg et al., 2019; Howson et al., 2023), and are subjected to a high degree of commodification as their identity is hidden behind a user ID (Irani, 2015). On the other hand, workers appreciate a high degree of autonomy, and diversity in content and complexity of their tasks (Wood et al., 2019), finding a game-like enjoyment and a sense of accomplishment in their work (Deng & Joshi, 2016).
The emerging literature investigating the labor of microwork paints a complex picture of working conditions in this labor (Berg et al., 2019; Bucher et al., 2024; Deng & Joshi, 2016; McInnis et al., 2016; Wood et al., 2019), while some unique aspects that seem to be integral to microwork are missing from the conceptualization of the working conditions (Brawley & Pury, 2016). Following the call of previous researchers (Brawley & Pury, 2016), this study set out to develop a comprehensive scale assessing the working conditions of microwork. For this study, the basic needs (autonomy, competence, and relatedness) outlined in Self-Determination Theory were employed as a theoretical lens rather than a framework (Ryan, 2023), aiding in translating the experiences of microworkers into working conditions.
Methods
The scale development and validation followed the steps: creation of concept definitions and item generation, content validation, assessment of the measurement model employing exploratory - (EFA) and confirmatory factor analysis (CFA), and lastly bivariate analyses establishing the nomological network for validation.
Concept definitions and item generation. The microworking conditions were drawn from the existing body of literature researching microwork (e.g., Berg et al., 2019; Bucher et al., 2024; Deng & Joshi, 2016; McInnis et al., 2016; Wood et al., 2019). This resulted in the conceptualization of 15 working conditions. Nine working conditions based on established models, including the work characteristics model and the decent work criteria, which appeared prominently in the literature, and were adapted for this scale (Deng & Joshi, 2016; Gadiraju, Yang, & Bozzon, 2017; Orhan, Khelladi, Castellano, & Singh, 2022; Silberman et al., 2018). Six novel working conditions were conceptualized following recommended procedures (Mackenzie, Podsakoff, & Podsakoff, 2011). In consideration of the expected attrition of items in the following phases of validation (Hinkin, 1998), 10 items per working condition were developed. For novel working conditions, item creation was guided by the characteristics outlined in the concept definition. For common working conditions adapted into this scale, inspiration was drawn from existing scales.
Content validation. The initial pool of 150 items and 15 definitions were subjected to content validation procedures in a sample of 10 academic - , and 30 experiential experts (Schilling et al., 2007), rating the items regarding their relevance, representativeness, and clarity (Scharp, Bakker, Breevaart, Kruup, & Uusberg, 2023).
Measurement Model. Data of a sample of 407 workers on four prominent microworking platforms in the EU (MTurk, Clickworker, Microworker, Picoworker) was collected using the new scale. EFA was performed to confirm the scale’s structure, and to conduct item reduction. A second sample of 455 workers was recruited on the same platforms to obtain data to perform CFA and to reduce the item pool to three items per working condition.
Nomological network. The second wave of data collection on microworking platforms included further scales assessing constructs that are hypothesized to relate to the 15 working conditions of the new scale. This data was used to establish convergent validity through bivariate analyses. Discriminant validity was established following the procedure by O’Neill & Seva (2013). Criterion validity was tested against the basic needs of autonomy, competence, and relatedness (Deci et al., 2001; Luong & Flake, 2022), and predictive validity for the well-being indicators of psychological - (World Health Organisation, 1998), social - (Keyes, 1998), and psychosomatic well-being (Franke, 2015) by assessing the partial correlations.
Results
Content validation. Data from the content validation procedure shows sufficient support for all 15 working conditions conceptualized. Based on the review of academic - and experiential experts, the initial item pool was reduced from 150 to 102 items, excluding items that were redundant to the concept definitions or difficult to understand.
Measurement model. Different factorial structures were explored through EFA and it was found that the hypothesized 15-factor structure showed sufficient fit (df = 3813; χ2 = 7353.57; CFI = .887; RMSEA =.048). After the item pool was reduced to 63 items, the fit of the 15-factor model remained excellent (df = 1020; χ2 = 1846.38; CFI = .949; RMSEA =.045). CFA confirmed this structure with good fit statistics in the final 45 item version (df = 840; χ2 = 1400.57; CFI = .958; RMSEA =.048). The final model showed good factor loadings of the items on their respective latent constructs, with loadings ranging from .703 to .930. Reliability of each subscale was acceptable as Cronbach’s α were between .791 and .920.
Nomological network. Convergent validity was established as all 15 working conditions correlated significantly with a hypothesized related construct. These results ranged from small negative correlations of -.248 (p < .01) of the new working condition ‘Digital Working Environment’ with the Digital Stressor Scale (Fischer, Reuter, & Riedl, 2021), to large positive correlations of .811 (p < .01) of the new ‘Platform Conduct’ construct with the Perceive Civility Scale (Porath, Gerbasi, & Schorch, 2015). No violations of discriminant validity were found (O’Neill & Seva, 2013). The partial correlations examining the micro-working condition scale’s criterion validity revealed relevant correlations with the need for autonomy (r = .293 - .222; p < .01), the need for competence (r = -.112 - .307; p < .01), and the need for relatedness (r = .224 - .490; p < .01). Partial correlations exploring the scale’s predictive properties showed that most micro-working conditions (12/15) predicted psychological well-being (r = .159 - .263; p < .01), some (6/15) predicted social well-being (r = -.222 - .217; p < .01), and only few (3/15) predicted psychosomatic well-being (r = -.101 - .130; p < .05).
Conclusion and Contributions
This study documents the development and validation of the Micro-Working Condition Scale, a measurement that conceptualizes the core working conditions defining the experience of microwork. This scale addresses the lack of a measurement for the unique working conditions of microwork (Brawley & Pury, 2016), as well as adapts existing working condition constructs to more accurately capture these in the specific context of microwork (Deng & Joshi, 2016; Gadiraju et al., 2017; Orhan et al., 2022; Silberman et al., 2018). In that we make several contributions: Firstly, a validated measure of microworking conditions enables more nuanced and targeted research into the dynamics between these conditions and work-related outcomes. Secondly, the perpetual adaptation and implementation of technology in virtually all labor markets reshapes many work contexts as management through algorithms and isolated working conditions become more prominent (Gagné et al., 2022; Parker & Grote, 2022). As these are integral features of microwork, this scale could be adopted to assess these conditions outside the microworking context. Lastly, considering the EU’s recent legislative efforts to improve the working conditions in the platform economy at large (Sikken, 2024), this scale can facilitate tailored research guiding the implementation of this directive to effectively improve working conditions for microworkers.
Within recent years AI AI-driven tools and services, such as Alexa and Siri, ChatGPT, and customer service chatbots, have become indispensable in our everyday lives. What is widely unknow, is that these AIs and algorithms rely on human labor in the form of data processing, labelling, and reviewing of model performance, to be built and maintained (Gray & Suri, 2019). This labor is commonly mediated through online platforms like Amazon’s Mechanical Turk (MTurk), and makes up a significant proportion of the gig economy (Piasna, Zwysen, & Drahokoupil, 2022). This online platform-mediated labor has been given various names in the literature including microwork, micro tasking, crowdwork, crowdsourcing, or clickwork (Berg, Furrer, Harmon, Rani, & Silberman, 2019; Durward, Blohm, & Leimeister, 2020; Webster, 2016). In this article we use the term microwork as this refers to the completion of short, independent tasks that contribute to a greater, mostly unknown project (Nakatsu, Grossman, & Iacovou, 2014). Despite this labor being at the heart of technological products that are used daily, very little is known about its hidden workers or the working conditions under which they work.
What is microwork like?
The working conditions are reportedly unstable, with workers being considered independent contractors, competing for work with every other worker on a given platform (Berg et al., 2019; Wood, Graham, Lehdonvirta, & Hjorth, 2019). Individual pieces of work, called tasks, often only pay cents, making it difficult for workers to make an acceptable hourly wage (Berg et al., 2019), requesters can reject work without stating a reason (Berg et al., 2019; McInnis, Cosley, Nam, & Leshed, 2016), and workers tend to work from home with no possibility connect to other workers leading to feelings of alienation (Bucher, Fieseler, Lutz, & Buhmann, 2024; Hertwig, Holz, & Lorig, 2024). The fully digitalized organization of this labor further creates novel conditions in this work. Workers have to invest substantial amounts of unpaid labor to maintain their working opportunities on the platforms (Berg et al., 2019; Howson et al., 2023), and are subjected to a high degree of commodification as their identity is hidden behind a user ID (Irani, 2015). On the other hand, workers appreciate a high degree of autonomy, and diversity in content and complexity of their tasks (Wood et al., 2019), finding a game-like enjoyment and a sense of accomplishment in their work (Deng & Joshi, 2016).
The emerging literature investigating the labor of microwork paints a complex picture of working conditions in this labor (Berg et al., 2019; Bucher et al., 2024; Deng & Joshi, 2016; McInnis et al., 2016; Wood et al., 2019), while some unique aspects that seem to be integral to microwork are missing from the conceptualization of the working conditions (Brawley & Pury, 2016). Following the call of previous researchers (Brawley & Pury, 2016), this study set out to develop a comprehensive scale assessing the working conditions of microwork. For this study, the basic needs (autonomy, competence, and relatedness) outlined in Self-Determination Theory were employed as a theoretical lens rather than a framework (Ryan, 2023), aiding in translating the experiences of microworkers into working conditions.
Methods
The scale development and validation followed the steps: creation of concept definitions and item generation, content validation, assessment of the measurement model employing exploratory - (EFA) and confirmatory factor analysis (CFA), and lastly bivariate analyses establishing the nomological network for validation.
Concept definitions and item generation. The microworking conditions were drawn from the existing body of literature researching microwork (e.g., Berg et al., 2019; Bucher et al., 2024; Deng & Joshi, 2016; McInnis et al., 2016; Wood et al., 2019). This resulted in the conceptualization of 15 working conditions. Nine working conditions based on established models, including the work characteristics model and the decent work criteria, which appeared prominently in the literature, and were adapted for this scale (Deng & Joshi, 2016; Gadiraju, Yang, & Bozzon, 2017; Orhan, Khelladi, Castellano, & Singh, 2022; Silberman et al., 2018). Six novel working conditions were conceptualized following recommended procedures (Mackenzie, Podsakoff, & Podsakoff, 2011). In consideration of the expected attrition of items in the following phases of validation (Hinkin, 1998), 10 items per working condition were developed. For novel working conditions, item creation was guided by the characteristics outlined in the concept definition. For common working conditions adapted into this scale, inspiration was drawn from existing scales.
Content validation. The initial pool of 150 items and 15 definitions were subjected to content validation procedures in a sample of 10 academic - , and 30 experiential experts (Schilling et al., 2007), rating the items regarding their relevance, representativeness, and clarity (Scharp, Bakker, Breevaart, Kruup, & Uusberg, 2023).
Measurement Model. Data of a sample of 407 workers on four prominent microworking platforms in the EU (MTurk, Clickworker, Microworker, Picoworker) was collected using the new scale. EFA was performed to confirm the scale’s structure, and to conduct item reduction. A second sample of 455 workers was recruited on the same platforms to obtain data to perform CFA and to reduce the item pool to three items per working condition.
Nomological network. The second wave of data collection on microworking platforms included further scales assessing constructs that are hypothesized to relate to the 15 working conditions of the new scale. This data was used to establish convergent validity through bivariate analyses. Discriminant validity was established following the procedure by O’Neill & Seva (2013). Criterion validity was tested against the basic needs of autonomy, competence, and relatedness (Deci et al., 2001; Luong & Flake, 2022), and predictive validity for the well-being indicators of psychological - (World Health Organisation, 1998), social - (Keyes, 1998), and psychosomatic well-being (Franke, 2015) by assessing the partial correlations.
Results
Content validation. Data from the content validation procedure shows sufficient support for all 15 working conditions conceptualized. Based on the review of academic - and experiential experts, the initial item pool was reduced from 150 to 102 items, excluding items that were redundant to the concept definitions or difficult to understand.
Measurement model. Different factorial structures were explored through EFA and it was found that the hypothesized 15-factor structure showed sufficient fit (df = 3813; χ2 = 7353.57; CFI = .887; RMSEA =.048). After the item pool was reduced to 63 items, the fit of the 15-factor model remained excellent (df = 1020; χ2 = 1846.38; CFI = .949; RMSEA =.045). CFA confirmed this structure with good fit statistics in the final 45 item version (df = 840; χ2 = 1400.57; CFI = .958; RMSEA =.048). The final model showed good factor loadings of the items on their respective latent constructs, with loadings ranging from .703 to .930. Reliability of each subscale was acceptable as Cronbach’s α were between .791 and .920.
Nomological network. Convergent validity was established as all 15 working conditions correlated significantly with a hypothesized related construct. These results ranged from small negative correlations of -.248 (p < .01) of the new working condition ‘Digital Working Environment’ with the Digital Stressor Scale (Fischer, Reuter, & Riedl, 2021), to large positive correlations of .811 (p < .01) of the new ‘Platform Conduct’ construct with the Perceive Civility Scale (Porath, Gerbasi, & Schorch, 2015). No violations of discriminant validity were found (O’Neill & Seva, 2013). The partial correlations examining the micro-working condition scale’s criterion validity revealed relevant correlations with the need for autonomy (r = .293 - .222; p < .01), the need for competence (r = -.112 - .307; p < .01), and the need for relatedness (r = .224 - .490; p < .01). Partial correlations exploring the scale’s predictive properties showed that most micro-working conditions (12/15) predicted psychological well-being (r = .159 - .263; p < .01), some (6/15) predicted social well-being (r = -.222 - .217; p < .01), and only few (3/15) predicted psychosomatic well-being (r = -.101 - .130; p < .05).
Conclusion and Contributions
This study documents the development and validation of the Micro-Working Condition Scale, a measurement that conceptualizes the core working conditions defining the experience of microwork. This scale addresses the lack of a measurement for the unique working conditions of microwork (Brawley & Pury, 2016), as well as adapts existing working condition constructs to more accurately capture these in the specific context of microwork (Deng & Joshi, 2016; Gadiraju et al., 2017; Orhan et al., 2022; Silberman et al., 2018). In that we make several contributions: Firstly, a validated measure of microworking conditions enables more nuanced and targeted research into the dynamics between these conditions and work-related outcomes. Secondly, the perpetual adaptation and implementation of technology in virtually all labor markets reshapes many work contexts as management through algorithms and isolated working conditions become more prominent (Gagné et al., 2022; Parker & Grote, 2022). As these are integral features of microwork, this scale could be adopted to assess these conditions outside the microworking context. Lastly, considering the EU’s recent legislative efforts to improve the working conditions in the platform economy at large (Sikken, 2024), this scale can facilitate tailored research guiding the implementation of this directive to effectively improve working conditions for microworkers.
| Original language | English |
|---|---|
| Publication status | Published - 25 Jul 2025 |
| Event | Annual Meeting Academy of Management 2025 - Copenhagen Buisness School , Copenhagen, Denmark Duration: 25 Jul 2025 → 29 Jul 2025 https://aom2025.eventscribe.net/ |
Conference
| Conference | Annual Meeting Academy of Management 2025 |
|---|---|
| Abbreviated title | AoM |
| Country/Territory | Denmark |
| City | Copenhagen |
| Period | 25/07/25 → 29/07/25 |
| Internet address |
Funding
ERC grant agreement No. 101003134