When AI Becomes Your Variable: The Hidden Bias in AI-Generated Data

Lille My
Jun 28
3 min read

Welcome to this week's scan of how AI is reshaping research — curated from Nature, Science and the leading business and management journals.

When you let AI measure your variables, you may be biasing your results

It has become routine: ask a large language model to score the sentiment of a review, rate the aesthetics of an ad, or label the emotion in a customer message, then use that AI-generated score as a variable in a regression to estimate its effect on sales or engagement. It feels like measurement — but it isn't. In Information Systems Research, Jingwen Zhang, Wendao Xue, Yifan Yu and Yong Tan show that because AI predictions are imperfect, plugging them directly into a model as regressors introduces a systematic bias into the estimated effects. The error in the AI's labels doesn't simply average out; it propagates into your coefficients and your inferences.

The authors develop a debiasing procedure for partially linear models that corrects the distortion AI-generated regressors create, restoring valid estimation and inference. The timing matters because the practice they critique is spreading fast across exactly the fields this newsletter serves. This same week, marketing scholars in the Journal of Marketing Research proposed a meta-learning framework to model dynamic consumer preferences from just a handful of observations, and operations researchers in Production and Operations Management built an interpretable machine-learning pipeline to extract competitive intelligence from user-behavior data. As AI-derived measures become standard inputs, the question of whether those measures bias the conclusions drawn from them stops being academic.

Why it matters: If your study uses an LLM or any ML model to generate a variable and then treats that variable as if it were observed without error, your causal estimates may be biased — and no amount of clean study design downstream will fix it. The practical takeaway is not to abandon AI-assisted measurement, which is often the only way to scale, but to treat AI-generated variables as the noisy predictions they are: report the model's accuracy, and apply correction methods like the one above before reading causal meaning into the coefficients.

More from this week

Deep learning finds a hidden ECG signal for sudden cardiac death

Sudden cardiac death is preventable with a defibrillator, yet the standard risk marker — left ventricular ejection fraction — misses most cases while flagging many low-risk patients. Ziad Obermeyer, Sendhil Mullainathan and colleagues applied deep learning to a Swedish dataset linking every ECG in a region to death records, surfacing a predictive signal hidden in ordinary ECGs. It is a striking example of AI used not to automate a known task but to discover a new biomarker from routine clinical data.

Read the original in Nature, 24 Jun →

Operations research as the backbone of agentic GenAI

Generative AI is shifting from conversational assistants to agentic systems that sense, decide, and act inside operational workflows. Tinglong Dai, David Simchi-Levi and co-authors describe an 'autonomy paradox': the more operational freedom a GenAI system is granted, the more formal structure, explicit constraints, and tail-risk discipline it needs. They argue operations research provides the orchestration layer that makes autonomous AI safe to deploy in real business processes.

Read the original in Production and Operations Management, 18 May →

Meta-learning personalizes from just a few data points

Personalization usually demands rich histories that new or short-session customers simply don't have. Mingzhang Yin, Khaled Boughanmi and Anirban Mukherjee propose meta-temporal processes (MetaTP), a meta-learning framework that learns dynamic, individual-level preferences from only a handful of observations. The approach offers firms a route to scalable personalization and improved retention precisely where data is thinnest.

Read the original in Journal of Marketing Research, 3 Jun →

A thousand AI agents build a civilization in Minecraft

Video games have become a serious testbed for AI. In one project, computer scientists dropped 1,000 AI-driven agents into Minecraft and watched an 'AI civilization' emerge, complete with artists, chefs, and explorers — echoing earlier 'Smallville' sandbox studies of generative agents. The work shows how open-ended game worlds let researchers probe coordination, culture, and emergent social behavior in populations of autonomous agents.

Read the original in PNAS, 18 Jun →

Catching recruitment scams with ChatGPT and supervised learning

Digital hiring platforms have become fertile ground for recruitment scams that prey on job seekers' eagerness and low vigilance. Tianteng Wang and co-authors combine ChatGPT-based text mining with theory-driven supervised learning to detect fraudulent postings, pairing predictive accuracy with interpretable analysis of what makes a posting suspicious. It is a concrete template for using LLMs as part of a transparent, accountable detection pipeline rather than a black box.

Read the original in Production and Operations Management, 26 May →

That's this week. Forward it to a colleague who's been pasting data into ChatGPT without a second thought — and explore AI tools for your own research at gaiforresearch.com.

Generative AI for Research Initiative