How Biased Clinical Research Data Affects Healthcare Practices
Jaskirat Pabla, Samin Sharar Nafi, Sahl Bakshi, Saboor Bakshi
Mar 31, 2025
Introduction
Our project explores how biased clinical data, especially lacking in racial and gender diversity, can lead to harmful outcomes in healthcare AI. Through interviews and case studies, we examine real-world examples where algorithms made discriminatory decisions, and propose ways to make AI in healthcare more inclusive and fair.
Interviews
We interviewed four participants, beginning with brief introductions about their background and research. Each was presented with six real-world cases of biased AI in healthcare, with three cases based on gender, and the other three on race. We first offered limited, skewed details to explore their assumptions, then revealed the full context for deeper discussion. The interviews concluded with reflections on broader impacts and solutions.
Analysis of Case Studies
For each of the six case studies, we will analyze the interviewee responses to understand how bias influenced the AI's decisions and explore the ethical and practical implications of each scenario.

Try the quiz before reading the case and discussion below.
It’s a great way to test your intuition first!

Transgender Patient’s Emergency Misdiagnosed by Algorithmic Bias

A 32-year-old transgender man arrived at the emergency room with severe abdominal pain. Although he informed staff that he was transgender, the hospital’s electronic health record (EHR) system listed him as male (Ring, 2019). As a result, clinicians failed to consider pregnancy and misattributed his symptoms to factors like obesity. The patient was, in fact, pregnant and experiencing labour complications. Due to the delay in recognizing this, urgent care was postponed, and tragically, the baby was stillborn (Ring, 2019).

This case highlights how rigid, binary classifications in health records — combined with provider assumptions — can lead to critical misdiagnoses. Standard pregnancy-related alerts were never triggered because the system registered the patient as male. This highlights the dangers of algorithmic bias and the need for more inclusive healthcare systems that reflect the realities of transgender and non-binary individuals in today’s day and age (Compton, 2019; Cirillo et al., 2020).

Questions
  1. Imagine you’re a physician treating a patient with severe abdominal pain and no visible injuries. What diagnoses would immediately come to mind, and what additional information would you ask for?
  2. What if you were told the patient was a woman?
  3. How might your diagnostic thinking change if you were told the patient was a man?
  4. How would learning that the patient is a transgender man affect your clinical reasoning, and what challenges might arise in ensuring they receive appropriate care?
  5. In your opinion, now knowing that the person was transgender and pregnant, what role did algorithmic bias, system design, or any other factors that you can think of, play in the AI’s failure to identify pregnancy risk in this case?
  6. What changes would you suggest — to electronic health records, triage systems, provider training, etc. — to prevent similar outcomes for transgender patients in the future?
Discussion

One of the main issues that stood out in all the interviews conducted was how social stereotypes regarding gender shaped clinical reasoning. Most participants who were asked how they would manage a patient who has abdominal pain acknowledged that their diagnostic reasoning would differ depending on the patient’s gender or their perceived gender. As an example, Jennifer Aguiar explained how a female patient would require her to think of reproductive health problems like ovarian cysts or pregnancy in her differential list. However, she, Dan Holtby, and Scott Davidson noted that if the patient was considered a male, they would all disproportionately focus on gastrointestinal or urinary pathology. This does demonstrate that there is gender clinical heuristic bias that can result in cognitive error, particularly concerning transgender patients whose anatomy does not correspond to gender-based assumptions.

All participants advocated for the need to capture gender identity as well as the sex designated at birth in health records. Aguiar pointed out that although there is a need to honour gender identity, critical clinically relevant information, such as a uterus or ovaries. should be freely available for use in clinical decision-making. Ballester specifically argued for the need to differentiate these two constructs in electronic health record (EHR) systems as more hospitals adopt artificial intelligence-driven alerts and triage systems. If a system only recognizes gender identity and not biological anatomy, crucial alerts—such as those related to pregnancy—can be missed, with life-threatening consequences.

Algorithmic bias was another concern shared across the interviews. Aguiar invoked the “garbage in, garbage out” principle, stressing that if transgender individuals are not adequately represented in the datasets that train AI models, the resulting tools will fail to serve them. Ballester agreed and advocated for designing more sophisticated data pipelines that take into account variables like hormone therapy or surgical history. Without such nuance, AI systems risk reinforcing existing biases and becoming blind to outliers.

Scott Davidson offered a slightly different perspective, attributing the failure less to systems and more to human error. He noted that in some cases, even when EHRs are limited, the necessary information is verbally communicated by the patient but not acted upon. This highlights the need for better training so that healthcare providers can interpret and apply contextual information more effectively.

Ultimately, the interviews point to four clear action items: EHRs should include both sex assigned at birth and gender identity, AI systems must be trained on inclusive data, clinical education should prioritize transgender healthcare, and empathy must remain at the core of every patient interaction. Only through this multidimensional approach can healthcare become truly equitable.

Conclusion
Why it matters?
Everyone should have equal, unbiased access to healthcare, but right now, that’s not the case. Systemic disparities already exist, and if we’re not careful, AI could make them worse. In fact, by 2022, nearly 20% of U.S. hospitals had integrated some form of AI into their practices (Baten & Abdul, 2022). Although this is remarkable progress, it also raises red flags. We are now utilizing these technologies to make decisions concerning the diagnosis, treatment plans, and even who gets access to care. If the data that is fed to these algorithms is biased, the results will be biased as well. This implies that people who are already part of marginalized communities, such as racial minorities, transgender people, and others, may be left behind or at risk of being harmed by the very systems designed to assist them. That’s why we’re discussing far more than simply algorithms and data, this is about real situations where lives are helped or cut short, so we must ensure everyone gets the care they need and deserve.
What we learnt?
One of the biggest things we’ve learned is that bias in healthcare is not just a theoretical issue, it has serious, real-world consequences. For example, in one tragic case, a transgender man, assigned female at birth, was marked as male in the system. Although he had disclosed he was transgender, clinicians failed to consider pregnancy as a possibility. As a result, care was delayed, and the baby was stillborn. In another case, a Black man waited over five years for a kidney transplant because the algorithm used to assess kidney function artificially boosted scores for Black patients, making them seem healthier than they really were. These examples show that bias often doesn’t come from the AI itself, it comes from the data it’s trained on and the assumptions of the people building these systems. We’ve learned that fixing these systems after they’re deployed is extremely difficult. That’s why we need to design them responsibly from the ground up, this means collecting inclusive, representative data and ensuring diverse voices are at the table when these tools are being developed.
What was challenging?
One of the most difficult aspects of this project was the research involved, particularly collecting examples of biased healthcare data that resulted in negative consequences for someone's care. We were not searching for just numbers or broad patterns, we wanted stories that depicted the very real ramifications of such biases on peoples’ lives. It was important to us that the impact of biased AI wasn't reduced to numbers, but instead framed through human experiences. In addition, the interviews posed another challenge. How could we frame questions that stimulate meaningful and thought-provoking interaction? We had to ensure that the questions weren’t too leading, vague or overly guiding, while also posing questions that encouraged deep reflection and discussion from our participants. Finding the right balance between being respectful, clear, and insightful required a lot of revision and consideration.
If we had more time, what else might we have done?
Given the opportunity, we would have liked to include additional interviewees from varying positions and perspectives within the industry, enabling a breadth of lived experiences and a richer understanding of the problem. The insights gained would be immeasurably deeper in quality regarding the effects of data bias in healthcare across diverse communities. Furthermore, the quiz section could have been changed to be more instructional and engaging. Imagine if participants could walk through case studies with contextual guides, explanations, and real-time feedback. This approach could have helped users not only test their understanding but also learn and reflect as they progressed.