Love, AI, and DEI

9 min readFeb 15, 2025

“Valentine’s Day in America” created using DALL-E (ChatGPT)

A Valentine’s Day Reflection on Bias and AI

Happy Valentine’s Day, everyone. While today is a day of happiness and connection for many, it can also be one of loneliness for others. Before diving into this article, take a moment to reach out to someone — maybe a friend, family member, or colleague — who might appreciate a reminder that they matter.

Now, let’s talk about something that impacts all of us, whether we know it or not: bias amplification.

The Unseen Bias in AI-Generated Content

AI and the ethical concerns surrounding it have been rising topics of discussion in recent years, especially as companies reconsider or even eliminate DEI programs. What does this have to do with Valentine’s Day? Well, it all started when I wanted to create a fun AI-generated image to post on LinkedIn for Valentine’s Day. I experimented with different prompts and quickly noticed something: the images predominantly featured white individuals, and in most images created, the women were predominately blonde. No offense to any white blond female Americans out there.

“Valentine’s Day in America” first pass — created using DALL-E (ChatGPT)

Soylent Green is….white people!

I know, that was a lame movie joke — I could not stop my self from saying it though.

As someone who has experimented a lot with AI-generated images, I have noticed that unless I explicitly request diversity, the outputs often reflect dominant demographic trends present in the training data. Otherwise, the model defaults to a homogeneous representation. This issue is more than just an annoyance — it’s a reflection of a deeper systemic problem in AI and data science.

Interestingly, and rather humorous too, when I asked ChatGPT to create an image that was more inclusive and showed a diverse representation of people, it seemed to have a hard time, and in most cases, it was just outright weird. It seems like AI has a hard time understanding love too — I guess when it thinks about diversity and love together it turns polyamorous? I guess, in a way that does express diversity and inclusion.

¯\_(ツ)_/¯

“Valentine's Day in America”, showing “diversity and inclusion” — created using DALL-E (ChatGPT)

Understanding the Science Behind AI Bias

Curious about why this was happening, I asked ChatGPT for a scientific breakdown of AI-generated bias. The explanation pointed to several key factors:

Data Composition & Skewed Representation
AI models, including OpenAI’s DALL·E, are trained on vast datasets sourced from the internet.
Many widely used datasets (e.g., MS COCO, ImageNet, LAION-5B) are Western-centric, resulting in an overrepresentation of white individuals.
Geographic and economic disparities in internet usage further amplify this skew.
Statistical Learning & Bias Amplification
AI models rely on statistical patterns and probabilistic modeling, sometimes reinforcing dominant trends due to the frequency of certain data representations, a phenomenon known as mode collapse.
Since white individuals are overrepresented in training data, the model defaults to them as the “expected” or “average” representation.
When given neutral prompts like “a happy couple,” AI tends to associate them with Western-style imagery.
Empirical Studies on AI Bias
The “Gender Shades” study (Buolamwini & Gebru, 2018) found that AI models had far higher error rates for darker-skinned individuals, particularly women.
Google’s AI Fairness Report (2021) revealed that AI-generated images were 2.4 times more likely to depict white individuals when given neutral prompts.
(ChatGPT, 4o)

The Dangers of Bias Amplification in AI

Let’s think about what happens when these biases persist unchecked. If users do not consciously prompt AI to include diversity, then AI-generated content will continue reinforcing the same biased patterns. Over time, this leads to:

AI models perceiving a skewed reality, further embedding biases into future versions of AI.
Societal reinforcement, where media and decision-making tools reflect these biases.
Discrimination in critical areas, such as hiring algorithms, healthcare models, and educational resources.

This brings up a difficult but important ethical question: Should AI always be “accurate” based on existing data, or should it be designed to be ethical and inclusive? Can it be both?

A Teachable Moment: Explaining Bias to My Daughter

As I was working on a Python script to illustrate bias amplification, my daughter walked in and asked what I was doing. I explained to her that AI models learn from data that already exists in the world. If that data is biased, whether intentionally or not, the AI will inherit those same patterns.

I told her that some people and demographics had a head start in creating digital content, meaning their perspectives are naturally more prominent. This isn’t necessarily due to malice but rather systemic societal factors. However, the end result remains the same: some groups get left out of AI-generated representations, decisions, and systems without ever having a fair chance to be included, at least not without some intentional actions taken.

I then asked her, “If AI models are just reflecting the world as it knows it to be, is there a problem and if so, should we do anything to change it?” She thought for a moment and replied, “Yes, because just because the data is right does not mean that it is right.”

The Role of DEI in Addressing Bias Amplification

This brings me to DEI programs. Regardless of personal opinions on their execution, we need to think about the need and process of identifying and correcting systemic biases in data, workplaces, and social structures. Bias has been observed in various domains, including AI, hiring, healthcare, education, and media, often influenced by historical and systemic factors. AI simply mirrors and amplifies the disparities that already exist — if left unchecked, it may continue to amplify bias, especially considering the ease of content creation.

Some companies have reconsidered DEI programs, questioning their necessity or effectiveness. While perspectives on these programs vary, research indicates that biases can persist in systems if not actively addressed, potentially influencing outcomes in ways that may not be equitable. Even if programs are disbanded, we shoud still keep these concepts in mind as we interact with people and systems thought our lives.

Moving Forward: A Call to Action

Fixing AI bias isn’t just about tweaking models — it’s about changing the way we create, interact with, and think about data. Here are a few things we can do:

Be proactive: If you use AI tools, ask for diversity in your prompts. I found that with images, having smaller groups of people is helpful. I was also advised that rather than saying in your prompt “Create an image of a <ethnicity/race> woman in her 40’s”, say “Create an image of a 43 year old <ethnicity/race> woman”. For whatever reason, this level of specificity seems to create more accurate and precise images.
Think beyond convenience: Yes, it’s a pain to have to work diversity intro your prompts, I have gone over some images dozens of times till I git it right, but imagine the people who never see themselves represented because others didn’t take that extra step. It really is worth the effort.
Advocate for responsible AI policies: Whether in government, education, or corporate settings, demand transparency and fairness in AI deployment — better yet, if you are able to, demonstrate how it can be done yourself.
Support a culture of diversity, equity, and inclusion: You don’t need a program to do this. Everyone wants to feel safe and included. Start by pausing and thinking about the things that you do before you do them — even a one minute pause to think could have a huge impact in someone’s life.

Final Thoughts

Bias amplification in AI is a reflection of larger societal issues. If we want AI to be more inclusive, we need to be more inclusive in how we approach data and representation. This requires both individual action and systemic change.

Whether it is considered DEI or some other variation, we need to be fostering inclusive representation, behaviors, and environments in our programs and processes as well as how we personally engage with data and society. Considering different perspectives can contribute to more balanced and equitable outcomes, but more than that diversity promotes creativity and it is through creativity is how we really advance as a society. So, the next time you generate an image, analyze data, or make a decision, take a moment to ask: Who is missing? Who might be affected? And most importantly, how can I make this better?

Because in the end, the world we create is the world we live in; lets find ways to nurture goodness and make it the best it can be.

“Sloth Love” created using DALL-E (ChatGPT)

Want to Know More?

To further illustrate bias amplification, I created a small Python program that loosely simulates how small biases can grow over time. The program runs two processes, each generating values at different growth rates. It uses the result set of each run, determining the mode to then calculate A + B = C. Due to a small initial advantage, one process becomes dominant, reinforcing bias which has a direct impact on the mode and ultimately the value chose for A, highlighting how amplified bias reduces the ability of others to be able to influence decisions made. Again, this is a rough example.

Here’s a key function that drives the amplification:

def generate_entries(process_id, queue, start_delay=0):
    "Process function to generate numbers based on controlled exponential growth."
    time.sleep(start_delay)
    entries = INITIAL_ENTRIES
    start_time = time.time()
    
    while time.time() - start_time < TIMER_DURATION:
        queue.put((process_id, int(entries)))  # Convert to int only when storing
        entries *= (1 + GROWTH_RATE)  # Keep floating-point precision
        time.sleep(1)

Over time, even a small starting advantage causes one process to dominate, illustrating how biases can compound. If you’re interested in testing this model, feel free to experiment with the code (full script below) and observe the effects in real time! Feel free to tweak it and make any recommendations.

Note: the following code was written to run using python3 on a Mac.

import multiprocessing
import time
import numpy as np
import matplotlib.pyplot as plt
from collections import Counter
import os
import sys

# Constants
TIMER_DURATION = 30  # seconds
PROCESS_2_DELAY = 5  # Process 2 starts 5 seconds later
INITIAL_ENTRIES = 1
B_VALUE = 2  # Fixed value of B
GROWTH_RATE_HISTORY = [0.2, 0.4]  # Initialize with two values

def calculate_growth_rate():
    "Dynamically calculate growth rate based on previous values."
    Y = GROWTH_RATE_HISTORY[-1] - GROWTH_RATE_HISTORY[-2]
    Y = min(max(Y, 0), 1)  # Ensure Y is between 0 and 1
    new_growth_rate = max(0, 0 + Y)
    GROWTH_RATE_HISTORY.append(new_growth_rate)  # Track growth history
    return new_growth_rate

GROWTH_RATE = calculate_growth_rate()  # Controlled exponential increase

def generate_entries(process_id, queue, start_delay=0):
    "Process function to generate numbers based on controlled exponential growth."
    time.sleep(start_delay)
    entries = INITIAL_ENTRIES
    start_time = time.time()
    
    while time.time() - start_time < TIMER_DURATION:
        queue.put((process_id, int(entries)))  # Convert to int only when storing
        entries *= (1 + GROWTH_RATE)  # Keep floating-point precision
        time.sleep(1)

def main():
    manager = multiprocessing.Manager()
    queue = manager.Queue()
    process_1_total, process_2_total = 0, 0
    mode_counts = Counter()
    time_series = []

    # Start processes
    p1 = multiprocessing.Process(target=generate_entries, args=(1, queue))
    p2 = multiprocessing.Process(target=generate_entries, args=(2, queue, PROCESS_2_DELAY))
    
    p1.start()
    p2.start()
    
    start_time = time.time()
    while time.time() - start_time < TIMER_DURATION:
        time.sleep(1)  # Simulate real-time updates
        
        # Collect new data from processes
        new_entries = {1: 0, 2: 0}
        while True:
            try:
                process_id, count = queue.get_nowait()  # Non-blocking retrieval
                new_entries[process_id] += count
                if process_id == 1:
                    process_1_total += count
                else:
                    process_2_total += count
            except:
                break  # No more items in queue
        
        # Determine mode
        mode_value = 1 if process_1_total >= process_2_total else 2
        C = mode_value + B_VALUE
        mode_counts[C] += 1
        
        # Store time-series data for visualization
        time_series.append((time.time() - start_time, process_1_total, process_2_total))
        
        # Print real-time updates without flicker
        sys.stdout.write(f"\rTime Left: {TIMER_DURATION - int(time.time() - start_time)}s | "
                         f"Process 1: {process_1_total} | "
                         f"Process 2: {process_2_total} | "
                         f"Mode: {mode_value} | A + B = {C}    ")
        sys.stdout.flush()
    
    # Ensure processes terminate properly
    p1.join()
    p2.join()
    
    # Print final statistics
    print("\nFinal Results:")
    for key, value in mode_counts.items():
        print(f"C = {key} occurred {value} times")
    
    # Plot the results
    timestamps, p1_counts, p2_counts = zip(*time_series)
    plt.figure(figsize=(10, 5))
    plt.bar(timestamps, p1_counts, label="Process 1 Count", alpha=0.6)
    plt.bar(timestamps, p2_counts, label="Process 2 Count", alpha=0.6, bottom=p1_counts)
    plt.xlabel("Time (seconds)")
    plt.ylabel("Total Entries")
    plt.title("Amplification Bias: Process 1 vs. Process 2 Growth")
    plt.legend()
    plt.show()

if __name__ == "__main__":
    main()

Save the code to a .py file and then launch it in your terminal. The program will run for 30 seconds and you will see the Process count increase while the timer counts down.