Your Data and AI Models
Every time you use an AI tool — from chatbots to job search platforms — you are sharing data. Understanding what happens to that data is one of the most valuable skills you can build in today's digital world.
What Data Do AI Models Actually Use?
AI models like the ones powering Rafiki, ChatGPT, or Google Gemini were built by training on enormous collections of text, images, and other information. But training data is just the beginning. When you use an AI tool today, you are generating new data in real time — your questions, your preferences, even how long you pause before typing.
When you type a message to an AI assistant, that message may be stored on servers, reviewed by engineers to improve the product, used to personalise future responses, or in some cases shared with third parties. The exact rules depend on the platform's privacy policy — a long document most of us never read.
A Nairobi Example: Searching for a Job
Imagine you are a recent Kenyatta University graduate looking for a data analyst job in Nairobi. You open a popular AI-powered job platform and type: "I have a degree in statistics, I live in Westlands, and I need a job paying at least KES 80,000." You have just shared your education level, location, and salary expectations with a system you may know very little about.
How AI Models Learn From You
Many AI services use a process called feedback learning. When you give a thumbs-up or thumbs-down to a response, you are teaching the model. When you rephrase a question because the first answer was bad, that rephrasing is also data. This is how AI companies improve their products — but it also means your interactions have a long life beyond the moment you close the browser tab.
- Prompts you type may be stored and reviewed by human trainers
- Your device type and location are often logged automatically
- How you interact — which suggestions you click, how quickly you respond — is tracked as behavioural data
- Account information links all of this back to your identity if you are logged in
Personal Data vs. Sensitive Data
Not all data carries the same risk. Your name or email address is personal data — it identifies you. But your health condition, ethnic background, political opinion, or financial situation is sensitive data — it can be used to discriminate against you or cause you real harm if exposed.
What Responsible AI Companies Should Do
Good AI products are designed with privacy in mind from the start — a principle called privacy by design. This means they collect only what they need, store it safely, tell you clearly what they are doing, and give you control over your data. When a platform gives you options to delete your history, opt out of training, or download your data, those are signs of a more responsible product.
In the next lesson, you will learn about the Kenya Data Protection Act — the law that gives you specific rights over your personal data and holds organisations accountable for how they handle it.