Traditional edtech collected
structured data: names, grades, attendance records, demographic information.
AI-powered tools collect all of that, and they also collect the content of
student interactions: the essays students draft, the questions they ask, the
problems they work through, and in some tools, behavioral signals like how long
a student pauses before answering or which paths through a lesson they take.
The Future of Privacy Forum, a
nonprofit that advises schools on data privacy, identified a critical question
that most vendor reviews miss: will the AI tool use student inputs to improve
the underlying model?2 Many edtech products embed AI through a third-party API and have terms stating
that student data won't be used for model training. Others are less specific.
The practical consequence of the distinction is significant: if a student's
writing sample is used to train a language model, that information may persist
in the model indefinitely, and there is no technical mechanism to fully remove
it later.
This is not hypothetical. It is
a documented gap in how many schools evaluate new tools, and it is the kind of
question that does not appear in a standard app vetting checklist designed
before generative AI existed.
Two federal laws govern most of
what schools are required to do. FERPA, the Family Educational Rights and
Privacy Act, protects personally identifiable information in student education
records and restricts who can access or disclose it. COPPA, the Children's
Online Privacy Protection Act, regulates how commercial platforms collect data
from children under 13 and was amended in January 2025 to shift the default
from opt-out to opt-in consent: vendors can no longer assume permission for
advertising-related data use.3
Both laws are necessary and
neither is sufficient. FERPA was written in 1974 and does not address model
training, behavioral data, or the distinction between a tool that processes
student data and one that ingests it into a system that learns from it. COPPA
applies to children under 13 but does not extend to high school students. And
beyond these federal frameworks, schools must also track the patchwork of state
law: as of 2025, there are more than 130 state student privacy laws across 43
states.4
The honest picture is that the
legal framework is genuinely complex, it was not designed for AI, and it is not
keeping pace with how quickly AI tools are entering classrooms. That does not
make compliance impossible. It does mean that a district relying only on
standard EdTech vetting processes, without additional scrutiny specific to AI,
is likely leaving questions unanswered.
Before approving any AI tool
for student use, there are four questions worth putting directly to the vendor
in writing.
First: what student data does
the tool collect, and where is it stored? The answer should be specific. Vague
references to 'usage data' or 'interaction logs' are not sufficient.
Second: will student data or
student-generated content be used to train or improve the AI model? If yes, ask
how the data is de-identified, who has access to it, and whether it can be
deleted on request.
Third: does the vendor have a
signed Data Processing Agreement ready to provide? A DPA is the mechanism by
which vendors contractually commit to FERPA-compliant data handling. A vendor
who hesitates on this question is a vendor worth reconsidering.
Fourth: what is the data
retention policy? Students whose records persist in vendor systems long after
they've graduated remain exposed to future breaches and misuse. The PowerSchool
victims include students who enrolled years before the December 2024 breach.
The question of how long data is held is not a formality.
The school districts that managed the PowerSchool breach best were the ones that already knew what data they had stored with that vendor, in what form, and under what contractual terms. That knowledge did not prevent the breach. It determined how quickly they could respond, how accurately they could notify families, and how clearly they could answer the questions parents and students were asking.
The same preparedness question applies to every AI tool a district is considering right now. Before the next tool is approved, it is worth asking: if this vendor's systems were compromised tomorrow, what data would be at risk, and would we know?