AI product safety
A formal mapping of each Department for Education standard to Poyntr's implementation, with specific technical evidence. Written for institutional procurement, ICO review, and regulatory assessment.
Mapped against the January 2026 update including cognitive development, emotional and social development, mental health, and manipulation standards.
Products must clearly state their intended purpose, target demographic, and learning focus. Claims require robust evidence.
Poyntr is an AI wellbeing companion for young people aged 5-18+ in educational settings, and a coaching intelligence platform for enterprise organisations. Purpose, capabilities, and limitations are stated in onboarding, Terms of Service, and this compliance documentation. We do not exaggerate capabilities.
Developers must indicate applicable use cases from eight DfE categories.
Primary category: Learner Engagement & Interaction. Poyntr is a wellbeing companion, not an instructional tool. It does not create content, deliver curriculum, generate assessments, or assist with homework. This distinction is critical to how the product is positioned and deployed.
Products must effectively prevent users from accessing harmful or inappropriate content. Filtering must be maintained throughout conversations, adjusted for age and SEND needs, and updated against emerging threats.
Three-layer pipeline on every message: regex pre-moderation (10 categories including injection, obfuscation, and CSAM), AI intent classifier (20-turn context, fails open with 3s timeout), and post-moderation output scanning. Crisis signals bypass generic blocking and route to crisis protocol. Age-appropriate content filtering configurable per institution. Auto-restriction after repeated misuse.
Products must maintain robust activity logging, alert local supervisors when harmful content is accessed, maintain DSL contact details, and generate understandable reports.
Real-time safeguarding dashboard with alert feed, severity filtering, acknowledgement workflow, and status transition audit trail. Per-student insight profiles with emotional trajectory, engagement, resilience. DSL contact details maintained per institution. Immutable content access audit log recording every data access with accessor identity, type, target, and timestamp.
Products must protect against jailbreaking, support permission levels, implement prompt fixes, conduct pre-release testing, and comply with cyber security standards.
System prompt hardening with immutable identity, injection resistance, and instruction confidentiality. Comprehensive adversarial detection across 10 categories. LLM intent classifier detects multi-turn escalation. Role-based access with four permission levels (student, pastoral, DSL, admin). Distributed rate limiting. Container-level sandboxing. Automated dependency auditing.
Products must provide clear privacy notices, conduct DPIAs throughout the tool lifecycle, and not collect or use personal data for commercial purposes without lawful basis.
Three-layer envelope encryption: HSM root key (never leaves hardware), per-institution KEK, per-user DEK. 20+ database tables encrypted at application layer. Zero training on student data with 0-day retention DPAs. Crypto-shredding for GDPR erasure. Full data export for right of access. Age-appropriate privacy notices. DPIA for Pulse aggregation in progress. Full product DPIA in progress (target Q2 2026).
Products must not store, collect, or use intellectual property created by learners or teachers for commercial purposes including model training without explicit consent.
Student and teacher content is never used for model training, fine-tuning, or commercial purposes. All AI provider DPAs specify 0-day data retention and contractual prohibition on training. Students retain ownership of their content. This is stated in our Terms and enforced technically.
Products must conduct sufficient testing with diverse users and use cases, and test new versions before release.
Comprehensive safeguarding benchmark suite with 200+ test cases across 20 categories, covering true positives, false positives, and edge cases. Tests run against multiple detection layers (regex, semantic, LLM). Automated typecheck and build validation on every deploy. Pre-commit hooks block secrets and sensitive data. Security audit rounds covering auth, injection, crypto, and API surface.
Products must demonstrate accountability through clear risk assessment, formal complaints mechanism, and transparent safety policies.
Formal complaints via support ticket system ([email protected]) with SLA tracking. Safeguarding concerns via [email protected] with same-day response commitment. Data protection enquiries via [email protected] with 1-month GDPR response. Compliance documentation including this page, DPA, privacy policies, and data governance framework. Risk assessment embedded in DPIA process.
Products must mitigate cognitive deskilling. Must use progressive disclosure rather than providing final answers by default. Must prompt learner input first and track when learners offload thinking.
Poyntr is a wellbeing companion, not an instructional tool. The youth companion explicitly refuses to generate homework, essays, academic content, or study resources. The coaching methodology is non-directive: it reflects and asks questions, never provides answers. "Ha, nice try" is the scripted response to homework requests. By design, there is nothing to offload thinking to.
Products must avoid anthropomorphisation, include default time limits, monitor personal disclosures indicating relationship formation, and notify DSLs of concerning engagement patterns.
The companion identifies itself as an AI: "You are not their friend (you are an AI). You are not their therapist (you are not qualified). You are not their teacher (you are not evaluating them)." Institutions can configure business-hours-only access and per-student session limits per week. Detection pipeline monitors engagement patterns and flags concerning intensity. DSL dashboard surfaces students with unusual usage spikes including night-time sessions.
Products must detect distress signs including negative emotional cues, mental health references, suicide mentions, and night-time usage spikes. Must provide tiered responses and always direct learners to human help.
Real-time crisis check on every message with 8 youth-specific codes (suicidal ideation, self-harm, abuse, exploitation, grooming). Soft signposting to Childline (0800 1111), Shout (text 85258), and trusted adults. DSL webhook on immediate/high severity. Crisis protocol: acknowledge, check safety, provide resources. "Do NOT fix, coach, or redirect. Just be there." Fails closed: if detection unavailable, assumes potential need. Time-of-day tracked on every session for usage pattern analysis.
Products must not employ sycophancy, flattery, deception, social pressure, threats, or dark patterns. Must not design interactions to prolong use for engagement or revenue.
Non-directive coaching: reflects and asks, never prescribes, flatters, or diagnoses. Zero engagement hooks: no streaks, gamification, push notifications, daily challenges, or reward systems. All sessions are student-initiated. No scarcity signals, social proof, or loss aversion triggers. The companion actively encourages real-world human support: "Talking to someone you trust can really help."
Beyond the 13 standards
The DfE standards are a baseline. Here is what we do that no standard required us to do.
Hardware-backed encryption with customer key control. Three-layer envelope encryption with HSM root keys that never leave hardware. Enterprise customers can bring their own key management service. Automated key rotation every 90 days (DEK) and 180 days (KEK). Crypto-shredding makes GDPR erasure irreversible by overwriting the key with random bytes.
Differential privacy with hysteresis. Calibrated Gaussian noise (epsilon 1.0) on every aggregate. K-anonymity at 50, not 5. Hysteresis-based suppression requiring 3 consecutive weeks below threshold before a cohort disappears from published data, preventing timing attacks that exploit the published/null boundary.
Tamper-evident audit logs with SIEM export. Every audit entry is HMAC-SHA256 chained to the previous entry for the same organisation, forming a verifiable append-only log. If a single entry is modified or deleted, the chain breaks. Export to external security monitoring platforms for independent oversight.
Alert enrichment, not duplication. When multiple detections fire for the same student in the same session, subsequent signals enrich the existing alert rather than creating duplicates. Peer disclosures ("my friend is being hurt") are tracked separately from self-disclosures. The DSL sees one coherent picture, not a flood.
Predictive safeguarding patterns. Expert-defined risk trajectories (grooming escalation, county lines recruitment, radicalisation pathways) matched against longitudinal detection data using vector similarity. Individual alerts only surface to the student's DSL. Aggregate patterns require k-anonymity before publication.
Circuit breakers on every external dependency. If any upstream service goes down, the system degrades gracefully. Crisis detection fails closed. Coaching continues with reduced capability rather than failing entirely. Provider alerts fire to the operations team within seconds.
Per-tenant admission control. One noisy organisation cannot exhaust shared AI provider quotas for all other organisations. Rate limiting is distributed across replicas with database-backed consistency.
Memory safety engineering. Automatic detection and repair of refusal loops caused by toxic working memory. Nightly deduplication sweeps (cosine similarity threshold 0.92). Importance decay with grace periods. Stale session memories archived automatically. None of this is required by any standard. We built it because children's coaching quality depends on memory integrity.
Five-band developmental age calibration. The UK Children's Code defines five developmental stages: preliterate (0-5), core primary (6-9), transition (10-12), early teens (13-15), and approaching adulthood (16-17). Our detection sensitivity, content filtering, response style, and safeguarding thresholds are calibrated per band, not just "under 18 / over 18." A 7-year-old and a 16-year-old do not get the same experience.
Continuous canary monitoring. Only 6% of student-facing AI systems have been adversarially tested. We deploy canary prompts in production periodically to verify the system still behaves correctly. If a model update, configuration change, or provider issue causes the system to respond inappropriately to a safeguarding scenario, we know within minutes, not weeks.
Public safety transparency report. We publish quarterly statistics on detection events (anonymised counts by category), false positive rates, system uptime, and security incidents. Not because any standard requires it. Because if you are trusting us with your students, you should be able to verify that the system actually works.
Responsible disclosure programme. Security researchers can report vulnerabilities through a documented responsible disclosure process. We commit to acknowledging reports within 48 hours, providing a fix timeline, and not pursuing legal action against good-faith researchers. Most edtech platforms do not have this.
Child impact assessment. Beyond the DPIA (which assesses data protection risk), we commit to conducting a child impact assessment that evaluates how the product affects children's development, autonomy, and wellbeing. This is the AADC's "best interests of the child" principle applied as a formal design review, not just a legal checkbox.
Cryptography policy. Three-layer envelope encryption architecture, HSM integration, key rotation procedures, and crypto-shredding for GDPR erasure.
Data governance framework. Data classification, retention policies, access control model, breach response, and third-party processing agreements.
Schedule 1 appropriate policy. DPA 2018 Schedule 1 document for processing special category data under the safeguarding exemption (Paragraph 18).
DPIA (Pulse aggregation). Data Protection Impact Assessment covering anonymisation methodology, k-anonymity thresholds, and differential privacy parameters.