Rayse The Game is an Irish education company running ELEVATE, a study-skills and accountability programme for secondary-school students. Each week, they send personalised check-in emails to parents — three concrete recommendations per teenager, written in the warm, direct coaching tone the founder Ray is known for. As the programme grew, the AI drafts they were generating from third-party APIs became a bottleneck: usable, but increasingly generic, expensive at scale, and entirely dependent on a model someone else owned.
We rebuilt the pipeline around a private language model that the company owns end-to-end. The work began with the prompt itself — codifying Ray's coaching voice with the precision of a clinical specification, then layering in safety provisions for distress signals and privacy protection for the teenagers. We built a structured evaluation framework so quality became measurable rather than felt: every model output is scored against a multi-dimensional rubric covering voice, specificity, pedagogical accuracy, and safety calibration. A synthetic data pipeline produces around 1,000 evaluation-aligned training examples per cycle.
The result is a fine-tuned model that runs on demand-priced GPU infrastructure for tens of dollars per training cycle and cents per inference. Against the third-party baseline, the first deployed version improved measurably on every dimension the rubric tracked — while preserving the safety properties that matter when supporting neurodiverse teenagers. Weekly human review time dropped, the emails read more like Ray, and Rayse The Game now owns the model end-to-end: the prompt design, the synthetic data, the evaluation harness, and the trained weights all live in their infrastructure.
The work continues on a regular cadence. Each iteration — generate fresh training data, fine-tune, evaluate, decide whether to deploy — takes a few hours and a few dollars. The infrastructure is reusable: as ELEVATE expands into new contexts and new artefact types, the same pipeline produces new specialised models without starting over. We're not selling them an AI product. We're helping them build one they own.
- Sector
- Education
- Engagement
- Ongoing
- Stack
- Private LLM, evaluation harness, synthetic data pipeline
- Outcome
- Owned IP, lower per-message cost, measurable quality lift