AI Wrapped

Methodology

Every number links to a published study. We show ranges because these are estimates — not stopwatch measurements.

How time spent is estimated

We sum gaps between your messages when gaps are under 30 minutes, cap each session at 2 hours, and assign a 1-minute minimum per conversation. This reflects estimated active time — not official session logs from AI providers.

How time saved is estimated

Each conversation is classified into a task type (writing, coding, email, etc.). We apply a minutes-saved multiplier from peer-reviewed research, scaled by assistant output length and your skill level from the quiz.

Category Savings Source
writing 40% Noy & Zhang, Science 2023
email 31% Jaffe et al., Microsoft 2024
coding 26% Cui et al., Management Science 2026
support 15% Brynjolfsson/Li/Raymond, QJE 2025
analysis 25% Dell'Acqua et al., HBS/BCG 2023
translation 30% Macken et al., EC DGT 2020
research 40% Noy & Zhang 2023 (writing proxy)
meeting notes 67% Cisco Webex AI 2024
brainstorm 30% Dell'Acqua et al. 2023 (centaur)
image gen 80% Industry estimate — flagged
learning 25% Weighted average
other 15% Conservative default

Calibration quiz

Before your wrap, eight questions calibrate estimates to your profile. They adjust a base skill multiplier with modifiers for primary use, replacement vs augmentation, verification habits, work context, and coding-on-mature-codebase (METR 2025).

  • Replacement ratio: mostly replacing existing work (×1.0) vs mostly new tasks you wouldn't attempt (×0.55).
  • Verification: careful review adds time back (×0.9 always) vs rarely checking (×1.03, wider confidence band).
  • Primary use alignment: savings boost when conversation category matches your stated primary use.
  • Mature codebase: extra downward adjustment on coding conversations when you work on familiar repos as an expert.

Skill multipliers

  • novice: ×1.5
  • intermediate: ×1
  • expert: ×0.6
  • expert mature_code: ×0.4

Counter-evidence we disclose

  • METR 2025: Experienced open-source developers were ~19% slower on their own mature repos with AI, while believing they were faster. Expert + mature-code personas use a 0.4× multiplier.
  • Dell'Acqua jagged frontier: Tasks outside AI capability saw 19% lower accuracy — harm invisible in exports.
  • Perceived vs measured: Self-reported time savings often exceed measured savings (Jaffe et al., Microsoft 2024).

Platform data gaps

  • Claude: No per-message model in exports.
  • Gemini: Flat activity log; threads are reconstructed heuristically.
  • Grok: No model variant per message.
  • ChatGPT: No token counts; o-series reasoning time not visible.