Methodology

Every number links to a published study. We show ranges because these are estimates — not stopwatch measurements.

How time spent is estimated

We sum gaps between your messages when gaps are under 30 minutes, cap each session at 2 hours, and assign a 1-minute minimum per conversation. This reflects estimated active time — not official session logs from AI providers.

How time saved is estimated

Each conversation is classified into a task type (writing, coding, email, etc.). We apply a minutes-saved multiplier from peer-reviewed research, scaled by assistant output length and your skill level from the quiz.

Category	Savings	Source
writing	40%	Noy & Zhang, Science 2023
email	31%	Jaffe et al., Microsoft 2024
coding	26%	Cui et al., Management Science 2026
support	15%	Brynjolfsson/Li/Raymond, QJE 2025
analysis	25%	Dell'Acqua et al., HBS/BCG 2023
translation	30%	Macken et al., EC DGT 2020
research	40%	Noy & Zhang 2023 (writing proxy)
meeting notes	67%	Cisco Webex AI 2024
brainstorm	30%	Dell'Acqua et al. 2023 (centaur)
image gen	80%	Industry estimate — flagged
learning	25%	Weighted average
other	15%	Conservative default

Calibration quiz

Before your wrap, eight questions calibrate estimates to your profile. They adjust a base skill multiplier with modifiers for primary use, replacement vs augmentation, verification habits, work context, and coding-on-mature-codebase (METR 2025).

Replacement ratio: mostly replacing existing work (×1.0) vs mostly new tasks you wouldn't attempt (×0.55).
Verification: careful review adds time back (×0.9 always) vs rarely checking (×1.03, wider confidence band).
Primary use alignment: savings boost when conversation category matches your stated primary use.
Mature codebase: extra downward adjustment on coding conversations when you work on familiar repos as an expert.

Skill multipliers

novice: ×1.5
intermediate: ×1
expert: ×0.6
expert mature_code: ×0.4

Counter-evidence we disclose

METR 2025: Experienced open-source developers were ~19% slower on their own mature repos with AI, while believing they were faster. Expert + mature-code personas use a 0.4× multiplier.
Dell'Acqua jagged frontier: Tasks outside AI capability saw 19% lower accuracy — harm invisible in exports.
Perceived vs measured: Self-reported time savings often exceed measured savings (Jaffe et al., Microsoft 2024).

Platform data gaps

Claude: No per-message model in exports.
Gemini: Flat activity log; threads are reconstructed heuristically.
Grok: No model variant per message.
ChatGPT: No token counts; o-series reasoning time not visible.