Working Title: State of Phishing in the Rise of GenAI Author: Scott Altiparmak Status: Active collection — dataset frozen at v1
Which phishing techniques are humans most likely to miss when linguistic quality is no longer a reliable detection signal?
Prior technique-level phishing detection research was built on datasets where linguistic quality varied across samples. This conflates technique with quality as variables. By holding linguistic quality constant at AI-generation level and using identical difficulty distributions across all techniques, this study isolates technique as the sole variable. No published baseline exists for technique-level human detection rates under this condition. This study establishes that baseline.
Publication target: Blog post on scottaltiparmak.com — "Which Phishing Techniques Fool You When AI Writes the Email" Last updated: 2026-03-06
| | Count | |---|---| | Total cards | 1,000 | | Phishing cards | 690 | | Legitimate cards | 310 | | Types | Email | | Generation | Claude Haiku + Sonnet (documented prompt templates, 3 Haiku + 1 Sonnet per batch) |
All cards are AI-generated by design. There are no real-world phishing samples in the dataset. This is a deliberate research choice: it eliminates PII handling, sourcing/licensing complexity, and uncontrolled variation in linguistic quality across samples.
Six techniques, 115 cards each. Each technique is split across difficulty (35 easy/medium/hard, 10 extreme):
| Technique | Easy | Medium | Hard | Extreme | Total |
|-----------|------|--------|------|---------|-------|
| urgency | 35 | 35 | 35 | 10 | 115 |
| authority-impersonation | 35 | 35 | 35 | 10 | 115 |
| credential-harvest | 35 | 35 | 35 | 10 | 115 |
| hyper-personalization | 35 | 35 | 35 | 10 | 115 |
| pretexting | 35 | 35 | 35 | 10 | 115 |
| fluent-prose | 35 | 35 | 35 | 10 | 115 |
| Total | 210 | 210 | 210 | 60 | 690 |
Technique definitions:
| Technique | Description |
|-----------|-------------|
| urgency | False time pressure or threat of account loss — the most traditional and recognisable phishing vector |
| authority-impersonation | Impersonates IT, management, government entity, or known brand with authority framing |
| credential-harvest | Explicit credential request or redirect to a login page — direct ask for passwords or access |
| hyper-personalization | Uses recipient name, role, company, or context convincingly — a primary GenAI differentiator |
| pretexting | Builds a false scenario before the ask — invoice dispute, ongoing thread, project context |
| fluent-prose | Polished natural language with no traditional tells — phish is embedded in entirely plausible correspondence |
Difficulty calibration:
| Category | Count | |----------|-------| | Transactional | 110 | | Marketing | 100 | | Workplace | 100 | | Total | 310 |
Category definitions:
Legitimate cards are generated with the same linguistic quality standard as phishing cards. No intentional quality degradation.
All cards are generated using documented prompt templates stored in docs/prompts/. Each prompt template specifies:
Each card records: ai_model, ai_prompt_version, generation_date.
Cards generated by scripts/generate-cards.ts using structured prompt templates per technique/category. Output written directly to cards_staging with status = 'pending' and source_corpus = 'generated'.
Generation runs in batches: 20 cards per call, one technique/category per batch. Batch tracked in import_batches.
All generated cards reviewed via /admin review UI before approval. Reviewer (Scott) sees:
Reviewer actions:
No card enters cards_real without human review. This is the quality gate — rejecting cards that are low-quality, implausible, or do not represent the technique cleanly.
Once cards_real reaches 1,000 approved cards (690 phishing + 310 legitimate, balanced across techniques), the dataset is frozen as v1. Freeze recorded in dataset_versions.
cards_stagingHolds generated cards awaiting review. Fields used for generated content:
| Field | Used | Notes | |-------|------|-------| | id | ✓ | UUID primary key | | import_batch_id | ✓ | FK to import_batches (generation batch) | | source_corpus | ✓ | Always 'generated' | | raw_from | ✓ | Generated sender address | | raw_subject | ✓ | Generated subject line | | raw_body | ✓ | Generated email/SMS body | | inferred_type | ✓ | email / sms | | is_phishing | ✓ | Set at generation | | suggested_technique | ✓ | From generation prompt | | suggested_difficulty | ✓ | From generation prompt | | suggested_highlights | ✓ | Phrases to highlight in feedback | | suggested_clues | ✓ | Analyst clues for feedback | | suggested_explanation | ✓ | Why this is/isn't phishing | | ai_provider | ✓ | openai / anthropic | | ai_model | ✓ | e.g. gpt-4o | | ai_preprocessing_version | ✓ | Prompt template version | | status | ✓ | pending / approved / rejected | | raw_email_hash | — | N/A for generated (no dedup needed) | | email_headers_json | — | N/A | | genai_detector_score | — | N/A (all cards are known AI-generated) | | is_genai_suspected | — | N/A |
cards_realApproved, curated live dataset:
| Field | Type | Notes | |-------|------|-------| | id | UUID | Primary key | | staging_id | UUID | FK to cards_staging | | card_id | TEXT | Unique game ID e.g. real-p-001 | | type | TEXT | email / sms | | is_phishing | BOOLEAN | | | difficulty | TEXT | easy / medium / hard / extreme | | secondary_technique | TEXT | Secondary phishing technique, if applicable (null for most cards) | | from_address | TEXT | | | subject | TEXT | | | body | TEXT | | | technique | TEXT | Primary technique | | highlights | TEXT[] | Phrases to highlight in feedback | | clues | TEXT[] | Analyst clues | | explanation | TEXT | Why this is/isn't phishing | | auth_status | TEXT | verified / unverified / fail — simulated SPF/DKIM/DMARC result | | reply_to | TEXT | Mismatched reply-to address (hard/extreme phishing only) | | attachment_name | TEXT | Filename shown in ATCH row (when card references an attachment) | | sent_at | TEXT | RFC 2822 timestamp — odd hours for phishing, business hours for legit | | ai_model | TEXT | Which model generated the card | | ai_preprocessing_version | TEXT | Prompt template version | | dataset_version | TEXT | v1 | | approved_at | TIMESTAMPTZ | |
answersEvery answer event from research mode:
| Field | Type | Notes | |-------|------|-------| | id | UUID | Primary key | | session_id | UUID | Groups answers from same game | | card_id | TEXT | | | is_phishing | BOOLEAN | Ground truth | | technique | TEXT | Primary technique (null for legit cards) | | difficulty | TEXT | | | type | TEXT | email / sms | | user_answer | TEXT | phishing / legit | | correct | BOOLEAN | | | confidence | TEXT | guessing / likely / certain | | time_from_render_ms | INT | Card shown → answer submitted | | time_from_confidence_ms | INT | Confidence selected → answer submitted | | confidence_selection_time_ms | INT | Card shown → confidence selected | | scroll_depth_pct | SMALLINT | 0–100 | | answer_method | TEXT | swipe / button | | answer_ordinal | SMALLINT | Position in session (1–10) | | streak_at_answer_time | SMALLINT | | | correct_count_at_time | SMALLINT | | | game_mode | TEXT | freeplay / daily / research / expert / preview | | is_daily_challenge | BOOLEAN | | | card_source | TEXT | generated / real | | dataset_version | TEXT | v1 | | is_genai_suspected | BOOLEAN | Card flagged as likely GenAI-generated | | genai_confidence | TEXT | low / medium / high (null if not suspected) | | grammar_quality | SMALLINT | 0–5 rating from generation metadata | | prose_fluency | SMALLINT | 0–5 rating from generation metadata | | personalization_level | SMALLINT | 0–5 rating from generation metadata | | contextual_coherence | SMALLINT | 0–5 rating from generation metadata | | secondary_technique | TEXT | Secondary phishing technique if applicable | | player_id | UUID | FK to players table (pseudonymous) | | headers_opened | BOOLEAN | Player opened the [HEADERS] panel | | url_inspected | BOOLEAN | Player tapped a URL to inspect it | | auth_status | TEXT | Card's SPF/DKIM/DMARC result (verified/unverified/fail) | | has_reply_to | BOOLEAN | Card had a mismatched Reply-To address | | has_url | BOOLEAN | Card body contained at least one URL | | has_attachment | BOOLEAN | Card had an attachment name set | | has_sent_at | BOOLEAN | Card had a sent timestamp (odd-hours signal available) | | created_at | TIMESTAMPTZ | |
sessionsOne row per game played:
| Field | Type | Notes | |-------|------|-------| | session_id | UUID | Primary key | | game_mode | TEXT | freeplay / daily / research / expert / preview | | is_daily_challenge | BOOLEAN | | | started_at | TIMESTAMPTZ | | | completed_at | TIMESTAMPTZ | Null if abandoned | | cards_answered | SMALLINT | | | final_score | INT | | | final_rank | TEXT | | | device_type | TEXT | mobile / tablet / desktop | | viewport_width | SMALLINT | | | viewport_height | SMALLINT | | | referrer | TEXT | |
import_batchesTracks each generation batch:
| Field | Type | Notes | |-------|------|-------| | batch_id | UUID | Primary key | | source_corpus | TEXT | Always 'generated' for v1 | | import_date | TIMESTAMPTZ | | | raw_count | INT | Cards generated | | processed_count | INT | Cards reviewed | | approved_count | INT | Cards approved | | rejected_count | INT | | | notes | TEXT | Technique, difficulty, model, prompt version |
dataset_versionsVersion registry:
| Field | Type | Notes | |-------|------|-------| | version | TEXT | v1, v2 etc. | | locked_at | TIMESTAMPTZ | Null until frozen | | total_cards | INT | | | phishing_count | INT | | | legit_count | INT | | | description | TEXT | |
Research Mode requires a player account. Answers are linked to a pseudonymous player UUID via the player_id foreign key in the answers table. Email addresses are held only in Supabase Auth and are never stored in research tables — our own tables record only UUIDs, game mode, technique, correctness, confidence, and timing signals. A session UUID is generated at game start and persisted to the sessions table to group answers from the same round.
Per-player collection cap: Each player can contribute a maximum of 30 research answers (3 complete sessions of 10 cards each), enforced server-side. After reaching the cap, players are marked as research-graduated and gain access to Expert Mode. This cap prevents any single player from dominating the dataset and creates an incentive structure for completion. Answers beyond the cap are silently discarded at the API layer.
Timing measurements:
time_from_render_ms — card first render to answer submissiontime_from_confidence_ms — confidence selection to answer submission (pure decision deliberation)confidence_selection_time_ms — card render to confidence selectionScroll depth: tracked via scrollTop ratio on the card body element. Records maximum scroll percentage reached before answering.
No PII in research tables. No IP storage. No behavioural tracking outside the game session.
Consent: Players informed via the game UI that Research Mode answers contribute to anonymised security awareness research. Participation is voluntary and implicit in selecting Research Mode.
Collection target: ~1,000 research mode answers minimum before publishing. With random sampling from a 1,000-card dataset (115 phishing cards per technique), expected answers per technique per N total answers = N × (115/1000). Reaching 100 answers per technique in expectation requires approximately 870 total research answer events. A target of 1,000 provides a comfortable buffer. This provides a statistically meaningful sample for primary technique-level comparisons.
For each of the 6 techniques:
Expected finding direction: hyper-personalization and fluent-prose have higher bypass rates than urgency and credential-harvest (which have more traditional tells even at high quality).
Difficulty is distributed across techniques (35 easy/35 medium/35 hard/10 extreme per technique). Extreme is intentionally under-represented as it represents near-indistinguishable attacks unlikely to be detected by most players. Primary analysis uses all difficulties combined. Difficulty-stratified breakdown reported separately to confirm the technique effect is not an artifact of difficulty distribution.
Players can optionally self-report their professional background: other (general users), technical (technical, non-security), or infosec (security/cybersecurity professionals). Background is set on the player profile and linked to answers via player UUID.
Analysis questions:
Background is optional and a significant portion of players may not disclose it. This analysis is reported as a supplementary finding, not a primary result.
Six signals are available to players during gameplay. Three are passive (always visible), three are active (require deliberate interaction):
| Signal | Type | Behavioral field |
|--------|------|-----------------|
| Sender domain (FROM vs body) | Passive | — |
| Send time (SENT row) | Passive | — |
| Attachment name (ATCH row) | Passive | has_attachment on card |
| Authentication headers ([HEADERS]) | Active | headers_opened |
| Reply-To mismatch ([HEADERS]) | Active | headers_opened |
| URL destinations (URL inspector) | Active | url_inspected |
Active tool interactions are logged per answer. Analysis questions:
Players who seek out a retro phishing awareness game are likely more security-aware than the general population. Results should be interpreted as reflecting a security-aware population, not general users. This is a limitation but also produces conservative bypass rates — if even security-aware individuals miss certain techniques at elevated rates, the finding is stronger for the general population.
The terminal interface strips all visual design cues (logos, branding, CSS styling). Results reflect text-based linguistic phishing recognition, not full email client simulation.
This is appropriate for the research question. GenAI's primary advantage over traditional phishing is text quality, not visual design. Testing linguistic cues in isolation directly matches the research question.
The game provides immediate post-answer feedback including the technique label, difficulty, and forensic signal analysis. This creates a within-session and cross-session learning effect: players calibrate over a session and improve across sessions.
How this is controlled for:
answer_ordinal is logged for every answer (position 1–10 within the session). This allows isolation of naive answers (early ordinals) from calibrated answers (later ordinals). Primary analysis uses all ordinals combined. A sensitivity analysis using only ordinals 1–3 (first three cards of each session) tests whether the technique ranking is robust to the learning effect. If the technique ordering holds across both cuts, the finding is valid.
Secondary finding enabled: The learning structure is an opportunity, not only a limitation. Players who return for multiple sessions provide data on technique-specific trainability — which attacks remain hard even after repeated exposure to feedback. Techniques with persistently high bypass rates across return players represent harder-to-train threats. This is separately reportable and directly relevant to security awareness program design.
The dataset is entirely AI-generated by design. Two implications:
Both limitations are disclosed in the publication.
The same card can be served to multiple distinct sessions. There is no within-session repetition (each session draws a unique 10-card sample), but across sessions a given card may be seen by many different players. This is by design: repeated exposure across sessions provides the statistical volume needed for per-card and per-technique analysis. Cards are not removed from the pool after being seen. Answers are linked to a pseudonymous player_id, so returning players can be identified for cross-session analysis (e.g., trainability across sessions). The per-player cap of 30 answers (3 sessions) bounds the maximum contribution from any single player.
fluent-prose Confoundfluent-prose phishing cards are defined by polished natural language with no traditional tells. This technique partially overlaps with the GenAI baseline condition shared by all cards in this dataset — all cards are grammatically fluent by construction. As a result, fluent-prose cards may be harder to distinguish from legitimate cards not because of superior technique, but because the technique definition is closest to the baseline condition. This is a design confound that will be disclosed in the publication. The technique is retained in the dataset because it represents a real and distinct category of attack. Readers should interpret elevated bypass rates for fluent-prose as an upper bound that includes baseline noise.
Players who open the [HEADERS] panel and observe SPF/DKIM/DMARC: FAIL have a near-deterministic signal on easy and medium phishing cards, where authentication always fails by design. A player using headers as their primary detection heuristic will produce correct answers that are not attributable to technique recognition — their accuracy reflects forensic hygiene, not response to the technique content.
How this is controlled for in analysis:
headers_opened = false (answers made without opening headers) as the "content-only" detection signal. If the technique ranking is consistent across both cuts, the finding is robust to this confound.auth_status may be verified (attacker registered their own domain with passing authentication), header inspection provides no correct-direction signal — the technique effect is cleanest in this difficulty stratum and should be reported separately.The research deck is drawn by purely random sampling from all cards in cards_real — 10 cards per session with no stratification by technique or difficulty. Difficulty balance is guaranteed at the dataset level (15 cards per difficulty tier per technique), not enforced at the session level. Over a sufficient number of sessions, expected exposure across techniques and difficulty tiers converges to the dataset proportions. Difficulty-stratified analysis is a planned secondary analysis that will confirm the technique effect is not an artifact of difficulty imbalance.
The answers table records answers from all game modes (research, freeplay, daily). Primary analysis uses only game_mode = 'research' answers from cards_real sourced cards. Non-research mode answers are excluded from all findings. The multi-mode table structure is a product decision (unified schema) and does not contaminate research data.
/intel) — live aggregate findings, always current. Methodology note links to this document.| Version | Date | Notes | |---------|------|-------| | 0.1 | 2026-03-01 | Initial methodology draft (real-world corpus plan) | | 0.2 | 2026-03-01 | Full schema, GenAI classification methodology | | 1.0 | 2026-03-01 | Pivot to all-generated dataset. New research question locked. 550 cards, 6 techniques. Methodology rewritten. | | 1.1 | 2026-03-02 | Added auth_status, reply_to, attachment_name, sent_at card fields. Added behavioral tracking (headers_opened, url_inspected, has_reply_to, has_url, has_attachment). Added tool usage secondary analysis. Added training effect / learning section. Signal count corrected to 6. | | 1.2 | 2026-03-02 | Added disclosures: repeated card exposure, fluent-prose confound, difficulty distribution during collection, answer pool scope. Server-side correct/technique verification added to answer collection pipeline. | | 1.3 | 2026-03-03 | Added auth header shortcut limitation with sensitivity analysis plan. Updated difficulty distribution section: research deck now stratifies by difficulty tier within technique per session (random tier selection), not pure random sampling. ResearchIntro updated with explicit data collection disclosure. | | 1.4 | 2026-03-04 | Removed SMS from dataset scope (email only). Added Extreme difficulty tier (15/15/15/15 per technique). Switched research deck from stratified to purely random sampling. Updated collection target to ~1,000 (math updated for random sampling). Updated data collection section: pseudonymous UUID model, corrected anonymity claims. Added secondary_technique to cards_real schema and approve pipeline. | | 1.5 | 2026-03-04 | Scaled dataset from 550 to 1,000 cards. Phishing: 690 (6 × 35 easy/medium/hard + 6 × 10 extreme). Legit: 310 (110/100/100). Extreme capped at 10 per technique — sufficient for expert mode coverage without over-representing near-indistinguishable attacks. Collection target math updated. | | 1.6 | 2026-03-06 | Status updated to active collection (dataset frozen at v1). Corrected session UUID claim — sessions ARE persisted to the sessions table. Corrected returning-player claim — player_id FK enables cross-session analysis. Added per-player collection cap (30 answers / 3 sessions, server-side enforced). Added missing answers table fields: player_id, card_source, is_genai_suspected, genai_confidence, grammar_quality, prose_fluency, personalization_level, contextual_coherence, secondary_technique, has_sent_at. Corrected game_mode values across schema tables. |