Phishing Practice Test File (CSV)
================================

This is a synthetic dataset for building and testing machine-learning phishing defenses.
It contains simulated email + URL metadata and a binary label.

label:
  0 = legitimate
  1 = phishing

Columns (high-level):
- email_id: unique ID
- timestamp_utc: simulated send time (UTC)
- from_email, sender_domain: sender identity fields
- reply_to_mismatch: 1 if Reply-To differs from From (simulated)
- subject_text, body_text: text fields (safe, synthetic)
- brand_mentioned, department_theme: optional context fields
- num_links: number of links in the message
- primary_url: first URL extracted from the email
- displayed_url: simulated visible (anchor) URL text
- url_length: length of primary_url
- email_length: length of subject + body
- contains_urgent_words: 1 if urgency language appears
- domain_mismatch: 1 if URL host differs from sender_domain
- has_attachment, attachment_type: attachment indicator/type
- domain_age_days: simulated domain age (younger tends to be riskier)
- https_used: 1 if primary_url uses https
- ip_in_url: 1 if the URL uses a raw IP address
- spf_pass, dkim_pass, dmarc_pass: simulated email auth results