Phishing Practice Test File (CSV) ================================ This is a synthetic dataset for building and testing machine-learning phishing defenses. It contains simulated email + URL metadata and a binary label. label: 0 = legitimate 1 = phishing Columns (high-level): - email_id: unique ID - timestamp_utc: simulated send time (UTC) - from_email, sender_domain: sender identity fields - reply_to_mismatch: 1 if Reply-To differs from From (simulated) - subject_text, body_text: text fields (safe, synthetic) - brand_mentioned, department_theme: optional context fields - num_links: number of links in the message - primary_url: first URL extracted from the email - displayed_url: simulated visible (anchor) URL text - url_length: length of primary_url - email_length: length of subject + body - contains_urgent_words: 1 if urgency language appears - domain_mismatch: 1 if URL host differs from sender_domain - has_attachment, attachment_type: attachment indicator/type - domain_age_days: simulated domain age (younger tends to be riskier) - https_used: 1 if primary_url uses https - ip_in_url: 1 if the URL uses a raw IP address - spf_pass, dkim_pass, dmarc_pass: simulated email auth results