Files
2026-05-08 17:44:26 -05:00

39 KiB

GreySec PHI Scanner — Operational Runbook

Multi-Location PHI Discovery Deployment

Document: Operational Runbook
Scanner: GreySec PHI Scanner (internal, non-billable)
Task: t_e7d202b5
Last Updated: 2026-05-04


Table of Contents

  1. Overview
  2. Per-Location Configuration
  3. LDAP Host Discovery
  4. WinRM Credential Management
  5. Centralized SQLite Inventory Aggregation
  6. Report Consolidation
  7. Deployment Checklist
  8. Troubleshooting
  9. Scheduled Scanning

1. Overview

What the Scanner Does

GreySec PHI Scanner discovers unprotected Protected Health Information (PHI) across Windows hosts, file shares, and databases at multiple physical or logical locations. It supports multi-location orchestrated scans from a single central deployment, aggregates findings into SQLite, and produces both per-location and enterprise-wide HTML reports.

Supported PHI Types and Detection Patterns

PHI Type Pattern Used Severity
SSN \b\d{3}[-\s]\d{2}[-\s]\d{4}\b HIGH (3)
MRN / Medical Record Number \b(MRN|Medical Record|EHR|ID)[:\s#]*\d{6,10}\b HIGH (3)
Date of Birth `\b(0[1-9] 1[0-2])[/.-](0[1-9]
Email `\b[A-Za-z0-9._%+-]+@[A-Za-z0-9.-]+.[A-Z a-z]{2,}\b`
Phone \b(\+?1[-.\s]?)?\(?\d{3}\)?[-.\s]?\d{3}[-.\s]?\d{4}\b MEDIUM (2)
IP Address \b\d{1,3}\.\d{1,3}\.\d{1,3}\.\d{1,3}\b LOW (1)

Severity mapping: SSN and MRN are HIGH. DOB, Email, and Phone are MEDIUM. IP addresses are LOW.

Deployment Architecture

┌─────────────────────────────────────────────────────────┐
│  GreySec PHI Scanner — Multi-Location Deployment        │
├─────────────────────────────────────────────────────────┤
│                                                         │
│  ┌──────────┐    ┌──────────┐    ┌──────────┐          │
│  │ Location │    │ Location │    │ Location │          │
│  │    A     │    │    B     │    │    C     │          │
│  │  (HQ)    │    │ (Branch) │    │ (Cloud)  │          │
│  │          │    │          │    │          │          │
│  │ Win/Linux│    │ Win/Linux│    │ Azure/   │          │
│  │ DBs/Host │    │ DBs/Host │    │ AWS      │          │
│  └────┬─────┘    └────┬─────┘    └────┬─────┘          │
│       │                │                │                │
│       └────────────────┴────────────────┘                │
│                          │                              │
│              ┌───────────▼───────────┐                   │
│              │  Central Reporting    │                   │
│              │  (SQLite aggregation)│                   │
│              └──────────────────────┘                   │
└─────────────────────────────────────────────────────────┘

Remote Execution Method

The scanner uses SMB upload + Task Scheduler (atsvc DCERPC) for Windows hosts:

  1. SMB connect to C$ administrative share
  2. Upload PowerShell scan agent to C:\<remote_dir>\phi_scan.ps1
  3. Trigger execution via atsvc named pipe → SchRpcRegisterTaskSchRpcRun
  4. Task runs as S-1-5-18 (SYSTEM) — no domain account mapping required
  5. Wait for results file at C:\<remote_dir>\phi_scan_results.json
  6. Download results via SMB

No WinRM required. The atsvc method works on all Windows versions Vista through 2025.

Scanner Entry Points

Command Description
greysec-phi scan <path> Local file system PHI scan
greysec-phi scan --config <yaml> Config-driven scan (all target types)
greysec-phi scan-windows --host <ip> --user <u> --pass <p> Single Windows host scan
greysec-phi scan-db --engine mssql --host <h> --database <d> --user <u> --pass <p> Database PHI scan
greysec-phi discover --domain <d> --bind-user <u> --bind-password <p> --ldap-url <url> LDAP host discovery
greysec-phi orchestrate --locations-dir ./locations/ --workers 4 Multi-location orchestrated scan
greysec-phi aggregate ./findings/*.db --output ./findings/aggregate.db SQLite aggregation
greysec-phi report --results <json> --output <html> HTML report generation
greysec-phi inventory Show inventory summary

Package Structure

greysec-phi-scanner/
├── locations/                  # Per-location YAML configs
│   ├── acme_hq.example.yaml   # Example: production client HQ
│   └── lab_internal.yaml       # Example: internal lab
├── configs/                   # Optional shared config snippets
├── scripts/
│   └── win_scan_targeted.ps1  # PowerShell agent deployed to Windows hosts
├── results/                   # Scan results and reports
└── credentials.env            # Secrets (not committed to git)

2. Per-Location Configuration

Full Config Reference

Each location is defined by a YAML file. Copy locations/acme_hq.example.yaml as a starting point.

# -----------------------------------------------------------------
# Identity
# -----------------------------------------------------------------
location_id: "clinic_north"           # Unique ID (alphanumeric + underscore)
location_name: "Clinic North"        # Human-readable name
client: "Acme Health"                # Client org name
engagement: "PHI Security Assessment 2026"
description: >
  Primary care clinic north location. Windows file server + MSSQL EHR.

tags:
  - production
  - healthcare

# -----------------------------------------------------------------
# Windows Hosts — scanned via SMB + Task Scheduler DCERPC
# Required: SMB port 445 accessible, credentials with C$ write access
# -----------------------------------------------------------------
windows_hosts:
  - name: "Clinic File Server"       # Optional display name
    hostname: FILE-SERVER-01         # NetBIOS hostname
    ip: 10.10.50.10                 # IP address
    username: scanner_svc           # Service account username
    password: "CHANGE_ME"           # Use credentials.env or env var in production
    domain: ACME                    # AD domain, "" for local accounts
    share: C$                       # Administrative share for upload/download
    remote_dir: phi_test            # Working dir on target (created if needed)
    winrm_port: 5985                # WinRM port (informational; atsvc used instead)
    scan_paths:                     # Paths to scan on this host
      - "C:\\Departments\\HR"
      - "C:\\Departments\\Billing"
      - "C:\\Shares\\PatientData"
      - "C:\\Users"                 # User profile directories
      - "C:\\inetpub\\wwwroot"
      - "C:\\ProgramData"
    tags:
      - production
      - fileserver

  - hostname: APP-SERVER-01
    ip: 10.10.50.20
    username: scanner_svc
    password: "CHANGE_ME"
    domain: ACME
    share: C$
    remote_dir: phi_test
    scan_paths:
      - "C:\\inetpub\\wwwroot"
      - "C:\\ProgramData"
      - "C:\\Users"
    tags:
      - production
      - appserver

# -----------------------------------------------------------------
# Databases — scanned via Presidio analyzer
# Supported: mssql (SQL Server), postgres (PostgreSQL)
# -----------------------------------------------------------------
databases:
  - name: "Epic EMR Primary"         # Display name
    type: mssql                      # mssql | postgres
    host: 10.10.60.5                 # Database server hostname/IP
    port: 1433                       # Default: 1433 (mssql) or 5432 (postgres)
    user: phi_scan_user              # Read-only or minimal-privilege scanner account
    password: "CHANGE_ME"
    database: EpicEMR
    # Custom PHI query (optional). If null, uses default:
    # MSSQL: SELECT FullName, SSN, DOB, MRN, Phone, Email FROM Patients;
    #         SELECT ClaimID, SSN, BilledAmount, DiagnosisCode FROM Claims;
    #         SELECT BreachDate, SSN, PatientID, Description FROM BreachLog;
    query: |
      SELECT pat_id, ssn, first_name, last_name, dob, mrn, home_phone, email_addr
        FROM patient_identity;
    tags:
      - production
      - ehr

  - name: "Epic EMR Reporting"
    type: mssql
    host: 10.10.60.6
    port: 1433
    user: phi_scan_user
    password: "CHANGE_ME"
    database: EpicReporting
    query: null                      # Uses default PHI query
    tags:
      - production
      - ehr

# -----------------------------------------------------------------
# File Paths — local or UNC path scans (Linux scanner node)
# Use for NFS mounts, Samba shares mapped on the scanner host,
# or any directory accessible from the scanner node
# -----------------------------------------------------------------
file_paths:
  - name: "Shared PHI Exports"       # Display name
    path: "\\\\fileserver01\\patient_exports"   # UNC path
    ignore_patterns:                # Files/dirs to skip
      - "*.bak"
      - "*.exe"
      - "*.dll"
      - "*.sys"
      - "node_modules"
      - ".git"
    tags:
      - production

  - name: "Local Scan Root"
    path: /mnt/phi_share              # Local filesystem path
    ignore_patterns:
      - "*.bak"
      - "*.vmdk"
      - "*.iso"
    tags:
      - production

# -----------------------------------------------------------------
# Orchestration Settings
# -----------------------------------------------------------------
max_concurrency: 4          # Max parallel scans per location (default: 4)
scan_timeout_mins: 45       # Per-location timeout in minutes (default: 30)
retry_count: 1             # Retries on failure (default: 1)
continue_on_failure: true   # Continue scanning other targets if one fails (default: true)
report_output_dir: ./reports/acme_hq   # Per-location report output directory

All Config File Options Summary

Field Type Required Default Description
location_id string Yes Unique identifier
location_name string No "" Display name
client string No Client organization
engagement string No Engagement/project name
description string No Free-text description
tags list[string] No [] Tags for filtering at scan time
windows_hosts list No [] Windows hosts to scan
databases list No [] Databases to scan
file_paths list No [] Filesystem/UNC paths to scan
max_concurrency int No 4 Parallel scans per location
scan_timeout_mins int No 30 Per-location timeout
retry_count int No 1 Retries on failure
continue_on_failure bool No true Continue on target failure
report_output_dir string No ./reports Report output path

WindowsHostConfig Fields

Field Type Default Description
hostname string NetBIOS hostname
ip string Target IP address
username string SMB/atsvc username
password string SMB/atsvc password
domain string "" AD domain (empty for local account)
share string C$ Administrative share for file transfer
remote_dir string phi_test Working directory on target
scan_paths list[string] [C:\Users, C:\inetpub, C:\ProgramData] Paths to scan
winrm_port int 5985 WinRM port (informational)
tags list[string] [] Filtering tags

DatabaseTarget Fields

Field Type Default Description
name string Display name
type string mssql or postgres
host string DB server hostname/IP
port int 1433/5432 DB port
user string Scanner DB account
password string DB account password
database string Database name
query string null Custom SQL (null = default PHI query)
tags list[string] [] Filtering tags

FileTarget Fields

Field Type Default Description
name string Display name
path string Local path or UNC path
ignore_patterns list[string] [*.exe, *.dll, ...] Patterns to skip
tags list[string] [] Filtering tags

3. LDAP Host Discovery

Overview

Use the discover command to query Active Directory via LDAP and enumerate Windows computers eligible for PHI scanning. The command resolves FQDNs to IPs and probes management ports (SMB 445, RDP 3389, WinRM 5985, MSSQL 1433).

Prerequisites

  • LDAP service account (read-only domain user sufficient)
  • Port 389 (or 636 for LDAPS) accessible to the scanner node
  • python3-ldap3 installed (pip install greysec-phi-scanner[full])

Running Discovery

greysec-phi discover \
  --domain CONTOSO \
  --bind-user "scanner_svc@CONTOSO" \
  --bind-password "ServicePass123!" \
  --ldap-url ldap://dc01.contoso.com

Filtering by OU

The LDAP query searches the entire naming context by default. To restrict to a specific OU, post-filter the output:

greysec-phi discover \
  --domain CONTOSO \
  --bind-user "scanner_svc@CONTOSO" \
  --bind-password "ServicePass123!" \
  --ldap-url ldap://dc01.contoso.com \
| grep "OU=Clinics"

Output

The command prints discovered hosts and also registers them in the local SQLite inventory:

[PHI Scanner] Discovered 12 hosts
  FILE-SERVER-01 (10.10.50.10) — Windows Server 2019 Standard
  APP-SERVER-01 (10.10.50.20) — Windows Server 2019 Standard
  WIN10-CLINIC-03 (10.10.50.30) — Windows 10 Enterprise

Each host record includes: hostname, os, fqdn, ou, last_logon, ip, open_ports.

Generating a Location Config from Discovery

After discovery, create a new location YAML by extracting relevant hosts:

# Discover and format as YAML hosts list
greysec-phi discover --domain CONTOSO ... \
  --output-format yaml \
| grep -A 20 "windows_hosts:" > new_location.yaml

LDAP Query Used Internally

The discovery module uses this LDAP filter:

(&(objectClass=computer)(operatingSystem=*Windows*))

Attributes retrieved: cn, operatingSystem, dNSHostName, lastLogonTimestamp, DistinguishedName


4. WinRM Credential Management

Important: The Scanner Does NOT Use WinRM

Despite the winrm_port field in configs and the winrm_scan.py module name, remote Windows scanning uses SMB + Task Scheduler DCERPC (atsvc pipe). WinRM is not required. The winrm_port field exists for compatibility and informational purposes only.

Creating a Dedicated Scanner Service Account

For each location, create a dedicated service account in Active Directory:

Step 1: Create the account

# In Active Directory Users and Computers (ADUC) or via PowerShell
New-ADUser `
  -Name "GreySec PHI Scanner" `
  -SamAccountName "scanner_svc" `
  -UserPrincipalName "scanner_svc@CONTOSO.COM" `
  -Path "OU=Service Accounts,OU=Admin,DC=CONTOSO,DC=COM" `
  -AccountPassword(ConvertTo-SecureString "ComplexPass123!" -AsPlainText -Force) `
  -PasswordNeverExpires $true `
  -Enabled $true

Step 2: Add to Domain Computers (if needed for SMB access)

Add-ADGroupMember "Domain Computers" "scanner_svc"

Required Permissions

The scanner service account needs:

  1. Read access to target file shares — Add the account to the Readers group on each file share or NTFS-enumerate the share with the account.
  2. Write access to C$ administrative share — Required for the SMB upload step. This is typically granted by being a member of Domain Computers and having local admin on target machines, OR by granting the account local admin on target machines.
  3. Log on locally / Log on as a service — Usually granted by default for domain users on member servers.

Granting share read access via PowerShell (on file server):

# Grant scanner account read access to a share
Grant-SmbShareAccess -Name "PatientData" `
  -AccountName "CONTOSO\scanner_svc" `
  -AccessRight Read

# For NTFS permissions
$acl = Get-Acl "C:\Shares\PatientData"
$user = New-Object System.Security.Principal.NTAccount("CONTOSO\scanner_svc")
$acl.SetAccessRuleProtection($false, $true)
$accessRule = New-Object System.Security.AccessControl.FileSystemAccessRule(
  $user, "ReadAndExecute", "Allow")
$acl.AddAccessRule($accessRule)
Set-Acl "C:\Shares\PatientData" $acl

Making the account a local admin on target machines (via GPO or direct):

# On each target Windows host, add to Administrators group
Add-LocalGroupMember -Group "Administrators" `
  -Member "CONTOSO\scanner_svc"

Credential Storage

Option A: Environment variables (recommended for CI/CD)

export PHI_SCANNER_USERNAME="scanner_svc"
export PHI_SCANNER_PASSWORD="ServicePass123!"
export PHI_SCANNER_DOMAIN="CONTOSO"

Option B: In the location YAML (for file-based deployments)

windows_hosts:
  - hostname: FILE-SERVER-01
    ip: 10.10.50.10
    username: "{{ env `PHI_SCANNER_USERNAME` }}"
    password: "{{ env `PHI_SCANNER_PASSWORD` }}"
    domain: "{{ env `PHI_SCANNER_DOMAIN` }}"

Option C: credentials.env file (not committed to git)

# ~/.greysec/credentials.env
PHI_SCANNER_USERNAME=scanner_svc
PHI_SCANNER_PASSWORD=ServicePass123!
PHI_SCANNER_DOMAIN=CONTOSO
# Load before running
set -a && source ~/.greysec/credentials.env && set +a
greysec-phi orchestrate --locations-dir ./locations/

Important: Never commit real credentials to version control. Add credentials.env, *.local.yaml, and reports/ to .gitignore.

Testing WinRM/Connectivity Before Scanning

Test SMB connectivity to verify the service account works before running a full scan:

# Test 1: SMB connectivity (port 445)
nc -zv 10.10.50.10 445

# Test 2: Verify credentials via CrackMapExec or smbmap
smbmap -H 10.10.50.10 -u scanner_svc -p 'ServicePass123!'

# Test 3: Quick scan with a single file target
greysec-phi scan-windows \
  --host 10.10.50.10 \
  --user scanner_svc \
  --password "ServicePass123!" \
  --share C$ \
  --path "C:\\inetpub\\wwwroot"

Local Account Mode (Workgroup/Lab Environments)

For workgroup machines without AD, use local credentials:

windows_hosts:
  - hostname: DESKTOP-1DHNF5M
    ip: 192.168.68.15
    username: labuser
    password: LabPass123!
    domain: ""         # Empty domain = local account
    share: C$
    remote_dir: tmp
    scan_paths:
      - "C:\\Users"
      - "C:\\tmp"

5. Centralized SQLite Inventory Aggregation

Overview

Each location scan produces a JSON results file (<location_id>_results.json) and an HTML report. For centralized tracking, use the orchestrate command to run all locations and automatically aggregate results into a master SQLite-like JSON aggregate and master HTML report.

Running Multi-Location Orchestrated Scans

greysec-phi orchestrate \
  --locations-dir ./locations/ \
  --output-dir ./reports \
  --workers 4

This runs all .yaml files in ./locations/ in parallel (up to 4 concurrent location scans), then produces:

  • ./reports/<location_id>/<location_id>_results.json — per-location JSON
  • ./reports/<location_id>/<location_id>_report.html — per-location HTML
  • ./reports/master_phi_results.json — aggregated JSON across all locations
  • ./reports/master_phi_report.html — master enterprise HTML report

Tag-Based Filtering

Run only locations/targets matching specific tags:

greysec-phi orchestrate \
  --locations-dir ./locations/ \
  --tags production healthcare \
  --workers 4

Targets within a location are also filtered by tags — only targets whose tags overlap with the location's tags are scanned.

Per-Location Individual Scan + Manual Aggregation

If scanning locations independently (different times, different scanner nodes):

# Scan Location A
greysec-phi scan --config ./locations/clinic_north.yaml \
  --results ./findings/clinic_north.json

# Scan Location B
greysec-phi scan --config ./locations/clinic_south.yaml \
  --results ./findings/clinic_south.json

# Aggregate all location JSONs into one master
greysec-phi aggregate ./findings/clinic_*.json \
  --output ./findings/enterprise_aggregate.json

The aggregate command merges finding counts, severity breakdowns, and generates a master HTML report.

Inventory SQLite Database

The scanner maintains a local SQLite inventory at ~/.greysec/phi_inventory.db (configurable via inventory_db in config):

# View inventory summary
greysec-phi inventory

# Example output:
# === PHI Scanner Inventory ===
#   Total hosts registered: 24
#   Total scan runs:       18
#   Total findings:        143
#     HIGH:   12
#     MEDIUM: 87
#     LOW:    44

ResultAggregator Fields

The ResultAggregator produces a master dict with these top-level keys:

Field Type Description
aggregated_at ISO datetime When aggregation ran
locations list[dict] Per-location summary entries
total_findings int Sum of all findings
total_high int HIGH severity count
total_medium int MEDIUM severity count
total_low int LOW severity count
total_files_scanned int Total files scanned
total_targets_scanned int Total targets scanned
status_counts dict success/partial/failed/skipped counts
master_report_html path Path to master HTML report
master_results_json path Path to master JSON

6. Report Consolidation

Per-Location HTML Report

Generated automatically by LocationRunner at <report_output_dir>/<location_id>_report.html, or manually:

greysec-phi report \
  --results ./reports/clinic_north/clinic_north_results.json \
  --output ./reports/clinic_north/clinic_north_report.html \
  --client "Acme Health" \
  --engagement "PHI Security Assessment 2026"

Master Enterprise Report

After orchestration or manual aggregation, the master HTML report is at:

./reports/master_phi_report.html

Regenerate from the master JSON:

greysec-phi report \
  --results ./reports/master_phi_results.json \
  --output ./reports/enterprise-phi-audit.html \
  --client "Acme Health" \
  --engagement "Enterprise PHI Audit — All Locations"

Report Contents

Per-location HTML report includes:

  • Executive summary with finding counts by severity
  • Per-target breakdown (Windows host, database, file share)
  • Detailed findings table with: type, severity, file/path, context snippet, line number
  • HIPAA citation references

Master HTML report includes:

  • Cover page with classification and generation timestamp
  • KPI strip: total findings, HIGH/MEDIUM/LOW counts, files scanned, targets scanned
  • Per-location summary table with status badges (success/partial/failed)
  • Scan errors table
  • Links to individual location reports

Severity Severity Mapping in Reports

Severity Level Score Examples Color
HIGH 3 SSN, MRN Red
MEDIUM 2 DOB, Email, Phone Orange
LOW 1 IP Address Green

7. Deployment Checklist

Deploy a new location using this checklist:

Pre-Deployment

  • Review existing deployment guide at docs/deployment.md
  • Obtain legal/IRB authorization for PHI scanning at the target location
  • Identify target scope: Windows hosts, file shares, databases, or all three
  • Coordinate with location IT to ensure port 445 (SMB) is accessible from scanner node
  • Install scanner: pip install greysec-phi-scanner or pip install -e . from source

Step 1: Create Location YAML Config

  • Copy locations/acme_hq.example.yaml to locations/<new_location_id>.yaml
  • Set location_id (unique, lowercase with underscores)
  • Set location_name and description
  • Add all Windows hosts with IP, credentials, and scan paths
  • Add all database targets with connection strings and custom queries (if needed)
  • Add all file share paths
  • Set appropriate tags for filtering (e.g., production, lab)
  • Set max_concurrency (default 4 is fine for most locations)
  • Do not commit real credentials — use {{ env "VAR" }} template syntax or credentials.env

Step 2: Set Up Scanner Service Account

  • Create AD service account (e.g., scanner_svc@DOMAIN)
  • Place in appropriate OU (e.g., OU=Service Accounts,OU=Admin)
  • Set strong password; confirm password never expires
  • Grant read access to all target file shares
  • Grant C$ share write access (local admin on target machines, or Domain Computers membership)
  • Test credentials: smbmap -H <target_ip> -u scanner_svc -p '<password>'

Step 3: Test WinRM/SMB Connectivity

  • Verify port 445 reachable: nc -zv <target_ip> 445
  • Verify share access: smbclient -L //<target_ip> -U scanner_svc
  • Run a quick single-path scan to validate the full pipeline:
    greysec-phi scan-windows \
      --host <target_ip> \
      --user scanner_svc \
      --password "ServicePass123!" \
      --share C$ \
      --path "C:\\inetpub\\wwwroot"
    
  • Confirm results file appears and contains expected findings

Step 4: Run Initial Scan

  • Run the full location scan:
    greysec-phi orchestrate \
      --location-file ./locations/<new_location_id>.yaml \
      --output-dir ./reports \
      --workers 4
    
  • Monitor for errors in the output
  • Confirm per-location HTML report generated at ./reports/<location_id>/<location_id>_report.html

Step 5: Verify Findings Match Expectations

  • Open the HTML report and review all HIGH severity findings
  • Spot-check a sample of findings — confirm they are real PHI and not false positives
  • Check for any targets that returned zero findings (may indicate permission issues)
  • Cross-reference with known PHI data locations provided by the client
  • Escalate any unexpected HIGH findings to the engagement lead immediately

Step 6: Configure Scheduled Scans

  • Create systemd timer or cron job (see Section 9)
  • Set schedule: typically weekly or monthly for production environments
  • Configure log rotation for scan results (keep 90 days minimum for HIPAA)
  • Set up report delivery (email, secure share, or Supabase)
  • Document schedule in the runbook for this location

Step 7: Set Up Report Delivery

  • Configure automated report delivery to the security team or CISO
  • Reports should go to a secure, access-controlled location
  • Set up alert on master report generation failure
  • Add location to the master locations/ directory index

8. Troubleshooting

WinRM Connection Failures

Symptom: Output shows WinRM connection errors.

Cause: The scanner uses SMB+atsvc, not WinRM. Misconfigured winrm_port or WinRM-specific firewall rules being the only open port.

Fix:

  1. Verify port 445 is open: nc -zv <target_ip> 445
  2. The winrm_port field in the YAML is informational only. The scanner ignores it.
  3. If SMB is blocked, work with network security to allow TCP 445 outbound to target subnet.
  4. Do NOT try to enable WinRM on target machines — it is not used.

Error example: RPC_X_NULL_REF_POINTER — None of the routers responded Cause: SMB handshake succeeded but the atsvc named pipe failed. Usually a credential or share permission issue. Fix: Verify the scanner account has write access to the C$ share.

LDAP Query Failures

Symptom: greysec-phi discover fails with LDAP bind errors.

Common causes and fixes:

Error Cause Fix
LDAPBindError: Invalid credentials Wrong username format Use user@domain format: scanner_svc@CONTOSO.COM
LDAPSocketOpenError: connection refused LDAPS on 636 without proper config, or port 389 blocked Use ldap:// (not ldaps://) on port 389, or verify firewall
NamingContext error Scanner account lacks read access to AD Grant domain read access to the service account
Timeout DC unreachable or LDAP service down Verify network path; try a different DC

Test LDAP manually:

# With ldapsearch
ldapsearch -H ldap://dc01.contoso.com:389 \
  -D "scanner_svc@contoso.com" \
  -w "ServicePass123!" \
  -b "DC=contoso,DC=com" \
  "(&(objectClass=computer)(operatingSystem=*Windows*))" \
  cn operatingSystem

Empty Findings (Scanner Not Finding Expected PHI)

Symptom: Scan ran successfully but returned zero findings where PHI is known to exist.

Diagnosis steps:

  1. Verify the path is being scanned. Add a known PHI test file:

    # On the target Windows host
    echo "Patient SSN: 123-45-6789" > C:\tmp\phi_test\PHI_seed\test.txt
    

    Then re-run the scan and check if this file is detected.

  2. Check file type support. The scanner looks for these extensions by default on Windows:

    *.txt, *.csv, *.log, *.json, *.xml, *.doc, *.docx,
    *.xls, *.xlsx, *.pdf, *.mdb, *.accdb, *.sql, *.cfg,
    *.ini, *.dat, *.bak
    

    Max file size: 50MB.

  3. Verify read permissions. If the scanner account can write to the share but not read certain files, those paths are silently skipped.

  4. Check ignore_patterns. Ensure PHI files aren't being excluded by patterns in the config.

  5. Database query returns empty. Run the query manually:

    # Test MSSQL query
    /opt/mssql-tools/bin/sqlcmd -S 10.10.60.5 -U phi_scan_user \
      -P 'ServicePass123!' -d EpicEMR \
      -Q "SELECT TOP 10 SSN FROM patient_identity"
    

Permission Denied Errors

Symptom: Permission denied errors in scan output for Windows targets.

Typical causes:

Error Cause Fix
SMB upload failed No write access to C$ share Grant scanner account local admin on target, or add to Remote Management group
Cannot create phi_test directory No write to remote_dir path Set remote_dir to a path the account can write to, e.g., tmp
Database query denied Scanner DB account lacks SELECT Grant SELECT on the target tables: GRANT SELECT ON patient_identity TO phi_scan_user;
File read denied (Windows) NTFS ACL prevents scanner account Add scanner account to the file's ACL with Read access

Testing share permissions:

# List accessible shares with scanner account
smbmap -H 10.10.50.10 -u scanner_svc -p 'password'

# Test specific share read
smbmap -H 10.10.50.10 -u scanner_svc -p 'password' -r C$

Large Scan Performance Issues

Symptom: Scan times out, runs very slowly, or hits memory limits.

Mitigations:

  1. Reduce scan paths. Don't scan entire drives. Limit to known PHI-containing directories:

    scan_paths:
      - "C:\\Departments\\HR"        # PHI likely here
      - "C:\\Shares\\PatientData"    # PHI likely here
      # NOT "C:\\" — too broad
    
  2. Increase timeout. For large locations, increase scan_timeout_mins:

    scan_timeout_mins: 120   # 2 hours for large file servers
    
  3. Exclude noise. Add patterns for large non-PHI files:

    ignore_patterns:
      - "*.vmdk"
      - "*.iso"
      - "*.bak"
      - "node_modules"
      - ".git"
    
  4. Database scans. Add TOP 10000 or WHERE clauses to limit rows:

    query: |
      SELECT TOP 10000 pat_id, ssn, first_name, last_name, dob, mrn
        FROM patient_identity
        WHERE created_date > '2025-01-01';
    
  5. Concurrency tuning. For a location with many targets, increase max_concurrency:

    max_concurrency: 8   # Up to 8 parallel scans per location
    
  6. Scan polling interval. If results aren't appearing within the expected time, the poll_count in output shows how many SMB read attempts were made. Increase the timeout in remote_scan() if network latency is high.


9. Scheduled Scanning

cron — Nightly Scan at 2 AM

# /etc/cron.d/greysec-phi-scanner
# Run nightly PHI scan at 2 AM Monday-Friday
SHELL=/bin/bash
PATH=/usr/local/bin:/usr/bin:/bin
LC_ALL=en_US.UTF-8

# Load credentials
source /home/scanner/.greysec/credentials.env

# Nightly orchestrated scan across all production locations
0 2 * * 1-5 scanner cd /opt/greysec-phi-scanner && \
  greysec-phi orchestrate \
    --locations-dir /opt/greysec-phi-scanner/locations \
    --tags production \
    --output-dir /opt/greysec-phi-scanner/reports \
    --workers 4 \
    >> /var/log/phi-scanner/nightly.log 2>&1

# Aggregate results every Friday
0 3 * * 5 scanner cd /opt/greysec-phi-scanner && \
  greysec-phi aggregate \
    /opt/greysec-phi-scanner/reports/*/results.json \
    --output /opt/greysec-phi-scanner/reports/master_phi_results.json \
    >> /var/log/phi-scanner/weekly-aggregate.log 2>&1

cron — Monthly Full Enterprise Scan

# First of every month at 1 AM
0 1 1 * * scanner /opt/greysec-phi-scanner/bin/full-enterprise-scan.sh
#!/bin/bash
# /opt/greysec-phi-scanner/bin/full-enterprise-scan.sh
set -euo pipefail

source /home/scanner/.greysec/credentials.env
DATE=$(date +%Y-%m-%d)
REPORT_DIR="/opt/greysec-phi-scanner/reports/monthly/${DATE}"
mkdir -p "$REPORT_DIR"

greysec-phi orchestrate \
  --locations-dir /opt/greysec-phi-scanner/locations \
  --output-dir "$REPORT_DIR" \
  --workers 8

# Copy master report to centralized location
cp "$REPORT_DIR/master_phi_report.html" \
  "/opt/greysec-phi-scanner/reports/enterprise-latest.html"

# Send alert if HIGH findings > threshold
HIGH_COUNT=$(python3 -c "
import json
with open('$REPORT_DIR/master_phi_results.json') as f:
    d = json.load(f)
print(d['total_high'])
")
if [ "$HIGH_COUNT" -gt 0 ]; then
  echo "PHI Scanner: $HIGH_COUNT HIGH severity findings — see report" | \
    mail -s "[ALERT] PHI Scanner HIGH Findings" security@contoso.com
fi

systemd — Timer + Service for Nightly Scans

Unit file: /etc/systemd/system/greysec-phi-scanner.service

[Unit]
Description=GreySec PHI Scanner — Nightly PHI Discovery Scan
After=network-online.target
Wants=network-online.target

[Service]
Type=oneshot
User=scanner
Group=scanner
Environment="PATH=/opt/greysec-phi-scanner/bin"
EnvironmentFile=/home/scanner/.greysec/credentials.env
ExecStart=/usr/bin/python3 -m greysec_phi_scanner.cli orchestrate \
  --locations-dir /opt/greysec-phi-scanner/locations \
  --output-dir /opt/greysec-phi-scanner/reports \
  --workers 4
StandardOutput=journal
StandardError=journal

Timer file: /etc/systemd/system/greysec-phi-scanner.timer

[Unit]
Description=Nightly PHI Scanner Run at 2 AM Mon-Fri
Requires=greysec-phi-scanner.service

[Timer]
OnCalendar=*-*-* 02:00:00
Persistent=true
RandomizedDelaySec=300

[Install]
WantedBy=timers.target
# Enable and start
sudo systemctl daemon-reload
sudo systemctl enable --now greysec-phi-scanner.timer

# Check next run
systemctl list-timers greysec-phi-scanner.timer

# View last run result
systemctl status greysec-phi-scanner.service
journalctl -u greysec-phi-scanner.service -n 50

Log Rotation for Scan Results

Add to /etc/logrotate.d/greysec-phi-scanner:

/var/log/phi-scanner/ {
    daily
    rotate 90           # Keep 90 days (HIPAA minimum)
    compress
    delaycompress
    missingok
    notifempty
    create 0640 scanner scanner
    sharedscripts
    postrotate
        systemctl reload greysec-phi-scanner > /dev/null 2>&1 || true
    endscript
}

/opt/greysec-phi-scanner/reports/master_phi_results.json {
    weekly
    rotate 52
    compress
    delaycompress
    missingok
    notifempty
    create 0640 scanner scanner
}

Alerting on Scan Failures

Add a check after each orchestrate run:

#!/bin/bash
# /opt/greysec-phi-scanner/bin/check-scan-status.sh
RESULT_FILE="/opt/greysec-phi-scanner/reports/master_phi_results.json"

if [ ! -f "$RESULT_FILE" ]; then
  echo "CRITICAL: Master results file missing — scan may have failed" | \
    mail -s "[CRITICAL] PHI Scanner Failure" ops@contoso.com
  exit 1
fi

# Check for failed locations
FAILED=$(python3 -c "
import json
with open('$RESULT_FILE') as f:
    d = json.load(f)
print(d['status_counts'].get('failed', 0))
")

if [ "$FAILED" -gt 0 ]; then
  echo "PHI Scanner: $FAILED location(s) failed — review immediately" | \
    mail -s "[WARNING] PHI Scanner Partial Failure" ops@contoso.com
fi

Scheduling Summary

Schedule Method Command
Nightly (Mon-Fri, 2 AM) cron or systemd timer orchestrate --locations-dir ./locations/ --tags production
Weekly (Friday 3 AM) cron aggregate ./reports/*/results.json --output master.json
Monthly (1st, 1 AM) cron Full orchestrate + report delivery script
On-demand Manual greysec-phi orchestrate --location-file ./locations/<id>.yaml

Quick Reference Card

# === Installation ===
pip install greysec-phi-scanner

# === Single location scan ===
greysec-phi orchestrate --location-file ./locations/clinic_north.yaml \
  --output-dir ./reports

# === Multi-location scan (all locations in dir) ===
greysec-phi orchestrate --locations-dir ./locations/ --workers 4 \
  --output-dir ./reports

# === LDAP host discovery ===
greysec-phi discover --domain CONTOSO \
  --bind-user scanner_svc@CONTOSO \
  --bind-password 'ServicePass123!' \
  --ldap-url ldap://dc01.contoso.com

# === Single Windows host scan ===
greysec-phi scan-windows --host 10.10.50.10 \
  --user scanner_svc --password 'ServicePass123!' \
  --share C$ --path "C:\\Shares\\PatientData"

# === Database scan ===
greysec-phi scan-db --engine mssql --host 10.10.60.5 \
  --database EpicEMR --user phi_scan_user --password 'ServicePass123!'

# === Local filesystem scan ===
greysec-phi scan /mnt/phi_share --output ./results.json

# === Generate HTML report ===
greysec-phi report --results ./results.json \
  --output ./report.html --client "Acme Health"

# === View inventory ===
greysec-phi inventory

# === Aggregate multiple location results ===
greysec-phi aggregate ./reports/*/results.json \
  --output ./reports/master_phi_results.json

This runbook is operational. For architecture and installation details, see docs/deployment.md. For issues, consult the troubleshooting section above or contact the GreySec tooling team.