# GreySec PHI Scanner Detects Protected Health Information (PHI) across files, databases, and Windows hosts to support HIPAA Security Risk Assessments. ## What It Scans | Target | How | PHI Types | |--------|-----|-----------| | File systems | Regex + entropy scan | SSN, MRN, phone, email, DOB, license, account, URL | | MSSQL / PostgreSQL | Direct SQL query + Presidio NLP | All PHI 18 identifiers | | Windows hosts (remote) | SMB upload + WinRM exec + SMB download | File-based PHI patterns | ## Quick Start ```bash # Scan a directory python3 -m greysec_phi_scanner scan /path/to/patient_data # Scan a database phi-scan scan --config configs/hq.yaml # Generate HTML report phi-scan report --results results.json -o report.html --client "Acme Hospital" ``` ## Architecture ``` phi-scanner/ ├── src/greysec_phi_scanner/ │ ├── scanner.py # Core regex file scanner │ ├── config.py # Pydantic config models │ ├── cli.py # Typer CLI (scan/report/discover/inventory) │ ├── db/ │ │ └── scanner.py # MSSQL + PostgreSQL scanning │ ├── windows/ │ │ ├── winrm_scan.py # Remote Windows scan (SMB + WinRM) │ │ └── host_detector.py # LDAP host discovery │ ├── inventory/ │ │ └── db.py # SQLite inventory │ └── reporting/ │ └── html_report.py # GreySec-branded HTML reports ├── test_data/ # Synthetic PHI for testing └── docs/deployment.md # Multi-location deployment guide ``` ## Report Output - **Cover page** with client name, date, classification - **Executive summary** with KPI cards (HIGH/MED/LOW) - **Scope table** with files scanned per source - **Findings by source** with severity badges - **Risk & Impact** narrative (no remediation — GreySec business rule) - **Appendix** with full raw JSON data ## Multi-Location Deployment Each engagement location gets its own `config.yaml` with: - Target-specific paths/credentials - Environment variable `${VAR}` for secrets - SQLite inventory at `~/.greysec/phi_inventory.db` - Reports per location under `~/engagements//reports/`