* [Bug] KQL Validation Add Wildcard w/ Space token value
## Summary
Fixes KQL parser to support wildcard values containing spaces (e.g., `*S3 Browser*`), which work in Kibana but were rejected by our unit tests.
**Issue:** #5750
## Changes
### Grammar (`lib/kql/kql/kql.g`)
- Added `WILDCARD_LITERAL` token with priority 3 to match wildcard patterns containing spaces
- Uses negative lookahead to stop before `or`/`and`/`not` keywords
- Added to `value` rule (not `literal`) so field names remain unaffected
### Parser (`lib/kql/kql/parser.py`)
- Handle new `WILDCARD_LITERAL` token type as wildcards
- Quoted strings (`"*text*"`) now treated as literals, matching Kibana behavior
## Behavior
| Query | Before | After |
|-------|--------|-------|
| `field: *S3 Browser*` | ❌ Parse error | ✅ Wildcard |
| `field: *test*` | ✅ Wildcard | ✅ Wildcard |
| `common.*: value` | ✅ Works | ✅ Works |
| `field: "*text*"` | Wildcard | ✅ Literal (matches Kibana) |
## Test plan
- [x] All 63 existing KQL unit tests pass
- [x] New wildcard-with-spaces patterns parse correctly
- [x] Wildcard field names (`common.*`) still work
- [x] Keywords (`or`, `and`, `not`) correctly recognized as separators
- [x] Tested against rule file from PR #5694
* update pyproject version
* update kibana and kql pyproject.toml versions
update kibana and kql pyproject.toml versions
* update wildcard_literal pattern to account for false matches with leading keywords
Add Negative lookahead at start of Pattern 2 - uses (?!(?:or|and|not)\b) at the start to prevent matching values that begin with keywords like 'not /path*'
* adding NOT keyword token and support for wildcard in the middle of spaced phrase
# KQL Parser Changes - Wildcard Spaces and NOT Prefix Fix
## Overview
This update fixes two issues in the KQL parser:
1. **Wildcard values with spaces** - Values like `*S3 Browser*` now parse correctly
2. **NOT prefix false match** - Values like `not /tmp/go-build*` are no longer incorrectly consumed as a single wildcard literal
## Files Modified
### `lib/kql/kql/kql.g` (Grammar)
**Added `optional_not` rule** to handle `NOT` as an explicit grammar element:
```
?list_of_values: "(" or_list_of_values ")"
| optional_not value
?optional_not: NOT optional_not
|
```
**Expanded `WILDCARD_LITERAL`** with 4 patterns to support all wildcard-with-space cases:
| Pattern | Description | Example |
|---------|-------------|---------|
| 1 | Starts with `*` | `*S3 Browser`, `*S3 Browser*` |
| 2 | Ends with `*` (doesn't start with `*`) | `S3 Browser*` |
| 3a | `*` appears after a space | `S3 B*owser` |
| 3b | `*` appears before a space | `S3* Browser` |
### `lib/kql/kql/parser.py`
Added methods to handle the new grammar rules:
- `list_of_values()` - handles `optional_not value` structure
- `optional_not()` - counts NOT occurrences and wraps values with `NotValue`
### `lib/kql/kql/kql2eql.py`
Added corresponding methods for EQL conversion:
- `list_of_values()` - handles `optional_not value` structure
- `optional_not()` - counts NOT occurrences and wraps with `eql.ast.Not`
## Test Results
All 63 kuery tests pass. Verified wildcard cases:
| Input | Result |
|-------|--------|
| `field: *S3 Browser*` | `field:*S3\ Browser*` |
| `field: S3 Browser*` | `field:S3\ Browser*` |
| `field: *S3 Browser` | `field:*S3\ Browser` |
| `field: S3 B*owser` | `field:S3\ B*owser` |
| `field: S3* Browser` | `field:S3*\ Browser` |
| `field: foo* bar* baz` | `field:foo*\ bar*\ baz` |
| `process.executable: not /tmp/go-build*` | `not process.executable:/tmp/go-build*` |
| `field < value` | `field < value` (range expression, not wildcard) |
## Technical Notes
### Pattern 3a Fix
Pattern 3a requires at least one character AFTER the `*` (uses `[...]+` instead of `[...]*`). This prevents Pattern 2 from incorrectly matching shorter strings like `S3 B*` when the full value is `S3 B*owser`.
### NOT Keyword Handling
The `optional_not` grammar approach explicitly parses `NOT` as a keyword before the value, preventing it from being consumed as part of a wildcard literal. This is safer than regex-only approaches because:
- `NOT` token only matches the exact word "not" (case-insensitive)
- Values like `notafile*` are still parsed as `UNQUOTED_LITERAL`
- Edge case: literal value "not" must be quoted: `field: "not"`
* Changes to Addresses Review Comments
### Changes to Addresses Review Comments @Mikaayenson
1. **Fixed regex patterns to prevent trailing whitespace capture** (`kql.g`)
- Added `(?=\s|$|[()":{}])` lookahead to all WILDCARD_LITERAL patterns
- This ensures patterns stop at boundaries without capturing trailing whitespace
2. **Removed `.rstrip()` workaround** (`parser.py`)
- No longer needed since regex now handles boundaries correctly
3. **Added explicit WILDCARD_LITERAL handling** (`kql2eql.py`)
- Now checks `token.type == "WILDCARD_LITERAL"` explicitly
- Mirrors the approach used in `parser.py`
4. **Added unit tests** (`tests/kuery/test_parser.py`)
- `test_wildcard_with_spaces` - all 4 WILDCARD_LITERAL patterns
- `test_wildcard_with_spaces_and_keywords` - wildcards with `and`/`or` boundaries
- `test_not_prefix_with_wildcard` - NOT keyword not consumed as wildcard
- `test_quoted_wildcard_as_literal` - quoted wildcards are literal strings
- `test_triple_not_optimization` - `not not not foo` → `not foo`
* changed test directory from tmp
* changed format of new tests
* Update pyproject.toml
Update pyproject.toml
---------
Co-authored-by: Eric Forte <119343520+eric-forte-elastic@users.noreply.github.com>
* Add schema validation for AlertSuppressionMapping
* Add support for indicator match alert suppression
* Add unit tests
* Update order and remove validates_schema method
* Add comments
* Add test for query rule duration only
* Add whitespace checking for KQL parse
* Add unit test for blank space check
* Bump patch version
* Add test cases for newline blank space
* Add additional unit tests
* Update to only walk tree once
---------
Co-authored-by: Terrance DeJesus <99630311+terrancedejesus@users.noreply.github.com>
* Bump Version
* updated
* Bump patch version
* Optimization should only occur on single values
* Wildcard semantically equivalent to query_string*
* Add unit test for optimization
* Move code-checks to yml
* Add tests path to code-checks
* Add lib path for code-checks
* Install deps from local
* Update DSL optimization unit test
---------
Co-authored-by: Terrance DeJesus <99630311+terrancedejesus@users.noreply.github.com>
* first pass
* Adding a dedicated code checking workflow
* Type fixes
* linting config and python version bump
* Type hints
* Drop incorrect config option
* More fixes
* Style fixes
* CI adjustments
* Pyproject fixes
* CI & pyproject fixes
* Proper version bump
* Tests formatting
* Resolve cirtular dependency
* Test fixes
* Make sure the tests are formatted correctly
* Check tweaks
* Bumping python version in CI images
* Pin marshmallow do 3.x because 4.x is not supported
* License fix
* Convert path to str
* Making myself a codeowner
* Missing kwargs param
* Adding a missing kwargs to `set_score`
* Update .github/CODEOWNERS
Co-authored-by: Mika Ayenson, PhD <Mikaayenson@users.noreply.github.com>
* Dropping unnecessary raise
* Dropping skipped test
* Drop unnecessary var
* Drop unused commented-out func
* Disable typehinting for the whole func
* Update linting command
* Invalid type hist on the input param
* Incorrect field type
* Incorrect value used fix
* Stricter values check
* Simpler function call
* Type condition fix
* TOML formatter fix
* Simpligy output conditions
* Formatting
* Use proper types instead of aliases
* MITRE attack fixes
* Using pathlib.Path for an argument
* Use proper method to update a set from a dict
* First round of `ruff` fixes
* More fixes
* More fixes
* Hack against cyclic dependency
* Ignore `PLC0415`
* Remove unused markers
* Cleanup
* Fixing the incorrect condition
* Update .github/CODEOWNERS
Co-authored-by: Mika Ayenson, PhD <Mikaayenson@users.noreply.github.com>
* Set explicit default values for optional fields
* Update the guidelines
* Adding None Defaults
---------
Co-authored-by: Mika Ayenson, PhD <Mikaayenson@users.noreply.github.com>
Co-authored-by: eric-forte-elastic <eric.forte@elastic.co>
* Delete RTAs
* Delete RTA-related orchestration code
* Drop RTAs from tests
* Remove RTAs from README
* Further cleanup
* Readme update
* Version bump and no more RTAs
* Styling fixes
* Drop RTAs from config files
* Drop `rule-mapping.yaml`
* Bring back event collector / normalizer
* Drop rta mention
* Cleanup rta leftovers
* Style fix
---------
Co-authored-by: Mika Ayenson, PhD <Mikaayenson@users.noreply.github.com>
* [Rule Tuning] Tighten Up Windows EventLog Indexes, Improve tags
* Format & order
* Update pyproject.toml
* Update credential_access_cookies_chromium_browsers_debugging.toml
* add description to hunting schema; change queries to be a list
* update createremotethreat by process hunt
* update dll hijack and masquerading as MSFT library
* remove sysmon specific dDLL hijack via masquerading MSFT library
* updated Masquerading Attempts as Native Windows Binaries
* updates Rare DLL Side-Loading by Occurrence
* updates Rare LSASS Process Access Attempts
* update DNS Queries via LOLBins with Low Occurence Frequency
* updated Low Occurrence of Drivers Loaded on Unique Hosts
* updates Excessive RDP Network Activity by Host and User
* updates Excessive SMB Network Activity by Process ID
* updated Executable File Creation by an Unusual Microsoft Binary
* Frequency of Process Execution and Network Logon by Source Address
* updates Frequency of Process Execution and Network Logon by Source Address
* updated Execution via Remote Services by Client Address
* updated Startup Execution with Low Occurrence Frequency by Unique Host
* updated Low Frequency of Process Execution via WMI by Unique Agent
* updated Low Frequency of Process Execution via Windows Scheduled Task by Unique Agent
* updated Low Occurence of Process Execution via Windows Services with Unique Agent
* Updated High Count of Network Connection Over Extended Period by Process
* update Libraries Loaded by svchost with Low Occurrence Frequency
* updated Microsoft Office Child Processes with Low Occurrence Frequency by Unique Agent
* updated Network Discovery via Sensitive Ports by Unusual Process
* updated PE File Transfer via SMB_Admin Shares by Agent or User
* updated Persistence via Run Key with Low Occurrence Frequency
* updates Persistence via Startup with Low Occurrence Frequency by Unique Host
* updates "Persistence via Run Key with Low Occurrence Frequency"; adjusted file names to remove data source
* updates "Low Occurrence of Suspicious Launch Agent or Launch Daemon"
* updates "Egress Network Connections with Total Bytes Greater than Threshold"
* updates "Rundll32 Execution Aggregated by Command Line"
* updates "Scheduled tasks Creation by Action via Registry"
* updates "Scheduled Tasks Creation for Unique Hosts by Task Command"
* updates "Suspicious Base64 Encoded Powershell Command"
* updates "Suspicious DNS TXT Record Lookups by Process"
* updates "Unique Windows Services Creation by Service File Name"
* Updates "Unique Windows Services Creation by Service File Name"
* updates "Windows Command and Scripting Interpreter from Unusual Parent Process"
* updates "Windows Logon Activity by Source IP"
* updates "Suspicious Network Connections by Unsigned Mach-O"
* updates LLM hunting queries
* re-generated markdown files; updated generate markdown py file
* updated test_hunt_data
* Update hunting/macos/queries/suspicious_network_connections_by_unsigned_macho.toml
Co-authored-by: Mika Ayenson <Mikaayenson@users.noreply.github.com>
* Update hunting/windows/queries/drivers_load_with_low_occurrence_frequency.toml
Co-authored-by: Mika Ayenson <Mikaayenson@users.noreply.github.com>
* Update hunting/windows/queries/domain_names_queried_via_lolbins_and_with_low_occurence_frequency.toml
Co-authored-by: Mika Ayenson <Mikaayenson@users.noreply.github.com>
* Update hunting/windows/queries/excessive_rdp_network_activity_by_source_host_and_user.toml
Co-authored-by: Mika Ayenson <Mikaayenson@users.noreply.github.com>
* Update hunting/windows/queries/excessive_rdp_network_activity_by_source_host_and_user.toml
Co-authored-by: Mika Ayenson <Mikaayenson@users.noreply.github.com>
* updated missing integrations
* updated MD docs according to recent hunting changes
* Update hunting/windows/queries/executable_file_creation_by_an_unusual_microsoft_binary.toml
Co-authored-by: Jonhnathan <26856693+w0rk3r@users.noreply.github.com>
* Update hunting/windows/queries/detect_rare_dll_sideload_by_occurrence.toml
Co-authored-by: Jonhnathan <26856693+w0rk3r@users.noreply.github.com>
* Update hunting/windows/queries/detect_masquerading_attempts_as_native_windows_binaries.toml
Co-authored-by: Jonhnathan <26856693+w0rk3r@users.noreply.github.com>
* Update hunting/windows/queries/detect_dll_hijack_via_masquerading_as_microsoft_native_libraries.toml
Co-authored-by: Jonhnathan <26856693+w0rk3r@users.noreply.github.com>
* Update hunting/llm/queries/aws_bedrock_dos_resource_exhaustion_detection.toml
Co-authored-by: Jonhnathan <26856693+w0rk3r@users.noreply.github.com>
* added enrichment policy link to rule
* Update hunting/windows/docs/execution_via_windows_management_instrumentation_by_occurrence_frequency_by_unique_agent.md
Co-authored-by: Samirbous <64742097+Samirbous@users.noreply.github.com>
* Update hunting/windows/docs/windows_command_and_scripting_interpreter_from_unusual_parent.md
Co-authored-by: Samirbous <64742097+Samirbous@users.noreply.github.com>
* Update hunting/windows/docs/windows_command_and_scripting_interpreter_from_unusual_parent.md
Co-authored-by: Samirbous <64742097+Samirbous@users.noreply.github.com>
* Update hunting/windows/docs/rundll32_execution_aggregated_by_cmdline.md
Co-authored-by: Samirbous <64742097+Samirbous@users.noreply.github.com>
* Update hunting/windows/docs/microsoft_office_child_processes_with_low_occurrence_frequency.md
Co-authored-by: Samirbous <64742097+Samirbous@users.noreply.github.com>
* Update hunting/windows/docs/microsoft_office_child_processes_with_low_occurrence_frequency.md
Co-authored-by: Samirbous <64742097+Samirbous@users.noreply.github.com>
* Update hunting/windows/queries/execution_via_windows_management_instrumentation_by_occurrence_frequency_by_unique_agent.toml
Co-authored-by: Samirbous <64742097+Samirbous@users.noreply.github.com>
* Update hunting/windows/queries/execution_via_windows_management_instrumentation_by_occurrence_frequency_by_unique_agent.toml
Co-authored-by: Samirbous <64742097+Samirbous@users.noreply.github.com>
* Update hunting/index.md
Co-authored-by: Samirbous <64742097+Samirbous@users.noreply.github.com>
* Update hunting/windows/docs/execution_via_network_logon_by_occurrence_frequency_by_top_source_ip.md
Co-authored-by: Samirbous <64742097+Samirbous@users.noreply.github.com>
* Update hunting/windows/queries/execution_via_network_logon_by_occurrence_frequency_by_top_source_ip.toml
Co-authored-by: Samirbous <64742097+Samirbous@users.noreply.github.com>
---------
Co-authored-by: Mika Ayenson <Mikaayenson@users.noreply.github.com>
Co-authored-by: Jonhnathan <26856693+w0rk3r@users.noreply.github.com>
Co-authored-by: Samirbous <64742097+Samirbous@users.noreply.github.com>