7ae298005d
* [Bug] KQL Validation Add Wildcard w/ Space token value ## Summary Fixes KQL parser to support wildcard values containing spaces (e.g., `*S3 Browser*`), which work in Kibana but were rejected by our unit tests. **Issue:** #5750 ## Changes ### Grammar (`lib/kql/kql/kql.g`) - Added `WILDCARD_LITERAL` token with priority 3 to match wildcard patterns containing spaces - Uses negative lookahead to stop before `or`/`and`/`not` keywords - Added to `value` rule (not `literal`) so field names remain unaffected ### Parser (`lib/kql/kql/parser.py`) - Handle new `WILDCARD_LITERAL` token type as wildcards - Quoted strings (`"*text*"`) now treated as literals, matching Kibana behavior ## Behavior | Query | Before | After | |-------|--------|-------| | `field: *S3 Browser*` | ❌ Parse error | ✅ Wildcard | | `field: *test*` | ✅ Wildcard | ✅ Wildcard | | `common.*: value` | ✅ Works | ✅ Works | | `field: "*text*"` | Wildcard | ✅ Literal (matches Kibana) | ## Test plan - [x] All 63 existing KQL unit tests pass - [x] New wildcard-with-spaces patterns parse correctly - [x] Wildcard field names (`common.*`) still work - [x] Keywords (`or`, `and`, `not`) correctly recognized as separators - [x] Tested against rule file from PR #5694 * update pyproject version * update kibana and kql pyproject.toml versions update kibana and kql pyproject.toml versions * update wildcard_literal pattern to account for false matches with leading keywords Add Negative lookahead at start of Pattern 2 - uses (?!(?:or|and|not)\b) at the start to prevent matching values that begin with keywords like 'not /path*' * adding NOT keyword token and support for wildcard in the middle of spaced phrase # KQL Parser Changes - Wildcard Spaces and NOT Prefix Fix ## Overview This update fixes two issues in the KQL parser: 1. **Wildcard values with spaces** - Values like `*S3 Browser*` now parse correctly 2. **NOT prefix false match** - Values like `not /tmp/go-build*` are no longer incorrectly consumed as a single wildcard literal ## Files Modified ### `lib/kql/kql/kql.g` (Grammar) **Added `optional_not` rule** to handle `NOT` as an explicit grammar element: ``` ?list_of_values: "(" or_list_of_values ")" | optional_not value ?optional_not: NOT optional_not | ``` **Expanded `WILDCARD_LITERAL`** with 4 patterns to support all wildcard-with-space cases: | Pattern | Description | Example | |---------|-------------|---------| | 1 | Starts with `*` | `*S3 Browser`, `*S3 Browser*` | | 2 | Ends with `*` (doesn't start with `*`) | `S3 Browser*` | | 3a | `*` appears after a space | `S3 B*owser` | | 3b | `*` appears before a space | `S3* Browser` | ### `lib/kql/kql/parser.py` Added methods to handle the new grammar rules: - `list_of_values()` - handles `optional_not value` structure - `optional_not()` - counts NOT occurrences and wraps values with `NotValue` ### `lib/kql/kql/kql2eql.py` Added corresponding methods for EQL conversion: - `list_of_values()` - handles `optional_not value` structure - `optional_not()` - counts NOT occurrences and wraps with `eql.ast.Not` ## Test Results All 63 kuery tests pass. Verified wildcard cases: | Input | Result | |-------|--------| | `field: *S3 Browser*` | `field:*S3\ Browser*` | | `field: S3 Browser*` | `field:S3\ Browser*` | | `field: *S3 Browser` | `field:*S3\ Browser` | | `field: S3 B*owser` | `field:S3\ B*owser` | | `field: S3* Browser` | `field:S3*\ Browser` | | `field: foo* bar* baz` | `field:foo*\ bar*\ baz` | | `process.executable: not /tmp/go-build*` | `not process.executable:/tmp/go-build*` | | `field < value` | `field < value` (range expression, not wildcard) | ## Technical Notes ### Pattern 3a Fix Pattern 3a requires at least one character AFTER the `*` (uses `[...]+` instead of `[...]*`). This prevents Pattern 2 from incorrectly matching shorter strings like `S3 B*` when the full value is `S3 B*owser`. ### NOT Keyword Handling The `optional_not` grammar approach explicitly parses `NOT` as a keyword before the value, preventing it from being consumed as part of a wildcard literal. This is safer than regex-only approaches because: - `NOT` token only matches the exact word "not" (case-insensitive) - Values like `notafile*` are still parsed as `UNQUOTED_LITERAL` - Edge case: literal value "not" must be quoted: `field: "not"` * Changes to Addresses Review Comments ### Changes to Addresses Review Comments @Mikaayenson 1. **Fixed regex patterns to prevent trailing whitespace capture** (`kql.g`) - Added `(?=\s|$|[()":{}])` lookahead to all WILDCARD_LITERAL patterns - This ensures patterns stop at boundaries without capturing trailing whitespace 2. **Removed `.rstrip()` workaround** (`parser.py`) - No longer needed since regex now handles boundaries correctly 3. **Added explicit WILDCARD_LITERAL handling** (`kql2eql.py`) - Now checks `token.type == "WILDCARD_LITERAL"` explicitly - Mirrors the approach used in `parser.py` 4. **Added unit tests** (`tests/kuery/test_parser.py`) - `test_wildcard_with_spaces` - all 4 WILDCARD_LITERAL patterns - `test_wildcard_with_spaces_and_keywords` - wildcards with `and`/`or` boundaries - `test_not_prefix_with_wildcard` - NOT keyword not consumed as wildcard - `test_quoted_wildcard_as_literal` - quoted wildcards are literal strings - `test_triple_not_optimization` - `not not not foo` → `not foo` * changed test directory from tmp * changed format of new tests * Update pyproject.toml Update pyproject.toml --------- Co-authored-by: Eric Forte <119343520+eric-forte-elastic@users.noreply.github.com>
29 lines
902 B
TOML
29 lines
902 B
TOML
[project]
|
|
name = "detection-rules-kibana"
|
|
version = "0.4.5"
|
|
description = "Kibana API utilities for Elastic Detection Rules"
|
|
license = {text = "Elastic License v2"}
|
|
keywords = ["Elastic", "Kibana", "Detection Rules", "Security", "Elasticsearch"]
|
|
classifiers = [
|
|
"Intended Audience :: Developers",
|
|
"Programming Language :: Python :: 3",
|
|
"Programming Language :: Python :: 3.12",
|
|
"Topic :: Security",
|
|
"Topic :: Software Development :: Build Tools",
|
|
"Topic :: Software Development :: Libraries :: Python Modules",
|
|
"Topic :: Software Development",
|
|
]
|
|
requires-python = ">=3.12"
|
|
dependencies = [
|
|
"requests>=2.25,<3.0",
|
|
"elasticsearch~=8.12.1",
|
|
]
|
|
|
|
[project.urls]
|
|
Homepage = "https://github.com/elastic/detection-rules"
|
|
License = "https://github.com/elastic/detection-rules/blob/main/LICENSE.txt"
|
|
|
|
[build-system]
|
|
requires = ["setuptools", "wheel"]
|
|
build-backend = "setuptools.build_meta"
|