diff --git a/CLI.md b/CLI.md index c93c5ba7c..20ab7376a 100644 --- a/CLI.md +++ b/CLI.md @@ -5,7 +5,7 @@ the [README](README.md). Basic use of the CLI such as [creating a rule](CONTRIBU [testing](CONTRIBUTING.md#testing-a-rule-with-the-cli) are referenced in the [contribution guide](CONTRIBUTING.md). -## Using a config file or environment variables +## Using a user config file or environment variables CLI commands which are tied to Kibana and Elasticsearch are capable of parsing auth-related keyword args from a config file or environment variables. @@ -17,9 +17,9 @@ follows: * config values * prompt (this only applies to certain values) -#### Setup a config file +#### Setup a user config file -In the root directory of this repo, create the file `.detection-rules-cfg.json` and add relevant values +In the root directory of this repo, create the file `.detection-rules-cfg.json` (or `.yaml`) and add relevant values Currently supported arguments: * elasticsearch_url @@ -42,6 +42,8 @@ on the building block rules. Using the environment variable `DR_BYPASS_TAGS_VALIDATION` will bypass the Detection Rules Unit Tests on the `tags` field in toml files. +Using the environment variable `DR_BYPASS_TIMELINE_TEMPLATE_VALIDATION` will bypass the timeline template id and title validation for rules. + ## Importing rules into the repo You can import rules into the repo using the `create-rule` or `import-rules-to-repo` commands. Both of these commands will @@ -85,9 +87,19 @@ Usage: detection_rules import-rules-to-repo [OPTIONS] [INPUT_FILE]... Import rules from json, toml, yaml, or Kibana exported rule file(s). Options: - --required-only Only prompt for required fields - -d, --directory DIRECTORY Load files from a directory - -h, --help Show this message and exit. + -ac, --action-connector-import Include action connectors in export + -e, --exceptions-import Include exceptions in export + --required-only Only prompt for required fields + -d, --directory DIRECTORY Load files from a directory + -s, --save-directory DIRECTORY Save imported rules to a directory + -se, --exceptions-directory DIRECTORY + Save imported exceptions to a directory + -sa, --action-connectors-directory DIRECTORY + Save imported actions to a directory + -ske, --skip-errors Skip rule import errors + -da, --default-author TEXT Default author for rules missing one + -snv, --strip-none-values Strip None values from the rule + -h, --help Show this message and exit. ``` The primary advantage of using this command is the ability to import multiple rules at once. Multiple rule paths can be @@ -97,10 +109,14 @@ a combination of both. In addition to the formats mentioned using `create-rule`, this will also accept an `.ndjson`/`jsonl` file containing multiple rules (as would be the case with a bulk export). +The `-s/--save-directory` is an optional parameter to specify a non default directory to place imported rules. If it is not specified, the first directory specified in the rules config will be used. + This will also strip additional fields and prompt for missing required fields. \* Note: This will attempt to parse ALL files recursively within a specified directory. +Additionally, the `-e` flag can be used to import exceptions in addition to rules from the export file. + ## Commands using Elasticsearch and Kibana clients @@ -165,6 +181,8 @@ Options: -h, --help Show this message and exit. Commands: + export-rules Export custom rules from Kibana. + import-rules Import custom rules into Kibana. search-alerts Search detection engine alerts with KQL. upload-rule Upload a list of rule .toml files to Kibana. ``` @@ -272,7 +290,7 @@ directly. ```console Usage: detection_rules export-rules-from-repo [OPTIONS] - Export rule(s) into an importable ndjson file. + Export rule(s) and exception(s) into an importable ndjson file. Options: -f, --rule-file FILE @@ -280,13 +298,16 @@ Options: -id, --rule-id TEXT -o, --outfile PATH Name of file for exported rules -r, --replace-id Replace rule IDs with new IDs before export - --stack-version [7.10|7.11|7.12|7.13|7.14|7.15|7.16|7.8|7.9|8.0|8.1|8.10|8.11|8.12|8.13|8.2|8.3|8.4|8.5|8.6|8.7|8.8|8.9] + --stack-version [7.8|7.9|7.10|7.11|7.12|7.13|7.14|7.15|7.16|8.0|8.1|8.2|8.3|8.4|8.5|8.6|8.7|8.8|8.9|8.10|8.11|8.12|8.13|8.14] Downgrade a rule version to be compatible with older instances of Kibana -s, --skip-unsupported If `--stack-version` is passed, skip rule types which are unsupported (an error will be raised otherwise) --include-metadata Add metadata to the exported rules + -ac, --include-action-connectors + Include Action Connectors in export + -e, --include-exceptions Include Exceptions Lists in export -h, --help Show this message and exit. ``` @@ -317,6 +338,7 @@ Options: --kibana-url TEXT -kp, --kibana-password TEXT -kc, --kibana-cookie TEXT Cookie from an authed session + --api-key TEXT --cloud-id TEXT ID of the cloud instance. Usage: detection_rules kibana import-rules [OPTIONS] @@ -329,7 +351,7 @@ Options: -id, --rule-id TEXT -o, --overwrite Overwrite existing rules -e, --overwrite-exceptions Overwrite exceptions in existing rules - -a, --overwrite-action-connectors + -ac, --overwrite-action-connectors Overwrite action connectors in existing rules -h, --help Show this message and exit. @@ -476,6 +498,51 @@ python -m detection_rules kibana import-rules -d test-export-rules -o ### Exporting rules +This command should be run with the `CUSTOM_RULES_DIR` envvar set, that way proper validation is applied to versioning when the rules are downloaded. See the [custom rules docs](docs/custom-rules.md) for more information. + +``` +python -m detection_rules kibana export-rules -h + +█▀▀▄ ▄▄▄ ▄▄▄ ▄▄▄ ▄▄▄ ▄▄▄ ▄▄▄ ▄▄▄ ▄ ▄ █▀▀▄ ▄ ▄ ▄ ▄▄▄ ▄▄▄ +█ █ █▄▄ █ █▄▄ █ █ █ █ █ █▀▄ █ █▄▄▀ █ █ █ █▄▄ █▄▄ +█▄▄▀ █▄▄ █ █▄▄ █▄▄ █ ▄█▄ █▄█ █ ▀▄█ █ ▀▄ █▄▄█ █▄▄ █▄▄ ▄▄█ + +Kibana client: +Options: + --ignore-ssl-errors TEXT + --space TEXT Kibana space + --provider-name TEXT Elastic Cloud providers: cloud-basic and cloud- + saml (for SSO) + --provider-type TEXT Elastic Cloud providers: basic and saml (for + SSO) + -ku, --kibana-user TEXT + --kibana-url TEXT + -kp, --kibana-password TEXT + -kc, --kibana-cookie TEXT Cookie from an authed session + --api-key TEXT + --cloud-id TEXT ID of the cloud instance. + +Usage: detection_rules kibana export-rules [OPTIONS] + + Export custom rules from Kibana. + +Options: + -d, --directory PATH Directory to export rules to [required] + -acd, --action-connectors-directory PATH + Directory to export action connectors to + -ed, --exceptions-directory PATH + Directory to export exceptions to + -da, --default-author TEXT Default author for rules missing one + -r, --rule-id TEXT Optional Rule IDs to restrict export to + -ac, --export-action-connectors + Include action connectors in export + -e, --export-exceptions Include exceptions in export + -s, --skip-errors Skip errors when exporting rules + -sv, --strip-version Strip the version fields from all rules + -h, --help Show this message and exit. + +``` + Example of a rule exporting, with errors skipped ``` @@ -648,4 +715,4 @@ value = "fast" ``` The easiest way to _update_ a rule with existing transform entries is to use `guide-plugin-convert` and manually add it -to the rule. \ No newline at end of file +to the rule. diff --git a/detection_rules/__init__.py b/detection_rules/__init__.py index bd75c6f42..b83894de3 100644 --- a/detection_rules/__init__.py +++ b/detection_rules/__init__.py @@ -11,6 +11,8 @@ import sys assert (3, 12) <= sys.version_info < (4, 0), "Only Python 3.12+ supported" from . import ( # noqa: E402 + custom_schemas, + custom_rules, devtools, docs, eswrap, @@ -28,6 +30,8 @@ from . import ( # noqa: E402 ) __all__ = ( + 'custom_rules', + 'custom_schemas', 'devtools', 'docs', 'eswrap', diff --git a/detection_rules/action.py b/detection_rules/action.py new file mode 100644 index 000000000..95ee9b997 --- /dev/null +++ b/detection_rules/action.py @@ -0,0 +1,64 @@ +# Copyright Elasticsearch B.V. and/or licensed to Elasticsearch B.V. under one +# or more contributor license agreements. Licensed under the Elastic License +# 2.0; you may not use this file except in compliance with the Elastic License +# 2.0. + +"""Dataclasses for Action.""" +from dataclasses import dataclass +from pathlib import Path +from typing import List, Optional + +from .mixins import MarshmallowDataclassMixin +from .schemas import definitions + + +@dataclass(frozen=True) +class ActionMeta(MarshmallowDataclassMixin): + """Data stored in an exception's [metadata] section of TOML.""" + creation_date: definitions.Date + rule_id: List[definitions.UUIDString] + rule_name: str + updated_date: definitions.Date + + # Optional fields + deprecation_date: Optional[definitions.Date] + comments: Optional[str] + maturity: Optional[definitions.Maturity] + + +@dataclass +class Action(MarshmallowDataclassMixin): + """Data object for rule Action.""" + @dataclass + class ActionParams: + """Data object for rule Action params.""" + body: str + + action_type_id: definitions.ActionTypeId + group: str + params: ActionParams + id: Optional[str] + frequency: Optional[dict] + alerts_filter: Optional[dict] + + +@dataclass(frozen=True) +class TOMLActionContents(MarshmallowDataclassMixin): + """Object for action from TOML file.""" + metadata: ActionMeta + actions: List[Action] + + +@dataclass(frozen=True) +class TOMLAction: + """Object for action from TOML file.""" + contents: TOMLActionContents + path: Path + + @property + def name(self): + return self.contents.metadata.rule_name + + @property + def id(self): + return self.contents.metadata.rule_id diff --git a/detection_rules/action_connector.py b/detection_rules/action_connector.py new file mode 100644 index 000000000..8a31c2a8f --- /dev/null +++ b/detection_rules/action_connector.py @@ -0,0 +1,176 @@ +# Copyright Elasticsearch B.V. and/or licensed to Elasticsearch B.V. under one +# or more contributor license agreements. Licensed under the Elastic License +# 2.0; you may not use this file except in compliance with the Elastic License +# 2.0. + +"""Dataclasses for Action.""" +from dataclasses import dataclass +from datetime import datetime +from pathlib import Path +from typing import List, Optional, Tuple + +import pytoml +from marshmallow import EXCLUDE + +from .mixins import MarshmallowDataclassMixin +from .schemas import definitions +from .config import parse_rules_config + +RULES_CONFIG = parse_rules_config() + + +@dataclass(frozen=True) +class ActionConnectorMeta(MarshmallowDataclassMixin): + """Data stored in an Action Connector's [metadata] section of TOML.""" + + creation_date: definitions.Date + action_connector_name: str + rule_ids: List[definitions.UUIDString] + rule_names: List[str] + updated_date: definitions.Date + + # Optional fields + deprecation_date: Optional[definitions.Date] + comments: Optional[str] + maturity: Optional[definitions.Maturity] + + +@dataclass +class ActionConnector(MarshmallowDataclassMixin): + """Data object for rule Action Connector.""" + + id: str + attributes: dict + frequency: Optional[dict] + managed: Optional[bool] + type: Optional[str] + references: Optional[List] + + +@dataclass(frozen=True) +class TOMLActionConnectorContents(MarshmallowDataclassMixin): + """Object for action connector from TOML file.""" + + metadata: ActionConnectorMeta + action_connectors: List[ActionConnector] + + @classmethod + def from_action_connector_dict(cls, actions_dict: dict, rule_list: dict) -> "TOMLActionConnectorContents": + """Create a TOMLActionContents from a kibana rule resource.""" + rule_ids = [] + rule_names = [] + + for rule in rule_list: + rule_ids.append(rule["id"]) + rule_names.append(rule["name"]) + + # Format date to match schema + creation_date = datetime.strptime(actions_dict["created_at"], "%Y-%m-%dT%H:%M:%S.%fZ").strftime("%Y/%m/%d") + updated_date = datetime.strptime(actions_dict["updated_at"], "%Y-%m-%dT%H:%M:%S.%fZ").strftime("%Y/%m/%d") + metadata = { + "creation_date": creation_date, + "rule_ids": rule_ids, + "rule_names": rule_names, + "updated_date": updated_date, + "action_connector_name": f"Action Connector {actions_dict.get('id')}", + } + + return cls.from_dict({"metadata": metadata, "action_connectors": [actions_dict]}, unknown=EXCLUDE) + + def to_api_format(self) -> List[dict]: + """Convert the TOML Action Connector to the API format.""" + converted = [] + + for action in self.action_connectors: + converted.append(action.to_dict()) + return converted + + +@dataclass(frozen=True) +class TOMLActionConnector: + """Object for action connector from TOML file.""" + + contents: TOMLActionConnectorContents + path: Path + + @property + def name(self): + return self.contents.metadata.action_connector_name + + def save_toml(self): + """Save the action to a TOML file.""" + assert self.path is not None, f"Can't save action for {self.name} without a path" + # Check if self.path has a .toml extension + path = self.path + if path.suffix != ".toml": + # If it doesn't, add one + path = path.with_suffix(".toml") + with path.open("w") as f: + contents_dict = self.contents.to_dict() + # Sort the dictionary so that 'metadata' is at the top + sorted_dict = dict(sorted(contents_dict.items(), key=lambda item: item[0] != "metadata")) + pytoml.dump(sorted_dict, f) + + +def parse_action_connector_results_from_api(results: List[dict]) -> tuple[List[dict], List[dict]]: + """Filter Kibana export rule results for action connector dictionaries.""" + action_results = [] + non_action_results = [] + for result in results: + if result.get("type") != "action": + non_action_results.append(result) + else: + action_results.append(result) + + return action_results, non_action_results + + +def build_action_connector_objects(action_connectors: List[dict], action_connector_rule_table: dict, + action_connectors_directory: Path, save_toml: bool = False, + skip_errors: bool = False, verbose=False, + ) -> Tuple[List[TOMLActionConnector], List[str], List[str]]: + """Build TOMLActionConnector objects from a list of action connector dictionaries.""" + output = [] + errors = [] + toml_action_connectors = [] + for action_connector_dict in action_connectors: + try: + connector_id = action_connector_dict.get("id") + rule_list = action_connector_rule_table.get(connector_id) + if not rule_list: + output.append(f"Warning action connector {connector_id} has no associated rules. Loading skipped.") + continue + else: + contents = TOMLActionConnectorContents.from_action_connector_dict(action_connector_dict, rule_list) + filename = f"{connector_id}_actions.toml" + if RULES_CONFIG.action_connector_dir is None and not action_connectors_directory: + raise FileNotFoundError( + "No Action Connector directory is specified. Please specify either in the config or CLI." + ) + actions_path = ( + Path(action_connectors_directory) / filename + if action_connectors_directory + else RULES_CONFIG.action_connector_dir / filename + ) + if verbose: + output.append(f"[+] Building action connector(s) for {actions_path}") + + ac_object = TOMLActionConnector( + contents=contents, + path=actions_path, + ) + if save_toml: + ac_object.save_toml() + toml_action_connectors.append(ac_object) + + except Exception as e: + if skip_errors: + output.append(f"- skipping actions_connector export - {type(e).__name__}") + if not action_connectors_directory: + errors.append(f"- no actions connector directory found - {e}") + else: + errors.append(f"- actions connector export - {e}") + continue + raise + + return toml_action_connectors, output, errors diff --git a/detection_rules/beats.py b/detection_rules/beats.py index 4af93dfdb..8d695df73 100644 --- a/detection_rules/beats.py +++ b/detection_rules/beats.py @@ -285,6 +285,15 @@ def get_schema_from_kql(tree: kql.ast.BaseNode, beats: list, version: str = None def parse_beats_from_index(index: Optional[list]) -> List[str]: + """Parse beats schema types from index.""" indexes = index or [] - beat_types = [index.split("-")[0] for index in indexes if "beat-*" in index] + beat_types = [] + # Need to split on : to support cross-cluster search + # e.g. mycluster:logs-* -> logs-* + for index in indexes: + if "beat-*" in index: + index_parts = index.split(':', 1) + last_part = index_parts[-1] + beat_type = last_part.split("-")[0] + beat_types.append(beat_type) return beat_types diff --git a/detection_rules/cli_utils.py b/detection_rules/cli_utils.py index 9920d416b..95fdc53f2 100644 --- a/detection_rules/cli_utils.py +++ b/detection_rules/cli_utils.py @@ -5,6 +5,7 @@ import copy import datetime +import functools import os import typing from pathlib import Path @@ -13,18 +14,15 @@ from typing import List, Optional import click import kql -import functools + from . import ecs -from .attack import matrix, tactics, build_threat_map_entry -from .rule import TOMLRule, TOMLRuleContents -from .rule_loader import (RuleCollection, - DEFAULT_RULES_DIR, - DEFAULT_BBR_DIR, +from .attack import build_threat_map_entry, matrix, tactics +from .rule import BYPASS_VERSION_LOCK, TOMLRule, TOMLRuleContents +from .rule_loader import (DEFAULT_PREBUILT_BBR_DIRS, + DEFAULT_PREBUILT_RULES_DIRS, RuleCollection, dict_filter) from .schemas import definitions -from .utils import clear_caches, get_path - -RULES_DIR = get_path("rules") +from .utils import clear_caches def single_collection(f): @@ -49,7 +47,7 @@ def single_collection(f): rules.load_directories(Path(d) for d in directories) if rule_id: - rules.load_directories((DEFAULT_RULES_DIR, DEFAULT_BBR_DIR), + rules.load_directories(DEFAULT_PREBUILT_RULES_DIRS + DEFAULT_PREBUILT_BBR_DIRS, obj_filter=dict_filter(rule__rule_id=rule_id)) if len(rules) != 1: client_error(f"Could not find rule with ID {rule_id}") @@ -64,10 +62,10 @@ def multi_collection(f): """Add arguments to get a RuleCollection by file, directory or a list of IDs""" from .misc import client_error - @click.option('--rule-file', '-f', multiple=True, type=click.Path(dir_okay=False), required=False) - @click.option('--directory', '-d', multiple=True, type=click.Path(file_okay=False), required=False, - help='Recursively load rules from a directory') - @click.option('--rule-id', '-id', multiple=True, required=False) + @click.option("--rule-file", "-f", multiple=True, type=click.Path(dir_okay=False), required=False) + @click.option("--directory", "-d", multiple=True, type=click.Path(file_okay=False), required=False, + help="Recursively load rules from a directory") + @click.option("--rule-id", "-id", multiple=True, required=False) @functools.wraps(f) def get_collection(*args, **kwargs): rule_id: List[str] = kwargs.pop("rule_id", []) @@ -76,20 +74,23 @@ def multi_collection(f): rules = RuleCollection() - if not (directories or rule_id or rule_files): - client_error('Required: at least one of --rule-id, --rule-file, or --directory') + if not (directories or rule_id or rule_files or (DEFAULT_PREBUILT_RULES_DIRS + DEFAULT_PREBUILT_BBR_DIRS)): + client_error("Required: at least one of --rule-id, --rule-file, or --directory") rules.load_files(Path(p) for p in rule_files) rules.load_directories(Path(d) for d in directories) if rule_id: - rules.load_directories((DEFAULT_RULES_DIR, DEFAULT_BBR_DIR), - obj_filter=dict_filter(rule__rule_id=rule_id)) + rules.load_directories( + DEFAULT_PREBUILT_RULES_DIRS + DEFAULT_PREBUILT_BBR_DIRS, obj_filter=dict_filter(rule__rule_id=rule_id) + ) found_ids = {rule.id for rule in rules} missing = set(rule_id).difference(found_ids) if missing: client_error(f'Could not find rules with IDs: {", ".join(missing)}') + elif not rule_files and not directories: + rules.load_directories(Path(d) for d in (DEFAULT_PREBUILT_RULES_DIRS + DEFAULT_PREBUILT_BBR_DIRS)) if len(rules) == 0: client_error("No rules found") @@ -101,7 +102,8 @@ def multi_collection(f): def rule_prompt(path=None, rule_type=None, required_only=True, save=True, verbose=False, - additional_required: Optional[list] = None, **kwargs) -> TOMLRule: + additional_required: Optional[list] = None, skip_errors: bool = False, strip_none_values=True, **kwargs, + ) -> TOMLRule: """Prompt loop to build a rule.""" from .misc import schema_prompt @@ -112,6 +114,8 @@ def rule_prompt(path=None, rule_type=None, required_only=True, save=True, verbos kwargs = copy.deepcopy(kwargs) + rule_name = kwargs.get('name') + if 'rule' in kwargs and 'metadata' in kwargs: kwargs.update(kwargs.pop('metadata')) kwargs.update(kwargs.pop('rule')) @@ -132,8 +136,8 @@ def rule_prompt(path=None, rule_type=None, required_only=True, save=True, verbos contents[name] = rule_type continue - # these are set at package release time - if name == 'version': + # these are set at package release time depending on the version strategy + if (name == 'version' or name == 'revision') and not BYPASS_VERSION_LOCK: continue if required_only and name not in required_fields: @@ -142,20 +146,20 @@ def rule_prompt(path=None, rule_type=None, required_only=True, save=True, verbos # build this from technique ID if name == 'threat': threat_map = [] + if not skip_errors: + while click.confirm('add mitre tactic?'): + tactic = schema_prompt('mitre tactic name', type='string', enum=tactics, is_required=True) + technique_ids = schema_prompt(f'technique or sub-technique IDs for {tactic}', type='array', + is_required=False, enum=list(matrix[tactic])) or [] - while click.confirm('add mitre tactic?'): - tactic = schema_prompt('mitre tactic name', type='string', enum=tactics, is_required=True) - technique_ids = schema_prompt(f'technique or sub-technique IDs for {tactic}', type='array', - is_required=False, enum=list(matrix[tactic])) or [] - - try: - threat_map.append(build_threat_map_entry(tactic, *technique_ids)) - except KeyError as e: - click.secho(f'Unknown ID: {e.args[0]} - entry not saved for: {tactic}', fg='red', err=True) - continue - except ValueError as e: - click.secho(f'{e} - entry not saved for: {tactic}', fg='red', err=True) - continue + try: + threat_map.append(build_threat_map_entry(tactic, *technique_ids)) + except KeyError as e: + click.secho(f'Unknown ID: {e.args[0]} - entry not saved for: {tactic}', fg='red', err=True) + continue + except ValueError as e: + click.secho(f'{e} - entry not saved for: {tactic}', fg='red', err=True) + continue if len(threat_map) > 0: contents[name] = threat_map @@ -178,8 +182,11 @@ def rule_prompt(path=None, rule_type=None, required_only=True, save=True, verbos ] else: - result = schema_prompt(name, is_required=name in required_fields, **options.copy()) - + if skip_errors: + # return missing information + return f"Rule: {kwargs["id"]}, Rule Name: {rule_name} is missing {name} information" + else: + result = schema_prompt(name, is_required=name in required_fields, **options.copy()) if result: if name not in required_fields and result == options.get('default', ''): skipped.append(name) @@ -187,13 +194,16 @@ def rule_prompt(path=None, rule_type=None, required_only=True, save=True, verbos contents[name] = result - suggested_path = os.path.join(RULES_DIR, contents['name']) # TODO: UPDATE BASED ON RULE STRUCTURE - path = os.path.realpath(path or input('File path for rule [{}]: '.format(suggested_path)) or suggested_path) + # DEFAULT_PREBUILT_RULES_DIRS[0] is a required directory just as a suggestion + suggested_path = Path(DEFAULT_PREBUILT_RULES_DIRS[0]) / contents['name'] + path = Path(path or input(f'File path for rule [{suggested_path}]: ') or suggested_path).resolve() meta = {'creation_date': creation_date, 'updated_date': creation_date, 'maturity': 'development'} try: rule = TOMLRule(path=Path(path), contents=TOMLRuleContents.from_dict({'rule': contents, 'metadata': meta})) except kql.KqlParseError as e: + if skip_errors: + return f"Rule: {kwargs['id']}, Rule Name: {rule_name} query failed to parse: {e.error_msg}" if e.error_msg == 'Unknown field': warning = ('If using a non-ECS field, you must update "ecs{}.non-ecs-schema.json" under `beats` or ' '`legacy-endgame` (Non-ECS fields should be used minimally).'.format(os.path.sep)) @@ -218,9 +228,13 @@ def rule_prompt(path=None, rule_type=None, required_only=True, save=True, verbos continue break + except Exception as e: + if skip_errors: + return f"Rule: {kwargs['id']}, Rule Name: {rule_name} failed: {e}" + raise e if save: - rule.save_toml() + rule.save_toml(strip_none_values=strip_none_values) if skipped: print('Did not set the following values because they are un-required when set to the default value') diff --git a/detection_rules/config.py b/detection_rules/config.py new file mode 100644 index 000000000..022d9d048 --- /dev/null +++ b/detection_rules/config.py @@ -0,0 +1,316 @@ +# Copyright Elasticsearch B.V. and/or licensed to Elasticsearch B.V. under one +# or more contributor license agreements. Licensed under the Elastic License +# 2.0; you may not use this file except in compliance with the Elastic License +# 2.0. + +"""Configuration support for custom components.""" +import fnmatch +import os +from dataclasses import dataclass, field +from pathlib import Path +from functools import cached_property +from typing import Dict, List, Optional + +import yaml +from eql.utils import load_dump + +from .misc import discover_tests +from .utils import cached, load_etc_dump, get_etc_path, set_all_validation_bypass + +ROOT_DIR = Path(__file__).parent.parent +CUSTOM_RULES_DIR = os.getenv('CUSTOM_RULES_DIR', None) + + +@dataclass +class UnitTest: + """Base object for unit tests configuration.""" + bypass: Optional[List[str]] = None + test_only: Optional[List[str]] = None + + def __post_init__(self): + assert (self.bypass is None or self.test_only is None), \ + 'Cannot set both `test_only` and `bypass` in test_config!' + + +@dataclass +class RuleValidation: + """Base object for rule validation configuration.""" + bypass: Optional[List[str]] = None + test_only: Optional[List[str]] = None + + def __post_init__(self): + assert not (self.bypass and self.test_only), 'Cannot use both test_only and bypass' + + +@dataclass +class ConfigFile: + """Base object for configuration files.""" + + @dataclass + class FilePaths: + packages_file: str + stack_schema_map_file: str + deprecated_rules_file: Optional[str] = None + version_lock_file: Optional[str] = None + + @dataclass + class TestConfigPath: + config: str + + files: FilePaths + rule_dir: List[str] + testing: Optional[TestConfigPath] = None + + @classmethod + def from_dict(cls, obj: dict) -> 'ConfigFile': + files_data = obj.get('files', {}) + files = cls.FilePaths( + deprecated_rules_file=files_data.get('deprecated_rules'), + packages_file=files_data['packages'], + stack_schema_map_file=files_data['stack_schema_map'], + version_lock_file=files_data.get('version_lock') + ) + rule_dir = obj['rule_dirs'] + + testing_data = obj.get('testing') + testing = cls.TestConfigPath( + config=testing_data['config'] + ) if testing_data else None + + return cls(files=files, rule_dir=rule_dir, testing=testing) + + +@dataclass +class TestConfig: + """Detection rules test config file""" + test_file: Optional[Path] = None + unit_tests: Optional[UnitTest] = None + rule_validation: Optional[RuleValidation] = None + + @classmethod + def from_dict(cls, test_file: Optional[Path] = None, unit_tests: Optional[dict] = None, + rule_validation: Optional[dict] = None) -> 'TestConfig': + return cls(test_file=test_file or None, unit_tests=UnitTest(**unit_tests or {}), + rule_validation=RuleValidation(**rule_validation or {})) + + @cached_property + def all_tests(self): + """Get the list of all test names.""" + return discover_tests() + + def tests_by_patterns(self, *patterns: str) -> List[str]: + """Get the list of test names by patterns.""" + tests = set() + for pattern in patterns: + tests.update(list(fnmatch.filter(self.all_tests, pattern))) + return sorted(tests) + + @staticmethod + def parse_out_patterns(names: List[str]) -> (List[str], List[str]): + """Parse out test patterns from a list of test names.""" + patterns = [] + tests = [] + for name in names: + if name.startswith('pattern:') and '*' in name: + patterns.append(name[len('pattern:'):]) + else: + tests.append(name) + return patterns, tests + + @staticmethod + def format_tests(tests: List[str]) -> List[str]: + """Format unit test names into expected format for direct calling.""" + raw = [t.rsplit('.', maxsplit=2) for t in tests] + formatted = [] + for test in raw: + path, clazz, method = test + path = f'{path.replace(".", os.path.sep)}.py' + formatted.append('::'.join([path, clazz, method])) + return formatted + + def get_test_names(self, formatted: bool = False) -> (List[str], List[str]): + """Get the list of test names to run.""" + patterns_t, tests_t = self.parse_out_patterns(self.unit_tests.test_only or []) + patterns_b, tests_b = self.parse_out_patterns(self.unit_tests.bypass or []) + defined_tests = tests_t + tests_b + patterns = patterns_t + patterns_b + unknowns = sorted(set(defined_tests) - set(self.all_tests)) + assert not unknowns, f'Unrecognized test names in config ({self.test_file}): {unknowns}' + + combined_tests = sorted(set(defined_tests + self.tests_by_patterns(*patterns))) + + if self.unit_tests.test_only is not None: + tests = combined_tests + skipped = [t for t in self.all_tests if t not in tests] + elif self.unit_tests.bypass: + tests = [] + skipped = [] + for test in self.all_tests: + if test not in combined_tests: + tests.append(test) + else: + skipped.append(test) + else: + tests = self.all_tests + skipped = [] + + if formatted: + return self.format_tests(tests), self.format_tests(skipped) + else: + return tests, skipped + + def check_skip_by_rule_id(self, rule_id: str) -> bool: + """Check if a rule_id should be skipped.""" + bypass = self.rule_validation.bypass + test_only = self.rule_validation.test_only + + # neither bypass nor test_only are defined, so no rules are skipped + if not (bypass or test_only): + return False + # if defined in bypass or not defined in test_only, then skip + return (bypass and rule_id in bypass) or (test_only and rule_id not in test_only) + + +@dataclass +class RulesConfig: + """Detection rules config file.""" + deprecated_rules_file: Path + deprecated_rules: Dict[str, dict] + packages_file: Path + packages: Dict[str, dict] + rule_dirs: List[Path] + stack_schema_map_file: Path + stack_schema_map: Dict[str, dict] + test_config: TestConfig + version_lock_file: Path + version_lock: Dict[str, dict] + + action_dir: Optional[Path] = None + action_connector_dir: Optional[Path] = None + auto_gen_schema_file: Optional[Path] = None + bbr_rules_dirs: Optional[List[Path]] = field(default_factory=list) + bypass_version_lock: bool = False + exception_dir: Optional[Path] = None + normalize_kql_keywords: bool = True + bypass_optional_elastic_validation: bool = False + + def __post_init__(self): + """Perform post validation on packages.yaml file.""" + if 'package' not in self.packages: + raise ValueError('Missing the `package` field defined in packages.yaml.') + + if 'name' not in self.packages['package']: + raise ValueError('Missing the `name` field defined in packages.yaml.') + + +@cached +def parse_rules_config(path: Optional[Path] = None) -> RulesConfig: + """Parse the _config.yaml file for default or custom rules.""" + if path: + assert path.exists(), f'rules config file does not exist: {path}' + loaded = yaml.safe_load(path.read_text()) + elif CUSTOM_RULES_DIR: + path = Path(CUSTOM_RULES_DIR) / '_config.yaml' + loaded = yaml.safe_load(path.read_text()) + else: + path = Path(get_etc_path('_config.yaml')) + loaded = load_etc_dump('_config.yaml') + + try: + ConfigFile.from_dict(loaded) + except KeyError as e: + raise SystemExit(f'Missing key `{str(e)}` in _config.yaml file.') + except (AttributeError, TypeError): + raise SystemExit(f'No data properly loaded from {path}') + except ValueError as e: + raise SystemExit(e) + + base_dir = path.resolve().parent + + # testing + # precedence to the environment variable + # environment variable is absolute path and config file is relative to the _config.yaml file + test_config_ev = os.getenv('DETECTION_RULES_TEST_CONFIG', None) + if test_config_ev: + test_config_path = Path(test_config_ev) + else: + test_config_file = loaded.get('testing', {}).get('config') + if test_config_file: + test_config_path = base_dir.joinpath(test_config_file) + else: + test_config_path = None + + if test_config_path: + test_config_data = yaml.safe_load(test_config_path.read_text()) + + # overwrite None with empty list to allow implicit exemption of all tests with `test_only` defined to None in + # test config + if 'unit_tests' in test_config_data and test_config_data['unit_tests'] is not None: + test_config_data['unit_tests'] = {k: v or [] for k, v in test_config_data['unit_tests'].items()} + test_config = TestConfig.from_dict(test_file=test_config_path, **test_config_data) + else: + test_config = TestConfig.from_dict() + + # files + # paths are relative + files = {f'{k}_file': base_dir.joinpath(v) for k, v in loaded['files'].items()} + contents = {k: load_dump(str(base_dir.joinpath(v).resolve())) for k, v in loaded['files'].items()} + + contents.update(**files) + + # directories + # paths are relative + if loaded.get('directories'): + contents.update({k: base_dir.joinpath(v).resolve() for k, v in loaded['directories'].items()}) + + # rule_dirs + # paths are relative + contents['rule_dirs'] = [base_dir.joinpath(d).resolve() for d in loaded.get('rule_dirs')] + + # directories + # paths are relative + if loaded.get('directories'): + directories = loaded.get('directories') + if directories.get('exception_dir'): + contents['exception_dir'] = base_dir.joinpath(directories.get('exception_dir')).resolve() + if directories.get('action_dir'): + contents['action_dir'] = base_dir.joinpath(directories.get('action_dir')).resolve() + if directories.get('action_connector_dir'): + contents['action_connector_dir'] = base_dir.joinpath(directories.get('action_connector_dir')).resolve() + + # version strategy + contents['bypass_version_lock'] = loaded.get('bypass_version_lock', False) + + # bbr_rules_dirs + # paths are relative + if loaded.get('bbr_rules_dirs'): + contents['bbr_rules_dirs'] = [base_dir.joinpath(d).resolve() for d in loaded.get('bbr_rules_dirs', [])] + + # kql keyword normalization + contents['normalize_kql_keywords'] = loaded.get('normalize_kql_keywords', True) + + if loaded.get('auto_gen_schema_file'): + contents['auto_gen_schema_file'] = base_dir.joinpath(loaded['auto_gen_schema_file']) + + # Check if the file exists + if not contents['auto_gen_schema_file'].exists(): + # If the file doesn't exist, create an empty JSON file + contents['auto_gen_schema_file'].write_text('{}') + + # bypass_optional_elastic_validation + contents['bypass_optional_elastic_validation'] = loaded.get('bypass_optional_elastic_validation', False) + if contents['bypass_optional_elastic_validation']: + set_all_validation_bypass(contents['bypass_optional_elastic_validation']) + + try: + rules_config = RulesConfig(test_config=test_config, **contents) + except (ValueError, TypeError) as e: + raise SystemExit(f'Error parsing packages.yaml: {str(e)}') + + return rules_config + + +@cached +def load_current_package_version() -> str: + """Load the current package version from config file.""" + return parse_rules_config().packages['package']['name'] diff --git a/detection_rules/custom_rules.py b/detection_rules/custom_rules.py new file mode 100644 index 000000000..6a4d71371 --- /dev/null +++ b/detection_rules/custom_rules.py @@ -0,0 +1,150 @@ +# Copyright Elasticsearch B.V. and/or licensed to Elasticsearch B.V. under one +# or more contributor license agreements. Licensed under the Elastic License +# 2.0; you may not use this file except in compliance with the Elastic License +# 2.0. + +"""Commands for supporting custom rules.""" +from pathlib import Path + +import click +import yaml + +from .main import root +from .utils import get_etc_path, load_etc_dump, ROOT_DIR + +from semver import Version + +DEFAULT_CONFIG_PATH = Path(get_etc_path('_config.yaml')) +CUSTOM_RULES_DOC_PATH = Path(ROOT_DIR).joinpath('docs', 'custom-rules.md') + + +@root.group('custom-rules') +def custom_rules(): + """Commands for supporting custom rules.""" + + +def create_config_content() -> str: + """Create the initial content for the _config.yaml file.""" + # Base structure of the configuration + config_content = { + 'rule_dirs': ['rules'], + 'bbr_rules_dirs': ['rules_building_block'], + 'directories': { + 'action_dir': 'actions', + 'action_connector_dir': 'action_connectors', + 'exception_dir': 'exceptions', + }, + 'files': { + 'deprecated_rules': 'etc/deprecated_rules.json', + 'packages': 'etc/packages.yaml', + 'stack_schema_map': 'etc/stack-schema-map.yaml', + 'version_lock': 'etc/version.lock.json', + }, + 'testing': { + 'config': 'etc/test_config.yaml' + } + } + + return yaml.safe_dump(config_content, default_flow_style=False) + + +def create_test_config_content(enable_prebuilt_tests: bool) -> str: + """Generate the content for the test_config.yaml with special content and references.""" + + def format_test_string(test_string: str, comment_char: str) -> str: + """Generate a yaml formatted string with a comment character.""" + return f"{comment_char} - {test_string}" + + comment_char = "#" if enable_prebuilt_tests else "" + example_test_config_path = DEFAULT_CONFIG_PATH.parent.joinpath("example_test_config.yaml") + + lines = [ + "# For more details, refer to the example configuration:", + f"# {example_test_config_path}", + "# Define tests to explicitly bypass, with all others being run.", + "# To run all tests, set bypass to empty or leave this file commented out.", + "", + "unit_tests:", + " bypass:", + format_test_string("tests.test_gh_workflows.TestWorkflows.test_matrix_to_lock_version_defaults", comment_char), + format_test_string( + "tests.test_schemas.TestVersionLockSchema.test_version_lock_has_nested_previous", comment_char + ), + format_test_string("tests.test_packages.TestRegistryPackage.test_registry_package_config", comment_char), + format_test_string("tests.test_all_rules.TestValidRules.test_schema_and_dupes", comment_char), + ] + + return "\n".join(lines) + + +@custom_rules.command('setup-config') +@click.argument('directory', type=Path) +@click.argument('kibana-version', type=str, default=load_etc_dump('packages.yaml')['package']['name']) +@click.option('--overwrite', is_flag=True, help="Overwrite the existing _config.yaml file.") +@click.option( + "--enable-prebuilt-tests", "-e", is_flag=True, help="Enable all prebuilt tests instead of default subset." +) +def setup_config(directory: Path, kibana_version: str, overwrite: bool, enable_prebuilt_tests: bool): + """Setup the custom rules configuration directory and files with defaults.""" + + config = directory / '_config.yaml' + if not overwrite and config.exists(): + raise FileExistsError(f'{config} already exists. Use --overwrite to update') + + etc_dir = directory / 'etc' + test_config = etc_dir / 'test_config.yaml' + package_config = etc_dir / 'packages.yaml' + stack_schema_map_config = etc_dir / 'stack-schema-map.yaml' + config_files = [ + package_config, + stack_schema_map_config, + test_config, + config, + ] + directories = [ + directory / 'actions', + directory / 'action_connectors', + directory / 'exceptions', + directory / 'rules', + directory / 'rules_building_block', + etc_dir, + ] + version_files = [ + etc_dir / 'deprecated_rules.json', + etc_dir / 'version.lock.json', + ] + + # Create directories + for dir_ in directories: + dir_.mkdir(parents=True, exist_ok=True) + click.echo(f'Created directory: {dir_}') + + # Create version_files and populate with default content if applicable + for file_ in version_files: + file_.write_text('{}') + click.echo( + f'Created file with default content: {file_}' + ) + + # Create the stack-schema-map.yaml file + stack_schema_map_content = load_etc_dump('stack-schema-map.yaml') + latest_version = max(stack_schema_map_content.keys(), key=lambda v: Version.parse(v)) + latest_entry = {latest_version: stack_schema_map_content[latest_version]} + stack_schema_map_config.write_text(yaml.safe_dump(latest_entry, default_flow_style=False)) + + # Create default packages.yaml + package_content = {'package': {'name': kibana_version}} + package_config.write_text(yaml.safe_dump(package_content, default_flow_style=False)) + + # Create and configure test_config.yaml + test_config.write_text(create_test_config_content(enable_prebuilt_tests)) + + # Create and configure _config.yaml + config.write_text(create_config_content()) + + for file_ in config_files: + click.echo(f'Created file with default content: {file_}') + + click.echo(f'\n# For details on how to configure the _config.yaml file,\n' + f'# consult: {DEFAULT_CONFIG_PATH.resolve()}\n' + f'# or the docs: {CUSTOM_RULES_DOC_PATH.resolve()}') diff --git a/detection_rules/custom_schemas.py b/detection_rules/custom_schemas.py new file mode 100644 index 000000000..214d09df7 --- /dev/null +++ b/detection_rules/custom_schemas.py @@ -0,0 +1,107 @@ +# Copyright Elasticsearch B.V. and/or licensed to Elasticsearch B.V. under one +# or more contributor license agreements. Licensed under the Elastic License +# 2.0; you may not use this file except in compliance with the Elastic License +# 2.0. + +"""Custom Schemas management.""" +import uuid +from pathlib import Path + +import eql +import eql.types +from eql import load_dump, save_dump + +from .config import parse_rules_config +from .utils import cached, clear_caches + +RULES_CONFIG = parse_rules_config() +RESERVED_SCHEMA_NAMES = ["beats", "ecs", "endgame"] + + +@cached +def get_custom_schemas(stack_version: str = None) -> dict: + """Load custom schemas if present.""" + custom_schema_dump = {} + + stack_versions = [stack_version] if stack_version else RULES_CONFIG.stack_schema_map.keys() + + for version in stack_versions: + stack_schema_map = RULES_CONFIG.stack_schema_map[version] + + for schema, value in stack_schema_map.items(): + if schema not in RESERVED_SCHEMA_NAMES: + schema_path = Path(value) + if not schema_path.is_absolute(): + schema_path = RULES_CONFIG.stack_schema_map_file.parent / value + if schema_path.is_file(): + custom_schema_dump.update(eql.utils.load_dump(str(schema_path))) + else: + raise ValueError(f"Custom schema must be a file: {schema_path}") + + return custom_schema_dump + + +def resolve_schema_path(path: str) -> Path: + """Helper function to resolve the schema path.""" + path_obj = Path(path) + return path_obj if path_obj.is_absolute() else RULES_CONFIG.stack_schema_map_file.parent.joinpath(path) + + +def update_data(index: str, field: str, data: dict) -> dict: + """Update the schema entry with the appropriate index and field.""" + if index not in data: + data[index] = {} + data[index][field] = "keyword" + return data + + +def update_stack_schema_map(stack_schema_map: dict, auto_gen_schema_file: str) -> dict: + """Update the stack-schema-map.yaml file with the appropriate auto_gen_schema_file location.""" + random_uuid = str(uuid.uuid4()) + auto_generated_id = None + for version in stack_schema_map: + key_found = False + for key, value in stack_schema_map[version].items(): + value_path = resolve_schema_path(value) + if value_path == Path(auto_gen_schema_file).resolve() and key not in RESERVED_SCHEMA_NAMES: + auto_generated_id = key + key_found = True + break + if key_found is False: + if auto_generated_id is None: + auto_generated_id = random_uuid + stack_schema_map[version][auto_generated_id] = str(auto_gen_schema_file) + return stack_schema_map, auto_generated_id, random_uuid + + +def clean_stack_schema_map(stack_schema_map: dict, auto_generated_id: str, random_uuid: str) -> dict: + """Clean up the stack-schema-map.yaml file replacing the random UUID with a known key if possible.""" + for version in stack_schema_map: + if random_uuid in stack_schema_map[version]: + stack_schema_map[version][auto_generated_id] = stack_schema_map[version].pop(random_uuid) + return stack_schema_map + + +def update_auto_generated_schema(index: str, field: str): + """Load custom schemas if present.""" + auto_gen_schema_file = str(RULES_CONFIG.auto_gen_schema_file) + stack_schema_map_file = str(RULES_CONFIG.stack_schema_map_file) + + # Update autogen schema file + data = load_dump(auto_gen_schema_file) + data = update_data(index, field, data) + save_dump(data, auto_gen_schema_file) + + # Update the stack-schema-map.yaml file with the appropriate auto_gen_schema_file location + stack_schema_map = load_dump(stack_schema_map_file) + stack_schema_map, auto_generated_id, random_uuid = update_stack_schema_map(stack_schema_map, auto_gen_schema_file) + save_dump(stack_schema_map, stack_schema_map_file) + + # Clean up the stack-schema-map.yaml file replacing the random UUID with the auto_generated_id + stack_schema_map = load_dump(stack_schema_map_file) + stack_schema_map = clean_stack_schema_map(stack_schema_map, auto_generated_id, random_uuid) + save_dump(stack_schema_map, stack_schema_map_file) + + RULES_CONFIG.stack_schema_map = stack_schema_map + # IMPORTANT must clear cache in order to reload schema + clear_caches() diff --git a/detection_rules/devtools.py b/detection_rules/devtools.py index 45bf7bc26..42200a60d 100644 --- a/detection_rules/devtools.py +++ b/detection_rules/devtools.py @@ -33,6 +33,7 @@ from . import attack, rule_loader, utils from .beats import (download_beats_schema, download_latest_beats_schema, refresh_main_schema) from .cli_utils import single_collection +from .config import parse_rules_config from .docs import IntegrationSecurityDocs, IntegrationSecurityDocsMDX from .ecs import download_endpoint_schemas, download_schemas from .endgame import EndgameSchemaManager @@ -52,17 +53,16 @@ from .rule import (AnyRuleData, BaseRuleData, DeprecatedRule, QueryRuleData, RuleTransform, ThreatMapping, TOMLRule, TOMLRuleContents) from .rule_loader import RuleCollection, production_filter from .schemas import definitions, get_stack_versions -from .utils import (dict_hash, get_etc_path, get_path, load_dump, - load_etc_dump, save_etc_dump) -from .version_lock import VersionLockFile, default_version_lock +from .utils import dict_hash, get_etc_path, get_path, load_dump +from .version_lock import VersionLockFile, loaded_version_lock -RULES_DIR = get_path('rules') GH_CONFIG = Path.home() / ".config" / "gh" / "hosts.yml" NAVIGATOR_GIST_ID = '1a3f65224822a30a8228a8ed20289a89' NAVIGATOR_URL = 'https://ela.st/detection-rules-navigator' NAVIGATOR_BADGE = ( f'[![ATT&CK navigator coverage](https://img.shields.io/badge/ATT&CK-Navigator-red.svg)]({NAVIGATOR_URL})' ) +RULES_CONFIG = parse_rules_config() def get_github_token() -> Optional[str]: @@ -87,10 +87,21 @@ def dev_group(): @click.option('--generate-navigator', is_flag=True, help='Generate ATT&CK navigator files') @click.option('--generate-docs', is_flag=True, default=False, help='Generate markdown documentation') @click.option('--update-message', type=str, help='Update message for new package') -def build_release(config_file, update_version_lock: bool, generate_navigator: bool, generate_docs: str, - update_message: str, release=None, verbose=True): +@click.pass_context +def build_release(ctx: click.Context, config_file, update_version_lock: bool, generate_navigator: bool, + generate_docs: str, update_message: str, release=None, verbose=True): """Assemble all the rules into Kibana-ready release files.""" - config = load_dump(str(config_file))['package'] + if RULES_CONFIG.bypass_version_lock: + click.echo('WARNING: You cannot run this command when the versioning strategy is configured to bypass the ' + 'version lock. Set `bypass_version_lock` to `False` in the rules config to use the version lock.') + ctx.exit() + + config = load_dump(config_file)['package'] + + err_msg = f'No `registry_data` in package config. Please see the {get_etc_path("package.yaml")} file for an' \ + f' example on how to supply this field in {PACKAGE_FILE}.' + assert 'registry_data' in config, err_msg + registry_data = config['registry_data'] if generate_navigator: @@ -102,10 +113,11 @@ def build_release(config_file, update_version_lock: bool, generate_navigator: bo if verbose: click.echo(f'[+] Building package {config.get("name")}') - package = Package.from_config(config, verbose=verbose) + package = Package.from_config(config=config, verbose=verbose) if update_version_lock: - default_version_lock.manage_versions(package.rules, save_changes=True, verbose=verbose) + loaded_version_lock.manage_versions(package.rules, save_changes=True, verbose=verbose) + package.save(verbose=verbose) previous_pkg_version = find_latest_integration_version("security_detection_engine", "ga", @@ -192,10 +204,11 @@ def build_integration_docs(ctx: click.Context, registry_version: str, pre: str, @click.option("--new-package", type=click.Choice(['true', 'false']), help="indicates new package") @click.option("--maturity", type=click.Choice(['beta', 'ga'], case_sensitive=False), required=True, help="beta or production versions") +@click.pass_context def bump_versions(major_release: bool, minor_release: bool, patch_release: bool, new_package: str, maturity: str): """Bump the versions""" - pkg_data = load_etc_dump('packages.yaml')['package'] + pkg_data = RULES_CONFIG.packages['package'] kibana_ver = Version.parse(pkg_data["name"], optional_minor_and_patch=True) pkg_ver = Version.parse(pkg_data["registry_data"]["version"]) pkg_kibana_ver = Version.parse(pkg_data["registry_data"]["conditions"]["kibana.version"].lstrip("^")) @@ -236,7 +249,7 @@ def bump_versions(major_release: bool, minor_release: bool, patch_release: bool, click.echo(f"Package Kibana version: {pkg_data['registry_data']['conditions']['kibana.version']}") click.echo(f"Package version: {pkg_data['registry_data']['version']}") - save_etc_dump({"package": pkg_data}, "packages.yaml") + RULES_CONFIG.packages_file.write_text(yaml.safe_dump({"package": pkg_data})) @dataclasses.dataclass @@ -290,7 +303,7 @@ class GitChangeEntry: @click.option("--target-stack-version", "-t", help="Minimum stack version to filter the staging area", required=True) @click.option("--dry-run", is_flag=True, help="List the changes that would be made") @click.option("--exception-list", help="List of files to skip staging", default="") -def prune_staging_area(target_stack_version: str, dry_run: bool, exception_list: list): +def prune_staging_area(target_stack_version: str, dry_run: bool, exception_list: str): """Prune the git staging area to remove changes to incompatible rules.""" exceptions = { "detection_rules/etc/packages.yaml", @@ -313,15 +326,17 @@ def prune_staging_area(target_stack_version: str, dry_run: bool, exception_list: continue # it's a change to a rule file, load it and check the version - if str(change.path.absolute()).startswith(str(RULES_DIR)) and change.path.suffix == ".toml": - # bypass TOML validation in case there were schema changes - dict_contents = RuleCollection.deserialize_toml_string(change.read()) - min_stack_version: Optional[str] = dict_contents.get("metadata", {}).get("min_stack_version") + for rules_dir in RULES_CONFIG.rule_dirs: + if str(change.path.absolute()).startswith(str(rules_dir)) and change.path.suffix == ".toml": + # bypass TOML validation in case there were schema changes + dict_contents = RuleCollection.deserialize_toml_string(change.read()) + min_stack_version: Optional[str] = dict_contents.get("metadata", {}).get("min_stack_version") - if min_stack_version is not None and \ - (target_stack_version < Version.parse(min_stack_version, optional_minor_and_patch=True)): - # rule is incompatible, add to the list of reversions to make later - reversions.append(change) + if min_stack_version is not None and \ + (target_stack_version < Version.parse(min_stack_version, optional_minor_and_patch=True)): + # rule is incompatible, add to the list of reversions to make later + reversions.append(change) + break if len(reversions) == 0: click.echo("No files restored from staging area") @@ -334,8 +349,9 @@ def prune_staging_area(target_stack_version: str, dry_run: bool, exception_list: @dev_group.command('update-lock-versions') @click.argument('rule-ids', nargs=-1, required=False) +@click.pass_context @click.option('--force', is_flag=True, help='Force update without confirmation') -def update_lock_versions(rule_ids: Tuple[str, ...], force: bool): +def update_lock_versions(ctx: click.Context, rule_ids: Tuple[str, ...], force: bool): """Update rule hashes in version.lock.json file without bumping version.""" rules = RuleCollection.default() @@ -349,8 +365,13 @@ def update_lock_versions(rule_ids: Tuple[str, ...], force: bool): ): return + if RULES_CONFIG.bypass_version_lock: + click.echo('WARNING: You cannot run this command when the versioning strategy is configured to bypass the ' + 'version lock. Set `bypass_version_lock` to `False` in the rules config to use the version lock.') + ctx.exit() + # this command may not function as expected anymore due to previous changes eliminating the use of add_new=False - changed, new, _ = default_version_lock.manage_versions(rules, exclude_version_update=True, save_changes=True) + changed, new, _ = loaded_version_lock.manage_versions(rules, exclude_version_update=True, save_changes=True) if not changed: click.echo('No hashes updated') @@ -608,7 +629,8 @@ def license_check(ctx, ignore_directory): @dev_group.command('test-version-lock') @click.argument('branches', nargs=-1, required=True) @click.option('--remote', '-r', default='origin', help='Override the remote from "origin"') -def test_version_lock(branches: tuple, remote: str): +@click.pass_context +def test_version_lock(ctx: click.Context, branches: tuple, remote: str): """Simulate the incremental step in the version locking to find version change violations.""" git = utils.make_git('-C', '.') current_branch = git('rev-parse', '--abbrev-ref', 'HEAD') @@ -621,7 +643,8 @@ def test_version_lock(branches: tuple, remote: str): subprocess.check_call(['python', '-m', 'detection_rules', 'dev', 'build-release', '-u']) finally: - diff = git('--no-pager', 'diff', get_etc_path('version.lock.json')) + rules_config = ctx.obj['rules_config'] + diff = git('--no-pager', 'diff', str(rules_config.version_lock_file)) outfile = get_path() / 'lock-diff.txt' outfile.write_text(diff) click.echo(f'diff saved to {outfile}') @@ -718,21 +741,23 @@ def search_rule_prs(ctx, no_loop, query, columns, language, token, threads): @dev_group.command('deprecate-rule') @click.argument('rule-file', type=Path) +@click.option('--deprecation-folder', '-d', type=Path, required=True, + help='Location to move the deprecated rule file to') @click.pass_context -def deprecate_rule(ctx: click.Context, rule_file: Path): +def deprecate_rule(ctx: click.Context, rule_file: Path, deprecation_folder: Path): """Deprecate a rule.""" - version_info = default_version_lock.version_lock + version_info = loaded_version_lock.version_lock rule_collection = RuleCollection() contents = rule_collection.load_file(rule_file).contents rule = TOMLRule(path=rule_file, contents=contents) - if rule.contents.id not in version_info: + if rule.contents.id not in version_info and not RULES_CONFIG.bypass_version_lock: click.echo('Rule has not been version locked and so does not need to be deprecated. ' - 'Delete the file or update the maturity to `development` instead') + 'Delete the file or update the maturity to `development` instead.') ctx.exit() today = time.strftime('%Y/%m/%d') - deprecated_path = get_path('rules', '_deprecated', rule_file.name) + deprecated_path = deprecation_folder / rule_file.name # create the new rule and save it new_meta = dataclasses.replace(rule.contents.metadata, @@ -741,6 +766,7 @@ def deprecate_rule(ctx: click.Context, rule_file: Path): maturity='deprecated') contents = dataclasses.replace(rule.contents, metadata=new_meta) new_rule = TOMLRule(contents=contents, path=deprecated_path) + deprecated_path.parent.mkdir(parents=True, exist_ok=True) new_rule.save_toml() # remove the old rule @@ -814,14 +840,20 @@ def update_navigator_gists(directory: Path, token: str, gist_id: str, print_mark @click.argument('stack_version') @click.option('--skip-rule-updates', is_flag=True, help='Skip updating the rules') @click.option('--dry-run', is_flag=True, help='Print the changes rather than saving the file') -def trim_version_lock(stack_version: str, skip_rule_updates: bool, dry_run: bool): +@click.pass_context +def trim_version_lock(ctx: click.Context, stack_version: str, skip_rule_updates: bool, dry_run: bool): """Trim all previous entries within the version lock file which are lower than the min_version.""" stack_versions = get_stack_versions() assert stack_version in stack_versions, \ f'Unknown min_version ({stack_version}), expected: {", ".join(stack_versions)}' min_version = Version.parse(stack_version) - version_lock_dict = default_version_lock.version_lock.to_dict() + + if RULES_CONFIG.bypass_version_lock: + click.echo('WARNING: Cannot trim the version lock when the versioning strategy is configured to bypass the ' + 'version lock. Set `bypass_version_lock` to `false` in the rules config to use the version lock.') + ctx.exit() + version_lock_dict = loaded_version_lock.version_lock.to_dict() removed = defaultdict(list) rule_msv_drops = [] diff --git a/detection_rules/ecs.py b/detection_rules/ecs.py index 4ec6bdb15..089bb04cf 100644 --- a/detection_rules/ecs.py +++ b/detection_rules/ecs.py @@ -16,6 +16,8 @@ import requests from semver import Version import yaml +from .config import CUSTOM_RULES_DIR, parse_rules_config +from .custom_schemas import get_custom_schemas from .utils import (DateTimeEncoder, cached, get_etc_path, gzip_compress, load_etc_dump, read_gzip, unzip) @@ -23,6 +25,7 @@ ECS_NAME = "ecs_schemas" ECS_SCHEMAS_DIR = get_etc_path(ECS_NAME) ENDPOINT_NAME = "endpoint_schemas" ENDPOINT_SCHEMAS_DIR = get_etc_path(ENDPOINT_NAME) +RULES_CONFIG = parse_rules_config() def add_field(schema, name, info): @@ -124,6 +127,12 @@ def get_eql_schema(version=None, index_patterns=None): for k, v in flatten(get_index_schema(index_name)).items(): add_field(converted, k, convert_type(v)) + # add custom schema + if index_patterns and CUSTOM_RULES_DIR: + for index_name in index_patterns: + for k, v in flatten(get_custom_index_schema(index_name)).items(): + add_field(converted, k, convert_type(v)) + # add endpoint custom schema for k, v in flatten(get_endpoint_schemas()).items(): add_field(converted, k, convert_type(v)) @@ -147,9 +156,24 @@ def get_non_ecs_schema(): return load_etc_dump('non-ecs-schema.json') +@cached +def get_custom_index_schema(index_name: str, stack_version: str = None): + """Load custom schema.""" + custom_schemas = get_custom_schemas(stack_version) + index_schema = custom_schemas.get(index_name, {}) + ccs_schema = custom_schemas.get(index_name.split(":", 1)[-1], {}) + index_schema.update(ccs_schema) + return index_schema + + @cached def get_index_schema(index_name): - return get_non_ecs_schema().get(index_name, {}) + """Load non-ecs schema.""" + non_ecs_schema = get_non_ecs_schema() + index_schema = non_ecs_schema.get(index_name, {}) + ccs_schema = non_ecs_schema.get(index_name.split(":", 1)[-1], {}) + index_schema.update(ccs_schema) + return index_schema def flatten_multi_fields(schema): @@ -201,9 +225,15 @@ def get_kql_schema(version=None, indexes=None, beat_schema=None) -> dict: indexes = indexes or () converted = flatten_multi_fields(get_schema(version, name='ecs_flat')) + # non-ecs schema for index_name in indexes: converted.update(**flatten(get_index_schema(index_name))) + # custom schema + if CUSTOM_RULES_DIR: + for index_name in indexes: + converted.update(**flatten(get_custom_index_schema(index_name))) + # add endpoint custom schema converted.update(**flatten(get_endpoint_schemas())) diff --git a/detection_rules/eswrap.py b/detection_rules/eswrap.py index b914ddcde..22f0c01c4 100644 --- a/detection_rules/eswrap.py +++ b/detection_rules/eswrap.py @@ -17,6 +17,7 @@ from elasticsearch import Elasticsearch from elasticsearch.client import AsyncSearchClient import kql +from .config import parse_rules_config from .main import root from .misc import add_params, client_error, elasticsearch_options, get_elasticsearch_client, nested_get from .rule import TOMLRule @@ -26,6 +27,7 @@ from .utils import format_command_options, normalize_timing_and_sort, unix_time_ COLLECTION_DIR = get_path('collections') MATCH_ALL = {'bool': {'filter': [{'match_all': {}}]}} +RULES_CONFIG = parse_rules_config() def add_range_to_dsl(dsl_filter, start_time, end_time='now'): @@ -92,7 +94,7 @@ class RtaEvents: rule = RuleCollection.default().id_map.get(rule_id) assert rule is not None, f"Unable to find rule with ID {rule_id}" merged_events = combine_sources(*self.events.values()) - filtered = evaluate(rule, merged_events) + filtered = evaluate(rule, merged_events, normalize_kql_keywords=RULES_CONFIG.normalize_kql_keywords) if filtered: sources = [e['agent']['type'] for e in filtered] diff --git a/detection_rules/etc/_config.yaml b/detection_rules/etc/_config.yaml new file mode 100644 index 000000000..5ad6dd6f2 --- /dev/null +++ b/detection_rules/etc/_config.yaml @@ -0,0 +1,74 @@ +# detection-rules config file +bbr_rules_dirs: + - ../../rules_building_block +rule_dirs: + - ../../rules +files: + deprecated_rules: deprecated_rules.json + packages: packages.yaml + stack_schema_map: stack-schema-map.yaml + version_lock: version.lock.json +normalize_kql_keywords: False +# Set the versioning strategy. +# 1. Set to False to use version.lock.json file +# 2. Set to True to either: +# - Explicitly set within rule.version in the TOML file +# - Defer to kibana versions (never manually set) +# bypass_version_lock: false + +# directories: + # action_dir: actions + # exception_dir: exceptions + # action_connector_dir: action_connectors + +# to set up a custom rules directory, copy this file to the root of the custom rules directory, which is set +# using the environment variable CUSTOM_RULES_DIR +# example structure: +# custom-rules +# ├── _config.yaml +# └── rules +# ├── example_rule_1.toml +# ├── example_rule_2.toml +# └── etc +# ├── deprecated_rules.json +# ├── packages.yaml +# ├── stack-schema-map.yaml +# └── version.lock.json +# └── actions +## ├── action_1.toml +## ├── action_2.toml +# └── exceptions +## ├── exception_1.toml +## ├── exception_2.toml +# +# update custom-rules/_config.yaml with: +# deprecated_rules: etc/deprecated_rules.json +# packages: etc/packages.yaml +# stack_schema_map: etc/stack-schema-map.yaml +# version_lock: etc/version.lock.json +# +# the paths in this file are relative to the custom rules directory (CUSTOM_RULES_DIR/) +# +# Refer to each original source file for purpose and proper formatting +# + +# testing: +# config: etc/example_test_config.yaml + +# To turn on automatic schema generation for non-ecs fields via custom schemas use a line like the following. +# This will generate a schema file in the specified location that will be used to add entries for each field +# and index combination that is not already in a known schema. This will also automatically add it to your +# stack-schema-map.yaml file when using a custom rules directory and config. +# auto_gen_schema_file: "etc/auto-gen-schema.json" + +# To on bulk disable elastic validation for optional fields, use the following line +# bypass_optional_elastic_validation: True + +# This points to the testing config file (see example under detection_rules/etc/example_test_config.yaml) +# This can either be set here or as the environment variable `DETECTION_RULES_TEST_CONFIG`, with precedence +# going to the environment variable if both are set. Having both these options allows for configuring testing on +# prebuilt Elastic rules without specifying a rules _config.yaml. +# +# If set in this file, the path should be relative to the location of this config. If passed as an environment variable, +# it should be the full path +# Note: Using the `custom-rules setup-config ` command will generate a config called `test_config.yaml` diff --git a/detection_rules/etc/example_test_config.yaml b/detection_rules/etc/example_test_config.yaml new file mode 100644 index 000000000..bdf576fb6 --- /dev/null +++ b/detection_rules/etc/example_test_config.yaml @@ -0,0 +1,41 @@ + +# set the environment variable DETECTION_RULES_TEST_CONFIG + +# `bypass` and `test_only` are mutually exclusive and will cause an error if both are specified. +# +# tests can be defined by their full name or using glob-style patterns with the following notation +# pattern:*rule* +# the patterns are case sensitive + +unit_tests: + # define tests to explicitly bypass, with all others being run + # + # to run all tests, set bypass to empty or leave this file commented out + bypass: +# - tests.test_all_rules.TestValidRules.test_schema_and_dupes +# - tests.test_packages.TestRegistryPackage.test_registry_package_config +# - tests.test_all_rules.TestRuleMetadata.test_event_dataset +# - tests.test_all_rules.TestRuleMetadata.test_integration_tag +# - tests.test_gh_workflows.TestWorkflows.test_matrix_to_lock_version_defaults +# - pattern:*rule* +# - pattern:*kuery* + + # define tests to explicitly run, with all others being bypassed + # + # to bypass all tests, set test_only to empty + test_only: +# - tests.test_all_rules.TestRuleMetadata.test_event_dataset +# - pattern:*rule* + + +# `bypass` and `test_only` are mutually exclusive and will cause an error if both are specified. +# +# both variables require a list of rule_ids +rule_validation: + + bypass: +# - "34fde489-94b0-4500-a76f-b8a157cf9269" + + + test_only: +# - "34fde489-94b0-4500-a76f-b8a157cf9269" diff --git a/detection_rules/etc/test_cli.bash b/detection_rules/etc/test_cli.bash index 1a865f811..1a59fb6bf 100755 --- a/detection_rules/etc/test_cli.bash +++ b/detection_rules/etc/test_cli.bash @@ -12,10 +12,15 @@ echo "Refreshing redirect mappings in ATT&CK" python -m detection_rules dev attack refresh-redirect-mappings echo "Viewing rule: threat_intel_indicator_match_address.toml" -python -m detection_rules view-rule rules/cross-platform/threat_intel_indicator_match_address.toml +python -m detection_rules view-rule rules/threat_intel/threat_intel_indicator_match_address.toml echo "Exporting rule by ID: 0a97b20f-4144-49ea-be32-b540ecc445de" -python -m detection_rules export-rules-from-repo --rule-id 0a97b20f-4144-49ea-be32-b540ecc445de +mkdir tmp-export 2>/dev/null +python -m detection_rules export-rules-from-repo --rule-id 0a97b20f-4144-49ea-be32-b540ecc445de -o tmp-export/test_rule.ndjson + +echo "Importing rule by ID: 0a97b20f-4144-49ea-be32-b540ecc445de" +python -m detection_rules import-rules-to-repo tmp-export/test_rule.ndjson --required-only +rm -rf tmp-export echo "Updating rule data schemas" python -m detection_rules dev schemas update-rule-data diff --git a/detection_rules/etc/test_remote_cli.bash b/detection_rules/etc/test_remote_cli.bash index 235873a71..63131301a 100755 --- a/detection_rules/etc/test_remote_cli.bash +++ b/detection_rules/etc/test_remote_cli.bash @@ -14,7 +14,7 @@ python -m detection_rules kibana search-alerts echo "Performing a rule export..." mkdir tmp-export 2>/dev/null -python -m detection_rules kibana export-rules -d tmp-export --skip-errors +python -m detection_rules kibana export-rules -d tmp-export -sv --skip-errors ls tmp-export echo "Removing generated files..." rm -rf tmp-export diff --git a/detection_rules/exception.py b/detection_rules/exception.py new file mode 100644 index 000000000..1b89d6ab8 --- /dev/null +++ b/detection_rules/exception.py @@ -0,0 +1,286 @@ +# Copyright Elasticsearch B.V. and/or licensed to Elasticsearch B.V. under one +# or more contributor license agreements. Licensed under the Elastic License +# 2.0; you may not use this file except in compliance with the Elastic License +# 2.0. +"""Rule exceptions data.""" +from collections import defaultdict +from dataclasses import dataclass +from datetime import datetime +from pathlib import Path +from typing import List, Optional, Union, Tuple, get_args + +import pytoml +from marshmallow import EXCLUDE, ValidationError, validates_schema + +from .mixins import MarshmallowDataclassMixin +from .schemas import definitions +from .config import parse_rules_config + +RULES_CONFIG = parse_rules_config() + +# https://www.elastic.co/guide/en/security/current/exceptions-api-overview.html + + +@dataclass(frozen=True) +class ExceptionMeta(MarshmallowDataclassMixin): + """Data stored in an exception's [metadata] section of TOML.""" + creation_date: definitions.Date + list_name: str + rule_ids: List[definitions.UUIDString] + rule_names: List[str] + updated_date: definitions.Date + + # Optional fields + deprecation_date: Optional[definitions.Date] + comments: Optional[str] + maturity: Optional[definitions.Maturity] + + +@dataclass(frozen=True) +class BaseExceptionItemEntry(MarshmallowDataclassMixin): + """Shared object between nested and non-nested exception items.""" + field: str + type: definitions.ExceptionEntryType + + +@dataclass(frozen=True) +class NestedExceptionItemEntry(BaseExceptionItemEntry, MarshmallowDataclassMixin): + """Nested exception item entry.""" + entries: List['ExceptionItemEntry'] + + @validates_schema + def validate_nested_entry(self, data: dict, **kwargs): + """More specific validation.""" + if data.get('list') is not None: + raise ValidationError('Nested entries cannot define a list') + + +@dataclass(frozen=True) +class ExceptionItemEntry(BaseExceptionItemEntry, MarshmallowDataclassMixin): + """Exception item entry.""" + @dataclass(frozen=True) + class ListObject: + """List object for exception item entry.""" + id: str + type: definitions.EsDataTypes + + list: Optional[ListObject] + operator: definitions.ExceptionEntryOperator + value: Optional[Union[str, List[str]]] + + @validates_schema + def validate_entry(self, data: dict, **kwargs): + """Validate the entry based on its type.""" + value = data.get('value', '') + if data['type'] in ('exists', 'list') and value is not None: + raise ValidationError(f'Entry of type {data["type"]} cannot have a value') + elif data['type'] in ('match', 'wildcard') and not isinstance(value, str): + raise ValidationError(f'Entry of type {data["type"]} must have a string value') + elif data['type'] == 'match_any' and not isinstance(value, list): + raise ValidationError(f'Entry of type {data["type"]} must have a list of strings as a value') + + +@dataclass(frozen=True) +class ExceptionItem(MarshmallowDataclassMixin): + """Base exception item.""" + @dataclass(frozen=True) + class Comment: + """Comment object for exception item.""" + comment: str + + comments: List[Optional[Comment]] + description: str + entries: List[Union[ExceptionItemEntry, NestedExceptionItemEntry]] + list_id: str + item_id: Optional[str] # api sets field when not provided + meta: Optional[dict] + name: str + namespace_type: Optional[definitions.ExceptionNamespaceType] # defaults to "single" if not provided + tags: Optional[List[str]] + type: definitions.ExceptionItemType + + +@dataclass(frozen=True) +class EndpointException(ExceptionItem, MarshmallowDataclassMixin): + """Endpoint exception item.""" + _tags: List[definitions.ExceptionItemEndpointTags] + + @validates_schema + def validate_endpoint(self, data: dict, **kwargs): + """Validate the endpoint exception.""" + for entry in data['entries']: + if entry['operator'] == "excluded": + raise ValidationError("Endpoint exceptions cannot have an `excluded` operator") + + +@dataclass(frozen=True) +class DetectionException(ExceptionItem, MarshmallowDataclassMixin): + """Detection exception item.""" + expire_time: Optional[str] # fields.DateTime] # maybe this is isoformat? + + +@dataclass(frozen=True) +class ExceptionContainer(MarshmallowDataclassMixin): + """Exception container.""" + description: str + list_id: Optional[str] + meta: Optional[dict] + name: str + namespace_type: Optional[definitions.ExceptionNamespaceType] + tags: Optional[List[str]] + type: definitions.ExceptionContainerType + + def to_rule_entry(self) -> dict: + """Returns a dict of the format required in rule.exception_list.""" + # requires KSO id to be consider valid structure + return dict(namespace_type=self.namespace_type, type=self.type, list_id=self.list_id) + + +@dataclass(frozen=True) +class Data(MarshmallowDataclassMixin): + """Data stored in an exception's [exception] section of TOML.""" + container: ExceptionContainer + items: Optional[List[DetectionException]] # Union[DetectionException, EndpointException]] + + +@dataclass(frozen=True) +class TOMLExceptionContents(MarshmallowDataclassMixin): + """Data stored in an exception file.""" + + metadata: ExceptionMeta + exceptions: List[Data] + + @classmethod + def from_exceptions_dict(cls, exceptions_dict: dict, rule_list: list[dict]) -> "TOMLExceptionContents": + """Create a TOMLExceptionContents from a kibana rule resource.""" + rule_ids = [] + rule_names = [] + + for rule in rule_list: + rule_ids.append(rule["id"]) + rule_names.append(rule["name"]) + + # Format date to match schema + creation_date = datetime.strptime(exceptions_dict["container"]["created_at"], "%Y-%m-%dT%H:%M:%S.%fZ").strftime( + "%Y/%m/%d" + ) + updated_date = datetime.strptime(exceptions_dict["container"]["updated_at"], "%Y-%m-%dT%H:%M:%S.%fZ").strftime( + "%Y/%m/%d" + ) + metadata = { + "creation_date": creation_date, + "list_name": exceptions_dict["container"]["name"], + "rule_ids": rule_ids, + "rule_names": rule_names, + "updated_date": updated_date, + } + + return cls.from_dict({"metadata": metadata, "exceptions": [exceptions_dict]}, unknown=EXCLUDE) + + def to_api_format(self) -> List[dict]: + """Convert the TOML Exception to the API format.""" + converted = [] + + for exception in self.exceptions: + converted.append(exception.container.to_dict()) + if exception.items: + for item in exception.items: + converted.append(item.to_dict()) + + return converted + + +@dataclass(frozen=True) +class TOMLException: + """TOML exception object.""" + contents: TOMLExceptionContents + path: Optional[Path] = None + + @property + def name(self): + """Return the name of the exception list.""" + return self.contents.metadata.list_name + + def save_toml(self): + """Save the exception to a TOML file.""" + assert self.path is not None, f"Can't save exception {self.name} without a path" + # Check if self.path has a .toml extension + path = self.path + if path.suffix != ".toml": + # If it doesn't, add one + path = path.with_suffix(".toml") + with path.open("w") as f: + contents_dict = self.contents.to_dict() + # Sort the dictionary so that 'metadata' is at the top + sorted_dict = dict(sorted(contents_dict.items(), key=lambda item: item[0] != "metadata")) + pytoml.dump(sorted_dict, f) + + +def parse_exceptions_results_from_api(results: List[dict]) -> tuple[dict, dict, List[str], List[dict]]: + """Parse exceptions results from the API into containers and items.""" + exceptions_containers = {} + exceptions_items = defaultdict(list) + errors = [] + unparsed_results = [] + + for result in results: + result_type = result.get("type") + list_id = result.get("list_id") + + if result_type in get_args(definitions.ExceptionContainerType): + exceptions_containers[list_id] = result + elif result_type in get_args(definitions.ExceptionItemType): + exceptions_items[list_id].append(result) + else: + unparsed_results.append(result) + + return exceptions_containers, exceptions_items, errors, unparsed_results + + +def build_exception_objects(exceptions_containers: List[dict], exceptions_items: List[dict], + exception_list_rule_table: dict, exceptions_directory: Path, save_toml: bool = False, + skip_errors: bool = False, verbose=False, + ) -> Tuple[List[TOMLException], List[str], List[str]]: + """Build TOMLException objects from a list of exception dictionaries.""" + output = [] + errors = [] + toml_exceptions = [] + for container in exceptions_containers.values(): + try: + list_id = container.get("list_id") + items = exceptions_items.get(list_id) + contents = TOMLExceptionContents.from_exceptions_dict( + {"container": container, "items": items}, + exception_list_rule_table.get(list_id), + ) + filename = f"{list_id}_exceptions.toml" + if RULES_CONFIG.exception_dir is None and not exceptions_directory: + raise FileNotFoundError( + "No Exceptions directory is specified. Please specify either in the config or CLI." + ) + exceptions_path = ( + Path(exceptions_directory) / filename + if exceptions_directory + else RULES_CONFIG.exception_dir / filename + ) + if verbose: + output.append(f"[+] Building exception(s) for {exceptions_path}") + e_object = TOMLException( + contents=contents, + path=exceptions_path, + ) + if save_toml: + e_object.save_toml() + toml_exceptions.append(e_object) + + except Exception as e: + if skip_errors: + output.append(f"- skipping exceptions export - {type(e).__name__}") + if not exceptions_directory: + errors.append(f"- no exceptions directory found - {e}") + else: + errors.append(f"- exceptions export - {e}") + continue + raise + + return toml_exceptions, output, errors diff --git a/detection_rules/generic_loader.py b/detection_rules/generic_loader.py new file mode 100644 index 000000000..41f91bee8 --- /dev/null +++ b/detection_rules/generic_loader.py @@ -0,0 +1,195 @@ +# Copyright Elasticsearch B.V. and/or licensed to Elasticsearch B.V. under one +# or more contributor license agreements. Licensed under the Elastic License +# 2.0; you may not use this file except in compliance with the Elastic License +# 2.0. + +"""Load generic toml formatted files for exceptions and actions.""" +from pathlib import Path +from typing import Callable, Dict, Iterable, List, Optional, Union + +import pytoml + +from .action import TOMLAction, TOMLActionContents +from .action_connector import TOMLActionConnector, TOMLActionConnectorContents +from .config import parse_rules_config +from .exception import TOMLException, TOMLExceptionContents +from .rule_loader import dict_filter +from .schemas import definitions + + +RULES_CONFIG = parse_rules_config() + +GenericCollectionTypes = Union[TOMLAction, TOMLActionConnector, TOMLException] +GenericCollectionContentTypes = Union[TOMLActionContents, TOMLActionConnectorContents, TOMLExceptionContents] + + +def metadata_filter(**metadata) -> Callable[[GenericCollectionTypes], bool]: + """Get a filter callback based off item metadata""" + flt = dict_filter(metadata) + + def callback(item: GenericCollectionTypes) -> bool: + target_dict = item.contents.metadata.to_dict() + return flt(target_dict) + + return callback + + +class GenericCollection: + """Generic collection for action and exception objects.""" + + items: list + __default = None + + def __init__(self, items: Optional[List[GenericCollectionTypes]] = None): + self.id_map: Dict[definitions.UUIDString, GenericCollectionTypes] = {} + self.file_map: Dict[Path, GenericCollectionTypes] = {} + self.name_map: Dict[definitions.RuleName, GenericCollectionTypes] = {} + self.items: List[GenericCollectionTypes] = [] + self.errors: Dict[Path, Exception] = {} + self.frozen = False + + self._toml_load_cache: Dict[Path, dict] = {} + + for items in (items or []): + self.add_item(items) + + def __len__(self) -> int: + """Get the total amount of exceptions in the collection.""" + return len(self.items) + + def __iter__(self) -> Iterable[GenericCollectionTypes]: + """Iterate over all items in the collection.""" + return iter(self.items) + + def __contains__(self, item: GenericCollectionTypes) -> bool: + """Check if an item is in the map by comparing IDs.""" + return item.id in self.id_map + + def filter(self, cb: Callable[[TOMLException], bool]) -> 'GenericCollection': + """Retrieve a filtered collection of items.""" + filtered_collection = GenericCollection() + + for item in filter(cb, self.items): + filtered_collection.add_item(item) + + return filtered_collection + + @staticmethod + def deserialize_toml_string(contents: Union[bytes, str]) -> dict: + """Deserialize a TOML string into a dictionary.""" + return pytoml.loads(contents) + + def _load_toml_file(self, path: Path) -> dict: + """Load a TOML file into a dictionary.""" + if path in self._toml_load_cache: + return self._toml_load_cache[path] + + # use pytoml instead of toml because of annoying bugs + # https://github.com/uiri/toml/issues/152 + # might also be worth looking at https://github.com/sdispater/tomlkit + with path.open("r", encoding="utf-8") as f: + toml_dict = self.deserialize_toml_string(f.read()) + self._toml_load_cache[path] = toml_dict + return toml_dict + + def _get_paths(self, directory: Path, recursive=True) -> List[Path]: + """Get all TOML files in a directory.""" + return sorted(directory.rglob('*.toml') if recursive else directory.glob('*.toml')) + + def _assert_new(self, item: GenericCollectionTypes) -> None: + """Assert that the item is new and can be added to the collection.""" + file_map = self.file_map + name_map = self.name_map + + assert not self.frozen, f"Unable to add item {item.name} to a frozen collection" + assert item.name not in name_map, \ + f"Rule Name {item.name} collides with {name_map[item.name].name}" + + if item.path is not None: + item_path = item.path.resolve() + assert item_path not in file_map, f"Item file {item_path} already loaded" + file_map[item_path] = item + + def add_item(self, item: GenericCollectionTypes) -> None: + """Add a new item to the collection.""" + self._assert_new(item) + self.name_map[item.name] = item + self.items.append(item) + + def load_dict(self, obj: dict, path: Optional[Path] = None) -> GenericCollectionTypes: + """Load a dictionary into the collection.""" + if 'exceptions' in obj: + contents = TOMLExceptionContents.from_dict(obj) + item = TOMLException(path=path, contents=contents) + elif 'actions' in obj: + contents = TOMLActionContents.from_dict(obj) + item = TOMLAction(path=path, contents=contents) + elif 'action_connectors' in obj: + contents = TOMLActionConnectorContents.from_dict(obj) + item = TOMLActionConnector(path=path, contents=contents) + else: + raise ValueError("Invalid object type") + + self.add_item(item) + return item + + def load_file(self, path: Path) -> GenericCollectionTypes: + """Load a single file into the collection.""" + try: + path = path.resolve() + + # use the default generic loader as a cache. + # if it already loaded the item, then we can just use it from that + if self.__default is not None and self is not self.__default: + if path in self.__default.file_map: + item = self.__default.file_map[path] + self.add_item(item) + return item + + obj = self._load_toml_file(path) + return self.load_dict(obj, path=path) + except Exception: + print(f"Error loading item in {path}") + raise + + def load_files(self, paths: Iterable[Path]) -> None: + """Load multiple files into the collection.""" + for path in paths: + self.load_file(path) + + def load_directory( + self, directory: Path, recursive=True, toml_filter: Optional[Callable[[dict], bool]] = None + ) -> None: + """Load all TOML files in a directory.""" + paths = self._get_paths(directory, recursive=recursive) + if toml_filter is not None: + paths = [path for path in paths if toml_filter(self._load_toml_file(path))] + + self.load_files(paths) + + def load_directories( + self, directories: Iterable[Path], recursive=True, toml_filter: Optional[Callable[[dict], bool]] = None + ) -> None: + """Load all TOML files in multiple directories.""" + for path in directories: + self.load_directory(path, recursive=recursive, toml_filter=toml_filter) + + def freeze(self) -> None: + """Freeze the generic collection and make it immutable going forward.""" + self.frozen = True + + @classmethod + def default(cls) -> 'GenericCollection': + """Return the default item collection, which retrieves from default config location.""" + if cls.__default is None: + collection = GenericCollection() + if RULES_CONFIG.exception_dir: + collection.load_directory(RULES_CONFIG.exception_dir) + if RULES_CONFIG.action_dir: + collection.load_directory(RULES_CONFIG.action_dir) + if RULES_CONFIG.action_connector_dir: + collection.load_directory(RULES_CONFIG.action_connector_dir) + collection.freeze() + cls.__default = collection + + return cls.__default diff --git a/detection_rules/integrations.py b/detection_rules/integrations.py index a54e3bb0a..726cce486 100644 --- a/detection_rules/integrations.py +++ b/detection_rules/integrations.py @@ -20,8 +20,8 @@ from marshmallow import EXCLUDE, Schema, fields, post_load import kql from . import ecs +from .config import load_current_package_version from .beats import flatten_ecs_schema -from .misc import load_current_package_version from .utils import cached, get_etc_path, read_gzip, unzip from .schemas import definitions @@ -384,6 +384,10 @@ def parse_datasets(datasets: list, package_manifest: dict) -> List[Optional[dict integration = 'Unknown' if '.' in value: package, integration = value.split('.', 1) + # Handle cases where endpoint event datasource needs to be parsed uniquely (e.g endpoint.events.network) + # as endpoint.network + if package == "endpoint" and "events" in integration: + integration = integration.split('.')[1] else: package = value diff --git a/detection_rules/kbwrap.py b/detection_rules/kbwrap.py index 48f3b775c..27a106b3a 100644 --- a/detection_rules/kbwrap.py +++ b/detection_rules/kbwrap.py @@ -4,6 +4,7 @@ # 2.0. """Kibana cli commands.""" +import re import sys from pathlib import Path from typing import Iterable, List, Optional @@ -13,13 +14,21 @@ import click import kql from kibana import Signal, RuleResource +from .config import parse_rules_config from .cli_utils import multi_collection +from .action_connector import (TOMLActionConnectorContents, + parse_action_connector_results_from_api, build_action_connector_objects) +from .exception import (TOMLExceptionContents, + build_exception_objects, parse_exceptions_results_from_api) +from .generic_loader import GenericCollection from .main import root from .misc import add_params, client_error, kibana_options, get_kibana_client, nested_set from .rule import downgrade_contents_from_rule, TOMLRuleContents, TOMLRule from .rule_loader import RuleCollection from .utils import format_command_options, rulename_to_filename +RULES_CONFIG = parse_rules_config() + @root.group('kibana') @add_params(*kibana_options) @@ -78,7 +87,7 @@ def upload_rule(ctx, rules: RuleCollection, replace_id): @multi_collection @click.option('--overwrite', '-o', is_flag=True, help='Overwrite existing rules') @click.option('--overwrite-exceptions', '-e', is_flag=True, help='Overwrite exceptions in existing rules') -@click.option('--overwrite-action-connectors', '-a', is_flag=True, +@click.option('--overwrite-action-connectors', '-ac', is_flag=True, help='Overwrite action connectors in existing rules') @click.pass_context def kibana_import_rules(ctx: click.Context, rules: RuleCollection, overwrite: Optional[bool] = False, @@ -88,78 +97,253 @@ def kibana_import_rules(ctx: click.Context, rules: RuleCollection, overwrite: Op kibana = ctx.obj['kibana'] rule_dicts = [r.contents.to_api_format() for r in rules] with kibana: + cl = GenericCollection.default() + exception_dicts = [ + d.contents.to_api_format() for d in cl.items if isinstance(d.contents, TOMLExceptionContents) + ] + action_connectors_dicts = [ + d.contents.to_api_format() for d in cl.items if isinstance(d.contents, TOMLActionConnectorContents) + ] response, successful_rule_ids, results = RuleResource.import_rules( rule_dicts, + exception_dicts, + action_connectors_dicts, overwrite=overwrite, overwrite_exceptions=overwrite_exceptions, overwrite_action_connectors=overwrite_action_connectors ) + def handle_response_errors(response: dict): + """Handle errors from the import response.""" + def parse_list_id(s: str): + """Parse the list ID from the error message.""" + match = re.search(r'list_id: "(.*?)"', s) + return match.group(1) if match else None + + # Re-try to address known Kibana issue: https://github.com/elastic/kibana/issues/143864 + workaround_errors = [] + + flattened_exceptions = [e for sublist in exception_dicts for e in sublist] + all_exception_list_ids = {exception["list_id"] for exception in flattened_exceptions} + + click.echo(f'{len(response["errors"])} rule(s) failed to import!') + + for error in response['errors']: + click.echo(f' - {error["rule_id"]}: ({error["error"]["status_code"]}) {error["error"]["message"]}') + + if "references a non existent exception list" in error["error"]["message"]: + list_id = parse_list_id(error["error"]["message"]) + if list_id in all_exception_list_ids: + workaround_errors.append(error["rule_id"]) + + if workaround_errors: + workaround_errors = list(set(workaround_errors)) + click.echo(f'Missing exception list errors detected for {len(workaround_errors)} rules. ' + 'Try re-importing using the following command and rule IDs:\n') + click.echo('python -m detection_rules kibana import-rules -o ', nl=False) + click.echo(' '.join(f'-id {rule_id}' for rule_id in workaround_errors)) + click.echo() + if successful_rule_ids: click.echo(f'{len(successful_rule_ids)} rule(s) successfully imported') rule_str = '\n - '.join(successful_rule_ids) - print(f' - {rule_str}') + click.echo(f' - {rule_str}') if response['errors']: - click.echo(f'{len(response["errors"])} rule(s) failed to import!') - for error in response['errors']: - click.echo(f' - {error["rule_id"]}: ({error["error"]["status_code"]}) {error["error"]["message"]}') + handle_response_errors(response) return response, results -@kibana_group.command('export-rules') -@click.option('--directory', '-d', required=True, type=Path, help='Directory to export rules to') -@click.option('--rule-id', '-r', multiple=True, help='Optional Rule IDs to restrict export to') -@click.option('--skip-errors', '-s', is_flag=True, help='Skip errors when exporting rules') +@kibana_group.command("export-rules") +@click.option("--directory", "-d", required=True, type=Path, help="Directory to export rules to") +@click.option( + "--action-connectors-directory", "-acd", required=False, type=Path, help="Directory to export action connectors to" +) +@click.option("--exceptions-directory", "-ed", required=False, type=Path, help="Directory to export exceptions to") +@click.option("--default-author", "-da", type=str, required=False, help="Default author for rules missing one") +@click.option("--rule-id", "-r", multiple=True, help="Optional Rule IDs to restrict export to") +@click.option("--export-action-connectors", "-ac", is_flag=True, help="Include action connectors in export") +@click.option("--export-exceptions", "-e", is_flag=True, help="Include exceptions in export") +@click.option("--skip-errors", "-s", is_flag=True, help="Skip errors when exporting rules") +@click.option("--strip-version", "-sv", is_flag=True, help="Strip the version fields from all rules") @click.pass_context -def kibana_export_rules(ctx: click.Context, directory: Path, - rule_id: Optional[Iterable[str]] = None, skip_errors: bool = False) -> List[TOMLRule]: +def kibana_export_rules(ctx: click.Context, directory: Path, action_connectors_directory: Optional[Path], + exceptions_directory: Optional[Path], default_author: str, + rule_id: Optional[Iterable[str]] = None, export_action_connectors: bool = False, + export_exceptions: bool = False, skip_errors: bool = False, strip_version: bool = False + ) -> List[TOMLRule]: """Export custom rules from Kibana.""" - kibana = ctx.obj['kibana'] + kibana = ctx.obj["kibana"] with kibana: - results = RuleResource.export_rules(list(rule_id)) + results = RuleResource.export_rules(list(rule_id), exclude_export_details=not export_exceptions) + + # Handle Exceptions Directory Location + if results and exceptions_directory: + exceptions_directory.mkdir(parents=True, exist_ok=True) + exceptions_directory = exceptions_directory or RULES_CONFIG.exception_dir + if not exceptions_directory and export_exceptions: + click.echo("Warning: Exceptions export requested, but no exceptions directory found") + + # Handle Actions Connector Directory Location + if results and action_connectors_directory: + action_connectors_directory.mkdir(parents=True, exist_ok=True) + action_connectors_directory = action_connectors_directory or RULES_CONFIG.action_connector_dir + if not action_connectors_directory and export_action_connectors: + click.echo("Warning: Action Connector export requested, but no Action Connector directory found") if results: directory.mkdir(parents=True, exist_ok=True) + else: + click.echo("No rules found to export") + return [] + + rules_results = results + if export_exceptions: + # Assign counts to variables + rules_count = results[-1]["exported_rules_count"] + exception_list_count = results[-1]["exported_exception_list_count"] + exception_list_item_count = results[-1]["exported_exception_list_item_count"] + action_connector_count = results[-1]["exported_action_connector_count"] + + # Parse rules results and exception results from API return + rules_results = results[:rules_count] + exception_results = results[rules_count:rules_count + exception_list_count + exception_list_item_count] + rules_and_exceptions_count = rules_count + exception_list_count + exception_list_item_count + action_connector_results = results[ + rules_and_exceptions_count: rules_and_exceptions_count + action_connector_count + ] errors = [] exported = [] - for rule_resource in results: + exception_list_rule_table = {} + action_connector_rule_table = {} + for rule_resource in rules_results: try: - contents = TOMLRuleContents.from_rule_resource(rule_resource, maturity='production') - threat = contents.data.get('threat') - first_tactic = threat[0].tactic.name if threat else '' + if strip_version: + rule_resource.pop("revision", None) + rule_resource.pop("version", None) + rule_resource["author"] = rule_resource.get("author") or default_author or [rule_resource.get("created_by")] + if isinstance(rule_resource["author"], str): + rule_resource["author"] = [rule_resource["author"]] + contents = TOMLRuleContents.from_rule_resource(rule_resource, maturity="production") + threat = contents.data.get("threat") + first_tactic = threat[0].tactic.name if threat else "" rule_name = rulename_to_filename(contents.data.name, tactic_name=first_tactic) - rule = TOMLRule(contents=contents, path=directory / f'{rule_name}') + rule = TOMLRule(contents=contents, path=directory / f"{rule_name}") except Exception as e: if skip_errors: print(f'- skipping {rule_resource.get("name")} - {type(e).__name__}') errors.append(f'- {rule_resource.get("name")} - {e}') continue raise + if rule.contents.data.exceptions_list: + # For each item in rule.contents.data.exceptions_list to the exception_list_rule_table under the list_id + for exception in rule.contents.data.exceptions_list: + exception_id = exception["list_id"] + if exception_id not in exception_list_rule_table: + exception_list_rule_table[exception_id] = [] + exception_list_rule_table[exception_id].append({"id": rule.id, "name": rule.name}) + if rule.contents.data.actions: + # use connector ids as rule source + for action in rule.contents.data.actions: + action_id = action["id"] + if action_id not in action_connector_rule_table: + action_connector_rule_table[action_id] = [] + action_connector_rule_table[action_id].append({"id": rule.id, "name": rule.name}) exported.append(rule) + # Parse exceptions results from API return + exceptions = [] + if export_exceptions: + exceptions_containers = {} + exceptions_items = {} + + exceptions_containers, exceptions_items, parse_errors, _ = parse_exceptions_results_from_api(exception_results) + errors.extend(parse_errors) + + # Build TOMLException Objects + exceptions, e_output, e_errors = build_exception_objects( + exceptions_containers, + exceptions_items, + exception_list_rule_table, + exceptions_directory, + save_toml=False, + skip_errors=skip_errors, + verbose=False, + ) + for line in e_output: + click.echo(line) + errors.extend(e_errors) + + # Parse action connector results from API return + action_connectors = [] + if export_action_connectors: + action_connector_results, _ = parse_action_connector_results_from_api(action_connector_results) + + # Build TOMLActionConnector Objects + action_connectors, ac_output, ac_errors = build_action_connector_objects( + action_connector_results, + action_connector_rule_table, + action_connectors_directory=None, + save_toml=False, + skip_errors=skip_errors, + verbose=False, + ) + for line in ac_output: + click.echo(line) + errors.extend(ac_errors) + saved = [] for rule in exported: try: rule.save_toml() except Exception as e: if skip_errors: - print(f'- skipping {rule.contents.data.name} - {type(e).__name__}') - errors.append(f'- {rule.contents.data.name} - {e}') + print(f"- skipping {rule.contents.data.name} - {type(e).__name__}") + errors.append(f"- {rule.contents.data.name} - {e}") continue raise saved.append(rule) - click.echo(f'{len(results)} rules exported') - click.echo(f'{len(exported)} rules converted') - click.echo(f'{len(saved)} saved to {directory}') + saved_exceptions = [] + for exception in exceptions: + try: + exception.save_toml() + except Exception as e: + if skip_errors: + print(f"- skipping {exception.rule_name} - {type(e).__name__}") + errors.append(f"- {exception.rule_name} - {e}") + continue + raise + + saved_exceptions.append(exception) + + saved_action_connectors = [] + for action in action_connectors: + try: + action.save_toml() + except Exception as e: + if skip_errors: + print(f"- skipping {action.name} - {type(e).__name__}") + errors.append(f"- {action.name} - {e}") + continue + raise + + saved_action_connectors.append(action) + + click.echo(f"{len(results)} results exported") + click.echo(f"{len(exported)} rules converted") + click.echo(f"{len(exceptions)} exceptions exported") + click.echo(f"{len(action_connectors)} action connectors exported") + click.echo(f"{len(saved)} rules saved to {directory}") + click.echo(f"{len(saved_exceptions)} exception lists saved to {exceptions_directory}") + click.echo(f"{len(saved_action_connectors)} action connectors saved to {action_connectors_directory}") if errors: - err_file = directory / '_errors.txt' - err_file.write_text('\n'.join(errors)) - click.echo(f'{len(errors)} errors saved to {err_file}') + err_file = directory / "_errors.txt" + err_file.write_text("\n".join(errors)) + click.echo(f"{len(errors)} errors saved to {err_file}") return exported diff --git a/detection_rules/main.py b/detection_rules/main.py index e103619da..151f8a86a 100644 --- a/detection_rules/main.py +++ b/detection_rules/main.py @@ -15,22 +15,30 @@ import pytoml from marshmallow_dataclass import class_schema from pathlib import Path from semver import Version -from typing import Dict, Iterable, List, Optional +from typing import Dict, Iterable, List, Optional, get_args from uuid import uuid4 - import click +from .action_connector import (TOMLActionConnectorContents, + build_action_connector_objects, parse_action_connector_results_from_api) from .attack import build_threat_map_entry from .cli_utils import rule_prompt, multi_collection +from .config import load_current_package_version, parse_rules_config +from .generic_loader import GenericCollection +from .exception import (TOMLExceptionContents, + build_exception_objects, parse_exceptions_results_from_api) from .mappings import build_coverage_map, get_triggered_rules, print_converage_summary -from .misc import add_client, client_error, nested_set, parse_config, load_current_package_version +from .misc import ( + add_client, client_error, nested_set, parse_user_config +) from .rule import TOMLRule, TOMLRuleContents, QueryRuleData from .rule_formatter import toml_write from .rule_loader import RuleCollection from .schemas import all_versions, definitions, get_incompatible_fields, get_schema_file from .utils import Ndjson, get_path, get_etc_path, clear_caches, load_dump, load_rule_contents, rulename_to_filename -RULES_DIR = get_path('rules') +RULES_CONFIG = parse_rules_config() +RULES_DIRS = RULES_CONFIG.rule_dirs @click.group('detection-rules', context_settings={'help_option_names': ['-h', '--help']}) @@ -39,8 +47,8 @@ RULES_DIR = get_path('rules') @click.pass_context def root(ctx, debug): """Commands for detection-rules repository.""" - debug = debug if debug is not None else parse_config().get('debug') - ctx.obj = {'debug': debug} + debug = debug if debug is not None else parse_user_config().get('debug') + ctx.obj = {'debug': debug, 'rules_config': RULES_CONFIG} if debug: click.secho('DEBUG MODE ENABLED', fg='yellow') @@ -91,32 +99,148 @@ def generate_rules_index(ctx: click.Context, query, overwrite, save_files=True): return bulk_upload_docs, importable_rules_docs -@root.command('import-rules-to-repo') -@click.argument('input-file', type=click.Path(dir_okay=False, exists=True), nargs=-1, required=False) -@click.option('--required-only', is_flag=True, help='Only prompt for required fields') -@click.option('--directory', '-d', type=click.Path(file_okay=False, exists=True), help='Load files from a directory') -def import_rules_into_repo(input_file, required_only, directory): +@root.command("import-rules-to-repo") +@click.argument("input-file", type=click.Path(dir_okay=False, exists=True), nargs=-1, required=False) +@click.option("--action-connector-import", "-ac", is_flag=True, help="Include action connectors in export") +@click.option("--exceptions-import", "-e", is_flag=True, help="Include exceptions in export") +@click.option("--required-only", is_flag=True, help="Only prompt for required fields") +@click.option("--directory", "-d", type=click.Path(file_okay=False, exists=True), help="Load files from a directory") +@click.option( + "--save-directory", "-s", type=click.Path(file_okay=False, exists=True), help="Save imported rules to a directory" +) +@click.option( + "--exceptions-directory", + "-se", + type=click.Path(file_okay=False, exists=True), + help="Save imported exceptions to a directory", +) +@click.option( + "--action-connectors-directory", + "-sa", + type=click.Path(file_okay=False, exists=True), + help="Save imported actions to a directory", +) +@click.option("--skip-errors", "-ske", is_flag=True, help="Skip rule import errors") +@click.option("--default-author", "-da", type=str, required=False, help="Default author for rules missing one") +@click.option("--strip-none-values", "-snv", is_flag=True, help="Strip None values from the rule") +def import_rules_into_repo(input_file: click.Path, required_only: bool, action_connector_import: bool, + exceptions_import: bool, directory: click.Path, save_directory: click.Path, + action_connectors_directory: click.Path, exceptions_directory: click.Path, + skip_errors: bool, default_author: str, strip_none_values: bool): """Import rules from json, toml, yaml, or Kibana exported rule file(s).""" - rule_files = glob.glob(os.path.join(directory, '**', '*.*'), recursive=True) if directory else [] + errors = [] + rule_files = glob.glob(os.path.join(directory, "**", "*.*"), recursive=True) if directory else [] rule_files = sorted(set(rule_files + list(input_file))) - rule_contents = [] + file_contents = [] for rule_file in rule_files: - rule_contents.extend(load_rule_contents(Path(rule_file))) + file_contents.extend(load_rule_contents(Path(rule_file))) - if not rule_contents: - click.echo('Must specify at least one file!') + if not file_contents: + click.echo("Must specify at least one file!") - for contents in rule_contents: - base_path = contents.get('name') or contents.get('rule', {}).get('name') + exceptions_containers = {} + exceptions_items = {} + + exceptions_containers, exceptions_items, _, unparsed_results = parse_exceptions_results_from_api(file_contents) + + action_connectors, unparsed_results = parse_action_connector_results_from_api(unparsed_results) + + file_contents = unparsed_results + + exception_list_rule_table = {} + action_connector_rule_table = {} + for contents in file_contents: + # Don't load exceptions as rules + if contents["type"] not in get_args(definitions.RuleType): + click.echo(f"Skipping - {contents["type"]} is not a supported rule type") + continue + base_path = contents.get("name") or contents.get("rule", {}).get("name") base_path = rulename_to_filename(base_path) if base_path else base_path - rule_path = os.path.join(RULES_DIR, base_path) if base_path else None + if base_path is None: + raise ValueError(f"Invalid rule file, please ensure the rule has a name field: {contents}") + rule_path = os.path.join(save_directory if save_directory is not None else RULES_DIRS[0], base_path) # handle both rule json formats loaded from kibana and toml data_view_id = contents.get("data_view_id") or contents.get("rule", {}).get("data_view_id") additional = ["index"] if not data_view_id else ["data_view_id"] - rule_prompt(rule_path, required_only=required_only, save=True, verbose=True, - additional_required=additional, **contents) + + # Use additional to store all available fields for the rule + additional += [key for key in contents if key not in additional and contents.get(key, None)] + + # use default author if not provided + contents["author"] = contents.get("author") or default_author or [contents.get("created_by")] + if isinstance(contents["author"], str): + contents["author"] = [contents["author"]] + + output = rule_prompt( + rule_path, + required_only=required_only, + save=True, + verbose=True, + additional_required=additional, + skip_errors=skip_errors, + strip_none_values=strip_none_values, + **contents, + ) + # If output is not a TOMLRule + if isinstance(output, str): + errors.append(output) + + if contents.get("exceptions_list"): + # For each item in rule.contents.data.exceptions_list to the exception_list_rule_table under the list_id + for exception in contents["exceptions_list"]: + exception_id = exception["list_id"] + if exception_id not in exception_list_rule_table: + exception_list_rule_table[exception_id] = [] + exception_list_rule_table[exception_id].append({"id": contents["id"], "name": contents["name"]}) + + if contents.get("actions"): + # If rule has actions with connectors, add them to the action_connector_rule_table under the action_id + for action in contents["actions"]: + action_id = action["id"] + if action_id not in action_connector_rule_table: + action_connector_rule_table[action_id] = [] + action_connector_rule_table[action_id].append({"id": contents["id"], "name": contents["name"]}) + + # Build TOMLException Objects + if exceptions_import: + _, e_output, e_errors = build_exception_objects( + exceptions_containers, + exceptions_items, + exception_list_rule_table, + exceptions_directory, + save_toml=True, + skip_errors=skip_errors, + verbose=True, + ) + for line in e_output: + click.echo(line) + errors.extend(e_errors) + + # Build TOMLActionConnector Objects + if action_connector_import: + _, ac_output, ac_errors = build_action_connector_objects( + action_connectors, + action_connector_rule_table, + action_connectors_directory, + save_toml=True, + skip_errors=skip_errors, + verbose=True, + ) + for line in ac_output: + click.echo(line) + errors.extend(ac_errors) + + exceptions_count = 0 if not exceptions_import else len(exceptions_containers) + len(exceptions_items) + click.echo(f"{len(file_contents) + exceptions_count} results exported") + click.echo(f"{len(file_contents)} rules converted") + click.echo(f"{exceptions_count} exceptions exported") + click.echo(f"{len(action_connectors)} actions connectors exported") + if errors: + err_file = save_directory if save_directory is not None else RULES_DIRS[0] / "_errors.txt" + err_file.write_text("\n".join(errors)) + click.echo(f"{len(errors)} errors saved to {err_file}") @root.command('build-limited-rules') @@ -234,9 +358,17 @@ def view_rule(ctx, rule_file, api_format): return rule -def _export_rules(rules: RuleCollection, outfile: Path, downgrade_version: Optional[definitions.SemVer] = None, - verbose=True, skip_unsupported=False, include_metadata: bool = False): - """Export rules into a consolidated ndjson file.""" +def _export_rules( + rules: RuleCollection, + outfile: Path, + downgrade_version: Optional[definitions.SemVer] = None, + verbose=True, + skip_unsupported=False, + include_metadata: bool = False, + include_action_connectors: bool = False, + include_exceptions: bool = False, +): + """Export rules and exceptions into a consolidated ndjson file.""" from .rule import downgrade_contents_from_rule outfile = outfile.with_suffix('.ndjson') @@ -263,6 +395,21 @@ def _export_rules(rules: RuleCollection, outfile: Path, downgrade_version: Optio output_lines = [json.dumps(r.contents.to_api_format(include_metadata=include_metadata), sort_keys=True) for r in rules] + # Add exceptions to api format here and add to output_lines + if include_exceptions or include_action_connectors: + cl = GenericCollection.default() + # Get exceptions in API format + if include_exceptions: + exceptions = [d.contents.to_api_format() for d in cl.items if isinstance(d.contents, TOMLExceptionContents)] + exceptions = [e for sublist in exceptions for e in sublist] + output_lines.extend(json.dumps(e, sort_keys=True) for e in exceptions) + if include_action_connectors: + action_connectors = [ + d.contents.to_api_format() for d in cl.items if isinstance(d.contents, TOMLActionConnectorContents) + ] + actions = [a for sublist in action_connectors for a in sublist] + output_lines.extend(json.dumps(a, sort_keys=True) for a in actions) + outfile.write_text('\n'.join(output_lines) + '\n') if verbose: @@ -273,20 +420,42 @@ def _export_rules(rules: RuleCollection, outfile: Path, downgrade_version: Optio click.echo(f'Skipped {len(unsupported)} unsupported rules: \n- {unsupported_str}') -@root.command('export-rules-from-repo') +@root.command("export-rules-from-repo") @multi_collection -@click.option('--outfile', '-o', default=Path(get_path('exports', f'{time.strftime("%Y%m%dT%H%M%SL")}.ndjson')), - type=Path, help='Name of file for exported rules') -@click.option('--replace-id', '-r', is_flag=True, help='Replace rule IDs with new IDs before export') -@click.option('--stack-version', type=click.Choice(all_versions()), - help='Downgrade a rule version to be compatible with older instances of Kibana') -@click.option('--skip-unsupported', '-s', is_flag=True, - help='If `--stack-version` is passed, skip rule types which are unsupported ' - '(an error will be raised otherwise)') -@click.option('--include-metadata', type=bool, is_flag=True, default=False, help='Add metadata to the exported rules') -def export_rules_from_repo(rules, outfile: Path, replace_id, stack_version, - skip_unsupported, include_metadata: bool) -> RuleCollection: - """Export rule(s) into an importable ndjson file.""" +@click.option( + "--outfile", + "-o", + default=Path(get_path("exports", f'{time.strftime("%Y%m%dT%H%M%SL")}.ndjson')), + type=Path, + help="Name of file for exported rules", +) +@click.option("--replace-id", "-r", is_flag=True, help="Replace rule IDs with new IDs before export") +@click.option( + "--stack-version", + type=click.Choice(all_versions()), + help="Downgrade a rule version to be compatible with older instances of Kibana", +) +@click.option( + "--skip-unsupported", + "-s", + is_flag=True, + help="If `--stack-version` is passed, skip rule types which are unsupported " "(an error will be raised otherwise)", +) +@click.option("--include-metadata", type=bool, is_flag=True, default=False, help="Add metadata to the exported rules") +@click.option( + "--include-action-connectors", + "-ac", + type=bool, + is_flag=True, + default=False, + help="Include Action Connectors in export", +) +@click.option( + "--include-exceptions", "-e", type=bool, is_flag=True, default=False, help="Include Exceptions Lists in export" +) +def export_rules_from_repo(rules, outfile: Path, replace_id, stack_version, skip_unsupported, include_metadata: bool, + include_action_connectors: bool, include_exceptions: bool) -> RuleCollection: + """Export rule(s) and exception(s) into an importable ndjson file.""" assert len(rules) > 0, "No rules found" if replace_id: @@ -301,8 +470,15 @@ def export_rules_from_repo(rules, outfile: Path, replace_id, stack_version, rules.add_rule(TOMLRule(contents=new_contents)) outfile.parent.mkdir(exist_ok=True) - _export_rules(rules=rules, outfile=outfile, downgrade_version=stack_version, - skip_unsupported=skip_unsupported, include_metadata=include_metadata) + _export_rules( + rules=rules, + outfile=outfile, + downgrade_version=stack_version, + skip_unsupported=skip_unsupported, + include_metadata=include_metadata, + include_action_connectors=include_action_connectors, + include_exceptions=include_exceptions, + ) return rules @@ -414,8 +590,19 @@ def test_rules(ctx): """Run unit tests over all of the rules.""" import pytest + rules_config = ctx.obj['rules_config'] + test_config = rules_config.test_config + tests, skipped = test_config.get_test_names(formatted=True) + + if skipped: + click.echo(f'Tests skipped per config ({len(skipped)}):') + click.echo('\n'.join(skipped)) + clear_caches() - ctx.exit(pytest.main(["-v"])) + if tests: + ctx.exit(pytest.main(['-v'] + tests)) + else: + click.echo('No tests found to execute!') @root.group('typosquat') diff --git a/detection_rules/misc.py b/detection_rules/misc.py index 5d9a2ccb9..989d78f0f 100644 --- a/detection_rules/misc.py +++ b/detection_rules/misc.py @@ -7,15 +7,16 @@ import os import re import time +import unittest import uuid from pathlib import Path - from functools import wraps from typing import NoReturn, Optional import click import requests + # this is primarily for type hinting - all use of the github client should come from GithubClient class try: from github import Github @@ -49,6 +50,9 @@ JS_LICENSE = """ """.strip().format("\n".join(' * ' + line for line in LICENSE_LINES)) +ROOT_DIR = Path(__file__).parent.parent + + class ClientError(click.ClickException): """Custom CLI error to format output or full debug stacktrace.""" @@ -109,8 +113,8 @@ def nest_from_dot(dots, value): nested = {fields.pop(): value} - for field in reversed(fields): - nested = {field: nested} + for field_ in reversed(fields): + nested = {field_: nested} return nested @@ -275,7 +279,7 @@ def get_default_config() -> Optional[Path]: @cached -def parse_config(): +def parse_user_config(): """Parse a default config file.""" import eql @@ -290,10 +294,27 @@ def parse_config(): return config +def discover_tests(start_dir: str = 'tests', pattern: str = 'test*.py', top_level_dir: Optional[str] = None): + """Discover all unit tests in a directory.""" + def list_tests(s, tests=None): + if tests is None: + tests = [] + for test in s: + if isinstance(test, unittest.TestSuite): + list_tests(test, tests) + else: + tests.append(test.id()) + return tests + + loader = unittest.defaultTestLoader + suite = loader.discover(start_dir, pattern=pattern, top_level_dir=top_level_dir or str(ROOT_DIR)) + return list_tests(suite) + + def getdefault(name): """Callback function for `default` to get an environment variable.""" envvar = f"DR_{name.upper()}" - config = parse_config() + config = parse_user_config() return lambda: os.environ.get(envvar, config.get(name)) diff --git a/detection_rules/mixins.py b/detection_rules/mixins.py index 52cee2399..b22677d29 100644 --- a/detection_rules/mixins.py +++ b/detection_rules/mixins.py @@ -17,7 +17,7 @@ import marshmallow_union import marshmallow from marshmallow import Schema, ValidationError, validates_schema, fields as marshmallow_fields -from .misc import load_current_package_version +from .config import load_current_package_version from .schemas import definitions from .schemas.stack_compat import get_incompatible_fields from semver import Version diff --git a/detection_rules/navigator.py b/detection_rules/navigator.py index 2db3a16a0..125ab1bbe 100644 --- a/detection_rules/navigator.py +++ b/detection_rules/navigator.py @@ -17,7 +17,6 @@ import json from .attack import CURRENT_ATTACK_VERSION from .mixins import MarshmallowDataclassMixin from .rule import TOMLRule -from .rule_loader import DEFAULT_RULES_DIR, DEFAULT_BBR_DIR from .schemas import definitions @@ -162,11 +161,13 @@ class NavigatorBuilder: return links def rule_links_dict(self, rule: TOMLRule) -> dict: + """Create a links dictionary for a rule.""" base_url = 'https://github.com/elastic/detection-rules/blob/main/rules/' - try: - base_path = str(rule.path.resolve().relative_to(DEFAULT_RULES_DIR)) - except ValueError: - base_path = str(rule.path.resolve().relative_to(DEFAULT_BBR_DIR)) + base_path = str(rule.get_base_rule_dir()) + + if base_path is None: + raise ValueError("Could not find a valid base path for the rule") + url = f'{base_url}{base_path}' return self.links_dict(rule.name, url) diff --git a/detection_rules/packaging.py b/detection_rules/packaging.py index 36044c657..0dda6b507 100644 --- a/detection_rules/packaging.py +++ b/detection_rules/packaging.py @@ -19,16 +19,19 @@ from semver import Version import click import yaml -from .misc import JS_LICENSE, cached, load_current_package_version +from .config import load_current_package_version, parse_rules_config +from .misc import JS_LICENSE, cached from .navigator import NavigatorBuilder, Navigator from .rule import TOMLRule, QueryRuleData, ThreatMapping -from .rule_loader import DeprecatedCollection, RuleCollection, DEFAULT_RULES_DIR, DEFAULT_BBR_DIR +from .rule_loader import DeprecatedCollection, RuleCollection from .schemas import definitions -from .utils import Ndjson, get_path, get_etc_path, load_etc_dump -from .version_lock import default_version_lock +from .utils import Ndjson, get_path, get_etc_path +from .version_lock import loaded_version_lock + +RULES_CONFIG = parse_rules_config() RELEASE_DIR = get_path("releases") -PACKAGE_FILE = get_etc_path('packages.yaml') +PACKAGE_FILE = str(RULES_CONFIG.packages_file) NOTICE_FILE = get_path('NOTICE.txt') FLEET_PKG_LOGO = get_etc_path("security-logo-color-64px.svg") @@ -62,7 +65,7 @@ def filter_rule(rule: TOMLRule, config_filter: dict, exclude_fields: Optional[di unique_fields = get_unique_query_fields(rule) for index, fields in exclude_fields.items(): - if unique_fields and (rule.contents.data.index == index or index == 'any'): + if unique_fields and (rule.contents.data.index_or_dataview == index or index == 'any'): if set(unique_fields) & set(fields): return False @@ -89,18 +92,19 @@ class Package(object): self.historical = historical if min_version is not None: - self.rules = self.rules.filter(lambda r: min_version <= r.contents.latest_version) + self.rules = self.rules.filter(lambda r: min_version <= r.contents.saved_version) if max_version is not None: - self.rules = self.rules.filter(lambda r: max_version >= r.contents.latest_version) + self.rules = self.rules.filter(lambda r: max_version >= r.contents.saved_version) + assert not RULES_CONFIG.bypass_version_lock, "Packaging can not be used when version locking is bypassed." self.changed_ids, self.new_ids, self.removed_ids = \ - default_version_lock.manage_versions(self.rules, verbose=verbose, save_changes=False) + loaded_version_lock.manage_versions(self.rules, verbose=verbose, save_changes=False) @classmethod def load_configs(cls): - """Load configs from packages.yml.""" - return load_etc_dump(str(PACKAGE_FILE))['package'] + """Load configs from packages.yaml.""" + return RULES_CONFIG.packages['package'] @staticmethod def _package_kibana_notice_file(save_dir): @@ -223,9 +227,10 @@ class Package(object): return sha256 @classmethod - def from_config(cls, config: dict = None, verbose: bool = False, historical: bool = True) -> 'Package': + def from_config(cls, rule_collection: Optional[RuleCollection] = None, config: Optional[dict] = None, + verbose: Optional[bool] = False) -> 'Package': """Load a rules package given a config.""" - all_rules = RuleCollection.default() + all_rules = rule_collection or RuleCollection.default() config = config or {} exclude_fields = config.pop('exclude_fields', {}) # deprecated rules are now embedded in the RuleCollection.deprecated - this is left here for backwards compat @@ -240,7 +245,7 @@ class Package(object): if verbose: click.echo(f' - {len(all_rules) - len(rules)} rules excluded from package') - package = cls(rules, verbose=verbose, historical=historical, **config) + package = cls(rules, verbose=verbose, **config) return package @@ -473,10 +478,10 @@ class Package(object): bulk_upload_docs.append(create) - try: - relative_path = str(rule.path.resolve().relative_to(DEFAULT_RULES_DIR)) - except ValueError: - relative_path = str(rule.path.resolve().relative_to(DEFAULT_BBR_DIR)) + relative_path = str(rule.get_base_rule_dir()) + + if relative_path is None: + raise ValueError(f"Could not find a valid relative path for the rule: {rule.id}") rule_doc = dict(hash=rule.contents.sha256(), source='repo', diff --git a/detection_rules/remote_validation.py b/detection_rules/remote_validation.py index bab264604..c00d3bc37 100644 --- a/detection_rules/remote_validation.py +++ b/detection_rules/remote_validation.py @@ -16,7 +16,8 @@ from requests import HTTPError from kibana import Kibana -from .misc import ClientError, getdefault, get_elasticsearch_client, get_kibana_client, load_current_package_version +from .config import load_current_package_version +from .misc import ClientError, getdefault, get_elasticsearch_client, get_kibana_client from .rule import TOMLRule, TOMLRuleContents from .schemas import definitions diff --git a/detection_rules/rule.py b/detection_rules/rule.py index 5fefcb2d4..d5a8c9a81 100644 --- a/detection_rules/rule.py +++ b/detection_rules/rule.py @@ -14,6 +14,7 @@ from dataclasses import dataclass, field from functools import cached_property from pathlib import Path from typing import Any, Dict, List, Literal, Optional, Tuple, Union +from urllib.parse import urlparse from uuid import uuid4 import eql @@ -21,26 +22,32 @@ import marshmallow from semver import Version from marko.block import Document as MarkoDocument from marko.ext.gfm import gfm -from marshmallow import ValidationError, validates_schema, pre_load +from marshmallow import ValidationError, pre_load, validates_schema import kql from . import beats, ecs, endgame, utils +from .config import load_current_package_version, parse_rules_config from .integrations import (find_least_compatible_version, get_integration_schema_fields, load_integrations_manifests, load_integrations_schemas, parse_datasets) -from .misc import load_current_package_version from .mixins import MarshmallowDataclassMixin, StackCompatMixin from .rule_formatter import nested_normalize, toml_write from .schemas import (SCHEMA_DIR, definitions, downgrade, get_min_supported_stack_version, get_stack_schemas, strip_non_public_fields) from .schemas.stack_compat import get_restricted_fields -from .utils import cached, convert_time_span, PatchedTemplate +from .utils import PatchedTemplate, cached, convert_time_span, get_nested_value, set_nested_value + _META_SCHEMA_REQ_DEFAULTS = {} MIN_FLEET_PACKAGE_VERSION = '7.13.0' TIME_NOW = time.strftime('%Y/%m/%d') +RULES_CONFIG = parse_rules_config() +DEFAULT_PREBUILT_RULES_DIRS = RULES_CONFIG.rule_dirs +DEFAULT_PREBUILT_BBR_DIRS = RULES_CONFIG.bbr_rules_dirs +BYPASS_VERSION_LOCK = RULES_CONFIG.bypass_version_lock + BUILD_FIELD_VERSIONS = { "related_integrations": (Version.parse('8.3.0'), None), @@ -169,6 +176,14 @@ class BaseThreatEntry: name: str reference: str + @pre_load + def modify_url(self, data: Dict[str, Any], **kwargs): + """Modify the URL to support MITRE ATT&CK URLS with and without trailing forward slash.""" + if urlparse(data["reference"]).scheme: + if not data["reference"].endswith("/"): + data["reference"] += "/" + return data + @dataclass(frozen=True) class SubTechnique(BaseThreatEntry): @@ -361,6 +376,7 @@ class BaseRuleData(MarshmallowDataclassMixin, StackCompatMixin): references: Optional[List[str]] related_integrations: Optional[List[RelatedIntegrations]] = field(metadata=dict(metadata=dict(min_compat="8.3"))) required_fields: Optional[List[RequiredFields]] = field(metadata=dict(metadata=dict(min_compat="8.3"))) + revision: Optional[int] = field(metadata=dict(metadata=dict(min_compat="8.8"))) risk_score: definitions.RiskScore risk_score_mapping: Optional[List[RiskScoreMapping]] rule_id: definitions.UUIDString @@ -376,6 +392,7 @@ class BaseRuleData(MarshmallowDataclassMixin, StackCompatMixin): to: Optional[str] type: definitions.RuleType threat: Optional[List[ThreatMapping]] + version: Optional[definitions.PositiveInteger] @classmethod def save_schema(cls): @@ -455,6 +472,23 @@ class BaseRuleData(MarshmallowDataclassMixin, StackCompatMixin): return obj + @validates_schema + def validates_data(self, data, **kwargs): + """Validate fields and data for marshmallow schemas.""" + + # Validate version and revision fields not supplied. + disallowed_fields = [field for field in ['version', 'revision'] if data.get(field) is not None] + if not disallowed_fields: + return + + error_message = " and ".join(disallowed_fields) + + # If version and revision fields are supplied, and using locked versions raise an error. + if BYPASS_VERSION_LOCK is not True: + msg = (f"Configuration error: Rule {data['name']} - {data['rule_id']} " + f"should not contain rules with `{error_message}` set.") + raise ValidationError(msg) + class DataValidator: """Additional validation beyond base marshmallow schema validation.""" @@ -578,7 +612,7 @@ class DataValidator: f"field, use the environment variable `DR_BYPASS_NOTE_VALIDATION_AND_PARSE`") # raise if setup header is in note and in setup - if self.setup_in_note and self.setup: + if self.setup_in_note and (self.setup and self.setup != "None"): raise ValidationError("Setup header found in both note and setup fields.") @@ -676,6 +710,16 @@ class QueryRuleData(BaseRuleData): language: definitions.FilterLanguages alert_suppression: Optional[AlertSuppressionMapping] = field(metadata=dict(metadata=dict(min_compat="8.8"))) + @cached_property + def index_or_dataview(self) -> list[str]: + """Return the index or dataview depending on which is set. If neither returns empty list.""" + if self.index is not None: + return self.index + elif self.data_view_id is not None: + return [self.data_view_id] + else: + return [] + @cached_property def validator(self) -> Optional[QueryValidator]: if self.language == "kuery": @@ -943,7 +987,7 @@ class BaseRuleContents(ABC): pass def lock_info(self, bump=True) -> dict: - version = self.autobumped_version if bump else (self.latest_version or 1) + version = self.autobumped_version if bump else (self.saved_version or 1) contents = {"rule_name": self.name, "sha256": self.sha256(), "version": version, "type": self.type} return contents @@ -990,20 +1034,43 @@ class BaseRuleContents(ABC): return max_allowable_version - current_version - 1 @property - def latest_version(self) -> Optional[int]: - """Retrieve the latest known version of the rule.""" - min_stack = self.get_supported_version() - return self.version_lock.get_locked_version(self.id, min_stack) + def saved_version(self) -> Optional[int]: + """Retrieve the version from the version.lock or from the file if version locking is bypassed.""" + toml_version = self.data.get("version") + + if BYPASS_VERSION_LOCK: + return toml_version + + if toml_version: + print(f"WARNING: Rule {self.name} - {self.id} has a version set in the rule TOML." + " This `version` will be ignored and defaulted to the version.lock.json file." + " Set `bypass_version_lock` to `True` in the rules config to use the TOML version.") + + return self.version_lock.get_locked_version(self.id, self.get_supported_version()) @property def autobumped_version(self) -> Optional[int]: """Retrieve the current version of the rule, accounting for automatic increments.""" - version = self.latest_version + version = self.saved_version + + if BYPASS_VERSION_LOCK: + raise NotImplementedError("This method is not implemented when version locking is not in use.") + + # Default to version 1 if no version is set yet if version is None: return 1 + # Auto-increment version if the rule is 'dirty' and not bypassing version lock return version + 1 if self.is_dirty else version + def get_synthetic_version(self, use_default: bool) -> Optional[int]: + """ + Get the latest actual representation of a rule's version, where changes are accounted for automatically when + version locking is used, otherwise, return the version defined in the rule toml if present else optionally + default to 1. + """ + return self.autobumped_version or self.saved_version or (1 if use_default else None) + @classmethod def convert_supported_version(cls, stack_version: Optional[str]) -> Version: """Convert an optional stack version to the minimum for the lock in the form major.minor.""" @@ -1051,13 +1118,22 @@ class TOMLRuleContents(BaseRuleContents, MarshmallowDataclassMixin): @cached_property def version_lock(self): # VersionLock - from .version_lock import default_version_lock + from .version_lock import loaded_version_lock - return getattr(self, '_version_lock', None) or default_version_lock + if RULES_CONFIG.bypass_version_lock is True: + err_msg = "Cannot access the version lock when the versioning strategy is configured to bypass the" \ + " version lock. Set `bypass_version_lock` to `false` in the rules config to use the version lock." + raise ValueError(err_msg) + + return getattr(self, '_version_lock', None) or loaded_version_lock def set_version_lock(self, value): from .version_lock import VersionLock + err_msg = "Cannot set the version lock when the versioning strategy is configured to bypass the version lock." \ + " Set `bypass_version_lock` to `false` in the rules config to use the version lock." + assert not RULES_CONFIG.bypass_version_lock, err_msg + if value and not isinstance(value, VersionLock): raise TypeError(f'version lock property must be set with VersionLock objects only. Got {type(value)}') @@ -1095,6 +1171,20 @@ class TOMLRuleContents(BaseRuleContents, MarshmallowDataclassMixin): def type(self) -> str: return self.data.type + def _add_known_nulls(self, rule_dict: dict) -> dict: + """Add known nulls to the rule.""" + # Note this is primarily as a stopgap until add support for Rule Actions + for pair in definitions.KNOWN_NULL_ENTRIES: + for compound_key, sub_key in pair.items(): + value = get_nested_value(rule_dict, compound_key) + if isinstance(value, list): + items_to_update = [ + item for item in value if isinstance(item, dict) and get_nested_value(item, sub_key) is None + ] + for item in items_to_update: + set_nested_value(item, sub_key, None) + return rule_dict + def _post_dict_conversion(self, obj: dict) -> dict: """Transform the converted API in place before sending to Kibana.""" super()._post_dict_conversion(obj) @@ -1262,10 +1352,12 @@ class TOMLRuleContents(BaseRuleContents, MarshmallowDataclassMixin): data: AnyRuleData = value["data"] metadata: RuleMeta = value["metadata"] - data.validate_query(metadata) - data.data_validator.validate_note() - data.data_validator.validate_bbr(metadata.get('bypass_bbr_timing')) - data.validate(metadata) if hasattr(data, 'validate') else False + test_config = RULES_CONFIG.test_config + if not test_config.check_skip_by_rule_id(value['data'].rule_id): + data.validate_query(metadata) + data.data_validator.validate_note() + data.data_validator.validate_bbr(metadata.get('bypass_bbr_timing')) + data.validate(metadata) if hasattr(data, 'validate') else False @staticmethod def validate_remote(remote_validator: 'RemoteValidator', contents: 'TOMLRuleContents'): @@ -1276,7 +1368,13 @@ class TOMLRuleContents(BaseRuleContents, MarshmallowDataclassMixin): cls, rule: dict, creation_date: str = TIME_NOW, updated_date: str = TIME_NOW, maturity: str = 'development' ) -> 'TOMLRuleContents': """Create a TOMLRuleContents from a kibana rule resource.""" - meta = {'creation_date': creation_date, 'updated_date': updated_date, 'maturity': maturity} + integrations = [r.get("package") for r in rule.get("related_integrations")] + meta = { + "creation_date": creation_date, + "updated_date": updated_date, + "maturity": maturity, + "integration": integrations, + } contents = cls.from_dict({'metadata': meta, 'rule': rule, 'transforms': None}, unknown=marshmallow.EXCLUDE) return contents @@ -1295,9 +1393,10 @@ class TOMLRuleContents(BaseRuleContents, MarshmallowDataclassMixin): flattened.update(self.metadata.to_dict()) return flattened - def to_api_format(self, include_version: bool = True, include_metadata: bool = False) -> dict: + def to_api_format(self, include_version: bool = not BYPASS_VERSION_LOCK, include_metadata: bool = False) -> dict: """Convert the TOML rule to the API format.""" rule_dict = self.to_dict() + rule_dict = self._add_known_nulls(rule_dict) converted_data = rule_dict['rule'] converted = self._post_dict_conversion(converted_data) @@ -1346,11 +1445,22 @@ class TOMLRule: """Generate the relevant fleet compatible asset.""" return {"id": self.id, "attributes": self.contents.to_api_format(), "type": definitions.SAVED_OBJECT_TYPE} - def save_toml(self): + def get_base_rule_dir(self) -> Path | None: + """Get the base rule directory for the rule.""" + rule_path = self.path.resolve() + for rules_dir in DEFAULT_PREBUILT_RULES_DIRS + DEFAULT_PREBUILT_BBR_DIRS: + if rule_path.is_relative_to(rules_dir): + return rule_path.relative_to(rules_dir) + return None + + def save_toml(self, strip_none_values: bool = True): assert self.path is not None, f"Can't save rule {self.name} (self.id) without a path" - converted = dict(metadata=self.contents.metadata.to_dict(), rule=self.contents.data.to_dict()) + converted = dict( + metadata=self.contents.metadata.to_dict(), + rule=self.contents.data.to_dict(strip_none_values=strip_none_values), + ) if self.contents.transform: - converted['transform'] = self.contents.transform.to_dict() + converted["transform"] = self.contents.transform.to_dict() toml_write(converted, str(self.path.absolute())) def save_json(self, path: Path, include_version: bool = True): @@ -1369,13 +1479,17 @@ class DeprecatedRuleContents(BaseRuleContents): @cached_property def version_lock(self): # VersionLock - from .version_lock import default_version_lock + from .version_lock import loaded_version_lock - return getattr(self, '_version_lock', None) or default_version_lock + return getattr(self, '_version_lock', None) or loaded_version_lock def set_version_lock(self, value): from .version_lock import VersionLock + err_msg = "Cannot set the version lock when the versioning strategy is configured to bypass the version lock." \ + " Set `bypass_version_lock` to `false` in the rules config to use the version lock." + assert not RULES_CONFIG.bypass_version_lock, err_msg + if value and not isinstance(value, VersionLock): raise TypeError(f'version lock property must be set with VersionLock objects only. Got {type(value)}') @@ -1400,7 +1514,7 @@ class DeprecatedRuleContents(BaseRuleContents): kwargs['transform'] = obj['transform'] if 'transform' in obj else None return cls(**kwargs) - def to_api_format(self, include_version=True) -> dict: + def to_api_format(self, include_version: bool = not BYPASS_VERSION_LOCK) -> dict: """Convert the TOML rule to the API format.""" data = copy.deepcopy(self.data) if self.transform: @@ -1486,8 +1600,8 @@ def get_unique_query_fields(rule: TOMLRule) -> List[str]: cfg = set_eql_config(rule.contents.metadata.get('min_stack_version')) with eql.parser.elasticsearch_syntax, eql.parser.ignore_missing_functions, eql.parser.skip_optimizations, cfg: - parsed = kql.parse(query) if language == 'kuery' else eql.parse_query(query) - + parsed = (kql.parse(query, normalize_kql_keywords=RULES_CONFIG.normalize_kql_keywords) + if language == 'kuery' else eql.parse_query(query)) return sorted(set(str(f) for f in parsed if isinstance(f, (eql.ast.Field, kql.ast.Field)))) diff --git a/detection_rules/rule_formatter.py b/detection_rules/rule_formatter.py index 766a05d4f..149f4d145 100644 --- a/detection_rules/rule_formatter.py +++ b/detection_rules/rule_formatter.py @@ -67,6 +67,9 @@ def wrap_text(v, block_indent=0, join=False): lines = textwrap.wrap(v, initial_indent=' ' * block_indent, subsequent_indent=' ' * block_indent, width=120, break_long_words=False, break_on_hyphens=False) lines = [line + '\n' for line in lines] + # If there is a single line that contains a quote, add a new blank line to trigger multiline formatting + if len(lines) == 1 and '"' in lines[0]: + lines = lines + [''] return lines if not join else ''.join(lines) @@ -191,6 +194,7 @@ def toml_write(rule_contents, outfile=None): def _do_write(_data, _contents): query = None + threat_query = None if _data == 'rule': # - We want to avoid the encoder for the query and instead use kql-lint. @@ -204,6 +208,7 @@ def toml_write(rule_contents, outfile=None): # # if tags and isinstance(tags, list): # contents['rule']["tags"] = list(sorted(set(tags))) + threat_query = contents['rule'].pop('threat_query', '').strip() top = OrderedDict() bottom = OrderedDict() @@ -214,18 +219,28 @@ def toml_write(rule_contents, outfile=None): if k == 'actions': # explicitly preserve formatting for message field in actions preserved_fields = ["params.message"] - v = [preserve_formatting_for_fields(action, preserved_fields) for action in v] + v = [preserve_formatting_for_fields(action, preserved_fields) for action in v] if v is not None else [] if k == 'filters': # explicitly preserve formatting for value field in filters preserved_fields = ["meta.value"] - v = [preserve_formatting_for_fields(meta, preserved_fields) for meta in v] + v = [preserve_formatting_for_fields(meta, preserved_fields) for meta in v] if v is not None else [] if k == 'note' and isinstance(v, str): # Transform instances of \ to \\ as calling write will convert \\ to \. # This will ensure that the output file has the correct number of backslashes. v = v.replace("\\", "\\\\") + if k == 'setup' and isinstance(v, str): + # Transform instances of \ to \\ as calling write will convert \\ to \. + # This will ensure that the output file has the correct number of backslashes. + v = v.replace("\\", "\\\\") + + if k == 'description' and isinstance(v, str): + # Transform instances of \ to \\ as calling write will convert \\ to \. + # This will ensure that the output file has the correct number of backslashes. + v = v.replace("\\", "\\\\") + if isinstance(v, dict): bottom[k] = OrderedDict(sorted(v.items())) elif isinstance(v, list): @@ -241,9 +256,17 @@ def toml_write(rule_contents, outfile=None): if query: top.update({'query': "XXxXX"}) + if threat_query: + top.update({'threat_query': "XXxXX"}) + top.update(bottom) top = toml.dumps(OrderedDict({data: top}), encoder=encoder) + # we want to preserve the threat_query format, but want to modify it in the context of encoded dump + if threat_query: + formatted_threat_query = "\nthreat_query = '''\n{}\n'''{}".format(threat_query, '\n\n' if bottom else '') + top = top.replace('threat_query = "XXxXX"', formatted_threat_query) + # we want to preserve the query format, but want to modify it in the context of encoded dump if query: formatted_query = "\nquery = '''\n{}\n'''{}".format(query, '\n\n' if bottom else '') diff --git a/detection_rules/rule_loader.py b/detection_rules/rule_loader.py index 51c7b824a..b6ac67421 100644 --- a/detection_rules/rule_loader.py +++ b/detection_rules/rule_loader.py @@ -4,7 +4,6 @@ # 2.0. """Load rule metadata transform between rule and api formats.""" -import io from collections import OrderedDict from dataclasses import dataclass, field from pathlib import Path @@ -17,6 +16,7 @@ import json from marshmallow.exceptions import ValidationError from . import utils +from .config import parse_rules_config from .mappings import RtaMappings from .rule import ( DeprecatedRule, DeprecatedRuleContents, DictRule, TOMLRule, TOMLRuleContents @@ -24,10 +24,9 @@ from .rule import ( from .schemas import definitions from .utils import cached, get_path -DEFAULT_RULES_DIR = get_path("rules") -DEFAULT_BBR_DIR = get_path("rules_building_block") -DEFAULT_DEPRECATED_DIR = DEFAULT_RULES_DIR / '_deprecated' -RTA_DIR = get_path("rta") +RULES_CONFIG = parse_rules_config() +DEFAULT_PREBUILT_RULES_DIRS = RULES_CONFIG.rule_dirs +DEFAULT_PREBUILT_BBR_DIRS = RULES_CONFIG.bbr_rules_dirs FILE_PATTERN = r'^([a-z0-9_])+\.(json|toml)$' @@ -82,7 +81,8 @@ def metadata_filter(**metadata) -> Callable[[TOMLRule], bool]: production_filter = metadata_filter(maturity="production") -def load_locks_from_tag(remote: str, tag: str) -> (str, dict, dict): +def load_locks_from_tag(remote: str, tag: str, version_lock: str = 'detection_rules/etc/version.lock.json', + deprecated_file: str = 'detection_rules/etc/deprecated_rules.json') -> (str, dict, dict): """Loads version and deprecated lock files from git tag.""" import json git = utils.make_git() @@ -104,13 +104,13 @@ def load_locks_from_tag(remote: str, tag: str) -> (str, dict, dict): commit_hash = git('rev-list', '-1', tag) try: - version = json.loads(git('show', f'{tag}:detection_rules/etc/version.lock.json')) + version = json.loads(git('show', f'{tag}:{version_lock}')) except CalledProcessError: # Adding resiliency to account for the old directory structure version = json.loads(git('show', f'{tag}:etc/version.lock.json')) try: - deprecated = json.loads(git('show', f'{tag}:detection_rules/etc/deprecated_rules.json')) + deprecated = json.loads(git('show', f'{tag}:{deprecated_file}')) except CalledProcessError: # Adding resiliency to account for the old directory structure deprecated = json.loads(git('show', f'{tag}:etc/deprecated_rules.json')) @@ -293,8 +293,8 @@ class RawRuleCollection(BaseCollection): """Return the default rule collection, which retrieves from rules/.""" if cls.__default is None: collection = RawRuleCollection() - collection.load_directory(DEFAULT_RULES_DIR) - collection.load_directory(DEFAULT_BBR_DIR) + collection.load_directories(DEFAULT_PREBUILT_RULES_DIRS) + collection.load_directories(DEFAULT_PREBUILT_BBR_DIRS) collection.freeze() cls.__default = collection @@ -305,7 +305,7 @@ class RawRuleCollection(BaseCollection): """Return the default BBR collection, which retrieves from building_block_rules/.""" if cls.__default_bbr is None: collection = RawRuleCollection() - collection.load_directory(DEFAULT_BBR_DIR) + collection.load_directories(DEFAULT_PREBUILT_BBR_DIRS) collection.freeze() cls.__default_bbr = collection @@ -359,7 +359,7 @@ class RuleCollection(BaseCollection): # use pytoml instead of toml because of annoying bugs # https://github.com/uiri/toml/issues/152 # might also be worth looking at https://github.com/sdispater/tomlkit - with io.open(path, "r", encoding="utf-8") as f: + with path.open("r", encoding="utf-8") as f: toml_dict = self.deserialize_toml_string(f.read()) self._toml_load_cache[path] = toml_dict return toml_dict @@ -404,13 +404,15 @@ class RuleCollection(BaseCollection): # bypass rule object load (load_dict) and load as a dict only if obj.get('metadata', {}).get('maturity', '') == 'deprecated': contents = DeprecatedRuleContents.from_dict(obj) - contents.set_version_lock(self._version_lock) + if not RULES_CONFIG.bypass_version_lock: + contents.set_version_lock(self._version_lock) deprecated_rule = DeprecatedRule(path, contents) self.add_deprecated_rule(deprecated_rule) return deprecated_rule else: contents = TOMLRuleContents.from_dict(obj) - contents.set_version_lock(self._version_lock) + if not RULES_CONFIG.bypass_version_lock: + contents.set_version_lock(self._version_lock) rule = TOMLRule(path=path, contents=contents) self.add_rule(rule) return rule @@ -442,8 +444,10 @@ class RuleCollection(BaseCollection): from .version_lock import VersionLock, add_rule_types_to_lock git = utils.make_git() - rules_dir = DEFAULT_RULES_DIR.relative_to(get_path(".")) - paths = git("ls-tree", "-r", "--name-only", branch, rules_dir).splitlines() + paths = [] + for rules_dir in DEFAULT_PREBUILT_RULES_DIRS: + rules_dir = rules_dir.relative_to(get_path(".")) + paths.extend(git("ls-tree", "-r", "--name-only", branch, rules_dir).splitlines()) rule_contents = [] rule_map = {} @@ -507,8 +511,8 @@ class RuleCollection(BaseCollection): """Return the default rule collection, which retrieves from rules/.""" if cls.__default is None: collection = RuleCollection() - collection.load_directory(DEFAULT_RULES_DIR) - collection.load_directory(DEFAULT_BBR_DIR) + collection.load_directories(DEFAULT_PREBUILT_RULES_DIRS) + collection.load_directories(DEFAULT_PREBUILT_BBR_DIRS) collection.freeze() cls.__default = collection @@ -519,7 +523,7 @@ class RuleCollection(BaseCollection): """Return the default BBR collection, which retrieves from building_block_rules/.""" if cls.__default_bbr is None: collection = RuleCollection() - collection.load_directory(DEFAULT_BBR_DIR) + collection.load_directories(DEFAULT_PREBUILT_BBR_DIRS) collection.freeze() cls.__default_bbr = collection @@ -629,8 +633,8 @@ rta_mappings = RtaMappings() __all__ = ( "FILE_PATTERN", - "DEFAULT_RULES_DIR", - "DEFAULT_BBR_DIR", + "DEFAULT_PREBUILT_RULES_DIRS", + "DEFAULT_PREBUILT_BBR_DIRS", "load_github_pr_rules", "DeprecatedCollection", "DeprecatedRule", diff --git a/detection_rules/rule_validators.py b/detection_rules/rule_validators.py index a75a13bda..4534f5440 100644 --- a/detection_rules/rule_validators.py +++ b/detection_rules/rule_validators.py @@ -4,6 +4,7 @@ # 2.0. """Validation logic for rules containing queries.""" +import re from enum import Enum from functools import cached_property, wraps from typing import Any, Callable, Dict, List, Optional, Tuple, Union @@ -18,9 +19,10 @@ from semver import Version import kql from . import ecs, endgame +from .config import CUSTOM_RULES_DIR, load_current_package_version, parse_rules_config +from .custom_schemas import update_auto_generated_schema from .integrations import (get_integration_schema_data, load_integrations_manifests) -from .misc import load_current_package_version from .rule import (EQLRuleData, QueryRuleData, QueryValidator, RuleMeta, TOMLRuleContents, set_eql_config) from .schemas import get_stack_schemas @@ -33,6 +35,7 @@ EQL_ERROR_TYPES = Union[eql.EqlCompileError, eql.EqlSyntaxError, eql.EqlTypeMismatchError] KQL_ERROR_TYPES = Union[kql.KqlCompileError, kql.KqlParseError] +RULES_CONFIG = parse_rules_config() class ExtendedTypeHint(Enum): @@ -107,16 +110,21 @@ class KQLValidator(QueryValidator): @cached_property def ast(self) -> kql.ast.Expression: - return kql.parse(self.query) + return kql.parse(self.query, normalize_kql_keywords=RULES_CONFIG.normalize_kql_keywords) @cached_property def unique_fields(self) -> List[str]: return list(set(str(f) for f in self.ast if isinstance(f, kql.ast.Field))) + def auto_add_field(self, validation_checks_error: kql.errors.KqlParseError, index_or_dataview: str) -> None: + """Auto add a missing field to the schema.""" + field_name = extract_error_field(self.query, validation_checks_error) + update_auto_generated_schema(index_or_dataview, field_name) + def to_eql(self) -> eql.ast.Expression: return kql.to_eql(self.query) - def validate(self, data: QueryRuleData, meta: RuleMeta) -> None: + def validate(self, data: QueryRuleData, meta: RuleMeta, max_attempts: int = 10) -> None: """Validate the query, called from the parent which contains [metadata] information.""" if meta.query_schema_validation is False or meta.maturity == "deprecated": # syntax only, which is done via self.ast @@ -125,20 +133,36 @@ class KQLValidator(QueryValidator): if isinstance(data, QueryRuleData) and data.language != 'lucene': packages_manifest = load_integrations_manifests() package_integrations = TOMLRuleContents.get_packaged_integrations(data, meta, packages_manifest) + for _ in range(max_attempts): + validation_checks = {"stack": None, "integrations": None} + # validate the query against fields within beats + validation_checks["stack"] = self.validate_stack_combos(data, meta) - validation_checks = {"stack": None, "integrations": None} - # validate the query against fields within beats - validation_checks["stack"] = self.validate_stack_combos(data, meta) + if package_integrations: + # validate the query against related integration fields + validation_checks["integrations"] = self.validate_integration(data, meta, package_integrations) - if package_integrations: - # validate the query against related integration fields - validation_checks["integrations"] = self.validate_integration(data, meta, package_integrations) + if (validation_checks["stack"] and not package_integrations): + # if auto add, try auto adding and then call stack_combo validation again + if validation_checks["stack"].error_msg == "Unknown field" and RULES_CONFIG.auto_gen_schema_file: + # auto add the field and re-validate + self.auto_add_field(validation_checks["stack"], data.index_or_dataview[0]) + else: + raise validation_checks["stack"] - if (validation_checks["stack"] and not package_integrations): - raise validation_checks["stack"] + if (validation_checks["stack"] and validation_checks["integrations"]): + # if auto add, try auto adding and then call stack_combo validation again + if validation_checks["stack"].error_msg == "Unknown field" and RULES_CONFIG.auto_gen_schema_file: + # auto add the field and re-validate + self.auto_add_field(validation_checks["stack"], data.index_or_dataview[0]) + else: + raise ValueError(f"Error in both stack and integrations checks: {validation_checks}") - if (validation_checks["stack"] and validation_checks["integrations"]): - raise ValueError(f"Error in both stack and integrations checks: {validation_checks}") + else: + break + + else: + raise ValueError(f"Maximum validation attempts exceeded for {data.rule_id} - {data.name}") def validate_stack_combos(self, data: QueryRuleData, meta: RuleMeta) -> Union[KQL_ERROR_TYPES, None, TypeError]: """Validate the query against ECS and beats schemas across stack combinations.""" @@ -147,11 +171,11 @@ class KQLValidator(QueryValidator): ecs_version = mapping['ecs'] err_trailer = f'stack: {stack_version}, beats: {beats_version}, ecs: {ecs_version}' - beat_types, beat_schema, schema = self.get_beats_schema(data.index or [], + beat_types, beat_schema, schema = self.get_beats_schema(data.index_or_dataview, beats_version, ecs_version) try: - kql.parse(self.query, schema=schema) + kql.parse(self.query, schema=schema, normalize_kql_keywords=RULES_CONFIG.normalize_kql_keywords) except kql.KqlParseError as exc: message = exc.error_msg trailer = err_trailer @@ -192,11 +216,17 @@ class KQLValidator(QueryValidator): integration_schema_data["integration"], ) integration_schema = integration_schema_data["schema"] + stack_version = integration_schema_data["stack_version"] # Add non-ecs-schema fields - for index_name in data.index: + for index_name in data.index_or_dataview: integration_schema.update(**ecs.flatten(ecs.get_index_schema(index_name))) + # Add custom schema fields for appropriate stack version + if data.index and CUSTOM_RULES_DIR: + for index_name in data.index_or_dataview: + integration_schema.update(**ecs.flatten(ecs.get_custom_index_schema(index_name, stack_version))) + # Add endpoint schema fields for multi-line fields integration_schema.update(**ecs.flatten(ecs.get_endpoint_schemas())) if integration: @@ -206,7 +236,9 @@ class KQLValidator(QueryValidator): # Validate the query against the schema try: - kql.parse(self.query, schema=integration_schema) + kql.parse(self.query, + schema=integration_schema, + normalize_kql_keywords=RULES_CONFIG.normalize_kql_keywords) except kql.KqlParseError as exc: if exc.error_msg == "Unknown field": field = extract_error_field(self.query, exc) @@ -293,35 +325,66 @@ class EQLValidator(QueryValidator): def unique_fields(self) -> List[str]: return list(set(str(f) for f in self.ast if isinstance(f, eql.ast.Field))) - def validate(self, data: 'QueryRuleData', meta: RuleMeta) -> None: + def auto_add_field(self, validation_checks_error: eql.errors.EqlParseError, index_or_dataview: str) -> None: + """Auto add a missing field to the schema.""" + field_name = extract_error_field(self.query, validation_checks_error) + update_auto_generated_schema(index_or_dataview, field_name) + + def validate(self, data: "QueryRuleData", meta: RuleMeta, max_attempts: int = 10) -> None: """Validate an EQL query while checking TOMLRule.""" if meta.query_schema_validation is False or meta.maturity == "deprecated": # syntax only, which is done via self.ast return - if isinstance(data, QueryRuleData) and data.language != 'lucene': + if isinstance(data, QueryRuleData) and data.language != "lucene": packages_manifest = load_integrations_manifests() package_integrations = TOMLRuleContents.get_packaged_integrations(data, meta, packages_manifest) - validation_checks = {"stack": None, "integrations": None} - # validate the query against fields within beats - validation_checks["stack"] = self.validate_stack_combos(data, meta) + for _ in range(max_attempts): + validation_checks = {"stack": None, "integrations": None} + # validate the query against fields within beats + validation_checks["stack"] = self.validate_stack_combos(data, meta) - if package_integrations: - # validate the query against related integration fields - validation_checks["integrations"] = self.validate_integration(data, meta, package_integrations) + if package_integrations: + # validate the query against related integration fields + validation_checks["integrations"] = self.validate_integration(data, meta, package_integrations) - if validation_checks["stack"] and not package_integrations: - raise validation_checks["stack"] + if validation_checks["stack"] and not package_integrations: + # if auto add, try auto adding and then validate again + if ( + "Field not recognized" in validation_checks["stack"].error_msg + and RULES_CONFIG.auto_gen_schema_file # noqa: W503 + ): + # auto add the field and re-validate + self.auto_add_field(validation_checks["stack"], data.index_or_dataview[0]) + else: + raise validation_checks["stack"] - if validation_checks["stack"] and validation_checks["integrations"]: - raise ValueError(f"Error in both stack and integrations checks: {validation_checks}") + elif validation_checks["stack"] and validation_checks["integrations"]: + # if auto add, try auto adding and then validate again + if ( + "Field not recognized" in validation_checks["stack"].error_msg + and RULES_CONFIG.auto_gen_schema_file # noqa: W503 + ): + # auto add the field and re-validate + self.auto_add_field(validation_checks["stack"], data.index_or_dataview[0]) + else: + raise ValueError(f"Error in both stack and integrations checks: {validation_checks}") - rule_type_config_fields, rule_type_config_validation_failed = \ - self.validate_rule_type_configurations(data, meta) + else: + break + + else: + raise ValueError(f"Maximum validation attempts exceeded for {data.rule_id} - {data.name}") + + rule_type_config_fields, rule_type_config_validation_failed = self.validate_rule_type_configurations( + data, meta + ) if rule_type_config_validation_failed: - raise ValueError(f"""Rule type config values are not ECS compliant, check these values: - {rule_type_config_fields}""") + raise ValueError( + f"""Rule type config values are not ECS compliant, check these values: + {rule_type_config_fields}""" + ) def validate_stack_combos(self, data: QueryRuleData, meta: RuleMeta) -> Union[EQL_ERROR_TYPES, None, ValueError]: """Validate the query against ECS and beats schemas across stack combinations.""" @@ -332,9 +395,9 @@ class EQLValidator(QueryValidator): err_trailer = f'stack: {stack_version}, beats: {beats_version},' \ f'ecs: {ecs_version}, endgame: {endgame_version}' - beat_types, beat_schema, schema = self.get_beats_schema(data.index or [], + beat_types, beat_schema, schema = self.get_beats_schema(data.index_or_dataview, beats_version, ecs_version) - endgame_schema = self.get_endgame_schema(data.index or [], endgame_version) + endgame_schema = self.get_endgame_schema(data.index_or_dataview, endgame_version) eql_schema = ecs.KqlSchema2Eql(schema) # validate query against the beats and eql schema @@ -383,10 +446,15 @@ class EQLValidator(QueryValidator): stack_version = integration_schema_data["stack_version"] # add non-ecs-schema fields for edge cases not added to the integration - if data.index: - for index_name in data.index: + if data.index_or_dataview: + for index_name in data.index_or_dataview: integration_schema.update(**ecs.flatten(ecs.get_index_schema(index_name))) + # Add custom schema fields for appropriate stack version + if data.index_or_dataview and CUSTOM_RULES_DIR: + for index_name in data.index_or_dataview: + integration_schema.update(**ecs.flatten(ecs.get_custom_index_schema(index_name, stack_version))) + # add endpoint schema fields for multi-line fields integration_schema.update(**ecs.flatten(ecs.get_endpoint_schemas())) package_schemas[package].update(**integration_schema) @@ -536,4 +604,4 @@ def extract_error_field(source: str, exc: Union[eql.EqlParseError, kql.KqlParseE line = lines[exc.line + mod] start = exc.column stop = start + len(exc.caret.strip()) - return line[start:stop] + return re.sub(r'^\W+|\W+$', '', line[start:stop]) diff --git a/detection_rules/schemas/__init__.py b/detection_rules/schemas/__init__.py index acd66982a..edd79e39f 100644 --- a/detection_rules/schemas/__init__.py +++ b/detection_rules/schemas/__init__.py @@ -10,12 +10,13 @@ from typing import OrderedDict as OrderedDictType import jsonschema from semver import Version -from ..misc import load_current_package_version -from ..utils import cached, get_etc_path, load_etc_dump +from ..config import load_current_package_version, parse_rules_config +from ..utils import cached, get_etc_path from . import definitions from .rta_schema import validate_rta_mapping from .stack_compat import get_incompatible_fields + __all__ = ( "SCHEMA_DIR", "definitions", @@ -28,13 +29,14 @@ __all__ = ( "all_versions", ) +RULES_CONFIG = parse_rules_config() SCHEMA_DIR = get_etc_path("api_schemas") migrations = {} def all_versions() -> List[str]: """Get all known stack versions.""" - return [str(v) for v in sorted(migrations)] + return [str(v) for v in sorted(migrations, key=lambda x: Version.parse(x, optional_minor_and_patch=True))] def migrate(version: str): @@ -311,12 +313,15 @@ def downgrade(api_contents: dict, target_version: str, current_version: Optional @cached def load_stack_schema_map() -> dict: - return load_etc_dump('stack-schema-map.yaml') + return RULES_CONFIG.stack_schema_map @cached def get_stack_schemas(stack_version: Optional[str] = '0.0.0') -> OrderedDictType[str, dict]: - """Return all ECS + beats to stack versions for every stack version >= specified stack version and <= package.""" + """ + Return all ECS, beats, and custom stack versions for every stack version. + Only versions >= specified stack version and <= package are returned. + """ stack_version = Version.parse(stack_version or '0.0.0', optional_minor_and_patch=True) current_package = Version.parse(load_current_package_version(), optional_minor_and_patch=True) diff --git a/detection_rules/schemas/definitions.py b/detection_rules/schemas/definitions.py index 67cba8e1c..29e0ceefc 100644 --- a/detection_rules/schemas/definitions.py +++ b/detection_rules/schemas/definitions.py @@ -4,13 +4,48 @@ # 2.0. """Custom shared definitions for schemas.""" +import os +from typing import Final, List, Literal -from typing import List, Literal, Final - -from marshmallow import validate +from marshmallow import fields, validate from marshmallow_dataclass import NewType from semver import Version +from detection_rules.config import CUSTOM_RULES_DIR + + +def elastic_timeline_template_id_validator(): + """Custom validator for Timeline Template IDs.""" + def validator(value): + if os.environ.get('DR_BYPASS_TIMELINE_TEMPLATE_VALIDATION') is not None: + fields.String().deserialize(value) + else: + validate.OneOf(list(TIMELINE_TEMPLATES))(value) + + return validator + + +def elastic_timeline_template_title_validator(): + """Custom validator for Timeline Template Titles.""" + def validator(value): + if os.environ.get('DR_BYPASS_TIMELINE_TEMPLATE_VALIDATION') is not None: + fields.String().deserialize(value) + else: + validate.OneOf(TIMELINE_TEMPLATES.values())(value) + + return validator + + +def elastic_rule_name_regexp(pattern): + """Custom validator for rule names.""" + def validator(value): + if not CUSTOM_RULES_DIR: + validate.Regexp(pattern)(value) + else: + fields.String().deserialize(value) + return validator + + ASSET_TYPE = "security_rule" SAVED_OBJECT_TYPE = "security-rule" @@ -59,6 +94,8 @@ QUERY_FIELD_OP_EXCEPTIONS = ["powershell.file.script_block_text"] # we had a bad rule ID make it in before tightening up the pattern, and so we have to let it bypass KNOWN_BAD_RULE_IDS = Literal['119c8877-8613-416d-a98a-96b6664ee73a5'] KNOWN_BAD_DEPRECATED_DATES = Literal['2021-03-03'] +# Known Null values that cannot be handled in TOML due to lack of Null value support via compound dicts +KNOWN_NULL_ENTRIES = [{"rule.actions": "frequency.throttle"}] OPERATORS = ['equals'] TIMELINE_TEMPLATES: Final[dict] = { @@ -149,6 +186,12 @@ CardinalityFields = NewType('CardinalityFields', List[NonEmptyStr], validate=val CodeString = NewType("CodeString", str) ConditionSemVer = NewType('ConditionSemVer', str, validate=validate.Regexp(CONDITION_VERSION_PATTERN)) Date = NewType('Date', str, validate=validate.Regexp(DATE_PATTERN)) +ExceptionEntryOperator = Literal['included', 'excluded'] +ExceptionEntryType = Literal['match', 'match_any', 'exists', 'list', 'wildcard', 'nested'] +ExceptionNamespaceType = Literal['single', 'agnostic'] +ExceptionItemEndpointTags = Literal['endpoint', 'os:windows', 'os:linux', 'os:macos'] +ExceptionContainerType = Literal['detection', 'endpoint', 'rule_default'] +ExceptionItemType = Literal['simple'] FilterLanguages = Literal["eql", "esql", "kuery", "lucene"] Interval = NewType('Interval', str, validate=validate.Regexp(INTERVAL_PATTERN)) InvestigateProviderQueryType = Literal["phrase", "range"] @@ -161,7 +204,7 @@ Operator = Literal['equals'] OSType = Literal['windows', 'linux', 'macos'] PositiveInteger = NewType('PositiveInteger', int, validate=validate.Range(min=1)) RiskScore = NewType("MaxSignals", int, validate=validate.Range(min=1, max=100)) -RuleName = NewType('RuleName', str, validate=validate.Regexp(NAME_PATTERN)) +RuleName = NewType('RuleName', str, validate=elastic_rule_name_regexp(NAME_PATTERN)) RuleType = Literal['query', 'saved_query', 'machine_learning', 'eql', 'esql', 'threshold', 'threat_match', 'new_terms'] SemVer = NewType('SemVer', str, validate=validate.Regexp(VERSION_PATTERN)) SemVerMinorOnly = NewType('SemVerFullStrict', str, validate=validate.Regexp(MINOR_SEMVER)) @@ -172,8 +215,8 @@ StoreType = Literal['appState', 'globalState'] TacticURL = NewType('TacticURL', str, validate=validate.Regexp(TACTIC_URL)) TechniqueURL = NewType('TechniqueURL', str, validate=validate.Regexp(TECHNIQUE_URL)) ThresholdValue = NewType("ThresholdValue", int, validate=validate.Range(min=1)) -TimelineTemplateId = NewType('TimelineTemplateId', str, validate=validate.OneOf(list(TIMELINE_TEMPLATES))) -TimelineTemplateTitle = NewType('TimelineTemplateTitle', str, validate=validate.OneOf(TIMELINE_TEMPLATES.values())) +TimelineTemplateId = NewType('TimelineTemplateId', str, validate=elastic_timeline_template_id_validator()) +TimelineTemplateTitle = NewType('TimelineTemplateTitle', str, validate=elastic_timeline_template_title_validator()) TransformTypes = Literal["osquery", "investigate"] UUIDString = NewType('UUIDString', str, validate=validate.Regexp(UUID_PATTERN)) BuildingBlockType = Literal['default'] @@ -182,3 +225,23 @@ BuildingBlockType = Literal['default'] MachineLearningType = getattr(Literal, '__getitem__')(tuple(MACHINE_LEARNING_PACKAGES)) # noqa: E999 MachineLearningTypeLower = getattr(Literal, '__getitem__')( tuple(map(str.lower, MACHINE_LEARNING_PACKAGES))) # noqa: E999 +## + +ActionTypeId = Literal[ + ".slack", ".slack_api", ".email", ".index", ".pagerduty", ".swimlane", ".webhook", ".servicenow", + ".servicenow-itom", ".servicenow-sir", ".jira", ".resilient", ".opsgenie", ".teams", ".torq", ".tines", + ".d3security" +] +EsDataTypes = Literal[ + 'binary', 'boolean', + 'keyword', 'constant_keyword', 'wildcard', + 'long', 'integer', 'short', 'byte', 'double', 'float', 'half_float', 'scaled_float', 'unsigned_long', + 'date', 'date_nanos', + 'alias', 'object', 'flatten', 'nested', 'join', + 'integer_range', 'float_range', 'long_range', 'double_range', 'date_range', 'ip_range', + 'ip', 'version', 'murmur3', 'aggregate_metric_double', 'histogram', + 'text', 'text_match_only', 'annotated-text', 'completion', 'search_as_you_type', 'token_count', + 'dense_vector', 'sparse_vector', 'rank_feature', 'rank_features', + 'geo_point', 'geo_shape', 'point', 'shape', + 'percolator' +] diff --git a/detection_rules/utils.py b/detection_rules/utils.py index 701f4132a..abffc8885 100644 --- a/detection_rules/utils.py +++ b/detection_rules/utils.py @@ -30,6 +30,7 @@ from eql.utils import load_dump, stream_json_lines import kql + CURR_DIR = Path(__file__).resolve().parent ROOT_DIR = CURR_DIR.parent ETC_DIR = ROOT_DIR / "detection_rules" / "etc" @@ -85,6 +86,17 @@ def get_json_iter(f): return data +def get_nested_value(dictionary, compound_key): + """Get a nested value from a dictionary.""" + keys = compound_key.split('.') + for key in keys: + if isinstance(dictionary, dict): + dictionary = dictionary.get(key) + else: + return None + return dictionary + + def get_path(*paths) -> Path: """Get a file by relative path.""" return ROOT_DIR.joinpath(*paths) @@ -126,6 +138,22 @@ def save_etc_dump(contents, *path, **kwargs): return eql.utils.save_dump(contents, path) +def set_all_validation_bypass(env_value: bool = False): + """Set all validation bypass environment variables.""" + os.environ['DR_BYPASS_NOTE_VALIDATION_AND_PARSE'] = str(env_value) + os.environ['DR_BYPASS_BBR_LOOKBACK_VALIDATION'] = str(env_value) + os.environ['DR_BYPASS_TAGS_VALIDATION'] = str(env_value) + os.environ['DR_BYPASS_TIMELINE_TEMPLATE_VALIDATION'] = str(env_value) + + +def set_nested_value(dictionary, compound_key, value): + """Set a nested value in a dictionary.""" + keys = compound_key.split('.') + for key in keys[:-1]: + dictionary = dictionary.setdefault(key, {}) + dictionary[keys[-1]] = value + + def gzip_compress(contents) -> bytes: gz_file = io.BytesIO() @@ -240,9 +268,9 @@ def convert_time_span(span: str) -> int: return eql.ast.TimeRange(amount, unit).as_milliseconds() -def evaluate(rule, events): +def evaluate(rule, events, normalize_kql_keywords: bool = False): """Evaluate a query against events.""" - evaluator = kql.get_evaluator(kql.parse(rule.query)) + evaluator = kql.get_evaluator(kql.parse(rule.query), normalize_kql_keywords=normalize_kql_keywords) filtered = list(filter(evaluator, events)) return filtered diff --git a/detection_rules/version_lock.py b/detection_rules/version_lock.py index 227021acd..36c62633b 100644 --- a/detection_rules/version_lock.py +++ b/detection_rules/version_lock.py @@ -11,15 +11,14 @@ from typing import ClassVar, Dict, List, Optional, Union import click from semver import Version +from .config import parse_rules_config from .mixins import LockDataclassMixin, MarshmallowDataclassMixin from .rule_loader import RuleCollection from .schemas import definitions -from .utils import cached, get_etc_path +from .utils import cached -ETC_VERSION_LOCK_FILE = "version.lock.json" -ETC_VERSION_LOCK_PATH = get_etc_path() / ETC_VERSION_LOCK_FILE -ETC_DEPRECATED_RULES_FILE = "deprecated_rules.json" -ETC_DEPRECATED_RULES_PATH = get_etc_path() / ETC_DEPRECATED_RULES_FILE + +RULES_CONFIG = parse_rules_config() # This was the original version the lock was created under. This constant has been replaced by # schemas.get_min_supported_stack_version to dynamically determine the minimum @@ -53,7 +52,7 @@ class VersionLockFileEntry(MarshmallowDataclassMixin, BaseEntry): class VersionLockFile(LockDataclassMixin): """Schema for the full version lock file.""" data: Dict[Union[definitions.UUIDString, definitions.KNOWN_BAD_RULE_IDS], VersionLockFileEntry] - file_path: ClassVar[Path] = ETC_VERSION_LOCK_PATH + file_path: ClassVar[Path] = RULES_CONFIG.version_lock_file def __contains__(self, rule_id: str): """Check if a rule is in the map by comparing IDs.""" @@ -78,7 +77,7 @@ class DeprecatedRulesEntry(MarshmallowDataclassMixin): class DeprecatedRulesFile(LockDataclassMixin): """Schema for the full deprecated rules file.""" data: Dict[Union[definitions.UUIDString, definitions.KNOWN_BAD_RULE_IDS], DeprecatedRulesEntry] - file_path: ClassVar[Path] = ETC_DEPRECATED_RULES_PATH + file_path: ClassVar[Path] = RULES_CONFIG.deprecated_rules_file def __contains__(self, rule_id: str): """Check if a rule is in the map by comparing IDs.""" @@ -124,7 +123,12 @@ class VersionLock: def __init__(self, version_lock_file: Optional[Path] = None, deprecated_lock_file: Optional[Path] = None, version_lock: Optional[dict] = None, deprecated_lock: Optional[dict] = None, - name: Optional[str] = None): + name: Optional[str] = None, invalidated: Optional[bool] = False): + + if invalidated: + err_msg = "This VersionLock configuration is not valid when configued to bypass_version_lock." + raise NotImplementedError(err_msg) + assert (version_lock_file or version_lock), 'Must provide version lock file or contents' assert (deprecated_lock_file or deprecated_lock), 'Must provide deprecated lock file or contents' @@ -182,7 +186,7 @@ class VersionLock: already_deprecated = set(current_deprecated_lock) deprecated_rules = set(rules.deprecated.id_map) - new_rules = set(rule.id for rule in rules if rule.contents.latest_version is None) - deprecated_rules + new_rules = set(rule.id for rule in rules if rule.contents.saved_version is None) - deprecated_rules changed_rules = set(rule.id for rule in rules if rule.contents.is_dirty) - deprecated_rules # manage deprecated rules @@ -331,4 +335,5 @@ class VersionLock: return changed_rules, list(new_rules), newly_deprecated -default_version_lock = VersionLock(ETC_VERSION_LOCK_PATH, ETC_DEPRECATED_RULES_PATH, name='default') +name = str(RULES_CONFIG.version_lock_file) +loaded_version_lock = VersionLock(RULES_CONFIG.version_lock_file, RULES_CONFIG.deprecated_rules_file, name=name) diff --git a/docs/custom-rules.md b/docs/custom-rules.md new file mode 100644 index 000000000..1048b68f7 --- /dev/null +++ b/docs/custom-rules.md @@ -0,0 +1,214 @@ +# Custom Rules + +A custom rule is any rule that is not maintained by Elastic under `rules/` or `rules_building_block`. These docs are intended +to show how to manage custom rules using this repository. + +For more detailed breakdown and explanation of employing a detections-as-code approach, refer to the +[dac-reference](https://dac-reference.readthedocs.io/en/latest/index.html). + + +## Defining a custom rule config and directory structure + +The simplest way to maintain custom rules alongside the existing prebuilt rules in the repo, is to decouple where the rules +are stored to minimize VCS conflicts and overlap. This is accomplished by defining a custom rules directory using a config file. + +### Understanding the structure + +``` +custom-rules +├── _config.yaml +└── rules + ├── example_rule_1.toml + ├── example_rule_2.toml +└── etc + ├── deprecated_rules.json + ├── packages.yaml + ├── stack-schema-map.yaml + ├── test_config.yaml + └── version.lock.json +└── actions + ├── action_1.toml + ├── action_2.toml +└── action_connectors + ├── action_connector_1.toml + └── action_connectors_2.toml +└── exceptions + ├── exception_1.toml + └── exception_2.toml +``` + +This structure represents a portable set of custom rules. This is just an example, and the exact locations of the files +should be defined in the `_config.yaml` file. Refer to the details in the default +[_config.yaml](../detection_rules/etc/_config.yaml) for more information. + +* deprecated_rules.json - tracks all deprecated rules (optional) +* packages.yaml - information for building packages (mostly optional, but the current version is required) +* stack-schema-map.yaml - a mapping of schemas for query validation +* test_config.yaml - a config file for testing (optional) +* version.lock.json - this tracks versioning for rules (optional depending on versioning strategy) + +To initialize a custom rule directory, run `python -m detection_rules custom-rules setup-config ` + +### Defining a config + +```yaml +rule_dirs: + - rules +files: + deprecated_rules: deprecated_rules.json + packages: packages.yaml + stack_schema_map: stack-schema-map.yaml + version_lock: version.lock.json +directories: + action_dir: actions + action_connector_dir: action_connectors + exception_dir: exceptions +``` + +Some notes: + +* The paths in this file are relative to the custom rules directory (CUSTOM_RULES_DIR/) +* Refer to each original [source file](../detection_rules/etc/example_test_config.yaml) for purpose and proper formatting +* You can also add an optional `bbr_rules_dirs` section for custom BBR rules. +* To bypass using the version lock versioning strategy (version lock file) you can set the optional `bypass_version_lock` value to be `True` +* To normalize the capitalization KQL keywords in KQL rule queries one can use the optional `normalize_kql_keywords` value set to `True` or `False` as desired. +* To manage exceptions tied to rules one can set an exceptions directory using the optional `exception_dir` value (included above) set to be the desired path. If an exceptions directory is explicitly specified in a CLI command, the config value will be ignored. +* To manage action-connectors tied to rules one can set an action-connectors directory using the optional `action_connector_dir` value (included above) set to be the desired path. If an actions_connector directory is explicitly specified in a CLI command, the config value will be ignored. +* To turn on automatic schema generation for non-ecs fields via custom schemas add `auto_gen_schema_file: `. This will generate a schema file in the specified location that will be used to add entries for each field and index combination that is not already in a known schema. This will also automatically add it to your stack-schema-map.yaml file when using a custom rules directory and config. +* For Kibana action items, currently these are included in the rule toml files themselves. At a later date, we may allow for bulk editing of rule action items through separate action toml files. The action_dir config key is left available for this later implementation. For now to bulk update, use the bulk actions add rule actions UI in Kibana. +* To on bulk disable elastic validation for optional fields, use the following line `bypass_optional_elastic_validation: True`. + + +When using the repo, set the environment variable `CUSTOM_RULES_DIR=` + + +### Defining a testing config + +```yaml +testing: + config: etc/example_test_config.yaml +``` + +This points to the testing config file (see example under detection_rules/etc/example_test_config.yaml) and can either +be set in `_config.yaml` or as the environment variable `DETECTION_RULES_TEST_CONFIG`, with precedence going to the +environment variable if both are set. Having both these options allows for configuring testing on prebuilt Elastic rules +without specifying a rules _config.yaml. + + +* Note: If set in this file, the path should be relative to the location of this config. If passed as an environment variable, it should be the full path + + +### How the config is used and it's designed portability + +This repo is designed to operate on certain expectations of structure and config files. By defining the code below, it allows +the design to become portable and based on defined information, rather than the static excpectiations. + +```python +RULES_CONFIG = parse_rules_config() + +# which then makes the following attribute available for use + +@dataclass +class RulesConfig: + """Detection rules config file.""" + deprecated_rules_file: Path + deprecated_rules: Dict[str, dict] + packages_file: Path + packages: Dict[str, dict] + rule_dirs: List[Path] + stack_schema_map_file: Path + stack_schema_map: Dict[str, dict] + test_config: TestConfig + version_lock_file: Path + version_lock: Dict[str, dict] + + action_dir: Optional[Path] = None + action_connector_dir: Optional[Path] = None + auto_gen_schema_file: Optional[Path] = None + bbr_rules_dirs: Optional[List[Path]] = field(default_factory=list) + bypass_version_lock: bool = False + exception_dir: Optional[Path] = None + normalize_kql_keywords: bool = True + bypass_optional_elastic_validation: bool = False + +# using the stack_schema_map +RULES_CONFIG.stack_schema_map +``` + +### Version Strategy Warning + +- General (`bypass_version_lock = False`) + - Default + - Versions from Kibana or the TOML file are ignored + - Version lock file usage is permitted +- General (`bypass_version_lock = True`) + - Must be explicitly set in the config + - Versions from Kibana or the TOML file are used + - Version lock file usage is not permitted +- Tactical Warning Messages + - Rule import to TOML file will skip version and revision fields when supplied (*rule_prompt* & *import_rules_into_repo*) if `bypass_version_lock = False`. No warning message is issued. + - Rule version lock will not be updated or used if `bypass_version_lock = True` when building a release package (*build_release*). A warning message is issued. + - If versions are in the TOML file, and `bypass_version_lock = False`, the versions in the TOML file will not be used (*autobumped_version*). A warning message is issued. + - If `bypass_version_lock = False`, when autobumping the version, it will check the version lock file and increment if is_dirty (*autobumped_version*), otherwise just use the version supplied. No warning message is issued. + - If `bypass_version_lock = True`, the updating the version lock file will disabled (*update_lock_versions*). A warning message is issued. + - If `bypass_version_lock = True`, loading the version lock file is disabled and skipped. (*from_dict*, *load_from_file*, *manage_versions*, *test_version_lock_has_nested_previous*). A warning message is issued. + +### Custom actions, action connectors, and exceptions lists + +To convert these to TOML, you can do the following: + +1. export the ndjson from Kibana into a `dict` or load from kibana + +```python +from detection_rules.action import Action, ActionMeta, TOMLActionContents, TOMLAction + +action = Action.from_dict(action_dict) +meta = ActionMeta(...) +action_contents = TOMLActionContents(action=[action], meta=meta) +toml_action = TOMLAction(path=Path, contents=action_contents) +``` + +Mimick a similar approach for exception lists. Both can then be managed with the `GenericLoader` + +```python +from detection_rules.generic_loader import GenericLoader + +loader = GenericLoader() +loader.load_directory(...) +``` + +### Using Custom Schemas + +You can specify custom defined schemas for custom indexes using the `etc/stack-schema-map.yaml` in your custom rules directory. + +To add a custom schema, add a sub key in the `etc/stack-schema-map.yaml` file under the stack version you wish the custom schema to apply. +Then for its value, reference the json file, or folder of files, where you have your schema defined. Please note, to validate rules with a `min_stack_version` set, the `stack-schema-map.yaml` needs an entry for the highest version. + +Example: + +```yaml +8.14.0: + beats: 8.12.2 + ecs: 8.11.0 + endgame: 8.4.0 + custom: schemas/custom-schema.json +``` + +Note: the `custom` key can be any alpha numeric value except `beats`, `ecs`, or `endgame` as these are reserved terms. + +Note: Remember if you want to turn on automatic schema generation for non-ecs fields a custom schemas add `auto_gen_schema_file: `. + +Example schema json: + +```json + +{ + "custom-index*": { + "process.NewCustomValue": "keyword", + "process.AnotherCustomValue": "keyword" + } +} +``` + +This can then be used in a rule query by adding the index to the applicable rule e.g. `index = ["logs-endpoint.events.*", "custom-index*"]`. +Then one can use the index in the query e.g. `process where host.os.type == "linux" and process.NewCustomValue == "GoodValue"` \ No newline at end of file diff --git a/docs/developing.md b/docs/developing.md index b47eb0104..815ad37f1 100644 --- a/docs/developing.md +++ b/docs/developing.md @@ -33,10 +33,21 @@ relativeFrom = "now-48h/h" relativeTo = "now" ``` -Other transform suppoprt can be found under +Other transform support can be found under `python -m detection-rules dev transforms -h` +#### Testing bypasses with environment variables + +Using the environment variable `DR_BYPASS_NOTE_VALIDATION_AND_PARSE` will bypass the Detection Rules validation on the `note` field in toml files. + +Using the environment variable `DR_BYPASS_BBR_LOOKBACK_VALIDATION` will bypass the Detection Rules lookback and interval validation +on the building block rules. + +Using the environment variable `DR_BYPASS_TAGS_VALIDATION` will bypass the Detection Rules Unit Tests on the `tags` field in toml files. + +Using the environment variable `DR_BYPASS_TIMELINE_TEMPLATE_VALIDATION` will bypass the timeline template id and title validation for rules. + ## Using the `RuleResource` methods built on detections `_bulk_action` APIs diff --git a/docs/rule_insights.md b/docs/rule-insights.md similarity index 100% rename from docs/rule_insights.md rename to docs/rule-insights.md diff --git a/lib/kibana/kibana/resources.py b/lib/kibana/kibana/resources.py index c056d4235..a46d2530f 100644 --- a/lib/kibana/kibana/resources.py +++ b/lib/kibana/kibana/resources.py @@ -230,17 +230,28 @@ class RuleResource(BaseResource): raise @classmethod - def import_rules(cls, rules: List[dict], overwrite: bool = False, overwrite_exceptions: bool = False, - overwrite_action_connectors: bool = False) -> (dict, list, List[Optional['RuleResource']]): + def import_rules( + cls, + rules: List[dict], + exceptions: List[List[dict]] = [], + action_connectors: List[List[dict]] = [], + overwrite: bool = False, + overwrite_exceptions: bool = False, + overwrite_action_connectors: bool = False, + ) -> (dict, list, List[Optional["RuleResource"]]): """Import a list of rules into Kibana using the _import API and return the response and successful imports.""" url = f'{cls.BASE_URI}/_import' params = dict( overwrite=stringify_bool(overwrite), overwrite_exceptions=stringify_bool(overwrite_exceptions), - overwrite_action_connectors=stringify_bool(overwrite_action_connectors) + overwrite_action_connectors=stringify_bool(overwrite_action_connectors), ) rule_ids = [r['rule_id'] for r in rules] - headers, raw_data = Kibana.ndjson_file_data_prep(rules, "import.ndjson") + flattened_exceptions = [e for sublist in exceptions for e in sublist] + flattened_actions_connectors = [a for sublist in action_connectors for a in sublist] + headers, raw_data = Kibana.ndjson_file_data_prep( + rules + flattened_exceptions + flattened_actions_connectors, "import.ndjson" + ) response = Kibana.current().post(url, headers=headers, params=params, raw_data=raw_data) errors = response.get("errors", []) error_rule_ids = [e['rule_id'] for e in errors] diff --git a/lib/kibana/pyproject.toml b/lib/kibana/pyproject.toml index e2bfb2ef7..903b916a8 100644 --- a/lib/kibana/pyproject.toml +++ b/lib/kibana/pyproject.toml @@ -1,6 +1,6 @@ [project] name = "detection-rules-kibana" -version = "0.3.0" +version = "0.4.0" description = "Kibana API utilities for Elastic Detection Rules" license = {text = "Elastic License v2"} keywords = ["Elastic", "Kibana", "Detection Rules", "Security", "Elasticsearch"] diff --git a/tests/base.py b/tests/base.py index 3e7567a56..b56ede1c5 100644 --- a/tests/base.py +++ b/tests/base.py @@ -4,11 +4,13 @@ # 2.0. """Shared resources for tests.""" - +import os import unittest +from pathlib import Path from functools import lru_cache from typing import Union +from detection_rules.config import parse_rules_config from detection_rules.rule import TOMLRule from detection_rules.rule_loader import DeprecatedCollection, DeprecatedRule, RuleCollection, production_filter @@ -17,15 +19,25 @@ RULE_LOADER_FAIL = False RULE_LOADER_FAIL_MSG = None RULE_LOADER_FAIL_RAISED = False +CUSTOM_RULES_DIR = os.getenv('CUSTOM_RULES_DIR', None) +RULES_CONFIG = parse_rules_config() + @lru_cache -def default_rules() -> RuleCollection: +def load_rules() -> RuleCollection: + if CUSTOM_RULES_DIR: + rc = RuleCollection() + path = Path(CUSTOM_RULES_DIR) + assert path.exists(), f'Custom rules directory {path} does not exist' + rc.load_directories(directories=RULES_CONFIG.rule_dirs) + rc.freeze() + return rc return RuleCollection.default() -@lru_cache -def default_bbr() -> RuleCollection: - return RuleCollection.default_bbr() +def default_bbr(rc: RuleCollection) -> RuleCollection: + rules = [r for r in rc.rules if 'rules_building_block' in r.path.parent.parts] + return RuleCollection(rules=rules) class BaseRuleTest(unittest.TestCase): @@ -44,17 +56,19 @@ class BaseRuleTest(unittest.TestCase): if not RULE_LOADER_FAIL: try: - rc = default_rules() - rc_bbr = default_bbr() - cls.all_rules = rc.rules - cls.rule_lookup = rc.id_map - cls.production_rules = rc.filter(production_filter) + rc = load_rules() + rc_bbr = default_bbr(rc) + cls.rc = rc + cls.all_rules = rc.filter(production_filter) cls.bbr = rc_bbr.rules cls.deprecated_rules: DeprecatedCollection = rc.deprecated except Exception as e: RULE_LOADER_FAIL = True RULE_LOADER_FAIL_MSG = str(e) + cls.custom_dir = Path(CUSTOM_RULES_DIR).resolve() if CUSTOM_RULES_DIR else None + cls.rules_config = RULES_CONFIG + @staticmethod def rule_str(rule: Union[DeprecatedRule, TOMLRule], trailer=' ->') -> str: return f'{rule.id} - {rule.name}{trailer or ""}' diff --git a/tests/test_all_rules.py b/tests/test_all_rules.py index 3a740c8ed..7918a3a7c 100644 --- a/tests/test_all_rules.py +++ b/tests/test_all_rules.py @@ -19,18 +19,18 @@ from semver import Version import kql from detection_rules import attack +from detection_rules.config import load_current_package_version from detection_rules.integrations import (find_latest_compatible_version, load_integrations_manifests, load_integrations_schemas) -from detection_rules.misc import load_current_package_version from detection_rules.packaging import current_stack_version from detection_rules.rule import (AlertSuppressionMapping, QueryRuleData, QueryValidator, ThresholdAlertSuppression, TOMLRuleContents) -from detection_rules.rule_loader import FILE_PATTERN +from detection_rules.rule_loader import FILE_PATTERN, RULES_CONFIG from detection_rules.rule_validators import EQLValidator, KQLValidator from detection_rules.schemas import definitions, get_min_supported_stack_version, get_stack_schemas from detection_rules.utils import INTEGRATION_RULE_DIR, PatchedTemplate, get_path, load_etc_dump, make_git -from detection_rules.version_lock import default_version_lock +from detection_rules.version_lock import loaded_version_lock from rta import get_available_tests from .base import BaseRuleTest @@ -60,14 +60,14 @@ class TestValidRules(BaseRuleTest): def test_all_rule_queries_optimized(self): """Ensure that every rule query is in optimized form.""" - for rule in self.production_rules: + for rule in self.all_rules: if ( rule.contents.data.get("language") == "kuery" and not any( item in rule.contents.data.query for item in definitions.QUERY_FIELD_OP_EXCEPTIONS ) ): source = rule.contents.data.query - tree = kql.parse(source, optimize=False) + tree = kql.parse(source, optimize=False, normalize_kql_keywords=RULES_CONFIG.normalize_kql_keywords) optimized = tree.optimize(recursive=True) err_message = f'\n{self.rule_str(rule)} Query not optimized for rule\n' \ f'Expected: {optimized}\nActual: {source}' @@ -78,7 +78,7 @@ class TestValidRules(BaseRuleTest): mappings = load_etc_dump('rule-mapping.yaml') ttp_names = sorted(get_available_tests()) - for rule in self.production_rules: + for rule in self.all_rules: if isinstance(rule.contents.data, QueryRuleData) and rule.id in mappings: matching_rta = mappings[rule.id].get('rta_name') @@ -101,7 +101,7 @@ class TestValidRules(BaseRuleTest): def test_rule_type_changes(self): """Test that a rule type did not change for a locked version""" - default_version_lock.manage_versions(self.production_rules) + loaded_version_lock.manage_versions(self.rc) def test_bbr_validation(self): base_fields = { @@ -584,25 +584,28 @@ class TestRuleMetadata(BaseRuleTest): err_msg = f'The following rules have an updated_date older than the creation_date\n {rules_str}' self.fail(err_msg) + @unittest.skipIf(RULES_CONFIG.bypass_version_lock, "Skipping deprecated version lock check") def test_deprecated_rules(self): """Test that deprecated rules are properly handled.""" - versions = default_version_lock.version_lock - deprecations = load_etc_dump('deprecated_rules.json') + versions = loaded_version_lock.version_lock + deprecations = self.rules_config.deprecated_rules deprecated_rules = {} - rules_path = get_path('rules') - deprecated_path = get_path("rules", "_deprecated") + rules_paths = RULES_CONFIG.rule_dirs misplaced_rules = [] for r in self.all_rules: if "rules_building_block" in str(r.path): if r.contents.metadata.maturity == "deprecated": misplaced_rules.append(r) - elif r.path.relative_to(rules_path).parts[-2] == "_deprecated" \ - and r.contents.metadata.maturity != "deprecated": - misplaced_rules.append(r) + else: + for rules_path in rules_paths: + if "_deprecated" in r.path.relative_to(rules_path).parts \ + and r.contents.metadata.maturity != "deprecated": + misplaced_rules.append(r) + break misplaced = '\n'.join(f'{self.rule_str(r)} {r.contents.metadata.maturity}' for r in misplaced_rules) - err_str = f'The following rules are stored in {deprecated_path} but are not marked as deprecated:\n{misplaced}' + err_str = f'The following rules are stored in _deprecated but are not marked as deprecated:\n{misplaced}' self.assertListEqual(misplaced_rules, [], err_str) for rule in self.deprecated_rules: @@ -615,7 +618,7 @@ class TestRuleMetadata(BaseRuleTest): rule_path = rule.path.relative_to(rules_path) err_msg = f'{self.rule_str(rule)} deprecated rules should be stored in ' \ - f'"{deprecated_path}" folder' + f'"{rule_path.parent / "_deprecated"}" folder' self.assertEqual('_deprecated', rule_path.parts[-2], err_msg) err_msg = f'{self.rule_str(rule)} missing deprecation date' @@ -700,7 +703,7 @@ class TestRuleMetadata(BaseRuleTest): packages_manifest = load_integrations_manifests() valid_integration_folders = [p.name for p in list(Path(INTEGRATION_RULE_DIR).glob("*")) if p.name != 'endpoint'] - for rule in self.production_rules: + for rule in self.all_rules: # TODO: temp bypass for esql rules; once parsed, we should be able to look for indexes via `FROM` if not rule.contents.data.get('index'): continue @@ -975,7 +978,7 @@ class TestIntegrationRules(BaseRuleTest): def test_rule_demotions(self): """Test to ensure a locked rule is not dropped to development, only deprecated""" - versions = default_version_lock.version_lock + versions = loaded_version_lock.version_lock failures = [] for rule in self.all_rules: @@ -1146,10 +1149,10 @@ class TestRuleTiming(BaseRuleTest): class TestLicense(BaseRuleTest): """Test rule license.""" - + @unittest.skipIf(os.environ.get('CUSTOM_RULES_DIR'), 'Skipping test for custom rules.') def test_elastic_license_only_v2(self): """Test to ensure that production rules with the elastic license are only v2.""" - for rule in self.production_rules: + for rule in self.all_rules: rule_license = rule.contents.data.license if 'elastic license' in rule_license.lower(): err_msg = f'{self.rule_str(rule)} If Elastic License is used, only v2 should be used' @@ -1184,7 +1187,7 @@ class TestBuildTimeFields(BaseRuleTest): min_supported_stack_version = get_min_supported_stack_version() invalids = [] - for rule in self.production_rules: + for rule in self.all_rules: min_stack = rule.contents.metadata.min_stack_version build_fields = rule.contents.data.get_build_fields() @@ -1250,7 +1253,7 @@ class TestNoteMarkdownPlugins(BaseRuleTest): ' introduced in Elastic Stack version 8.8.0. Older Elastic Stack versions will display ' 'unrendered Markdown in this guide.') - for rule in self.production_rules.rules: + for rule in self.all_rules: if not rule.contents.get('transform'): continue @@ -1266,7 +1269,7 @@ class TestNoteMarkdownPlugins(BaseRuleTest): def test_plugin_placeholders_match_entries(self): """Test that the number of plugin entries match their respective placeholders in note.""" - for rule in self.production_rules.rules: + for rule in self.all_rules: has_transform = rule.contents.get('transform') is not None has_note = rule.contents.data.get('note') is not None note = rule.contents.data.note @@ -1308,7 +1311,7 @@ class TestNoteMarkdownPlugins(BaseRuleTest): def test_if_plugins_explicitly_defined(self): """Check if plugins are explicitly defined with the pattern in note vs using transform.""" - for rule in self.production_rules.rules: + for rule in self.all_rules: note = rule.contents.data.get('note') if note is not None: results = re.search(r'(!{osquery|!{investigate)', note, re.I | re.M) @@ -1323,7 +1326,7 @@ class TestAlertSuppression(BaseRuleTest): "Test only applicable to 8.8+ stacks for rule alert suppression feature.") def test_group_field_in_schemas(self): """Test to ensure the fields are defined is in ECS/Beats/Integrations schema.""" - for rule in self.production_rules: + for rule in self.all_rules: rule_type = rule.contents.data.get('type') if rule_type in ('query', 'threshold') and rule.contents.data.get('alert_suppression'): if isinstance(rule.contents.data.alert_suppression, AlertSuppressionMapping): diff --git a/tests/test_gh_workflows.py b/tests/test_gh_workflows.py index 0aee5322d..10983d205 100644 --- a/tests/test_gh_workflows.py +++ b/tests/test_gh_workflows.py @@ -10,7 +10,7 @@ from pathlib import Path import yaml -from detection_rules.schemas import get_stack_versions +from detection_rules.schemas import get_stack_versions, RULES_CONFIG from detection_rules.utils import get_path GITHUB_FILES = Path(get_path()) / '.github' @@ -20,6 +20,7 @@ GITHUB_WORKFLOWS = GITHUB_FILES / 'workflows' class TestWorkflows(unittest.TestCase): """Test GitHub workflow functionality.""" + @unittest.skipIf(RULES_CONFIG.bypass_version_lock, 'Version lock bypassed') def test_matrix_to_lock_version_defaults(self): """Test that the default versions in the lock-versions workflow mirror those from the schema-map.""" lock_workflow_file = GITHUB_WORKFLOWS / 'lock-versions.yml' diff --git a/tests/test_mappings.py b/tests/test_mappings.py index abdb7a4ec..32c4584d4 100644 --- a/tests/test_mappings.py +++ b/tests/test_mappings.py @@ -9,11 +9,15 @@ import unittest import warnings from . import get_data_files, get_fp_data_files +from detection_rules.config import parse_rules_config from detection_rules.utils import combine_sources, evaluate, load_etc_dump from rta import get_available_tests from .base import BaseRuleTest +RULES_CONFIG = parse_rules_config() + + class TestMappings(BaseRuleTest): """Test that all rules appropriately match against expected data sets.""" @@ -21,7 +25,7 @@ class TestMappings(BaseRuleTest): def evaluate(self, documents, rule, expected, msg): """KQL engine to evaluate.""" - filtered = evaluate(rule, documents) + filtered = evaluate(rule, documents, RULES_CONFIG.normalize_kql_keywords) self.assertEqual(expected, len(filtered), msg) return filtered @@ -30,7 +34,7 @@ class TestMappings(BaseRuleTest): mismatched_ecs = [] mappings = load_etc_dump('rule-mapping.yaml') - for rule in self.production_rules: + for rule in self.all_rules: if rule.contents.data.type == "query" and rule.contents.data.language == "kuery": if rule.id not in mappings: continue @@ -63,7 +67,7 @@ class TestMappings(BaseRuleTest): def test_false_positives(self): """Test that expected results return against false positives.""" - for rule in self.production_rules: + for rule in self.all_rules: if rule.contents.data.type == "query" and rule.contents.data.language == "kuery": for fp_name, merged_data in get_fp_data_files().items(): msg = 'Unexpected FP match for: {} - {}, against: {}'.format(rule.id, rule.name, fp_name) diff --git a/tests/test_packages.py b/tests/test_packages.py index 648d3cfcf..319b78881 100644 --- a/tests/test_packages.py +++ b/tests/test_packages.py @@ -13,7 +13,6 @@ from detection_rules import rule_loader from detection_rules.schemas.registry_package import (RegistryPackageManifestV1, RegistryPackageManifestV3) from detection_rules.packaging import PACKAGE_FILE, Package -from detection_rules.rule_loader import RuleCollection from tests.base import BaseRuleTest @@ -54,20 +53,23 @@ class TestPackages(BaseRuleTest): def test_package_loader_production_config(self): """Test that packages are loading correctly.""" + @unittest.skipIf(rule_loader.RULES_CONFIG.bypass_version_lock, 'Version lock bypassed') def test_package_loader_default_configs(self): """Test configs in detection_rules/etc/packages.yaml.""" - Package.from_config(package_configs) + Package.from_config(rule_collection=self.rc, config=package_configs) + @unittest.skipIf(rule_loader.RULES_CONFIG.bypass_version_lock, 'Version lock bypassed') def test_package_summary(self): """Test the generation of the package summary.""" - rules = self.production_rules + rules = self.rc package = Package(rules, 'test-package') package.generate_summary_and_changelog(package.changed_ids, package.new_ids, package.removed_ids) + @unittest.skipIf(rule_loader.RULES_CONFIG.bypass_version_lock, 'Version lock bypassed') def test_rule_versioning(self): """Test that all rules are properly versioned and tracked""" self.maxDiff = None - rules = RuleCollection.default() + rules = self.rc original_hashes = [] post_bump_hashes = [] diff --git a/tests/test_schemas.py b/tests/test_schemas.py index 2ac7fd845..5adb5a770 100644 --- a/tests/test_schemas.py +++ b/tests/test_schemas.py @@ -11,9 +11,9 @@ from semver import Version import eql from detection_rules import utils -from detection_rules.misc import load_current_package_version +from detection_rules.config import load_current_package_version from detection_rules.rule import TOMLRuleContents -from detection_rules.schemas import downgrade +from detection_rules.schemas import downgrade, RULES_CONFIG from detection_rules.version_lock import VersionLockFile from marshmallow import ValidationError @@ -271,6 +271,7 @@ class TestVersionLockSchema(unittest.TestCase): version_lock_contents = copy.deepcopy(self.version_lock_contents) VersionLockFile.from_dict(dict(data=version_lock_contents)) + @unittest.skipIf(RULES_CONFIG.bypass_version_lock, 'Version lock bypassed') def test_version_lock_has_nested_previous(self): """Fail field validation on version lock with nested previous fields""" version_lock_contents = copy.deepcopy(self.version_lock_contents) diff --git a/tests/test_specific_rules.py b/tests/test_specific_rules.py index 52c7ccb8f..bb7f0334f 100644 --- a/tests/test_specific_rules.py +++ b/tests/test_specific_rules.py @@ -17,7 +17,7 @@ from detection_rules.integrations import ( load_integrations_manifests, load_integrations_schemas, ) -from detection_rules.misc import load_current_package_version +from detection_rules.config import load_current_package_version from detection_rules.packaging import current_stack_version from detection_rules.rule import QueryValidator from detection_rules.rule_loader import RuleCollection @@ -38,8 +38,8 @@ class TestEndpointQuery(BaseRuleTest): ) def test_os_and_platform_in_query(self): """Test that all endpoint rules have an os defined and linux includes platform.""" - for rule in self.production_rules: - if not rule.contents.data.get("language") in ("eql", "kuery"): + for rule in self.all_rules: + if not rule.contents.data.get('language') in ('eql', 'kuery'): continue if rule.path.parent.name not in ("windows", "macos", "linux"): # skip cross-platform for now @@ -72,7 +72,7 @@ class TestNewTerms(BaseRuleTest): def test_history_window_start(self): """Test new terms history window start field.""" - for rule in self.production_rules: + for rule in self.all_rules: if rule.contents.data.type == "new_terms": # validate history window start field exists and is correct @@ -88,7 +88,7 @@ class TestNewTerms(BaseRuleTest): ) def test_new_terms_field_exists(self): # validate new terms and history window start fields are correct - for rule in self.production_rules: + for rule in self.all_rules: if rule.contents.data.type == "new_terms": assert ( rule.contents.data.new_terms.field == "new_terms_fields" @@ -100,7 +100,7 @@ class TestNewTerms(BaseRuleTest): def test_new_terms_fields(self): """Test new terms fields are schema validated.""" # ecs validation - for rule in self.production_rules: + for rule in self.all_rules: if rule.contents.data.type == "new_terms": meta = rule.contents.metadata feature_min_stack = Version.parse("8.4.0") @@ -149,7 +149,7 @@ class TestNewTerms(BaseRuleTest): def test_new_terms_max_limit(self): """Test new terms max limit.""" # validates length of new_terms to stack version - https://github.com/elastic/kibana/issues/142862 - for rule in self.production_rules: + for rule in self.all_rules: if rule.contents.data.type == "new_terms": meta = rule.contents.metadata feature_min_stack = Version.parse("8.4.0") @@ -174,7 +174,7 @@ class TestNewTerms(BaseRuleTest): def test_new_terms_fields_unique(self): """Test new terms fields are unique.""" # validate fields are unique - for rule in self.production_rules: + for rule in self.all_rules: if rule.contents.data.type == "new_terms": assert len(set(rule.contents.data.new_terms.value)) == len( rule.contents.data.new_terms.value diff --git a/tests/test_transform_fields.py b/tests/test_transform_fields.py index a2b69cbcc..5c741305d 100644 --- a/tests/test_transform_fields.py +++ b/tests/test_transform_fields.py @@ -6,7 +6,6 @@ """Test fields in TOML [transform].""" import copy import unittest -from pathlib import Path from textwrap import dedent import pytoml @@ -15,8 +14,6 @@ from detection_rules.devtools import guide_plugin_convert_ from detection_rules.rule import TOMLRule, TOMLRuleContents from detection_rules.rule_loader import RuleCollection -RULES_DIR = Path(__file__).parent.parent / 'rules' - class TestGuideMarkdownPlugins(unittest.TestCase): """Test the Markdown plugin features within the investigation guide.""" @@ -33,8 +30,62 @@ class TestGuideMarkdownPlugins(unittest.TestCase): @staticmethod def load_rule() -> TOMLRule: rc = RuleCollection() - windows_rule = list(RULES_DIR.joinpath('windows').glob('*.toml'))[0] - sample_rule = rc.load_file(windows_rule) + windows_rule = { + "metadata": { + "creation_date": "2020/08/14", + "updated_date": "2024/03/28", + "integration": ["endpoint"], + "maturity": "production", + "min_stack_version": "8.3.0", + "min_stack_comments": "New fields added: required_fields, related_integrations, setup", + }, + "rule": { + "author": ["Elastic"], + "description": "This is a test.", + "license": "Elastic License v2", + "from": "now-9m", + "name": "Test Suspicious Print Spooler SPL File Created", + "note": 'Test note', + "references": ["https://safebreach.com/Post/How-we-bypassed-CVE-2020-1048-Patch-and-got-CVE-2020-1337"], + "risk_score": 47, + "rule_id": "43716252-4a45-4694-aff0-5245b7b6c7cd", + "setup": "Test setup", + "severity": "medium", + "tags": [ + "Domain: Endpoint", + "OS: Windows", + "Use Case: Threat Detection", + "Tactic: Privilege Escalation", + "Resources: Investigation Guide", + "Data Source: Elastic Endgame", + "Use Case: Vulnerability", + "Data Source: Elastic Defend", + ], + "timestamp_override": "event.ingested", + "type": "eql", + "threat": [ + { + "framework": "MITRE ATT&CK", + "tactic": { + "id": "TA0004", + "name": "Privilege Escalation", + "reference": "https://attack.mitre.org/tactics/TA0004/", + }, + "technique": [ + { + "id": "T1068", + "name": "Exploitation for Privilege Escalation", + "reference": "https://attack.mitre.org/techniques/T1068/", + } + ], + } + ], + "index": ["logs-endpoint.events.file-*", "endgame-*"], + "query": 'file where host.os.type == "windows" and event.type != "deletion"', + "language": "eql", + }, + } + sample_rule = rc.load_dict(windows_rule) return sample_rule def test_transform_guide_markdown_plugins(self) -> None: diff --git a/tests/test_version_locking.py b/tests/test_version_locking.py index 0e2ffa777..37e0e5e42 100644 --- a/tests/test_version_locking.py +++ b/tests/test_version_locking.py @@ -10,17 +10,18 @@ import unittest from semver import Version from detection_rules.schemas import get_min_supported_stack_version -from detection_rules.version_lock import default_version_lock +from detection_rules.version_lock import loaded_version_lock, RULES_CONFIG class TestVersionLock(unittest.TestCase): """Test version locking.""" + @unittest.skipIf(RULES_CONFIG.bypass_version_lock, 'Version lock bypassed') def test_previous_entries_gte_current_min_stack(self): """Test that all previous entries for all locks in the version lock are >= the current min_stack.""" errors = {} min_version = get_min_supported_stack_version() - for rule_id, lock in default_version_lock.version_lock.to_dict().items(): + for rule_id, lock in loaded_version_lock.version_lock.to_dict().items(): if 'previous' in lock: prev_vers = [Version.parse(v, optional_minor_and_patch=True) for v in list(lock['previous'])] outdated = [f"{v.major}.{v.minor}" for v in prev_vers if v < min_version]