Mozdef_util Library¶
We provide a library used to interact with MozDef components.
Connecting to Elasticsearch¶
1 2 | from mozdef_util.elasticsearch_client import ElasticsearchClient
es_client = ElasticsearchClient("http://127.0.0.1:9200")
|
Creating/Updating Documents¶
Create a new Event¶
1 2 3 4 | event_dict = {
"example_key": "example value"
}
es_client.save_event(body=event_dict)
|
Update an existing event¶
1 2 3 4 5 | event_dict = {
"example_key": "example new value"
}
# Assuming 12345 is the id of the existing entry
es_client.save_event(body=event_dict, doc_id="12345")
|
Create a new alert¶
1 2 3 4 | alert_dict = {
"example_key": "example value"
}
es_client.save_alert(body=alert_dict)
|
Update an existing alert¶
1 2 3 4 5 | alert_dict = {
"example_key": "example new value"
}
# Assuming 12345 is the id of the existing entry
es_client.save_alert(body=alert_dict, doc_id="12345")
|
Create a new generic document¶
1 2 3 4 | document_dict = {
"example_key": "example value"
}
es_client.save_object(index='randomindex', body=document_dict)
|
Update an existing document¶
1 2 3 4 5 | document_dict = {
"example_key": "example new value"
}
# Assuming 12345 is the id of the existing entry
es_client.save_object(index='randomindex', body=document_dict, doc_id="12345")
|
Bulk Importing¶
1 2 3 | from mozdef_util.elasticsearch_client import ElasticsearchClient
es_client = ElasticsearchClient("http://127.0.0.1:9200", bulk_amount=30, bulk_refresh_time=5)
es_client.save_event(body={'key': 'value'}, bulk=True)
|
- Line 2: bulk_amount (defaults to 100), specifies how many messages should sit in the bulk queue before they get written to elasticsearch
- Line 2: bulk_refresh_time (defaults to 30), is the amount of time that a bulk flush is forced
- Line 3: bulk (defaults to False) determines if an event should get added to a bulk queue
Searching for documents¶
Simple search¶
1 2 3 4 5 6 7 8 9 | from mozdef_util.query_models import SearchQuery, TermMatch, ExistsMatch
search_query = SearchQuery(hours=24)
must = [
TermMatch('category', 'brointel'),
ExistsMatch('seenindicator')
]
search_query.add_must(must)
results = search_query.execute(es_client, indices=['events','events-previous'])
|
SimpleResults
When you perform a “simple” search (one without any aggregation), a SimpleResults object is returned. This object is a dict, with the following format:
Key | Description |
---|---|
hits | Contains an array of documents that matched the search query |
meta | Contains a hash of fields describing the search query (Ex: if the query timed out or not) |
Example simple result:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 | {
'hits': [
{
'_id': u'cp5ZsOgLSu6tHQm5jAZW1Q',
'_index': 'events-20161005',
'_score': 1.0,
'_source': {
'details': {
'information': 'Example information'
},
'category': 'excategory',
'summary': 'Test Summary',
'type': 'event'
}
}
],
'meta': {'timed_out': False}
}
|
Aggregate search¶
1 2 3 4 5 6 | from mozdef_util.query_models import SearchQuery, TermMatch, Aggregation
search_query = SearchQuery(hours=24)
search_query.add_must(TermMatch('category', 'brointel'))
search_query.add_aggregation(Aggregation('source'))
results = search_query.execute(es_client)
|
AggregatedResults
When you perform an aggregated search (Ex: give me a count of all different ip addresses are in the documents that match a specific query), a AggregatedResults object is returned. This object is a dict, with the following format:
Key | Description |
---|---|
aggregations | Contains the aggregation results, grouped by field name |
hits | Contains an array of documents that matched the search query |
meta | Contains a hash of fields describing the search query (Ex: if the query timed out or not) |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 | {
'aggregations': {
'ip': {
'terms': [
{
'count': 2,
'key': '1.2.3.4'
},
{
'count': 1,
'key': '127.0.0.1'
}
]
}
},
'hits': [
{
'_id': u'LcdS2-koQWeICOpbOT__gA',
'_index': 'events-20161005',
'_score': 1.0,
'_source': {
'details': {
'information': 'Example information'
},
'ip': '1.2.3.4',
'summary': 'Test Summary',
'type': 'event'
}
},
{
'_id': u'F1dLS66DR_W3v7ZWlX4Jwg',
'_index': 'events-20161005',
'_score': 1.0,
'_source': {
'details': {
'information': 'Example information'
},
'ip': '1.2.3.4',
'summary': 'Test Summary',
'type': 'event'
}
},
{
'_id': u'G1nGdxqoT6eXkL5KIjLecA',
'_index': 'events-20161005',
'_score': 1.0,
'_source': {
'details': {
'information': 'Example information'
},
'ip': '127.0.0.1',
'summary': 'Test Summary',
'type': 'event'
}
}
],
'meta': {
'timed_out': False
}
}
|
Match/Query Classes¶
ExistsMatch¶
Checks to see if a specific field exists in a document
1 2 3 | from mozdef_util.query_models import ExistsMatch
ExistsMatch("randomfield")
|
TermMatch¶
Checks if a specific field matches the key
1 2 3 | from mozdef_util.query_models import TermMatch
TermMatch("details.ip", "127.0.0.1")
|
TermsMatch¶
Checks if a specific field matches any of the keys
1 2 3 | from mozdef_util.query_models import TermsMatch
TermsMatch("details.ip", ["127.0.0.1", "1.2.3.4"])
|
WildcardMatch¶
Allows regex to be used in looking for documents that a field contains all or part of a key
1 2 3 | from mozdef_util.query_models import WildcardMatch
WildcardMatch('summary', 'test*')
|
PhraseMatch¶
Checks if a field contains a specific phrase (includes spaces)
1 2 3 | from mozdef_util.query_models import PhraseMatch
PhraseMatch('summary', 'test run')
|
BooleanMatch¶
Used to apply specific “matchers” to a query. This will unlikely be used outside of SearchQuery.
1 2 3 4 5 6 7 8 9 10 | from mozdef_util.query_models import ExistsMatch, TermMatch, BooleanMatch
must = [
ExistsMatch('details.ip')
]
must_not = [
TermMatch('type', 'alert')
]
BooleanMatch(must=must, should=[], must_not=must_not)
|
MissingMatch¶
Checks if a field does not exist in a document
1 2 3 | from mozdef_util.query_models import MissingMatch
MissingMatch('summary')
|
RangeMatch¶
Checks if a field value is within a specific range (mostly used to look for documents in a time frame)
1 2 3 | from mozdef_util.query_models import RangeMatch
RangeMatch('utctimestamp', "2016-08-12T21:07:12.316450+00:00", "2016-08-13T21:07:12.316450+00:00")
|
QueryStringMatch¶
Uses a custom query string to generate the “match” based on (Similar to what you would see in kibana)
1 2 3 | from mozdef_util.query_models import QueryStringMatch
QueryStringMatch('summary: test')
|
SubnetMatch¶
Checks if an IP field is within the bounds of a subnet
1 2 3 | from mozdef_util.query_models import SubnetMatch
SubnetMatch('details.sourceipaddress', '10.1.1.0/24')
|
Aggregation¶
Used to aggregate results based on a specific field
1 2 3 4 5 6 7 8 9 10 | from mozdef_util.query_models import Aggregation, SearchQuery, ExistsMatch
search_query = SearchQuery(hours=24)
must = [
ExistsMatch('seenindicator')
]
search_query.add_must(must)
aggr = Aggregation('details.ip')
search_query.add_aggregation(aggr)
results = search_query.execute(es_client, indices=['events','events-previous'])
|