GeoModel Version 0.1 Specification¶
The first release version of GeoModel will be a minimum viable product (MVP) containing features that replace the functionality of the existing implementation along with a few new requirements.
Terminology¶
Locality¶
The locality of a user is a geographical region from which most of that user’s online activity originates.
Authenticated Actions¶
An event produced as a result of any user taking any action that required they be authenticated e.g. authenticating to a service with Duo, activity in the AWS web console etc.
Primary Interface¶
GeoModel v0.1 is an alert built into MozDef that:
- Processes authentication-related events.
- Updates user locality information.
- Emits alerts when some specific conditions are met.
Data Stores¶
GeoModel interacts with MozDef to both query for events as well as store new alerts.
GeoModel also maintains its own user locality information. Version 0.1 will store this information in the same ElasticSearch instance that MozDef uses, under a configured index.
Functional Components¶
GeoModel v0.1 can be thought of as consisting of two core “components” that are each responsible for a distinct set of responsibilities. These two components interact in a pipeline.
Because GeoModel v0.1 is implemented as an
Alert in MozDef,
it is essentially a distinct Python program run by MozDef’s AlertTask
scheduler.
Analysis Engine¶
The first component handles the analysis of events pertaining to authenticated actions made by users. These events are retrieved from MozDef and analyzed to determine locality of users which is then persisted in a data store.
This component has the following responsibilities:
- Run configured queries to retrieve events describing authenticated actions taken by users from MozDef.
- Load locality state from ElasticSearch.
- Remove outdated locality information.
- Update locality state with information from retrieved events.
Alert Emitter¶
The second component handles the creation of alerts and communicating of those alerts to MozDef.
This component has the following responsibilities:
- Inspect localities produced by the Analysis Engine to produce alerts.
- Store alerts in MozDef’s ElasticSearch instance.
The Alert Emitter will, given a set of localities for a user, produce an alert if and only if both:
- User activity is found to originate from a location outside of all previously known localities.
- It would not be possible for the user to have travelled to a new locality from the one they were last active in.
Data Models¶
The following models describe what data is required to implement the features that each component is responsible for. They are described using a JSON-based format where keys indicidate the names of values and values are strings containing those values’ types, which are represented using TypeScript notation. We use this notation because configuration data as well as data stored in ElasticSearch are represented as JSON and JSON-like objects respectively.
General Configuration¶
The top-level configuration for GeoModel version 0.1 must contain the following.
{
"localities": {
"es_index": string,
"valid_duration_days": number,
"radius_kilometres": number
},
"events": {
"search_window": object,
"lucene_query": string,
},
"whitelist": {
"users": Array<string>,
"cidrs": Array<string>
}
}
Using the information above, GeoModel can determine:
- What index to store locality documents in.
- What index to read events from.
- What index to write alerts to.
- What queries to run in order to retrieve a complete set of events.
- When a user locality is considered outdated and should be removed.
- The radius that localities should have.
- Whitelisting rules to apply.
In the above, note that events.queries
describes an array of objects. Each of
these objects are expected to contain a query for ElasticSearch using
Lucene syntax. The
username
field is expected to be a string describing the path into
the result dictionary your query will return that will produce the username of
the user taking an authenticated action.
The search_window
object can contain any of the keywords passed to Python’s
timedelta
constructor.
So for example the following:
{
"events": [
{
"search_window": {
"minutes": 30
},
"lucene_query": "tags:auth0",
"username_path": "details.username"
}
]
}
would query ElasticSearch for all events tagged auth0
and try to extract
the username
from result["details"]["username"]
where result
is one of
the results produced by executing the query.
The alerts.whitelist
portion of the configuration specifies a couple of
parameters for whitelisting acitivity:
- From any of a list of users (based on
events.queries.username
). - From any IPs within the range of any of a list of CIDRs.
For example, the following whitelist configurations would instruct GeoModel
not to produce alerts for actions taken by “testuser” or for any users
originating from an IP in either the ranges 1.2.3.0/8
and 192.168.0.0/16
.
{
"alerts": {
"whitelist": {
"users": ["testuser"],
"cidrs": ["1.2.3.0/8", "192.168.0.0/16"]:
}
}
}
Note however that GeoModel will still retain locality information for whitelisted users and users originating from whitelisted IPs.
User Locality State¶
GeoModel version 0.1 uses one ElasticSearch Type (similar to a table in a relational database) to represent locality information. Under this type, one document exists per user describing that user’s locality information.
{
"type_": "locality",
"username": string,
"localities": Array<{
"sourceipaddress": string,
"city": string,
"country": string,
"lastaction": date,
"latitude": number,
"longitude": number,
"radius": number
}>
}
Using the information above, GeoModel can determine:
- All of the localities of a user.
- Whether a locality is older than some amount of time.
- How far outside of any localities a given location is.
Alerts¶
Alerts emitted to the configured index are intended to cohere to MozDef’s preferred naming scheme.
{
"username": string,
"hops": [
{
"origin": {
"ip": string,
"city": string,
"country": string,
"latitude": number,
"longitude": number,
"geopoint": GeoPoint
}
"destination": {
"ip": string,
"city": string,
"country": string,
"latitude": number,
"longitude": number,
"geopoint": GeoPoint
}
}
]
}
Note in the above that the origin.geopoint
field uses ElasticSearch’s
GeoPoint
type.
User Stories¶
User stories here make references to the following categories of users:
- An operator is anyone responsible for deploying or maintaining a deployment of MozDef that includes GeoModel.
- An investigator is anyone responsible for viewing and taking action based on alerts emitted by GeoModel.
Potential Compromises Detected¶
As an investigator, I expect that if a user is found to have performed some authenticated action in one location and then, some short amount of time later, in another that an alert will be emitted by GeoModel.
Realistic Travel Excluded¶
As an investigator, I expect that if someone starts working somehwere, gets on a plane and continues working after arriving in their destination that an alert will not be emitted by GeoModel.
Diversity of Indicators¶
As an operator, I expect that GeoModel will fetch events pertaining to authenticated actions from new sources (Duo, Auth0, etc.) after I deploy MozDef with GeoModel configured with queries targeting those sources.
Old Data Removed Automatically¶
As an operator, I expect that GeoModel will forget about localities attributed to users that have not been in those geographic regions for a configured amount of time.