User Behavior Analysis - Admin Console
Overview
This document describes the network requirement to facilitate communication between VeridiumID Server and UBA Service, and the data collected for UBA analysis currently available in the perspective of user privacy.
Network
The UBA Kubernetes cluster uses a Load Balancer to make the services reachable outside the cluster. The IPs to be white listed are:
Canonical Name | Address | Port |
---|---|---|
prod-public-nlb-uba-da415169d15bdc85.elb.eu-central-1.amazonaws.com | 18.158.83.114 | 443 |
prod-public-nlb-uba-da415169d15bdc85.elb.eu-central-1.amazonaws.com | 18.193.33.215 | 443 |
prod-public-nlb-uba-da415169d15bdc85.elb.eu-central-1.amazonaws.com | 18.192.139.232 | 443 |
The UBA services accessed by VeridiumID Server:
DNS resolving
The client should be able to resolve the above names:
The above names should be resolve to the public NLB (in case of global DNS resolution), otherwise to the 3 IPs listed in the table.
Requirements from client
To provide the IPs (classes of IPs -reverse proxy) on which the client’s infrastructure communicates on the Internet to be whitelisted on the UBA Kubernetes cluster side.
Short description
Our User Behaviour Analysis (UBA) is formed of two components:
Motion: performs user identity verification by means of analysing signals that are collected by the motion sensors - i.e., accelerometer and gyroscope - of a mobile device during biometric authentication.
Context: uses all the available context information from the authenticator (the device that is used for authentication - e.g., a mobile phone) and the exploiter (the device that the user wants to access after authentication - e.g., a browser from a laptop) at the authentication moment.
Motion
Within this component, we extract features that are unique to the user that initiated the authentication process and, by employing a set of user-specific Machine Learning models, we generate a recognition score.
User-specific model training:
For each user, the associated Machine Learning models are trained at every 10th authentication. Up until the 100th authentication, we use all the available signals from the user to train the user models, while after that we only keep the last 100 signal samples. The user-specific models are retrained for each enrolled user, to improve the artificial “knowledge” they have on the assigned user.
Since for a newly enrolled user we do not have positive data to train a set of Machine Learning models, we consider the first 10 authentications to be performed by the legitimate (genuine) user and, therefore, we assign a score of 1 for each of them. We explain below the significance of motion score values.
Motion scores:
In the case of motion biometric recognition, the score is a value from the interval [0, 1], where 0 indicates a strong reject (i.e., the person trying to authenticate is an imposter), while 1 represents a strong accept (i.e., the person trying to authenticate is a genuine / legitimate user). We divide this score interval into 6 confidence intervals. An example of these confidence intervals is the following:
[0, t1): Reject with high confidence
[t1, t2): Reject with medium confidence
[t2, 0.5): Reject with low confidence
[0.5, t3): Accept with low confidence
[t3, t4): Accept with medium confidence
[t4, 1]: Accept with high confidence
The decision threshold for motion scoring is 0.5, which corresponds to equal false acceptance and rejection rates. Therefore, every authentication that has a motion signal with a score lower than 0.5 is considered an anomaly, while those with signals that have a score equal to or higher than 0.5 are considered successful authentications. The thresholds t1, t2, t3, t4 are established based on the empirical evidence observed on a dataset used for experimentation and they depend on the number of positive training samples.
Context
Within this component, we extract features that are specific to the authentication context that is commonly used by the user when accessing a service (exploiter).
Global recognition models:
In the case of context, we have a set of global models that have been trained in advance on an internally created dataset. We do not need user-specific models in this scenario because the global models have been trained to recognize certain anomalies (one example is an impossible change in location).
A context score is computed based on a set of features that are generated by comparing the context of the current authentication with the context information from the three previous authentication sessions. Since the last three authentications are needed to extract input features for the global models, as in the case of motion, we consider that the previous three authentications (the ones before the current session) are representative for the legitimate user.
We assign a positive recognition score to the first three authentications after enrollment, which in this case is 0.
Score interpretation:
The risk score interval for context is [0, ∞) and the decision threshold is 5. The threshold is established based on the empirical evidence observed on a dataset used for experimentation.
Every authentication that has a risk score lower than or equal to the decision threshold is considered to belong to the legitimate user, while the authentications with risk scores higher than 5 are labeled as anomalies.
List of attributes sent to UBA system
1. General
Name | Type | Description | Example |
---|---|---|---|
session_id | UUID | The ID generated for current authentication. | 843b6084-7b8a-45a0-afda-d7f11c37821c |
device_id | UUID | The ID generated for the authentication device. | 13debdd4-01c8-4be3-ab7d-32f558f274e3 |
account_id | UUID | The ID generated for the account associated with the device. | d9c819cc-4256-4c1e-a2ed-c5457ce85c48 |
service_identifier | String | The name of the service used for authentication. | IAMShowcase |
stage_type |
CODE
| The method and biometry used for authentication. |
CODE
|
tenant_id | UUID | Generated when creating the tenant. | 7a63b0f3-4fb7-426d-b954-700091f8a306 |
stage_id | UUID | Generated when creating the stage. | c79c8e99-21d8-45bc-a395-8717e7ad7846 |
audit |
CODE
| Generated for each request. |
CODE
|
created_date | long | Generated for each request. | 1596437633874 |
stage_state |
CODE
| The state of the stage at a certain step. | INGESTING |
session_state |
CODE
| The state of the session at a certain step. | INGESTING |
2. MOTION
Name | Type | Description | Example |
---|---|---|---|
rotation_rate |
CODE
| Data collected from gyroscope sensor. |
CODE
|
linear_acceleration |
CODE
| Data collected from accelerometer sensor. |
CODE
|
point_id | UUID | Generated when creating the gesture. | a9f4a4f-a47b-42c7-805e-6afed78d3ef4 |
audit |
CODE
| Generated for each request. |
CODE
|
created_date | long | Generated for each request. | 1596437633874 |
The figures below are visual representations of the vectors that are received from the accelerometer and gyroscope sensors.
3. CONTEXT
Name | Type | Description | Example |
---|---|---|---|
context_id | UUID | Generated when creating the context. | 5f0e118c-f6b6-4bc6-84dc-c15ce976aba2 |
audit |
CODE
| Generated for each request. |
CODE
|
device_make | String | The manufacturer of the authentication device. | Apple |
device_model | String | The model of the authentication device. | IPhone8,1 |
device_id | UUID | The ID generated for the authentication device. | 65772a01-dcda-4714-8558-6104b5845470 |
hardware_crypto_support | Boolean | Hardware encryption information for the authentication device. | True |
internet_connection_type | String | The internet connection type for the authentication device. | WIFI |
is_rooted | Boolean | rooted/ jailbroken authentication device information. | False |
language | String | The language of the authentication device. | en |
current_date_time | long | The date captured by the authentication device. | 1646745304184 |
timezone_offset | Integer | The timezone offset captured by the authentication device. | -120 |
geo_location |
CODE
| Details of the user's geospatial information (World Geodetic System) for current authentication. The fields are allowed on location object based on GDPR restrictions. |
CODE
|
address |
CODE
| Details of the user's physical location for current authentication. The fields are configurable from Admin → Geolocation Settings. They are allowed on location object based on GDPR restrictions. |
CODE
|
OS |
CODE
| The operating system details for the current authentication. |
CODE
|
user_agent |
CODE
| The user agent details for the current authentication. |
CODE
|
registration_time | long | The registration time of the authentication device. | 1596437633874 |
IP |
CODE
| IP address and location (INTERNET or INTRANET) |
CODE
|
biometric_methods | String | The type of the biometry used in authentication (TOTP, 4F, PIN, TOUCHID, VFACE). | TOUCHID |
application_id | String | Application ID specific for the mobile device used as authenticator. | AD |
app_version | String | Application version specific for the mobile device used as authenticator. | 1.0 |
device_cookie_info |
CODE
| Determine if the exploiter device is trusted. | 1 |