User Behavior Analysis - Admin Console

Overview

This document describes the network requirement to facilitate communication between VeridiumID Server and UBA Service, and the data collected for UBA analysis currently available in the perspective of user privacy.

Network

The UBA Kubernetes cluster uses a Load Balancer to make the services reachable outside the cluster. The IPs to be white listed are:

Canonical Name	Address	Port
prod-public-nlb-uba-da415169d15bdc85.elb.eu-central-1.amazonaws.com	18.158.83.114	443
prod-public-nlb-uba-da415169d15bdc85.elb.eu-central-1.amazonaws.com	18.193.33.215	443
prod-public-nlb-uba-da415169d15bdc85.elb.eu-central-1.amazonaws.com	18.192.139.232	443

The UBA services accessed by VeridiumID Server:

DNS resolving

The client should be able to resolve the above names:

The above names should be resolve to the public NLB (in case of global DNS resolution), otherwise to the 3 IPs listed in the table.

Requirements from client

To provide the IPs (classes of IPs -reverse proxy) on which the client’s infrastructure communicates on the Internet to be whitelisted on the UBA Kubernetes cluster side.

Short description

Our User Behaviour Analysis (UBA) is formed of two components:

Motion: performs user identity verification by means of analysing signals that are collected by the motion sensors - i.e., accelerometer and gyroscope - of a mobile device during biometric authentication.
Context: uses all the available context information from the authenticator (the device that is used for authentication - e.g., a mobile phone) and the exploiter (the device that the user wants to access after authentication - e.g., a browser from a laptop) at the authentication moment.

Motion

Within this component, we extract features that are unique to the user that initiated the authentication process and, by employing a set of user-specific Machine Learning models, we generate a recognition score.

User-specific model training:

For each user, the associated Machine Learning models are trained at every 10^th authentication. Up until the 100^th authentication, we use all the available signals from the user to train the user models, while after that we only keep the last 100 signal samples. The user-specific models are retrained for each enrolled user, to improve the artificial “knowledge” they have on the assigned user.

Since for a newly enrolled user we do not have positive data to train a set of Machine Learning models, we consider the first 10 authentications to be performed by the legitimate (genuine) user and, therefore, we assign a score of 1 for each of them. We explain below the significance of motion score values.

Motion scores:

In the case of motion biometric recognition, the score is a value from the interval [0, 1], where 0 indicates a strong reject (i.e., the person trying to authenticate is an imposter), while 1 represents a strong accept (i.e., the person trying to authenticate is a genuine / legitimate user). We divide this score interval into 6 confidence intervals. An example of these confidence intervals is the following:

[0, t₁): Reject with high confidence
[t₁, t₂): Reject with medium confidence
[t₂, 0.5): Reject with low confidence
[0.5, t₃): Accept with low confidence
[t₃, t₄): Accept with medium confidence
[t₄, 1]: Accept with high confidence

The decision threshold for motion scoring is 0.5, which corresponds to equal false acceptance and rejection rates. Therefore, every authentication that has a motion signal with a score lower than 0.5 is considered an anomaly, while those with signals that have a score equal to or higher than 0.5 are considered successful authentications. The thresholds t₁, t₂, t₃, t₄ are established based on the empirical evidence observed on a dataset used for experimentation and they depend on the number of positive training samples.

Context

Within this component, we extract features that are specific to the authentication context that is commonly used by the user when accessing a service (exploiter).

Global recognition models:

In the case of context, we have a set of global models that have been trained in advance on an internally created dataset. We do not need user-specific models in this scenario because the global models have been trained to recognize certain anomalies (one example is an impossible change in location).

A context score is computed based on a set of features that are generated by comparing the context of the current authentication with the context information from the three previous authentication sessions. Since the last three authentications are needed to extract input features for the global models, as in the case of motion, we consider that the previous three authentications (the ones before the current session) are representative for the legitimate user.

We assign a positive recognition score to the first three authentications after enrollment, which in this case is 0.

Score interpretation:

The risk score interval for context is [0, ∞) and the decision threshold is 5. The threshold is established based on the empirical evidence observed on a dataset used for experimentation.

Every authentication that has a risk score lower than or equal to the decision threshold is considered to belong to the legitimate user, while the authentications with risk scores higher than 5 are labeled as anomalies.

List of attributes sent to UBA system

1. General

Name	Type	Description	Example
session_id	UUID	The ID generated for current authentication.	843b6084-7b8a-45a0-afda-d7f11c37821c
device_id	UUID	The ID generated for the authentication device.	13debdd4-01c8-4be3-ab7d-32f558f274e3
account_id	UUID	The ID generated for the account associated with the device.	d9c819cc-4256-4c1e-a2ed-c5457ce85c48
service_identifier	String	The name of the service used for authentication.	IAMShowcase
stage_type	CODE `StageType { String challenge; String activator; }`	The method and biometry used for authentication.	CODE `challenge='TOUCHID' activator='QR'`
tenant_id	UUID	Generated when creating the tenant.	7a63b0f3-4fb7-426d-b954-700091f8a306
stage_id	UUID	Generated when creating the stage.	c79c8e99-21d8-45bc-a395-8717e7ad7846
audit	CODE `AuditEntry { long created_date; long last_update_time; String time_zone; String created_by; String modified_by; String tenant_id; }`	Generated for each request.	CODE `created_by = "App Name" created_date = 1596437633874 last_update_time = 1596437633874 tenant_id = 7a63b0f3-4fb7-426d-b954-700091f8a306`
created_date	long	Generated for each request.	1596437633874
stage_state	CODE `enum StageState { UNKNOWN = 0; INITIALIZED = 1; INGESTING = 2; INFERRING = 3; COMPLETED = 4; ERROR = 5; TIMEOUT = 6; }`	The state of the stage at a certain step.	INGESTING
session_state	CODE `enum SessionState { UNNKOWN = 0; CREATED = 1; INGESTING = 2; COMPLETED = 3; TIMEOUT = 4; }`	The state of the session at a certain step.	INGESTING

2. MOTION

Name	Type	Description	Example
rotation_rate	CODE `GestureCoordinates { double x_value; double y_value; double z_value; long timestamp; }`	Data collected from gyroscope sensor.	CODE `[{x_value: 0.5257872343063354 y_value: 1.0781968832015991 z_value: 0.1111893430352211 timestamp: 43914427}, x_value: 0.6038274765014648 y_value: 0.9051811695098877 z_value: 0.007703691720962524 timestamp: 43914437}, …]`
linear_acceleration	CODE `GestureCoordinates { double x_value; double y_value; double z_value; long timestamp; }`	Data collected from accelerometer sensor.	CODE `[{x_value: 0.3792180836200714 y_value: -1.1320989251136782 z_value: -0.2763497829437256 timestamp: 43914427} {x_value: -0.829619163274765 y_value: -1.4498988389968872 z_value: 1.105836057662964 timestamp: 43914437} …]`
point_id	UUID	Generated when creating the gesture.	a9f4a4f-a47b-42c7-805e-6afed78d3ef4
audit	CODE `AuditEntry { long created_date; long last_update_time; String time_zone; String created_by; String modified_by; String tenant_id; }`	Generated for each request.	CODE `created_by = "App Name" created_date = 1596437633874 last_update_time = 1596437633874 tenant_id = 7a63b0f3-4fb7-426d-b954-700091f8a306`
created_date	long	Generated for each request.	1596437633874

The figures below are visual representations of the vectors that are received from the accelerometer and gyroscope sensors.

3. CONTEXT

Name	Type	Description	Example
context_id	UUID	Generated when creating the context.	5f0e118c-f6b6-4bc6-84dc-c15ce976aba2
audit	CODE `AuditEntry { long created_date; long last_update_time; String time_zone; String created_by; String modified_by; String tenant_id; }`	Generated for each request.	CODE `created_by = "App Name" created_date = 1596437633874 last_update_time = 1596437633874 tenant_id = 7a63b0f3-4fb7-426d-b954-700091f8a306`
device_make	String	The manufacturer of the authentication device.	Apple
device_model	String	The model of the authentication device.	IPhone8,1
device_id	UUID	The ID generated for the authentication device.	65772a01-dcda-4714-8558-6104b5845470
hardware_crypto_support	Boolean	Hardware encryption information for the authentication device.	True
internet_connection_type	String	The internet connection type for the authentication device.	WIFI
is_rooted	Boolean	rooted/ jailbroken authentication device information.	False
language	String	The language of the authentication device.	en
current_date_time	long	The date captured by the authentication device.	1646745304184
timezone_offset	Integer	The timezone offset captured by the authentication device.	-120
geo_location	CODE `GeoLocation { GeoLocationDoubleValue latitude; GeoLocationDoubleValue longitude; GeoLocationDoubleValue altitude; GeoLocationDoubleValue accuracy; GeoLocationDoubleValue altitude_accuracy; GeoLocationDoubleValue heading; GeoLocationDoubleValue speed; } GeoLocationDoubleValue { double value ; bool is_set; }`	Details of the user's geospatial information (World Geodetic System) for current authentication. The fields are allowed on location object based on GDPR restrictions.	CODE `latitude: { value: 44.4432373046875, is_set: true} longitude: { value: 26.110290997903746, is_set: true} accuracy: { value: 65.0, is_set: true}`
address	CODE `Address { String city; String country_code; String country_name; String postal_code; String region_code; String region_name; }`	Details of the user's physical location for current authentication. The fields are configurable from Admin → Geolocation Settings. They are allowed on location object based on GDPR restrictions.	CODE `city: "Targoviste" country_code: "RO" country_name: "Romania" region_code: "16" region_name: "Dambovita"`
OS	CODE `OsData { String os_name; String os_version; }`	The operating system details for the current authentication.	CODE `os_name: ‘iOS’ os_version: ‘14.6.0’`
user_agent	CODE `UserAgent { String name; String raw; String version; }`	The user agent details for the current authentication.	CODE `name: "Firefox" raw: "Mozilla/5.0 (Macintosh; Intel Mac OS X 10.15; rv:94.0) Gecko/20100101 Firefox/94.0" version: "94.0"`
registration_time	long	The registration time of the authentication device.	1596437633874
IP	CODE `IpContext { String address; IpLocationType location_type; } enum IpLocationType { LOCATION_UNKNOWN = 0; INTERNET = 1; INTRANET = 2; }`	IP address and location (INTERNET or INTRANET)	CODE `address: "86.120.11.66"`
biometric_methods	String	The type of the biometry used in authentication (TOTP, 4F, PIN, TOUCHID, VFACE).	TOUCHID
application_id	String	Application ID specific for the mobile device used as authenticator.	AD
app_version	String	Application version specific for the mobile device used as authenticator.	1.0
device_cookie_info	CODE `enum DeviceCookieInformation { UNKNOWN = 0; TRUSTED_DEVICE = 1; UNTRUSTED_DEVICE = 2; NOT_APPLICABLE = 3; }`	Determine if the exploiter device is trusted.	1