Data Layer Connector

The Data Layer Connector is CHEQ’s front-end mechanism designed to communicate CHEQ information with a website’s front-end so it can be used by tag managers or other site code. This connector is designed to send the user payload to the browser as soon as a user is classified. That’s exceptionally fast, but is also subject to race conditions.

Each method triggers an event only when the value is initially set and when the value changes (unless toggled otherwise in Advanced Settings).

Where to Find It

The data layer is available to all Analytics and select Acquisition clients. If you're an acquisition customer and would like to use data layer functionality, please contact your Customer Success Manager.

For Analytics customers, the Data Layer tab is located here:

Analytics > Settings > Data Layer

For Acquisition customers, the Data Layer tab is located here:

Acquisition > Settings > Data Layer

Methods of Deployment

Data Layer Push

This uses an event-driven data layer methodology to push a JSON object payload into an array. It is the standard method of pushing data into Google Tag Manager and also compatible with popular Adobe Data Collection Tags extensions (like Adobe Client Data Layer).

Field Definitions

Field Type Description
Object Name string Specifies the name of the data layer array object (can be an existing object).
Event Name string Specifies the name of the event that is pushed into the data layer, which helps the tag management system listen for the updated data.
Key string Specifies the name of the variable and can have any name using valid variable charsets.
Value selection Dynamically returns the value of the corresponding key.

 

Example Payload

Input
Field Value
Object Name myDataLayer
Event Name cheq_response
Key agentType
Value Threat Type

Output
myDataLayer.push({
"event":"cheq_response",
"agentType":"crawler"
})

Local Storage

This uses the Local Storage methodology to push a key:value pair into your browser's local storage. It is accessible by any system that can view and reference local storage objects.

Field Definitions

Field Type Description
Event Name string Specifies the name of the JavaScript event that is pushed from the <body> tag, which helps the tag management system listen for the updated data.
Key string Specifies the name of the variable and can have any name using valid variable charsets.
Value selection Dynamically returns the value of the corresponding key.

 

Example Payload

Input
Field Value
Event Name cheq_response
Key agentType
Value Threat Type

Output
localStorage.setItem('agentType', 'crawler');
const cheqEvent = new Event('cheq_response');
document.body.dispatchEvent(cheqEvent);

JavaScript Variable

This method uses a JavaScript variables to push values into the variables. It is accessible by any system that can retrieve JavaScript variables.

Field Definitions

Field Type Description
Event Name string Specifies the name of the JavaScript event that is pushed from the <body> tag, which helps the tag management system listen for the updated data.
Key string Specifies the name of the variable and can have any name using valid variable charsets.
Value selection Dynamically returns the value of the corresponding key.

 

Example Payload

Input
Field Value
Event Name cheq_response
Key agentType
Value Threat Type

Output
var agentType = "crawler";
const cheqEvent = new Event('cheq_response');
document.body.dispatchEvent(cheqEvent);

Cookie

This method uses a Cookie to push key:value pairs into users' browser cookies. It is accessible by any system that can retrieve cookies.

Field Definitions

Field Type Description
Event Name string Specifies the name of the JavaScript event that is pushed from the <body> tag, which helps the tag management system listen for the updated data.
Key string Specifies the name of the variable and can have any name using valid variable charsets.
Value selection Dynamically returns the value of the corresponding key.

 

Example Payload

Input
Field Value
Event Name cheq_response
Key agentType
Value Threat Type

Output
document.cookie = "agent_type=crawler; path=/; max-age=31536000";
const cheqEvent = new Event('cheq_response');
document.body.dispatchEvent(cheqEvent);

Using the Data Layer

The Data Layer is typically leveraged by a Tag Management System. To learn how to integrate it into your TMS, please refer to one of our guides:

CHEQ Manage (Ensighten)

Google Tag Manager

Adobe Tags (Launch)

Tealium iQ

Data Layer Values

Threat Group

Values Description
bots A bot is software that automates tasks and can mimic human behavior, sometimes for malicious purposes like false clicks or fraud. Bot activity is detected by analyzing network and browser traits, fingerprints, and other indicators of non-human activity.
malicious Malicious activity involves excessive, abnormal, or false-identity actions with harmful intent, detected by analyzing user behavior and characteristics.
suspicious Suspicious activity includes abnormal repetition, traffic from data centers, or VPN-based spoofing, and is detected by analyzing visit patterns and sources.
good bots Good bots are automated tools that perform useful tasks without disrupting user experience, though they can resemble malicious bots and must be distinguished carefully. Some are undeclared crawlers that collect data without clear identification, making them harder to tell apart from real users.
valid This entity is likely human.

Threat Type

Values Description
Abnormal rate limit An abnormal and disproportional number of clicks on ads is performed by users posing as legitimate users but without any intention of following through and converting.
Automation Signal Automation signal typically refers to an indicator or trigger used in automated systems and processes. These signals can take various forms, such as specific events, conditions, or data inputs, that initiate or control automated actions or workflows.
Automation Tools Automation tools are used to perform automatic activity usually at scale in a repetitive and fast manner. Such activity can be sometimes executed for malicious purposes such as false clicks and fraudulent display of web-placed ads.
Behavioral anomalies Behavioral anomalies is a method of detecting individuals with hostile intentions by observing their behavior and activities on digital assets.
Browser Anomalies Browser anomalies signals refer to unexpected or irregular behavior and issues that occur when using web browsers. These anomalies can encompass a wide range of issues, including rendering errors, compatibility issues with websites, slow performance, and security vulnerabilities.
Click Farm A click farm is an organized fraud that leverages tools and large groups of humans to manually click on paid ads online. Click farms have people clicking on ads with no intention of converting.
Click Hijacking Click Hijacking is an attack vector that tricks a user into clicking a web element that is invisible or disguised as another element. This can cause users to unwittingly download malware, visit malicious web pages, provide credentials or sensitive information, transfer money, or purchase products online.
Cost-effective Scaling Bot operators aim to maximize impact while minimizing costs. By using scalable, low-cost technologies and infrastructure, they can carry out large-scale automated operations without significant investment in financial or computing resources.
Crawlers Undeclared crawlers are automated tools used by legitimate organizations to scan and collect information from websites. Unlike declared bots, they do not clearly identify themselves or reveal their purpose, making it harder to detect and classify them.
Data Centers Data-center traffic is any traffic that has been detected to originate in a data center. As such, it is very likely to have come from a server rather than a laptop, smartphone, tablet, or other personal devices that may indicate a source of non-human traffic.
Desktop Embedded Browser A desktop embedded browser is a web browser integrated directly into a desktop application or software, allowing users to access web content seamlessly without leaving the application's environment.
Disabled JavaScript When JavaScript is disabled in the user browser, certain features on a website might not work or even the website might not operate completely. Users with disabled JS deliberately don't intend to interact with the website the way it was designed to.
Disabled Cookies Disabling the browser’s cookies is common with bots and fraudsters. Users that have their cookies disabled will not be able to use most of the website's functionality. This means that these users would not be able to progress within the funnel and most likely won’t convert.
Excessive rate limit An excessive number of clicks on ads that is performed by users posing as legitimate users but without any intention of following through and converting.
False Representation False Representation such as User Agent Spoofing is the situation where the user information is modified to hide and lie about its real characteristics and identity. It is most often seen with bots trying to hide their tracks, but some malicious human users will occasionally engage in this way as well.
Frequency Capping Ad fatigue occurs when your audience sees your ads too often and disproportionally which causes your campaigns to become less effective. Using frequency capping, you can limit the number of times your ads appear to the same user.
Geo Exclusions Fraudsters tend to obfuscate their true geolocation by using different kinds of tools such as VPNs and Proxies. This allows them to interact with campaigns that are outside of the original targeting strategy and potentially facilitate targeted attacks.
Good bot A known bot is any bot that performs useful or helpful tasks that aren't detrimental to a user's experience on the Internet. Nevertheless, Known Bots are still not human users. Because good bots can share similar characteristics with malicious bots, the challenge is ensuring good bots aren’t blocked when putting together a bot management strategy.
IP reputation IP Reputation means there were signs that an IP address was behaving poorly, for example, by sending too many unwanted requests.
Like Headless Like Headless refers to a browsing or execution mode where the program runs without a visible browser window or user interface. This is often used for automation tasks such as testing, data scraping, or loading pages in the background without any visual interaction. It's a common behavior of bots and scripts that interact with websites programmatically rather than through real user actions.
Location Spoofing Location spoofing is a technique that allows users to fake or alter the geographical location information reported by their device. This can be done for various purposes, such as enhancing privacy, accessing region-restricted content.
Low-quality users Users showing signs of outdated or limited browsing capabilities often convert at lower rates. This behavior may point to less tech-savvy users or basic bots built to operate in a simplified way to avoid detection.
Malicious Bots A malicious bot is a malware designed to perform a variety of attack patterns.
Malware A malware signal is a sign or indicator of the presence or activity of malware (malicious software) on a computer or network. Malware signals can take many forms and may include unusual system behavior, unexpected network traffic, security alerts, or antivirus software notifications.
Malicious User Input Malicious user input often includes fake, invalid, or randomly generated data, signs that the activity likely doesn't come from a real human user.
Multi Suspicious Signals The visits showed multiple indicators of invalid traffic.
Network Anomalies User traffic that includes one or more attributes (e.g., IP, user cookie) associated with known irregular patterns, such as non-disclosed auto-refresh traffic, duplicate clicks, and attribute mismatch.
Non-Browser HTTP Clients Non-browser HTTP clients are software tools or applications that utilize the HTTP to send and receive data from web servers without the need for a traditional web browser. These clients are commonly used for various purposes, including automated data retrieval, web service consumption, and API communication, offering developers a way to interact with web resources programmatically.
Proxy Many proxies frequently hide or facilitate invalid activity. Invalid proxy activity can originate from an intermediary proxy device that exists to manipulate traffic counts, (remove the or), create/pass-on non-human or invalid traffic, or otherwise fail to meet protocol validation.
Scrapers Scrapers are used for content scraping from web applications. A web scraper extracts underlying code as well as stored data. This extracted information can be used then to retrieve business intelligence, replicate the web service, and more.
Stealth Signal A stealth signal typically refers to a signal or communication that is intentionally designed to be discreet, low-profile, or difficult to detect. It is often used to avoid detection by adversaries or unauthorized parties. These signals may employ various techniques to minimize their visibility, such as encryption, camouflage, or low-power transmission methods, to operate covertly or securely in sensitive situations.
Suspicious User Input User input is validated across various form fields, such as email and phone number. Suspicious submissions often include irregular patterns or formats that suggest the data may be invalid.
Useragent Spoofing User agent spoofing is a signal manipulation technique where a device or software alters the information it sends to web servers as its "User-Agent" string. The User-Agent string is part of an HTTP request and typically contains information about the client, including its browser type, version, and operating system. By spoofing or faking this information, a user or application can appear as a different device or browser to web servers.
VPN VPNs are being used to access services or websites that are out of reach, which can only be done with a VPN or proxy. This might be considered as a suspicious use to commit fraud.

AI Traffic Type

Values Description
agent Identifies AI agents.
crawler Identifies AI crawlers. Typically, these will not show up in the data layer as they do not execute JavaScript.
none Neither an AI agent or crawler.

Vendor Name

Values Description
various Represents the vendor of the AI crawler/agent or the abridged user agent of a non-AI entity (e.g. Google Inc. for Chrome users).

Client Name

Values Description
various Represents the more specific client name of the AI crawler/agent or more specific user agent of a non-AI entity (e.g. Chrome).

Request ID

Values Description
various Represents the unique ID generated each time the CHEQ library loads. This is typically used to associate hit-level CHEQ data with data from third-party tools.

Policy

Policy returns a loosely-obfuscated value, represented as 0, 1, or 2 that identifies traffic based on your policy settings in CHEQ. This is primarily used for Acquisition customers who want to integrate our detection verdicts with third party consent management tools.

Values Description
0 Represents valid traffic.
1 Represents suspicious traffic.
2 Represents malicious traffic.

Important Considerations

Race Conditions

Race conditions are a key consideration when using the Data Layer connector, as the session will not be classified if it isn’t classified as Valid/Malicious before the analytics server call is sent. This increases Unspecified values in Adobe or (not set) values in Google Analytics. This outcome is normal and expected when working in an asynchronous environment. Our powerful detection engine also factors behavioral data into classification, meaning it might need some extra time to unambiguously diagnose traffic.


There are steps one can take to increase coverage:

  1. Trigger the CHEQ library code before the analytics code.
    1. This is recommended for all implementations leveraging the Data Layer connector as the primary means of traffic classification. If the CHEQ code loads after analytics code without supplemental events, the classification rate will be very low.
  2. Send an event on classification.
    1. This eliminates the race condition by sending an event to the analytics tool whenever a classification is returned. This occurs only once per user for the vast majority of cases. There could be cost implications of sending extra events to analytics tools, so please be aware of contract guardrails before implementing.
  3. Leverage the Cloud Storage connector
    1. The Cloud Storage connector is the most permanent solution and CHEQ’s recommended means of analytics integration. CHEQ sends batches of user data directly into your analytics tool at a cadence as frequent as hourly. This data is propagated several ways, one of which is via user-specified User ID passed in via CHEQ’s JavaScript SDK.

Was this article helpful?

1 out of 1 found this helpful

Have more questions? Submit a request