The Science of Digital Browsing Behavior and Why the Timing of It Matters
Note: Please find this post on our new Medium blog as well.
At Tapad, we have a rich set of privacy-safe digital browsing data available for devices in our Device GraphTM, which includes website visits and app usage across time. One way we leverage this data is by using statistical models to find devices with behavior patterns similar to those of past converters -- where converters are people who have taken a desired digital action (e.g. clicking on a specific link or button on a brand website).
During our research we’ve found that it is not just what websites you visit that matters, but also the timing and frequency of those visits that can indicate how likely a device is to make a conversion.
Imagine that we are looking for devices belonging to people who are looking to buy a pickup truck. To find the right people, there need to be two ingredients: (1) Lifestyle - they have interests associated with pickup trucks, e.g. outdoors or construction, and (2) Intent - they intend to purchase a vehicle in the near future. In this case, a digital conversion could be something like clicking on a “Find a dealer” link on a car website, which demonstrates their interest in purchasing a pickup truck.
Being able to distinguish devices that have both ingredients from those that only have one (or none) can make engagement strategies more effective. For example, if a device has the right “lifestyle” and shows an intent to purchase soon, we could serve them a limited time offer to push them through to conversion. If a device has the right lifestyle but shows no sign of intent to purchase, we could serve brand messaging to prime them for conversion farther in the future.
Signs that a device has the right lifestyle or intends to make a purchase are present in the digital browsing behavior of that device. Visits to some websites, like those about the outdoors, may reveal that a person has interests (lifestyle) often associated with pickup trucks, whereas others, like car review sites, demonstrate an intent to buy a vehicle in the near future (intent).
We start with looking for sites that demonstrate intent first because there is a clear temporal relationship that provides a signal: for converters, site visits associated with an intent to convert soon should occur right before conversion time.
Here’s how we did an exploratory analysis to find likely intent sites:
Some sites are intent sites -- i.e. they are visited by devices right before they convert. If this is true, then for converters, visits to these sites should occur more often than expected in time intervals close to an end conversion.
For a particular conversion endpoint (e.g. find a dealer), take all converting devices and look backwards at their website visit history prior to conversion. This window of time (we cut it off at a maximum of 60 days back) is the “observation window” for each device.
Set an “intent window” of time immediately before the conversion, which is a fixed but configurable number of days in length, e.g. one week, three months, etc.
This setting may depend on your conversion endpoint. If the conversion is for buying a car, then your intent window may be longer as that’s a large purchase that is planned over a longer span of time. However, for something like buying a new pair of jeans, the window could be much shorter.
Look at all websites (e.g. sites A, B, C, …) that are visited at least once by any converter during their observation window.
For the analysis, we look marginally at visits to a single website A: we want to know if A visits happen more often in the intent window than what we would reasonably expect due to chance. If this is the case, that suggests website A is an intent site – i.e. is visited most frequently when a device is about to convert.
To determine whether website A visits occur more often than expected in the intent window:
Find the visit rate (visits / day) for website A in the intent window
Compare it to the overall visit rate (visits / day) for website A across all observation windows.
This can be done by measuring the Relative Reporting Ratio (RRR) – the ratio of the intent visit rate and the overall visit rate. The RRR can also be interpreted as how much higher (or lower) the joint probability that a day both (1) has a visit from website A and (2) is in the intent window -- i.e. P(intent yes, visit yes) -- is compared to what we would expect if intent status and visit status were independent: P(intent yes) × P(visit yes).
Our site visits are recorded at the day level – i.e. on each day the website is either present or absent. So days are our smallest observable unit of time.
We assume that every observed day, regardless of device, can be cross-classified according to the 2 x 2 table below. Days are assumed to be independent of one another, which is equivalent to assuming visits follow a Poisson process where only the Intent Yes vs Intent No categorization modulates the rate of the process.
A separate 2 x 2 table is calculated marginally for every context (e.g. "website A"). We do not try to account for multivariate effects.
V = # of days with visits to website A
O = # of days on which a visit could have happened
VI = # of intent window days with website A visits
I = # of possible intent window days
Once all days corresponding to a particular website and conversion event have been categorized via the 2 x 2 table above, we can calculate our relevant metrics:
There will be a separate 2 x 2 table (and RRR and Confidence Interval) generated for each website for a given conversion endpoint.
In order to discriminate between websites, we rank them based on sorting the lower bound of the 95% Confidence Interval for RRR from highest to lowest.
Websites with the highest rankings have stronger Intent-style behavior -- i.e. their visits are happening close to the conversion event more often than we’d expect.
Let’s take a look at some example results for a particular foreign car campaign. The conversion events on this campaign are things like clicking on “find a dealer”, “requesting a quote”, etc on the car brand website.
Here’s what we saw:
Over a 60 day interval, more than 39,000 unique websites were visited by converters for this campaign.
We ranked all of these websites using to our approach above. The top 10 websites in our ranking had RRR’s in the range of 3.0-4.0, meaning that they have 3-4x more visits in time window close to conversion compared to what we’d expect.
Among the top 10 websites there were
Four shopping and research websites geared specifically for car buyers
Two online forums specific to this car brand
One car insurance shopping site
Although this is just from a small sample of sites, intuitively it makes sense that potential buyers would look at reviews, car forums, and insurance sites right before they actually purchase a car.
We also noticed that many of the sites with a neutral RRR - i.e. those not any more or less likely to be visited close to conversion - tended to be things like travel aggregator or retail sites. Again, it seems reasonable that these more general shopping, news, and travel sites aren’t visited any more or less frequently close to a conversion event.
Of course these are just a few preliminary results, and there is always going to be some degree of noise present. Nonetheless, we thought it was interesting to see some intuitive results within our exploration.
The timing of website visits can reveal patterns related to conversions over and above what we see from just looking at sets of websites alone.
Incorporating timing information into conversion models provides a fuller picture of device browsing behavior, and can show us what websites to pinpoint and when.
We can use timing information to refine our audience engagement strategy to be more effective. If someone intends to make a purchase soon, for instance, we can show them a limited time offer or discount to push them through to conversion. Whereas if someone has the right “lifestyle” only, i.e. looks like a converter but hasn’t demonstrated an intent to purchase soon, we can engage them with brand messaging to increase awareness.