May 25 2007
Predicting their Next Move
Most sites have a target goal or call to action, may it be visiting a landing page or submitting an online form or better yet, purchasing products. What ever the end goal is there are paths that users take to navigate through the site and hopefully ultimately find what they are looking for before leaving. Conducting a pathing experiment to understand how users click through from page to page on your site is a good exercise.
Knowing where they have been is great but what if you could predict where new visitors might go in the future to determine the best placements for promotions, coupons, contact forms, etc…
Say for example, you know that users who visit the CD content area then click through to CD Players most often then visit several other pages like Music CDs, CD disk covers, and DVDs. The order in which they visit these other pages down the road is not much of a factor other than they visited those pages at some point in their session. If this was the only path users could take on this site then were done. Place some promos for discount, blank CDs and were done!
Ok, so realistically you have an insane number of possible paths user can take on your site and you want to know when would be a good time to include a internal promotion (like a discount coupon that links to a product page with savings) given that they “show interest” in those types of products.
So here lies the problem: How to you know what page would have the highest probability to be seen as a next page given that they have seen some group of previous pages that are related to the promo. Statisticians call this Conditional Probability.
Conditional probability is denoted by P(B|A) and is read as the “probability of event B given event A”.
Since we know that certain pages on the site have been viewed we know a little more about the visitor and what types of interests they may have on the site. Thus the probability of what they might see next has been affected.
Definition: P(B|A) = P(A and B) / P(A)
This means that the probability of event B given that event A occurred is the intersection of the probability of both events occurring divided by the probability of just event A occurring.
So what do I do with this do you say? Well, here is an example of the application.
Say you have the frequency of top paths for some pages of interest. Example of fictitious data is below and our goal is to determine the placement for a promo linked to Page C:
Calculate the percentages of all the joint probabilities as well as marginal probabilities as shown in the resulting data cube below:
So by inspection of the data you can determine the probability at which event occurs individually or jointly based on the summary data. But what we really want to know is the conditional probability of event B given event A. Thus we want to know that if the visitor sees a certain selection of pages what is the probability that they will then view Page A, Page B, or Page C. And in the end we will have determined the best placement for our promotion.
In my example, to calculate each of the conditional probabilities you would take the joint probabilities and divide them into the marginal probabilities.
P(PgA | Pg1/Pg2) = P(Pg1/Pg2 and PgA) / P(Pg1/Pg2)
= 0.161 / 0.415
= 0.389
Here is a chart of all the resulting conditional probabilities:
To visualize this data better let’s look at the conditional probabilities in a basic bar chart in excel:

You can see that path Page 2 / Page 3 has the highest probability that the visitor would then click through to the goal page C.
So now you can go back to your marketing manager and tell them that the promotion should be posted on Page 3 for those visitors who have viewed page 2 previously.
Until next time… safe analyzing.
Hey Wendi,
There’s so little coverage of the math side of web analytics, so I’m glad you’re joining the blogger community. Obviously much of your insight comes from your mathematics background. What do you think is the best way for people with more of a marketing focus to get up to speed on the math side of web analytics?
BTW, your post inspired me to blog!
http://www.searchmarketinggurus.com/search_marketing_gurus/2007/05/the_web_analyti.html
Cheers,
-Alex
Again, great post.
I’m not sure how to measure joint probabilities as well as marginal probabilities, but I love the idea of predictive analysis. Hope to read more.
[…] Measuring the probability of users actions can be strong and powerful if used properly. And can be easily done in Excel. […]
[…] Predicting Their Next Move […]