Archive for July, 2007

Jul 29 2007

Regional Online Marketing Summit – Stop #6: Houston, TX

Published by Wendi under best practices, marketing, seo/sem

I first want to send a “Thank You” to the Web Analytics Association for the free pass to the Houston, TX Online Marketing Summit.  If you are a member of the WAA and don’t read the monthly newsletter, you should.  Case in point – free passes to conferences (summits, forums, seminars, etc…) and other great discounts for just being a member.    

The conference was packed with great presentations covering a vast amount of information.  Between my co-worker and me we attempted to attend each talk within both tracks.  The Houston, TX location was setup in two tracks one focusing on “Search Marketing & Website Strategies” and the other on “Email Marketing, Analytics, & Social Media.”  Below are some highlights of those talks I sat in personally – but overall the conference was great and I learned a great deal and would recommend it to anyone in the vicinity of the remaining locations. 

Highlights from the talks I joined in on:

·         Google Website Optimizer; Dave Underwood, CEO, TopSpot:  Test minor changes yet don’t test things you already know don’t work.  Top variables to test include headline, image position, ‘call-to-action’ placement/look & feel, length of page, registration requirement for downloads, and contact form field list.  Key points from Dave included listening to your audience while testing, plan ahead and identify what you want to test up front, make sure to run the test long enough yet make sure you don’t over run the test (you are only hurting in the long run if you allow the ‘bad’ versions run longer than needed), and lastly “Just Test It.” 

·         Top 10 Email Campaigns; Joel Book, Dir. Of eStrategy, ExactTarget:  Joel’s presentation focused on permission based email and covered a great deal of information that I can’t give justice in a few short lines, but here goes.  Joel stressed that email strategies should be used to maintain customer engagement with your company while Search is used to attract and the design of a landing page is to convert.  Within a plan (step 1) one should design a communication that will give your customers a reason to Opt-In.  Test, test, test and integrate web analytics to understand the whole picture.  Top five things to test were landing page copy, A/B testing the offer, Subject lines, and A/B testing email creative.  Focus on understanding what your customers want and leverage a customer preferences center to deliver customized content to fit their needs.   Joel mentioned a few supporting tools and resources that can assist in design with email campaigns – PivotalVeracity.com, EyeTools.com, and EmailExperience.org.  I haven’t personally used them but they sound promising. 

·         Beyond Google – Vertical Search; Chris Hulse, Business.com:  With no great surprise there is no common list of available vertical searches but here are some that were mentioned during the presentation – business.com / ThomasNet.com / GlobalSpec.com / CitySearch.com / SourceTool.com / Shopzilla.com / Shopping.com / KnowledgeStorm.com / VerticalSearch.com.  G Y M is the new acronym for the top 3 search engines – “Google Yahoo MSN.”  Vertical search engines should be used to enhance, not replace, online paid placement marketing.  Your marketing plan should include “Core & Other.”  Scan the landscape of your users and understand their needs and other resources they use day to day to enhance placements. 

·         Social Media – Beyond the Buzz; Jason Breed, Vice President, Neighborhood America:  When integrating social media in to an existing marketing strategy, social media should enhance, not replace traditional online media outlets.  Start out by identifying the goals and select the right technology.  Ensure that the infrastructure can support the anticipated response times ten.  Create the right environment for your community.  Develop a community that provides value for those members.  Ensure they have a common interest and that environment is trustworthy and safe.  And most importantly, establish clear expectations up front.  Measure metrics that matter which include those that increase revenue, decrease cost, and those that help drive those two faster.  Lastly, ensure scalability and reliability of the community – Repeat.  Some social media sites mentioned were MySpace.com, Facebook.com, and Digg.com. 

·         What’s next: 21st Century Lead Cultivation; Nate Pruitt, Regional VP, Eloqua Corp:  Nate brought to the table the idea of Lead Scoring. Lead Scoring is the process of assigning a numerical value to each incoming lead that is then used to rank them for priority processing.  Developing a lead score is pretty straight forward includes identifying interest indicators that best predict behavior.  Align these indicators to lead quality and then assign a weight (positive or negative as you need accelerators and decelerators).  Nate also discussed the idea of Lead Nurturing and the process of marketing to so called “bad” leads or dead leads. 

 

After reading all this, you may wonder, what does this have to do anything with statistics?  Well, if you think about it… it has everything to do with statistics.  In each and every discussion, metrics/measurement was mentioned in some form or fashion.  Whatever the strategy, campaign, or initiative, measurement of success is at the heart of each and every plan.

Until next time… safe analyzing. 

No responses yet

Jul 09 2007

The Butterfly Effect.. or is it just coincidence?

The NY Times had an interesting article today about the eerie postings of death predictions in Wikipedia like the most recent one regarding the death of Nancy Benoit. The article moves into discussing the fine line between real-time late breaking news and predicting future events. I have to admit that I find this article disturbing but yet on some level intriguing. The article gets better once you can get past all the weird death notable mentions. One thing it reminded me of was the notion of Bill Tancer’s ‘searchonomics’ theory.

Bill Tancer, GM of Hitwise, initially proposed thoughts back in 2005 on ‘searchonomics’ and predicting consumer interest or rather public fear of possible a epidemic outbreak based on search history on the technical term “H5N1” and it’s more consumer friendly version “bird flu”. He has also dabbled with more fun data and predicting winners for American Idol and the UK version of Dancing with the Stars – both of which he was right on the money with predicting the winners.

I have found ‘searchonomics’ rather an interesting phenomenon that I thought I’d start my own predictions to see if there is any predictive power on the 2008 presidential candidates.

Unfortunately I don’t work for Hitwise, nor do I own a membership either; so I am limited to free versions of similar data – which limits my visibility a little. Using Google Trends you can see the early few months of the year on some of the top Democratic candidates:

Google Trends Democratic Presidential Candidates

 

Based on the traffic so far, it looks like it’s going to be a close race at this point. I’ll wait to make my predictions on the democratic side as soon as Google Trends decides to update their data a bit more (or if someone is willing to pull data in some fancier tool with more up to date data and send it my way, I might be able to make my prediction sooner).

Is this ‘searchenomics’ phenomenon the result of a “Butterfly Effect” or is it just a set of data points that are merely related by coincidence?

Until next time… safe analyzing.

No responses yet

Jul 05 2007

Determining your Sample Size

Published by Wendi under statistics, A/B testing

Robbin Steif asked me today how long she needed to let her test run before she could call it a day and assume that there is really no difference between the treatments (since she isn’t seeing one right now). She sent over the following screen shot of her outcomes in Google’s Website Optimizer from the last two weeks:

GA Website Optimizer A/B Test

As you can see, right now she isn’t seeing any lift in her conversion rate. Actually she is seeing similar values and a small drop. But is the drop significant? Does she have enough data to support an outcome at this point?


Before you run a test of significance you first need to know if you have enough data to support the test in the first place. For population proportions the formula for sample size “n” is:

 

n = z2(pq/δ2)

where

p = % of Success (conversions in this example)

q = % of Failures (i.e. 1 – p)

*note: use the Conversation Rate from your control landing page

To finish out this equation you need to make a few assumptions.

Assumptions

1. The Confidence Level - α (alpha): the level of certainty that you are willing to accept

2. Error - δ (delta): the margin of error that you are willing to accept

With these assumptions set, lastly you need to calculate the Z value based on your Confidence Level. It’s easy to do in excel with the NORMSINV() formula. Since we are determining the existence of a “difference” among the conversion rates versus if the conversion rate is specifically higher or lower than the control we need to divide alpha in half for a two-sided test structure.

=ABS(NORMSINV(α/2))

In this example our Z = 1.96. Now we have all the pieces in our formula to calculate the needed sample size.

n = z2(pq/δ2)

= (1.96)2 * [(.0472*.9528)/(.01)2]

 

= 1728

Thus Robbin is going to need 1,728 page views before she can make the determination that the treatments she is testing did or did not make a difference in her conversion rate. You can download this excel file I put together (nothing fancy) where you can toggle the alpha and delta values so that you can see how each one impacts the needed sample size.

I also included a reference to the maximum sample size one would need if you don’t have a control to set your “p” and “q” values. It’s rather astonishing but it you are a conservative then you can always fall back on this calculation and know that if you get approximately 10,000 samples you are good to go.

Part II of this question is – Is there a difference? This is different than asking how many samples you need to determine if there is a difference. Applying a hypothesis test is needed to actually determine if the difference in the conversion rates are statistically significant. You can read more about how to do this on my previous post about A/B testing. You can find a downloadable excel file in this post that you can toggle various sample sizes and determine if the conversion rates are different – statistically speaking.

 

Until next time… safe analyzing.

 

*UPDATE* 7/16/07:  Make sure that you use the first page view per visitor - “Unique Page Views” in Google analytics Terms when making this calculation.  The sample size calculation assumes each event is independent  of each other.

Thank You to Mike & Chris!

15 responses so far

Jul 02 2007

Stop Collecting So Much Data…

Published by Wendi under best practices, data mining

… and stop misusing data mining - is Peter Fader’s message to CIO’s. CIO Insight interview with Peter highlights the strengths and weaknesses of applied data mining in the business world and I have to agree with some of his thoughts; especially on the topic of utilizing probabilities to measure the propensity of behavior.

Measuring the probability of users actions can be strong and powerful if used properly. And can be easily done in Excel.

The trap I see so many people fall into is trying to analyzing too many variables at once and not taking the time to even look at what they are throwing into the model. If you really want to you could probably find relationships between how fast the sun rises and the stock market closing rates but does that really make any logical sense? Then why would you try to build relationships between buying behaviors and the fact that they own an Apple iPhone if you are selling shoes? So many marketers want to know every little detail about their customer – demographics, psychographics, what kind of car they drive, etc…

When you throw too much data at a problem you will have a hard time with independency and you need to take careful consideration the structure of your data otherwise your predictions can lead to false outcomes.

Some rules of thumb from my perspective:

  1. Enhance your data with the VOC – take surveys online or telephonically (mailed surveys are costly and too time consuming). This is a great way to get anecdotal data you don’t see in click stream data.
  2. Familiarize yourself with all the variables and truly understand what they mean – not what you think they mean.
  3. Don’t use variables that you can’t reproduce easily. If it’s too hard to calculate, find, or collect from the database then you probably shouldn’t use it. It’s impractical.
  4. Only include variables that make sense and add questionable variables later and determine if they degrade or enhance the predictability. In the end you may not even find a reason to test out the use of those questionable variables. *Make sure to not include variables that are variations of each other. If you include % of visits this month then don’t include the frequency of visits this month too. This can cause problems with multi-collinearity.
  5. Save enough data for testing! Minimum split is 90/10 but recommend at least 80/20 split. That is at a minimum use 90% of your data is used for development and the remaining 10% is for validation of the model. You need to know how predictive your model is before you take it to the market.

Bonus Point -

  1. If you want to get fancy, look at a repeated measures DOE structure for analyzing transactional data.

Until next time… safe analyzing.

2 responses so far