<?xml version="1.0" encoding="UTF-8"?><!-- generator="wordpress/2.1.3" -->
<rss version="2.0" 
	xmlns:content="http://purl.org/rss/1.0/modules/content/">
<channel>
	<title>Comments on: Determining your Sample Size</title>
	<link>http://coremarkanalytics.com/blog/2007/07/05/determining-your-sample-size/</link>
	<description>Web Analytics Blog - Paving the way to understanding web data as it relates to statistics and other methodologies.</description>
	<pubDate>Thu, 28 Aug 2008 15:58:37 +0000</pubDate>
	<generator>http://wordpress.org/?v=2.1.3</generator>

	<item>
		<title>By: Steve</title>
		<link>http://coremarkanalytics.com/blog/2007/07/05/determining-your-sample-size/#comment-25</link>
		<author>Steve</author>
		<pubDate>Fri, 06 Jul 2007 02:26:17 +0000</pubDate>
		<guid>http://coremarkanalytics.com/blog/2007/07/05/determining-your-sample-size/#comment-25</guid>
					<description>Thanks Wendi, at this rate I may as well just automatically "star" all your postings in Google Reader... ;-)

I would interpret this to mean that Robbin possibly does have enough info to make a determination (3 x ~ 250 + other combos?). Does that work? Or should we have 1700 for those three only?

Assuing that to be the case. Should she stick with the original or???

I see myself torn between the original and #1.
#1 Reduces the error range - not by much. That seems to imply to me that it is a more guaranteed combination? And hence has a value in and of itself???

Or am I dreaming? :-)

I could always handle the math side of statistics (I can just about read ancient greek purely from studying engineering @ uni. ;-) ). Knowing how/when to apply it was ever my problem...

Cheers, and Thanks!</description>
		<content:encoded><![CDATA[<p>Thanks Wendi, at this rate I may as well just automatically &#8220;star&#8221; all your postings in Google Reader&#8230; <img src='http://coremarkanalytics.com/blog/wp-includes/images/smilies/icon_wink.gif' alt=';-)' class='wp-smiley' /> </p>
<p>I would interpret this to mean that Robbin possibly does have enough info to make a determination (3 x ~ 250 + other combos?). Does that work? Or should we have 1700 for those three only?</p>
<p>Assuing that to be the case. Should she stick with the original or???</p>
<p>I see myself torn between the original and #1.<br />
#1 Reduces the error range - not by much. That seems to imply to me that it is a more guaranteed combination? And hence has a value in and of itself???</p>
<p>Or am I dreaming? <img src='http://coremarkanalytics.com/blog/wp-includes/images/smilies/icon_smile.gif' alt=':-)' class='wp-smiley' /> </p>
<p>I could always handle the math side of statistics (I can just about read ancient greek purely from studying engineering @ uni. <img src='http://coremarkanalytics.com/blog/wp-includes/images/smilies/icon_wink.gif' alt=';-)' class='wp-smiley' /> ). Knowing how/when to apply it was ever my problem&#8230;</p>
<p>Cheers, and Thanks!</p>
]]></content:encoded>
				</item>
	<item>
		<title>By: Wendi</title>
		<link>http://coremarkanalytics.com/blog/2007/07/05/determining-your-sample-size/#comment-26</link>
		<author>Wendi</author>
		<pubDate>Fri, 06 Jul 2007 15:23:10 +0000</pubDate>
		<guid>http://coremarkanalytics.com/blog/2007/07/05/determining-your-sample-size/#comment-26</guid>
					<description>Hi Steve, Thanks for the positive feedback and glad you find this blog useful.  
Actually Robbin needs a full 1700 with those three combinations.  I don't believe that she was running any others.  But yes, she just needs a total of 1700 for the entire sample.  So if she did run other combinations then she would have been closer to what she needed.  
As for picking the right one in the end, well that is up to her since there didn't seem to be any impact from the treatments.  It would be more of a personal preference question at that point.  Maybe over time the conversion will experience a lift just by refreshing the page anyway - who knows.   But right now if it takes an act of congress to update the page, I would waiver against not making any changes.  In some organizations there tends to be this struggle for IT resources so if that exists, based on the tested outcomes she may not be wise to make any changes at this point.  But really we are too early in the game to make a sound decision.  I guess it's just a waiting game at this point. 

Take Care, Wendi</description>
		<content:encoded><![CDATA[<p>Hi Steve, Thanks for the positive feedback and glad you find this blog useful.<br />
Actually Robbin needs a full 1700 with those three combinations.  I don&#8217;t believe that she was running any others.  But yes, she just needs a total of 1700 for the entire sample.  So if she did run other combinations then she would have been closer to what she needed.<br />
As for picking the right one in the end, well that is up to her since there didn&#8217;t seem to be any impact from the treatments.  It would be more of a personal preference question at that point.  Maybe over time the conversion will experience a lift just by refreshing the page anyway - who knows.   But right now if it takes an act of congress to update the page, I would waiver against not making any changes.  In some organizations there tends to be this struggle for IT resources so if that exists, based on the tested outcomes she may not be wise to make any changes at this point.  But really we are too early in the game to make a sound decision.  I guess it&#8217;s just a waiting game at this point. </p>
<p>Take Care, Wendi</p>
]]></content:encoded>
				</item>
	<item>
		<title>By: Steve</title>
		<link>http://coremarkanalytics.com/blog/2007/07/05/determining-your-sample-size/#comment-30</link>
		<author>Steve</author>
		<pubDate>Mon, 09 Jul 2007 09:05:47 +0000</pubDate>
		<guid>http://coremarkanalytics.com/blog/2007/07/05/determining-your-sample-size/#comment-30</guid>
					<description>Hi Wendi, thanks again (very muchly!) for that.

Can I please query one more issue (hopefully... ;-) )
You've set the margin of error to 0.01. I gather from reading elsewhere that implies an margin of error of 1%.
If we increase that error to 1.5% we drop the number of pages required to 768. Which she already has (783).

So the question: Why would we continue to 1% when by accepting just an extra 1/2 a percent, we're already there? Obviously office/client politics, but are there formulaic reasons why that would be preferable? Is it tied to the Z-value in some way?

If I may be allowed a bonus question? :-)
To try and put it terms I am familiar with: Are we already at a level of confidence of 95% +/- 0.32%?
Or have I totally got it all wrong? (again)

I think I'm actually starting to understand this stuff... :-)

Cheers! and again, Thanks!</description>
		<content:encoded><![CDATA[<p>Hi Wendi, thanks again (very muchly!) for that.</p>
<p>Can I please query one more issue (hopefully&#8230; <img src='http://coremarkanalytics.com/blog/wp-includes/images/smilies/icon_wink.gif' alt=';-)' class='wp-smiley' /> )<br />
You&#8217;ve set the margin of error to 0.01. I gather from reading elsewhere that implies an margin of error of 1%.<br />
If we increase that error to 1.5% we drop the number of pages required to 768. Which she already has (783).</p>
<p>So the question: Why would we continue to 1% when by accepting just an extra 1/2 a percent, we&#8217;re already there? Obviously office/client politics, but are there formulaic reasons why that would be preferable? Is it tied to the Z-value in some way?</p>
<p>If I may be allowed a bonus question? <img src='http://coremarkanalytics.com/blog/wp-includes/images/smilies/icon_smile.gif' alt=':-)' class='wp-smiley' /><br />
To try and put it terms I am familiar with: Are we already at a level of confidence of 95% +/- 0.32%?<br />
Or have I totally got it all wrong? (again)</p>
<p>I think I&#8217;m actually starting to understand this stuff&#8230; <img src='http://coremarkanalytics.com/blog/wp-includes/images/smilies/icon_smile.gif' alt=':-)' class='wp-smiley' /> </p>
<p>Cheers! and again, Thanks!</p>
]]></content:encoded>
				</item>
	<item>
		<title>By: Robbin Steif</title>
		<link>http://coremarkanalytics.com/blog/2007/07/05/determining-your-sample-size/#comment-33</link>
		<author>Robbin Steif</author>
		<pubDate>Tue, 10 Jul 2007 02:09:59 +0000</pubDate>
		<guid>http://coremarkanalytics.com/blog/2007/07/05/determining-your-sample-size/#comment-33</guid>
					<description>Wendi - is the  "margin of error" a margin of the control, i.e. 1% of the 4.72% conversion rate, or additive, 5.72%? I am guessing the former, since the latter seems too detectable to be considered a tie.</description>
		<content:encoded><![CDATA[<p>Wendi - is the  &#8220;margin of error&#8221; a margin of the control, i.e. 1% of the 4.72% conversion rate, or additive, 5.72%? I am guessing the former, since the latter seems too detectable to be considered a tie.</p>
]]></content:encoded>
				</item>
	<item>
		<title>By: Wendi</title>
		<link>http://coremarkanalytics.com/blog/2007/07/05/determining-your-sample-size/#comment-35</link>
		<author>Wendi</author>
		<pubDate>Tue, 10 Jul 2007 04:17:25 +0000</pubDate>
		<guid>http://coremarkanalytics.com/blog/2007/07/05/determining-your-sample-size/#comment-35</guid>
					<description>Hi Steve - Yes, by toggling the width of your precision you can reduce the amount of work needed to accomplish your goal.  It just depends on how much you are willing to give up in your precision.  Remember that the margin of error is plus or minus so the more you give the less precision you have.  And yes, you are almost correct in your margin of error estimate for the current conditions.  It's actually +/- 3.2% not 0.32%.  In many cases you will see polls refer to the standard +/- 3% error rate and right now Robbin is pretty much inline with the standard needs of acceptability.  

Hope this helps.
Wendi</description>
		<content:encoded><![CDATA[<p>Hi Steve - Yes, by toggling the width of your precision you can reduce the amount of work needed to accomplish your goal.  It just depends on how much you are willing to give up in your precision.  Remember that the margin of error is plus or minus so the more you give the less precision you have.  And yes, you are almost correct in your margin of error estimate for the current conditions.  It&#8217;s actually +/- 3.2% not 0.32%.  In many cases you will see polls refer to the standard +/- 3% error rate and right now Robbin is pretty much inline with the standard needs of acceptability.  </p>
<p>Hope this helps.<br />
Wendi</p>
]]></content:encoded>
				</item>
	<item>
		<title>By: Wendi</title>
		<link>http://coremarkanalytics.com/blog/2007/07/05/determining-your-sample-size/#comment-36</link>
		<author>Wendi</author>
		<pubDate>Tue, 10 Jul 2007 04:26:45 +0000</pubDate>
		<guid>http://coremarkanalytics.com/blog/2007/07/05/determining-your-sample-size/#comment-36</guid>
					<description>Robbin - The margin of error is the window that the true conversion rate should fall - with 95% confidence.  In other words, you are 95% confident that the true (population) conversion rate is somewhere between [3.72%,5.72%] based on your sampling (test).  

Cheers! Wendi</description>
		<content:encoded><![CDATA[<p>Robbin - The margin of error is the window that the true conversion rate should fall - with 95% confidence.  In other words, you are 95% confident that the true (population) conversion rate is somewhere between [3.72%,5.72%] based on your sampling (test).  </p>
<p>Cheers! Wendi</p>
]]></content:encoded>
				</item>
	<item>
		<title>By: Wendi</title>
		<link>http://coremarkanalytics.com/blog/2007/07/05/determining-your-sample-size/#comment-38</link>
		<author>Wendi</author>
		<pubDate>Tue, 10 Jul 2007 13:29:06 +0000</pubDate>
		<guid>http://coremarkanalytics.com/blog/2007/07/05/determining-your-sample-size/#comment-38</guid>
					<description>All Readers - I made a booboo - I mis-typed the the formula above.  I forgot to include the squaring of the delta.  The excel file that is correct and the actual value calculated and originally posted was also correct.  The post has been updated and  is now correct.  Sorry for this.  

Thanks Robbin!!</description>
		<content:encoded><![CDATA[<p>All Readers - I made a booboo - I mis-typed the the formula above.  I forgot to include the squaring of the delta.  The excel file that is correct and the actual value calculated and originally posted was also correct.  The post has been updated and  is now correct.  Sorry for this.  </p>
<p>Thanks Robbin!!</p>
]]></content:encoded>
				</item>
	<item>
		<title>By: Michael Helbling</title>
		<link>http://coremarkanalytics.com/blog/2007/07/05/determining-your-sample-size/#comment-46</link>
		<author>Michael Helbling</author>
		<pubDate>Sat, 14 Jul 2007 21:29:54 +0000</pubDate>
		<guid>http://coremarkanalytics.com/blog/2007/07/05/determining-your-sample-size/#comment-46</guid>
					<description>I am really enjoying reading your posts. This one was excellent, but I have one issue. You mention that the number of page views should be 1,728. In reality shouldn't it be 1,728 visits or sessions that enter on this page? Since a single visit could realistically see that page 2-4 times, I would think you would only want to count the first page view in a session toward the sample size.</description>
		<content:encoded><![CDATA[<p>I am really enjoying reading your posts. This one was excellent, but I have one issue. You mention that the number of page views should be 1,728. In reality shouldn&#8217;t it be 1,728 visits or sessions that enter on this page? Since a single visit could realistically see that page 2-4 times, I would think you would only want to count the first page view in a session toward the sample size.</p>
]]></content:encoded>
				</item>
	<item>
		<title>By: Steve</title>
		<link>http://coremarkanalytics.com/blog/2007/07/05/determining-your-sample-size/#comment-48</link>
		<author>Steve</author>
		<pubDate>Sun, 15 Jul 2007 22:56:11 +0000</pubDate>
		<guid>http://coremarkanalytics.com/blog/2007/07/05/determining-your-sample-size/#comment-48</guid>
					<description>Apologies for late reply. Thanks Wendi. I did wonder about the 0.32 - it felt wrong. :-)

Helps hugely! Again (x 2, x3 , x4 ...) thanks!
Cheers!</description>
		<content:encoded><![CDATA[<p>Apologies for late reply. Thanks Wendi. I did wonder about the 0.32 - it felt wrong. <img src='http://coremarkanalytics.com/blog/wp-includes/images/smilies/icon_smile.gif' alt=':-)' class='wp-smiley' /> </p>
<p>Helps hugely! Again (x 2, x3 , x4 &#8230;) thanks!<br />
Cheers!</p>
]]></content:encoded>
				</item>
	<item>
		<title>By: Wendi</title>
		<link>http://coremarkanalytics.com/blog/2007/07/05/determining-your-sample-size/#comment-49</link>
		<author>Wendi</author>
		<pubDate>Mon, 16 Jul 2007 13:46:02 +0000</pubDate>
		<guid>http://coremarkanalytics.com/blog/2007/07/05/determining-your-sample-size/#comment-49</guid>
					<description>Hi Michael,  Very good point.... but: 
Since Robin is measuring CTR - Click Thru Rate where 
CTR =  Clicks / Page Views            (aka Impressions)
Then looking at Page Views makes more sense.  Actually in Google Analytics you can pull Unique Page Views which is what you should look at to be most accurate.  Robbin could look at the number of reloads and back those out as well.  

Thanks for the comment.  

Cheers!  
Wendi</description>
		<content:encoded><![CDATA[<p>Hi Michael,  Very good point&#8230;. but:<br />
Since Robin is measuring CTR - Click Thru Rate where<br />
CTR =  Clicks / Page Views            (aka Impressions)<br />
Then looking at Page Views makes more sense.  Actually in Google Analytics you can pull Unique Page Views which is what you should look at to be most accurate.  Robbin could look at the number of reloads and back those out as well.  </p>
<p>Thanks for the comment.  </p>
<p>Cheers!<br />
Wendi</p>
]]></content:encoded>
				</item>
	<item>
		<title>By: chrisg</title>
		<link>http://coremarkanalytics.com/blog/2007/07/05/determining-your-sample-size/#comment-50</link>
		<author>chrisg</author>
		<pubDate>Mon, 16 Jul 2007 19:06:26 +0000</pubDate>
		<guid>http://coremarkanalytics.com/blog/2007/07/05/determining-your-sample-size/#comment-50</guid>
					<description>I think Michael's point is that the power calculation formula assumes independence of all of the observations, and two views of the same page by one person are not independent observations.   To get the best possible estimate of sample sizes needed, you'd have to use only one data point per visit.  Even though your statistic (CTR) is based on page views, your probability stats, which is what your power calculations are based on, have to use no more than one event (page view) per visit.  If it were my analysis, I would set it up to use only the first view of the page in a visit, both for the sample size formulas and the estimate of CTR itself.

Regards,  Chris</description>
		<content:encoded><![CDATA[<p>I think Michael&#8217;s point is that the power calculation formula assumes independence of all of the observations, and two views of the same page by one person are not independent observations.   To get the best possible estimate of sample sizes needed, you&#8217;d have to use only one data point per visit.  Even though your statistic (CTR) is based on page views, your probability stats, which is what your power calculations are based on, have to use no more than one event (page view) per visit.  If it were my analysis, I would set it up to use only the first view of the page in a visit, both for the sample size formulas and the estimate of CTR itself.</p>
<p>Regards,  Chris</p>
]]></content:encoded>
				</item>
	<item>
		<title>By: Wendi</title>
		<link>http://coremarkanalytics.com/blog/2007/07/05/determining-your-sample-size/#comment-51</link>
		<author>Wendi</author>
		<pubDate>Mon, 16 Jul 2007 19:43:26 +0000</pubDate>
		<guid>http://coremarkanalytics.com/blog/2007/07/05/determining-your-sample-size/#comment-51</guid>
					<description>Hey Chris.  Absolutely.  I am following you and Mike.  That is why I included the usage of Unique Page Views that Google Analytics provides in my reply to Mike (and is what Robbin is using on their site).  I agree totally with both of you.  I'll post and update to make that point clear in the post.  

Thanks again!
Wendi</description>
		<content:encoded><![CDATA[<p>Hey Chris.  Absolutely.  I am following you and Mike.  That is why I included the usage of Unique Page Views that Google Analytics provides in my reply to Mike (and is what Robbin is using on their site).  I agree totally with both of you.  I&#8217;ll post and update to make that point clear in the post.  </p>
<p>Thanks again!<br />
Wendi</p>
]]></content:encoded>
				</item>
	<item>
		<title>By: Erik</title>
		<link>http://coremarkanalytics.com/blog/2007/07/05/determining-your-sample-size/#comment-279</link>
		<author>Erik</author>
		<pubDate>Tue, 28 Aug 2007 23:49:41 +0000</pubDate>
		<guid>http://coremarkanalytics.com/blog/2007/07/05/determining-your-sample-size/#comment-279</guid>
					<description>Thanks Wendi,  This has been a really helpful article as I get prepared for a testing program.  I have a question regarding the number of variants (or 'combinations' as google calls them).

I read the above N as 1728 total views, or an average of 576 views for each of the three variants.  What happens if we are testing 6 different variants or even 9 (as in some multivariate testing)?  If your N is fixed at 1728, the required views per variant would decrease as variants are added.  At some point it seems that you would get too few views per variant to have a statistically significant test.

It seems to me the number of variants should be factored into the total sample size.  In reading about the Power formula, it seems it is based on a 2 sample test.  If that's the case should the above formula be multiplied by (r / 2), where 'r' is the number of variants?  Hope I'm on the right track with this, if not please let me know,
Thanks!</description>
		<content:encoded><![CDATA[<p>Thanks Wendi,  This has been a really helpful article as I get prepared for a testing program.  I have a question regarding the number of variants (or &#8216;combinations&#8217; as google calls them).</p>
<p>I read the above N as 1728 total views, or an average of 576 views for each of the three variants.  What happens if we are testing 6 different variants or even 9 (as in some multivariate testing)?  If your N is fixed at 1728, the required views per variant would decrease as variants are added.  At some point it seems that you would get too few views per variant to have a statistically significant test.</p>
<p>It seems to me the number of variants should be factored into the total sample size.  In reading about the Power formula, it seems it is based on a 2 sample test.  If that&#8217;s the case should the above formula be multiplied by (r / 2), where &#8216;r&#8217; is the number of variants?  Hope I&#8217;m on the right track with this, if not please let me know,<br />
Thanks!</p>
]]></content:encoded>
				</item>
	<item>
		<title>By: Rachel</title>
		<link>http://coremarkanalytics.com/blog/2007/07/05/determining-your-sample-size/#comment-405</link>
		<author>Rachel</author>
		<pubDate>Wed, 26 Sep 2007 13:32:27 +0000</pubDate>
		<guid>http://coremarkanalytics.com/blog/2007/07/05/determining-your-sample-size/#comment-405</guid>
					<description>I am still a bit confused on the sample size.  Does she need 1700 visitors for each experiment or in total for all three?  Right now in total she is at 783 and I'm trying to figure out if she needs ~1000 more or 4300 more. Thanks.</description>
		<content:encoded><![CDATA[<p>I am still a bit confused on the sample size.  Does she need 1700 visitors for each experiment or in total for all three?  Right now in total she is at 783 and I&#8217;m trying to figure out if she needs ~1000 more or 4300 more. Thanks.</p>
]]></content:encoded>
				</item>
	<item>
		<title>By: seby kallarakkal</title>
		<link>http://coremarkanalytics.com/blog/2007/07/05/determining-your-sample-size/#comment-1529</link>
		<author>seby kallarakkal</author>
		<pubDate>Tue, 27 Nov 2007 02:49:41 +0000</pubDate>
		<guid>http://coremarkanalytics.com/blog/2007/07/05/determining-your-sample-size/#comment-1529</guid>
					<description>Hi Wendi,
Great post. Thanks for sharing. I came across your blog few days back and have been catching up on the older posts. Your perspective on using stats for analytics is proving to be really useful.</description>
		<content:encoded><![CDATA[<p>Hi Wendi,<br />
Great post. Thanks for sharing. I came across your blog few days back and have been catching up on the older posts. Your perspective on using stats for analytics is proving to be really useful.</p>
]]></content:encoded>
				</item>
</channel>
</rss>
