In the past when Mike and I would set out to redesign Detailed Image we would make our design, user interface, and user experience decisions based upon a combination of our internal historical data and the industry “best practices” at the time. This strategy had served us well. We’d research what we considered to be the best-in-class e-commerce experiences and study what they did best. We’d read up on all of the latest studies and experiments published on GetElastic, Smashing Magazine, ConversionXL, and other great blogs. Then we’d combine all of that information with what we already knew about our market and our users to create the end product.
When it came to this most recent redesign back in April things were a little different. Responsive design was (and still is) so fresh, especially in e-commerce, that a lot of the “best practices” hadn’t been determined yet. We ended up looking at a lot more non-retail sites, and we ended up going with our gut more than I think we would have liked to.
Nevertheless, the results have been fantastic. Since the launch in late April we’re up in almost every important metric: revenue, average order value, mobile & tablet revenue, etc. The new shopping experience is so much more frictionless for mobile buyers that you’d be shocked if the mobile numbers weren’t up. However, that doesn’t necessarily mean that we nailed the responsive design. We may or may not be close to the optimal e-commerce shopping experience.
So, we decided we needed to put more of an emphasis on testing. In an email conversation with Linda from GetElastic, she proposed the idea of testing whether our mobile navigation would be better off with text instead of the icons that we had chosen. Icons vs. text is one of the many mobile responsive design elements that is still up for debate. Given how critical the mobile navigation is, we decided to jump this test to the top of our list.
Below are images of the Control (Icons) and the Test (Text) variations that we tested:
In addition to changing the icons to text, we made one other change. With the “DI” gone, we added the full “Detailed Image” in to the test version. To conserve space, we removed the Ship & Save promo text.
Each mobile user was randomly served either the control or the test version of the navigation the first time they visited the site. A cookie was placed in their browser so that the nav would remain consistent throughout their current session and also throughout all future sessions within that same browser (so long as the cookie wasn’t deleted of course).
There were a few additional steps to setting up this particular experiment because of all that we wanted to track. We weren’t able to take advantage of Analytics to tell us when to stop the experiment because there is no way to configure the experiment to only count mobile visits. When we reviewed the data, we had to use a mobile phone segment to filter the results, and then only look at the raw data because the additional calculations Analytics displayed were for all users.
We also wanted to track interactions with the navigation, in addition to the transactions that the experiment tracked. To accomplish that, we set up events to track every single click on the nav.
We then created custom segments for each variation so that we could compare variations for anything that Analytics tracks.
To analyze the results we used the ABBA split-test calculator and the AB Tester calculator. There’s no shortage of good calculators like this around. I prefer these two because they both have separate, statistically sound methodologies, they do a great job of explaining the math, and their code is publicly available.
We had over 33,000 mobile phone visits for each variation. Here are the results:
We analyzed both the unique events (whether or not a visitor clicked on the nav) as well as the total events (number of times the navigation was clicked) .
|Visits||Total Events||Unique Events||Events per Unique||Unique Event %|
|Successful Event Range||p-value||Improvement Range||Confidence|
|Control (Icons)||3.9% – 4.3%|
|Test (Text)||4.9% – 5.4%||<0.0001||20% – 35%||100%|
|Successful Event Range||p-value||Improvement Range||Confidence|
|Control (Icons)||12% – 13%|
|Test (Text)||15% – 15%||<0.0001||15% – 24%||100%|
In both cases the Text variation was the clear winner. With p-values less than 0.0001 we can be confident that the results are statistically significant (a p-value of less than 0.05 is required for the standard 95% confidence). The Improvement Range shows that the difference is pretty substantial either way you look at it. Users definitely interacted more with the navigation when it was text-based as opposed to icon-based.
After seeing those results, I had a thought: I wonder if the Icons group searched more? It would stand to reason that if they’re not using the nav to find what they’re looking for, that they could be searching for it. I didn’t plan to look at this originally, but by having the custom segments created in Google Analytics this was only a few clicks away.
|Visits||Visits w/Search||Search %||Search Range||p-value||Improvement Range||Confidence|
|Control (Icons)||33,094||2,299||6.95%||6.7% – 7.2%|
|Test (Text)||34,066||2,107||6.19%||5.9% – 6.4%||<0.0001||-16% – -5.6%||0%|
As I suspected, users searched more when presented with the Icons. The results were again statistically significant.
Lastly, and maybe most important from a business standpoint, we looked at the transactions for each variation. (It’s worth noting that this data included visitors to our detailing guide and Ask-a-Pro Detailer blog as well as the store because they are all shown the same main nav. Those visitors are generally higher traffic, lower conversions, which is one of the reasons our conversion rates look so low, even for mobile. Often times when we want to hone in on the performance of just the e-commerce experience we will factor those users out.)
|Visits||Transactions||% Improvement||AOV % Improvement||Conversion Range||p-value||Improvement Range||Confidence||95% Confidence Sample Size|
|Control (Icons)||33,093||86||0.21% – 0.32%||589,744|
|Test (Text)||34,064||101||17.44%||9.44%||0.24% – 0.36%||0.37||-17% – 44%||81.63%||516,702|
Unfortunately these results weren’t as cut and dry. The Text variation had more transactions, but the difference wasn’t statistically significant with a p-value of 0.37 (63% confidence). AB Tester calculates confidence differently, and puts it at 82%. Either way we’re short of 95%, and AB Tester estimates we’d need roughly 15x the number of visits to get to that 95% confidence. Looking at the Improvement Range you can see that it’s possible to have a negative improvement. A 9.44% AOV % Improvement (AOV = average order value) is also nice, but we’re still lacking anything definitive.
And The Winner Is…Text!
We chose the Text variation over the Icons. If you visit Detailed Image you’ll see it in action, just in time for the holiday shopping season.
The decision wasn’t as clear-cut as we had hoped it would be. In a vacuum, learning that users interacted with the text-based menu significantly more than the icon-based menu is very valuable information. We’ll surely use this to guide future design decisions.
However, concluding the Text variation the winner in this instance became murky once you looked at the searches. If users are clicking the menus more now and using the search box less, that doesn’t necessarily mean that we’ve created a better user experience. After all, we think our autosuggest is a great search experience. You’d hope that the transactional data would have made the final call, but it unfortunately did not. It leans towards the Text variation both in terms of transactions and average order value, but leaning towards and being statistically significant are two different things.
In the end it came down to five main reasons:
- The statistically significant increase in clicks on the navigation for the Text variation.
- Despite not being statistically significant, the transactions and AOV favor the Text variation.
- Our assumption that mobile users would rather touch their screen than type. So even though Icon users searched the site to make up for not using the menus, it’s reasonable to think that they would have rather been clicking through menus instead of typing into a search box.
- Choosing text over icons passes the “common sense check”. It makes sense that users aren’t always 100% sure what each icon means, whereas there’s little to no uncertainty about what the words Home, Menu, and Account mean. I know when Gmail switched their button labels from text to icons everyone in our company struggled…thankfully they then added a setting to switch the labels back to text, which we all did right away.
- The added branding of having “Detailed Image” on every single page. In hindsight we should have never launched without having our store name on every single page. Seemingly this is common sense but I think we overlooked it because we were focusing on saving as much space as possible.
We’re excited to have this improvement in place for the holiday shopping season and looking forward to running more of these types of experiments in 2014!
Update 10/28/2013: Linda at GetElastic wrote up a post summarizing our results!
Update 12/13/2014: ConversionXL did a thorough writeup, including their own test, and all the results point to similar outcomes as ours.