voice tech

How Decreasing Time-to-Cart Can Help Retailers Win More Customers in Grocery e-Commerce?

Ottomatias Peura

Jan 12, 2020

6 min read

Decreasing time-to-cart in grocery ecommerce can have a huge impact on the business, because creating the first cart takes so much time and is cumbersome. With speech recognition, time-to-cart can be decreased by up to 90%.

  • Copy link

  • Mail

  • LinkedIn

  • Facebook

  • Twitter

Building and running a successful web store or e-commerce business is a never-ending optimization process. Every step of the purchase funnel must be as slick as possible to maximize sales and revenue. That’s why big players such as Amazon or Shopify are running hundreds of tests every day.

The optimization starts from advertising and customer acquisition and moves into how the website is laid out and finally into how orders are paid and delivered. The target is to get everything a bit better and eventually the bottom line will have a larger figure in the end of the year. This is basically how a modern e-commerce is ran.

And most often than not, the improvements are very small.

For example the speed of the website has a relatively big effect on the conversion rate: the faster the pages load, the more willing a user is to keep on clicking and possibly purchasing more. If the pages are slow, they might go to another store to browse for similar products or just give up. Google says that every second in mobile page load time can improve (or decrease) conversions by a whopping 20%.

But getting a second off from the page load time is very hard and probably in real life the increase would be closer to 0,1 seconds. That would increase conversions by 2%. That’s of course great if you have up to six million purchases every day like Amazon, but it’s still only 2%. But what if you could improve some key metric in your web store by more than 100%?

This can be the case with adding voice recognition to e-commerce. Let’s first think about why time-to-cart is an important metric in ecommerce, especially in grocery vertical.

Grocery e-commerce user experience is currently not very good

With all the improvements done to e-commerce platforms in the past years, the user experience consists of searching for products, comparing them and adding some of them to your shopping cart. After the shopping cart is created, the cart is checked out and paid and eventually delivered to the customer.

If you are looking for a single high value product, say an iPhone X, that works well. The user goes through different e-commerce sites and looks for a good price and then orders the one they think has the best combination of price and delivery time. There are even aggregator sites that do this on your behalf and searching from a single page can compare prices from dozens or hundreds of sites and give you the options.

But what if you are not buying a single iPhone but rather dozens of low value products? Searching and adding can quickly become tedious. This is the case with groceries. While there are easily a hundred different milks in a supermarket and a dozen retailers that deliver groceries, the average user is not interested in comparing them all. They just want their usual milk. And after they’ve found their milk, they don’t want to repeat the process for bread, coffee and minced meat, too. That’s what they are forced to do in an average grocery e-commerce store.

Building a grocery cart in a webstore takes easily 30 minutes or more when doing it for the first time. Not exactly a huge time saver compared to visiting a brick-and-mortar store. The good part is that for the second time it’s a lot easier when the user can use their old cart as the starting point and only editing it according to their specific needs; they probably need the same milks and cheeses this time, too, but maybe they’ll want to switch last weeks chicken to beef or add some mascarpone for the weekend’s tiramisu. The first-time user experience is still very bad.

The cumbersomeness of building the initial cart is well-known to retailers and its effect is clearly seen in data. For example McKinsey has found that after the customer has done their first order, they are highly likely to come back if the user experience is not particularly bad. This is because even if the user experience would not have been great, the added ease that comes from having your old shopping cart available beats the difference.

This is to say that the competition in grocery ecommerce happens during the first purchase. If that is hard, the customer will probably not even complete the order, at least if they have already purchased from somewhere else before. And on the other hand, if the first purchase succeeds and the experience is good, it’s relatively easy to keep the customer and even win customers from someone else.

Voice for the win

Previously I noted that it’s usually very hard to improve any metric by a significant amount. But based on our data, time-to-cart can be improved by up to 90% by adding the possibility to use voice in adding products to the cart.

In our applications, users say a product they need – such as “5 tomatoes” – and 5 pieces of the most probable tomato product is added to the cart. If the customer wants to change this to another tomato, say to an organic one, they can click on the product and get a list of other possibilities or just say it out loud when adding the product.

This is analogous to how you would purchase groceries in a village shop. If you say "I'd need tomatoes", the clerk is not going to present you with all the different organic cherry tomatoes or ketchups they have available but rather just add five of the most common, basic tomatoes to your cart. If you then say "actually, would you have organic tomatoes" or "I meant tomato sauces", they are happy to substitute them for you.

About 90% of our end-users are generally satisfied with the experience of voice-based shopping cart. One of the applications has a NPS of about 40.

The experiences are built with Speechly API that is trained to extract product information from user speech input in real time. For example, when user starts saying something like “apples”, the platform already guesses from the first ‘a’ that the user might mean apples, anchovies or artichoke and not oranges, flour or beer.

Speechly real-time spoken language understanding in e-Commerce

This is different than most solutions currently on the market that first wait until the whole word has been uttered and only then starts looking that word from the product database.

Speechly experience is analogous to human speech. When we talk with each other, we don’t wait for the other to finish their speak and only after that start processing what they said. We rather process the information simultaneously and keep on nodding and saying something like “a-ha”, “yeah” and “okay” to signal that we are following and understand what is being said. If we don’t understand, we might interrupt and ask for clarification.

The currently available voice solutions don’t work like this. They wait silently until the user has said whatever they want to say, transcribe that into text and then send the text to another system that “understands” the text and does what is required.

Because Speechly combines speech recognition (=turning voice into text) and natural language understanding (=extract meaning out of what waa said), the experience is a lot faster and more natural than with smart speakers. Speechly also supports multimodality which means the user gets instant visual feedback on what was done.


e-Commerce is an optimization battle, where each step of the purchase funnel should be as optimized as possible. With the scale of operations in many e-Commerce enterprises, an improvement of a few percent can be worth millions.

One of the bottlenecks with current solutions is the creation of the shopping cart. Especially with the case of groceries where products are of low value and the amount of products needed to add to the cart is big, it’s important to make the process as easy and fast as possible. One of the best ways to achieve this is by adding voice capabilities.

Speechly is building a developer tool that enables real-time spoken language understanding. By using Speechly, retailers can add voice user interfaces to their web and mobile apps easily and create a fast, convenient shopping experience. This decreases time-to-cart by up to 90% and can improve user satisfaction and revenue.

If you are interested in adding voice to your e-Commerce, please contact our sales and get it done. If you want to read more about our solutions for grocery vertical, see our solutions page.

About Speechly

Speechly is a YC backed company building tools for speech recognition and natural language understanding. Speechly offers flexible deployment options (cloud, on-premise, and on-device), super accurate custom models for any domain, privacy and scalability for hundreds of thousands of hours of audio.

Latest blog posts

company news

Speechly is joining Roblox

Hannes Heikinheimo

Sep 19, 2023

1 min read

voice tech

4 Voice Chat Solutions for Virtual Reality

Voice chat has become an expected feature in virtual reality (VR) experiences. However, there are important factors to consider when picking the best solution to power your experience. This post will compare the pros and cons of the 4 leading VR voice chat solutions to help you make the best selection possible for your game or social experience.

Matt Durgavich

Jul 06, 2023

5 min read

company news

Speechly Has Received SOC 2 Type II Certification

Speechly has recently received SOC 2 Type II certification. This certification demonstrates Speechly's unwavering commitment to maintaining robust security controls and protecting client data.

Markus Lång

Jun 01, 2023

1 min read