Voice Picking with Modern Technologies

Ottomatias Peura

Feb 05, 2021

5 min read

Learn how voice picking and voice-directed warehousing with real-time Spoken Language Understanding improves efficiency and key metrics in your warehouse

Copy link
Mail
LinkedIn
Facebook
Twitter

Voice picking has been employed for decades. But only recently have technologies such as Speechly’s Spoken Language Understanding enabled intuitive and accurate real-time voice-user interfaces that maximize efficiency with minimal customization and development time.

What is voice picking?

Voice-directed warehousing (VDW), voice picking, pick by voice, voice-enabled warehouses, and speech-based picking all refer to the same thing. It’s a paperless, hands-free, and eyes-free computer system that employs voice commands for warehouse processes.

Warehouses have been frontrunners in using voice technology in improving workforce efficiency. Voice picking has been used at least since the early 1990s, but recently the technology has matured enough to make other technologies almost obsolete. The market is expected to grow significantly over the coming years, due to decreasing costs and improved accuracy.

Voice-directed warehouses typically use a multimodal voice user interface that can both direct the operator and take commands from the operator using voice. However, many warehouse operators use the voice-user interface only for data input from the operator to the system, and the information from the system to the operator is shown visually on their screen.

Voice-directed warehousing is well-suited for keeping an operator's hands and eyes free, allowing them to focus more on the task at hand.

It can be used in all kinds of storage environments; freezing and noisy environments are not a problem for Speechly’s voice technology. It can be used in warehouses with a large and small number of SKUs alike, and can be adapted to any process.

How does voice picking work?

In a voice-directed warehouse (VDW), operators are equipped with a device, often a mobile phone, tablet, or a voice-dedicated terminal and headset. Typically, the headset is equipped with noise-canceling features for better performance in loud environments. In addition to voice and touch, the device can also support RFID and barcodes for increased efficiency in certain situations.

Modern voice picking employs speech recognition and natural language understanding technologies for improved accuracy and intuitiveness. Speechly has a unique approach to these technologies by combining these processes into a Spoken Language Understanding system that returns accurate results for voice commands in real time.

When an employee starts their shift, orders are imported from the host system — such as an ERP (Enterprise Resource Planning software) or a WMS (Warehouse Management System) — to the device and then processed. After processing and sequencing, the instructions for what item the operator should pick and where to find it are either spoken out loud (with a text-to-speech system) or shown on the screen.

When the operator is in the correct location, they confirm that they are picking the correct items by checking in to the location. After that, they confirm the products by speaking the product code or another identifier printed on the product. The operator also confirms the quantity they are about to pick. In case of incorrect or inaccurate confirmation, the voice application can correct the operator multi-modally.

Depending on the implementation, location information, RFID and other technologies can be used to optimize the route and maximize the efficiency of the operator.

Typical voice commands that operators use include product code strings, quantities, and locations. The operator can also slow down or hasten the voice user interface. A well-designed multimodal voice user interface is the key to highest efficiency.

Benefits of voice picking

Benefits of using voice in a warehouse setting include, but is not limited to:

Faster and more efficient picking Faster and more efficient picking: Picking is the most expensive and labor-intensive warehouse process. It can constitute more than half the cost of a typical distribution center. Voice-user interfaces increase hourly pick rate.
More accurate reporting and better data quality Anomalies — such as broken or missing items — can be reported in real time, resulting in better data quality and cost savings.
Safer warehouse environments Safety is a priority in an efficient logistics facility. Hands-free and distraction-free operation of voice-user interfaces reduces injuries and accidents.
No need for printing and distributing picking documents in paper Because orders are imported directly from the ERP or WMS to the employees' mobile devices, operators are ready to start picking right after they start their shift.
Decreased training time for new employees Unlike traditional barcode and RFID scanners and hard-to-use enterprise software, voice-user interfaces are intuitive and require less than a day of training time for new employees. This can be a great benefit in warehouses with many seasonal employees.
Improved efficiency due to operators being able to do two things at once Voice-directed warehouses enable operators to spend up to 95% of their work time picking, rather than reporting and searching for documents.
Improved customer satisfaction due to no incorrect shipping Incorrect shipping is costly and reduces customer satisfaction. With voice picking, mistakes are massively reduced.
Effectiveness in cold environment Traditional user interfaces are hard to use in cold storages and environments that require operators to wear gloves.
Happier employees Simplified operations lead to happy, productive employees and decreased employee turnover.

Unlike some older voice systems currently employed in warehouses, voice user interfaces built with Speechly require no per user training of the speech recognition models. The model is trained once, and it will work for all old and new employees.

All interactions between the system and the operator can be tracked — this enables management to track progress in real time and audit trail to resolve anomalies.

Voice picking can be easily integrated into any WMS and ERP with productivity increases of up to 40%. Because of easy implementation and major productivity increases, typical voice projects in warehouses have a relatively short ROI of about 6 to 12 months. Due to improved data quality, it enables warehouse management to track and analyze progress and reallocate resources in almost real time.

The technology doesn’t have to be limited to just picking, though — voice can be used in most other warehouse processes, such as cross-picking, quality control, packing, sortation, replenishment, receiving, and put-away.

How to get started with Speechly in warehouses

Speechly’s Spoken Language Understanding technology offers industry-leading accuracy without the need for special hardware. Our technology works for all accents and can be adapted to all processes. Typically, a POC can be built that integrates to current ERP or WMS and supports most common warehouse processes in less than a month.

Our pricing is competitive and is based on the amount of audio data sent to our API. Typical costs for using our API in a warehouse setting are some thousands of euros per month. Speechly works on all mobile devices and can be used in custom hardware, too.

If you’re interested in learning more about how voice technology can help your logistics workforce be more efficient and improve your business data quality, leave your email address and our industry specialist will contact you with more details.

About Speechly

Speechly is a YC backed company building tools for speech recognition and natural language understanding. Speechly offers flexible deployment options (cloud, on-premise, and on-device), super accurate custom models for any domain, privacy and scalability for hundreds of thousands of hours of audio.

Latest blog posts

company news

Speechly is joining Roblox

Hannes Heikinheimo

Sep 19, 2023

1 min read

voice tech

4 Voice Chat Solutions for Virtual Reality

Voice chat has become an expected feature in virtual reality (VR) experiences. However, there are important factors to consider when picking the best solution to power your experience. This post will compare the pros and cons of the 4 leading VR voice chat solutions to help you make the best selection possible for your game or social experience.

Matt Durgavich

Jul 06, 2023

5 min read

company news

Speechly Has Received SOC 2 Type II Certification

Speechly has recently received SOC 2 Type II certification. This certification demonstrates Speechly's unwavering commitment to maintaining robust security controls and protecting client data.

Markus Lång

Jun 01, 2023

1 min read