Recently Amazon released a video which could be the best example of Silicon Valley’s favourite word: Disruption. The video shows Amazon’s latest product, a self check out grocery store, Amazon Go. In the video, Amazon has said they are using Computer Vision, Deep Learning and Sensor fusion to create this platform. All the buzzwords put in. Being a postgraduate in Artificial Intelligence, I decided to figure out how would such a thing be implemented if it were to be. Below I have detailed the same.
The shopping process is divided into three steps:
- Enter the shop
- Checkout and leave
1. Enter the Shop:
We saw people placing there phones over the entry gates in the video, this could imply that the entry gate is able to sense the NFC card and directly check you in. Also, we saw a QR code in the beginning being scanned. This is much similar to what we have at our University library, where bar code on the id card is scanned to mark you in. Also, this would be better thing to do, if we want more customers as only high end phones have NFC in them. QR code would be able to link user’s Amazon account with the shop, just like WhatsApp uses QR code to sync/sign-in on a web browser. This will allow Amazon to check if the user entering the shop is a valid user with Amazon, who it is and what the record of that customer is. Also, for reasons we will see later, I believe there should be cameras at check-in to recognise and store a customer details for easy facial recognition.
Use cases checked:
- Check if user is registered
- Check if user is allowed to avail the service
- Establish identity of customer
- Get good quality data for facial recognition/track user throughout the store
This is one of the trickiest part in my opinion. As said by Amazon in their video, they have used “Computer Vision, Deep learning and Sensor fusion” technologies. Also, on careful examination of video I found something peculiar (Figure 1)
There are some kind of monitoring sensors placed in each section of the shop. These do not look like normal cameras and appear to be much bigger for Bluetooth beacons too. I guess, this is some kind of fusion device that has been custom made to track customers’ proximity using Bluetooth signal and later filter it using Deep learning(state of the art Convolution Neural Nets(CNN)). My recent experience in building hyper-localisation technologies using data from WiFi and Bluetooth beacons using Machine Learning at a start-up I was working at, leads me to believe that Bluetooth beacons+Facial recognition would be a far more accurate and easier thing to do for Amazon. This is where facial features recorded when user was entering would be handy. So that we could match correct user to correct product. A question does arise if we are using image processing techniques, what if the store is too crowded. The answer to this question is image processing, thanks to CNN we can separate different faces and identify each one too. So ambiguity would be reduce. Moreover, if there would be any perhaps an alert system would block the entry and ask the user to enter again. This is nothing new, and Amazon has already implemented it on their Amazon Rekognition platform available on AWS.Using facial recognition as the source of user identification, Amazon need not worry much about a phone being switched off during shopping. Once the user has been identified, all purchases are added directly to their cart(sitting on Amazon servers) and phone plays virtually no role. Thus with efficient user tracking and well catalogued entry, all the items which user pick could be directly added to their cart. An action of picking up an item could be done in two ways: counting the number of items using computer vision techniques or placing sensors on the shelf which identify when a change in rack has happened. This is a fairly easy task and other techniques might be used too.
Use Cases solved:
- Keep track of user
- Modify(Add or remove item) shopping cart of correct user
- Charge for correct items per correct user.
3. Checkout and Leave
Once the user is finished shopping, he/she can just leave the store and the cameras can identify which user is leaving, and check if the customer had sufficient money in their account to finish the shopping. If not, this problem could be tackled in two ways, depending on Amazon’s customer policy. A user can be sent an alert to pay by a set date, and is not allowed to enter the store until then(hence, authentication in first step). Alternatively, their could be kiosks, where you can pay by card for purchases you have made, if a customer doesn’t happen to be a Amazon user. However, the second seems unlikely as it does not match Amazon’s idea of “Just Walk Out”.
The whole process can be summarised, to an extent, by this image on Amazon’s web page made for sentiment application for a retail store (Figure 2) :
Amazon Go seems to be one of the application of this technology, where Application is in-store shopping and check-out process.
In conclusion, the stores are planned to open early 2017, and it would be great to see one in action. How checks for theft have been put in place and also how people react to it. It is taking away jobs like that of cashier, but also providing more in terms of chefs that would be preparing delicious food back door, and also many IT engineers providing, managing and creating this platform.