Democratizing data for better shopping experiences
Dutch retailer Wehkamp offers their shoppers a wide range of quality products. They carry the latest fashion trends, home-goods, electronics and everything in between. As a leading e-commerce company in fashion in the Netherlands, they dedicates itself to provide a better shopping experience for the customers, and continually looks for ways to not only engage shoppers on their site, but also create opportunities for their brand partners to clearly demonstrate their value.
Their main marketing focus is relevance — ensuring their shoppers are able to find what they need in the most efficient way. This puts shoppers in a purchasing frame of mind when they visit Wehkamp’s website. Using Spark, the data science team is able to develop various machine-learning projects for this purpose based on the large scale data of products and customers. A major topic for the data science team is ranking products. If a visitor enters a search phrase, what are the best products that fit the search phrase and in what order should the products been shown? Ranking products is also important if a visitor enters a product overview page, where hundreds or even thousands of products of a certain article type are displayed.
For instance, when a user search for 'jeans' on wehkamp website, it returns 4400+ products. User navigates to 'ladies jeans' overview page. The search result page now has 2176 products. So the goal is to maximize the order of relevance of returned products given a user query.
System Design
- Data collection
- Click model - for relevancy scores
- Feature generation - for explaining relevancy
- Ranking model - for estimating weights to features
- Serve model (elasticsearch LTR) - for productionising
- Evaluation (tableau)
Tech stack
Data collection
Raw 'Google Analytics' feed (daily) → Google Big Query → S3 bucket → Spark
Click model
Objective - predict the relevance of products based on impression and clicks of products given its position. 2 candidates models - DBN (dynamic bayesian network) click model and COEC (click over expected clicks). COEC gave better results, easier to train and explain.
Feature generation
Ranking model
Notebook jobs were used to process raw data and generate features. XGBoost model was trained on these features. HyperOpt and MLflow were used for hyperparameter optimization and experiment tracking respectively. For identifying feature importances and explaining them, SHAP was used.
Serve model
Model was saved in Elastic index.