Opportunities and challenges in personalization of online hotel search David Zibriczky Data Science & Analytics Lead, User Profiling
Introduction 2
Introduction About Mission: Helping the travelers to find their ideal hotel at the best price Main Product: Hotel Metasearch Aggregates hotels and advertisers Availability and price comparison User interface for hotel search Redirecting users to advertisers Company facts: 1.8M+ hotels 400+ advertisers (booking sites) 190 countries HQ in Düsseldorf, Germany 3
Introduction Hotel Metasearch vs. OTA Metasearch!= Online Travel Agency (OTA) Aggregator of OTAs Redirecting visitors to OTAs Price comparison No direct feedback about hotels Common Booking is the ultimate goal Traditional booking online booking Helps users in hotel search Source: http://www.otrams.com/ 4
Introduction CPC Referral revenue model: CPC (Cost-per-click) CPC bidding per each hotel by the advertisers Features: Simple model to calculate revenue No influence on measurement by advertisers CPC ~ expected value after clicking Difficulties: Effect on CPC bidding takes time to measure Easy to cheat on short-term revenue Indirect measure of performance 5
Introduction Business Goals One of the main goals is to increase the revenue By increasing the following factors: CPC bidding (per hotel) Price of clicked hotels Number of clicks (bookings) Number of visitors Challenges: CPC vs. number of clicks click-boosters doesn t help Hotel price vs. number of bookings Short-term vs. long-term optimization Multi-criteria optimization Bouncers 6
Introduction User Value From user perspective: 1. Utility function 2. Effort to find the hotel Assumption: Increasing user value KPI improvement Goals: Maximize the utility of the hotel Bigger likelihood to book, higher CPC Better representation of value for price Increase in clicked price Minimize the effort for finding the hotel Better churn rate Increase the user experience Higher user retention 7
Introduction Personalization Personalization: Tailoring the hotel search process to individual preferences Goal: Adding value to a non-personalized solution usability, serendipity, decision support Techniques: Personalized recommendations User interface Improving search process Visualizing relevant features Personal campaigns, targeting Product: Personalization service that learns real-time and adapt to context 8
Potential Use Cases for Personalization 9
Use Cases Destinations 1. Function: Type/select a destination Goal of Personalization: 1. Best destination to travel 2. List of best cities/destinations to travel Challenges: User cold start Difficult to predict the next best destination One suggestion for use case #1 2. 10 10
Use Cases Search Suggestions Function: Autocomplete of search terms Goal of Personalization: Suggesting cities, POIs or keywords Challenges: User cold start Diversity of suggestions 11
Use Cases Hotel listing Function: List of hotels Goal of Personalization: Personalized sorting of hotels Matching the search criteria Challenges: Depends on the deals and CPC Positional bias Ranking method for any filtering Influence of context 12
Use Cases Advertisers Function: List of advertisers 1. 2. Goal of Personalization: 1. Best advertiser at View Deal button 2. Ranking all advertisers Challenges: Brand awareness Influence of CPC Price vs. brand? 13
Use Cases Images 1. Function: Images about the hotel Goal of Personalization: Image rec. 1. Top image 2. Images in details 2. Challenges: Labeling of images Diversity of topics Redundancy Positional bias 14
Use Cases Hotel listing on Map Function: Showing hotels on map Goal of Personalization: Most relevant hotels Visualization of relevance Challenges: Geospatial dependence Influence of POI Ranking is not trivial Number of hotels to show 15
Use Cases Explanation Function: Explanation of recommendations Goal of Personalization: Distance from a POI Amenities or other keywords Challenges: Trivial explanation doesn t add value Optimal number of explanations No feedback on that 16
Use Cases Search Criteria 2. 3. 4. 5. 7. 1. Function: Filter boxes Goal of Personalization: Search criteria suggestions Challenges: Personalization vs. default settings How to visualize suggestions? 6. 17
Common challenges in hotel industry 1. Episodic interactions (next travel, in-session modeling) 2. Unstable preference (seasonality, context, lack of domain knowledge) 3. Tracking (limited tracking, less registration, lack of feedback, cold start) 4. User Engagement (redirection, bouncers) 5. Price Sensitivity 6. Online booking vs. real world 18
19 How? A quick overview
How Data User interactions: Identification: cookie, members Actions: search criteria, hotel interactions, booking, navigation frontend/backend logging Hotel inventory: Static: metadata, amenities, images, ratings partner API, crawler Dynamic: availability, price, advertiser deals, CPC Other entities: destinations, POIs, filters Context: time, seasonality, device, platform, referrer, location, parameter box 20
How Application of ML Overview Classification: Visitor classification, churn, filter usage prediction (XGB, GBDT, NN, RF, SVM, LOGR) Regression: Price preference, expected LTV, value for price, CPC bidding (GBRT, RFR, SVR, LR) Clustering: User/hotel segmenting, discriminative features (K-Means, K-Medoid, DBSCAN) Association Rule Mining: Next best hotels, filters, destinations (Apriori) Feature Engineering: Image features (CNN), entity embedding (PCA, t-sne, MF) Natural Language Processing: Sentiment analysis, topic modeling (Word2vec, LDA) Ensemble learning: Combining multiple algorithms (boosting, stacking, linear comb.) 21
How Application of ML RecSys Segment-based popularity: Most popular destinations/hotels/filters in a specific user segment Case-based Reasoning: Actions of other users in the same/similar contexts Content-based Filtering: Similar hotels, most preferred hotel features Collaborative Filtering: K-Nearest Neighbors: Next best destinations, similar hotels, clustering (Item-KNN, User-KNN) Matrix Factorization: Personalized rec., user/hotel modeling, tensors (SGD, IALS, SVD, ) Deep Learning: Next best hotels or actions (RNN, GRU4Rec) Knowledge-based RS: Conversational recommenders, domain knowledge representation 22
How Evaluation Goal: Offline evaluation Goal: Online evaluation Reducing the cost of experiments Prototyping Evidence Good-enough candidates Techniques: Data Analysis and Insights Finding offline metrics Finding a Ground Truth Avoid over-optimization (offline!= online) Testing the feature in production KPI optimization Monitoring Accept/reject Techniques: Real-time manual testing A/B testing Surveys Parameter tuning 23
How Limitations/risks of Personalization 1. Data-driven solution (quality of the data) 2. User cold start 3. Misprediction 4. Self-reinforcement loop 5. Over-personalization 6. Cost of experiments 24
Thank You! Questions?
Appendix Abbreviations XGB GDT NN RF SVM LOGR GBRT RFR SVR LR K-Means K-Medoid DBSCAN PCA t-sne CNN MF Apriori Word2vec LDA Extreme Gradient Boosting Gradient Boosted Decision Trees Neural Network Random Forest Support Vector Machine Logistic Regression Gradient Boosted Regression Trees Random Forest Regressor Support Vector Regression Linear Regression K-Means Clustering K-Medoid Clustering Density-based spatial clustering of applications with noise Principal Component Analysis t-distributed Stochastic Neighbor Embedding Convolutional Neural Network Matrix Factorization Apriori algorithm Word2vec embedding Latent Dirichlet Allocation Item-KNN User-KNN SGD IALS SVD RNN GRU4Rec Item-based k-nearest-neighbor algorithm User-based k-nearest-neighbor algorithm Stochastic Gradient Descent Implicit Alternating Least Squares Singular Value Decomposition Recurrent neural network Recurrent neural network with Gated Recurrent Units for Recommender Systems 26