Realtime Recommendations

Size: px

Start display at page:

Download "Realtime Recommendations"

Daisy Elliott
5 years ago
Views:

1 Realtime Recommendations with Redis Torben Brodt plista GmbH April 25th, 2013 NoSQL Search Roadshow

algorithms plista GmbH recommendations &

2 Introduction Torben Brodt, Head of Data Engineering computer science studies 5 years plista publication collaborative filtering evangelist for "power of algorithms plista GmbH recommendations & advertising founded in 2008, Berlin [DE] ~5k recommendations/ second

4 Contents 1. How to feed a recommender? 2. How to build a recommendation? 3. How to scale a recommender?

5 How to feed a recommender?

6 How to feed a recommender? to show recommendations we are integrated on the website we have URL + HTTP Headers user agent IP address -> geolocation

7 How to feed a recommender? push the data away quickly make use of data quickly RULE: be quick src

8 How to feed a recommender?

9 How to feed a recommender?

10 Technology overview Apache Lucene for Content MySQL for relational data Machine Learning Hadoop? No! It's batch + slow In Memory? Yes, stream computing Redis for Statistics Live Backup

11 How to build a recommendation?

12 How to build a recommendation? different recommender families Behavioral based on interaction between user and article Most Popular Collaborative Filtering Item to Item Content based on the articles Content Similarity Latest Item Classification

13 Most popular with welt.de/football/.html ZINCR "p:welt.de" ZREVRANGEBYSCORE p:welt.de summer_is_coming 420 plista_company Live Read + Live Write = Real Time Recommendations 135

14 p:welt.de Recap Data types 689 summer_is_coming 420 plista_company String, Lists, Set,.. Hash map between string fields and string values, very fast HINCR complexity O(1) Sorted Set ZINCR complexity: O(log(N)) where N is the number of elements in the sorted set. Allows to limit number of result: ZREVRANGEBYSCORE UNION + INTERSECT

15 Most popular with timeseries welt.de/football/.html ZINCR "p:welt.de: " ZUNION "p:welt.de: " "p:welt.de: " "p:welt.de: " ZREVRANGEBYSCORE p:welt.de: p:welt.de: p:welt.de: summer_is_coming 135 summer_is_coming plista_best_company 689 summer_is_coming plista_best_company plista_best_company 135

16 Most popular with timeseries welt.de/football/.html ZINCR "p:welt.de: " ZUNION... WEIGHTS "p:welt.de: ".. "p:welt.de: ".. "p:welt.de: ".. ZREVRANGEBYSCORE p:welt.de: p:welt.de: p:welt.de: summer_is_coming 135 summer_is_coming plista_best_company 689 summer_is_coming plista_best_company plista_best_company 135

17 Most popular with timeseries 4 2 : : : h -2h -3h -4h -5h 1-6h -7h -8h

18 Most popular to any context it's not only publisher, we use ~50 context attributes publisher = welt.de weekday = sunday summer_is_coming 420 dortmund_wins 200 plista_company context attributes: publisher weekday geolocation demographics... geolocation = dortmund dortmund_wins

19 Most popular to any context how it looks like in Redis ZUNION... WEIGHTS p:welt.de: p:welt.de: p:welt.de: w:sunday: w:sunday: w:sunday: publisher = welt.de weekday = sunday summer_is_coming 420 dortmund_wins 200 plista_company g:dortmund: g:dortmund: g:dortmund: geolocation = dortmund dortmund_wins

20 Most popular with Effect size which context has an influence? ZUNION... WEIGHTS p:welt.de: * 70% p:welt.de: * 70% p:welt.de: * 70% w:sunday: w:sunday: w:sunday: * 10% 2 * 10% 1 * 10% g:dortmund: * 30% g:dortmund: * 30% g:dortmund: * 30% Effect Size Examples: small effect: weather big effect: publisher Data with small effect should not been taken into account, otherwise we get avg results

21 SUM over.. timeseries different context previous hits of the user similar publisher knowledge Σ publisher = welt.de 689 summer_is_coming 420 plista_company 135 ZUNION... WEIGHTS p:welt.de: p:welt.de: p:welt.de: w:sunday: w:sunday: w:sunday: g:dortmund: g:dortmund: g:dortmund: redis can do it ;)

22 Even more Matrix Operations ;) Similarity Matrix Human Control Matrix Meta-learning Matrix cooperation with aided from Σ

23 More recommenders possible this was only about most popular other algorithms using redis incremental collaborative filtering article to article paths (~graph).. using external data sources

24 How to scale a recommender?

25 How to scale a recommender? Distribution to many servers 1 client to access n servers partitioning of data using hashing

26 How to scale a recommender? Distribution to many servers 1 client to access n servers partitioning of data using hashing for UNION we run into problems combined keys need to be on same server NO consistent hashing possible workaround: prefix hashing

How to scale a recommender? src http://en.wikipedia.

27 How to scale a recommender? src Low Latency master/slave replication should be close to edge servers e.g. 1 redis instance per 1 webserver

28 How to scale a recommender? Application in Database LUA Support is shipped but single core process a long read blocks all writes concurrency issue src

29 How to scale a recommender? in spite of all those disadvantages Redis fits perfect for simple operations SUM + AGGREGATE + MIN + MAX In-Memory operations are pretty fast real-time features feel better in a real-time database (e.g. time series) we don't need batch

30 What else in Redis? message bus many recommenders live statistics caching "One technology to rule them all"

Questions? www.plista.com torben.brodt@plista.

31 Questions? xing.com/profile/torben_brodt

SEARCHING BILLIONS OF PRODUCT LOGS IN REAL TIME. Ryan Tabora - Think Big Analytics NoSQL Search Roadshow - June 6, 2013

SEARCHING BILLIONS OF PRODUCT LOGS IN REAL TIME Ryan Tabora - Think Big Analytics NoSQL Search Roadshow - June 6, 2013 1 WHO AM I? Ryan Tabora Think Big Analytics - Senior Data Engineer Lover of dachshunds,