E l a s t i c s e a r c h F e a t u r e s. Contents

Similar documents
Securing the Elastic Stack

Infrastructure at your Service. Elking your PostgreSQL Database Infrastructure

Application monitoring with BELK. Nishant Sahay, Sr. Architect Bhavani Ananth, Architect

Ingest. David Pilato, Developer Evangelist Paris, 31 Janvier 2017

Ingest. Aaron Mildenstein, Consulting Architect Tokyo Dec 14, 2017

Improving Drupal search experience with Apache Solr and Elasticsearch

Monitor your containers with the Elastic Stack. Monica Sarbu

Search and Time Series Databases

Goal of this document: A simple yet effective

by Cisco Intercloud Fabric and the Cisco

Log Analysis When CLI get's complex. ITNOG3 Octavio Melendres Network admin - Fastnet Spa

Search Engines and Time Series Databases

Amazon Elasticsearch Service

COMPARISON WHITEPAPER. Snowplow Insights VS SaaS load-your-data warehouse providers. We do data collection right.

ADVANCED DATABASES CIS 6930 Dr. Markus Schneider. Group 5 Ajantha Ramineni, Sahil Tiwari, Rishabh Jain, Shivang Gupta

FUJITSU Software ServerView Cloud Monitoring Manager V1.1. Release Notes

Monitor your infrastructure with the Elastic Beats. Monica Sarbu

Table 1 The Elastic Stack use cases Use case Industry or vertical market Operational log analytics: Gain real-time operational insight, reduce Mean Ti

Building a Scalable Recommender System with Apache Spark, Apache Kafka and Elasticsearch

Datasheet FUJITSU Software ServerView Cloud Monitoring Manager V1.1

The webinar will start soon... Elasticsearch Performance Optimisation

Deep dive into analytics using Aggregation. Boaz

Big Data Technology Ecosystem. Mark Burnette Pentaho Director Sales Engineering, Hitachi Vantara

HDInsight > Hadoop. October 12, 2017

Tizen apps with. Context Awareness, powered by AI. by Shashwat Pradhan, CEO Emberify

Towards a Real- time Processing Pipeline: Running Apache Flink on AWS

Big Data on AWS. Big Data Agility and Performance Delivered in the Cloud. 2015, Amazon Web Services, Inc. or its Affiliates. All rights reserved.

Overview. SUSE OpenStack Cloud Monitoring

Geo Capabilities in Elasticsearch Nicholas

ArcGIS Online: Developing Web Applications with Geocoding and Routing Services. Deelesh Mandloi Dmitry Kudinov Brad Niemand

ATTACHMENT MANAGEMENT USING AZURE BLOB STORAGE

Information Security Policy

Table of Contents HOL-PRT-1463

Integrated and Hyper-converged Data Protection

Developing Enterprise Cloud Solutions with Azure

FUJITSU Software ServerView Cloud Monitoring Manager V1.0. Overview

Bonus #1 - Client Extractor (Get Paying Clients in 1-Click using Google Maps API Technology)

AALOK INSTITUTE. DevOps Training

Cloudline Autonomous Driving Solutions. Accelerating insights through a new generation of Data and Analytics October, 2018

How to Select the Right Marketing Cloud Edition

API, DEVOPS & MICROSERVICES

edirectory Change log

3) CHARLIE HULL. Implementing open source search for a major specialist recruiting firm

MQ Monitoring on Cloud

Implementing Operational Analytics Using Big Data Technologies to Detect and Predict Sensor Anomalies

Real-time Streaming Applications on AWS Patterns and Use Cases

Java Architectures A New Hope. Eberhard Wolff

Fluentd + MongoDB + Spark = Awesome Sauce

Using Elastic with Magento

Identifying Workloads for the Cloud

Analytics Platform for ATLAS Computing Services

Tour of Database Platforms as a Service. June 2016 Warner Chaves Christo Kutrovsky Solutions Architect

Data Analytics at Logitech Snowflake + Tableau = #Winning

kjhf MIS 510 sharethisdeal - Final Project Report Team Members:- Damini Akash Krittika Karan Khurana Agrawal Patil Dhingra

Amazon Search Services. Christoph Schmitter

MAPR DATA GOVERNANCE WITHOUT COMPROMISE

To the Designer Where We Need Your Help

VMware Cloud on AWS Technical Deck VMware, Inc.

Using Internet as a Data Source for Official Statistics: a Comparative Analysis of Web Scraping Technologies

PNDA.io: when BGP meets Big-Data

API Connect. Arnauld Desprets - Technical Sale

The SMACK Stack: Spark*, Mesos*, Akka, Cassandra*, Kafka* Elizabeth K. Dublin Apache Kafka Meetup, 30 August 2017.

Relevancy Workbench Module. 1.0 Documentation

Automating Elasticity. March 2018

Ninja Level Infrastructure Monitoring. Defensive Approach to Security Monitoring and Automation

DevOps Using VSTS and Azure

Integrated and Hyper-converged Data Protection

Sigox APIs beginners API HOW TO. External Use, version 1.3

Agenda. Introduce the Tale of Two developers. Domino Top Secret. Back to the Future with the Domino

Big Data with Hadoop Ecosystem

ATTACHMENT MANAGEMENT USING AZURE BLOB STORAGE

INSTITUTIONAL REPOSITORY SERVICES

Europeana Core Service Platform

Fusion Registry 9 SDMX Data and Metadata Management System

ElasticSearch in Production

Distributed CI: Scaling Jenkins on Mesos and Marathon. Roger Ignazio Puppet Labs, Inc. MesosCon 2015 Seattle, WA

DOLLAR GENERAL CAREER SITE CANDIDATE ONLINE APPLICATION REFERENCE GUIDE

Who am I? Identity Product Group, CXP Team. Premier Field Engineer. SANS STI Student GWAPT, GCIA, GCIH, GCWN, GMOB

Microservices log gathering, processing and storing

Problem Code: #ISR13. College Code :

Cloud Foundry- 开放的应用平台

Coding for OCS. Derek Endres Software Developer Research #OSIsoftUC #PIWorld 2018 OSIsoft, LLC

Layer2 Term Set Glossary (TSG) for SharePoint

Cybercasing the Joint: On the Privacy Implications of Geo-Tagging

Creating a Recommender System. An Elasticsearch & Apache Spark approach

Developing and Testing Java Microservices on Docker. Todd Fasullo Dir. Engineering

Grandstream Networks, Inc. Captive Portal Authentication via Twitter

Harvesting Logs and Events Using MetaCentrum Virtualization Services. Radoslav Bodó, Daniel Kouřil CESNET

mimacom & LiferayDXP Campaign

DevOps + Infrastructure TRACK SUPPORTED BY

A PRACTICAL GUIDE TO SHAREPOINT 2013: NO FLUFF! JUST PRACTICAL EXERCISES TO ENHANCE YOUR SHAREPOINT 2013 LEARNING! BY SAIFULLAH SHAFIQ

RINGTAIL A COMPETITIVE ADVANTAGE FOR LAW FIRMS. Award-winning visual e-discovery software delivers faster insights and superior case strategies.

LOG AGGREGATION. To better manage your Red Hat footprint. Miguel Pérez Colino Strategic Design Team - ISBU

Datasheet FUJITSU Software Cloud Monitoring Manager V2.0

Hidden Gems in JD Edwards Orchestrator and AIS Server

Uber Push and Subscribe Database

MariaDB MaxScale 2.0, basis for a Two-speed IT architecture

Analyze Bug Statistics using Kibana Dashboard and Get Voice Alerts

St Ursula s College is pleased to announce a new partnership with Monitor as we install the Monitor School System in our College Tuckshop.

The Neutron Series Distributed Network Management Solution

Transcription:

Elasticsearch Features A n Overview

Contents Introduction... 2 Location Based Search... 2 Search Social Media(Twitter) data from Elasticsearch... 4 Query Boosting in Elasticsearch... 4 Machine Learning in Elasticsearch... 6 Possible Use cases... 8 Alternative Solutions... 8 Next Steps... 8 Conclusion... 9 Figure 1:Search result based on location... 3 Figure 2: Integrating various data sources to Elasticsearch... 4 Figure 3: Machine Learning jobs... 7 Figure 4:Anomaly Detection in ML job... 7 Figure 5:Forecasting in ML job... 8 P a g e 1

Introduction Elasticsearch is an open source search product which is a scalable search product with great features. Elasticsearch cluster can be setup including number of servers or machines where search indices can be distributed. Most of search features of the Elasticsearch is free. There are features like encryption, machine learning which come under a package called XPack which require license. This document covers some interesting features provided by Elasticsearch. Location Based Search In today s world where people use various mobile devices and want information based on their current location, location specific search has great relevance. In general search is performed based on keyword provided by user. But if search result can be provided based on current location of the user then it adds more value and relevance. Elasticsearch provides feature which can return search result relevant to user s location. When data is indexed in Elasticsearch, the data need to be indexed with metadata like latitude and longitude. Below is an example of indexing data with latitude and longitude: PUT conferences/conference/1 { "text": "Conference on AI enhanced sports","country":"us","category":"sports", "location": { "lat": 41.12, "lon": -71.34 While performing the search query, user s current latitude and longitude need to be obtained and then passed in the search query. Modern browsers like Chrome have feature where they can provide user s latitude and longitude. Indexed data can be searched based on user s latitude, longitude within a particular radius of the user current location. Example search query with co-ordinates: { P a g e 2

"filtered" : { "query" : { "field" : { "text" : "restaurant", "filter" : { "geo_distance" : { "distance" : "12km", "pin.location" : { "lat" : 40, "lon" : -70 Figure 1:Search result based on location P a g e 3

Search Social Media(Twitter) data from Elasticsearch Elasticsearch stack consists of Elasticsearch, Kibana and Logstash. Kibana is front end for Elasticsearch and Logstash is an open source, server-side data processing pipeline that ingests data from a multitude of sources simultaneously, transforms it, and then sends it to your favourite stash such as Elasticsearch. Logstash has number of plugins to pull data from various sources like PostgreSQL, MySQL, Twitter etc. For reading data from Twitter, an app need to be registered in Twitter. Then consumer key, consumer secret, oauth token, oauth token secret of the app can be used by Logstash to read data from Twitter. Logstach can continuously read data from Twitter based on a keyword and index the data to Elasticsearch. User can search indexed Twitter data from Elasticsearch. Figure 2: Integrating various data sources to Elasticsearch Query Boosting in Elasticsearch Search result can be promoted in Elasticsearch based on condition. The search query can be defined in such a way to promote results matching certain condition. Examples of query boosting: GET /_search { "query": { "bool": { "must": { "match": { "content": { "query": "full text search", "operator": "and" P a g e 4

, "should": [ { "match": { "content": "Elasticsearch", { "match": { "content": "Lucene" ] In the above example the search query is looking for data containing text full text search. Again in the should clause of the query there is text Elasticsearch and Lucene. If the document contains additional text Elasticsearch and Lucene then that document will be promoted in the search result. Examples of query boosting GET /_search { "query": { "bool": { "must": {, "match": { "content": { "should": [ "query": "full text search", "operator": "and" { "match": { P a g e 5

"content": { "query": "Elasticsearch", "boost": 3, { "match": { "content": { "query": "Lucene", "boost": 2 ] In the above query, data containing Elasticsearch and Lucene will be boosted by weight of 3 and 2 which will promote any data containing these texts. The query boosting can be used to implement relevance and additional promoting factor while displaying search result to end user. Machine Learning in Elasticsearch Elasticsearch offers Machine Learning feature as part of XPack Subscription. Machine Learning jobs can be created on the data indexed in Elasticsearch. These jobs can provide insight and help in anomaly detection by analyzing large volume of data over a period of time. It has forecasting feature for selected future time period. This feature is useful for analyzing anomalies in log data and their root causes. This feature can help analyzing non time series data to detect anomalies. The below diagram shows how data indexed over a period of can be used to learn and then predict future behavior by Machine Learning jobs. Once any anomaly is detected then notifications can be setup to inform relevant people about the anomaly. P a g e 6

Figure 3: Machine Learning jobs The below screenshot is an example of the Machine Learning job detecting anomaly in Kibana UI. Figure 4:Anomaly Detection in ML job The below screenshot is an example of the Machine Learning job doing forecasting for a future period in Kibana UI. P a g e 7

Figure 5:Forecasting in ML job Possible Use cases Location based search: There can be many use cases for this feature. For example, an user looking for a Food store or Hospital or Shopping Mall near his location or a parent looking for a school for his child near his residence. Machine Learning: If daily production logs of a web server is indexed in Elasticsearch then Machine Learning jobs can point to any anomaly in server resource usage or end user behavior. For example, more server resources are being used during peak hours. For a Shopping portal, during New Year Sale there may be more user hits. Machine Learning jobs can point to any such anomaly and forecast future behavior. Query Boosting: This feature can be used to promote search result based on user preferences such as user whose interest include sports will always get search results related to sports higher in order in search result. Social Media Search: This feature can be used to get combined search result for a user from within the application. The content source of search can be both data indexed in Elasticsearch and data from social media. Alternative Solutions There are other open source search products like Apache Solr, Lucene, Sphinx which can be explored as alternative solution. Next Steps Elasticsearch provides cloud based infrastructure (SAAS) and services which can be explored. P a g e 8

Conclusion As an open source search product, Elasticsearch offers great features. Organizations not willing to spend too much for a search product can definite explore Elasticsearch which is scalable and fast. A large number organizations do use Elasticsearch already. Elasticsearch have been adding new features like Machine Learning to improve their offerings. Also, products like Logstash allow users to write their own plugins to connect to any data source which already do not have any plugin available. On the whole, the Elastic stack including Elasticsearch, Kibana and Logstash give lot of features and overall coverage for any organization looking for implementing search and analysis. P a g e 9