From Silicon Valley to the Test Bed: Bringing Big-Data Technologies into ODS

Size: px

Start display at page:

Download "From Silicon Valley to the Test Bed: Bringing Big-Data Technologies into ODS"

Jonah Tucker
5 years ago
Views:

1 AVL List GmbH (Headquarters) From Silicon Valley to the Test Bed: Bringing Big-Data Technologies into ODS ASAM General Assembly 2018 Open Technical Seminar Dr. Sandi Pohorec

Agenda Motivation ASAM ODS an industry standard New challenges New technologies What ODS can offer in the area of Big Data ODS Big Data roadmap ODS Big Data goals for

2 Agenda Motivation ASAM ODS an industry standard New challenges New technologies What ODS can offer in the area of Big Data ODS Big Data roadmap ODS Big Data goals for 2018 ODS BD enables multi-vendor solutions ODS BD standardizes the best practices from end-to-end solutions ODS BD from a user view Summary Dr. Sandi Pohorec 14 June

3 Motivation Information is the oil of the 21st century, and analytics is the combustion engine (Peter Sondergaard, Senior Vice President, Gartner) ASAM Open Data Services will enable scalable analytics by using the Big Data technology stack Dr. Sandi Pohorec 14 June

STORAGE RDBMS ASAM ODS an industry standard ASAM Open Data Services (ODS) focuses on persistent storage and retrieval of testing meta and measurement data Important features: IT architecture

4 STORAGE RDBMS ASAM ODS an industry standard ASAM Open Data Services (ODS) focuses on persistent storage and retrieval of testing meta and measurement data Important features: IT architecture independent data access Well designed balance between High adaptability of the database Support of well-defined different application scenarios API s and file based data exchange support Major advantages: Truly an open data service: Vendor lock-in not possible Complete solution for test data management ASAM MDF Client applications RPC-API OO-API HTTP-API ODS Server Meta data Application model Base model internal linked Measurement data File based data exchange Dr. Sandi Pohorec 14 June

New challenges Real driving emissions Electrification Autonomous driving More testing:

Cost optimization: fusion of simulation and real-world testing Responsive real-time

challenges and enable horizontably scalable storage, search, processing, advanced

5 New challenges Real driving emissions Electrification Autonomous driving More testing: data volume grows exponentially More applications: data variety and variability increase Cost optimization: fusion of simulation and real-world testing Responsive real-time analytics: velocity goes from batch to real-time ODS needs to successfuly manage all challenges and enable horizontably scalable storage, search, processing, advanced analytics, machine learning and vizualization capability for all data in its domain Dr. Sandi Pohorec 14 June

ACHIEVED PLANNED ODS Big Data roadmap Participants in 2017 Define the problem and collect the information on: Big data use cases Scalable meta storage Scalable mass storage Near data processing

6 ACHIEVED PLANNED ODS Big Data roadmap Participants in 2017 Define the problem and collect the information on: Big data use cases Scalable meta storage Scalable mass storage Near data processing Analyze the work packages and define the solution: General approach and architecture Meta and mass storage Scalable near data processing Create and release the standard: Denormalized meta data representation for indexing services Big data formats for ODS: PARQUET and AVRO Impact on and adaption of ODS 6 Implement and improve: Vendors start to implement the initial release of the standard Standard workgroup gathers feedback and plans refinements of the standard Standard workgroup works on extensions of the standard: streaming, complex data types, Workpackages have been defined RFQ for technical service provider is published: extensive knowledge of both ODS and Big Data required to formulate a successful standard AVL, MHP and National Instruments provide the technical expertise to the workgroup Overall architecture encompassing all relevant aspects has been defined and demonstrated on AVL Santorin MX and AVL Concerto Workgroups (meta data, mass data) are active -still possible to join and contribute. Dr. Sandi Pohorec 14 June

What ODS can offer in the area of Big Data Define interfaces to existing Big Data technologies while considering: Approaches for scalable performance for ODS: HPC vs Big Data Big data ecosystem/map

7 What ODS can offer in the area of Big Data Define interfaces to existing Big Data technologies while considering: Approaches for scalable performance for ODS: HPC vs Big Data Big data ecosystem/map and system architecture: architecture layers, Hadoop distributions and components stacks Current and future data sources in the ODS ecosystem: not just time series but also complex object data in many formats Current and future use cases: autonomous driving, Provide industry wide Big Data standard for ODS that brings consistent interoperability and independence from vendor lock-in Enable the usage of powerful Big Data components and frameworks for: Near data processing, Scalable data analytics, Machine learning, Advanced and highly interactive visualizations, Data analytics partnerships (OEM2OEM, OEM2Tier1) while keeping the benefits of ODS! Dr. Sandi Pohorec 14 June

Monitoring Security New technologies: Hadoop Hadoop is a platform solution to manage the scalable storage and processing of very large data Scalable storage (fault tolerant) Distribution of data

8 Monitoring Security New technologies: Hadoop Hadoop is a platform solution to manage the scalable storage and processing of very large data Scalable storage (fault tolerant) Distribution of data (also hot-cold) Parallelization of the computation, Assigning cluster resources, managing job execution WHITEPAPER: BEST PRACTICES FOR BIG AUTOMOTIVE DATA Batch data processing Interface towards clients (API) Numerous purpose-build frameworks Scalable search In-memory data processing (Spark): Batch, ODS will provide an interface to new technologies 2018 A P I A P I MapReduce Pig Hive HBase SolrCloud SPARK YARN HDFS Streaming, Graph processing, Machine learning. Enterprise ready with full support for security and monitoring (data lineage) Dr. Sandi Pohorec 14 June

ODS Big Data goals for 2018 Searchable representation of meta data Standardize Transformation rules for the denormalized representation Industry standard JSON format Enable Enable utilization by

9 ODS Big Data goals for 2018 Searchable representation of meta data Standardize Transformation rules for the denormalized representation Industry standard JSON format Enable Enable utilization by non-ods clients/tools Choice of indexing technology/platform Hor. scalabilty and HA meta queries should not block analytics Powerful matching capabilities with near real time response Big Data format for measurement (time series) data Standardize Formats AVRO (row major format) PARQUET (column major format) For each format multiple data schemas Different use cases (Random access for advanced visualizations, Scalable analytics parallelism per file/per channel/, Write/Read optimized) Different properties of measurement data (Various sizes, number of signals, One/multiple sampling frequencies) For each schema defined transformation rules Enable Horizontally scalable storage and near data processing of ODS measurement data on Big Data clusters 2018 ODS BD STD Transformation rules Meta data 2018 ODS BD STD DATA SCHEMAS (avro, parquet) Meas data 2018 ODS BD STD Data transformation rules JSON PARQUET AVRO Dr. Sandi Pohorec 14 June

10 ODS enables multi-vendor solutions Components from different vendors can be used for each step Standard only defines the interfaces and data formats/schemas vendors are free to use different approaches how to best satisfy the NFRs: performance, scalability, reliability, availability, data integrity and security Vendor 1 Vendor 2 Vendor 3 ODS BD Big Data platform Enable big data at data recording Dr. Sandi Pohorec 14 June

ODS standardizes the best practices from end-to-end solutions AVL products form an end-to-end solution which provides more value that a combination of individual components: highest consistency,

11 ODS standardizes the best practices from end-to-end solutions AVL products form an end-to-end solution which provides more value that a combination of individual components: highest consistency, speed, integration, enterprise readiness, advanced data search of all data types, advanced analytics capability with insightful data visualization in near-real time. AVL Data Vizualizations AVL Data Analytics AVL Advanced Search context, time series, object data, events,. AVL Processing API ODS BD Big Data platform AVL BD PLAT. ABST. LAYER Enable big data at data recording Scalability already before Big Data (data conversion) AVL DATA HIGHWAY Guaranteed consistency with highest possible speed of data Integration with Enterprise: Data lineage, Security, Monitoring Dr. Sandi Pohorec 14 June

12 ODS standardizes the best practices from end-to-end solutions AVL products form an end-to-end solution which provides more value that a combination of individual components: highest consistency, speed, integration, enterprise readiness, advanced data search of all data types, advanced analytics capability with insightful data visualization in near-real time. AVL will bring the knowledge from its products and projects into the ODS Big Data standard AVL Data Vizualizations AVL Data Analytics AVL Advanced Search context, time series, object data, events,. AVL Processing API ODS BD Big Data platform ODS BD PLAT. ABST. LAYER Enable big data at data recording Scalability already before Big Data (data conversion) AVL DATA HIGHWAY Guaranteed Guaranteed consistency consistency with with highest possible possible speed speed of of data data Integration Integration with Enterprise: with Enterprise: Data Data lineage, lineage, Security, Security, Monitoring Dr. Sandi Pohorec 14 June

13 User view on the ODS Big Data Search for measurements simply by selecting an interesting part of a signal Dr. Sandi Pohorec 14 June

14 Summary Big-Data is a top-level requirement for ODS Workgroup for ODS Big Data was formed in 2016 In 2017 workgroup got to a common understanding Major considerations of scalability Focus points for standardization Result: Architecture of an ODS solution (ODS server + Big Data platform) and the interfaces In 2018 the workgroup is working on specifying the first artefacts: Meta data transformation and denormalization to industry standard JSON form Mass data transformation and conversion to the most important formats in the Big Data ecosystem: avro and parquet The work does not stop in 2018, the standardization will continuously cover more and more aspects of the automotive big data world Based on best practice from vendors AVL was leading the technical supplier group in 2017 and will continue to bring its knowledge and experience to the standard Dr. Sandi Pohorec 14 June

15 Thank You

Overview. Prerequisites. Course Outline. Course Outline :: Apache Spark Development::

Overview. Prerequisites. Course Outline. Course Outline :: Apache Spark Development:: Title Duration : Apache Spark Development : 4 days Overview Spark is a fast and general cluster computing system for Big Data. It provides high-level APIs in Scala, Java, Python, and R, and an optimized