Technical Report - Distributed Database Victor FERNANDES - Université de Strasbourg /2000 TECHNICAL REPORT

Size: px
Start display at page:

Download "Technical Report - Distributed Database Victor FERNANDES - Université de Strasbourg /2000 TECHNICAL REPORT"

Transcription

1 TECHNICAL REPORT Distributed Databases And Implementation of the TPC-H Benchmark Victor FERNANDES DESS Informatique Promotion : 1999 / 2000 Page 1 / 29

2 TABLE OF CONTENTS ABSTRACT... 3 INTRODUCTION... 3 HOW TO CREATE A DISTRIBUTED DATABASE... 4 ORACLE SETUP... 5 Database Links... 5 Private, Public, and Global Database Links... 6 Transparency in a Distributed Database System... 7 Location Transparency... 7 BENCHMARK... 8 RESULTS... 9 GRAPHICAL RESULTS CONCLUSION ANNEXES NETWORK SETUP First stage Net8 Assistant for a client First stage Net8 Assistant for a server Second stage Net8 Easy Config GLOBAL ADMINISTRATION QUERIES TIME REPORT Page 2 / 29

3 ABSTRACT Technical Report - Distributed Database This technical report describes the basics concepts of a Distributed Database (DB), and presents different examples how to create a query on an Oracle DB a remote Oracle DB. Introduction A distributed database is a set of databases stored on multiple computers that typically appears to applications as a single database. Consequently, an application can simultaneously access and modify the data in several databases in a network. Each database in the system is controlled by its local server but cooperates to maintain the consistency of the global distributed database. This figure illustrates a representative Oracle distributed database system. Drau.itec.uni-klu.ac.at Database Server Isel.itec.uni-klu.ac.at Database Server Node 1 Node 2 Network Database Link... DEPT TABLE... Db1 Database Db0 Database EMP TABLE... TRANSACTION INSERT INTO EMP@db0...; DELETE FROM DEPT...; SELECT... FROM EMP@db0...; COMMIT; TRANSACTION INSERT INTO EMP@db0...; DELETE FROM DEPT...; SELECT... FROM EMP@db0...; COMMIT;... Page 3 / 29

4 How to create a distributed database As reported on previous page, in order to create a distributed database you need at least two servers with a database instance on it running, and a network. For our study, we have two Windows NT 4.0 Server, running the Oracle database 8.0.5, and seven PC's under Linux with the Oracle database 8i. The next figure illustrates our study cluster. LAN GIGA SWITCH (FDDI Network 1GB/sec) LPC1 LPC2 LPC4 LPC5 LAN 100 MB / sec SWITCH LPC22 LPC23 LPC24 LAN 100 MB / sec HUB FIREWALL ISEL DRAU Computer configuration LINUX SUSE 6.3 PII 450 Mhz, 128 Mo RAM, 8Go HDD Windows NT 4.0 Server PII 450, 128 Mo RAM, 8Go HDD Computer Name LPC1, LPC2, LPC4, LPC5, LPC22, LPC23, LPC24 ISEL, DRAU Page 4 / 29

5 Oracle setup Before starting the Oracle setup you must setup the network (see 'Setup the network', in the annexe). If the network is ok, we can create a database link. Each database in a distributed database distincts all other databases in the system by its own global database name. Oracle forms a database's global database name by prefixing the database's network domain with the individual database's name. For example, the next figure illustrates a representative hierarchical arrangement of databases throughout a network. COM ACME_TOOLS ACME_AUTO Division 1 Division 2 Division 3 Asia Americas Europe Japan US Mexico UK Germany HQ Fin. Sales Mfgt Sales HQ Sales Sales Sales Sales While several databases can have the same individual name, each database must have a unique global database name. For example, the network domains US.AMERICAS.ACME_AUTO.COM and UK.EUROPE.ACME_AUTO.COM contains a SALES database. SALES.US.AMERICAS.ACME_AUTO.COM SALES.UK.EUROPE.ACME_AUTO.COM Database Links To facilitate application requests in a distributed database system, Oracle uses database links. A database link defines a one-way communication path an Oracle database to another database. Database links are essentially transparent to the users of an Oracle distributed database system, because the name of a database link is the same as the global name of the database to which the link points. Page 5 / 29

6 For example, the following SQL statement creates a database link in the local database that describes a path to the remote Db0 on DRAU Server service_test.itec.uni.klu.ac.at. service_test.itec.uni.klu.ac.at is the same as the name service in Net8 easy config CREATE DATABASE LINK my_link using 'service_test.itec.uni.klu.ac.at' ; After creating a database link, applications connected to the local database can access data in the remote service_test.itec.uni.klu.ac.at database. Now you can make a query like : SELECT * FROM dept@my_link ; Or INSERT INTO dept@my_link VALUES (...); Or DELETE FROM dept@my_link WHERE...; Private, Public, and Global Database Links Oracle allows you to create private, public, and global database links. Private Database Link Public Database Link Global Database Link You can create a private database link in a specific schema of a database. Only the owner of a private database link or PL/SQL subprograms in the schema can use a private database link to access data and database objects in the corresponding remote database. Eg: CREATE DATABASE LINK my_link using 'service_test.itec.uni.klu.ac.at' ; You can create a public database link for a database. All users and PL/SQL subprograms in the database can use a public database link to access data and database objects in the corresponding remote database. Eg: CREATE PUBLIC DATABASE LINK my_link using 'service_test.itec.uni.klu.ac.at' ; When an Oracle network uses Oracle Names, the names servers in the system automatically create and manage global database links for every Oracle database in the network. All users and PL/SQL subprograms in any database can use a global database link to access data and database objects in the corresponding remote database. For more information, you can see Oracle documentation. Page 6 / 29

7 Transparency in a Distributed Database System With minimal effort, you can make the functionality of an Oracle distributed database system transparent to users that work with the system. The goal of transparency is to make a distributed database system appear as though it is a single Oracle database. Consequently, the system does not burden developers and users of the system with complexities that would otherwise make distributed database application development challenging and detract user productivity. The following sections explain more about transparency in a distributed database system. Location Transparency An Oracle distributed database system has features that allow application developers and administrators to hide the physical location of database objects applications and users. Location transparency exists when a user can universally refer to a database object such as a table, regardless of the node to which an application connects. Location transparency has several benefits, including: Access to remote data is simple, because database users do not need to know the physical location of database objects. Administrators can move database objects with no impact on end-users or existing database applications. Most typically, administrators and developers use synonyms to establish location transparency for the tables and supporting objects in an application schema. For example, the following statements create synonyms in a database for tables in another, remote database. CREATE PUBLIC SYNONYM emp FOR emp@my_link; Now, rather than access the remote tables with a query such as: SELECT ename, dname FROM dept@my_link e, dept@my_link d WHERE e.deptno = d.deptno; an application can issue a much simpler query that does not have to account for the location of the remote tables. SELECT e.ename, e.dname FROM emp e, dept d WHERE e.deptno = d.deptno; In addition to synonyms, developers can also use views and stored procedures to establish location transparency for applications that work in a distributed database system. Page 7 / 29

8 BENCHMARK Technical Report - Distributed Database The TPC-H (Ad-hoc, decision support) benchmark represents decision support environments users don't know which queries will be executed against a database system; hence, the "ad-hoc" label. Given this ad-hocness, no re-knowledge of the queries can be built into the DBMS system and the query execution times can be very long. The next figure show the schema of the database used for our tests. The size of the database is about 100 M Bytes, it corresponds to a scale of 0.1 when you generate the flat files for the loading of the database. PART Rows P_PARTKEY P_NAME P_MFGR P_BRAND P_TYPE P_SIZE P_CONTAINER P_RETAILPRICE P_COMMENT S_SUPPKEY S_NAME S_ADDRESS S_NATIONKEY S_PHONE S_ACCTBAL S_COMMENT PARTSUPP Rows PS_PARTKEY PS_SUPPKEY PS_AVAILQTY PS_SUPPLYCOST PS_COMMENT CUSTOMER Rows SUPPLIER 1000 Rows C_CUSTKEY C_NAME C_ADDRESS C_NATIONKEY C_PHONE C_ACCTBAL C_MKTSEGMENT C_COMMENT NATION 25 Rows N_NATIONKEY N_NAME N_REGIONKEY N_COMMENT LINEITEM Rows L_ORDERKEY L_PARTKEY L_SUPPKEY L_LINENUMBER L_QUANTITY L_EXTENDEDPRICE L_DISCOUNT L_TAX L_RETURNFLAG L_LINESTATUS L_SHIPDATE L_COMMITDATE L_RECEIPTDATE L_SHIPINSTRUCT L_SHIPMODE L_COMMENT REGION 5 Rows R_REGIONKEY R_NAME R_COMMENT ORDERS Rows O_ORDERKEY O_CUSTKEY O_ORDERSTATUS O_TOTALPRICE O_ORDERDATE O_ORDERSPRIORITY O_CLERK O_SHIPPRIORITY O_COMMENT The benchmark has been tested on different topologies (machines), and on each node of the cluster, but for the benchmark we take only two nodes LPC1 and LPC2, because all the nodes doesn t have the same configuration and disk space. Anyway this was beyond the scope of this training period. The first test has been made on a single node with the totality of the database, in order to obtain a reference benchmark time. Page 8 / 29

9 The next step has been done on two nodes with a network at 100 Mb/s. On the first node, there are the tables : - REGION, - PART, - PARTSUPP, - SUPPLIER, - NATION, and on the second node : - CUSTOMER, - LINEITEM, - ORDERS. The last test has been carried out with the same repartition of the tables on each node, but with a faster network ( 1 Giga bits / s FDDI). Remark: The size of the database is about 100 M bytes (70 Mbytes for LINEITEM table), and the loading of the database has been done with database the generator (TPC-H tool) this tool creates eight flat files, we used the sqlloader tool to insert them into the database Results We have done severals tests with a different size of the databases, the first test was made with a database of 1 Giga Bytes. However we could not carry out the tests correctly (some queries run over two nights), as a consequence we choose a smaller size for our tests. Important remark: For a good performance cluster, the most important for each node of the cluster is to have a fast hard disk and a lot of memory. A good configuration for each node of the cluster: PC Bi-processor, 256 Mo RAM at least 2 controllers SCSI (UW or U2W), 2-3 Hard disk SCSI (UW or U2W) one disk for the system, one for the index, swap and one for the datafiles like tablespaces. And a good network (1Giga bit Ethernet). Reference benchmark on a single node Computer name Request sql Time in seconds Database 100 Mo Time in seconds Database 250 Mo Time in seconds Database 500 Mo LPC1 Q7.sql LPC1 Q8.sql LPC1 Q9.sql LPC1 Q10.sql LPC1 Q11.sql LPC1 Q12.sql LPC1 Q14.sql LPC1 Q15.sql LPC1 Q16.sql LPC1 Q17.sql LPC1 Q18.sql LPC1 Q19.sql LPC1 Q20.sql Page 9 / 29

10 Reference benchmark on two nodes (network 100 Mb/s) (queries test on LPC1) Computer name Request sql Time in seconds Database 100 Mo Time in seconds Database 250 Mo Time in seconds Database 500 Mo LPC1, LPC2 Q7.sql LPC1, LPC2 Q8.sql LPC1, LPC2 Q9.sql LPC1, LPC2 Q10.sql LPC1, LPC2 Q11.sql LPC1, LPC2 Q12.sql LPC1, LPC2 Q14.sql LPC1, LPC2 Q15.sql LPC1, LPC2 Q16.sql LPC1, LPC2 Q17.sql LPC1, LPC2 Q18.sql LPC1, LPC2 Q19.sql LPC1, LPC2 Q20.sql Reference benchmark on two nodes (network 100 Mb/s) (queries test on LPC2) Computer name Request sql Time in seconds Database 100 Mo Time in seconds Database 250 Mo Time in seconds Database 500 Mo LPC1, LPC2 Q7.sql LPC1, LPC2 Q8.sql LPC1, LPC2 Q9.sql LPC1, LPC2 Q10.sql LPC1, LPC2 Q11.sql LPC1, LPC2 Q12.sql LPC1, LPC2 Q14.sql LPC1, LPC2 Q15.sql LPC1, LPC2 Q16.sql LPC1, LPC2 Q17.sql LPC1, LPC2 Q18.sql LPC1, LPC2 Q19.sql LPC1, LPC2 Q20.sql Reference benchmark on two nodes (network 1 Gb/s) (queries test on LPC1) Computer name Request sql Time in seconds Database 100 Mo Time in seconds Database 250 Mo Time in seconds Database 500 Mo LPC1, LPC2 Q7.sql LPC1, LPC2 Q8.sql LPC1, LPC2 Q9.sql LPC1, LPC2 Q10.sql LPC1, LPC2 Q11.sql LPC1, LPC2 Q12.sql LPC1, LPC2 Q14.sql LPC1, LPC2 Q15.sql LPC1, LPC2 Q16.sql LPC1, LPC2 Q17.sql LPC1, LPC2 Q18.sql LPC1, LPC2 Q19.sql LPC1, LPC2 Q20.sql Page 10 / 29

11 Reference benchmark on two nodes (network 1 Gb/s) (queries test on LPC2) Computer name Request sql Time in seconds Database 100 Mo Time in seconds Database 250 Mo Time in seconds Database 500 Mo LPC1, LPC2 Q7.sql LPC1, LPC2 Q8.sql LPC1, LPC2 Q9.sql LPC1, LPC2 Q10.sql LPC1, LPC2 Q11.sql LPC1, LPC2 Q12.sql LPC1, LPC2 Q14.sql LPC1, LPC2 Q15.sql LPC1, LPC2 Q16.sql LPC1, LPC2 Q17.sql LPC1, LPC2 Q18.sql LPC1, LPC2 Q19.sql LPC1, LPC2 Q20.sql Page 11 / 29

12 Graphical results Technical Report - Distributed Database Time in second Benchmark 100 M Bytes of Data Single node two nodes 100MB/S two nodes 100MB/S Test on LPC2 two nodes 1GB/S two nodes 1GB/S Test on LPC2 Queries two nodes 1GB/S Test on LPC2 two nodes 1GB/S two nodes 100MB/S Test on LPC2 two nodes 100MB/S Single node On this figure, two queries are missing, because the execution time was too high, and we could not have seen the details for the other queries. Thus, the next figure show the queries Q8 and Q9 Time in second Q8 Q9 two nodes 100MB/S Single node Queries two nodes 1GB/S Test on LPC2 two nodes 1GB/S two nodes 100MB/S Test on LPC2 Benchmark 100 M Bytes of Data Single node two nodes 100MB/S two nodes 100MB/S Test on LPC2 two nodes 1GB/S two nodes 1GB/S Test on LPC2 Page 12 / 29

13 800 Time in second Benchmark 250 M Bytes of Data Single node two nodes 100MB/S two nodes 100MB/S Test on LPC2 two nodes 1GB/S two nodes 1GB/S Test on LPC2 0 Q7 Q10 Q11 Q12 Q14 Q15 Q16 Q17 Q18 Q19 Q20 Queries tw o nodes 1GB/S Single node Like on the previous page, two queries are missing in this figure. So the next figure shows the queries Q8 and Q9 Time in second Benchmark 250 M Bytes of Data Single node Q8 Q9 two nodes 1GB/S Test on LPC2 two nodes 1GB/S two nodes 100MB/S Test on LPC2 two nodes 100MB/S Single node Queries two nodes 100MB/S two nodes 100MB/S Test on LPC2 two nodes 1GB/S two nodes 1GB/S Test on LPC2 Page 13 / 29

14 Time in second 2500 Benchmark 500 M Bytes of Data Single node two nodes 100MB/S two nodes 100MB/S Test on LPC2 two nodes 1GB/S two nodes 1GB/S Test on LPC2 0 Q7 Q10 Q11 Q12 Q14 Q15 Q16 Q17 Q18 Q19 Q20 tw o nodes 1GB/S Single node Queries Like on the previous page, two queries are missing in this figure. So the next figure shows the queries Q8 and Q9 Time in second Benchmark 500 M Bytes of Data Single node Q8 Q9 Queries Single node two nodes 1GB/S two nodes 100MB/S two nodes 100MB/S Test on LPC2 two nodes 1GB/S two nodes 1GB/S Test on LPC2 Page 14 / 29

15 Conclusion Technical Report - Distributed Database We saw in this technical report, the various mechanisms to carry out a distributed database. The goal of this project was to show that a distributed database could bring a profit in performances for the execution time of SQL requests. With the various tests which were carried out, we realise that according to requests' the results differ. Indeed, for some distributed requests the execution times to decreases compared to the single node case. The problems of this project are not clause, indeed it remains much of points to be studied in order to have a global vision of the subject, it would be necessary for example to carry out tests with a cluster having more powerful nodes, or to test various distributions of table on 2, 3 to see 4 databases. Finally, the basic concept of distributed data, is only one introduction to other subjects of reflection such as the database replication. Finally to conclude, I would wish to thank Mr Harald KOSCH. Page 15 / 29

16 ANNEXES Page 16 / 29

17 Network setup Technical Report - Distributed Database The way the network is setup depends on if the machine acts as server or client or client - server First stage Net8 Assistant for a client Call the Net8 Assistant program Prompt>netasst & The only thing, you have to setup, is a domain (WORLD is the default value, but you can write what you want.) I recommend to set an internet domain like itec.uni.klu.ac.at First stage Net8 Assistant for a server For your server, you have to add a listener. And only configure the database name, and the listening location, and the default domain name. Tips and Tricks: try do not write specific character like " - " like uni-klu or " \ " or " / " because when we create a database link it doesn t work. Page 17 / 29

18 Second stage Net8 Easy Config On each server and client, you have to setup the Net8 easy config program to be sure that the communication works well between all the nodes of the cluster. For this you have to lauch the Net8 easy config program and this screen appears : Prompt>netec & First, you have to add a new service and a new service name (this name is very significant for the continuation of the events). Remark: The service name is automatically supplemented with the default domain that you have choose before. With Net8 easy config program you can test a communication to an Oracle database server. After this, you have to a network protocol (TCP/IP is ed by default) Page 18 / 29

19 Now, you have to specify the host name of the remote database and the port number. (the port 1521 is the default value). Now, you have to specify the System Identifier. (SID is only use for old versions of Oracle like 8.0.5) and for Oracle 8i you can specified Oracle Database name or a System Identifier for an old version of Oracle. Tips and Tricks : before testing, I recommend to click on next button and to click on finish button, because sometimes the program CRASH ;-). Then by restart the program you will be able to test if your service of communication functions correctly Page 19 / 29

20 Now you can test your communication And normally you obtain this screen : If it doesn t work, you can see the Oracle Documetation, or look the following tip and tricks. Page 20 / 29

21 Tips and Tricks : You can see if some Oracle s services function correctly under windows NT. The service the most important is the OracleTNSListener80. If this service is not correctly started, you ve got some problems to communicate with a remote database and TNS errors messages. Maybe, it's possible to do another install of oracle ;-(. Under linux : You can test if the listener is working. PROMPT >lsnrctrl LSNRCTRL>status LSNRCTRL>start 'for start the listener stop for stopping the listener, help for help Page 21 / 29

22 Global administration Technical Report - Distributed Database For the managing of all of these databases, I recommend to use the tool Oracle Enterprise Manager. This tool works only on Windows. Oracle Enterprise Manager combines a graphical console, agents, common services, and tools to provide an integrated, comprehensive systems management platform for managing Oracle products. From Enterprise Manager's Console, you can: - Administrate, diagnose, and tune multiple databases. - Distribute software to multiple servers and clients. - Schedule jobs on multiple nodes at varying time intervals. - Monitor objects and events throughout the network. - Customise your display using multiple graphic maps and groups of network objects, such as nodes and databases. - Administer Oracle Parallel Servers. - Integrate participating Oracle and third-party tools (Fail safe,...). Remark: On each server, the OracleAgent80 services NT must be started, and on Linux you call the listener program: Prompt>lsnrctl LSNRCTL>dbsnmp_start to start or dbsnmp_stop to stop the agent. Page 22 / 29

23 Queries Querie n 1: -- TPC-H/TPC-R Volume Shipping Query (Q7) -- Functional Query Definition -- Approved February 1998 supp_nation, cust_nation, l_year, sum(volume) as revenue ( n1.n_name as supp_nation, n2.n_name as cust_nation, to_char(l_shipdate, 'YYYY') as l_year, l_extendedprice * (1 - l_discount) as volume supplier, lineitem, orders, customer, nation n1, nation n2 s_suppkey = l_suppkey and o_orderkey = l_orderkey and c_custkey = o_custkey and s_nationkey = n1.n_nationkey and c_nationkey = n2.n_nationkey and ( (n1.n_name = ':1' and n2.n_name = ':2') or (n1.n_name = ':2' and n2.n_name = ':1') ) and l_shipdate between '01-jan-95' and '31-dec-96' ) shipping group by supp_nation, cust_nation, l_year order by supp_nation, cust_nation, l_year / Querie n 2: -- TPC-H/TPC-R National Market Share Query (Q8) -- Functional Query Definition -- Approved February 1998 o_year, sum(volume) mkt_share ( to_char(o_orderdate, 'YYYY') o_year, l_extendedprice * (1 - l_discount) volume, n2.n_name nation part, supplier, lineitem, orders, customer, nation n1, nation n2, region p_partkey = l_partkey and s_suppkey = l_suppkey and l_orderkey = o_orderkey and o_custkey = c_custkey and c_nationkey = n1.n_nationkey and n1.n_regionkey = r_regionkey and r_name = ':2' and s_nationkey = n2.n_nationkey and o_orderdate between date ' ' and date ' ' and p_type = ':3' ) all_nations group by o_year order by o_year Page 23 / 29

24 Querie n 3: -- TPC-H/TPC-R Product Type Profit Measure Query (Q9) -- Functional Query Definition -- Approved February 1998 nation, o_year, sum(amount) sum_profit ( n_name nation, to_char(o_orderdate,'yyyy') o_year, l_extendedprice * (1 - l_discount) - ps_supplycost * l_quantity amount part, supplier, lineitem, partsupp, orders, nation s_suppkey = l_suppkey and ps_suppkey = l_suppkey and ps_partkey = l_partkey and p_partkey = l_partkey and o_orderkey = l_orderkey and s_nationkey = n_nationkey and p_name like '%:1%' ) profit group by nation, o_year order by nation, o_year desc Querie n 4: -- TPC-H/TPC-R Returned Item Reporting Query (Q10) -- Functional Query Definition -- Approved February 1998 c_custkey, c_name, c_acctbal, n_name, c_address, c_phone, sum(l_extendedprice * (1 - l_discount)) revenue customer, orders, lineitem, nation c_custkey = o_custkey and l_orderkey = o_orderkey and o_orderdate >= '01-JAN-1999' and o_orderdate < '01-MAR-1999' and l_returnflag = 'R' and c_nationkey = n_nationkey group by c_custkey, c_name, c_acctbal, c_phone, n_name, c_address Querie n 5: -- TPC-H/TPC-R Important Stock Identification Query (Q11) -- Functional Query Definition -- Approved February 1998 ps_partkey, sum(ps_supplycost * ps_availqty) as value partsupp, supplier, nation ps_suppkey = s_suppkey Page 24 / 29

25 and s_nationkey = n_nationkey and n_name = ':1' group by ps_partkey having sum(ps_supplycost * ps_availqty) > ( sum(ps_supplycost * ps_availqty) * 150 partsupp, supplier, nation ps_suppkey = s_suppkey and s_nationkey = n_nationkey and n_name = ':1' ) order by value desc Querie n 6: Technical Report - Distributed Database -- TPC-H/TPC-R Shipping Modes and Order Priority Query (Q12) -- Functional Query Definition -- Approved February 1998 l_shipmode, sum(l_quantity) high_line_count orders, lineitem o_orderkey = l_orderkey and l_shipmode in ('AIR', 'SHIP') and l_commitdate < l_receiptdate and l_shipdate < l_commitdate and l_receiptdate >= '01-JAN-1950' and l_receiptdate < '01-JAN-1998' group by l_shipmode order by l_shipmode Querie n 7: -- TPC-H/TPC-R Promotion Effect Query (Q14) -- Functional Query Definition -- Approved February * sum(l_extendedprice * (1 - l_discount)) as promo_revenue lineitem, part l_partkey = p_partkey and l_shipdate >= '01-JAN-1950' and l_shipdate < '01-JAN-1999' Querie n 8: -- TPC-H/TPC-R Top Supplier Query (Q15) -- Functional Query Definition -- Approved February 1998 create view revenue (supplier_no, total_revenue) as l_suppkey, sum(l_extendedprice * (1 - l_discount)) lineitem l_shipdate >= '01-JAN-1950' Page 25 / 29

26 and l_shipdate < '01-JAN-1999' group by l_suppkey / s_suppkey, s_name, s_address, s_phone, total_revenue supplier, revenue s_suppkey = supplier_no and total_revenue = ( max(total_revenue) revenue ) order by s_suppkey / drop view revenue Querie n 9: -- TPC-H/TPC-R Parts/Supplier Relationship Query (Q16) -- Functional Query Definition -- Approved February 1998 p_brand, p_type, p_size, count(distinct ps_suppkey) as supplier_cnt partsupp, part p_partkey = ps_partkey and p_brand <> ':1' and p_type not like ':2%' and p_size in (10, 100, 2000, 3000, 50, 69, 5000, 10000) and ps_suppkey not in ( s_suppkey supplier s_comment like '%Customer%Complaints%' ) group by p_brand, p_type, p_size order by supplier_cnt desc, p_brand, p_type, p_size Querie n 10: -- TPC-H/TPC-R Small-Quantity-Order Revenue Query (Q17) -- Functional Query Definition -- Approved February 1998 sum(l_extendedprice) / 7.0 as avg_yearly lineitem, part p_partkey = l_partkey and p_brand = ':1' and p_container = ':2' and l_quantity < ( 0.2 * avg(l_quantity) lineitem l_partkey = p_partkey ); Page 26 / 29

27 Querie n 11: -- TPC-H/TPC-R Large Volume Customer Query (Q18) -- Function Query Definition -- Approved February 1998 c_name, c_custkey, o_orderkey, o_orderdate, o_totalprice, sum(l_quantity) customer, orders, lineitem o_orderkey in ( l_orderkey lineitem group by l_orderkey having sum(l_quantity) > ) and c_custkey = o_custkey and o_orderkey = l_orderkey group by c_name, c_custkey, o_orderkey, o_orderdate, o_totalprice order by o_totalprice desc, o_orderdate; Querie n 12: -- TPC-H/TPC-R Discounted Revenue Query (Q19) -- Functional Query Definition -- Approved February 1998 sum(l_extendedprice* (1 - l_discount)) as revenue lineitem, part ( p_partkey = l_partkey and p_brand = ':1' and p_container in ('SM CASE', 'SM BOX', 'SM PACK', 'SM PKG') and l_quantity >= 1 and l_quantity <= and p_size between 1 and 5 and l_shipmode in ('AIR', 'AIR REG') and l_shipinstruct = 'DELIVER IN PERSON' ) or ( p_partkey = l_partkey and p_brand = ':2' and p_container in ('MED BAG', 'MED BOX', 'MED PKG', 'MED PACK') and l_quantity >= 1 and l_quantity <= and p_size between 1 and 10 and l_shipmode in ('AIR', 'AIR REG') and l_shipinstruct = 'DELIVER IN PERSON' ) or ( p_partkey = l_partkey and p_brand = ':3' and p_container in ('LG CASE', 'LG BOX', 'LG PACK', 'LG PKG') and l_quantity >= 1 and l_quantity <= and p_size between 1 and 15 and l_shipmode in ('AIR', 'AIR REG') and l_shipinstruct = 'DELIVER IN PERSON' ); Page 27 / 29

28 Querie n 13: -- TPC-H/TPC-R Potential Part Promotion Query (Q20) -- Function Query Definition -- Approved February 1998 s_name, s_address supplier, nation s_suppkey in ( ps_suppkey partsupp ps_partkey in ( p_partkey part p_name like ':1%' ) and ps_availqty > ( 0.5 * sum(l_quantity) lineitem l_partkey = ps_partkey and l_suppkey = ps_suppkey and l_shipdate >= '01-JAN-1950' and l_shipdate < '01-JAN-1999') ) and s_nationkey = n_nationkey and n_name = ':3' order by s_name; Page 28 / 29

29 Time report Page 29 / 29

TPC-H Benchmark Set. TPC-H Benchmark. DDL for TPC-H datasets

TPC-H Benchmark Set. TPC-H Benchmark. DDL for TPC-H datasets TPC-H Benchmark Set TPC-H Benchmark TPC-H is an ad-hoc and decision support benchmark. Some of queries are available in the current Tajo. You can download the TPC-H data generator here. DDL for TPC-H datasets

More information

High Volume In-Memory Data Unification

High Volume In-Memory Data Unification 25 March 2017 High Volume In-Memory Data Unification for UniConnect Platform powered by Intel Xeon Processor E7 Family Contents Executive Summary... 1 Background... 1 Test Environment...2 Dataset Sizes...

More information

On-Disk Bitmap Index Performance in Bizgres 0.9

On-Disk Bitmap Index Performance in Bizgres 0.9 On-Disk Bitmap Index Performance in Bizgres 0.9 A Greenplum Whitepaper April 2, 2006 Author: Ayush Parashar Performance Engineering Lab Table of Contents 1.0 Summary...1 2.0 Introduction...1 3.0 Performance

More information

TPC BENCHMARK TM H (Decision Support) Standard Specification Revision

TPC BENCHMARK TM H (Decision Support) Standard Specification Revision TPC BENCHMARK TM H (Decision Support) Standard Specification Revision 2.17.3 Transaction Processing Performance Council (TPC) Presidio of San Francisco Building 572B Ruger St. (surface) P.O. Box 29920

More information

Comparison of Database Cloud Services

Comparison of Database Cloud Services Comparison of Database Cloud Services Benchmark Testing Overview ORACLE WHITE PAPER SEPTEMBER 2016 Table of Contents Table of Contents 1 Disclaimer 2 Preface 3 Introduction 4 Cloud OLTP Workload 5 Cloud

More information

TPC BENCHMARK TM H (Decision Support) Standard Specification Revision 2.8.0

TPC BENCHMARK TM H (Decision Support) Standard Specification Revision 2.8.0 TPC BENCHMARK TM H (Decision Support) Standard Specification Revision 2.8.0 Transaction Processing Performance Council (TPC) Presidio of San Francisco Building 572B Ruger St. (surface) P.O. Box 29920 (mail)

More information

Schema Tuning. Tuning Schemas : Overview

Schema Tuning. Tuning Schemas : Overview Administração e Optimização de Bases de Dados 2012/2013 Schema Tuning Bruno Martins DEI@Técnico e DMIR@INESC-ID Tuning Schemas : Overview Trade-offs among normalization / denormalization Overview When

More information

Comparison of Database Cloud Services

Comparison of Database Cloud Services Comparison of Database Cloud Services Testing Overview ORACLE WHITE PAPER SEPTEMBER 2016 Table of Contents Table of Contents 1 Disclaimer 2 Preface 3 Introduction 4 Cloud OLTP Workload 5 Cloud Analytic

More information

Tuning Relational Systems I

Tuning Relational Systems I Tuning Relational Systems I Schema design Trade-offs among normalization, denormalization, clustering, aggregate materialization, vertical partitioning, etc Query rewriting Using indexes appropriately,

More information

CSC317/MCS9317. Database Performance Tuning. Class test

CSC317/MCS9317. Database Performance Tuning. Class test CSC317/MCS9317 Database Performance Tuning Class test 7 October 2015 Please read all instructions (including these) carefully. The test time is approximately 120 minutes. The test is close book and close

More information

Whitepaper. Big Data implementation: Role of Memory and SSD in Microsoft SQL Server Environment

Whitepaper. Big Data implementation: Role of Memory and SSD in Microsoft SQL Server Environment Whitepaper Big Data implementation: Role of Memory and SSD in Microsoft SQL Server Environment Scenario Analysis of Decision Support System with Microsoft Windows Server 2012 OS & SQL Server 2012 and Samsung

More information

Histogram Support in MySQL 8.0

Histogram Support in MySQL 8.0 Histogram Support in MySQL 8.0 Øystein Grøvlen Senior Principal Software Engineer MySQL Optimizer Team, Oracle February 2018 Program Agenda 1 2 3 4 5 Motivating example Quick start guide How are histograms

More information

CPI Phoenix IQ-201 using EXASolution 2.0

CPI Phoenix IQ-201 using EXASolution 2.0 TPC Benchmark TM H Full Disclosure Report CPI Phoenix IQ-201 using EXASolution 2.0 First Edition April 2, 2008 TPC-H FULL DISCLOSURE REPORT 1 First Edition April 2, 2008 CPI Phoenix IQ-201 using EXASolution

More information

When and How to Take Advantage of New Optimizer Features in MySQL 5.6. Øystein Grøvlen Senior Principal Software Engineer, MySQL Oracle

When and How to Take Advantage of New Optimizer Features in MySQL 5.6. Øystein Grøvlen Senior Principal Software Engineer, MySQL Oracle When and How to Take Advantage of New Optimizer Features in MySQL 5.6 Øystein Grøvlen Senior Principal Software Engineer, MySQL Oracle Program Agenda Improvements for disk-bound queries Subquery improvements

More information

CPI Phoenix IQ-201 using EXASolution 2.0

CPI Phoenix IQ-201 using EXASolution 2.0 TPC Benchmark TM H Full Disclosure Report CPI Phoenix IQ-201 using EXASolution 2.0 First Edition January 14, 2008 TPC-H FULL DISCLOSURE REPORT 1 First Edition January 14, 2008 CPI Phoenix IQ-201 using

More information

TPC Benchmark H Full Disclosure Report

TPC Benchmark H Full Disclosure Report HP NetServer LXr 8500 using Microsoft Windows 2000 and Microsoft SQL Server 2000 TPC Benchmark H Full Disclosure Report Second Edition Submitted for Review August 18, 2000 First Edition - August 18, 2000

More information

Orri Erling (Program Manager, OpenLink Virtuoso), Ivan Mikhailov (Lead Developer, OpenLink Virtuoso).

Orri Erling (Program Manager, OpenLink Virtuoso), Ivan Mikhailov (Lead Developer, OpenLink Virtuoso). Orri Erling (Program Manager, OpenLink Virtuoso), Ivan Mikhailov (Lead Developer, OpenLink Virtuoso). Business Intelligence Extensions for SPARQL Orri Erling and Ivan Mikhailov OpenLink Software, 10 Burlington

More information

TPC Benchmark H Full Disclosure Report. Sun Microsystems Sun Fire X4100 Server Using Sybase IQ 12.6 Single Application Server

TPC Benchmark H Full Disclosure Report. Sun Microsystems Sun Fire X4100 Server Using Sybase IQ 12.6 Single Application Server TPC Benchmark H Full Disclosure Report Sun Microsystems Sun Fire X4100 Server Using Sybase IQ 12.6 Single Application Server Submitted for Review Report Date: Jun 23, 2006 TPC Benchmark H Full Disclosure

More information

Towards Comprehensive Testing Tools

Towards Comprehensive Testing Tools Towards Comprehensive Testing Tools Redefining testing mechanisms! Kuntal Ghosh (Software Engineer) PGCon 2017 26.05.2017 1 Contents Motivation Picasso Visualizer Picasso Art Gallery for PostgreSQL 10

More information

MOIRA A Goal-Oriented Incremental Machine Learning Approach to Dynamic Resource Cost Estimation in Distributed Stream Processing Systems

MOIRA A Goal-Oriented Incremental Machine Learning Approach to Dynamic Resource Cost Estimation in Distributed Stream Processing Systems MOIRA A Goal-Oriented Incremental Machine Learning Approach to Dynamic Resource Cost Estimation in Distributed Stream Processing Systems Daniele Foroni, C. Axenie, S. Bortoli, M. Al Hajj Hassan, R. Acker,

More information

vsql: Verifying Arbitrary SQL Queries over Dynamic Outsourced Databases

vsql: Verifying Arbitrary SQL Queries over Dynamic Outsourced Databases vsql: Verifying Arbitrary SQL Queries over Dynamic Outsourced Databases Yupeng Zhang, Daniel Genkin, Jonathan Katz, Dimitrios Papadopoulos and Charalampos Papamanthou Verifiable Databases client SQL database

More information

Parallelism Strategies In The DB2 Optimizer

Parallelism Strategies In The DB2 Optimizer Session: A05 Parallelism Strategies In The DB2 Optimizer Calisto Zuzarte IBM Toronto Lab May 20, 2008 09:15 a.m. 10:15 a.m. Platform: DB2 on Linux, Unix and Windows The Database Partitioned Feature (DPF)

More information

XWeB: the XML Warehouse Benchmark

XWeB: the XML Warehouse Benchmark XWeB: the XML Warehouse Benchmark CEMAGREF Clermont-Ferrand -- Université de Lyon (ERIC Lyon 2) hadj.mahboubi@cemagref.fr -- jerome.darmont@univ-lyon2.fr September 17, 2010 XWeB: CEMAGREF the XML Warehouse

More information

Oracle. Professional. WITH Function-Based Indexes (FBIs), I was able to alter an execution. Avoid Costly Joins with FBIs Pedro Bizarro.

Oracle. Professional. WITH Function-Based Indexes (FBIs), I was able to alter an execution. Avoid Costly Joins with FBIs Pedro Bizarro. Oracle Solutions for High-End Oracle DBAs and Developers Professional Avoid Costly Joins with FBIs Pedro Bizarro In this article, Pedro Bizarro describes how to use Function-Based Indexes to avoid costly

More information

Two-Phase Optimization for Selecting Materialized Views in a Data Warehouse

Two-Phase Optimization for Selecting Materialized Views in a Data Warehouse Two-Phase Optimization for Selecting Materialized Views in a Data Warehouse Jiratta Phuboon-ob, and Raweewan Auepanwiriyakul Abstract A data warehouse (DW) is a system which has value and role for decision-making

More information

Jayant Haritsa. Database Systems Lab Indian Institute of Science Bangalore, India

Jayant Haritsa. Database Systems Lab Indian Institute of Science Bangalore, India Jayant Haritsa Database Systems Lab Indian Institute of Science Bangalore, India Query Execution Plans SQL, the standard database query interface, is a declarative language Specifies only what is wanted,

More information

Multiple query optimization in middleware using query teamwork

Multiple query optimization in middleware using query teamwork SOFTWARE PRACTICE AND EXPERIENCE Softw. Pract. Exper. 2005; 35:361 391 Published online 21 December 2004 in Wiley InterScience (www.interscience.wiley.com). DOI: 10.1002/spe.640 Multiple query optimization

More information

Anorexic Plan Diagrams

Anorexic Plan Diagrams Anorexic Plan Diagrams E0 261 Jayant Haritsa Computer Science and Automation Indian Institute of Science JAN 2014 Plan Diagram Reduction 1 Query Plan Selection Core technique Query (Q) Query Optimizer

More information

Avoiding Sorting and Grouping In Processing Queries

Avoiding Sorting and Grouping In Processing Queries Avoiding Sorting and Grouping In Processing Queries Outline Motivation Simple Example Order Properties Grouping followed by ordering Order Property Optimization Performance Results Conclusion Motivation

More information

Hands-on Lab: Working with SQL using Hive

Hands-on Lab: Working with SQL using Hive Hands-n Lab: Wrking with SQL using Hive Wrking with SQL using Hive Hands-n Lab Overview Apache Hadp and its map-reduce framewrk have becme very ppular fr its rbust, scalable distributed prcessing. While

More information

Correlated Sample Synopsis on Big Data

Correlated Sample Synopsis on Big Data Correlated Sample Synopsis on Big Data by David S. Wilson A thesis submitted to Youngstown State University in partial fulfillment of the requirements for the degree of Master of Science in the Computer

More information

Developing a Dynamic Mapping to Manage Metadata Changes in Relational Sources

Developing a Dynamic Mapping to Manage Metadata Changes in Relational Sources Developing a Dynamic Mapping to Manage Metadata Changes in Relational Sources 1993-2016 Informatica LLC. No part of this document may be reproduced or transmitted in any form, by any means (electronic,

More information

Lazy Maintenance of Materialized Views

Lazy Maintenance of Materialized Views Lazy Maintenance of Materialized Views Jingren Zhou, Microsoft Research, USA Paul Larson, Microsoft Research, USA Hicham G. Elmongui, Purdue University, USA Introduction 2 Materialized views Speed up query

More information

A Compression Framework for Query Results

A Compression Framework for Query Results A Compression Framework for Query Results Zhiyuan Chen and Praveen Seshadri Cornell University zhychen, praveen@cs.cornell.edu, contact: (607)255-1045, fax:(607)255-4428 Decision-support applications in

More information

Enhanced XML Support in DB2 for LUW

Enhanced XML Support in DB2 for LUW extra on Demand Enhanced XML Support in DB2 for LUW Speaker Name David Owen (DOCE) Session: H12 Thursday May 26 th 2005 8:30am 1 Agenda XML support in the DB2 Family XML extender for decomposing XML documents

More information

Robust Optimization of Database Queries

Robust Optimization of Database Queries Robust Optimization of Database Queries Jayant Haritsa Database Systems Lab Indian Institute of Science July 2011 Robust Query Optimization (IASc Mid-year Meeting) 1 Database Management Systems (DBMS)

More information

SMOPD-C: An Autonomous Vertical Partitioning Technique for Distributed Databases on Cluster Computers

SMOPD-C: An Autonomous Vertical Partitioning Technique for Distributed Databases on Cluster Computers SMOPD-C: An Autonomous Vertical Partitioning Technique for Distributed Databases on Cluster Computers Liangzhe Li School of Computer Science University of Oklahoma Norman, USA lzli@ou.edu Le Gruenwald

More information

TPC Benchmark H Full Disclosure Report. Sun Microsystems Sun Fire X4200 M2 Server Using Sybase IQ 12.6 Single Application Server

TPC Benchmark H Full Disclosure Report. Sun Microsystems Sun Fire X4200 M2 Server Using Sybase IQ 12.6 Single Application Server TPC Benchmark H Full Disclosure Report Sun Microsystems Sun Fire X4200 M2 Server Using Sybase IQ 12.6 Single Application Server Submitted for Review Report Date: May 25, 2007 TPC Benchmark H Full Disclosure

More information

TPC Benchmark H Full Disclosure Report. Sun Microsystems Sun Fire V490 Server Using Sybase IQ 12.6 Single Application Server

TPC Benchmark H Full Disclosure Report. Sun Microsystems Sun Fire V490 Server Using Sybase IQ 12.6 Single Application Server TPC Benchmark H Full Disclosure Report Sun Microsystems Sun Fire V490 Server Using Sybase IQ 12.6 Single Application Server Submitted for Review Report Date: Jan 5, 2006 TPC Benchmark H Full Disclosure

More information

Optimizing Queries Using Materialized Views

Optimizing Queries Using Materialized Views Optimizing Queries Using Materialized Views Paul Larson & Jonathan Goldstein Microsoft Research 3/22/2001 Paul Larson, View matching 1 Materialized views Precomputed, stored result defined by a view expression

More information

Take Me to SSD: A Hybrid Block-Selection Method on HDFS based on Storage Type

Take Me to SSD: A Hybrid Block-Selection Method on HDFS based on Storage Type Take Me to SSD: A Hybrid Block-Selection Method on HDFS based on Storage Type Minkyung Kim Yonsei University 50 Yonsei-ro, Seodaemun-gu Seoul, Korea +82 2 2123 7757 goodgail@cs.yonsei.ac.kr Mincheol Shin

More information

Midterm Review. March 27, 2017

Midterm Review. March 27, 2017 Midterm Review March 27, 2017 1 Overview Relational Algebra & Query Evaluation Relational Algebra Rewrites Index Design / Selection Physical Layouts 2 Relational Algebra & Query Evaluation 3 Relational

More information

Challenges in Query Optimization. Doug Inkster, Ingres Corp.

Challenges in Query Optimization. Doug Inkster, Ingres Corp. Challenges in Query Optimization Doug Inkster, Ingres Corp. Abstract Some queries are inherently more difficult than others for a query optimizer to generate efficient plans. This session discusses the

More information

Beyond EXPLAIN. Query Optimization From Theory To Code. Yuto Hayamizu Ryoji Kawamichi. 2016/5/20 PGCon Ottawa

Beyond EXPLAIN. Query Optimization From Theory To Code. Yuto Hayamizu Ryoji Kawamichi. 2016/5/20 PGCon Ottawa Beyond EXPLAIN Query Optimization From Theory To Code Yuto Hayamizu Ryoji Kawamichi 2016/5/20 PGCon 2016 @ Ottawa Historically Before Relational Querying was physical Need to understand physical organization

More information

How to Analyze and Tune MySQL Queries for Better Performance

How to Analyze and Tune MySQL Queries for Better Performance How to Analyze and Tune MySQL Queries for Better Performance Øystein Grøvlen Senior Principal Software Engineer MySQL Optimizer Team, Oracle April 16, 2015 Program Agenda 1 2 3 4 5 6 Introduction to MySQL

More information

Efficiency Analysis of the access method with the cascading Bloom filter to the data warehouse on the parallel computing platform

Efficiency Analysis of the access method with the cascading Bloom filter to the data warehouse on the parallel computing platform Journal of Physics: Conference Series PAPER OPEN ACCESS Efficiency Analysis of the access method with the cascading Bloom filter to the data warehouse on the parallel computing platform To cite this article:

More information

Using MySQL, Hadoop and Spark for Data Analysis

Using MySQL, Hadoop and Spark for Data Analysis Using MySQL, Hadoop and Spark for Data Analysis Alexander Rubin Principle Architect, Percona September 21, 2015 About Me Alexander Rubin, Principal Consultant, Percona Working with MySQL for over 10 years

More information

Run Your Own Oracle Database Benchmarks with Hammerora

Run Your Own Oracle Database Benchmarks with Hammerora Run Your Own Oracle Database Benchmarks with Hammerora Steve Shaw Database Technology Manager Software and Services Group Date: 19-NOV-09 Time: 3.00 3.45 Location: Seoul Steve Shaw Introduction Database

More information

Query Optimization Time: The New Bottleneck in Realtime

Query Optimization Time: The New Bottleneck in Realtime Query Optimization Time: The New Bottleneck in Realtime Analytics Rajkumar Sen Jack Chen Nika Jimsheleishvilli MemSQL Inc. MemSQL Inc. MemSQL Inc. 534 4 th Street, 534 4 th Street, 534 4 th Street, San

More information

How to Analyze and Tune MySQL Queries for Better Performance

How to Analyze and Tune MySQL Queries for Better Performance How to Analyze and Tune MySQL Queries for Better Performance Øystein Grøvlen Senior Principal Software Engineer MySQL Optimizer Team, Oracle Copyright 2016, 2017, Oracle and/or its its affiliates. All

More information

Efficient in-memory query execution using JIT compiling. Han-Gyu Park

Efficient in-memory query execution using JIT compiling. Han-Gyu Park Efficient in-memory query execution using JIT compiling Han-Gyu Park 2012-11-16 CONTENTS Introduction How DCX works Experiment(purpose(at the beginning of this slide), environment, result, analysis & conclusion)

More information

Parallel Database Processing on a 100 Node PC Cluster: Cases for Decision Support Query Processing and Data Mining

Parallel Database Processing on a 100 Node PC Cluster: Cases for Decision Support Query Processing and Data Mining Parallel Database Processing on a 100 Node PC Cluster: Cases for Decision Support Query Processing and Data Mining Takayuki Tamura, Masato Oguchi, Masaru Kitsuregawa Institute of Industrial Science, The

More information

Fighting Redundancy in SQL

Fighting Redundancy in SQL Fighting Redundancy in SQL Antonio Badia and Dev Anand Computer Engineering and Computer Science department University of Louisville, Louisville KY 40292 Abstract. Many SQL queries with aggregated subqueries

More information

6.830 Problem Set 2 (2017)

6.830 Problem Set 2 (2017) 6.830 Problem Set 2 1 Assigned: Monday, Sep 25, 2017 6.830 Problem Set 2 (2017) Due: Monday, Oct 16, 2017, 11:59 PM Submit to Gradescope: https://gradescope.com/courses/10498 The purpose of this problem

More information

Benchmark TPC-H 100.

Benchmark TPC-H 100. Benchmark TPC-H 100 vs Benchmark TPC-H Transaction Processing Performance Council (TPC) is a non-profit organization founded in 1988 to define transaction processing and database benchmarks and to disseminate

More information

Optimizing Communication for Multi- Join Query Processing in Cloud Data Warehouses

Optimizing Communication for Multi- Join Query Processing in Cloud Data Warehouses Optimizing Communication for Multi- Join Query Processing in Cloud Data Warehouses Swathi Kurunji, Tingjian Ge, Xinwen Fu, Benyuan Liu, Cindy X. Chen Computer Science Department, University of Massachusetts

More information

TPC-D: Benchmarking for Decision Support

TPC-D: Benchmarking for Decision Support TPC-D: Benchmarking for Decision Support Carrie Ballinger, NCR Parallel Systems I. Introduction: A Child of the Nineties In the 1990 world that the TPC-D development effort was born into, decision support

More information

How to Analyze and Tune MySQL Queries for Better Performance

How to Analyze and Tune MySQL Queries for Better Performance How to Analyze and Tune MySQL Queries for Better Performance Øystein Grøvlen Senior Principal Software Engineer MySQL Optimizer Team, Oracle Copyright 2017, Oracle and/or its its affiliates. All All rights

More information

How to Analyze and Tune MySQL Queries for Better Performance

How to Analyze and Tune MySQL Queries for Better Performance How to Analyze and Tune MySQL Queries for Better Performance Øystein Grøvlen Senior Principal Software Engineer MySQL Optimizer Team, Oracle Copyright 2018, Oracle and/or its its affiliates. All All rights

More information

Advanced Query Optimization

Advanced Query Optimization Advanced Query Optimization Andreas Meister Otto-von-Guericke University Magdeburg Summer Term 2018 Why do we need query optimization? Andreas Meister Advanced Query Optimization Last Change: April 23,

More information

Active Disks - Remote Execution

Active Disks - Remote Execution - Remote Execution for Network-Attached Storage Erik Riedel Parallel Data Laboratory, Center for Automated Learning and Discovery University www.pdl.cs.cmu.edu/active Parallel Data Laboratory Center for

More information

Vectorized Postgres (VOPS extension) Konstantin Knizhnik Postgres Professional

Vectorized Postgres (VOPS extension) Konstantin Knizhnik Postgres Professional Vectorized Postgres (VOPS extension) Konstantin Knizhnik Postgres Professional Why Postgres is slow on OLAP queries? 1. Unpacking tuple overhead (heap_deform_tuple) 2. Interpretation overhead (invocation

More information

Understanding Data Races in MySQL

Understanding Data Races in MySQL Understanding Data Races in MySQL Wentao Wu, Jiexing Li, Tao Feng, Xiaofeng Zhan University of Wisconsin-Madison Abstract Data races are notorious for their close relationship to many painful concurrency

More information

GPU-Accelerated Analytics on your Data Lake.

GPU-Accelerated Analytics on your Data Lake. GPU-Accelerated Analytics on your Data Lake. Data Lake Data Swamp ETL Hell DATA LAKE 0001010100001001011010110 >>>>>>>>>>>>>>>>>>>>>> >>>>>>>> >>>>>> >>>>>>>>>>>>>>>>> >>> >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>>>>>

More information

ABSTRACT. GUPTA, SHALU View Selection for Query-Evaluation Efficiency using Materialized

ABSTRACT. GUPTA, SHALU View Selection for Query-Evaluation Efficiency using Materialized ABSTRACT GUPTA, SHALU View Selection for Query-Evaluation Efficiency using Materialized Views (Under the direction of Dr. Rada Chirkova) The purpose of this research is to show the use of derived data

More information

Benchmarking Polystores: the CloudMdsQL Experience

Benchmarking Polystores: the CloudMdsQL Experience Benchmarking Polystores: the CloudMdsQL Experience Boyan Kolev, Raquel Pau, Oleksandra Levchenko, Patrick Valduriez, Ricardo Jiménez-Peris, José Pereira To cite this version: Boyan Kolev, Raquel Pau, Oleksandra

More information

Query Optimizer Plan Diagrams: Production, Reduction and Applications

Query Optimizer Plan Diagrams: Production, Reduction and Applications Query Optimizer Plan Diagrams: Production, Reduction and Applications Jayant Haritsa Database Systems Lab Indian Institute of Science Bangalore, INDIA April 2011 Plan Diagrams Tutorial (ICDE 2011) 1 Cost-based

More information

NewSQL Databases MemSQL and VoltDB Experimental Evaluation

NewSQL Databases MemSQL and VoltDB Experimental Evaluation NewSQL Databases MemSQL and VoltDB Experimental Evaluation João Oliveira 1 and Jorge Bernardino 1,2 1 ISEC, Polytechnic of Coimbra, Rua Pedro Nunes, Coimbra, Portugal 2 CISUC Centre for Informatics and

More information

CSIT115/CSIT815 Data Management and Security Assignment 2

CSIT115/CSIT815 Data Management and Security Assignment 2 School of Computing and Information Technology Session: Autumn 2016 University of Wollongong Lecturer: Janusz R. Getta CSIT115/CSIT815 Data Management and Security Assignment 2 Scope This assignment consists

More information

Getting Started with SAP Sybase IQ Column Store Analytics Server

Getting Started with SAP Sybase IQ Column Store Analytics Server Author: Courtney Claussen SAP Sybase IQ Technical Evangelist Contributor: Bruce McManus Director of Customer Support at Sybase Getting Started with SAP Sybase IQ Column Store Analytics Server Lesson 4:

More information

TPC Benchmark H Full Disclosure Report. Kickfire Appliance 2400 Using MySQL Database

TPC Benchmark H Full Disclosure Report. Kickfire Appliance 2400 Using MySQL Database TPC Benchmark H Full Disclosure Report Kickfire Appliance 2400 Using MySQL Database Submitted for Review Report Date: May 5, 2008 TPCH Benchmark Full Disclosure Report Added discount explanation note (June

More information

GPU ACCELERATION FOR OLAP. Tim Kaldewey, Jiri Kraus, Nikolay Sakharnykh 03/26/2018

GPU ACCELERATION FOR OLAP. Tim Kaldewey, Jiri Kraus, Nikolay Sakharnykh 03/26/2018 GPU ACCELERATION FOR OLAP Tim Kaldewey, Jiri Kraus, Nikolay Sakharnykh 03/26/2018 A TYPICAL ANALYTICS QUERY From a business question to SQL Business question (TPC-H query 4) Determines how well the order

More information

Optimizer Standof. MySQL 5.6 vs MariaDB 5.5. Peter Zaitsev, Ovais Tariq Percona Inc April 18, 2012

Optimizer Standof. MySQL 5.6 vs MariaDB 5.5. Peter Zaitsev, Ovais Tariq Percona Inc April 18, 2012 Optimizer Standof MySQL 5.6 vs MariaDB 5.5 Peter Zaitsev, Ovais Tariq Percona Inc April 18, 2012 Thank you Ovais Tariq Ovais Did a lot of heavy lifing for this presentation He could not come to talk together

More information

Fighting Redundancy in SQL: the For-Loop Approach

Fighting Redundancy in SQL: the For-Loop Approach Fighting Redundancy in SQL: the For-Loop Approach Antonio Badia and Dev Anand Computer Engineering and Computer Science department University of Louisville, Louisville KY 40292 July 8, 2004 1 Introduction

More information

Copyright 2012, Oracle and/or its affiliates. All rights reserved. Insert Information Protection Policy Classification from Slide 12

Copyright 2012, Oracle and/or its affiliates. All rights reserved. Insert Information Protection Policy Classification from Slide 12 1 MySQL : 5.6 the Next Generation Lynn Ferrante Principal Consultant, Technical Sales Engineering Northern California Oracle Users Group November 2012 2 Safe Harbor Statement The

More information

COMPUTING SCIENCE. A P2P Database Server Based on BitTorrent. John Colquhoun and Paul Watson TECHNICAL REPORT SERIES

COMPUTING SCIENCE. A P2P Database Server Based on BitTorrent. John Colquhoun and Paul Watson TECHNICAL REPORT SERIES COMPUTING SCIENCE A P2P Database Server Based on BitTorrent John Colquhoun and Paul Watson TECHNICAL REPORT SERIES No. CS-TR-1183 January 2010 TECHNICAL REPORT SERIES No. CS-TR-1183 January, 2010 A P2P

More information

Data Manipulation (DML) and Data Definition (DDL)

Data Manipulation (DML) and Data Definition (DDL) Data Manipulation (DML) and Data Definition (DDL) 114 SQL-DML Inserting Tuples INSERT INTO REGION VALUES (6,'Antarctica','') INSERT INTO NATION (N_NATIONKEY, N_NAME, N_REGIONKEY) SELECT NATIONKEY, NAME,

More information

Benchmarking Hybrid OLTP&OLAP Database Systems

Benchmarking Hybrid OLTP&OLAP Database Systems Benchmarking Hybrid OLTP&OLAP Database Systems Florian Funke Alfons Kemper {first.last}@in.tum.de Technische Universität München Fakultät für Informatik Boltzmannstr. 3 85748 Garching, Germany Thomas Neumann

More information

Benchmarking In PostgreSQL

Benchmarking In PostgreSQL Benchmarking In PostgreSQL Lessons learned Kuntal Ghosh (Senior Software Engineer) Rafia Sabih (Software Engineer) 2017 EnterpriseDB Corporation. All rights reserved. 1 Overview Why benchmarking on PostgreSQL

More information

Finding the Pitfalls in Query Performance

Finding the Pitfalls in Query Performance Finding the Pitfalls in Query Performance M.L. Kersten P. Koutsourakis Y. Zhang CWI, MonetDB Solutions EU H2020 project ACTiCLOUD The Challenge MonetDB Mar-18 Which system is relatively better? Postgres

More information

1. Introduction. Dina Darwis, IJECS Volume 6 Issue 2 Feb., 2017 Page No Page 20183

1. Introduction. Dina Darwis, IJECS Volume 6 Issue 2 Feb., 2017 Page No Page 20183 www.ijecs.in International Journal Of Engineering And Computer Science ISSN: 2319-7242 Volume 6 Issue 2 Feb. 2017, Page No. 20183-20191 Index Copernicus Value (2015): 58.10, DOI: 10.18535/ijecs/v6i2.03

More information

Actian Hybrid Data Conference 2017 London

Actian Hybrid Data Conference 2017 London Actian Hybrid Data Conference 2017 London Disclaimer This document is for informational purposes only and is subject to change at any time without notice. The information in this document is proprietary

More information

Infrastructure at your Service. In-Memory-Pläne für den 12.2-Optimizer: Teuer oder billig?

Infrastructure at your Service. In-Memory-Pläne für den 12.2-Optimizer: Teuer oder billig? Infrastructure at your Service. In-Memory-Pläne für den 12.2-Optimizer: Teuer oder billig? About me Infrastructure at your Service. Clemens Bleile Senior Consultant Oracle Certified Professional DB 11g,

More information

Cisco Systems, Inc. First Edition April 2, 2019

Cisco Systems, Inc. First Edition April 2, 2019 Cisco Systems, Inc. TPC Benchmark H Full Disclosure Report for Cisco UCS C480 M5 Rack-Mount Server using Microsoft SQL Server 2017 Enterprise Edition And Red Hat Enterprise Linux 7.6 First Edition April

More information

MySQL 8.0: Common Table Expressions

MySQL 8.0: Common Table Expressions MySQL 8.0: Common Table Expressions Guilhem Bichot Lead Software Engineer MySQL Optimizer Team, Oracle Who With MySQL AB / Sun / Oracle since 2002 Coder on the Server from 4.0 to 8.0 Replication, on-line

More information

Recommending Materialized Views and Indexes with the IBM DB2 Design Advisor

Recommending Materialized Views and Indexes with the IBM DB2 Design Advisor Recommending Materialized Views and Indexes with the IBM DB2 Design Advisor Daniel C. Zilio et al Proceedings of the International Conference on Automatic Computing (ICAC 04) Rolando Blanco CS848 - Spring

More information

Spark-GPU: An Accelerated In-Memory Data Processing Engine on Clusters

Spark-GPU: An Accelerated In-Memory Data Processing Engine on Clusters 1 Spark-GPU: An Accelerated In-Memory Data Processing Engine on Clusters Yuan Yuan, Meisam Fathi Salmi, Yin Huai, Kaibo Wang, Rubao Lee and Xiaodong Zhang The Ohio State University Paypal Inc. Databricks

More information

Oracle8i. Release Distributed Database Systems. February 1999 A

Oracle8i. Release Distributed Database Systems. February 1999 A Oracle8i Distributed Database Systems Release 8.1.5 February 1999 A67784-01 Oracle8i Distributed Database Systems, Release 8.1.5 A67784-01 Copyright 1996, 1999, Oracle Corporation. All rights reserved.

More information

Hewlett-Packard Company

Hewlett-Packard Company Hewlett-Packard Company TPC Benchmark H Full Disclosure Report HP BladeSystem 64P using Oracle Database 10g Enterprise Edition with Real Application Cluster and Partitioning; and Red Hat Enterprise Linux

More information

a linear algebra approach to olap

a linear algebra approach to olap a linear algebra approach to olap Rogério Pontes December 14, 2015 Universidade do Minho data warehouse ETL OLTP OLAP ETL Warehouse OLTP Data Mining ETL OLTP Data Marts 2 olap Online analytical processing

More information

Algebricks: A Data Model-Agnostic Compiler Backend for Big Data Languages

Algebricks: A Data Model-Agnostic Compiler Backend for Big Data Languages Algebricks: A Data Model-Agnostic Compiler Backend for Big Data Languages Vinayak Borkar 2* Yingyi Bu 1 E. Preston Carman, Jr. 3 Nicola Onose 2* Till Westmann 4 Pouria Pirzadeh 1 Michael J. Carey 1 Vassilis

More information

TPC Benchmark TM H Full Disclosure Report

TPC Benchmark TM H Full Disclosure Report TPC Benchmark TM H Full Disclosure Report for Lenovo System x 3850 X6 using Microsoft SQL Server 2016 Enterprise Edition and Microsoft Windows Server 2016 Standard Edition TPC-H TM Version 2.17.1 Second

More information

Greenplum Database 4.0: Critical Mass Innovation. Architecture White Paper August 2010

Greenplum Database 4.0: Critical Mass Innovation. Architecture White Paper August 2010 Greenplum Database 4.0: Critical Mass Innovation Architecture White Paper August 2010 Greenplum Database 4.0: Critical Mass Innovation Table of Contents Meeting the Challenges of a Data-Driven World 2

More information

Cisco Systems, Inc. First Edition June 4, 2018

Cisco Systems, Inc. First Edition June 4, 2018 Cisco Systems, Inc. TPC Benchmark H Full Disclosure Report for Cisco UCS C240 M5 Rack-Mount Server using Microsoft SQL Server 2017 Enterprise Edition And Red Hat Enterprise Linux Server 7.3 First Edition

More information

Lab Validation Report

Lab Validation Report Lab Validation Report ParAccel PADB and NetApp SAN Optimized Solution High Performance Analytics with Advanced Data Management Capabilities By Julie Lockner May 2011 Lab Validation: ParAccel PADB and NetApp

More information

Real-World Performance Training Star Query Edge Conditions and Extreme Performance

Real-World Performance Training Star Query Edge Conditions and Extreme Performance Real-World Performance Training Star Query Edge Conditions and Extreme Performance Real-World Performance Team Dimensional Queries 1 2 3 4 The Dimensional Model and Star Queries Star Query Execution Star

More information

Query tuning with Optimization Service Center

Query tuning with Optimization Service Center Session: F08 Query tuning with Optimization Service Center Patrick Bossman IBM May 20, 2008 4:00 p.m. 5:00 p.m. Platform: DB2 for z/os 1 Agenda Overview of Optimization Service Center Workload (application)

More information

Simba: Towards Building Interactive Big Data Analytics Systems. Feifei Li

Simba: Towards Building Interactive Big Data Analytics Systems. Feifei Li Simba: Towards Building Interactive Big Data Analytics Systems Feifei Li Complex Operators over Rich Data Types Integrated into System Kernel For Example: SELECT k-means from Population WHERE k=5 and feature=age

More information

TPC Benchmark H Full Disclosure Report. ThinkServer RD630 VectorWise RedHat Enterprise Linux 6.4

TPC Benchmark H Full Disclosure Report. ThinkServer RD630 VectorWise RedHat Enterprise Linux 6.4 TPC Benchmark H Full Disclosure Report ThinkServer RD630 VectorWise 3.0.0 RedHat Enterprise Linux 6.4 First Edition May 2013 TPC-H Full Disclosure Report 1 First Edition May, 2013 Lenovo Corp., the sponsor

More information

A Case Study of Real-World Porting to the Itanium Platform

A Case Study of Real-World Porting to the Itanium Platform A Case Study of Real-World Porting to the Itanium Platform Jeff Byard VP, Product Development RightOrder, Inc. Agenda RightOrder ADS Product Description Porting ADS to Itanium 2 Testing ADS on Itanium

More information