Big Data NoSQL Databases Individual Assignment 20 Points

Size: px
Start display at page:

Download "Big Data NoSQL Databases Individual Assignment 20 Points"

Transcription

1 If this lab is an Individual assignment, you must do all coded programs on your own. You may ask others for help on the language syntax, but you must organize and present your own logical solution to the problem. No lab is complete until the student submits the signed pledge form associated with that lab. I realize that no coded programs will be graded until I turn in the sign & pledge form associated with that program; any late penalties will continue to compound until the pledge form is submitted. If this lab is a team assignment, both team members may share logic as they program side by side on their own computers. Each person must type all of his/her own code as part of the learning process. Team assignments are never to be "You do this portion and I ll do that portion" or "You do this lab and I ll do the next lab". Some of the lab assignments will have short answer questions. These short answer questions will be spot checked and graded for completion, but not checked for accuracy. Once these labs are graded and returned, I encourage you to compare answers with another class member who has also had the lab graded and returned. I/We realize that the penalty for turning in work that is not my own, or assisting others in doing so, can range from an "F" in the class to dismissal from Trinity University. I realize that it is a violation of academic integrity to share any portion of this lab with any person (outside my 2320 team & professor)! Print Name Time Required =. Hrs. Signature (pledged) Big Data NoSQL Databases Individual Assignment 20 Points Big Data and Data Analytics are here to stay. A lot of the NoSQL research and development have been focused around the need for speed and access on large Internet Web Site. 10% Extra Credit For Each: -- Do This After You Finish The Lab --- 1] {Sign/Pledge} I own a personal computer that runs windows _?_ {7/8/10/????}. I have successfully installed MongoDB on this system. 2] {Sign/Pledge} I own a personal computer that runs linux _?_ {what flavor} I have successfully installed MongoDB on this system. 3] {Sign/Pledge} I own a mac computer that runs OS X _?_ {what version} I have successfully installed MongoDB this system. Install MongoDB 1] Find a YouTube video which enables you to successfully install, and start, MongoDB on your windows server. Set up your databases at C:\Data\db. Set up your application in C:\Program Files\MongoDB. 2] The URL for the tutorial I used was _?_. 3] {Sign/Pledge} I have successfully installed MongoDB on my windows server. 4] The IP Address Of My Windows Server is _?_. 5] The DNS Entry For My Windows Server is _?_. Hint : CS-??.cs.trinity.edu Presentation #1 Watch MongoDB Tutorial 1 What is MongoDB? These homework questions can be answered as you watch the video. They are ordered with the content. There will be times when you should stop the video and do some web searching along the way.

2 Big-Data-NoSQL-MongoDB-1-HW.docx Database Systems (3343) Name 2 1] MongoDB is a high performance D_?_ O_?_ Database. 2] MongoDB is developed and supported by a company that was initially called 1_?_. 3] {T/F} MongoDB is open source. 4] According to the presentation, there are three major database types. List them. R_? O_? N_?_ 5] The problem that generated the need for NOSQL was that Relational Databases could not handle B_?_D_?_ 6] RDBMS systems are not horizontally S_?_ 7] Horizontal Scalability means that you keep adding more and more _?_ as you need more power. Big Data Stop the Presentation! Look up and read about Big Data. Start with Wikipedia. 1] Big data is a term for data sets that are so L_?_ or complex that traditional data processing applications are inadequate. Challenges include analysis, capture, data curation, search, sharing, storage, transfer, visualization, querying, updating and information privacy. 2] The term Big Data often refers simply to the use of predictive An_?_ or certain other advanced methods to extract value from data, and seldom to a particular size of data set. Accuracy in big data may lead to more confident decision making, and better decisions can result in greater operational efficiency, cost reduction and reduced risk. 3] Analysis of the data sets, extracted from Big Data, can find new Correl_?_ to "spot business trends, prevent diseases, combat crime and so on. 4] Data sets are Gro_?_ rapidly in part because they are increasingly gathered by cheap and numerous information-sensing mobile devices, aerial (remote sensing), software logs, cameras, microphones, radio-frequency identification (RFID) readers and wireless sensor networks. 5] According to research, the world's technological per-capita capacity to store information has roughly Do_?_ every 40 months since the 1980s; as of 2012, every day 2.5 Exabytes (2.5 10^18) of data are created. 6] Relational database management systems and desktop statistics and visualization packages often have difficulty handling B_?_ D_?_. The work instead requires "massively parallel software running on tens, hundreds, or even thousands of servers".

3 Big-Data-NoSQL-MongoDB-1-HW.docx Database Systems (3343) Name 3 7] {T/F} What is considered "Big Data" varies depending on the capabilities of the users and their tools, and expanding capabilities make big data a moving target. "For some organizations, facing hundreds of gigabytes of data for the first time may trigger a need to reconsider data management options. For others, it may take tens or hundreds of terabytes before data size becomes a significant consideration." Key-Value NoSQL Databases Look up and read about Key-Value Database. Start with Wikipedia. 1] A key-value store, or keyvalue database, is a data storage paradigm designed for storing, retrieving, and managing associative arrays, a data structure more commonly known today as a D_?_ or hash. 2] In key-valued databases, dictionaries contain a Col_?_ of objects, or records, which in turn have many different fields within them, each containing data. These records are stored and retrieved using a key that uniquely identifies the record, and is used to quickly find the data within the database. 3] Key-value stores work in a very different fashion from the better known relational databases (RDB). RDBs pre-define the data structure in the database as a series of tables containing fields with well-defined data types. Exposing the data types to the database program allows it to apply a number of optimizations. In contrast, key-value systems treat the data as a single opaque collection which may have different fields for every Rec_?_. This offers considerable flexibility and more closely follows modern concepts like object-oriented programming. Because optional values are not represented by placeholders as in most RDBs, key-value stores often use far less memory to store the same database, which can lead to large performance gains in certain workloads. 4] Performance, a lack of standardization and other issues limited key-value systems to niche uses for many years, but the rapid move to CL_?_ computing after 2010 has led to a renaissance as part of the broader NoSQL movement. Some graph databases are also key-value stores internally, adding the concept of the relationships (pointers) between records as a first class data type. Memcached Key-Value Store NoSQL Look up and read about Memcached. Start with Wikipedia. Memcached is a Key Value Store NoSQL Database. It is one of the solutions used to manage Big Data. 1] Memcached is a generalpurpose distributed M_?_ C_?_ system. Memcached is often used to speed up dynamic database-driven websites by caching data and objects in RAM to reduce the number of times an external data source (such as a database or API) must be read. 2] Memcached is Op_?_ So_?_ software, licensed under the Revised BSD license. 3] Memcached runs on L_?_, OS_?_, Microsoft W_?_. It depends on the libevent library.

4 Big-Data-NoSQL-MongoDB-1-HW.docx Database Systems (3343) Name 4 4] Memcached's APIs provide a very large H_?_ table distributed across multiple machines. When the table is full, subsequent inserts cause older data to be purged in least recently order. Applications using Memcached typically layer requests and additions into RAM before falling back on a slower backing store, such as a database. 5] The size of Memcached's hash table is often very large. It is limited to available M_?_ across all the servers in the cluster of servers in a data center. Where high volume, wide audience web publishing requires it, this may stretch to many gigabytes. Memcached can be equally valuable for situations where either the number of requests for content is high, or the cost of generating a particular piece of content is high. 6] Memcached was originally developed by Danga Interactive for LiveJournal. List at least two other systems, besides Facebook, that use Memcached. Google App Engine, AppScale, Microsoft Azure and Amazon Web Services also offer a Memcached service through an API. Redis Key-Value Store NoSQL Look up and read about Redis. Start with Wikipedia. Redis is a Key Value Store NoSQL Database. 1] Redis is a data structure server. It is open-source, networked, in-memory, and stores K_?_ with optional durability. The development of Redis has been sponsored by Redis Labs since June According to the monthly ranking by DB- Engines.com, Redis is the most popular key-value database. 2] The name Redis means REmote D_?_ Server. 3]?_ is an open source data structure server; it is the most popular key-value database. Oracle Coherence Key-Value Store NoSQL Look up and read about Oracle Coherence. Start with Wikipedia. Oracle Coherence is a Key Value Store NoSQL Database. 1] Oracle Coherence is a proprietary Java-based in-memory data grid, designed to have better reliability, Sca_?_ and performance than traditional relational database management systems. 2] Oracle s C_?_ product, that they purchased Tangosol provides them with an entry into the NoSQL market. 3]?_ is a proprietary Javabased in-memory data grid, designed to have better reliability, scalability and performance than traditional relational database management systems; Oracle purchased the product from Tangosol in order to move into the NoSQL market.

5 Big-Data-NoSQL-MongoDB-1-HW.docx Database Systems (3343) Name 5 Riak Key-Value Store NoSQL Look up and read about Riak. Start with Wikipedia. Riak is a Key-Value Store" NoSQL Database. 1] Riak is a distributed NoSQL key-value data store that offers high availability, fault tolerance, operational simplicity, and Sc_?_. Riak is written in Erlang. 2] In addition to the open-source version, Riak comes in a supported En_?_ version and a cloud storage version. Riak has fault tolerance data replication and automatic data distribution across the cluster for performance and resilience. 3] Riak is Op_?_ So_?_ database that offers high availability, fault tolerance, simplicity, and scalability. 4]?_ is a distributed NoSQL key-value data store that offers high availability, fault tolerance, operational simplicity, and Scalability. It is written in Erlang and offers a free community version as well an enterprise version and a cloud storage version. Bigtable Tabular NoSQL Look up and read about Bigtable. Start with Wikipedia. Bigtable is a Tabular NoSQL Database. 1] Bigtable is a compressed, high performance, and proprietary data storage system built on G_?_ F_?_ S_?_, Chubby Lock Service, SSTable (log-structured storage like LevelDB) and a few other Google technologies. 2] Google's reasons for developing its own Big Table database include Sc_?_ and better control of performance characteristics. 3]?_ is a proprietary, tabular NoSQL database built on the Google File System. Hbase Tabular NoSQL Look up and read about Apache Hbase. Start with Wikipedia. Hbase is a Tabular NoSQL Database. 1] Hbase is an open source, non-relational, distributed database modeled after Google's Bigtable and written in Java. It is developed as part of A_?_ Software Foundation's Hadoop project and runs on top of HDFS (Hadoop Distributed Filesystem), providing Bigtable-like capabilities for Hadoop. 2] Hbase is an Op_?_ So_?_, non-relational, distributed database modeled after Google's Bigtable 3] Hbase provides Apache with a fault-tolerant way of storing large quantities of Sp_?_ data (small amounts of information caught within a large collection of empty or unimportant data, such as finding the 50 largest items in a group of 2 billion records, or finding the non-zero items representing less than 0.1% of a huge collection). 4] {T/F} Hbase is now serving several data-driven websites, including Facebook's Messaging Platform. As of 2015, Hbase is the second most popular NoSQL Tabular Database First was Apache Cassandra.

6 Big-Data-NoSQL-MongoDB-1-HW.docx Database Systems (3343) Name 6 5]?_ is An Apache Software tabular NoSQL database that is modeled after Google's Bigtable; it is written in Java and used in the Hadoop project. Accumulo Tabular NoSQL Look up and read about Apache Accumulo. Start with Wikipedia. Accumulo is a Tabular NoSQL Database. 1] Apache Accumulo is a computer software project that developed a sorted, distributed key/value store based on the Bi_?_ technology from Google. 2] {T/F} Apache Accumulo is a system built on top of Apache Hadoop, Apache ZooKeeper, and Apache Thrift. Written in Java, Accumulo has cell-level access labels and server-side programming mechanisms. As of 2015, Accumulo is the third most popular NoSQL Tabular Database First and Second were Apache Cassandra and Apache Hbase. Cassandra Key-Value Tabular Hybrid NoSQL Look up and read about Apache Cassandra. Start with Wikipedia. Cassandra is a Hybrid Combining Tabular & "Key-Value Stores " NoSQL Database. 1] Apache Cassandra is a free and open-source distributed database management system designed to handle large amounts of data across many commodity servers, providing high availability with no single point of Fa_?_. Cassandra offers robust support for clusters spanning multiple datacenters, with asynchronous masterless replication allowing low latency operations for all clients. 2] Cassandra is an Op_?_ So_?_, non-relational, distributed database that is a hybrid between key-value stores and tabular NoSQL. 3] Cassandra also places a high value on performance. In 2012, University of Toronto researchers studying NoSQL systems concluded that "In terms of scalability, there is a clear winner throughout our experiments. Cassandra achieves the highest Throu_?_ for the maximum number of nodes in all experiments" although "this comes at the price of high write and read latencies." 4]?_ is essentially a hybrid between a key-value and a column-oriented (or tabular) database management system. Its data model is a partitioned row store with tunable consistency CouchDB Document Oriented NoSQL Look up and read about CouchDB. Start with Wikipedia. CouchDB is a Documented Oriented NoSQL Database. 1] Apache CouchDB, commonly referred to as CouchDB, is open source database software that focuses on ease of use and having an architecture that "completely embraces the W_?_"

7 Big-Data-NoSQL-MongoDB-1-HW.docx Database Systems (3343) Name 7 2] CouchDB is an Op_?_ So_?_ database software that focuses on ease of use and having an architecture that "completely embraces the Web" 3] Unlike a relational database, a CouchDB database does not store data and relationships in tables. Instead, each database is a collection of independent Do_?_. 4] Each CouchDB document maintains its own data and self-contained schema. An application may access multiple databases, such as one stored on a user's mobile phone and another on a server. Document metadata contains Rev_?_ information, making it possible to merge any differences that may have occurred while the databases were disconnected. 5] is an Apache documentoriented NoSQL database architecture and is implemented in the concurrency-oriented language Erlang; it uses JSON to store data, JavaScript as its query language using MapReduce, and HTTP for an API. Each document maintains its own data and self-contained schema Cloudant Document Oriented NoSQL Look up and read about Cloudant. Start with Wikipedia. Cloudant is a Documented Oriented NoSQL Database. 1] Cloudant is an IB_?_ software product, which is primarily delivered as a cloud-based service. Cloudant is an open source NoSQL database that is based on the Apache-backed CouchDB project and the open source BigCouch project. 2] Cloudant was purchased by IBM, from a company called Cloudant, in order to have an entry into document oriented NoSQL market. Cloudant's service provides integrated data management, search, and analytics engine designed for W_?_ applications. 3] Cloudant scales databases on the CouchDB framework and provides hosting, administrative tools, analytics and commercial support for CouchDB and BigCouch; it has the added advantage of data being redundantly distributed over multiple M_?_. 4]?_ is an IB_?_ software product, which is primarily delivered as a cloud-based NoSQL service to support document management in Web applications over multiple machines. MongoDB Document Oriented NoSQL Look up and read about MongoDB. Start with Wikipedia. MongoDB is a Documented Oriented NoSQL Database. 1] The name MongoDB was taken humongous. MongoDB is a free and open-source cross-platform whose focus is to provide a D_?_- oriented database. 2] MongoDB is classified as a NoSQL database; it avoids the traditional table-based relational database structure in favor of JSON-like documents with dynamic schemas. MongoDB calls the format BS_?_. 3] {T/F} As of July 2015, MongoDB is the fourth most popular type of database management system, and the most popular for document stores.

8 Big-Data-NoSQL-MongoDB-1-HW.docx Database Systems (3343) Name 8 4]?_ is a free and opensource cross-platform whose focus is to provide a document-oriented database; it is based on the BSON format. Presentation #2 Return To The Presentation! 1] According to the presentation, Relational Database has three major features that are missing from NoSQL databases. List them No J_?_ Support No Complex T_?_ Support No C_?_ Support 2] According to the presentation, NoSQL databases offer three major features that are missing from Relational databases. List them There are Qu_?_ Language support designed for non-relational structures. F_?_ Performance Horizontally S_?_ to handle much more data. 3] {Relational Database/NoSQL} provides much more functionality. 4] {Relational Database/NoSQL} offers much better performance" when using thousands of servers in hundreds of data centers. 5] A Table in a Relational Database will have R_?_. 6] A Table in a Relational Database would be comparable to a C_?_ in NoSQL. 7] In a Relational Database each record represents one I_?_ of entity. 8] In a Relational Database each record will have multiple F_?_. 9] MongoDB is a D_?_ oriented database.

9 Big-Data-NoSQL-MongoDB-1-HW.docx Database Systems (3343) Name 9 10] The JSON like format, used by MongoDB is called B_?_. 11] The BSON code, above, might represent one document; it would be equivalent to one R_?_ in a relational database. 12] In MongoDB, the BSON code, above, might represent _?_ documents. 13] In MongoDB, the BSON documents might also be called O_?_. 14] In MongoDB, multiple documents form a C_?_ 15] In MongoDB, a document is comparable to a _?_ in a relational database. 16] In MongoDB, a collection is comparable to a _?_ in a relational database. 17] Each F_?_ in the BSON document is a key-value pair. 18] The BSON code above has _?_ key value pairs one of those key-value pairs is Department & 20.

10 Big-Data-NoSQL-MongoDB-1-HW.docx Database Systems (3343) Name 10 19] The BSON code above has _?_ key value pairs one of those key-value pairs is First Name & Bill. 20] In MongoDB, there is a mandatory field, called _?_, that is mandatory; it is the equivalent of a primary key in relational database. 21] {T/F} In a relational database, all records must have the same number of fields. 22] {T/F} In a MongoDB database, all records must have the same number of fields. 23] {T/F} In the MongoDB, the project field above might have multiple values. 24] In the BISON code above, Hassan is involved in _?_ projects. 25] In the BISON code above, Bill is involved in _?_ projects. 26] {T/F} MongoDB supports 1-to-Many Relationships. 27] {T/F} The BSON code above, shows how two addresses might be embedded within the Address field for Hassan. 28] {T/F} NoSQL means you Can t Query!

11 Big-Data-NoSQL-MongoDB-1-HW.docx Database Systems (3343) Name 11 29] The MongoDB query above shows how one might search the Employee collection for a record whose _id = _?_. 30] Write the MongoDB query that might be used to search the Employee collection for records in which the name = "Hicks, Tom" 31] Write the MongoDB query that might be used to search the Employee collection for records in which the age = 21 32] Write the MongoDB query that might be used to find all of the employees. 33] Write the MongoDB query that might be used to find all of the employees and sort them by name. 34] {T/F} MongoDB supports Ad Hoc Queries. 35] {T/F} MongoDB supports Search By Field Queries. 36] {T/F} MongoDB supports Search By Range Queries. 37] {T/F} MongoDB supports Search By Regular Expression Queries. 38] {T/F} MongoDB supports Indexing. 39] {T/F} MongoDB supports Replication. 40] {T/F} MongoDB supports Duplication of Data on Multiple Computers. 41] {T/F} MongoDB supports Load Balancing. 42] {T/F} MongoDB supports Horizontal Scalability; new machines can be added to a running system. 43] {T/F} MongoDB could be used as a File Storage System using GridFS. 44] {T/F} MongoDB supports Aggregation MapReduce can be used for the batch processing of Data. SQL GROUP BY is supported. 45] M _?_ is where you would "divide the big computation into smaller pieces and send each piece to a smaller subset of data and once each small piece of computation is finished you abbreviate the results back". 46] {T/F} In MongoDB, you can use JavaScript functions in the queries. 47] MongoDB is developed and supported by a company called 10gen and was production ready in the year 2 _; 48] The company 10gen has recently changed it s name to _?_.

12 Big-Data-NoSQL-MongoDB-1-HW.docx Database Systems (3343) Name 12 Search The Web Go to 1] The current stable release of MongoDB is _._._? 2] MongoDB is available for {Mac/Windows/Linux}. Go to 3] In your opinion, what are the six largest companies that use MongoDB? List them. 4] {T/F} MongoDB can be used for Big Data applications, 5] {T/F} MongoDB can be used for Small Data applications, 6] {T/F} MongoDB will be better than Relational Databases for Small Data Applications. 7] Check those applications that have MongoDB driver support. Java Javascript Python Ruby C# PHP C++.NET Node.js Pearl Scala Fortran 8] Go to Indeed.com. Do the search above. There are currently _?_,000+ database jobs listed (at that one site) in the United States. 9] Go to Indeed.com. Do the search above. There are currently _?_ database jobs listed (at that one site) in the United States that pay $100,000 - $300,000 a year. 10] Go to Indeed.com. Do the search above. There are currently _?_ MongoDB database jobs listed (at that one site) in the United States.

13 Big-Data-NoSQL-MongoDB-1-HW.docx Database Systems (3343) Name 13 11] The graphic above, shows the downloads of several NoSQL databases. Many folks conclude from this collection of downloads, and others, that M_?_ is the most popular NoSQL database. Differing Opinions 1] I believe that NoSQL has it's place and is here to stay, but there are a variety of opinions. Watch the video below! MySQL vs MongoDB other side of the coin. Language is not great! 2] {Sign/Pledge} I have watched the Video above. Optional 1] Mongodb Tutorial for Beginners Nice to watch - long What To Turn In No Lab Is Complete Until Both Are Complete ] You sign & submit the Pledge form. a) Review the Pledge statement b) Record the amount of time you think you spent on this lab c) Staple all pages of this lab. Fold in half length-wise (like a hot-dog). Put your name on the outside. Place it on the professor desk before the beginning of lecture on the day it is due. The penalty for late homework will not exceed 25% off per day. 2] Place all programming code associated with this program, if any, in the Professor s Code Drop Box a) I do not accept programs by mail; do not submit labs via !

14 Big-Data-NoSQL-MongoDB-1-HW.docx Database Systems (3343) Name Comments A] Programs that do not compile are worth little, if anything. B] If a print statement format is off, the penalties will often be less than the 25% per day late penalty; turn in the lab. You would not be happy if you went to Best Buy and purchased a large screen TV that did everything except show the picture; you would consider it pretty worthless. Most users consider software that does not work properly pretty useless as well. If the lab is not working correctly, credit will be small (if any); you might be better to accept a 25% (1 day) late penalty and turn in the lab working correctly! C] Start all programs early so that you can get in contact with the professor if you have problems. D] If you are turning in this lab late, you may hand it to me if I am in the office put it in the mail box outside my office door slide it under the outer door to our suite {if locked} slide it under my office door. The sooner I get late labs, the sooner the late penalty meter quits clicking. E] Backup your programs in at least three places. Put a copy on your Y drive. Put a copy on your flash drive. Put a copy on your personal computer. Send yourself a copy in your .

NoSQL Databases & Big Data Individual Assignment 50 Points

NoSQL Databases & Big Data Individual Assignment 50 Points If this lab is an Individual assignment, you must do all coded programs on your own. You may ask others for help on the language syntax, but you must organize and present your own logical solution to the

More information

Chapter 11-B Homework ScalaFX & Eclipse Individual Assignment 25 Points

Chapter 11-B Homework ScalaFX & Eclipse Individual Assignment 25 Points If this lab is an Individual assignment, you must do all coded programs on your own. You may ask others for help on the language syntax, but you must organize and present your own logical solution to the

More information

Chapter 11-D Homework ScalaFX & Eclipse Individual Assignment 10 Points

Chapter 11-D Homework ScalaFX & Eclipse Individual Assignment 10 Points If this lab is an Individual assignment, you must do all coded programs on your own. You may ask others for help on the language syntax, but you must organize and present your own logical solution to the

More information

Chapter 4C Homework Functions III Individual Assignment 30 Points Questions 6 Points Script 24 Points

Chapter 4C Homework Functions III Individual Assignment 30 Points Questions 6 Points Script 24 Points PCS1-Ch-4C-Functions-3-HW.docx CSCI 1320 Initials P a g e 1 If this lab is an Individual assignment, you must do all coded programs on your own. You may ask others for help on the language syntax, but

More information

OOP-15-AVL Final Project-1-HW Individual Assignment 70 Points

OOP-15-AVL Final Project-1-HW Individual Assignment 70 Points OOP-15-AVL Final Project-1-HW.docx CSCI 2320 Initials P a g e 1 If this lab is an Individual assignment, you must do all coded programs on your own. You may ask others for help on the language syntax,

More information

OOP- 5 Stacks Individual Assignment 35 Points

OOP- 5 Stacks Individual Assignment 35 Points OOP-5-Stacks-HW.docx CSCI 2320 Initials P a g e 1 If this lab is an Individual assignment, you must do all coded programs on your own. You may ask others for help on the language syntax, but you must organize

More information

PCS1-Ch-3B-Basic-Loops-HW CSCI 1320 Initials P a g e 1

PCS1-Ch-3B-Basic-Loops-HW CSCI 1320 Initials P a g e 1 PCS1-Ch-3B-Basic-Loops-HW CSCI 1320 Initials P a g e 1 If this lab is an Individual assignment, you must do all coded programs on your own. You may ask others for help on the language syntax, but you must

More information

OOP-8-DLList-1-HW.docx CSCI 2320 Initials Page 1

OOP-8-DLList-1-HW.docx CSCI 2320 Initials Page 1 OOP-8-DLList-1-HW.docx CSCI 2320 Initials Page 1 If this lab is an Individual assignment, you must do all coded programs on your own. You may ask others for help on the language syntax, but you must organize

More information

OOP- 6 Direct Access Files & Software Engineering Individual Assignment

OOP- 6 Direct Access Files & Software Engineering Individual Assignment OOP-6-DA-Files-SE-HW.docx CSCI 2320 Initials P a g e 1 If this lab is an Individual assignment, you must do all coded programs on your own. You may ask others for help on the language syntax, but you must

More information

OOP- 4 Templates & Memory Management Print Only Pages 1-5 Individual Assignment Answers To Questions 10 Points - Program 15 Points

OOP- 4 Templates & Memory Management Print Only Pages 1-5 Individual Assignment Answers To Questions 10 Points - Program 15 Points OOP-4-Templates-Memory-Management-HW.docx CSCI 2320 Initials P a g e 1 If this lab is an Individual assignment, you must do all coded programs on your own. You may ask others for help on the language syntax,

More information

CIB Session 12th NoSQL Databases Structures

CIB Session 12th NoSQL Databases Structures CIB Session 12th NoSQL Databases Structures By: Shahab Safaee & Morteza Zahedi Software Engineering PhD Email: safaee.shx@gmail.com, morteza.zahedi.a@gmail.com cibtrc.ir cibtrc cibtrc 2 Agenda What is

More information

Cisco Switch Lab II (1-3 Persons) Individual/Team Lab 35 Points

Cisco Switch Lab II (1-3 Persons) Individual/Team Lab 35 Points All of the work in this project is my own! I have not left copies of my code in public folders on university computers. I have not given any of this project to others. I will not give any portion of this

More information

Clean Up Team Lab 10 Points. Cisco Switch Lab I Individual Lab 25 Points

Clean Up Team Lab 10 Points. Cisco Switch Lab I Individual Lab 25 Points All of the work in this project is my own! I have not left copies of my code in public folders on university computers. I have not given any of this project to others. I will not give any portion of this

More information

Intro To HTML & Web & Relational Queries Individual Assignment 30 Points

Intro To HTML & Web & Relational Queries Individual Assignment 30 Points If this lab is an Individual assignment, you must do all coded programs on your own. You may ask others for help on the language syntax, but you must organize and present your own logical solution to the

More information

Stages of Data Processing

Stages of Data Processing Data processing can be understood as the conversion of raw data into a meaningful and desired form. Basically, producing information that can be understood by the end user. So then, the question arises,

More information

NoSQL Databases MongoDB vs Cassandra. Kenny Huynh, Andre Chik, Kevin Vu

NoSQL Databases MongoDB vs Cassandra. Kenny Huynh, Andre Chik, Kevin Vu NoSQL Databases MongoDB vs Cassandra Kenny Huynh, Andre Chik, Kevin Vu Introduction - Relational database model - Concept developed in 1970 - Inefficient - NoSQL - Concept introduced in 1980 - Related

More information

Introduction to Big Data. NoSQL Databases. Instituto Politécnico de Tomar. Ricardo Campos

Introduction to Big Data. NoSQL Databases. Instituto Politécnico de Tomar. Ricardo Campos Instituto Politécnico de Tomar Introduction to Big Data NoSQL Databases Ricardo Campos Mestrado EI-IC Análise e Processamento de Grandes Volumes de Dados Tomar, Portugal, 2016 Part of the slides used in

More information

Server 2 - MySQL #1 Lab

Server 2 - MySQL #1 Lab Server-Configuration-2-MySQL-1-HW.docx CSCI 2320 Initials P a g e 1 If this lab is an Individual assignment, you must do all coded programs on your own. You may ask others for help on the language syntax,

More information

Advanced Database Technologies NoSQL: Not only SQL

Advanced Database Technologies NoSQL: Not only SQL Advanced Database Technologies NoSQL: Not only SQL Christian Grün Database & Information Systems Group NoSQL Introduction 30, 40 years history of well-established database technology all in vain? Not at

More information

Intro-PHP-HW.docx CSCI 3343 Initials P a g e 1

Intro-PHP-HW.docx CSCI 3343 Initials P a g e 1 Intro-PHP-HW.docx CSCI 3343 Initials P a g e 1 If this lab is an Individual assignment, you must do all coded programs on your own. You may ask others for help on the language syntax, but you must organize

More information

Embedded Technosolutions

Embedded Technosolutions Hadoop Big Data An Important technology in IT Sector Hadoop - Big Data Oerie 90% of the worlds data was generated in the last few years. Due to the advent of new technologies, devices, and communication

More information

Jargons, Concepts, Scope and Systems. Key Value Stores, Document Stores, Extensible Record Stores. Overview of different scalable relational systems

Jargons, Concepts, Scope and Systems. Key Value Stores, Document Stores, Extensible Record Stores. Overview of different scalable relational systems Jargons, Concepts, Scope and Systems Key Value Stores, Document Stores, Extensible Record Stores Overview of different scalable relational systems Examples of different Data stores Predictions, Comparisons

More information

Challenges for Data Driven Systems

Challenges for Data Driven Systems Challenges for Data Driven Systems Eiko Yoneki University of Cambridge Computer Laboratory Data Centric Systems and Networking Emergence of Big Data Shift of Communication Paradigm From end-to-end to data

More information

NOSQL EGCO321 DATABASE SYSTEMS KANAT POOLSAWASD DEPARTMENT OF COMPUTER ENGINEERING MAHIDOL UNIVERSITY

NOSQL EGCO321 DATABASE SYSTEMS KANAT POOLSAWASD DEPARTMENT OF COMPUTER ENGINEERING MAHIDOL UNIVERSITY NOSQL EGCO321 DATABASE SYSTEMS KANAT POOLSAWASD DEPARTMENT OF COMPUTER ENGINEERING MAHIDOL UNIVERSITY WHAT IS NOSQL? Stands for No-SQL or Not Only SQL. Class of non-relational data storage systems E.g.

More information

Cassandra- A Distributed Database

Cassandra- A Distributed Database Cassandra- A Distributed Database Tulika Gupta Department of Information Technology Poornima Institute of Engineering and Technology Jaipur, Rajasthan, India Abstract- A relational database is a traditional

More information

Cloud Computing 2. CSCI 4850/5850 High-Performance Computing Spring 2018

Cloud Computing 2. CSCI 4850/5850 High-Performance Computing Spring 2018 Cloud Computing 2 CSCI 4850/5850 High-Performance Computing Spring 2018 Tae-Hyuk (Ted) Ahn Department of Computer Science Program of Bioinformatics and Computational Biology Saint Louis University Learning

More information

Databases and Big Data Today. CS634 Class 22

Databases and Big Data Today. CS634 Class 22 Databases and Big Data Today CS634 Class 22 Current types of Databases SQL using relational tables: still very important! NoSQL, i.e., not using relational tables: term NoSQL popular since about 2007.

More information

DB-Queries-1 - REVIEW Individual 20 Points

DB-Queries-1 - REVIEW Individual 20 Points DB-Queries-1.docx CSCI 2320 Initials P a g e 1 If this lab is an Individual assignment, you must do all coded programs on your own. You may ask others for help on the language syntax, but you must organize

More information

Server-Configuration-2-MySQL-1-HW.docx CSCI 3343 Initials P a g e 1

Server-Configuration-2-MySQL-1-HW.docx CSCI 3343 Initials P a g e 1 Server-Configuration-2-MySQL-1-HW.docx CSCI 3343 Initials P a g e 1 The short answer questions will be spot checked and graded for completion, but not checked for accuracy. I encourage you to form a study

More information

Microsoft Big Data and Hadoop

Microsoft Big Data and Hadoop Microsoft Big Data and Hadoop Lara Rubbelke @sqlgal Cindy Gross @sqlcindy 2 The world of data is changing The 4Vs of Big Data http://nosql.mypopescu.com/post/9621746531/a-definition-of-big-data 3 Common

More information

Design Relationships, Indexes, Queries, & More (Individual/Team Of 2) Assignment 20 USE PENCIL

Design Relationships, Indexes, Queries, & More (Individual/Team Of 2) Assignment 20 USE PENCIL Relationships-1-HW.docx CSCI 3321 Initials P a g e 1 If this lab is an Individual assignment, you must do all coded programs on your own. You may ask others for help on the language syntax, but you must

More information

DEMYSTIFYING BIG DATA WITH RIAK USE CASES. Martin Schneider Basho Technologies!

DEMYSTIFYING BIG DATA WITH RIAK USE CASES. Martin Schneider Basho Technologies! DEMYSTIFYING BIG DATA WITH RIAK USE CASES Martin Schneider Basho Technologies! Agenda Defining Big Data in Regards to Riak A Series of Trade-Offs Use Cases Q & A About Basho & Riak Basho Technologies is

More information

OOP-10 BTree & B+Tree Individual Assignment 15 Points

OOP-10 BTree & B+Tree Individual Assignment 15 Points OOP-10-B+Tree-HW CSCI 2320 Initials P a g e 1 If this lab is an Individual assignment, you must do all coded programs on your own. You may ask others for help on the language syntax, but you must organize

More information

What is database? Types and Examples

What is database? Types and Examples What is database? Types and Examples Visit our site for more information: www.examplanning.com Facebook Page: https://www.facebook.com/examplanning10/ Twitter: https://twitter.com/examplanning10 TABLE

More information

Getting to know. by Michelle Darling August 2013

Getting to know. by Michelle Darling August 2013 Getting to know by Michelle Darling mdarlingcmt@gmail.com August 2013 Agenda: What is Cassandra? Installation, CQL3 Data Modelling Summary Only 15 min to cover these, so please hold questions til the end,

More information

CISC 7610 Lecture 2b The beginnings of NoSQL

CISC 7610 Lecture 2b The beginnings of NoSQL CISC 7610 Lecture 2b The beginnings of NoSQL Topics: Big Data Google s infrastructure Hadoop: open google infrastructure Scaling through sharding CAP theorem Amazon s Dynamo 5 V s of big data Everyone

More information

Hadoop An Overview. - Socrates CCDH

Hadoop An Overview. - Socrates CCDH Hadoop An Overview - Socrates CCDH What is Big Data? Volume Not Gigabyte. Terabyte, Petabyte, Exabyte, Zettabyte - Due to handheld gadgets,and HD format images and videos - In total data, 90% of them collected

More information

DIVING IN: INSIDE THE DATA CENTER

DIVING IN: INSIDE THE DATA CENTER 1 DIVING IN: INSIDE THE DATA CENTER Anwar Alhenshiri Data centers 2 Once traffic reaches a data center it tunnels in First passes through a filter that blocks attacks Next, a router that directs it to

More information

10/18/2017. Announcements. NoSQL Motivation. NoSQL. Serverless Architecture. What is the Problem? Database Systems CSE 414

10/18/2017. Announcements. NoSQL Motivation. NoSQL. Serverless Architecture. What is the Problem? Database Systems CSE 414 Announcements Database Systems CSE 414 Lecture 11: NoSQL & JSON (mostly not in textbook only Ch 11.1) HW5 will be posted on Friday and due on Nov. 14, 11pm [No Web Quiz 5] Today s lecture: NoSQL & JSON

More information

Introduction to NoSQL Databases

Introduction to NoSQL Databases Introduction to NoSQL Databases Roman Kern KTI, TU Graz 2017-10-16 Roman Kern (KTI, TU Graz) Dbase2 2017-10-16 1 / 31 Introduction Intro Why NoSQL? Roman Kern (KTI, TU Graz) Dbase2 2017-10-16 2 / 31 Introduction

More information

PROFESSIONAL. NoSQL. Shashank Tiwari WILEY. John Wiley & Sons, Inc.

PROFESSIONAL. NoSQL. Shashank Tiwari WILEY. John Wiley & Sons, Inc. PROFESSIONAL NoSQL Shashank Tiwari WILEY John Wiley & Sons, Inc. Examining CONTENTS INTRODUCTION xvil CHAPTER 1: NOSQL: WHAT IT IS AND WHY YOU NEED IT 3 Definition and Introduction 4 Context and a Bit

More information

Study of NoSQL Database Along With Security Comparison

Study of NoSQL Database Along With Security Comparison Study of NoSQL Database Along With Security Comparison Ankita A. Mall [1], Jwalant B. Baria [2] [1] Student, Computer Engineering Department, Government Engineering College, Modasa, Gujarat, India ank.fetr@gmail.com

More information

CS639: Data Management for Data Science. Lecture 1: Intro to Data Science and Course Overview. Theodoros Rekatsinas

CS639: Data Management for Data Science. Lecture 1: Intro to Data Science and Course Overview. Theodoros Rekatsinas CS639: Data Management for Data Science Lecture 1: Intro to Data Science and Course Overview Theodoros Rekatsinas 1 2 Big science is data driven. 3 Increasingly many companies see themselves as data driven.

More information

Comparing SQL and NOSQL databases

Comparing SQL and NOSQL databases COSC 6397 Big Data Analytics Data Formats (II) HBase Edgar Gabriel Spring 2014 Comparing SQL and NOSQL databases Types Development History Data Storage Model SQL One type (SQL database) with minor variations

More information

NoSQL systems. Lecture 21 (optional) Instructor: Sudeepa Roy. CompSci 516 Data Intensive Computing Systems

NoSQL systems. Lecture 21 (optional) Instructor: Sudeepa Roy. CompSci 516 Data Intensive Computing Systems CompSci 516 Data Intensive Computing Systems Lecture 21 (optional) NoSQL systems Instructor: Sudeepa Roy Duke CS, Spring 2016 CompSci 516: Data Intensive Computing Systems 1 Key- Value Stores Duke CS,

More information

Cassandra, MongoDB, and HBase. Cassandra, MongoDB, and HBase. I have chosen these three due to their recent

Cassandra, MongoDB, and HBase. Cassandra, MongoDB, and HBase. I have chosen these three due to their recent Tanton Jeppson CS 401R Lab 3 Cassandra, MongoDB, and HBase Introduction For my report I have chosen to take a deeper look at 3 NoSQL database systems: Cassandra, MongoDB, and HBase. I have chosen these

More information

A Survey Paper on NoSQL Databases: Key-Value Data Stores and Document Stores

A Survey Paper on NoSQL Databases: Key-Value Data Stores and Document Stores A Survey Paper on NoSQL Databases: Key-Value Data Stores and Document Stores Nikhil Dasharath Karande 1 Department of CSE, Sanjay Ghodawat Institutes, Atigre nikhilkarande18@gmail.com Abstract- This paper

More information

Intro Cassandra. Adelaide Big Data Meetup.

Intro Cassandra. Adelaide Big Data Meetup. Intro Cassandra Adelaide Big Data Meetup instaclustr.com @Instaclustr Who am I and what do I do? Alex Lourie Worked at Red Hat, Datastax and now Instaclustr We currently manage x10s nodes for various customers,

More information

Scaling Up HBase. Duen Horng (Polo) Chau Assistant Professor Associate Director, MS Analytics Georgia Tech. CSE6242 / CX4242: Data & Visual Analytics

Scaling Up HBase. Duen Horng (Polo) Chau Assistant Professor Associate Director, MS Analytics Georgia Tech. CSE6242 / CX4242: Data & Visual Analytics http://poloclub.gatech.edu/cse6242 CSE6242 / CX4242: Data & Visual Analytics Scaling Up HBase Duen Horng (Polo) Chau Assistant Professor Associate Director, MS Analytics Georgia Tech Partly based on materials

More information

10 Million Smart Meter Data with Apache HBase

10 Million Smart Meter Data with Apache HBase 10 Million Smart Meter Data with Apache HBase 5/31/2017 OSS Solution Center Hitachi, Ltd. Masahiro Ito OSS Summit Japan 2017 Who am I? Masahiro Ito ( 伊藤雅博 ) Software Engineer at Hitachi, Ltd. Focus on

More information

Large-Scale Web Applications

Large-Scale Web Applications Large-Scale Web Applications Mendel Rosenblum Web Application Architecture Web Browser Web Server / Application server Storage System HTTP Internet CS142 Lecture Notes - Intro LAN 2 Large-Scale: Scale-Out

More information

5/1/17. Announcements. NoSQL Motivation. NoSQL. Serverless Architecture. What is the Problem? Database Systems CSE 414

5/1/17. Announcements. NoSQL Motivation. NoSQL. Serverless Architecture. What is the Problem? Database Systems CSE 414 Announcements Database Systems CSE 414 Lecture 15: NoSQL & JSON (mostly not in textbook only Ch 11.1) 1 Homework 4 due tomorrow night [No Web Quiz 5] Midterm grading hopefully finished tonight post online

More information

Big Data Technology Ecosystem. Mark Burnette Pentaho Director Sales Engineering, Hitachi Vantara

Big Data Technology Ecosystem. Mark Burnette Pentaho Director Sales Engineering, Hitachi Vantara Big Data Technology Ecosystem Mark Burnette Pentaho Director Sales Engineering, Hitachi Vantara Agenda End-to-End Data Delivery Platform Ecosystem of Data Technologies Mapping an End-to-End Solution Case

More information

Big Data with Hadoop Ecosystem

Big Data with Hadoop Ecosystem Diógenes Pires Big Data with Hadoop Ecosystem Hands-on (HBase, MySql and Hive + Power BI) Internet Live http://www.internetlivestats.com/ Introduction Business Intelligence Business Intelligence Process

More information

CSE 444: Database Internals. Lecture 23 Spark

CSE 444: Database Internals. Lecture 23 Spark CSE 444: Database Internals Lecture 23 Spark References Spark is an open source system from Berkeley Resilient Distributed Datasets: A Fault-Tolerant Abstraction for In-Memory Cluster Computing. Matei

More information

DATABASE DESIGN II - 1DL400

DATABASE DESIGN II - 1DL400 DATABASE DESIGN II - 1DL400 Fall 2016 A second course in database systems http://www.it.uu.se/research/group/udbl/kurser/dbii_ht16 Kjell Orsborn Uppsala Database Laboratory Department of Information Technology,

More information

Fusion iomemory PCIe Solutions from SanDisk and Sqrll make Accumulo Hypersonic

Fusion iomemory PCIe Solutions from SanDisk and Sqrll make Accumulo Hypersonic WHITE PAPER Fusion iomemory PCIe Solutions from SanDisk and Sqrll make Accumulo Hypersonic Western Digital Technologies, Inc. 951 SanDisk Drive, Milpitas, CA 95035 www.sandisk.com Table of Contents Executive

More information

Goal of the presentation is to give an introduction of NoSQL databases, why they are there.

Goal of the presentation is to give an introduction of NoSQL databases, why they are there. 1 Goal of the presentation is to give an introduction of NoSQL databases, why they are there. We want to present "Why?" first to explain the need of something like "NoSQL" and then in "What?" we go in

More information

What is the maximum file size you have dealt so far? Movies/Files/Streaming video that you have used? What have you observed?

What is the maximum file size you have dealt so far? Movies/Files/Streaming video that you have used? What have you observed? Simple to start What is the maximum file size you have dealt so far? Movies/Files/Streaming video that you have used? What have you observed? What is the maximum download speed you get? Simple computation

More information

NOSQL DATABASE SYSTEMS: DECISION GUIDANCE AND TRENDS. Big Data Technologies: NoSQL DBMS (Decision Guidance) - SoSe

NOSQL DATABASE SYSTEMS: DECISION GUIDANCE AND TRENDS. Big Data Technologies: NoSQL DBMS (Decision Guidance) - SoSe NOSQL DATABASE SYSTEMS: DECISION GUIDANCE AND TRENDS h_da Prof. Dr. Uta Störl Big Data Technologies: NoSQL DBMS (Decision Guidance) - SoSe 2017 163 Performance / Benchmarks Traditional database benchmarks

More information

Hadoop. Introduction / Overview

Hadoop. Introduction / Overview Hadoop Introduction / Overview Preface We will use these PowerPoint slides to guide us through our topic. Expect 15 minute segments of lecture Expect 1-4 hour lab segments Expect minimal pretty pictures

More information

CSE 344 JULY 9 TH NOSQL

CSE 344 JULY 9 TH NOSQL CSE 344 JULY 9 TH NOSQL ADMINISTRATIVE MINUTIAE HW3 due Wednesday tests released actual_time should have 0s not NULLs upload new data file or use UPDATE to change 0 ~> NULL Extra OOs on Mondays 5-7pm in

More information

HDFS: Hadoop Distributed File System. CIS 612 Sunnie Chung

HDFS: Hadoop Distributed File System. CIS 612 Sunnie Chung HDFS: Hadoop Distributed File System CIS 612 Sunnie Chung What is Big Data?? Bulk Amount Unstructured Introduction Lots of Applications which need to handle huge amount of data (in terms of 500+ TB per

More information

A Review Of Non Relational Databases, Their Types, Advantages And Disadvantages

A Review Of Non Relational Databases, Their Types, Advantages And Disadvantages A Review Of Non Relational Databases, Their Types, Advantages And Disadvantages Harpreet kaur, Jaspreet kaur, Kamaljit kaur Student of M.Tech(CSE) Student of M.Tech(CSE) Assit.Prof.in CSE deptt. Sri Guru

More information

CPSC 426/526. Cloud Computing. Ennan Zhai. Computer Science Department Yale University

CPSC 426/526. Cloud Computing. Ennan Zhai. Computer Science Department Yale University CPSC 426/526 Cloud Computing Ennan Zhai Computer Science Department Yale University Recall: Lec-7 In the lec-7, I talked about: - P2P vs Enterprise control - Firewall - NATs - Software defined network

More information

A Study of NoSQL Database

A Study of NoSQL Database A Study of NoSQL Database International Journal of Engineering Research & Technology (IJERT) Biswajeet Sethi 1, Samaresh Mishra 2, Prasant ku. Patnaik 3 1,2,3 School of Computer Engineering, KIIT University

More information

Rule 14 Use Databases Appropriately

Rule 14 Use Databases Appropriately Rule 14 Use Databases Appropriately Rule 14: What, When, How, and Why What: Use relational databases when you need ACID properties to maintain relationships between your data. For other data storage needs

More information

Non-Relational Databases. Pelle Jakovits

Non-Relational Databases. Pelle Jakovits Non-Relational Databases Pelle Jakovits 25 October 2017 Outline Background Relational model Database scaling The NoSQL Movement CAP Theorem Non-relational data models Key-value Document-oriented Column

More information

Topics. Big Data Analytics What is and Why Hadoop? Comparison to other technologies Hadoop architecture Hadoop ecosystem Hadoop usage examples

Topics. Big Data Analytics What is and Why Hadoop? Comparison to other technologies Hadoop architecture Hadoop ecosystem Hadoop usage examples Hadoop Introduction 1 Topics Big Data Analytics What is and Why Hadoop? Comparison to other technologies Hadoop architecture Hadoop ecosystem Hadoop usage examples 2 Big Data Analytics What is Big Data?

More information

Next-Generation Cloud Platform

Next-Generation Cloud Platform Next-Generation Cloud Platform Jangwoo Kim Jun 24, 2013 E-mail: jangwoo@postech.ac.kr High Performance Computing Lab Department of Computer Science & Engineering Pohang University of Science and Technology

More information

A NoSQL Introduction for Relational Database Developers. Andrew Karcher Las Vegas SQL Saturday September 12th, 2015

A NoSQL Introduction for Relational Database Developers. Andrew Karcher Las Vegas SQL Saturday September 12th, 2015 A NoSQL Introduction for Relational Database Developers Andrew Karcher Las Vegas SQL Saturday September 12th, 2015 About Me http://www.andrewkarcher.com Twitter: @akarcher LinkedIn, Twitter Email: akarcher@gmail.com

More information

5/2/16. Announcements. NoSQL Motivation. The New Hipster: NoSQL. Serverless. What is the Problem? Database Systems CSE 414

5/2/16. Announcements. NoSQL Motivation. The New Hipster: NoSQL. Serverless. What is the Problem? Database Systems CSE 414 Announcements Database Systems CSE 414 Lecture 16: NoSQL and JSon Current assignments: Homework 4 due tonight Web Quiz 6 due next Wednesday [There is no Web Quiz 5 Today s lecture: JSon The book covers

More information

Distributed Non-Relational Databases. Pelle Jakovits

Distributed Non-Relational Databases. Pelle Jakovits Distributed Non-Relational Databases Pelle Jakovits Tartu, 7 December 2018 Outline Relational model NoSQL Movement Non-relational data models Key-value Document-oriented Column family Graph Non-relational

More information

Database Systems CSE 414

Database Systems CSE 414 Database Systems CSE 414 Lecture 16: NoSQL and JSon CSE 414 - Spring 2016 1 Announcements Current assignments: Homework 4 due tonight Web Quiz 6 due next Wednesday [There is no Web Quiz 5] Today s lecture:

More information

TITLE: PRE-REQUISITE THEORY. 1. Introduction to Hadoop. 2. Cluster. Implement sort algorithm and run it using HADOOP

TITLE: PRE-REQUISITE THEORY. 1. Introduction to Hadoop. 2. Cluster. Implement sort algorithm and run it using HADOOP TITLE: Implement sort algorithm and run it using HADOOP PRE-REQUISITE Preliminary knowledge of clusters and overview of Hadoop and its basic functionality. THEORY 1. Introduction to Hadoop The Apache Hadoop

More information

CS November 2017

CS November 2017 Bigtable Highly available distributed storage Distributed Systems 18. Bigtable Built with semi-structured data in mind URLs: content, metadata, links, anchors, page rank User data: preferences, account

More information

Windows Azure Overview

Windows Azure Overview Windows Azure Overview Christine Collet, Genoveva Vargas-Solar Grenoble INP, France MS Azure Educator Grant Packaged Software Infrastructure (as a Service) Platform (as a Service) Software (as a Service)

More information

Perspectives on NoSQL

Perspectives on NoSQL Perspectives on NoSQL PGCon 2010 Gavin M. Roy What is NoSQL? NoSQL is a movement promoting a loosely defined class of nonrelational data stores that break with a long history of relational

More information

New Approaches to Big Data Processing and Analytics

New Approaches to Big Data Processing and Analytics New Approaches to Big Data Processing and Analytics Contributing authors: David Floyer, David Vellante Original publication date: February 12, 2013 There are number of approaches to processing and analyzing

More information

Distributed Databases: SQL vs NoSQL

Distributed Databases: SQL vs NoSQL Distributed Databases: SQL vs NoSQL Seda Unal, Yuchen Zheng April 23, 2017 1 Introduction Distributed databases have become increasingly popular in the era of big data because of their advantages over

More information

Getting Started with Memcached. Ahmed Soliman

Getting Started with Memcached. Ahmed Soliman Getting Started with Memcached Ahmed Soliman In this package, you will find: A Biography of the author of the book A synopsis of the book s content Information on where to buy this book About the Author

More information

Oral Questions and Answers (DBMS LAB) Questions & Answers- DBMS

Oral Questions and Answers (DBMS LAB) Questions & Answers- DBMS Questions & Answers- DBMS https://career.guru99.com/top-50-database-interview-questions/ 1) Define Database. A prearranged collection of figures known as data is called database. 2) What is DBMS? Database

More information

High-Performance Distributed DBMS for Analytics

High-Performance Distributed DBMS for Analytics 1 High-Performance Distributed DBMS for Analytics 2 About me Developer, hardware engineering background Head of Analytic Products Department in Yandex jkee@yandex-team.ru 3 About Yandex One of the largest

More information

CS 655 Advanced Topics in Distributed Systems

CS 655 Advanced Topics in Distributed Systems Presented by : Walid Budgaga CS 655 Advanced Topics in Distributed Systems Computer Science Department Colorado State University 1 Outline Problem Solution Approaches Comparison Conclusion 2 Problem 3

More information

Advanced Data Management Technologies

Advanced Data Management Technologies ADMT 2017/18 Unit 15 J. Gamper 1/44 Advanced Data Management Technologies Unit 15 Introduction to NoSQL J. Gamper Free University of Bozen-Bolzano Faculty of Computer Science IDSE ADMT 2017/18 Unit 15

More information

CS November 2018

CS November 2018 Bigtable Highly available distributed storage Distributed Systems 19. Bigtable Built with semi-structured data in mind URLs: content, metadata, links, anchors, page rank User data: preferences, account

More information

Cloud Computing and Hadoop Distributed File System. UCSB CS170, Spring 2018

Cloud Computing and Hadoop Distributed File System. UCSB CS170, Spring 2018 Cloud Computing and Hadoop Distributed File System UCSB CS70, Spring 08 Cluster Computing Motivations Large-scale data processing on clusters Scan 000 TB on node @ 00 MB/s = days Scan on 000-node cluster

More information

Data Informatics. Seon Ho Kim, Ph.D.

Data Informatics. Seon Ho Kim, Ph.D. Data Informatics Seon Ho Kim, Ph.D. seonkim@usc.edu HBase HBase is.. A distributed data store that can scale horizontally to 1,000s of commodity servers and petabytes of indexed storage. Designed to operate

More information

Developer Internship Opportunity at I-CC

Developer Internship Opportunity at I-CC Developer Internship Opportunity at I-CC Who We Are: Technology company building next generation publishing and e-commerce solutions Aiming to become a leading European Internet technology company by 2015

More information

Module - 17 Lecture - 23 SQL and NoSQL systems. (Refer Slide Time: 00:04)

Module - 17 Lecture - 23 SQL and NoSQL systems. (Refer Slide Time: 00:04) Introduction to Morden Application Development Dr. Gaurav Raina Prof. Tanmai Gopal Department of Computer Science and Engineering Indian Institute of Technology, Madras Module - 17 Lecture - 23 SQL and

More information

Increase Value from Big Data with Real-Time Data Integration and Streaming Analytics

Increase Value from Big Data with Real-Time Data Integration and Streaming Analytics Increase Value from Big Data with Real-Time Data Integration and Streaming Analytics Cy Erbay Senior Director Striim Executive Summary Striim is Uniquely Qualified to Solve the Challenges of Real-Time

More information

PLATFORM AND SOFTWARE AS A SERVICE THE MAPREDUCE PROGRAMMING MODEL AND IMPLEMENTATIONS

PLATFORM AND SOFTWARE AS A SERVICE THE MAPREDUCE PROGRAMMING MODEL AND IMPLEMENTATIONS PLATFORM AND SOFTWARE AS A SERVICE THE MAPREDUCE PROGRAMMING MODEL AND IMPLEMENTATIONS By HAI JIN, SHADI IBRAHIM, LI QI, HAIJUN CAO, SONG WU and XUANHUA SHI Prepared by: Dr. Faramarz Safi Islamic Azad

More information

Database Management Systems

Database Management Systems Database Management Systems Fall 2017 Knowledge is of two kinds: we know a subject ourselves, or we know where we can find information upon it. -- Samuel Johnson (1709-1784) Queries for Today Why? Who?

More information

Abstract. The Challenges. ESG Lab Review InterSystems IRIS Data Platform: A Unified, Efficient Data Platform for Fast Business Insight

Abstract. The Challenges. ESG Lab Review InterSystems IRIS Data Platform: A Unified, Efficient Data Platform for Fast Business Insight ESG Lab Review InterSystems Data Platform: A Unified, Efficient Data Platform for Fast Business Insight Date: April 218 Author: Kerry Dolan, Senior IT Validation Analyst Abstract Enterprise Strategy Group

More information

NOSQL OPERATIONAL CHECKLIST

NOSQL OPERATIONAL CHECKLIST WHITEPAPER NOSQL NOSQL OPERATIONAL CHECKLIST NEW APPLICATION REQUIREMENTS ARE DRIVING A DATABASE REVOLUTION There is a new breed of high volume, highly distributed, and highly complex applications that

More information

ΕΠΛ 602:Foundations of Internet Technologies. Cloud Computing

ΕΠΛ 602:Foundations of Internet Technologies. Cloud Computing ΕΠΛ 602:Foundations of Internet Technologies Cloud Computing 1 Outline Bigtable(data component of cloud) Web search basedonch13of thewebdatabook 2 What is Cloud Computing? ACloudis an infrastructure, transparent

More information

Migrating Oracle Databases To Cassandra

Migrating Oracle Databases To Cassandra BY UMAIR MANSOOB Why Cassandra Lower Cost of ownership makes it #1 choice for Big Data OLTP Applications. Unlike Oracle, Cassandra can store structured, semi-structured, and unstructured data. Cassandra

More information

CompSci 516 Database Systems

CompSci 516 Database Systems CompSci 516 Database Systems Lecture 20 NoSQL and Column Store Instructor: Sudeepa Roy Duke CS, Fall 2018 CompSci 516: Database Systems 1 Reading Material NOSQL: Scalable SQL and NoSQL Data Stores Rick

More information

The NoSQL Ecosystem. Adam Marcus MIT CSAIL

The NoSQL Ecosystem. Adam Marcus MIT CSAIL The NoSQL Ecosystem Adam Marcus MIT CSAIL marcua@csail.mit.edu / @marcua About Me Social Computing + Database Systems Easily Distracted: Wrote The NoSQL Ecosystem in The Architecture of Open Source Applications

More information

Intro To Big Data. John Urbanic Parallel Computing Scientist Pittsburgh Supercomputing Center. Copyright 2017

Intro To Big Data. John Urbanic Parallel Computing Scientist Pittsburgh Supercomputing Center. Copyright 2017 Intro To Big Data John Urbanic Parallel Computing Scientist Pittsburgh Supercomputing Center Copyright 2017 Big data is a broad term for data sets so large or complex that traditional data processing applications

More information