Redshift Queries Playbook

Size: px
Start display at page:

Download "Redshift Queries Playbook"

Transcription

1 scalable analytics built for growth Redshift Queries Playbook SQL queries for understanding user behavior Updated June 23, 2015

2 This playbook shows you how you can use Amplitude's Amazon Redshift database to answer common questions about user behavior in your app. All queries are written in the PostgreSQL syntax used by Redshift and can be executed directly from the Redshift prompt. You can use Amazon s Redshift documentation to help you understand the supported functions.

3 Quick and Essential ContentsTips 1. Data for respective apps will be kept in their own schemas (namespaces/packages in Redshift) By Default, every Redshift command you run will be under the public schema. However, you can select which schema you want to work under instead. You can do this by using the SET search_path command. SET search_path = app123; SELECT COUNT(*) FROM events; Or you can include the schema as a prefix to the table SELECT COUNT(*) FROM app123.events; 2. Query directly from each app's table instead of the entire events table when possible. The events from each of your Amplitude apps are stored in their own tables. The table name for each app is 'events###' where the ### is the app number, which you can find in the URL of the Amplitude dashboard. The union of each app's events appears in a table called 'events'. Selecting FROM events### when possible will make your queries faster and more efficient. 3. Custom event properties and custom user properties associated with an event_type will be pulled into their own columns in the respective event_type table. Custom user-properties will be appended by a u_' and custom eventproperties will be appended by a 'e_'. Note - There is a a limit of 400 user properties and 50 event properties that will be pulled into their own columns. Anything past the limit will still require the JSON_EXTRACT_PATH_TEXT Function.

4 4. Always include a date range in the WHERE clause of your query. Our Redshift tables do not have a primary key but are sorted by the event_time column. Adding a date range in the WHERE clause of your query will significantly increase query speeds. We recommend using the DATE() function with event_time as the input. 5. Avoid SELECT * queries when possible. The more columns you select, the slower your query will be. Selecting only relevant columns, as opposed to all (*) columns, will significantly increase query speeds and show only relevant data.

5 Contents Table of Contents 0. Schema description 1. Active Users Count the active users on a given day 2. New Users Count the new users on a given day Click to reach the relevant section 3. Composition Show the breakdown of devices for users in a two week period 4. Sessions Show the distribution of session lengths on a specific date Show the average session length per segment 5. Events Show the distribution of event property totals Count the number of users who did an event more than twice Count the number of events done by a specific set of users who did another event Show the distribution of users who have done an event by number of times done Find out the last three events a user does before churning 6. Funnels Obtain a list of users for each step of a funnel Adding steps to a funnel Getting the list of users who did (or did not) reach a step in a funnel Funnels where users did event X, then Y, with no other events in between Funnels where users did event Y after event X, within 24 hour hours of event X

6 Contents Table of Contents 7. Revenue Obtain the number of paying users and total revenue Obtain a list of top paying users 8. User Properties Obtain the most common values for a given user property Obtain a list of users who have certain properties Obtain the most common advertising referral networks for users Obtain the number of users whose current level is greater than 7 but less than Event properties Click to reach the relevant section Obtain the list of items bought and how frequently that item was purchased Obtain the number of users who placed a bet and wagered between credits Additional Resources

7 Contents 0. Schema Description Below is the list of the columns in the table for the event type played_song. Included is the the column type and a brief description of the column. Column Type Description id bigint A depreciated column app integer App ID from the dashboard amplitude_id bigint Internal ID used to count unique users device_id character varying (256) Device specific identifier user_id character varying (256) A readable ID specified by you event_time timestamp w/o time zone Event time (UTC) after reconciliation client_event_time timestamp w/o time zone Local event time client_upload_time timestamp w/o time zone Local upload time server_upload_time timestamp w/o time zone Server time when event was received event_id integer Counter distinguishing events session_id bigint Session start time in milliseconds since epoch event_type character varying (256) A unique identifier for your event amplitude_event_typ character varying (256) Amplitude specific identifiers based on event e first_event boolean True if event is first for a given amplitude_id version_name character varying (256) App version os_name character varying (256) OS name os_version character varying (256) OS version continued on next page >> back to Table of Contents 1

8 Column Type Description device_brand character varying (256) Device brand device_manufacture character varying (256) Device manufacturer device_model character varying (256) Device model device_carrier character varying (256) Device carrier country character varying (256) Country language character varying (256) Language revenue double precision Revenue generated by a revenue event product_id character varying (256) Product ID of a revenue event quantity integer Quantity of a revenue event price double precision Price of a revenue event location_lat double precision Latitude location_lng double precision Longitude ip_address character varying (256) IP address event_properties character varying (65535) JSON string of event properties user_properties character varying (65535) JSON string of user properties region character varying (256) Region city character varying (256) City dma character varying (256) Designated Marketing Area (DMA) device_family character varying (256) Device Family device_type character varying (256) Device Type platform character varying (256) Platform (ios, Android, or Web) e_type character varying (2048) Custom event property 'type' e_length character varying (2048) Custom event property 'length' u_age character varying (2048) Custom user property 'age' u_gender character varying (2048) Custom user property 'gender' >> back to Table of Contents 2

9 Contents 1. Active Users The number of Active Users that an app has over a given period of time is one of the most basic and important metrics in measuring an app's level of user engagement. This metric counts the number of distinct users who performed at least one tracked event during the specified time period. A basic example of an active user count query is: Query Objective: Count the active users on a given day SELECT COUNT(DISTINCT amplitude_id) FROM events123 WHERE DATE(event_time) = ' '; Explanation This query returns the number of users who logged at least one event on March 1, The red text of the query above should be adjusted to your specific case. amplitude_id vs. device_id vs. user_id Notice amplitude_id is used in the query above; this is the most accurate field to identify unique users as it combines information from device_id and user_id. Still, results based on either user_id or amplitude_id will usually be similar, so you can use either one in most cases. Further, in certain situations (see below) device_id and user_id are more useful because they contain information usable outside of Amplitude - e.g. user_id can be used for contacting users by (as user_id's are often user's addresses) and device_id can be used for push notifications. For more discussion of ID types and to understand how we count unique users, see our documentation. >> back to Table of Contents 3

10 Modifications Time Zones Dates and times are in UTC (formatted yyyy-mm-dd hh:mm:ss), so if you are interested in getting active user counts for different time zones, forgo the DATE() function and offset the full timestamps by the appropriate differential. For example, to obtain the number of daily active users in the 24-hour period corresponding to March 1st Pacific Time, modify the event_time part of the query above to: WHERE event_time >= ' :00:00' AND event_time < ' :00:00' Users Who Did Specific Events The basic query above counts users who did any event as active users. If you are instead interested in users who did (or did not do) certain event types, you can easily modify the query to do so. For example, if you only want users who did the 'sentmessage' event, just modify the WHERE part of the query to: WHERE event_type = 'sentmessage' AND DATE(event_time) = ' '; Similarly, you can query for users who logged events other than certain events. For example, if your app tracks passive events such as push notifications, an active user might be best defined as a user who does some active action. So, if the event you want to exclude is called 'receivedpush', modify the query to: WHERE event_type!= 'receivedpush' AND DATE(event_time) = '; >> back to Table of Contents 4

11 Obtaining a List of Users If you want to see who the set of active users are, rather than simply how many, you can obtain the list of user ids (which, depending on your app, may be a list of user addresses, log in names, etc). The query is the same except for the beginning: SELECT DISTINCT user_id FROM events Note that we use user_id instead of amplitude_id because user_id is the identifier that your app recognizes (e.g. user addresses, log-in names, etc) while amplitude_id is Amplitude's internal id for users, which is not meaningful outside of Amplitude use. Saving Output to a File This modification above returns a table with one user id per row, so if your app has thousands (or more) users per day, this can be a very long table. It is often more useful to save the results of the query in a file instead of just viewing it the Redshift terminal. To do this, simply type the following command in the Redshift prompt: \o your_file_name.csv All query results for the remainder of your Redshift session will be written to the file your_file_name.csv on your local machine. To stop writing queries to the file, quit your session with: \q A variety of SQL UI tools exist where you can save tables generated from queries to Excel directly. A couple of these programs are SQL Workbench/J and Navicat. >> back to Table of Contents 5

12 Contents 2. New Users Another fundamental and important metric of app performance is the number of new users (per day, week, month, etc). New users for a given day are the users whose first Amplitude-recorded event occurred on that day. The basic query for new user count is: Query Objective: Count the new users on a given day SELECT COUNT(amplitude_id) FROM events123 WHERE first_event = 'True' AND DATE(event_time) = ' '; Explanation The query above returns the number of users who logged their first event (specified by first_event = 'True'), and hence were new users, on March 1, The red text of the query above should be adjusted to your specific case. Modifications Time Zones Dates and times are in UTC (formatted yyyy-mm-dd hh:mm:ss), so if you are interested in getting new user counts for different time zones, forgo the DATE() function and offset the full timestamps by the appropriate differential. >> back to Table of Contents 6

13 For example, the number of daily active users in the 24-hour period corresponding to March 1st Pacific Time, modify the event_time part of the query above to: WHERE event_time >= ' :00:00' AND event_time < ' :00:00' Number of Users Who Did a Specific Event The basic query above counts users who did any event as their first event. If you are instead interested in users who did a certain event type, you can easily modify the query to do so. For example, if you only want users who did the 'signedup' event, just query on the signedup event table: SELECT COUNT(amplitude_id) FROM app123.signedup WHERE first_event = 'True' AND DATE(event_time) = ' '; Obtaining a List of Users Just as with active users, it is often useful to obtain a list of the actual new user id's in addition to the count. SELECT DISTINCT amplitude_id FROM events >> back to Table of Contents 7

14 Contents 3. Composition Grouping your users by user properties will give you insight into who is using your app. Query Objective: Show the breakdown of devices for users in a two week period SELECT device_model, COUNT(DISTINCT(amplitude_id)) FROM events123 WHERE DATE(event_time) BETWEEN ' ' AND ' ' GROUP BY device_model ORDER BY COUNT DESC; Explanation The query above counts the number of distinct users by device for the first two weeks in March. It s worth noting that if a user does events on multiple devices during the time period, she will be counted in each device bucket. The red text of the query above should be adjusted to your specific case. Modifications Filter on another user property If you want to filter on another user property, you need to add it to the WHERE clause. WHERE country = India' The query will now only include users in India. >> back to Table of Contents 8

15 4. SessionsContents You can see the duration of time people are using your app. On the dashboard, session lengths are calculated by subtracting the MAX(client_event_time) and session_id (which is the number of milliseconds since epoch). Query Objective: Show the distribution of session lengths on a specific date SELECT DATEDIFF('milliseconds',timestamp 'epoch' + session_id / * INTERVAL '1 second',max) AS diff_millisec FROM (SELECT session_id, amplitude_id,min(client_event_time) as min, MAX(client_event_time) AS max FROM events123 WHERE session_id!= -1 AND DATE(event_time) BETWEEN ' ' AND ' ' GROUP BY session_id, amplitude_id) WHERE DATE(min) = ' ' ORDER BY diff_millisec ASC; Explanation The inner SELECT chooses distinct pairs of session_id and amplitude_id as well as the minimum and maximum timestamps per unique pair. The outer SELECT uses the datediff func-on to subtract the MAX(client_event_-me) and session_id by turning the session_id into a -mestamp. It does so by dividing by 1000 (gets to seconds), and then mul-plying by the 1 second interval, and then adding it to the epoch -mestamp (which is 0). >> back to Table of Contents 9

16 The final WHERE clause restricts the calculation to sessions that started on March 1 (because they could have extended into March 2). The red text of the query above should be adjusted to your specific case. Query Objective: Show the average session length per segment SELECT( SELECT SUM(length) FROM( SELECT DISTINCT session_id, amplitude_id, DATEDIFF('milliseconds',timestamp 'epoch' + session_id / * INTERVAL '1 second',max) AS length FROM( SELECT amplitude_id, session_id, MAX(client_event_time) OVER(PARTITION BY session_id ORDER BY amplitude_id, client_event_time ROWS BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING)AS max, MIN(client_event_time) OVER(PARTITION BY session_id ORDER BY amplitude_id, client_event_time ROWS BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING) AS min FROM events123 WHERE country = 'United States' AND DATE(event_time) BETWEEN ' ' AND ' ' AND session_id!= '-1') WHERE DATE(min)= ' ') ) / (SELECT CAST(COUNT(DISTINCT session_id) AS float) FROM events123 WHERE session_id!= '-1' AND DATE(event_time)= ' ' AND country='united States') /1000 AS average >> back to Table of Contents 10

17 Explanation In the red subquery, we are selecting amplitude_id, session_id, event_time, the MAX value for the client_event_time in a given session, and the MIN value for the client_event_time from your events table, ONLY looking at users from the United States and on January 1st. We partition the table by session_id. PARTITION is the group function, but it does not aggregate the ID's (each row with the same amplitude ID stays independent), and within each partition, the client_event_time is sorted from earliest to latest. The blue subquery selects the distinct number of session_id s, amplitude_id s and the difference between the maximum and minimum client_event_times (to give you session length in milliseconds.) The orange subquery sums the lengths of the sessions, which should give you the TOTAL time for all sessions. The green subquery gives you the number of number of distinct sessions from users who were in the United States and on January 1st. Finally, the black outer subquery simply divides the TOTAL session time by the number of sessions, giving you the average session length. We then divide by 1000 to get our average in seconds. Text in purple can be adjusted for your specific case. >> back to Table of Contents 11

18 5. Events Contents Analyzing custom events will help you understand what users are actually doing when they re in your app. There are many different types of questions you can ask, so we ll provide you with some examples below. Query Objective: Show the distribution of event property totals SELECT DATE (event_time) AS DATE,e_type, COUNT(*) FROM app123.signup WHERE DATE (event_time) BETWEEN ' ' AND ' ' GROUP BY DATE, e_type ORDER BY DATE, COUNT DESC; Explanation The query shows the distribution of the type property of the signup event every day for the first week in March. Because event properties are pulled into their own columns, we can query on the event property type directly and use GROUP BY to capture each property on each day. The red text of the query above should be adjusted to your specific case. >> back to Table of Contents 12

19 Query Objective: Count the number of users who did an event more than twice on a specific date SELECT amplitude_id, COUNT(*) AS total FROM app123.game_initiated WHERE DATE(event_time) = ' ' GROUP BY amplitude_id HAVING COUNT(*) >= 2; Explanation The query above counts how many users did the Game Initiated event two or more times on March 1. The inner SELECT creates the table of users and how many times they did the Game Initiated event, and the outer SELECT only chooses those who have done it two or more times. The red text of the query above should be adjusted to your specific case. Query Objective: Count the number of events done by a specific set of users who did another event Specifically, we will count the number of sentmessage events done by people who did the signup event in California during the first two weeks of March. There will be two steps. First we need to get the set of users who did signup in California from March 1 through March 14. The query below gets us this set. This will be an intermediate query that we will use in the final query. >> back to Table of Contents 13

20 SELECT DISTINCT(amplitude_id) FROM app123.signup WHERE region = 'California' AND DATE(event_time) BETWEEN ' ' AND ' '; The red text of the query above should be adjusted to your specific case. The next step is that we need to answer the question: for users in this set, how many sentmessage events happened during the same time period? There are two ways to get this. One way is to use an IN and the other way is to use a JOIN. Both will require the intermediate query we defined above. We ll explain both so you can choose which you feel more comfortable using. Using an IN SELECT COUNT(*) FROM app123.sentmessage WHERE DATE(event_time) BETWEEN ' ' AND ' ' AND amplitude_id IN (SELECT DISTINCT(amplitude_id) FROM app123.signup WHERE region = 'California' AND DATE(event_time) BETWEEN ' ' AND ' ') ; >> back to Table of Contents 14

21 Explanation The outer SELECT counts the number of sentmessage events. The condition where amplitude_id IN() means it will only select from rows there the amplitude_id is IN the set of users inside that function. So what do we do? We add our intermediate query inside the IN function so that we are only counting messages from users who have done signup in California from March 1 to March 14. The red text of the query above should be adjusted to your specific case. Using a JOIN DELETE CREATE OR REPLACE VIEW CalisignUp0301to0314 AS SELECT DISTINCT(amplitude_id) FROM app123.signup WHERE region = 'California' AND DATE(event_time) BETWEEN ' ' AND ' '; SELECT COUNT(*) FROM app123.sentmessage INNER JOIN CalisignUp0301to0314 ON app123.sentmessage.amplitude_id = CalisignUp0301to0314.amplitude_id WHERE DATE(event_time) BETWEEN ' ' AND ' '; Explanation The first part of the query is the intermediate query we defined above. We have turned it into a view (CalisignUp0301to0314) to make the query cleaner. The second part of the query is a JOIN -- we JOIN the events table with the created view. The JOIN selects out the amplitude_ids that appear in both tables (the users who did signup ) and the rest of the query only picks from these rows. The red text of the query above should be adjusted to your specific case. >> back to Table of Contents 15

22 Query Objective: Show the distribution of users who have done an event by number of times done SELECT amplitude_id, COUNT(*) AS messages FROM app123.sentmessage WHERE DATE(event_time) BETWEEN ' ' AND ' ' GROUP BY amplitude_id ORDER BY COUNT(*) DESC; Explanation The query s output is a table of the messages and the number of users who sent that number of messages in the first two weeks of March. Here is a sample of the output: messages users etc The inner SELECT creates a table of unique users and how many messages they ve logged during the first two weeks of March. The outer SELECT creates a table based on the number of messages and the number of users who fell into that bucket. >> back to Table of Contents 16

23 Query Objective: Find out the last three events a user does before churning We'll limit the analysis to people who used the app the month prior to last. 1. Define churn as people who have not logged in during the last month: CREATE VIEW churned AS ( SELECT DISTINCT(amplitude_id) FROM events123 WHERE DATE(event_time) BETWEEN ' ' AND ' ' AND amplitude_id NOT IN ( SELECT DISTINCT(amplitude_id) FROM events123 WHERE DATE(event_time) BETWEEN ' ' AND ' ' ) ); 2. Fetch the last three events per user: CREATE TEMPORARY TABLE last3 AS ( SELECT * FROM ( ); SELECT amplitude_id, event_type, row_number() over (PARTITION BY amplitude_id ORDER BY event_time DESC) FROM events123 WHERE event_type NOT IN ('session_start', 'session_end') AND amplitude_id IN ( SELECT amplitude_id FROM churned) ) WHERE row_number <= 3 >> back to Table of Contents 17

24 3. Join the tables to combine the three events into one row: CREATE VIEW last3joined AS ( SELECT a.amplitude_id, a.event_type AS e1, b.event_type AS e2, c.event_type AS e3 FROM (SELECT * FROM last3 WHERE row_number=1) AS a JOIN (SELECT * FROM last3 WHERE row_number=2) AS b ON a.amplitude_id = b.amplitude_id JOIN (SELECT * FROM last3 WHERE row_number=3) AS c ON a.amplitude_id = c.amplitude_id ); 4. What were the last three events before the user churned? SELECT e1 ',' e2 ',' e3 AS last3, COUNT(*) FROM last3joined GROUP BY last3 ORDER BY count DESC; Explanation This query shows the last three events users did, out of the set of users who were active in January but not active in February. The red text of the query above should be adjusted to your specific case. >> back to Table of Contents 18

25 6. Funnels Contents For almost any app, there are key sequences of events that users should progress through in order to successfully begin or continue using the app; this sequence is commonly called a funnel. For example, for a messaging app, the key initial funnel might have three steps: (1) The openapp event (2) The viewmessage event (3) The sendmessage event *Note: Tracking the number of users who make it (and don't make it) to each stage in a funnel is crucial, as it identifies which parts of your app's user experience flow are smooth, and which parts are bottlenecks that need improvement. In this section, we'll demonstrate how to do funnel analysis in Redshift, using the three-stage texting app funnel described above as an example. To do this, we will create each step in the funnel as a SQL View - essentially a saved query that we can use without retyping the query. Query Objective: Obtain a list of users for each step of a funnel CREATE VIEW Funnel_Step_1 AS ( SELECT DISTINCT user_id FROM app123.openapp WHERE DATE(event_time) BETWEEN ' ' AND ' ' ); >> back to Table of Contents 19

26 This view, which we name 'Funnel_Step_1', captures the users who opened the app during March 1st and 2nd. Next, we use the Funnel_Step_1 view to construct the view for the second step in the funnel: CREATE VIEW Funnel_Step_2 AS ( SELECT DISTINCT app123.viewmessage.user_id FROM app123.viewmessage INNER JOIN Funnel_Step_1 ON app123.viewmessage.user_id = Funnel_Step_1.user_id WHERE DATE(event_time) BETWEEN ' ' AND ' ' ); Funnel_Step_2 captures the subset of the users from Funnel_Step_1 who also did the viewmessage event during the first two days of March; that is, the users who did both the openapp and viewmessage. Finally, we use Funnel_Step_2 to construct the view for the third step of the funnel: CREATE VIEW Funnel_Step_3 AS ( ); SELECT DISTINCT app123.sendmessage.user_id FROM app123.sendmessage INNER JOIN Funnel_Step_2 ON app123.sendmessage.user_id = Funnel_Step_2.user_id WHERE DATE(event_time) BETWEEN ' ' AND ' ' Funnel_Step_3 captures the subset of the users from Funnel_Step_2 (which, recall, is itself a subset of users from Funnel_Step_1) who also did the sendmessage event during the first two days of March. >> back to Table of Contents 20

27 Now that we have created the views for our funnel, we can analyze each step. First we can look at the count of users who made it to step 1, 2, and 3, respectively, using the queries: SELECT count(*) FROM Funnel_Step_1; SELECT count(*) FROM Funnel_Step_2; SELECT count(*) FROM Funnel_Step_3; Query Objective: Adding steps to a funnel While our example funnel here has three steps, you can add as many steps to your funnel as you'd like. Let s add a step to our above funnel: CREATE VIEW Funnel_Step_4 AS ( SELECT DISTINCT app123.next_event.user_id FROM app123.next_event INNER JOIN Funnel_Step_3 ON app123.next_event.user_id = Funnel_Step_3.user_id WHERE DATE(event_time) BETWEEN ' ' AND ' ' ); Query Objective: Getting the list of users who did (or did not) reach a step in a funnel In addition to getting the counts of users for each step in the funnel, you can also get the list of user_ids for the users who did (or did not) reach a given step in the funnel. To get the list of users who reached step X but then did not reach step X+1 -- referred to as users who dropped off the funnel at step X+1 -- use the query below; here we obtain the users who reached step 2 of our example funnel (so they did the openapp and viewmessage events) but did not reach step 3 (so they did not do the sendmessage event): >> back to Table of Contents 21

28 SELECT Funnel_Step_2.user_id FROM Funnel_Step_2 LEFT JOIN Funnel_Step_3 ON Funnel_Step_2.user_id = Funnel_Step_3.user_id WHERE Funnel_Step_3.user_id IS NULL; Query Objective: Funnels where users did event X, then Y, with no other events in between In our dashboard, users are counted as converted as long as they complete the next funnel step on the same day or up until 30 days they have entered the funnel. To get a list of users who did your first step in the funnel and immediately proceeded to do the next event, we will need to start using partition functions. Lets say we are looking at a funnel with events openapp viewmessage, and we only want to look at the number of users who did viewmessage immediately after openapp, with no other events in between. In this case, we must query on the events table instead of individual event tables as the individual events table does not give us information on what events immediately follows. To get a list of the number of users who did the openapp event, use the query: SELECT COUNT(DISTINCT amplitude_id) FROM( SELECT amplitude_id, event_type, event_time, LEAD(event_type, 1) OVER(PARTITION BY amplitude_id ORDER BY event_time) AS next_event_type FROM events123) WHERE next_event_type= 'viewmessage' AND event_type='openapp' AND DATE(event_time) BETWEEN ' ' AND ' '; >> back to Table of Contents 22

29 Explanation The inner subquery selects amplitude_id, event_type, event_time along with the PARTITION function. PARTITION is similar to the group function, but it does not aggregate the ID's (each row with the same amplitude ID stays independent), and within each partition, we have chosen to sort by event_time, so the event_time is sorted from earliest to latest. The LEAD function with value 1 returns the value for the row that is one after the current row and the AS function names that column next_event_type. Note the LEAD function only works within the partition (see the null values in the sample table below). From this resulting table, we are only selecting the rows where the next_event_type has the value VIEW (second event in the funnel) and the event_type with the value openapp (first event in the funnel) AND only events on March 1st and March 2nd. A simplified example of the partition function can be seen below: amplitude_id event_time event_type next_event_type a 1:01 openapp viewmessage a 1:03 viewmessage viewmessage a 1:05 viewmessage null b 1:06 openapp viewmessage b 1:10 viewmessage null c 1:12 openapp null In this above example, we would have two rows and two users who would satisfy this requirement (users A and B, row 1 and row 4.) >> back to Table of Contents 23

30 Query Objective: Funnels where users did event Y after event X, within 24 hours of event X CREATE OR REPLACE VIEW openapp_funnel1 AS SELECT * FROM ( SELECT amplitude_id, event_time, row_number() OVER (PARTITION BY amplitude_id ORDER BY event_time ASC) FROM app123.openapp WHERE DATE(event_time) BETWEEN ' ' AND ' ') WHERE row_number = 1; The inner SELECT creates a table with the Amplitude ID and the time at which the user did the openapp event. The table is partitioned by amplitude_id, and within each partition the event times are sorted from least to greatest. Each row in each partition is given a row number. The outer SELECT picks only the first row of each partition - this is the first time the user did the openapp event in the given window. The inner SELECT makes a table that looks like this: amplitude_id event_time row_number() a 1:00 1 a 1:30 2 b 1:04 1 c 1:05 1 c 1:10 2 c 1:15 3 The outer SELECT makes a table that looks like this: amplitude_id event_time row_number() a 1:00 1 b 1:04 1 c 1:05 1 >> back to Table of Contents 24

31 CREATE OR REPLACE VIEW openapp_funnel2 AS SELECT DISTINCT(amplitude_id) FROM ( SELECT openapp_funnel1.amplitude_id, DATEDIFF('milliseconds',OPEN_funnel1.event_time, app123.viewmessage.event_time) AS dt FROM openapp_funnel1 INNER JOIN app123.viewmessage ON openapp_funnel1.amplitude_id = app123.viewmessage.amplitude_id WHERE DATE(app123.viewMessage.event_time) BETWEEN ' ' AND ' ') WHERE dt> 0 AND dt <= ; The inner SELECT JOINs the funnel1 table with the events table on id. It selects the id and the difference of time between the 2nd event and the 1st event ( dt ). For the time difference we have to use the DATEDIFF()function because Redshift doesn t recognize intervals (the output you would get if you just normally subtracted the dates). In the WHERE clause, the upper bound is +1 day because the activation event could happen during the next day. The outer SELECT picks just the ids where the difference is greater than 0 milliseconds (meaning the 2nd event happened after the first event) and less than milliseconds (1 day). SELECT COUNT(*) FROM openapp_funnel1; SELECT COUNT(*) FROM openapp_funnel2; To get weekly rate, divide the 2nd value by the 1st value. >> back to Table of Contents 25

32 7. RevenueContents If your app tracks revenue-generating events through Amplitude, such as in-app purchases, you can query for users and actions based on these revenue-generating events in Redshift. Note: Amplitude offers highly accurate revenue tracking by verifying purchases with Apple itunes and GooglePlay. This section assumes that your app has instrumented revenue verification. Verified Revenue events are stored in Amplitude's Redshift database with event_type verified_revenue and the actual monetary amount for that purchase is stored in the revenue column. Query Objective: Obtain the number of paying users and total revenue A very useful summary query is to find the number of distinct users who spent money on purchases over a period of time and the total amount of money they spent: SELECT count(distinct amplitude_id), sum(revenue) FROM app123.verified_revenue AND revenue > 0 AND DATE(event_time) BETWEEN ' ' AND ' '; >> back to Table of Contents 26

33 Query Objective: Obtain a list of top paying users Next, we can obtain a list of our app's so-called whales (i.e. users who are highly engaged and are the highest spenders on in-app purchases). The following query returns the user_id's of paying users and the total amount they have purchased over a specified time period, in descending order (highest paying users first): SELECT user_id, sum(revenue) as totalspent FROM events123 WHERE DATE(event_time) BETWEEN ' ' AND ' ' AND revenue IS NOT NULL GROUP BY user_id ORDER BY totalspent DESC; >> back to Table of Contents 27

34 Contents 8. User Properties A very common query is selecting users who satisfy some property intrinsic to them - their country, language, device platform (ios or Android), the ad network that directed them to the app, etc. Amplitude tracks all of this data, so finding the users who satisfy user properties is a simple query on Redshift. There are two primary types of user properties: properties tracked automatically by Amplitude, and custom-defined user properties. Each requires different query syntax, which we will go over below. Properties tracked automatically by Amplitude These properties are stored for every event in their own Redshift column and include: version : the version of your app being used (e.g ) country : the country as set on the user's device city : the city of the user region : the region of the user (states within the United States, and province in other countries) DMA : designated marketing area, a marketing area that shares media Language : language as set on the user's device Platform : operating system type, e.g. Android, ios, Chrome, etc. OS : version number of the operating system, e.g. Android Device family : e.g. Samsung, Casio, Kyocera, Acer Device Type : e.g. iphone 6, Galaxy Carrier : e.g. Verizon, Vodafone, AT&T >> back to Table of Contents 28

35 It is often useful to first look at the most common values for a given user property. For example, perhaps we are interested in knowing the countries in which our app has the most users. To do this, use the following query: Query Objective: Obtain the most common values for a given user property SELECT country, count(distinct amplitude_id) AS count FROM events123 WHERE DATE(event_time) BETWEEN ' ' AND ' ' GROUP BY country ORDER BY count DESC; This will return a table with the country name as the first column and the number of distinct users from that country as the second column; the 'ORDER BY count DESC' option at the end will list the countries from highest number of users to lowest. Once we have a sense of the relevant property values, we can then query for the list of users who have certain user properties. For example, if we are interested in getting a list of active users on March 1st and 2nd who are from either Canada or the United Kingdom, we can perform the following query: Query Objective: Obtain a list of users who have certain properties SELECT DISTINCT user_id, country, platform FROM events123 WHERE (country = 'Canada' OR country = 'United Kingdom') AND DATE(event_time) BETWEEN ' ' AND ' '; Here we return all of the relevant columns (user_id, country, platform) so we can see the corresponding property values for each user satisfying the query; however, you can choose to just return the user_id or you can ask to return other column values that are not part of the WHERE clause. >> back to Table of Contents 29

36 Custom-defined User Properties In addition to the user properties automatically tracked by Amplitude, your app can specify additional user-level properties. User properties are pulled into their own columns in each event table. There is a limit of 400 user properties that can be put into their own columns. All other properties are saved in JSON format in a single column in Redshift called user_properties. Possible examples include the Advertising network the user was referred from, the number of photos the user has saved in the app, the amount of in-game currency the user has, etc. Conceptually, these are very similar to the Amplitude-tracked user properties discussed above; they track one aspect of the current state of a user and they are not event-specific (so the same user properties and values appear on all events for a user at a point in time). As an example, say we want to see the most common advertising referral networks for users and we have stored this value in the user_properties column under the key 'Referral'. Then the query is: Query Objective: Obtain the most common advertising referral networks for users SELECT JSON_EXTRACT_PATH_TEXT(user_properties,'Referral') AS Referral_Type, count(distinct amplitude_id) as count FROM events123 WHERE DATE(event_time) BETWEEN ' ' AND ' ' GROUP BY Referral_Type ORDER BY count DESC; This will return a table with the referral network name as the first column (which we have chosen to call Referral_Type but you can name anything you want) and the number of associated distinct users as the second column; the 'ORDER BY count DESC' option at the end will list the referral network names in descending order from highest number of users to lowest. >> back to Table of Contents 30

37 Numerical Custom-defined User Properties If the user property you are interested in has numerical values instead of text, you can query for ranges of values. For example, below we query for the number of users whose Current Level in our game app is greater than 7 but less than 10 (i.e. Level 8 or 9): Query Objective: Obtain the number of users whose current level is greater than 7 but less than 10 SELECT DISTINCT count(amplitude_id) FROM events123 WHERE NULLIF(JSON_EXTRACT_PATH_TEXT(user_properties, 'Current Level'), '')::int > 7 AND NULLIF(JSON_EXTRACT_PATH_TEXT(user_properties, 'Current Level'), '')::int < 10 AND DATE(event_time) BETWEEN ' ' AND ' '; Be sure to use the syntax above - specifically the NULLIF()function (which converts empty strings '' to the special SQL value NULL) and the ::int (casts strings to integers). This is necessary for numerical property values since the json_extract_path_text() function returns strings. >> back to Table of Contents 31

38 Contents 9. Event Properties In addition to user properties, Amplitude also allows tracking of event properties which provide deeper data on user actions, specific to the type of event that occurred. For example, in a gambling game app when the user does a 'BET' event on a hand of cards, an event property called 'amount' can capture the amount of in-game currency they wagered. Or in a shopping app, when a user purchases an item, triggering a 'PURCHASE' event, an event property called 'item_name' can capture the name of the specific item that was purchased. Amplitude stores these event-based properties in Redshift in their own individual columns for each event type. There is a limit of 50 event properties that can be pulled out into their own columns. All other event properties will be stored in a special JSON column called event_properties. To query for them, we use the same syntax type that we use for custom-defined user properties (as described in section 7), based on the Redshift json_extract_path_text() function. Taking the shopping app example from the previous paragraph, the following query finds the names of the items bought and the count of how many times that item was purchased, over a period of time, ordered by the count: Query Objective: Obtain the list of items bought and how frequently that item was purchased SELECT item_name, count(*) AS count FROM app123.purchase AND DATE(event_time) BETWEEN ' ' AND ' ' GROUP BY item_name ORDER BY count DESC; >> back to Table of Contents 32

39 Numerical Event Properties If the event property you are interested in has numerical values instead of text, you can query for ranges of values. Taking the gambling game app example from the last section, we can query for the number of users who, when doing a 'BET' event, wagered between 100 and 500 credits: Query Objective: Obtain the number of users who did BET and wagered between credits SELECT count(distinct amplitude_id) AS count FROM app123.bet WHERE DATE(event_time) BETWEEN ' ' AND ' ' AND NULLIF(e_credits,'')::int >= 100 AND NULLIF(e_credits, '')::int <= 500 ; >> back to Table of Contents 33

40 additional resources Amplitude Docs Case Studies Blog Amazon Redshift docs Questions? Scalable analytics built for growth

Greenplum SQL Class Outline

Greenplum SQL Class Outline Greenplum SQL Class Outline The Basics of Greenplum SQL Introduction SELECT * (All Columns) in a Table Fully Qualifying a Database, Schema and Table SELECT Specific Columns in a Table Commas in the Front

More information

FUN WITH ANALYTIC FUNCTIONS UTOUG TRAINING DAYS 2017

FUN WITH ANALYTIC FUNCTIONS UTOUG TRAINING DAYS 2017 FUN WITH ANALYTIC FUNCTIONS UTOUG TRAINING DAYS 2017 ABOUT ME Born and raised here in UT In IT for 10 years, DBA for the last 6 Databases and Data are my hobbies, I m rather quite boring This isn t why

More information

Querying Data with Transact SQL

Querying Data with Transact SQL Course 20761A: Querying Data with Transact SQL Course details Course Outline Module 1: Introduction to Microsoft SQL Server 2016 This module introduces SQL Server, the versions of SQL Server, including

More information

Integration Service. Admin Console User Guide. On-Premises

Integration Service. Admin Console User Guide. On-Premises Kony MobileFabric TM Integration Service Admin Console User Guide On-Premises Release 6.5 Document Relevance and Accuracy This document is considered relevant to the Release stated on this title page and

More information

DB2 SQL Class Outline

DB2 SQL Class Outline DB2 SQL Class Outline The Basics of SQL Introduction Finding Your Current Schema Setting Your Default SCHEMA SELECT * (All Columns) in a Table SELECT Specific Columns in a Table Commas in the Front or

More information

Database Systems: Design, Implementation, and Management Tenth Edition. Chapter 7 Introduction to Structured Query Language (SQL)

Database Systems: Design, Implementation, and Management Tenth Edition. Chapter 7 Introduction to Structured Query Language (SQL) Database Systems: Design, Implementation, and Management Tenth Edition Chapter 7 Introduction to Structured Query Language (SQL) Objectives In this chapter, students will learn: The basic commands and

More information

Learn SQL by Calculating Customer Lifetime Value

Learn SQL by Calculating Customer Lifetime Value Learn SQL Learn SQL by Calculating Customer Lifetime Value Setup, Counting and Filtering 1 Learn SQL CONTENTS Getting Started Scenario Setup Sorting with ORDER BY FilteringwithWHERE FilteringandSorting

More information

INTERMEDIATE SQL GOING BEYOND THE SELECT. Created by Brian Duffey

INTERMEDIATE SQL GOING BEYOND THE SELECT. Created by Brian Duffey INTERMEDIATE SQL GOING BEYOND THE SELECT Created by Brian Duffey WHO I AM Brian Duffey 3 years consultant at michaels, ross, and cole 9+ years SQL user What have I used SQL for? ROADMAP Introduction 1.

More information

SQL functions fit into two broad categories: Data definition language Data manipulation language

SQL functions fit into two broad categories: Data definition language Data manipulation language Database Principles: Fundamentals of Design, Implementation, and Management Tenth Edition Chapter 7 Beginning Structured Query Language (SQL) MDM NUR RAZIA BINTI MOHD SURADI 019-3932846 razia@unisel.edu.my

More information

Integrating Hive and Kafka

Integrating Hive and Kafka 3 Integrating Hive and Kafka Date of Publish: 2018-12-18 https://docs.hortonworks.com/ Contents... 3 Create a table for a Kafka stream...3 Querying live data from Kafka... 4 Query live data from Kafka...

More information

Integration Service. Admin Console User Guide. On-Premises

Integration Service. Admin Console User Guide. On-Premises Kony MobileFabric TM Integration Service Admin Console User Guide On-Premises Release 7.3 Document Relevance and Accuracy This document is considered relevant to the Release stated on this title page and

More information

Principles of Data Management

Principles of Data Management Principles of Data Management Alvin Lin August 2018 - December 2018 Structured Query Language Structured Query Language (SQL) was created at IBM in the 80s: SQL-86 (first standard) SQL-89 SQL-92 (what

More information

Oracle Database: SQL and PL/SQL Fundamentals NEW

Oracle Database: SQL and PL/SQL Fundamentals NEW Oracle Database: SQL and PL/SQL Fundamentals NEW Duration: 5 Days What you will learn This Oracle Database: SQL and PL/SQL Fundamentals training delivers the fundamentals of SQL and PL/SQL along with the

More information

Oracle Database 11g: SQL and PL/SQL Fundamentals

Oracle Database 11g: SQL and PL/SQL Fundamentals Oracle University Contact Us: +33 (0) 1 57 60 20 81 Oracle Database 11g: SQL and PL/SQL Fundamentals Duration: 5 Days What you will learn In this course, students learn the fundamentals of SQL and PL/SQL

More information

SIT772 Database and Information Retrieval WEEK 6. RELATIONAL ALGEBRAS. The foundation of good database design

SIT772 Database and Information Retrieval WEEK 6. RELATIONAL ALGEBRAS. The foundation of good database design SIT772 Database and Information Retrieval WEEK 6. RELATIONAL ALGEBRAS The foundation of good database design Outline 1. Relational Algebra 2. Join 3. Updating/ Copy Table or Parts of Rows 4. Views (Virtual

More information

Course Modules for MCSA: SQL Server 2016 Database Development Training & Certification Course:

Course Modules for MCSA: SQL Server 2016 Database Development Training & Certification Course: Course Modules for MCSA: SQL Server 2016 Database Development Training & Certification Course: 20762C Developing SQL 2016 Databases Module 1: An Introduction to Database Development Introduction to the

More information

MANAGING DATA(BASES) USING SQL (NON-PROCEDURAL SQL, X401.9)

MANAGING DATA(BASES) USING SQL (NON-PROCEDURAL SQL, X401.9) Technology & Information Management Instructor: Michael Kremer, Ph.D. Class 6 Professional Program: Data Administration and Management MANAGING DATA(BASES) USING SQL (NON-PROCEDURAL SQL, X401.9) AGENDA

More information

Snowplow Meetup 11/08/15

Snowplow Meetup 11/08/15 Snowplow Meetup 11/08/15 Agenda 1.Introduction 2.System overview 3.Dimensional design 4.Channel attribution 5.General remarks 2 System overview 3 System overview 4 Dimensional design 5 Dimensional design

More information

Exam code: Exam name: Database Fundamentals. Version 16.0

Exam code: Exam name: Database Fundamentals. Version 16.0 98-364 Number: 98-364 Passing Score: 800 Time Limit: 120 min File Version: 16.0 Exam code: 98-364 Exam name: Database Fundamentals Version 16.0 98-364 QUESTION 1 You have a table that contains the following

More information

Integration Service. Admin Console User Guide. On-Premises

Integration Service. Admin Console User Guide. On-Premises Kony Fabric Integration Service Admin Console User Guide On-Premises Release V8 SP1 Document Relevance and Accuracy This document is considered relevant to the Release stated on this title page and the

More information

After completing this course, participants will be able to:

After completing this course, participants will be able to: Querying SQL Server T h i s f i v e - d a y i n s t r u c t o r - l e d c o u r s e p r o v i d e s p a r t i c i p a n t s w i t h t h e t e c h n i c a l s k i l l s r e q u i r e d t o w r i t e b a

More information

CGS 3066: Spring 2017 SQL Reference

CGS 3066: Spring 2017 SQL Reference CGS 3066: Spring 2017 SQL Reference Can also be used as a study guide. Only covers topics discussed in class. This is by no means a complete guide to SQL. Database accounts are being set up for all students

More information

GridDB Advanced Edition SQL reference

GridDB Advanced Edition SQL reference GMA022C1 GridDB Advanced Edition SQL reference Toshiba Solutions Corporation 2016 All Rights Reserved. Introduction This manual describes how to write a SQL command in the GridDB Advanced Edition. Please

More information

4 Introduction to Web Intelligence

4 Introduction to Web Intelligence 4 Introduction to Web Intelligence Web Intelligence enables you to create documents for reporting, data analysis, and sharing with other users using the BI Launch Pad environment. Querying The required

More information

OVERVIEW OF RELATIONAL DATABASES: KEYS

OVERVIEW OF RELATIONAL DATABASES: KEYS OVERVIEW OF RELATIONAL DATABASES: KEYS Keys (typically called ID s in the Sierra Database) come in two varieties, and they define the relationship between tables. Primary Key Foreign Key OVERVIEW OF DATABASE

More information

Aster Data SQL and MapReduce Class Outline

Aster Data SQL and MapReduce Class Outline Aster Data SQL and MapReduce Class Outline CoffingDW education has been customized for every customer for the past 20 years. Our classes can be taught either on site or remotely via the internet. Education

More information

Distributing Queries the Citus Way Fast and Lazy. Marco Slot

Distributing Queries the Citus Way Fast and Lazy. Marco Slot Distributing Queries the Citus Way Fast and Lazy Marco Slot What is Citus? Citus is an open source extension to Postgres (9.6, 10, 11) for transparently distributing tables across

More information

I Travel on mobile / FR

I Travel on mobile / FR I Travel on mobile / FR Exploring how people use their smartphones for travel activities Q3 2016 I About this study Background: Objective: Mobile apps and sites are a vital channel for advertisers to engage

More information

I Travel on mobile / UK

I Travel on mobile / UK I Travel on mobile / UK Exploring how people use their smartphones for travel activities Q3 2016 I About this study Background: Objective: Mobile apps and sites are a vital channel for advertisers to engage

More information

Oracle Database: SQL and PL/SQL Fundamentals Ed 2

Oracle Database: SQL and PL/SQL Fundamentals Ed 2 Oracle University Contact Us: Local: 1800 103 4775 Intl: +91 80 67863102 Oracle Database: SQL and PL/SQL Fundamentals Ed 2 Duration: 5 Days What you will learn This Oracle Database: SQL and PL/SQL Fundamentals

More information

Querying Data with Transact-SQL

Querying Data with Transact-SQL Querying Data with Transact-SQL Course: 20761 Course Details Audience(s): IT Professional(s) Technology: Microsoft SQL Server 2016 Duration: 24 HRs. ABOUT THIS COURSE This course is designed to introduce

More information

20461: Querying Microsoft SQL Server 2014 Databases

20461: Querying Microsoft SQL Server 2014 Databases Course Outline 20461: Querying Microsoft SQL Server 2014 Databases Module 1: Introduction to Microsoft SQL Server 2014 This module introduces the SQL Server platform and major tools. It discusses editions,

More information

SQL Server 2012 Development Course

SQL Server 2012 Development Course SQL Server 2012 Development Course Exam: 1 Lecturer: Amirreza Keshavarz May 2015 1- You are a database developer and you have many years experience in database development. Now you are employed in a company

More information

Amazon Mobile Analytics. User Guide

Amazon Mobile Analytics. User Guide Amazon Mobile Analytics User Guide Amazon Mobile Analytics: User Guide Copyright 2018 Amazon Web Services, Inc. and/or its affiliates. All rights reserved. Amazon's trademarks and trade dress may not be

More information

This module presents the star schema, an alternative to 3NF schemas intended for analytical databases.

This module presents the star schema, an alternative to 3NF schemas intended for analytical databases. Topic 3.3: Star Schema Design This module presents the star schema, an alternative to 3NF schemas intended for analytical databases. Star Schema Overview The star schema is a simple database architecture

More information

Table of Contents Page 2

Table of Contents Page 2 OE TOUCH Table of Contents App User Guide... 3 Overview... 4 Features... 5 Installing the App... 6 Logging In... 7 Navigation... 13 Shop for Product... 15 Product Detail... 22 Shopping Cart... 29 Checkout...

More information

MIXPANEL SYSTEM ARCHITECTURE

MIXPANEL SYSTEM ARCHITECTURE MIXPANEL SYSTEM ARCHITECTURE Vijay Jayaram, Technical Lead Manager, Mixpanel Infrastructure The content herein is correct as of June 2018, and represents the status quo at the time it was written. Mixpanel

More information

20461: Querying Microsoft SQL Server

20461: Querying Microsoft SQL Server 20461: Querying Microsoft SQL Server Length: 5 days Audience: IT Professionals Level: 300 OVERVIEW This 5 day instructor led course provides students with the technical skills required to write basic Transact

More information

The Plan. What will we cover? - Review Some Basics - Set Operators - Subqueries - Aggregate Filter Clause - Window Functions Galore - CTE s - Lateral

The Plan. What will we cover? - Review Some Basics - Set Operators - Subqueries - Aggregate Filter Clause - Window Functions Galore - CTE s - Lateral Becoming A SQL Guru The Plan 1 What will we cover? - Review Some Basics - Set Operators - Subqueries - Aggregate Filter Clause - Window Functions Galore - CTE s - Lateral Becoming A SQL Guru Queries Syntax

More information

Aster Data Basics Class Outline

Aster Data Basics Class Outline Aster Data Basics Class Outline CoffingDW education has been customized for every customer for the past 20 years. Our classes can be taught either on site or remotely via the internet. Education Contact:

More information

Querying Microsoft SQL Server (MOC 20461C)

Querying Microsoft SQL Server (MOC 20461C) Querying Microsoft SQL Server 2012-2014 (MOC 20461C) Course 21461 40 Hours This 5-day instructor led course provides students with the technical skills required to write basic Transact-SQL queries for

More information

Kaseya 2. User Guide. Version 7.0. English

Kaseya 2. User Guide. Version 7.0. English Kaseya 2 Custom Reports User Guide Version 7.0 English September 3, 2014 Agreement The purchase and use of all Software and Services is subject to the Agreement as defined in Kaseya s Click-Accept EULATOS

More information

MTA Database Administrator Fundamentals Course

MTA Database Administrator Fundamentals Course MTA Database Administrator Fundamentals Course Session 1 Section A: Database Tables Tables Representing Data with Tables SQL Server Management Studio Section B: Database Relationships Flat File Databases

More information

PERISCOPE DATA PRESENTS. Speed up Your SQL

PERISCOPE DATA PRESENTS. Speed up Your SQL PERISCOPE DATA PRESENTS Speed up Your SQL Your SQL Queries Are Holding You Back As an analyst, your work is important! Slow queries can keep you from efficiently surfacing and acting on insights. This

More information

SQL Analytics: Best Practices, Tips and Tricks

SQL Analytics: Best Practices, Tips and Tricks SQL Analytics: Best Practices, Tips and Tricks SQL Analytics: Best Practicess and How To s 201808 Content 4 8 12 17 23 4 Ways to Join the First Row in SQL Use correlated subqueries when the foreign key

More information

In This Lecture. Yet More SQL SELECT ORDER BY. SQL SELECT Overview. ORDER BY Example. ORDER BY Example. Yet more SQL

In This Lecture. Yet More SQL SELECT ORDER BY. SQL SELECT Overview. ORDER BY Example. ORDER BY Example. Yet more SQL In This Lecture Yet More SQL Database Systems Lecture 9 Natasha Alechina Yet more SQL ORDER BY Aggregate functions and HAVING etc. For more information Connoly and Begg Chapter 5 Ullman and Widom Chapter

More information

Querying Microsoft SQL Server

Querying Microsoft SQL Server Querying Microsoft SQL Server 20461D; 5 days, Instructor-led Course Description This 5-day instructor led course provides students with the technical skills required to write basic Transact SQL queries

More information

COURSE OUTLINE MOC 20461: QUERYING MICROSOFT SQL SERVER 2014

COURSE OUTLINE MOC 20461: QUERYING MICROSOFT SQL SERVER 2014 COURSE OUTLINE MOC 20461: QUERYING MICROSOFT SQL SERVER 2014 MODULE 1: INTRODUCTION TO MICROSOFT SQL SERVER 2014 This module introduces the SQL Server platform and major tools. It discusses editions, versions,

More information

Structured Query Language Continued. Rose-Hulman Institute of Technology Curt Clifton

Structured Query Language Continued. Rose-Hulman Institute of Technology Curt Clifton Structured Query Language Continued Rose-Hulman Institute of Technology Curt Clifton The Story Thus Far SELECT FROM WHERE SELECT * SELECT Foo AS Bar SELECT expression SELECT FROM WHERE LIKE SELECT FROM

More information

Mobile MOUSe MTA DATABASE ADMINISTRATOR FUNDAMENTALS ONLINE COURSE OUTLINE

Mobile MOUSe MTA DATABASE ADMINISTRATOR FUNDAMENTALS ONLINE COURSE OUTLINE Mobile MOUSe MTA DATABASE ADMINISTRATOR FUNDAMENTALS ONLINE COURSE OUTLINE COURSE TITLE MTA DATABASE ADMINISTRATOR FUNDAMENTALS COURSE DURATION 10 Hour(s) of Self-Paced Interactive Training COURSE OVERVIEW

More information

Querying Microsoft SQL Server

Querying Microsoft SQL Server 20461 - Querying Microsoft SQL Server Duration: 5 Days Course Price: $2,975 Software Assurance Eligible Course Description About this course This 5-day instructor led course provides students with the

More information

John Biancamano Inbound Digital LLC InboundDigital.net

John Biancamano Inbound Digital LLC InboundDigital.net John Biancamano Inbound Digital LLC 609.865.7994 InboundDigital.net About Me Owner of Inbound Digital, LLC digital marketing consulting and training: websites, SEO, advertising, and social media. Senior

More information

SQL: Data Querying. B0B36DBS, BD6B36DBS: Database Systems. h p://www.ksi.m.cuni.cz/~svoboda/courses/172-b0b36dbs/ Lecture 4

SQL: Data Querying. B0B36DBS, BD6B36DBS: Database Systems. h p://www.ksi.m.cuni.cz/~svoboda/courses/172-b0b36dbs/ Lecture 4 B0B36DBS, BD6B36DBS: Database Systems h p://www.ksi.m.cuni.cz/~svoboda/courses/172-b0b36dbs/ Lecture 4 SQL: Data Querying Mar n Svoboda mar n.svoboda@fel.cvut.cz 20. 3. 2018 Czech Technical University

More information

Apple Deployment Program Volume Purchase Program for Education Guide

Apple Deployment Program Volume Purchase Program for Education Guide Apple Deployment Program Volume Purchase Program for Education Guide Overview The Volume Purchase Program makes it simple to find, buy and distribute apps in bulk for your institution, so you can provide

More information

COURSE OUTLINE: Querying Microsoft SQL Server

COURSE OUTLINE: Querying Microsoft SQL Server Course Name 20461 Querying Microsoft SQL Server Course Duration 5 Days Course Structure Instructor-Led (Classroom) Course Overview This 5-day instructor led course provides students with the technical

More information

Business Insight Authoring

Business Insight Authoring Business Insight Authoring Getting Started Guide ImageNow Version: 6.7.x Written by: Product Documentation, R&D Date: August 2016 2014 Perceptive Software. All rights reserved CaptureNow, ImageNow, Interact,

More information

SQL Data Querying and Views

SQL Data Querying and Views Course A7B36DBS: Database Systems Lecture 04: SQL Data Querying and Views Martin Svoboda Faculty of Electrical Engineering, Czech Technical University in Prague Outline SQL Data manipulation SELECT queries

More information

WHY AND HOW TO LEVERAGE THE POWER AND SIMPLICITY OF SQL ON APACHE FLINK - FABIAN HUESKE, SOFTWARE ENGINEER

WHY AND HOW TO LEVERAGE THE POWER AND SIMPLICITY OF SQL ON APACHE FLINK - FABIAN HUESKE, SOFTWARE ENGINEER WHY AND HOW TO LEVERAGE THE POWER AND SIMPLICITY OF SQL ON APACHE FLINK - FABIAN HUESKE, SOFTWARE ENGINEER ABOUT ME Apache Flink PMC member & ASF member Contributing since day 1 at TU Berlin Focusing on

More information

Querying Data with Transact-SQL

Querying Data with Transact-SQL Course 20761A: Querying Data with Transact-SQL Page 1 of 5 Querying Data with Transact-SQL Course 20761A: 2 days; Instructor-Led Introduction The main purpose of this 2 day instructor led course is to

More information

COGS 121 HCI Programming Studio. Week 03 - Tech Lecture

COGS 121 HCI Programming Studio. Week 03 - Tech Lecture COGS 121 HCI Programming Studio Week 03 - Tech Lecture Housekeeping Assignment #1 extended to Monday night 11:59pm Assignment #2 to be released on Tuesday during lecture Database Management Systems and

More information

Lesson 2. Data Manipulation Language

Lesson 2. Data Manipulation Language Lesson 2 Data Manipulation Language IN THIS LESSON YOU WILL LEARN To add data to the database. To remove data. To update existing data. To retrieve the information from the database that fulfil the stablished

More information

I Shopping on mobile / RU

I Shopping on mobile / RU I Shopping on mobile / RU Exploring how people use their smartphones for shopping activities Q3 2016 I About this study Background: Objective: Mobile apps and sites are a vital channel for advertisers

More information

Course 20461C: Querying Microsoft SQL Server

Course 20461C: Querying Microsoft SQL Server Course 20461C: Querying Microsoft SQL Server Audience Profile About this Course This course is the foundation for all SQL Serverrelated disciplines; namely, Database Administration, Database Development

More information

Database Management Systems by Hanh Pham GOALS

Database Management Systems by Hanh Pham GOALS PROJECT Note # 02: Database Management Systems by Hanh Pham GOALS Most databases in the world are SQL-based DBMS. Using data and managing DBMS efficiently and effectively can help companies save a lot

More information

Stat Wk 3. Stat 342 Notes. Week 3, Page 1 / 71

Stat Wk 3. Stat 342 Notes. Week 3, Page 1 / 71 Stat 342 - Wk 3 What is SQL Proc SQL 'Select' command and 'from' clause 'group by' clause 'order by' clause 'where' clause 'create table' command 'inner join' (as time permits) Stat 342 Notes. Week 3,

More information

GOOGLE ANALYTICS HELP PRESENTATION. We Welcome You to. Google Analytics Implementation Guidelines

GOOGLE ANALYTICS HELP PRESENTATION. We Welcome You to. Google Analytics Implementation Guidelines GOOGLE ANALYTICS HELP PRESENTATION We Welcome You to Google Analytics Implementation Guidelines 05/23/2008 Ashi Avalon - Google Analytics Implementation Presentation Page 1 of 28 1) What Is Google Analytics?

More information

Subquery: There are basically three types of subqueries are:

Subquery: There are basically three types of subqueries are: Subquery: It is also known as Nested query. Sub queries are queries nested inside other queries, marked off with parentheses, and sometimes referred to as "inner" queries within "outer" queries. Subquery

More information

EECS 647: Introduction to Database Systems

EECS 647: Introduction to Database Systems EECS 647: Introduction to Database Systems Instructor: Luke Huan Spring 2009 Stating Points A database A database management system A miniworld A data model Conceptual model Relational model 2/24/2009

More information

Lecture 3 SQL. Shuigeng Zhou. September 23, 2008 School of Computer Science Fudan University

Lecture 3 SQL. Shuigeng Zhou. September 23, 2008 School of Computer Science Fudan University Lecture 3 SQL Shuigeng Zhou September 23, 2008 School of Computer Science Fudan University Outline Basic Structure Set Operations Aggregate Functions Null Values Nested Subqueries Derived Relations Views

More information

Querying Microsoft SQL Server

Querying Microsoft SQL Server Querying Microsoft SQL Server Duration: 5 Days (08:30-16:00) Overview: This course provides students with the technical skills required to write basic Transact-SQL queries for Microsoft SQL Server. This

More information

Data Exploration. The table below lists each of the files available for analysis with a short description of what is found in each one.

Data Exploration. The table below lists each of the files available for analysis with a short description of what is found in each one. Data Exploration Data Set Overview The table below lists each of the files available for analysis with a short description of what is found in each one. File Name Description Fields ad-clicks.csv This

More information

Exact Numeric Data Types

Exact Numeric Data Types SQL Server Notes for FYP SQL data type is an attribute that specifies type of data of any object. Each column, variable and expression has related data type in SQL. You would use these data types while

More information

SQL Data Query Language

SQL Data Query Language SQL Data Query Language André Restivo 1 / 68 Index Introduction Selecting Data Choosing Columns Filtering Rows Set Operators Joining Tables Aggregating Data Sorting Rows Limiting Data Text Operators Nested

More information

Querying Data with Transact SQL Microsoft Official Curriculum (MOC 20761)

Querying Data with Transact SQL Microsoft Official Curriculum (MOC 20761) Querying Data with Transact SQL Microsoft Official Curriculum (MOC 20761) Course Length: 3 days Course Delivery: Traditional Classroom Online Live MOC on Demand Course Overview The main purpose of this

More information

QUERYING MICROSOFT SQL SERVER COURSE OUTLINE. Course: 20461C; Duration: 5 Days; Instructor-led

QUERYING MICROSOFT SQL SERVER COURSE OUTLINE. Course: 20461C; Duration: 5 Days; Instructor-led CENTER OF KNOWLEDGE, PATH TO SUCCESS Website: QUERYING MICROSOFT SQL SERVER Course: 20461C; Duration: 5 Days; Instructor-led WHAT YOU WILL LEARN This 5-day instructor led course provides students with

More information

CSC 343 Winter SQL: Aggregation, Joins, and Triggers MICHAEL LIUT

CSC 343 Winter SQL: Aggregation, Joins, and Triggers MICHAEL LIUT SQL: Aggregation, Joins, and Triggers CSC 343 Winter 2018 MICHAEL LIUT (MICHAEL.LIUT@UTORONTO.CA) DEPARTMENT OF MATHEMATICAL AND COMPUTATIONAL SCIENCES UNIVERSITY OF TORONTO MISSISSAUGA Aggregation Operators

More information

1Z Oracle Database 11g - SQL Fundamentals I Exam Summary Syllabus Questions

1Z Oracle Database 11g - SQL Fundamentals I Exam Summary Syllabus Questions 1Z0-051 Oracle Database 11g - SQL Fundamentals I Exam Summary Syllabus Questions Table of Contents Introduction to 1Z0-051 Exam on Oracle Database 11g - SQL Fundamentals I 2 Oracle 1Z0-051 Certification

More information

Acquiring, Exploring and Preparing the Data

Acquiring, Exploring and Preparing the Data Technical Appendix Catch the Pink Flamingo Analysis Produced by: Prabhat Tripathi Acquiring, Exploring and Preparing the Data Data Exploration Data Set Overview The table below lists each of the files

More information

The State of Mobile Advertising Q2 2012

The State of Mobile Advertising Q2 2012 Q2 2012 Executive summary In our first edition of the State of Mobile Advertising report, we take an in-depth look at the monetization of mobile advertising from four perspectives within the ad delivery

More information

Information Systems Engineering. SQL Structured Query Language DML Data Manipulation (sub)language

Information Systems Engineering. SQL Structured Query Language DML Data Manipulation (sub)language Information Systems Engineering SQL Structured Query Language DML Data Manipulation (sub)language 1 DML SQL subset for data manipulation (DML) includes four main operations SELECT - used for querying a

More information

Finance on mobile: Canada

Finance on mobile: Canada Finance on mobile: Canada Exploring how people use their smartphones for finance activities Commissioned: 2016 Q3 This is a title and it will Google has commissioned this study for Think with Google. We're

More information

APPLICATION USER GUIDE

APPLICATION USER GUIDE APPLICATION USER GUIDE Application: Analytics Version: 1.0 Description: Analytics provides a complete view of your website analytics and usage. Page 2 of 59 Analytics 1.0 Summary Contents 1 ANALYTICS...

More information

Slides by: Ms. Shree Jaswal

Slides by: Ms. Shree Jaswal Slides by: Ms. Shree Jaswal Overview of SQL, Data Definition Commands, Set operations, aggregate function, null values, Data Manipulation commands, Data Control commands, Views in SQL, Complex Retrieval

More information

SQL CHEAT SHEET. created by Tomi Mester

SQL CHEAT SHEET. created by Tomi Mester SQL CHEAT SHEET created by Tomi Mester I originally created this cheat sheet for my SQL course and workshop participants.* But I have decided to open-source it and make it available for everyone who wants

More information

Business Analytics Nanodegree Syllabus

Business Analytics Nanodegree Syllabus Business Analytics Nanodegree Syllabus Master data fundamentals applicable to any industry Before You Start There are no prerequisites for this program, aside from basic computer skills. You should be

More information

Tableau Tutorial Using Canadian Arms Sales Data

Tableau Tutorial Using Canadian Arms Sales Data Tableau Tutorial Using Canadian Arms Sales Data 1) Your data comes from Industry Canada s Trade site. 2) If you don t want to download the data yourself, use this file. You can also download it from the

More information

Experimental Finance, IEOR. Mike Lipkin, Alexander Stanton

Experimental Finance, IEOR. Mike Lipkin, Alexander Stanton Experimental Finance, IEOR Mike Lipkin, Alexander Stanton Housekeeping Make sure your tables/views/functions are prefixed with UNI ID One zip file please, containing other files Describe your process Project

More information

Querying Microsoft SQL Server

Querying Microsoft SQL Server Querying Microsoft SQL Server Course 20461D 5 Days Instructor-led, Hands-on Course Description This 5-day instructor led course is designed for customers who are interested in learning SQL Server 2012,

More information

L07: SQL: Advanced & Practice. CS3200 Database design (sp18 s2) 1/11/2018

L07: SQL: Advanced & Practice. CS3200 Database design (sp18 s2) 1/11/2018 L07: SQL: Advanced & Practice CS3200 Database design (sp18 s2) 1/11/2018 254 Announcements! Please pick up your name card - always bring it - move closer to the front HW3 will be again SQL - individual

More information

Introduction to SQL Part 2 by Michael Hahsler Based on slides for CS145 Introduction to Databases (Stanford)

Introduction to SQL Part 2 by Michael Hahsler Based on slides for CS145 Introduction to Databases (Stanford) Introduction to SQL Part 2 by Michael Hahsler Based on slides for CS145 Introduction to Databases (Stanford) Lecture 3 Lecture Overview 1. Aggregation & GROUP BY 2. Set operators & nested queries 3. Advanced

More information

TURN DATA INTO ACTIONABLE INSIGHTS. Google Analytics Workshop

TURN DATA INTO ACTIONABLE INSIGHTS. Google Analytics Workshop TURN DATA INTO ACTIONABLE INSIGHTS Google Analytics Workshop The Value of Analytics Google Analytics is more than just numbers and stats. It tells the story of how people are interacting with your brand

More information

MIS NETWORK ADMINISTRATOR PROGRAM

MIS NETWORK ADMINISTRATOR PROGRAM NH107-7475 SQL: Querying and Administering SQL Server 2012-2014 136 Total Hours 97 Theory Hours 39 Lab Hours COURSE TITLE: SQL: Querying and Administering SQL Server 2012-2014 PREREQUISITE: Before attending

More information

Implementing Table Operations Using Structured Query Language (SQL) Using Multiple Operations. SQL: Structured Query Language

Implementing Table Operations Using Structured Query Language (SQL) Using Multiple Operations. SQL: Structured Query Language Implementing Table Operations Using Structured Query Language (SQL) Using Multiple Operations Show Only certain columns and rows from the join of Table A with Table B The implementation of table operations

More information

ASSIGNMENT NO Computer System with Open Source Operating System. 2. Mysql

ASSIGNMENT NO Computer System with Open Source Operating System. 2. Mysql ASSIGNMENT NO. 3 Title: Design at least 10 SQL queries for suitable database application using SQL DML statements: Insert, Select, Update, Delete with operators, functions, and set operator. Requirements:

More information

Computing for Medicine (C4M) Seminar 3: Databases. Michelle Craig Associate Professor, Teaching Stream

Computing for Medicine (C4M) Seminar 3: Databases. Michelle Craig Associate Professor, Teaching Stream Computing for Medicine (C4M) Seminar 3: Databases Michelle Craig Associate Professor, Teaching Stream mcraig@cs.toronto.edu Relational Model The relational model is based on the concept of a relation or

More information

Hustle Documentation. Release 0.1. Tim Spurway

Hustle Documentation. Release 0.1. Tim Spurway Hustle Documentation Release 0.1 Tim Spurway February 26, 2014 Contents 1 Features 3 2 Getting started 5 2.1 Installing Hustle............................................. 5 2.2 Hustle Tutorial..............................................

More information

4. SQL - the Relational Database Language Standard 4.3 Data Manipulation Language (DML)

4. SQL - the Relational Database Language Standard 4.3 Data Manipulation Language (DML) Since in the result relation each group is represented by exactly one tuple, in the select clause only aggregate functions can appear, or attributes that are used for grouping, i.e., that are also used

More information

CSC Web Programming. Introduction to SQL

CSC Web Programming. Introduction to SQL CSC 242 - Web Programming Introduction to SQL SQL Statements Data Definition Language CREATE ALTER DROP Data Manipulation Language INSERT UPDATE DELETE Data Query Language SELECT SQL statements end with

More information

Guide to Google Analytics: Admin Settings. Campaigns - Written by Sarah Stemen Account Manager. 5 things you need to know hanapinmarketing.

Guide to Google Analytics: Admin Settings. Campaigns - Written by Sarah Stemen Account Manager. 5 things you need to know hanapinmarketing. Guide to Google Analytics: Google s Enhanced Admin Settings Written by Sarah Stemen Account Manager Campaigns - 5 things you need to know INTRODUCTION Google Analytics is vital to gaining business insights

More information

NCSS: Databases and SQL

NCSS: Databases and SQL NCSS: Databases and SQL Tim Dawborn Lecture 1, January, 2016 Motivation SQLite SELECT WHERE JOIN Tips 2 Outline 1 Motivation 2 SQLite 3 Searching for Data 4 Filtering Results 5 Joining multiple tables

More information