Canvas Data Utilities Documentation Release 0.0.1a Kajigga Dev Mar 07, 2017
Contents 1 CanvasData Utilities 3 1.1 Module Usage.............................................. 3 1.2 Config File................................................ 4 1.3 Command-line Tool........................................... 5 1.4 Usage................................................... 5 2 Using the module programatically 7 3 Indices and tables 9 Python Module Index 11 i
ii
Canvas Data Utilities Documentation, Release 0.0.1a Contents: Contents 1
Canvas Data Utilities Documentation, Release 0.0.1a 2 Contents
CHAPTER 1 CanvasData Utilities Full documentation This python module is designed to make it easy to access Canvas Data files. Currently, this module makes it possible to: Convert downloaded Canvas Data files to CSV files with headers export SQL table creation statements list files for a table download all files or files for a specific table view the Canvas Data schema (fields, field types, etc) connect to a database (uses sqlalchemy) to create tables, import data, run SQL queries Module Usage This module can be used programatically in other scripts and software. An example of creating a canvas_data object is found below.: from canvas_data_utils.canvas_data_auth import CanvasData canvas_data_object = CanvasData( API_KEY=YOUR_API_KEY, API_SECRET=YOUR_API_SECRET, base_folder = YOUR_BASE_DIR, data_folder = YOUR_DATA_DIR) Once you have that object created, you can... generate mysql table creation statements 3
Canvas Data Utilities Documentation, Release 0.0.1a mysql_table_creation_statement = canvas_data_object.table_creation_statement('mysql') generate sqlite table creation statements sqlite_table_creation_statement = canvas_data_object.table_creation_statement('sqlite ') generate postgres table creation statements postgres_table_creation_statement = canvas_data_object.table_creation_statement( 'postgres') create tables in a database given by a connection string canvas_data_object.create_tables('sqlite:///{}'.format(db_filename)) fetch the current schema (as json) schema = canvas_data_object.fetch_schema() get a list of columns in a table user_dim_columns = canvas_data_object.get_schema_columns( 'user_dim') convert an text file download from TSV (Tab Separated Values) to CSV canvas_data_object.convert_tsv_to_csv(tsv_filepath) list all the tables in the schema table_list = canvas_data_object.table_list() download and convert all files to CSV canvas_data_object.convert_all_to_csv() list all downloadable files for a table file_list = canvas_data_object.list_all_files('user_dim') Config File You need to create a config file somewhere. This config file is a typical.ini file. It should look something like the following example. [config] API_SECRET = replace_with_api_secret_from_canvas_data API_KEY = replace_with_api_key_from_canvas_data base_folder = /path/to/base/folder/for/downloads/ data_folder = %(base_folder)s/test2 connection_string = sqlite:///%(base_folder)s/sample.db 4 Chapter 1. CanvasData Utilities
Canvas Data Utilities Documentation, Release 0.0.1a Note: The connection_string configuration follows the connection pattern needed by SQLAlchemy at http://docs. sqlalchemy.org/en/rel_1_0/core/engines.html. This library supports any database type than SQLAlchemy does. Command-line Tool This library includes a command line utility called canvasdata. Usage canvasdata [-h] [--config CONFIG] [-t TABLE] [--offline OFFLINE] {convert_to_csv,import,create_tables,reset,sql_create_statement,list_ files,download,sample_queries,schema} optional arguments: -h, --help show this help message and exit --config CONFIG path to the configuration file -t TABLE specify a specific table --offline OFFLINE run in offline mode 1.3. Command-line Tool 5
Canvas Data Utilities Documentation, Release 0.0.1a 6 Chapter 1. CanvasData Utilities
CHAPTER 2 Using the module programatically class canvas_data_utils.canvas_data_auth.canvasdata(*args, **kwargs) Bases: object clear_table(schema_table) Delete all records from the table schema_table convert_all_to_csv() converts all files (downloading them first if necessary) to CSV create_tables(db_connect_string=none) creates all of the tables in the database download_all_files(table=none) Download all of the files for a given table or, if no table is specified, download all files for all tables. download_single_file(file_url, filename) Download a single file given the file_url and the filename to give the file. file_imported(filename) returns true if the file has been imported into the database. otherwise returns false get_latest_download(table) returns the latest downloaded file for a table import_all_requests() imports all requests files. requests files are incremental. To get a full picture of web traffic in a given time period, you must import the requests file individually import_data(schema_table=none, with_download=true) downloads and imports all tables unless schema_table is defined, in which case it only imports that table import_file(schema_table, csv_filename) imports the table specified by schema_table with the data from csv_filename. latest_files() returns the latest downloadable file for a table 7
Canvas Data Utilities Documentation, Release 0.0.1a list_all_files(schema_table) Lists all files for a table normalize_values_for_db(obj, schema_table) normalizes some of the data for the database. For example, many fields come across with N representing a null value. This method changes that to None so it imports correctly into the database. This method also ensures that floats are floats, integers are integers, date fields have proper dates or are blank, etc. remove_caches() removes all cached data (schema, imported_files, etc) from the base folder reset_database(db_connect_string=none) resets the database by dropping all known tables then recreating them. This method does _not_ import data. 8 Chapter 2. Using the module programatically
CHAPTER 3 Indices and tables genindex modindex search 9
Canvas Data Utilities Documentation, Release 0.0.1a 10 Chapter 3. Indices and tables
Python Module Index c canvas_data_utils, 5 11
Canvas Data Utilities Documentation, Release 0.0.1a 12 Python Module Index
Index C canvas_data_utils (module), 5 L latest_files() (canvas_data_utils.canvas_data_auth.canvasdata CanvasData (class in canvas_data_utils.canvas_data_auth), 7 list_all_files() (canvas_data_utils.canvas_data_auth.canvasdata clear_table() (canvas_data_utils.canvas_data_auth.canvasdata convert_all_to_csv() (canvas_data_utils.canvas_data_auth.canvasdata create_tables() (canvas_data_utils.canvas_data_auth.canvasdata D download_all_files() (canvas_data_utils.canvas_data_auth.canvasdata download_single_file() (canvas_data_utils.canvas_data_auth.canvasdata F file_imported() (canvas_data_utils.canvas_data_auth.canvasdata G get_latest_download() (canvas_data_utils.canvas_data_auth.canvasdata I import_all_requests() (canvas_data_utils.canvas_data_auth.canvasdata import_data() (canvas_data_utils.canvas_data_auth.canvasdata import_file() (canvas_data_utils.canvas_data_auth.canvasdata N normalize_values_for_db() (canvas_data_utils.canvas_data_auth.canvasdata method), 8 R remove_caches() (canvas_data_utils.canvas_data_auth.canvasdata method), 8 reset_database() (canvas_data_utils.canvas_data_auth.canvasdata method), 8 13