keras-pandas Documentation

Size: px
Start display at page:

Download "keras-pandas Documentation"

Transcription

1 keras-pandas Documentation Release Brendan Herger Sep 29, 2018

2

3 Contents: 1 keras-pandas Quick Start Usage Contact Changelog Automater 7 3 lib 11 4 constants 13 5 transformations 15 Python Module Index 17 i

4 ii

5 Welcome to keras-pandas! I d recommend that you start with the keras-pandas section & Quick Start. Contents: 1

6 2 Contents:

7 CHAPTER 1 keras-pandas tl;dr: keras-pandas allows users to rapidly build and iterate on deep learning models. Getting data formatted and into keras can be tedious, time consuming, and difficult, whether your a veteran or new to Keras. keras-pandas overcomes these issues by (automatically) providing: A cleaned, transformed and correctly formatted X and y (good for keras, sklearn or any other ML platform) An input nub, without the hassle of worrying about input shapes or data types An output layer, correctly formatted for the kind of response variable provided With these resources, it s possible to rapidly build and iterate on deep learning models, and focus on the parts of modeling that you enjoy! For more info, check out the: Code Documentation Issue tracker Author s website 1.1 Quick Start Let s build a model with the titanic data set. This data set is particularly fun because this data set contains a mix of categorical and numerical data types, and features a lot of null values. We ll keras-pandas pip install -U keras-pandas And then run the following snippet to create and train a model: 3

8 from keras import Model from keras.layers import Dense from keras_pandas.automater import Automater from keras_pandas.lib import load_titanic observations = load_titanic() # Transform the data set, using keras_pandas categorical_vars = ['pclass', 'sex', 'survived'] numerical_vars = ['age', 'siblings_spouses_aboard', 'parents_children_aboard', 'fare'] text_vars = ['name'] auto = Automater(categorical_vars=categorical_vars, numerical_vars=numerical_vars, text_vars=text_vars, response_var='survived') X, y = auto.fit_transform(observations) # Start model with provided input nub x = auto.input_nub # Fill in your own hidden layers x = Dense(32)(x) x = Dense(32, activation='relu')(x) x = Dense(32)(x) # End model with provided output nub x = auto.output_nub(x) model = Model(inputs=auto.input_layers, outputs=x) model.compile(optimizer='adam', loss=auto.loss, metrics=['accuracy']) # Train model model.fit(x, y, epochs=4, validation_split=.2) 1.2 Usage Installation You can install keras-pandas with pip: pip install -U keras-pandas Creating an Automater The core feature of keras-pandas is the Automater, which accepts lists of variable types (all optional), and a response variable (optional, for supervised problems). Together, all of these variables are the user_input_variables, which may be different than the variables fed into Keras. As a side note, the response variable must be in one of the variable type lists (e.g. survived is in categorical_vars) 4 Chapter 1. keras-pandas

9 One variable type If you only have one variable type, only use that variable type! categorical_vars = ['pclass', 'sex', 'survived'] auto = Automater(categorical_vars=categorical_vars, response_var='survived') Multiple variable types If you have multiple variable types, throw them all in! categorical_vars = ['pclass', 'sex', 'survived'] numerical_vars = ['age', 'siblings_spouses_aboard', 'parents_children_aboard', 'fare'] auto = Automater(categorical_vars=categorical_vars, numerical_vars=numerical_vars, response_var='survived') No response_var If all variables are always available, and / or your problems space doesn t have a single response variable, you can omit the response variable. categorical_vars = ['pclass', 'sex', 'survived'] numerical_vars = ['age', 'siblings_spouses_aboard', 'parents_children_aboard', 'fare'] auto = Automater(categorical_vars=categorical_vars, numerical_vars=numerical_vars) In this case, an output nub will not be auto-generated Fitting the Automater Before use, the Automator must be fit. The fit() method accepts a pandas DataFrame, which must contain all of the columns listed during initialization. auto.fit(observations) Transforming data Now, we can use our Automater to transform the dataset, from a pandas DataFrame to numpy objects properly formatted for Keras s input and output layers. X, y = auto.transform(observations, df_out=false) This will return two objects: X: An array, containing numpy object for each Keras input. This is generally one Keras input for each user input variable. y: A numpy object, containing the response variable (if one was provided) 1.2. Usage 5

10 1.2.5 Using input / output nubs Setting up correctly formatted, heuristically good input and output layers is often Tedious Time consuming Difficult for those new to Keras With this in mind, keras-pandas provides correctly formatted input and output nubs. The input nub is correctly formatted to accept the output from auto.transform(). It contains one Keras Input layer for each generated input, may contain addition layers, and has all input piplines joined with a Concatenate layer. The output layer is correctly formatted to accept the response variable numpy object. 1.3 Contact Hey, I m Brendan Herger, avaiable at Please feel free to reach out to me at 13herger <at> gmail <dot> com 1.4 Changelog Development Adding regression example w/ inverse_transformation (#64) Fixing issue where web socket connections were being opened needlessly (#65) Adding Manifest.in, with including files references in setup.py (#54) Fixed poorly written text embedding index unit test (#52) Added license (#49) Earlier Lots of things happened. Break things and move fast 6 Chapter 1. keras-pandas

11 CHAPTER 2 Automater class Automater.Automater(numerical_vars=[], categorical_vars=[], boolean_vars=[], datetime_vars=[], text_vars=[], non_transformed_vars=[], response_var=none, df_out=false) Bases: object init (numerical_vars=[], categorical_vars=[], boolean_vars=[], datetime_vars=[], text_vars=[], non_transformed_vars=[], response_var=none, df_out=false) x. init (... ) initializes x; see help(type(x)) for signature _check_input_dataframe_columns_(input_dataframe) _check_output_dataframe_columns_(output_dataframe) _create_input_nub(variable_type_dict, input_dataframe) Generate a nub, appropriate for use as an input (and possibly additional Keras layers). Each Keras input variable has on input pipeline, with: One Input (required) Possible additional layers (optional, such as embedding layers for text) All input pipelines are then joined with a Concatenate layer Parameters variable_type_dict ({str:[str]}) A dictionary, with keys describing variables types, and values listing particular variables input_dataframe (pandas.dataframe) A pandas dataframe, containing all keras input layers Returns A Keras layer, which can be fed into future layers Return type ([keras,input], Layer) _create_mappers(variable_type_dict) Creates two sklearn-pandas mappers, one for the input variables, and another for the output variable(s) Parameters variable_type_dict ({str:[str]}) A dictionary, with keys describing variables types, and values listing particular variables 7

12 Returns Two sklearn-pandas mappers, one for the input variables, and another for the output variable(s) Return type (DataFrameMapper, DataFrameMapper) _create_output_nub(variable_type_dict, output_variables_df, y) Generate a nub, appropriate for use as an output / final Keras layer. The structure of this nub will depend on the y variable s data type Parameters variable_type_dict ({str:[str]}) A dictionary, with keys describing variables types, and values listing particular variables output_variables_df (pandas.dataframe) A dataframe containing the output variable. This is necessary for some data types (e.g. a categorical output needs to know how levels the categorical variable has) y (str) The name of the response variable Returns A single Keras layer, correctly formatted to output the response variable provided Return type Layer _suggest_loss(variable_type_dict, y) fit(input_dataframe) Get the data and layers ready for use Train the input transformation pipelines Create the keras input layers Train the output transformation pipeline(s) (optional, only if there is a response variable) Create the output layer(s) (optional, only if there is a response variable) Set self.fitted to True Parameters input_dataframe Returns self, now in a fitted state. The Automater now has initialized input layers, output layer(s) (if response variable is present), and can be used for the transform step Return type Automater fit_transform(input_dataframe) Perform a fit, and then a transform. See transform for return documentation get_transformer(variable) get_transformers() list_default_transformation_pipelines() transform(input_dataframe, df_out=none) Validate that the provided input_dataframe contains the required input columns Transform the keras input columns Transform the response variable, if it is present Format the data for return 8 Chapter 2. Automater

13 Parameters input_dataframe (pandas.dataframe) A pandas dataframe, containing all keras input layers Returns Either a pandas dataframe (if df_out = True), or a numpy object (if df_out = False). This object will contain: the transformed input variables, and the transformed output variables (if the output variable is present in input_dataframe 9

14 10 Chapter 2. Automater

15 CHAPTER 3 lib lib.check_variable_list_are_valid(variable_type_dict) Checks that the provided variable_type_dict is valid, by: Confirming there is no overlap between all variable lists Parameters variable_type_dict ({str:[str]}) A dictionary, with keys describing variables types, and values listing particular variables Returns True, if there is no overlap Return type bool lib.download_file(url, local_file_path, filename) Download the file at url in chunks, to the location at local_file_path Parameters url (str) URL to a file to be downloaded local_file_path (str) Path to download the file to Returns The path to the file on the local machine (same as input local_file_path) Return type str lib.get_variable_type(variable_name, variable_type_dict, response_var) lib.load_iris() lib.load_lending_club() lib.load_mushroom() lib.load_mushrooms() Load the mushrooms data set, as a pandas DataFrame Returns A DataFrame, containing the mushroom dataset Return type pandas.dataframe 11

16 lib.load_titanic() Load the titanic data set, as a pandas DataFrame Returns A DataFrame, containing the titanic dataset Return type pandas.dataframe 12 Chapter 3. lib

17 CHAPTER 4 constants constants.input_nub_categorical_handler(variable, input_dataframe) constants.input_nub_numeric_handler(variable, input_dataframe) constants.input_nub_text_handler(variable, input_dataframe) Create an input nub for text data, by: Finding all derived variables. With a variable name and sequence length of 4, there would be 4 derived variables, name_0 through name_4 - Creating an appropriately shaped input layer, embedding layer, and all future layers - Return both the input layer and last layer (both are necessary for creating a model) Parameters variable (str) Name of the variable input_dataframe (pandas.dataframe) A dataframe, containing either the specified variable, or derived variables Returns A tuple containing the input layer, and the last layer of the nub 13

18 14 Chapter 4. constants

19 CHAPTER 5 transformations class transformations.embeddingvectorizer(embedding_sequence_length=none) Bases: sklearn.base.transformermixin, sklearn.base.baseestimator Converts text into padded sequences. The output of this transformation is consistent with the required format for Keras embedding layers For example the fat man might be transformed into [2, 0, 27, 1, 1, 1], if the embedding_sequence_length is 6. There are a few sentinel values used by this layer: 0 is used for the UNK token (tokens which were not seen during training) 1 is used for the padding token (to fill out sequences that shorter than embedding_sequence_length) init (embedding_sequence_length=none) x. init (... ) initializes x; see help(type(x)) for signature fit(x, y=none) generate_embedding_sequence_length(observation_series) static pad(input_sequence, length, pad_char) Pad the given iterable, so that it is the correct length. Parameters input_sequence Any iterable object length (int) The desired length of the output. pad_char (str or int) The character or int to be added to short sequences Returns A sequence, of len length Return type [] static prepare_input(x) process_string(input_string) Turn a string into padded sequences, consisten with Keras s Embedding layer 15

20 Simple preprocess & tokenize Convert tokens to indices Pad sequence to be the correct length transform(x) Parameters input_string (str) A string, to be converted into a padded sequence of token indices Returns A padded, fixed-length array of token indices Return type [int] 16 Chapter 5. transformations

21 Python Module Index a Automater, 7 c constants, 13 l lib, 11 t transformations, 15 17

22 18 Python Module Index

23 Index Symbols init () (Automater.Automater method), 7 init () (transformations.embeddingvectorizer method), 15 _check_input_dataframe_columns_() (Automater.Automater method), 7 _check_output_dataframe_columns_() (Automater.Automater method), 7 _create_input_nub() (Automater.Automater method), 7 _create_mappers() (Automater.Automater method), 7 _create_output_nub() (Automater.Automater method), 8 _suggest_loss() (Automater.Automater method), 8 A Automater (class in Automater), 7 Automater (module), 7 C check_variable_list_are_valid() (in module lib), 11 constants (module), 13 D download_file() (in module lib), 11 E EmbeddingVectorizer (class in transformations), 15 F fit() (Automater.Automater method), 8 fit() (transformations.embeddingvectorizer method), 15 fit_transform() (Automater.Automater method), 8 G generate_embedding_sequence_length() (transformations.embeddingvectorizer method), 15 get_transformer() (Automater.Automater method), 8 get_transformers() (Automater.Automater method), 8 get_variable_type() (in module lib), 11 I input_nub_categorical_handler() (in module constants), 13 input_nub_numeric_handler() (in module constants), 13 input_nub_text_handler() (in module constants), 13 (Auto- L lib (module), 11 list_default_transformation_pipelines() mater.automater method), 8 load_iris() (in module lib), 11 load_lending_club() (in module lib), 11 load_mushroom() (in module lib), 11 load_mushrooms() (in module lib), 11 load_titanic() (in module lib), 11 P pad() (transformations.embeddingvectorizer static method), 15 prepare_input() (transformations.embeddingvectorizer static method), 15 process_string() (transformations.embeddingvectorizer method), 15 T transform() (Automater.Automater method), 8 transform() (transformations.embeddingvectorizer method), 16 transformations (module), 15 19

Problem Based Learning 2018

Problem Based Learning 2018 Problem Based Learning 2018 Introduction to Machine Learning with Python L. Richter Department of Computer Science Technische Universität München Monday, Jun 25th L. Richter PBL 18 1 / 21 Overview 1 2

More information

COMP 364: Computer Tools for Life Sciences

COMP 364: Computer Tools for Life Sciences COMP 364: Computer Tools for Life Sciences Intro to machine learning with scikit-learn Christopher J.F. Cameron and Carlos G. Oliver 1 / 1 Key course information Assignment #4 available now due Monday,

More information

Figure 3.20: Visualize the Titanic Dataset

Figure 3.20: Visualize the Titanic Dataset 80 Chapter 3. Data Mining with Azure Machine Learning Studio Figure 3.20: Visualize the Titanic Dataset 3. After verifying the output, we will cast categorical values to the corresponding columns. To begin,

More information

ARTIFICIAL INTELLIGENCE AND PYTHON

ARTIFICIAL INTELLIGENCE AND PYTHON ARTIFICIAL INTELLIGENCE AND PYTHON DAY 1 STANLEY LIANG, LASSONDE SCHOOL OF ENGINEERING, YORK UNIVERSITY WHAT IS PYTHON An interpreted high-level programming language for general-purpose programming. Python

More information

Package embed. November 19, 2018

Package embed. November 19, 2018 Version 0.0.2 Package embed November 19, 2018 Title Extra Recipes for Encoding Categorical Predictors Description Factor predictors can be converted to one or more numeric representations using simple

More information

rerpy Documentation Release Nathaniel J. Smith

rerpy Documentation Release Nathaniel J. Smith rerpy Documentation Release 0.2.1 Nathaniel J. Smith February 04, 2014 Contents 1 Vital statistics 3 2 User manual 5 2.1 Getting oriented............................................. 5 2.2 Loading data...............................................

More information

ˆ How to develop a naive LSTM network for a sequence prediction problem.

ˆ How to develop a naive LSTM network for a sequence prediction problem. Chapter 27 Understanding Stateful LSTM Recurrent Neural Networks A powerful and popular recurrent neural network is the long short-term model network or LSTM. It is widely used because the architecture

More information

DEEP LEARNING IN PYTHON. Creating a keras model

DEEP LEARNING IN PYTHON. Creating a keras model DEEP LEARNING IN PYTHON Creating a keras model Model building steps Specify Architecture Compile Fit Predict Model specification In [1]: import numpy as np In [2]: from keras.layers import Dense In [3]:

More information

Hands-on Machine Learning for Cybersecurity

Hands-on Machine Learning for Cybersecurity Hands-on Machine Learning for Cybersecurity James Walden 1 1 Center for Information Security Northern Kentucky University 11th Annual NKU Cybersecurity Symposium Highland Heights, KY October 11, 2018 Topics

More information

DEEP LEARNING IN PYTHON. Introduction to deep learning

DEEP LEARNING IN PYTHON. Introduction to deep learning DEEP LEARNING IN PYTHON Introduction to deep learning Imagine you work for a bank You need to predict how many transactions each customer will make next year Example as seen by linear regression Age Bank

More information

Code Mania Artificial Intelligence: a. Module - 1: Introduction to Artificial intelligence and Python:

Code Mania Artificial Intelligence: a. Module - 1: Introduction to Artificial intelligence and Python: Code Mania 2019 Artificial Intelligence: a. Module - 1: Introduction to Artificial intelligence and Python: 1. Introduction to Artificial Intelligence 2. Introduction to python programming and Environment

More information

Python With Data Science

Python With Data Science Course Overview This course covers theoretical and technical aspects of using Python in Applied Data Science projects and Data Logistics use cases. Who Should Attend Data Scientists, Software Developers,

More information

Keras: Handwritten Digit Recognition using MNIST Dataset

Keras: Handwritten Digit Recognition using MNIST Dataset Keras: Handwritten Digit Recognition using MNIST Dataset IIT PATNA January 31, 2018 1 / 30 OUTLINE 1 Keras: Introduction 2 Installing Keras 3 Keras: Building, Testing, Improving A Simple Network 2 / 30

More information

DATA STRUCTURE AND ALGORITHM USING PYTHON

DATA STRUCTURE AND ALGORITHM USING PYTHON DATA STRUCTURE AND ALGORITHM USING PYTHON Common Use Python Module II Peter Lo Pandas Data Structures and Data Analysis tools 2 What is Pandas? Pandas is an open-source Python library providing highperformance,

More information

Introduction to Data Science. Introduction to Data Science with Python. Python Basics: Basic Syntax, Data Structures. Python Concepts (Core)

Introduction to Data Science. Introduction to Data Science with Python. Python Basics: Basic Syntax, Data Structures. Python Concepts (Core) Introduction to Data Science What is Analytics and Data Science? Overview of Data Science and Analytics Why Analytics is is becoming popular now? Application of Analytics in business Analytics Vs Data

More information

Platypus Analytics Documentation

Platypus Analytics Documentation Platypus Analytics Documentation Release 0.1.0 Platypus LLC December 24, 2015 Contents 1 platypus package 3 1.1 Subpackages............................................... 3 1.2 Module contents.............................................

More information

Avpy Documentation. Release sydh

Avpy Documentation. Release sydh Avpy Documentation Release 0.1.3 sydh May 01, 2016 Contents 1 Overview 1 2 Getting Help 3 3 Issues 5 4 Changes 7 5 Contributions 9 6 Indices and tables 11 6.1 Examples.................................................

More information

As a reference, please find a version of the Machine Learning Process described in the diagram below.

As a reference, please find a version of the Machine Learning Process described in the diagram below. PREDICTION OVERVIEW In this experiment, two of the Project PEACH datasets will be used to predict the reaction of a user to atmospheric factors. This experiment represents the first iteration of the Machine

More information

About Intellipaat. About the Course. Why Take This Course?

About Intellipaat. About the Course. Why Take This Course? About Intellipaat Intellipaat is a fast growing professional training provider that is offering training in over 150 most sought-after tools and technologies. We have a learner base of 700,000 in over

More information

How to Develop Encoder-Decoder LSTMs

How to Develop Encoder-Decoder LSTMs Chapter 9 How to Develop Encoder-Decoder LSTMs 9.0.1 Lesson Goal The goal of this lesson is to learn how to develop encoder-decoder LSTM models. completing this lesson, you will know: After ˆ The Encoder-Decoder

More information

Neural networks. About. Linear function approximation. Spyros Samothrakis Research Fellow, IADS University of Essex.

Neural networks. About. Linear function approximation. Spyros Samothrakis Research Fellow, IADS University of Essex. Neural networks Spyros Samothrakis Research Fellow, IADS University of Essex About Linear function approximation with SGD From linear regression to neural networks Practical aspects February 28, 2017 Conclusion

More information

Python Working with files. May 4, 2017

Python Working with files. May 4, 2017 Python Working with files May 4, 2017 So far, everything we have done in Python was using in-memory operations. After closing the Python interpreter or after the script was done, all our input and output

More information

df2gspread Documentation

df2gspread Documentation df2gspread Documentation Release Eduard Trott Apr 05, 2017 Contents 1 df2gspread 3 1.1 Description................................................ 3 1.2 Status...................................................

More information

Programming for Data Science Syllabus

Programming for Data Science Syllabus Programming for Data Science Syllabus Learn to use Python and SQL to solve problems with data Before You Start Prerequisites: There are no prerequisites for this program, aside from basic computer skills.

More information

nptdms Documentation Release Adam Reeve

nptdms Documentation Release Adam Reeve nptdms Documentation Release 0.11.4 Adam Reeve Dec 01, 2017 Contents 1 Contents 3 1.1 Installation and Quick Start....................................... 3 1.2 Reading TDMS files...........................................

More information

Ch.1 Introduction. Why Machine Learning (ML)? manual designing of rules requires knowing how humans do it.

Ch.1 Introduction. Why Machine Learning (ML)? manual designing of rules requires knowing how humans do it. Ch.1 Introduction Syllabus, prerequisites Notation: Means pencil-and-paper QUIZ Means coding QUIZ Code respository for our text: https://github.com/amueller/introduction_to_ml_with_python Why Machine Learning

More information

Converting categorical data into numbers with Pandas and Scikit-learn -...

Converting categorical data into numbers with Pandas and Scikit-learn -... 1 of 6 11/17/2016 11:02 AM FastML Machine learning made easy RSS Home Contents Popular Links Backgrounds About Converting categorical data into numbers with Pandas and Scikit-learn 2014-04-30 Many machine

More information

Package reticulate. November 24, 2017

Package reticulate. November 24, 2017 Type Package Title R Interface to Python Version 1.3.1 Package reticulate November 24, 2017 R interface to Python modules, classes, and functions. When calling into Python R data types are automatically

More information

Data Acquisition and Processing

Data Acquisition and Processing Data Acquisition and Processing Adisak Sukul, Ph.D., Lecturer,, adisak@iastate.edu http://web.cs.iastate.edu/~adisak/bigdata/ Topics http://web.cs.iastate.edu/~adisak/bigdata/ Data Acquisition Data Processing

More information

span Documentation Release 0.2 Phillip Cloud

span Documentation Release 0.2 Phillip Cloud span Documentation Release 0.2 Phillip Cloud August 30, 2013 CONTENTS 1 Introduction 3 1.1 Name......................................... 3 1.2 Motivation...................................... 3 1.3 Disclaimer......................................

More information

Data Analyst Nanodegree Syllabus

Data Analyst Nanodegree Syllabus Data Analyst Nanodegree Syllabus Discover Insights from Data with Python, R, SQL, and Tableau Before You Start Prerequisites : In order to succeed in this program, we recommend having experience working

More information

Event: PASS SQL Saturday - DC 2018 Presenter: Jon Tupitza, CTO Architect

Event: PASS SQL Saturday - DC 2018 Presenter: Jon Tupitza, CTO Architect Event: PASS SQL Saturday - DC 2018 Presenter: Jon Tupitza, CTO Architect BEOP.CTO.TP4 Owner: OCTO Revision: 0001 Approved by: JAT Effective: 08/30/2018 Buchanan & Edwards Proprietary: Printed copies of

More information

Python Tutorial. CS/CME/BioE/Biophys/BMI 279 Oct. 17, 2017 Rishi Bedi

Python Tutorial. CS/CME/BioE/Biophys/BMI 279 Oct. 17, 2017 Rishi Bedi Python Tutorial CS/CME/BioE/Biophys/BMI 279 Oct. 17, 2017 Rishi Bedi 1 Python2 vs Python3 Python syntax Data structures Functions Debugging Classes The NumPy Library Outline 2 Many examples adapted from

More information

pyblock Documentation

pyblock Documentation pyblock Documentation Release 0.4 James Spencer Apr 02, 2018 Contents 1 Installation 3 2 pyblock API 5 3 pyblock tutorial 11 4 References 17 5 Indices and tables 19 Bibliography 21 Python Module Index

More information

Python review. 1 Python basics. References. CS 234 Naomi Nishimura

Python review. 1 Python basics. References. CS 234 Naomi Nishimura Python review CS 234 Naomi Nishimura The sections below indicate Python material, the degree to which it will be used in the course, and various resources you can use to review the material. You are not

More information

Ch.1 Introduction. Why Machine Learning (ML)?

Ch.1 Introduction. Why Machine Learning (ML)? Syllabus, prerequisites Ch.1 Introduction Notation: Means pencil-and-paper QUIZ Means coding QUIZ Why Machine Learning (ML)? Two problems with conventional if - else decision systems: brittleness: The

More information

Importing data sets in R

Importing data sets in R Importing data sets in R R can import and export different types of data sets including csv files text files excel files access database STATA data SPSS data shape files audio files image files and many

More information

Pandas. Data Manipulation in Python

Pandas. Data Manipulation in Python Pandas Data Manipulation in Python 1 / 27 Pandas Built on NumPy Adds data structures and data manipulation tools Enables easier data cleaning and analysis import pandas as pd 2 / 27 Pandas Fundamentals

More information

GoCD Python API client Documentation

GoCD Python API client Documentation GoCD Python API client Documentation Release 1.0.1 Grigory Chernyshev Dec 08, 2017 Contents 1 Intro 1 2 Contents: 3 2.1 Yet another GoCD python client..................................... 3 2.2 Usage...................................................

More information

PrettyPandas Documentation

PrettyPandas Documentation PrettyPandas Documentation Release 0.0.4 Henry Hammond Mar 26, 2018 Contents 1 Features 3 2 Installation 5 3 Contributing 7 4 Contents 9 4.1 Quick Start................................................

More information

BanzaiDB Documentation

BanzaiDB Documentation BanzaiDB Documentation Release 0.3.0 Mitchell Stanton-Cook Jul 19, 2017 Contents 1 BanzaiDB documentation contents 3 2 Indices and tables 11 i ii BanzaiDB is a tool for pairing Microbial Genomics Next

More information

Coclust Documentation

Coclust Documentation Coclust Documentation Release 0.2.1 Francois Role, Stanislas Morbieu, Mohamed Nadif May 30, 2017 Contents 1 Installation 3 2 Examples 5 3 Python API 11 4 Scripts 31 Python Module Index 37 i ii Coclust

More information

The UOB Python Lectures: Part 3 - Python for Data Analysis

The UOB Python Lectures: Part 3 - Python for Data Analysis The UOB Python Lectures: Part 3 - Python for Data Analysis Hesham al-ammal University of Bahrain Small Data BIG Data Data Scientist s Tasks Interacting with the outside world Reading and writing with a

More information

ENGR 102 Engineering Lab I - Computation

ENGR 102 Engineering Lab I - Computation ENGR 102 Engineering Lab I - Computation Learning Objectives by Week 1 ENGR 102 Engineering Lab I Computation 2 Credits 2. Introduction to the design and development of computer applications for engineers;

More information

Business Club. Decision Trees

Business Club. Decision Trees Business Club Decision Trees Business Club Analytics Team December 2017 Index 1. Motivation- A Case Study 2. The Trees a. What is a decision tree b. Representation 3. Regression v/s Classification 4. Building

More information

An Introduction to Deep Learning with RapidMiner. Philipp Schlunder - RapidMiner Research

An Introduction to Deep Learning with RapidMiner. Philipp Schlunder - RapidMiner Research An Introduction to Deep Learning with RapidMiner Philipp Schlunder - RapidMiner Research What s in store for today?. Things to know before getting started 2. What s Deep Learning anyway? 3. How to use

More information

SCHEME 7. 1 Introduction. 2 Primitives COMPUTER SCIENCE 61A. October 29, 2015

SCHEME 7. 1 Introduction. 2 Primitives COMPUTER SCIENCE 61A. October 29, 2015 SCHEME 7 COMPUTER SCIENCE 61A October 29, 2015 1 Introduction In the next part of the course, we will be working with the Scheme programming language. In addition to learning how to write Scheme programs,

More information

Lotus IT Hub. Module-1: Python Foundation (Mandatory)

Lotus IT Hub. Module-1: Python Foundation (Mandatory) Module-1: Python Foundation (Mandatory) What is Python and history of Python? Why Python and where to use it? Discussion about Python 2 and Python 3 Set up Python environment for development Demonstration

More information

Release Clearcode < and associates (see AUTHORS)

Release Clearcode <  and associates (see AUTHORS) pytest s aucedocumentation Release 0.3.3 Clearcode and associates (see AUTHORS) July 14, 2014 Contents 1 Contents 3 1.1 Usage................................................... 3

More information

Intro. Scheme Basics. scm> 5 5. scm>

Intro. Scheme Basics. scm> 5 5. scm> Intro Let s take some time to talk about LISP. It stands for LISt Processing a way of coding using only lists! It sounds pretty radical, and it is. There are lots of cool things to know about LISP; if

More information

Course May 18, Advanced Computational Physics. Course Hartmut Ruhl, LMU, Munich. People involved. SP in Python: 3 basic points

Course May 18, Advanced Computational Physics. Course Hartmut Ruhl, LMU, Munich. People involved. SP in Python: 3 basic points May 18, 2017 3 I/O 3 I/O 3 I/O 3 ASC, room A 238, phone 089-21804210, email hartmut.ruhl@lmu.de Patrick Böhl, ASC, room A205, phone 089-21804640, email patrick.boehl@physik.uni-muenchen.de. I/O Scientific

More information

Deep Nets with. Keras

Deep Nets with. Keras docs https://keras.io Deep Nets with Keras κέρας http://vem.quantumunlimited.org/the-gates-of-horn/ Professor Marie Roch These slides only cover enough to get started with feed-forward networks and do

More information

Django-CSP Documentation

Django-CSP Documentation Django-CSP Documentation Release 3.0 James Socol, Mozilla September 06, 2016 Contents 1 Installing django-csp 3 2 Configuring django-csp 5 2.1 Policy Settings..............................................

More information

Pandas III: Grouping and Presenting Data

Pandas III: Grouping and Presenting Data Lab 8 Pandas III: Grouping and Presenting Data Lab Objective: Learn about Pivot tables, groupby, etc. Introduction Pandas originated as a wrapper for numpy that was developed for purposes of data analysis.

More information

Pypeline Documentation

Pypeline Documentation Pypeline Documentation Release 0.2 Kyle Corbitt May 09, 2014 Contents 1 Contents 3 1.1 Installation................................................ 3 1.2 Quick Start................................................

More information

pysharedutils Documentation

pysharedutils Documentation pysharedutils Documentation Release 0.5.0 Joel James August 07, 2017 Contents 1 pysharedutils 1 2 Indices and tables 13 i ii CHAPTER 1 pysharedutils pysharedutils is a convenient utility module which

More information

Kaggle See Click Fix Model Description

Kaggle See Click Fix Model Description Kaggle See Click Fix Model Description BY: Miroslaw Horbal & Bryan Gregory LOCATION: Waterloo, Ont, Canada & Dallas, TX CONTACT : miroslaw@gmail.com & bryan.gregory1@gmail.com CONTEST: See Click Predict

More information

Data Wrangling with Python and Pandas

Data Wrangling with Python and Pandas Data Wrangling with Python and Pandas January 25, 2015 1 Introduction to Pandas: the Python Data Analysis library This is a short introduction to pandas, geared mainly for new users and adapted heavily

More information

CSC Advanced Scientific Computing, Fall Numpy

CSC Advanced Scientific Computing, Fall Numpy CSC 223 - Advanced Scientific Computing, Fall 2017 Numpy Numpy Numpy (Numerical Python) provides an interface, called an array, to operate on dense data buffers. Numpy arrays are at the core of most Python

More information

Python data pipelines similar to R Documentation

Python data pipelines similar to R Documentation Python data pipelines similar to R Documentation Release 0.1.0 Jan Schulz October 23, 2016 Contents 1 Python data pipelines 3 1.1 Features.................................................. 3 1.2 Documentation..............................................

More information

Exceptions & a Taste of Declarative Programming in SQL

Exceptions & a Taste of Declarative Programming in SQL Exceptions & a Taste of Declarative Programming in SQL David E. Culler CS8 Computational Structures in Data Science http://inst.eecs.berkeley.edu/~cs88 Lecture 12 April 18, 2016 Computational Concepts

More information

Certified Data Science with Python Professional VS-1442

Certified Data Science with Python Professional VS-1442 Certified Data Science with Python Professional VS-1442 Certified Data Science with Python Professional Certified Data Science with Python Professional Certification Code VS-1442 Data science has become

More information

IMPORTING DATA IN PYTHON I. Welcome to the course!

IMPORTING DATA IN PYTHON I. Welcome to the course! IMPORTING DATA IN PYTHON I Welcome to the course! Import data Flat files, e.g..txts,.csvs Files from other software Relational databases Plain text files Source: Project Gutenberg Table data titanic.csv

More information

IMPORTING & MANAGING FINANCIAL DATA IN PYTHON. Read, inspect, & clean data from csv files

IMPORTING & MANAGING FINANCIAL DATA IN PYTHON. Read, inspect, & clean data from csv files IMPORTING & MANAGING FINANCIAL DATA IN PYTHON Read, inspect, & clean data from csv files Import & clean data Ensure that pd.dataframe() is same as csv source file Stock exchange listings: amex-listings.csv

More information

weibull Documentation

weibull Documentation weibull Documentation Release 0.0 Jason R. Jones Jan 11, 2018 Contents 1 Installation 1 1.1 Using pip................................................ 1 1.2 Using setup.py............................................

More information

pymapd Documentation Release dev3+g4665ea7 Tom Augspurger

pymapd Documentation Release dev3+g4665ea7 Tom Augspurger pymapd Documentation Release 0.4.1.dev3+g4665ea7 Tom Augspurger Sep 07, 2018 Contents: 1 Install 3 2 Usage 5 2.1 Connecting................................................ 5 2.2 Querying.................................................

More information

Announcements for this Lecture

Announcements for this Lecture Lecture 6 Objects Announcements for this Lecture Last Call Quiz: About the Course Take it by tomorrow Also remember survey Assignment 1 Assignment 1 is live Posted on web page Due Thur, Sep. 18 th Due

More information

Lab and Assignment Activity

Lab and Assignment Activity Lab and Assignment Activity 1 Introduction Sometime ago, a Titanic dataset was released to the general public. This file is given to you as titanic_data.csv. This data is in text format and contains 12

More information

Welcome to CS61A! Last modified: Thu Jan 23 03:58: CS61A: Lecture #1 1

Welcome to CS61A! Last modified: Thu Jan 23 03:58: CS61A: Lecture #1 1 Welcome to CS61A! This is a course about programming, which is the art and science of constructing artifacts ( programs ) that perform computations or interact with the physical world. To do this, we have

More information

matchbox Documentation

matchbox Documentation matchbox Documentation Release 0.3.0 Clearcode.cc Oct 27, 2018 Contents 1 Package status 3 2 Package resources 5 3 Contents 7 3.1 Glossary................................................. 7 3.2 Rationale.................................................

More information

Package BiocSklearn. July 14, 2018

Package BiocSklearn. July 14, 2018 Package BiocSklearn July 14, 2018 Title interface to python sklearn via Rstudio reticulate This package provides interfaces to selected sklearn elements, and demonstrates fault tolerant use of python modules

More information

Lecture #12: Quick: Exceptions and SQL

Lecture #12: Quick: Exceptions and SQL UC Berkeley EECS Adj. Assistant Prof. Dr. Gerald Friedland Computational Structures in Data Science Lecture #12: Quick: Exceptions and SQL Administrivia Open Project: Starts Monday! Creative data task

More information

Keras: Handwritten Digit Recognition using MNIST Dataset

Keras: Handwritten Digit Recognition using MNIST Dataset Keras: Handwritten Digit Recognition using MNIST Dataset IIT PATNA February 9, 2017 1 / 24 OUTLINE 1 Introduction Keras: Deep Learning library for Theano and TensorFlow 2 Installing Keras Installation

More information

Python for Non-programmers

Python for Non-programmers Python for Non-programmers A Gentle Introduction 1 Yann Tambouret Scientific Computing and Visualization Information Services & Technology Boston University 111 Cummington St. yannpaul@bu.edu Winter 2013

More information

databuild Documentation

databuild Documentation databuild Documentation Release 0.0.10 Flavio Curella May 15, 2015 Contents 1 Contents 3 1.1 Installation................................................ 3 1.2 Quickstart................................................

More information

PySoundFile Documentation Release 0.6.0

PySoundFile Documentation Release 0.6.0 PySoundFile Documentation Release 0.6.0 Bastian Bechtold, Matthias Geier January 30, 2015 Contents 1 Breaking Changes 1 2 Installation 1 3 Read/Write Functions 2 4 Block Processing 2 5 SoundFile Objects

More information

Page 1. Human-computer interaction. Lecture 1b: Design & Implementation. Building user interfaces. Mental & implementation models

Page 1. Human-computer interaction. Lecture 1b: Design & Implementation. Building user interfaces. Mental & implementation models Human-computer interaction Lecture 1b: Design & Implementation Human-computer interaction is a discipline concerned with the design, implementation, and evaluation of interactive systems for human use

More information

Data Science and Machine Learning Essentials

Data Science and Machine Learning Essentials Data Science and Machine Learning Essentials Lab 3A Visualizing Data By Stephen Elston and Graeme Malcolm Overview In this lab, you will learn how to use R or Python to visualize data. If you intend to

More information

Introduction to Python

Introduction to Python Introduction to Python Jon Kerr Nilsen, Dmytro Karpenko Research Infrastructure Services Group, Department for Research Computing, USIT, UiO Why Python Clean and easy-to-understand syntax alldata = cpickle.load(open(filename1,

More information

COMP1730/COMP6730 Programming for Scientists. Sequence types, part 2

COMP1730/COMP6730 Programming for Scientists. Sequence types, part 2 COMP1730/COMP6730 Programming for Scientists Sequence types, part 2 Lecture outline * Lists * Mutable objects & references Sequence data types (recap) * A sequence contains n 0 values (its length), each

More information

NFLWin Documentation. Release Andrew Schechtman-Rook

NFLWin Documentation. Release Andrew Schechtman-Rook NFLWin Documentation Release 1.0.0 Andrew Schechtman-Rook Aug 17, 2016 Links 1 Quickstart 3 2 Current Default Model 5 3 Why NFLWin? 7 4 Resources 9 4.1 Installation................................................

More information

RPLidar Documentation

RPLidar Documentation RPLidar Documentation Release 0.9.1 Artyom Pavlov May 13, 2017 Contents Python Module Index 5 i ii RPLidar Documentation, Release 0.9.1 Simple and lightweight module for working with RPLidar rangefinder

More information

ENGR 102 Engineering Lab I - Computation

ENGR 102 Engineering Lab I - Computation ENGR 102 Engineering Lab I - Computation Week 07: Arrays and Lists of Data Introduction to Arrays In last week s lecture, 1 we were introduced to the mathematical concept of an array through the equation

More information

streamio Documentation

streamio Documentation streamio Documentation Release 0.1.0.dev James Mills April 17, 2014 Contents 1 About 3 1.1 Examples................................................. 3 1.2 Requirements...............................................

More information

wget

wget CIFAR 10 In 1]: %matplotlib inline %reload_ext autoreload %autoreload 2 The CIFAR-10 dataset consists of 60000 32x32 color images in 10 classes, with 6000 images per class. There are 50000 training images

More information

Sergey Fogelson VP of Analytics, Viacom

Sergey Fogelson VP of Analytics, Viacom EXTREME GRADIENT BOOSTING WITH XGBOOST Review of pipelines using sklearn Sergey Fogelson VP of Analytics, Viacom Pipeline Review Takes a list of named 2-tuples (name, pipeline_step) as input Tuples can

More information

Intermediate/Advanced Python. Michael Weinstein (Day 2)

Intermediate/Advanced Python. Michael Weinstein (Day 2) Intermediate/Advanced Python Michael Weinstein (Day 2) Topics Review of basic data structures Accessing and working with objects in python Numpy How python actually stores data in memory Why numpy can

More information

Supervised Learning Classification Algorithms Comparison

Supervised Learning Classification Algorithms Comparison Supervised Learning Classification Algorithms Comparison Aditya Singh Rathore B.Tech, J.K. Lakshmipat University -------------------------------------------------------------***---------------------------------------------------------

More information

fastparquet Documentation

fastparquet Documentation fastparquet Documentation Release 0.1.5 Continuum Analytics Jul 12, 2018 Contents 1 Introduction 3 2 Highlights 5 3 Caveats, Known Issues 7 4 Relation to Other Projects 9 5 Index 11 5.1 Installation................................................

More information

Emel: Deep Learning in One Line using Type Inference, Architecture Selection, and Hyperparameter Tuning

Emel: Deep Learning in One Line using Type Inference, Architecture Selection, and Hyperparameter Tuning Emel: Deep Learning in One Line using Type Inference, Architecture Selection, and Hyperparameter Tuning BARAK OSHRI and NISHITH KHANDWALA We present Emel, a new framework for training baseline supervised

More information

Principles of Functional Programming

Principles of Functional Programming 15-150 Principles of Functional Programming Some Slides for Lecture 16 Modules March 20, 2018 Michael Erdmann Signatures & Structures A signature specifies an interface. A structure provides an implementation.

More information

WordEmbeddingLoader Documentation

WordEmbeddingLoader Documentation WordEmbeddingLoader Documentation Release 0.2.0 Yuta Koreeda Aug 14, 2017 Modules 1 Issues with encoding 3 2 Development 5 3 CHANGELOG 7 3.1 v0.2.................................................... 7

More information

pescador Documentation

pescador Documentation pescador Documentation Release Brian McFee and Eric Humphrey July 28, 2016 Contents 1 Simple example 1 1.1 Batch generators............................................. 1 1.2 StreamLearner..............................................

More information

Deep Learning Frameworks. COSC 7336: Advanced Natural Language Processing Fall 2017

Deep Learning Frameworks. COSC 7336: Advanced Natural Language Processing Fall 2017 Deep Learning Frameworks COSC 7336: Advanced Natural Language Processing Fall 2017 Today s lecture Deep learning software overview TensorFlow Keras Practical Graphical Processing Unit (GPU) From graphical

More information

lime Documentation Release 0.1 Marco Tulio Ribeiro

lime Documentation Release 0.1 Marco Tulio Ribeiro lime Documentation Release 0.1 Marco Tulio Ribeiro Aug 10, 2017 Contents 1 lime package 3 1.1 Subpackages............................................... 3 1.2 Submodules...............................................

More information

Python for Data Analysis. Prof.Sushila Aghav-Palwe Assistant Professor MIT

Python for Data Analysis. Prof.Sushila Aghav-Palwe Assistant Professor MIT Python for Data Analysis Prof.Sushila Aghav-Palwe Assistant Professor MIT Four steps to apply data analytics: 1. Define your Objective What are you trying to achieve? What could the result look like? 2.

More information

A. Python Crash Course

A. Python Crash Course A. Python Crash Course Agenda A.1 Installing Python & Co A.2 Basics A.3 Data Types A.4 Conditions A.5 Loops A.6 Functions A.7 I/O A.8 OLS with Python 2 A.1 Installing Python & Co You can download and install

More information

mri Documentation Release Nate Harada

mri Documentation Release Nate Harada mri Documentation Release 1.0.0 Nate Harada September 18, 2015 Contents 1 Getting Started 3 1.1 Deploying A Server........................................... 3 1.2 Using Caffe as a Client..........................................

More information

CS111 Jeopardy: The Home Version

CS111 Jeopardy: The Home Version CS111 Jeopardy: The Home Version The game that turns CS111 into CSfun11! Fall 2018 This is intended to be a fun way to review some of the topics that will be on the CS111 final exam. These questions are

More information

Package CatEncoders. March 8, 2017

Package CatEncoders. March 8, 2017 Type Package Title Encoders for Categorical Variables Version 0.1.1 Author nl zhang Package CatEncoders Maintainer nl zhang March 8, 2017 Contains some commonly used categorical

More information