Extending Jython. with SIM, SPARQL and SQL

Similar documents
Overview of the Ruby Language. By Ron Haley

CS 349 / SE 382 Scripting. Professor Michael Terry March 18, 2009

Compiler principles, PS1

Syntax and Grammars 1 / 21

A Simple Syntax-Directed Translator

Basic Python 3 Programming (Theory & Practical)

Programming Paradigms

fjyswan Dr. Philip Cannata 1

1. true / false By a compiler we mean a program that translates to code that will run natively on some machine.

Lecture content. Course goals. Course Introduction. TDDA69 Data and Program Structure Introduction

Lecture 16: Static Semantics Overview 1

Announcements. My office hours are today in Gates 160 from 1PM-3PM. Programming Project 3 checkpoint due tomorrow night at 11:59PM.

Getting Started Values, Expressions, and Statements CS GMU

This is an introductory tutorial, which covers the basics of Jython and explains how to handle its various modules and sub-modules.

Seminar on Languages for Scientific Computing Aachen, 6 Feb Navid Abbaszadeh.

CIS192 Python Programming

Introduction to Python programming, II

Weiss Chapter 1 terminology (parenthesized numbers are page numbers)

1 Lexical Considerations

Contents. Figures. Tables. Examples. Foreword. Preface. 1 Basics of Java Programming 1. xix. xxi. xxiii. xxvii. xxix

JVM ByteCode Interpreter

Introduc)on. Tristan Naumann. Sahar Hasan. Steve Hehl. Brian Lu

UNIVERSITY OF CALIFORNIA Department of Electrical Engineering and Computer Sciences Computer Science Division. P. N. Hilfinger

The current topic: Python. Announcements. Python. Python

Sprite an animation manipulation language Language Reference Manual

Python Implementation Strategies. Jeremy Hylton Python / Google

New Programming Paradigms

PYTHON. Varun Jain & Senior Software Engineer. Pratap, Mysore Narasimha Raju & TEST AUTOMATION ARCHITECT. CenturyLink Technologies India PVT LTD

Programming Languages Third Edition. Chapter 7 Basic Semantics

Learning R via Python...or the other way around

Table of Contents EVALUATION COPY

About Python. Python Duration. Training Objectives. Training Pre - Requisites & Who Should Learn Python

Overview of OOP. Dr. Zhang COSC 1436 Summer, /18/2017

(CC)A-NC 2.5 by Randall Munroe Python

Course Introduction and Python Basics

This course is designed for anyone who needs to learn how to write programs in Python.

CS152 Programming Language Paradigms Prof. Tom Austin, Fall Syntax & Semantics, and Language Design Criteria

Table of Contents. Preface... xxi

Key Differences Between Python and Java

Object Model Comparisons

GIS 4653/5653: Spatial Programming and GIS. More Python: Statements, Types, Functions, Modules, Classes

Reminder About Functions

PYTHON TRAINING COURSE CONTENT


CIS192: Python Programming

Program Abstractions, Language Paradigms. CS152. Chris Pollett. Aug. 27, 2008.

SD314 Outils pour le Big Data

Lecture Set 2: Starting Java

Data Structures (list, dictionary, tuples, sets, strings)

Time : 1 Hour Max Marks : 30

The Structure of a Syntax-Directed Compiler

CS 11 Ocaml track: lecture 6

Functions, Scope & Arguments. HORT Lecture 12 Instructor: Kranthi Varala

Parsing Scheme (+ (* 2 3) 1) * 1

GE PROBLEM SOVING AND PYTHON PROGRAMMING. Question Bank UNIT 1 - ALGORITHMIC PROBLEM SOLVING

CSC326 Python Imperative Core (Lec 2)

Lecture Set 2: Starting Java

STATS 507 Data Analysis in Python. Lecture 2: Functions, Conditionals, Recursion and Iteration

Introduction to Python. Genome 559: Introduction to Statistical and Computational Genomics Prof. James H. Thomas

CSC312 Principles of Programming Languages : Functional Programming Language. Copyright 2006 The McGraw-Hill Companies, Inc.

Syntax Errors; Static Semantics

Semantic actions for declarations and expressions

CS321 Languages and Compiler Design I. Winter 2012 Lecture 1

Course Title: Python + Django for Web Application

Matrex Table of Contents

Python I. Some material adapted from Upenn cmpe391 slides and other sources

Introduction to Python. Genome 559: Introduction to Statistical and Computational Genomics Prof. James H. Thomas

CSE450. Translation of Programming Languages. Lecture 11: Semantic Analysis: Types & Type Checking

TEXT MINING INTRO TO PYTHON

An introduction introduction to functional functional programming programming using usin Haskell

What are compilers? A Compiler

Semantic actions for declarations and expressions. Monday, September 28, 15

CIS192 Python Programming

M/s. Managing distributed workloads. Language Reference Manual. Miranda Li (mjl2206) Benjamin Hanser (bwh2124) Mengdi Lin (ml3567)

CIS192: Python Programming

Scala : an LLVM-targeted Scala compiler

Topics Covered Thus Far. CMSC 330: Organization of Programming Languages. Language Features Covered Thus Far. Programming Languages Revisited

Programming Languages and Program Development Life Cycle Fall Introduction to Information and Communication Technologies CSD 102


RYERSON POLYTECHNIC UNIVERSITY DEPARTMENT OF MATH, PHYSICS, AND COMPUTER SCIENCE CPS 710 FINAL EXAM FALL 96 INSTRUCTIONS

Closing the Case for Groovy (and Ruby, and Python)

Introduction to Compilers and Language Design Copyright (C) 2017 Douglas Thain. All rights reserved.

Compilers. Prerequisites

Faculty of Electrical Engineering, Mathematics, and Computer Science Delft University of Technology

Accelerating Information Technology Innovation

Computer Components. Software{ User Programs. Operating System. Hardware

CS 231 Data Structures and Algorithms, Fall 2016

Pace University. Fundamental Concepts of CS121 1

Python Programming, bridging course 2011

CMSC 330: Organization of Programming Languages

Introduction to Functional Programming (Python) John R. Woodward

Interactive use. $ python. >>> print 'Hello, world!' Hello, world! >>> 3 $ Ctrl-D

CIS192 Python Programming

STUDY NOTES UNIT 1 - INTRODUCTION TO OBJECT ORIENTED PROGRAMMING

SMURF Language Reference Manual Serial MUsic Represented as Functions

Where We Are. Lexical Analysis. Syntax Analysis. IR Generation. IR Optimization. Code Generation. Machine Code. Optimization.

Haske k ll An introduction to Functional functional programming using Haskell Purely Lazy Example: QuickSort in Java Example: QuickSort in Haskell

7. Introduction to Denotational Semantics. Oscar Nierstrasz

Accelerating Information Technology Innovation

Introduction to: Computers & Programming: Review prior to 1 st Midterm

Transcription:

Extending Jython with SIM, SPARQL and SQL 1

Outline of topics Interesting features of Python and Jython Relational and semantic data models and query languages, triple stores, RDF Extending the Jython grammar, abstract syntax, code compiler, runtime behavior Project suggestions and opportunities 2

Why is Python lovely? High level, multi-paradigmed: object oriented, imperative, functional, the best of many worlds Dynamic-duck-strong typing, automatic memory management, garbage collection Emphasizes code readability, clear syntax Off-side rule block delimiting instead of braces Scripting, prototyping, scientific computing, data visualization, web frameworks, Google, bittorrent CPython is the mainstream implementation 3

Isn t Jython reinventing the wheel? Python does not compile into universal binaries which run on every platform Jython dynamically compiles Python into Java bytecode which runs directly on a JVM Jython programs can interact with other Java packages and libraries Sacrifice a little speed for portability Extend language with Java tools instead C 4

J/Python features and caveats Compile or Interpret Off-side rule Duck Typing Multiple return values Argument passing List comprehension Functional programming Decorators 5

Interactive interpreter 1 $./dist/bin/jython 2 Jython 2.5.1+ (unknown:exported, Feb 24 2010, 14:11:17) 3 [Java HotSpot(TM) Client VM (Apple Inc.)] on java1.5.0_22 4 Type "help", "copyright", "credits" or "license" for more information. 5 6 >>> from math import * 7 8 >>> 2**32/1024**3 # how many gigabytes in 32 bit address space? 9 4L 10 11 >>> 2**64/1024**3 # how many gigabytes in 64 bit address space? 12 17179869184L 13 14 >>> 2**1024 # try doing this on a TI-89 without overflow... 15 179769313486231590772930519078902473361797697894230657273430081157732675805500963132 708477322407536021120113879871393357658789768814416622492847430639474124377767893424 865485276302219601246094119453082952085005768838150682342462881473913110540827237163 350510684586298239947245938479716304835356329624224137216L 16 17 >>> log(exp(2)) == exp(log(2)) # are nat log and e^x inverse functions? 18 True 19 20 >>> [(x, log(x)) for x in xrange(1, 11)] # log of first ten positive integers 21 [(1, 0.0), (2, 0.6931471805599453), (3, 1.0986122886681096), (4, 1.3862943611198906), (5, 1.6094379124341003), (6, 1.791759469228055), (7, 1.9459101490553132), (8, 2.0794415416798357), (9, 2.1972245773362196), (10, 2.302585092994046)] 6

Offside-rule block delimiting Indentation using spaces or tabs at the very left of statements is used to delimit blocks instead of curly braces. The exact amount of indentation does not matter, instead the relative indentation of nested blocks matters. Bruce Eckel: Because blocks are denoted by indentation in Python, indentation is uniform in Python programs. And indentation is meaningful to us as readers. So because we have consistent code formatting, I can read somebody else's code and I'm not constantly tripping over, "Oh, I see. They're putting their curly braces here or there." I don't have to think about that. 7

Offside-rule implementation The tokenizer uses a stack to store indentation levels. Whenever a nested block begins, the new indentation level is pushed on the stack, and an <INDENT> token is passed to the parser, though there can never be more than one <INDENT> token in a row. When a line is encountered with a smaller indentation level, values are popped from the stack until a value is on top which is equal to the new indentation level (if none is found, a syntax error occurs). For each value popped, a <DEDENT> token is generated, and there can be multiple <DEDENT> tokens in a row. At the end of the source code, <DEDENT> tokens are generated for each indentation level left on the stack. 8

Offside-rule example Sample code:! if foo:!!if bar:!!!x = 1! else:! x = 0! Token stream and indent stack:! <if> <foo> <:>!!! [0]! <INDENT> <if> <bar> <:>!! [0, 4]! <INDENT> <x> <=> <1>!! [0, 4, 8]! <DEDENT> <DEDENT> <ELSE> <:> [0]! <INDENT> <x> <=> <0>!! [0, 2]! <DEDENT>!!!! [0]! 9

Duck typing 1 def function(x, y, z): 2 return (x+y)*z 3 4 >>> print function(1, 2, 3) 5 9 6 7 >>> type(function(1, 2, 3)) 8 <type 'int'> 9 10 >>> print function("one", "two", 3) 11 onetwoonetwoonetwo 12 13 >>> type(function("one", "two", 3)) 14 <type 'str'> An object's current set of methods and properties determines the valid semantics Allows polymorphism without inheritance The name of the concept refers to the duck test, attributed to James Whitcomb Riley: "when I see a bird that walks like a duck and swims like a duck and quacks like a duck, I call that bird a duck." 10

Multiple return values 1 def function(): 2 x = 1 3 y = "two" 4 z = [1, 2, 3] 5 return x, y, z 6 7 >>> print function(); 8 9 (1, 'two', [1, 2, 3]) 10 11 >>> a, b, c = function() 12 >>> print a, b, c 13 14 1 two [1, 2, 3] 15 16 >>> d = function() 17 >>> print d 18 19 (1, 'two', [1, 2, 3]) Python will automatically pack multiple values into a tuple You can otherwise pack your own lists, tuples, dicts 11

Function arguments 1 def function(arg0, arg1, *args, **kwargs): 2 print "arg0: %s" % arg0 3 print "arg1: %s" % arg1 4 print "args: ", args 5 print "kwargs: ", kwargs 6 7 >>> function(1, 2, "three", "four", key1="five", key2="six") 8 9 arg0: 1 10 arg1: 2 11 args: ('three', 'four') 12 kwargs: {'key2': 'six', 'key1': 'five'} Arguments are passed by reference, but immutable types (tuples, ints, strings) behave like pass by value Positional arguments precede keyword arguments *args receives a tuple (automatically packed) **kwargs receives a dictionary (automatically packed) 12

List comprehensions and generator expressions 1 list = [<expression> for <value> in <collection> if <condition>] 2 3 itr = (<expression> for <value> in <collection> if <condition>) List comprehension is a construct for creating a new list from an existing list This is analogous to mathematical set builder notation Generator expressions are the lazy evaluation equivalent of list comprehensions These are extremely useful, and you should master them! 13

Anonymous functions 1 map(lambda x: x**2, range(1, x+1)) 2 3 == [x**2 for x in range(1, x+1)] 4 5 filter(lambda x: x%2 == 0, range(1, x+1)) 6 7 == [x for x in range(1, x+1) if x%2 == 0] 8 9 reduce(lambda x,y: x*y, xxrange(2, x+1)) 10 11 ==? In Python, lambda functions are a matter of style. You can always define a separate normal function instead. Lambdas are appropriate when encapsulating specific, non-reusable code without littering the code base with simple one-line functions. List comprehensions and generator expressions are generally used instead of filter() and reduce() reduce() has been removed from Python 3.0 standard library 14

Decorator pattern A design pattern for use with any objected oriented programming language Uses composition rather than inheritance to extend classes Allows new behavior to be added to a particular object at runtime, independently of other instances of the same class Java I/O Streams implementation employs the decorator pattern 15

Function Decorators 1 def entryexit(f): 2 def new_f(): 3 print "Entering", f. name 4 f() 5 print "Exited", f. name 6 return new_f 7 8 @entryexit 9 def func1(): 10 print "-inside func1-" 11 12 @entryexit 13 def func2(): 14 print "-inside func2-" 15 16 >>> func1() 17 18 Entering func1 19 -inside func1-20 Exited func1 21 22 >>> func2() 23 24 Entering func2 25 -inside func2-26 Exited func2 16

Data Models How well does a model capture the meaning of data and inter-data relationships? 17

University database )1,.)*!?*5),%! <2.0%(! @/+.2&A! )&.3*%&)H /3;()(%G 7&.3*%&! 0/I,5(%GH(%! /3;(),5 C5/3D7&.3*%&! $%)&5.+&,5 E*)*/5+'*5 +2/))*)H &*/+'(%G! (%)&5.+&,5! 3*1&H)&.3*%&) 3*1&H! 6/+.2&(*) 6/+.2&AH3*1&! B*1/5&0*%&! +2/))*)H 5*G()&*5*3 )&.3*%&)H *%5,22*3! +,22*G*H3*1/5&0*%&)! F,22*G*! +,22*G*H(% 7*+&(,%! KLCLMB! 3*1&H/& +,.5)*)H,66*5*3 <! <!()!/!+2/))!%/0*! <! J! <!()!/!).4+2/))!,6!J! N N!()!/!)(%G2*!;/2.*3!LO<! N N!()!/!0.2&(!;/2.*3!LO<! F,.5)*! Diagram created by: Saurabh Boyed )*+&(,%)H,6! +,.5)*H,6! 18

Relational data models Based on mathematical relations and firstorder predicate logic By far the most popular model for data management in use today Complex relationships are difficult to describe and slow to query over Referential integrity issues Biased toward the implementation 19

Semantic data models Representation of how data relates to the real world is part of the model (rather than the implementation as per SQL databases) Class hierarchies and inheritance similar to objected oriented programming paradigm Better for models with hierarchic structure or entity relationships (many-to-many relationships are easy to define and query) 20

Hacking Antlr and Jython /grammar/python.g sqlquery_stmt, sqlinsert_statement - define new types of small_stmt; decompose SQL statements into respective lists; expressions in where clause stored in separate list /src/org/python/antlr/ast/sqlquery.java SQLGetQueryProcessor - call SQLProcessQuery of current SQLQuery object SQLProcessQuery - retrieve query results from database /src/org/python/compiler/codecompiler.java visitsqlquery - push where clause expressions onto runtime stack for evaluation /src/org/python/core/py.java sqlwhereclause - store evaluated where clause expression of current SQLQuery sqlwhereclausefinal - callback to SQLQuery.SQLGetQueryProcessor sending back evaluated where clause expression /src/org/python/antlr/ast/visitorbase.java, /src/org/python/antlr/ast/visitorif.java visitsqlquery, visitsqlinsert, visitsimquery - updates to handle new AST types 21

Grammar for sqlquery_stmt SELECT NAME STAR COMMA NAME CAPSFROM NAME COMMA NAME WHERE NAME ASSIGN LESS... ALT_NOTEQUAL expr (boxes with double lines are accepting) CAPSAND CAPSOR NAME ASSIGN LESS... ALT_NOTEQUAL expr 22

References The Python Language Reference Quick reference for Python 2.6 The Jython Wiki The Definitive Guide to Jython SPARQL Query Language for RDF A Semantic Database Management System: SIM Web Database (WDB): A Java Semantic Database 23