OQL to SQL translator

License Codehaus OpenEJB OpenJMS OpenORB Tyrex

Old releases

General

Release 1.3

Release 1.3rc1

Release 1.2

Main

Home

About

Features

Download

Dependencies

Reference guide

Publications

JavaDoc

Maven 2 support

Maven 2 archetypes

DTD & Schemas

Recent HTML changes

News Archive

RSS news feed

Project Wiki

Development/Support

Mailing Lists

SVN/JIRA

Contributing

Support

Continuous builds

Prof. services

Related projects

Spring ORM support

Spring XML factories

WS frameworks

XML

XML Code Generator

JDO

Introduction

First steps

Using JDO

JDO Config

Types

JDO Mapping

JDO FAQ

JDO Examples

JDO HOW-TOs

Tips & Tricks

Other Features

JDO sample JAR

Tools

Schema generator

Advanced JDO

Caching

OQL

Trans. & Locks

Design

KeyGen

Long Trans.

Nested Attrs.

Pooling Examples

LOBs

Best practice

DDL Generator

Using DDL Generator

Properties

Ant task

Type Mapping

The Examples

3rd Party Tools

JDO Tests

XML Tests

Configuration

About

License

User stories

Contributors

Marketplace

Status, Todo

Changelog

Library

Contact

Project Name

OQL to SQL translator

Reference: The Java OQL to SQL translator

News
Status
Introduction
Overview
Syntax
Type and validity checking
SQL Generation
OQL FAQ
Summary
Examples
Java class files
Limit Clause

News

Release 0.9.6:

- Added support for LIMIT clause for MS SQL Server.

- In the case a RDBMS does not support LIMIT/OFFSET clauses, a SyntaxNotSupportedException is thrown.

- Added support for a limit clause and an offset clause. Currently, only HSQL, mySQL and postgreSQL are supported.

- Added an example section.

Status

The Castor OQL implementation is currently in phase 3 of development.

NOTE: This documentation is not yet finished

Introduction

This document describes an OQL to SQL translator to be added to the Castor JDO Java object Persistence API. The translator will accept OQL queries passed as strings, and generate a parse tree of the OQL. It will then traverse the tree creating the appropriate SQL. The user will then be able to bind parameters to parameterized queries. Type checking will be performed on the bound parameters. When the user executes the query, the system will submit the query to the SQL database, and then postprocess the SQL resultset to create the appropriate result as a Java Object or literal. The current mapping and persistence packages will be used for metadata and RDBMS communication.

Four of the (now defunct) SourceXchange milestones for this project call for java source code. These milestones will be referred to here as phase 1, 2, 3, and 4. There are many possible OQL features that can be supported, but weren't discussed in the proposal or RFP. Many of these are probably unwanted. These additional features are specified as phase 5, which is out of the scope of this SourceXChange project.

Overview

The parser will construct a parse tree as output from an OQL query string given as input. The OQL syntax is a subset of the syntax described in the ODMG 3.0 specification section 4.12, with some additional constructs. Following is a description of the supported OQL syntax, and its SQL equivalent.

Certain features of OQL may not be directly translatable to SQL, but may still be supported, by post processing the query. For example, the first() and last() collection functions supported in OQL are not directly translatable to standard SQL, but a resultset can be post-processed to return the appropriate values. Features requiring post-processing of SQL resultsets will be documented as such below.

Currently the OQLQuery checks for correct syntax at the same time as it does type checking and other types of error checking. The new code will involve a multiple pass strategy, with the following passes:

Parse the String query checking for syntax errors, and return a parse tree.
Traverse the parse tree checking for correct types, valid member and method identifiers, and use of features which are unsupported. This pass may also generate some data necessary for creating the SQL.
Traverse the tree one final time, creating the equivalent SQL statement to the OQL Query originally passed.

Syntax

This section describes the first pass which will be done by the parser. The parser will create a StringTokenizer like this:

StringTokenizer tokenizer
    = new StringTokenizer(oql,
    "\n\r\t,.()[]+-*/<>=:|$", true);

This will create a StringTokenizer with the delimiter characters listed in the second argument, and it will return delimeters as well as tokens. The parser will also create a Vector to be used as a token buffer. As tokens are returned from the StringTokenizer they will be added to the Vector. Older tokens will be removed from the Vector when it reaches a certain size. The Vector will also be modified when the StringTokenizer returns multi character operators as seperate tokens, for example the -> method invocation operator.

The parser will consume tokens from the StringTokenizer, generating a ParseTree. Each ParseTree node will have a nodeType corresponding to its symbol in the OQL syntax. After each node is created it will look at the next token and act acordingly, either modifying its properties (i.e. for DISTINCT property of selectExpr), creating a new child node or returning an error. If the error travels up to the root node of the ParseTree, there is a syntax error in the OQL submitted. At the end of this pass, the ParseTree will contain an apropriate representation of the query, which will be analyzed, and used to create SQL. Below is the modified EBNF which will be the Castor OQL syntax.

query                   ::= selectExpr
                            | expr

selectExpr              ::= select [distinct] 
                            projectionAttributes
                            fromClause
                            [whereClause]
                            [groupClause]
                            [orderClause]
                            [limitClause [offsetClause]]

projectionAttributes    ::= projectionList
                          | *

projectionList          ::= projection {, projection }

projection              ::= field
                          | expr [as identifier]

fromClause              ::= from iteratorDef {, iteratorDef}

iteratorDef             ::= identifier [ [as ] identifier ]
                          | identifier in identifier

whereClause             ::= where expr

groupClause             ::= group by fieldList {havingClause}

havingClause            ::= having expr

orderClause             ::= order by sortCriteria

limitClause             ::= limit  queryParam

offsetClause            ::= offset queryParam

sortCriteria            ::= sortCriterion {, sortCriterion }

sortCriterion           ::= expr [ (asc | desc) ]

expr                    ::= castExpr

castExpr                ::= orExpr
                            | ( type ) castExpr

orExpr                  ::= andExpr {or andExpr}

andExpr                 ::= quantifierExpr {and quantifierExpr}

quantifierExpr          ::= equalityExpr
                            | for all inClause : equalityExpr
                            | exists inClause : equalityExpr

inClause                ::= identifier in expr

equalityExpr            ::= relationalExpr
                            {(= | !=) 
                                [compositePredicate] relationalexpr }
                            | relationalExpr {like relationalExpr}

relationalExpr          ::= additiveExpr
                            {(< | <= 
                                | > | >= ) 
                                    [ compositePredicate ] additiveExpr }
                                | additiveExpr between 
                                    additiveExpr and additiveExpr

compositePredicate      ::= some | any | all

additiveExpr            ::= multiplicativeExpr
                                {+ multiplicativeExpr}
                            | multiplicativeExpr
                                {- multiplicativeExpr}
                            | multiplicativeExpr
                                {union multiplicativeExpr}
                            | multiplicativeExpr
                                {except multiplicativeExpr}
                            | multiplicativeExpr
                                {|| multiplicativeExpr}

multiplicativeExpr      ::= inExpr {* inExpr}
                            | inExpr {/ inExpr}
                            | inExpr {mod inExpr}
                            | inExpr {intersect inExpr}

inExpr                  ::= unaryExpr {in unaryExpr}

unaryExpr               ::= + unaryExpr
                            | - unaryExpr
                            | abs unaryExpr
                            | not unaryExpr
                            | postfixExpr

postfixExpr             ::= primaryExpr{[ index ]}
                            | primaryExpr
                            {(. | ->)identifier[arglist]}

index                   ::= expr {, expr}
                            | expr : expr

argList                 ::= ([ valueList ])

primaryExpr             ::= conversionExpr
                            | collectionExpr
                            | aggregateExpr
                            | undefinedExpr
                            | collectionConstruction
                            | identifier[ arglist ]
                            | queryParam
                            | literal
                            | ( query )

conversionExpr          ::= listtoset( query )
                            | element( query )
                            | distinct( query )
                            | flatten( query )

collectionExpr          ::= first( query )
                            | last( query )
                            | unique( query )
                            | exists( query )

aggregateExpr           ::= sum( query )
                            | min( query )
                            | max( query )
                            | avg( query )
                            | count(( query | * ))

undefinedExpr           ::= is_undefined( query )
                            | is_defined( query )

fieldList               ::= field {, field}

field                   ::= identifier: expr

collectionConstruction  ::= array([valueList])
                            | set([valueList])
                            | bag([valueList])
                            | list([valueList])
                            | list(listRange)

valueList               ::= expr {, expr}

listRange               ::= expr..expr

queryParam              ::= $[(type)]longLiteral

type                    ::= [unsigned] short
                            | [unsigned] long
                            | long long
                            | float
                            | double
                            | char
                            | string
                            | boolean
                            | octet
                            | enum [identifier.]identifier
                            | date
                            | time
                            | interval
                            | timestamp
                            | set <type>
                            | bag <type>
                            | list <type>
                            | array <type>
                            | dictionary <type, type>
                            | identifier

identifier              ::= letter{letter| digit| _}

literal                 ::= booleanLiteral
                            | longLiteral
                            | doubleLiteral
                            | charLiteral
                            | stringLiteral
                            | dateLiteral
                            | timeLiteral
                            | timestampLiteral
                            | nil
                            | undefined

booleanLiteral          ::= true
                            | false

longLiteral             ::= digit{digit}

doubleLiteral           ::= digit{digit}.digit{digit}
                            [(E | e)[+|-]digit{digit}]

charLiteral             ::= 'character'

stringLiteral           ::= "{character}"

dateLiteral             ::= date
                                'longliteral-longliteral-longliteral'

timeLiteral             ::= time
                                'longliteral:longLiteral:floatLiteral'

timestampLiteral        ::= timestamp
                                'longLiteral-longLiteral-longLiteral
                                longliteral:longLiteral:floatLiteral'

floatLiteral            ::= digit{digit}.digit{digit}

character               ::= letter
                            | digit
                            | special-character

letter                  ::= A|B|...|Z|
                            a|b|...|z

digit                   ::= 0|1|...|9

special-character       ::= ?|_|*|%|\

The following symbols were removed from the standard OQL Syntax for the following reasons:
- andthen: Cannot be implemented in a single SQL query.

- orelse: Same as above.

- import: This is advanced functionality which may be added later. This phase will use the castor mapping mechanism to define the namespace.

- Defined Queries: This is another feature which can be added later. It is unclear where the queries would be stored, and what their scope would be seeing as how this project is an OQL to SQL translator, and not an ODBMS.

- iteratorDef was changed so that all instances of expr were replaced by identifier. This means that the from clause can only contain extent names (class names), rather than any expression. This is the most common case and others could create complicated SQL sub-queries or post-processing requirements.

- objectConstruction and structConstruction were removed. What is the scope of the constructed object or struct, and how is a struct defined in Java?

The following symbols were added or modified.
- between added to relationalExpr.

- Optional type specification added to queryParam.

The rest of the standard OQL syntax remains unchanged. Certain syntactically correct queries may not be supported in Castor. For example, top level expressions which do not contain a selectExpr anywhere in the query may not be supported. This will be discussed further in the next section.

Type and validity checking

The first pass over the ParseTree will do type checking, and create some structures used in the SQL generation pass. It will also check whether the identifiers used are valid, and whether the query uses unsupported features. The following table describes each type of node in the ParseTree, and how it will be processed in the first pass.

expr
- A query whose top level element is an expr, rather than a selectExpr will not be supported within the scope of this project. These queries can either be stated as a selectExpr, like aggregateExpr's, or they would require post-processing of the SQL results, like element()first() and last().
Phase 5

projectionAttributes
- select * will return a Collection of Arrays of Objects.
Phase 5

projectionList
- Selecting multiple fields will return a Collection of Arrays of Objects.

- When there are multiple fields selected, a list of field names and aliases will be kept for checking validity of expr's in the whereClause, groupClause, and orderClause.

Phase 5

projection
- Alias identifier will be stored.

- expr in projection may only be identifier, without an arglist.

Phase 1

projection
- expr in projection may only be identifier (with optional argList), aggregateExpr, undefinedExpr, and postfixExpr (for selecting fields and accessors).

- The subquery in aggregateExpr and undefinedExpr can be identifier (with optional arglist), or postfixExpr for applying these functions to fields and accessors.

- If an identifier before the . or -> contains an arglist, it will be considered a SQL function, and passed through to the RDBMS.

- If the postfixExpr contains one of the above operators, the mapping mechanism will be used to determine if the path expression is valid and to generate a list of required join tables, using the manyKey and manyTable from the JDOFieldDescriptor.
Phase 2

fromClause
- The class of the extent being selected from will be stored, and ClassDescriptor objects will be instantiated.
Phase 1

whereClause
- expr in whereClause may only contain orExpr, andExpr, equalityExpr (without compositePredicate), relationalExpr, additiveExpr (without set operators union and except), multiplicativeExpr (without set operator intersect), unaryExpr, postFixExpr (must be only primaryExpr, no array or property reference or method calls).

- primaryExpr may only contain identifier (without an argList), literal and queryParam. Identifier will be checked against object name and alias in projectionList.

- For equalityExpr, relationalExpr, aditiveExpr, multiplicativeExpr, the left side and right side expr's must evaluate to comparable types.

- For unaryExpr, simple type checking for numerical or character based types will be performed.

- If the operands for any of the relational, equality, additive, multiplicative, or unary operators is a query parameter, an expected type will be determined. If the parameter included a specified type which is incompatible with the system determined type, an error will be generated.
Phase 1

whereClause
- Support for built in OQL functions will be added to the whereClause: is_defined, is_undefined.

- inExpr will be supported in whereClause.

- inExpr will only allow collectionConstruction for the right side argument to in. No subQueries will be allowed.
Phase 2

whereClause
- identifiers will be able to contain an optional arglist. If the arglist is before a . or -> the identifier will be considered a SQL function and will be passed through to the DBMS. Otherwise, the identifier will be for an accessor method, or a property name.

- Accessor methods and property references will cause a check through the ClassDescriptor and FieldDescriptors for the object type, and the required join tables.
Phase 3

whereClause
- compositePredicate will be suported in equalityExpr.

- exists(query) will be supported.

- quantifierExpr will support for all and exists.

- Subqueries will be supported on the right side of the in operator
Phase 4

groupClause, havingClause
- Will identify appropriate fields in SQL schema for each expr.

- aggregateExpr will be supported.

- Only expr's which translate to SQL columns whcih are already being selected will be supported.
Phase 4

orderClause
- May only contain expr's which translate into SQL columns which are already being selected.
Phase 3

SQL Generation

After the first pass, the ParseTree is free of errors, and ready for the SQL generation step. The existing implementation of the OQLParser uses the persistence API for SQL generation. This API lacks the necessary features to generate SQL from any OQL. The SQLEngine class which implements Persistence is used to create a JDBCQueryExpression. The SQL is derived from the finder, which is a JDBCQueryExpression produced by the SQLEngine. The problem is that the SQLEngine only supports single objects. It cannot generate SQL for path expressions like this:

select p.address from Person p

This query requires a SQL statement like this:

select address.* from person, address
    where person.address_id = address.address_id

The buildFinder method should not be used to generate a queryExpression. The SQLEngine should be used to get a ClassDescriptor, and to create a new QueryExpression. The OQLParser should use the methods in the QueryExpression to generate the SQL. The JDBCQueryExpression which is an implementation of QueryExpression is also lacking in necessary features. This class should continue to be used, but the following features will need to be added:

addColumn(String): For adding something to select without specifying the tablename, for use with functions (i.e. select count(*))
addTable(String): For when the table has to be added manually.
addCondition(String): Add a condition created outside the class, for nested expressions, and other expressions that are not of the form table.column op table.column.
setDistinct(boolean): Used for select distinct.
addOrderColumn(String tableName, String columnName, boolean desc): Used for order by
addGroupExpr(String): Used for group by
addHavingExpr(String): Used for having.

The following table lists each type of tree node, and how it will be processed in the SQL generation pass.

selectExpr
- distinct in the selectExpr will result in a call to setDistinct(true) in the queryExpr.
Phase 2

projection
- The queryExpr will be populated with the columns and tables necessary to retrieve the object. This will use code similar to SQLEngine.addLoadSql(...).
Phase 1

projection
- aggregateExpr and SQL functions will be passed to addColumn.

- undefinedExpr will be translated to is null and is not null

- postfixExpr (for selecting fields and accessors) will result in a different group of select expressions and "from tables" being generated.
Phase 2

whereClause
- Entire expr in where clause will be translated, and then added to the QueryExpr, using a single call to addCondition(String), and multiple calls to addTable(String).
Phase 1

whereClause
- is_defined() will translate into is not null and is_undefined() will translate into is null.

- inExpr will translate directly, with the collectionConstruction removed.
Phase 2

whereClause
- compositePredicate and exists(query) translate directly to SQL.

- For quantifierExpr, exists will translate into an exists() SQL subquery. for all will translate into the contrapositive(?) exists query, for example:
for all x in teachers: x.name = 'Nis'
translates to:
not exists (select * from teachers where name != 'Nis')

Phase 4

OQL FAQ

Please see the OQL section of the JDO FAQ.

Summary

The Parser and ParseTree classes will be improved through the phases of this project. The top level of OQL to SQL translation will look very simple, like this:

OQLParser parser = new OQLParser(query);
ParseTree pt = parser.getParseTree();
pt.checkPass();
//the SQL generation pass
_expr = pt.getQueryExpr();

These methods will have some additional parameters passed for storing and retrieving data relevant to the query. Following is a table containing a list of what will be introduced in each coding phase of the project.

Phase 1
- New parser structure which generates symbol tree

- Parser still supports only limited OQL

- selected field aliases

- whereClause supports or, and, equality, additive, multiplicative, and unary Operators.

- support for specifying parameter types

- check specified parameter type against system determined type

- specifying ordered parameters.

- JDBCQueryExpression must support addCondition(String condition)

Phase 2
- Distinct keyword will be supported in selectExpr

- aggregateExpr and undefinedExpr supported in projection (Select statement)

- isDefined, isUndefined in whereClause

- inExpr in whereClause

- type checking/conversion in bind()

- fields and accessors in the projection. Mapping mechanism may need some additional features.

- SQL functions in the projection

- order by

Phase 3
- fields and accessors in whereClause

- SQL functions in the where clause

Phase 4
- sub queries

- exists()

- compositePredicate

- quantifierExpr's: for all and exists

- group by

- having

Phase 5:
- expr as top level symbol

- queries selecting multiple fields or as comma separated list or *

Examples

Please find below various examples of OQL queries using the Java class files as outlined below.

Java class files

The following fragment shows the Java class declaration for the Product class:

package myapp;

public class Product 
{
    private int       _id;

    private String    _name; 

    private float     _price; 

    private ProductGroup _group;


    public int getId() { ... }

    public void setId( int anId ) { ... }

    public String getName() { ... }

    public void setName( String aName ) { ... }

    public float getPrice() { ... }

    public void setPrice( float aPrice ) { ... }

    public ProductGroup getProductGroup() { ... }

    public void setProductGroup( ProductGroup aProductGroup ) { ... }
}

The following fragment shows the Java class declaration for the ProductGroup class:



public class ProductGroup
{

    private int       _id;

    private String    _name;

    public int getId() { ... }

    public void setId( int id ) { ... }

    public String getName() { ... }

    public void setName( String name ) { ... }

}

Limit Clause

On a selected number of RDBMS, Castor JDO now supports the use of LIMIT/OFFSET clauses.

As per this release, the following RDBMS have full/partial support for this feature.

RDBMS	LIMIT	OFFSET
postgreSQL	Yes	Yes
mySQL	Yes	Yes
Oracle - 1)2)	Yes	Yes
HSQL	Yes	Yes
MS SQL	Yes	-
DB2	Yes	-

1) Oracle has full support for LIMIT/OFFSET clauses for release 8.1.6 and later.

2) For the LIMIT/OFFSET clauses to work properly the OQL query is required to include a ORDER BY clause.

The following code fragment shows an OQL query that uses the LIMIT keyword to select the first 10 ProductGroup instances.

query = db.getOQLQuery("select p from ProductGroup as p LIMIT $1");
query.bind(10);

Below is the same OQL query again, restricting the number of ProductGroup instances returned to 10, though this time it is specified that the ProductGroup instances 11 to 20 should be returned.

query = db.getOQLQuery ("select p from ProductGroup as p LIMIT $1 OFFSET $2");
query.bind(10);

In the case a RDBMS does not support LIMIT/OFFSET clauses, a SyntaxNotSupportedException will be thrown.

Copyright © 1999-2005 ExoLab Group, Intalio Inc., and Contributors. All rights reserved.

Java, EJB, JDBC, JNDI, JTA, Sun, Sun Microsystems are trademarks or registered trademarks of Sun Microsystems, Inc. in the United States and in other countries. XML, XML Schema, XSLT and related standards are trademarks or registered trademarks of MIT, INRIA, Keio or others, and a product of the World Wide Web Consortium. All other product names mentioned herein are trademarks of their respective owners.