License     Codehaus     OpenEJB     OpenJMS     OpenORB     Tyrex     

Old releases
  Release 1.3
  Release 1.3rc1
  Release 1.2

  Reference guide
  Maven 2 support
  Maven 2 archetypes
  DTD & Schemas
  Recent HTML changes
  News Archive
  RSS news feed
  Project Wiki

  Mailing Lists
  Continuous builds
  Prof. services

Related projects
  Spring ORM support
  Spring XML factories
  WS frameworks


XML Code Generator
  XML Code Generator

  First steps
  Using JDO
  JDO Config
  JDO Mapping
  JDO Examples
  Tips & Tricks
  Other Features
  JDO sample JAR

  Schema generator

Advanced JDO
  Trans. & Locks
  Long Trans.
  Nested Attrs.
  Pooling Examples
  Best practice

DDL Generator
  Using DDL Generator
  Ant task
  Type Mapping

  The Examples
  3rd Party Tools
  JDO Tests
  XML Tests

  User stories
  Status, Todo
  Project Name


OQL to SQL translator

Reference: The Java OQL to SQL translator

Type and validity checking
SQL Generation
    Java class files
    Limit Clause


Release 0.9.6:
-Added support for LIMIT clause for MS SQL Server.
-In the case a RDBMS does not support LIMIT/OFFSET clauses, a SyntaxNotSupportedException is thrown.
-Added support for a limit clause and an offset clause. Currently, only HSQL, mySQL and postgreSQL are supported.
-Added an example section.


The Castor OQL implementation is currently in phase 3 of development.

NOTE: This documentation is not yet finished


This document describes an OQL to SQL translator to be added to the Castor JDO Java object Persistence API. The translator will accept OQL queries passed as strings, and generate a parse tree of the OQL. It will then traverse the tree creating the appropriate SQL. The user will then be able to bind parameters to parameterized queries. Type checking will be performed on the bound parameters. When the user executes the query, the system will submit the query to the SQL database, and then postprocess the SQL resultset to create the appropriate result as a Java Object or literal. The current mapping and persistence packages will be used for metadata and RDBMS communication.

Four of the (now defunct) SourceXchange milestones for this project call for java source code. These milestones will be referred to here as phase 1, 2, 3, and 4. There are many possible OQL features that can be supported, but weren't discussed in the proposal or RFP. Many of these are probably unwanted. These additional features are specified as phase 5, which is out of the scope of this SourceXChange project.


The parser will construct a parse tree as output from an OQL query string given as input. The OQL syntax is a subset of the syntax described in the ODMG 3.0 specification section 4.12, with some additional constructs. Following is a description of the supported OQL syntax, and its SQL equivalent.

Certain features of OQL may not be directly translatable to SQL, but may still be supported, by post processing the query. For example, the first() and last() collection functions supported in OQL are not directly translatable to standard SQL, but a resultset can be post-processed to return the appropriate values. Features requiring post-processing of SQL resultsets will be documented as such below.

Currently the OQLQuery checks for correct syntax at the same time as it does type checking and other types of error checking. The new code will involve a multiple pass strategy, with the following passes:

  1. Parse the String query checking for syntax errors, and return a parse tree.
  2. Traverse the parse tree checking for correct types, valid member and method identifiers, and use of features which are unsupported. This pass may also generate some data necessary for creating the SQL.
  3. Traverse the tree one final time, creating the equivalent SQL statement to the OQL Query originally passed.


This section describes the first pass which will be done by the parser. The parser will create a StringTokenizer like this:

StringTokenizer tokenizer
    = new StringTokenizer(oql,
    "\n\r\t,.()[]+-*/<>=:|$", true);
This will create a StringTokenizer with the delimiter characters listed in the second argument, and it will return delimeters as well as tokens. The parser will also create a Vector to be used as a token buffer. As tokens are returned from the StringTokenizer they will be added to the Vector. Older tokens will be removed from the Vector when it reaches a certain size. The Vector will also be modified when the StringTokenizer returns multi character operators as seperate tokens, for example the -> method invocation operator.

The parser will consume tokens from the StringTokenizer, generating a ParseTree. Each ParseTree node will have a nodeType corresponding to its symbol in the OQL syntax. After each node is created it will look at the next token and act acordingly, either modifying its properties (i.e. for DISTINCT property of selectExpr), creating a new child node or returning an error. If the error travels up to the root node of the ParseTree, there is a syntax error in the OQL submitted. At the end of this pass, the ParseTree will contain an apropriate representation of the query, which will be analyzed, and used to create SQL. Below is the modified EBNF which will be the Castor OQL syntax.

query                   ::= selectExpr
                            | expr

selectExpr              ::= select [distinct] 
                            [limitClause [offsetClause]]

projectionAttributes    ::= projectionList
                          | *

projectionList          ::= projection {, projection }

projection              ::= field
                          | expr [as identifier]

fromClause              ::= from iteratorDef {, iteratorDef}

iteratorDef             ::= identifier [ [as ] identifier ]
                          | identifier in identifier

whereClause             ::= where expr

groupClause             ::= group by fieldList {havingClause}

havingClause            ::= having expr

orderClause             ::= order by sortCriteria

limitClause             ::= limit  queryParam

offsetClause            ::= offset queryParam

sortCriteria            ::= sortCriterion {, sortCriterion }

sortCriterion           ::= expr [ (asc | desc) ]

expr                    ::= castExpr

castExpr                ::= orExpr
                            | ( type ) castExpr

orExpr                  ::= andExpr {or andExpr}

andExpr                 ::= quantifierExpr {and quantifierExpr}

quantifierExpr          ::= equalityExpr
                            | for all inClause : equalityExpr
                            | exists inClause : equalityExpr

inClause                ::= identifier in expr

equalityExpr            ::= relationalExpr
                            {(= | !=) 
                                [compositePredicate] relationalexpr }
                            | relationalExpr {like relationalExpr}

relationalExpr          ::= additiveExpr
                            {(< | <= 
                                | > | >= ) 
                                    [ compositePredicate ] additiveExpr }
                                | additiveExpr between 
                                    additiveExpr and additiveExpr

compositePredicate      ::= some | any | all

additiveExpr            ::= multiplicativeExpr
                                {+ multiplicativeExpr}
                            | multiplicativeExpr
                                {- multiplicativeExpr}
                            | multiplicativeExpr
                                {union multiplicativeExpr}
                            | multiplicativeExpr
                                {except multiplicativeExpr}
                            | multiplicativeExpr
                                {|| multiplicativeExpr}

multiplicativeExpr      ::= inExpr {* inExpr}
                            | inExpr {/ inExpr}
                            | inExpr {mod inExpr}
                            | inExpr {intersect inExpr}

inExpr                  ::= unaryExpr {in unaryExpr}

unaryExpr               ::= + unaryExpr
                            | - unaryExpr
                            | abs unaryExpr
                            | not unaryExpr
                            | postfixExpr

postfixExpr             ::= primaryExpr{[ index ]}
                            | primaryExpr
                            {(. | ->)identifier[arglist]}

index                   ::= expr {, expr}
                            | expr : expr

argList                 ::= ([ valueList ])

primaryExpr             ::= conversionExpr
                            | collectionExpr
                            | aggregateExpr
                            | undefinedExpr
                            | collectionConstruction
                            | identifier[ arglist ]
                            | queryParam
                            | literal
                            | ( query )

conversionExpr          ::= listtoset( query )
                            | element( query )
                            | distinct( query )
                            | flatten( query )

collectionExpr          ::= first( query )
                            | last( query )
                            | unique( query )
                            | exists( query )

aggregateExpr           ::= sum( query )
                            | min( query )
                            | max( query )
                            | avg( query )
                            | count(( query | * ))

undefinedExpr           ::= is_undefined( query )
                            | is_defined( query )

fieldList               ::= field {, field}

field                   ::= identifier: expr

collectionConstruction  ::= array([valueList])
                            | set([valueList])
                            | bag([valueList])
                            | list([valueList])
                            | list(listRange)

valueList               ::= expr {, expr}

listRange               ::= expr..expr

queryParam              ::= $[(type)]longLiteral

type                    ::= [unsigned] short
                            | [unsigned] long
                            | long long
                            | float
                            | double
                            | char
                            | string
                            | boolean
                            | octet
                            | enum [identifier.]identifier
                            | date
                            | time
                            | interval
                            | timestamp
                            | set <type>
                            | bag <type>
                            | list <type>
                            | array <type>
                            | dictionary <type, type>
                            | identifier

identifier              ::= letter{letter| digit| _}

literal                 ::= booleanLiteral
                            | longLiteral
                            | doubleLiteral
                            | charLiteral
                            | stringLiteral
                            | dateLiteral
                            | timeLiteral
                            | timestampLiteral
                            | nil
                            | undefined

booleanLiteral          ::= true
                            | false

longLiteral             ::= digit{digit}

doubleLiteral           ::= digit{digit}.digit{digit}
                            [(E | e)[+|-]digit{digit}]

charLiteral             ::= 'character'

stringLiteral           ::= "{character}"

dateLiteral             ::= date

timeLiteral             ::= time

timestampLiteral        ::= timestamp

floatLiteral            ::= digit{digit}.digit{digit}

character               ::= letter
                            | digit
                            | special-character

letter                  ::= A|B|...|Z|

digit                   ::= 0|1|...|9

special-character       ::= ?|_|*|%|\

The following symbols were removed from the standard OQL Syntax for the following reasons:
- andthen: Cannot be implemented in a single SQL query.
- orelse: Same as above.
- import: This is advanced functionality which may be added later. This phase will use the castor mapping mechanism to define the namespace.
- Defined Queries: This is another feature which can be added later. It is unclear where the queries would be stored, and what their scope would be seeing as how this project is an OQL to SQL translator, and not an ODBMS.
- iteratorDef was changed so that all instances of expr were replaced by identifier. This means that the from clause can only contain extent names (class names), rather than any expression. This is the most common case and others could create complicated SQL sub-queries or post-processing requirements.
- objectConstruction and structConstruction were removed. What is the scope of the constructed object or struct, and how is a struct defined in Java?

The following symbols were added or modified.
- between added to relationalExpr.
- Optional type specification added to queryParam.

The rest of the standard OQL syntax remains unchanged. Certain syntactically correct queries may not be supported in Castor. For example, top level expressions which do not contain a selectExpr anywhere in the query may not be supported. This will be discussed further in the next section.

Type and validity checking

The first pass over the ParseTree will do type checking, and create some structures used in the SQL generation pass. It will also check whether the identifiers used are valid, and whether the query uses unsupported features. The following table describes each type of node in the ParseTree, and how it will be processed in the first pass.

- A query whose top level element is an expr, rather than a selectExpr will not be supported within the scope of this project. These queries can either be stated as a selectExpr, like aggregateExpr's, or they would require post-processing of the SQL results, like element()first() and last().
Phase 5
- select * will return a Collection of Arrays of Objects.
Phase 5
- Selecting multiple fields will return a Collection of Arrays of Objects.
- When there are multiple fields selected, a list of field names and aliases will be kept for checking validity of expr's in the whereClause, groupClause, and orderClause.
Phase 5
- Alias identifier will be stored.
- expr in projection may only be identifier, without an arglist.
Phase 1
- expr in projection may only be identifier (with optional argList), aggregateExpr, undefinedExpr, and postfixExpr (for selecting fields and accessors).
- The subquery in aggregateExpr and undefinedExpr can be identifier (with optional arglist), or postfixExpr for applying these functions to fields and accessors.
- If an identifier before the . or -> contains an arglist, it will be considered a SQL function, and passed through to the RDBMS.
- If the postfixExpr contains one of the above operators, the mapping mechanism will be used to determine if the path expression is valid and to generate a list of required join tables, using the manyKey and manyTable from the JDOFieldDescriptor.
Phase 2
- The class of the extent being selected from will be stored, and ClassDescriptor objects will be instantiated.
Phase 1
- expr in whereClause may only contain orExpr, andExpr, equalityExpr (without compositePredicate), relationalExpr, additiveExpr (without set operators union and except), multiplicativeExpr (without set operator intersect), unaryExpr, postFixExpr (must be only primaryExpr, no array or property reference or method calls).
- primaryExpr may only contain identifier (without an argList), literal and queryParam. Identifier will be checked against object name and alias in projectionList.
- For equalityExpr, relationalExpr, aditiveExpr, multiplicativeExpr, the left side and right side expr's must evaluate to comparable types.
- For unaryExpr, simple type checking for numerical or character based types will be performed.
- If the operands for any of the relational, equality, additive, multiplicative, or unary operators is a query parameter, an expected type will be determined. If the parameter included a specified type which is incompatible with the system determined type, an error will be generated.
Phase 1
- Support for built in OQL functions will be added to the whereClause: is_defined, is_undefined.
- inExpr will be supported in whereClause.
- inExpr will only allow collectionConstruction for the right side argument to in. No subQueries will be allowed.
Phase 2
- identifiers will be able to contain an optional arglist. If the arglist is before a . or -> the identifier will be considered a SQL function and will be passed through to the DBMS. Otherwise, the identifier will be for an accessor method, or a property name.
- Accessor methods and property references will cause a check through the ClassDescriptor and FieldDescriptors for the object type, and the required join tables.
Phase 3
- compositePredicate will be suported in equalityExpr.
- exists(query) will be supported.
- quantifierExpr will support for all and exists.
- Subqueries will be supported on the right side of the in operator
Phase 4
groupClause, havingClause
- Will identify appropriate fields in SQL schema for each expr.
- aggregateExpr will be supported.
- Only expr's which translate to SQL columns whcih are already being selected will be supported.
Phase 4
- May only contain expr's which translate into SQL columns which are already being selected.
Phase 3

SQL Generation

After the first pass, the ParseTree is free of errors, and ready for the SQL generation step. The existing implementation of the OQLParser uses the persistence API for SQL generation. This API lacks the necessary features to generate SQL from any OQL. The SQLEngine class which implements Persistence is used to create a JDBCQueryExpression. The SQL is derived from the finder, which is a JDBCQueryExpression produced by the SQLEngine. The problem is that the SQLEngine only supports single objects. It cannot generate SQL for path expressions like this:

select p.address from Person p 
This query requires a SQL statement like this:
select address.* from person, address
    where person.address_id = address.address_id

The buildFinder method should not be used to generate a queryExpression. The SQLEngine should be used to get a ClassDescriptor, and to create a new QueryExpression. The OQLParser should use the methods in the QueryExpression to generate the SQL. The JDBCQueryExpression which is an implementation of QueryExpression is also lacking in necessary features. This class should continue to be used, but the following features will need to be added:

For adding something to select without specifying the tablename, for use with functions (i.e. select count(*))
For when the table has to be added manually.
Add a condition created outside the class, for nested expressions, and other expressions that are not of the form table.column op table.column.
Used for select distinct.
addOrderColumn(String tableName, String columnName, boolean desc)
Used for order by
Used for group by
Used for having.

The following table lists each type of tree node, and how it will be processed in the SQL generation pass.

- distinct in the selectExpr will result in a call to setDistinct(true) in the queryExpr.
Phase 2
- The queryExpr will be populated with the columns and tables necessary to retrieve the object. This will use code similar to SQLEngine.addLoadSql(...).
Phase 1
- aggregateExpr and SQL functions will be passed to addColumn.
- undefinedExpr will be translated to is null and is not null
- postfixExpr (for selecting fields and accessors) will result in a different group of select expressions and "from tables" being generated.
Phase 2
- Entire expr in where clause will be translated, and then added to the QueryExpr, using a single call to addCondition(String), and multiple calls to addTable(String).
Phase 1
- is_defined() will translate into is not null and is_undefined() will translate into is null.
- inExpr will translate directly, with the collectionConstruction removed.
Phase 2
- compositePredicate and exists(query) translate directly to SQL.
- For quantifierExpr, exists will translate into an exists() SQL subquery. for all will translate into the contrapositive(?) exists query, for example:
for all x in teachers: = 'Nis'
translates to:
not exists (select * from teachers
    where name != 'Nis')
Phase 4


Please see the OQL section of the JDO FAQ.


The Parser and ParseTree classes will be improved through the phases of this project. The top level of OQL to SQL translation will look very simple, like this:

OQLParser parser = new OQLParser(query);
ParseTree pt = parser.getParseTree();
//the SQL generation pass
_expr = pt.getQueryExpr();
These methods will have some additional parameters passed for storing and retrieving data relevant to the query. Following is a table containing a list of what will be introduced in each coding phase of the project.

Phase 1
- New parser structure which generates symbol tree
- Parser still supports only limited OQL
- selected field aliases
- whereClause supports or, and, equality, additive, multiplicative, and unary Operators.
- support for specifying parameter types
- check specified parameter type against system determined type
- specifying ordered parameters.
- JDBCQueryExpression must support addCondition(String condition)
Phase 2
- Distinct keyword will be supported in selectExpr
- aggregateExpr and undefinedExpr supported in projection (Select statement)
- isDefined, isUndefined in whereClause
- inExpr in whereClause
- type checking/conversion in bind()
- fields and accessors in the projection. Mapping mechanism may need some additional features.
- SQL functions in the projection
- order by
Phase 3
- fields and accessors in whereClause
- SQL functions in the where clause
Phase 4
- sub queries
- exists()
- compositePredicate
- quantifierExpr's: for all and exists
- group by
- having
Phase 5:
- expr as top level symbol
- queries selecting multiple fields or as comma separated list or *


Please find below various examples of OQL queries using the Java class files as outlined below.

Java class files

The following fragment shows the Java class declaration for the Product class:

package myapp;

public class Product 
    private int       _id;

    private String    _name; 

    private float     _price; 

    private ProductGroup _group;

    public int getId() { ... }

    public void setId( int anId ) { ... }

    public String getName() { ... }

    public void setName( String aName ) { ... }

    public float getPrice() { ... }

    public void setPrice( float aPrice ) { ... }

    public ProductGroup getProductGroup() { ... }

    public void setProductGroup( ProductGroup aProductGroup ) { ... }

The following fragment shows the Java class declaration for the ProductGroup class:

public class ProductGroup

    private int       _id;

    private String    _name;

    public int getId() { ... }

    public void setId( int id ) { ... }

    public String getName() { ... }

    public void setName( String name ) { ... }


Limit Clause

On a selected number of RDBMS, Castor JDO now supports the use of LIMIT/OFFSET clauses.

As per this release, the following RDBMS have full/partial support for this feature.

Oracle - 1)2)YesYes

1) Oracle has full support for LIMIT/OFFSET clauses for release 8.1.6 and later.

2) For the LIMIT/OFFSET clauses to work properly the OQL query is required to include a ORDER BY clause.

The following code fragment shows an OQL query that uses the LIMIT keyword to select the first 10 ProductGroup instances.

query = db.getOQLQuery("select p from ProductGroup as p LIMIT $1");

Below is the same OQL query again, restricting the number of ProductGroup instances returned to 10, though this time it is specified that the ProductGroup instances 11 to 20 should be returned.

query = db.getOQLQuery ("select p from ProductGroup as p LIMIT $1 OFFSET $2");

In the case a RDBMS does not support LIMIT/OFFSET clauses, a SyntaxNotSupportedException will be thrown.


Copyright © 1999-2005 ExoLab Group, Intalio Inc., and Contributors. All rights reserved.
Java, EJB, JDBC, JNDI, JTA, Sun, Sun Microsystems are trademarks or registered trademarks of Sun Microsystems, Inc. in the United States and in other countries. XML, XML Schema, XSLT and related standards are trademarks or registered trademarks of MIT, INRIA, Keio or others, and a product of the World Wide Web Consortium. All other product names mentioned herein are trademarks of their respective owners.