ll.pysql – Oracle import script

Overview

The module/script pysql can be used to import data into an Oracle database. It reads pysql files which are an extension of normal Oracle SQL files.

A PySQL file can contain different types of commands.

SQL commands

A PySQL file may contain normal SQL commands. For the pysql script to be able to execute these commands they must be terminated with a comment line -- @@@. pysql will strip off a trailing ; or / from the command and execute it. Any exception that is raised as a result of executing the command will stop the script and be reported. This is in contrast to how sqlplus executes SQL commands. sqlplus would continue after an error and exit with status code 0 even if there were errors. (It is also possible to explicitely ignore any exception raised by the command by specifying a different exception handling mode.)

A PySQL file that only contains SQL commands is still a valid SQL file from the perspective of Oracle, so it still can be executed via sqlplus.

Literal Python blocks

A literal Python block starts with a line that only contains #>>> and ends with a line that only contains #<<<. Python code within the block gets executed when the block is encountered. The following objects are available within the block as global variables:

sqlexpr

Can be used to specify that an argument for a procedure should be an SQL expression instead of a Python value or a var object;

datetime

Python’s datetime module;

connection

The active database connection (or None if there is no active database connection).

Furthermore all PySQL commands (see below) are available.

Variables that get set within a literal Python block will be available (and retain their value) in subsequent literal Python blocks or other PySQL commands.

PySQL commands

A PySQL file may also contain PySQL commands. A PySQL command looks and behaves like a Python function call. This function call must either be contained in a single line (i.e. start with name( and end with ) or it must start with a line that only contains name( and end at a line that only contains ). (name must be the name of a PySQL command).

The following commands are available:

include

Includes another PySQL file;

connect

Connects to a database;

disconnect

Disconnects from the active database connection;

procedure

Call a procedure in the database (and handles OUT parameter via var objects);

sql

Executes an SQL statement in the database (and handles OUT parameter via var objects);

literalsql

Executes an SQL statement in the database (this is what SQL commands get converted to);

commit

Commits the transaction in the active database connection;

rollback

Rolls back the transaction in the active database connection;

literalpy

Executes Python code (this is what literal Python blocks get converted to);

setvar

Sets a variable;

unsetvar

Deletes a variable;

raiseexceptions

Set the exception handling mode;

pushraiseexceptions

Temporarily modifies the exception handling mode;

popraiseexceptions

Reverts to the previously active exception handling mode;

checkerrors

Checks whether there are invalid database objects;

scp

Creates a file on a remote host via scp;

file

Creates a file on the local machine;

resetsequence

Resets a database sequence to the maximum value of a field in a table;

user_exists

Tests whether a database user exists;

object_exists

Tests whether a database object (table, package, procedure, etc.) exists;

constraint_exists

Tests whether a database constraint (primary key, foriegn key, unique or check constraint) exists;

drop_types

Drops all database objects of a certain type;

comment

A comment

loadbytes

Loads the binary content of a file;

loadstr

Loads the text content of a file;

var

Marks an argument for a procedure or sql command as being an OUT parameter (or passes the value of the variable in subsequent procedure/sql commands);

env

Returns the value of an environment variable.

Comments

A line starting with # (outside of a SQL command or literal Python block) is considered a comment and will be ignored.

Example

The following is a complete PySQL file that will create a sequence, table and procedure and will call the procedure to insert data into the table:

create sequence person_seq
   increment by 10
   start with 10
   maxvalue 1.0e28
   minvalue 10
   nocycle
   cache 20
   noorder
;

-- @@@

create sequence contact_seq
   increment by 10
   start with 10
   maxvalue 1.0e28
   minvalue 10
   nocycle
   cache 20
   noorder
;

-- @@@

create table person
(
   per_id integer not null,
   per_firstname varchar2(200),
   per_lastname varchar2(200)
);

-- @@@

alter table person add constraint person_pk primary key(per_id);

-- @@@

create table contact
(
   con_id integer not null,
   per_id integer not null,
   con_type varchar2(200),
   con_value varchar2(200)
);

-- @@@

alter table contact add constraint contact_pk primary key(con_id);

-- @@@

create or replace procedure person_insert
(
   c_user in varchar2,
   p_per_id in out integer,
   p_per_firstname in varchar2 := null,
   p_per_lastname in varchar2 := null
)
as
begin
   if p_per_id is null then
      select person_seq.nextval into p_per_id from dual;
   end if;

   insert into person
   (
      per_id,
      per_firstname,
      per_lastname
   )
   values
   (
      p_per_id,
      p_per_firstname,
      p_per_lastname
   );
end;
/

-- @@@

create or replace procedure contact_insert
(
   c_user in varchar2,
   p_con_id in out integer,
   p_per_id in integer := null,
   p_con_type in varchar2 := null,
   p_con_value in varchar2 := null
)
as
begin
   if p_con_id is null then
      select contact_seq.nextval into p_con_id from dual;
   end if;

   insert into contact
   (
      con_id,
      per_id,
      con_type,
      con_value
   )
   values
   (
      p_con_id,
      p_per_id,
      p_con_type,
      p_con_value
   );
end;
/

-- @@@

# import data

procedure(
   'person_insert',
   args=dict(
      c_user='import',
      p_per_id=var('per_id_max'),
      p_per_firstname='Max',
      p_per_lastname='Mustermann',
   )
)

procedure(
   'contact_insert',
   args=dict(
      c_user='import',
      p_per_id=var('per_id_max'),
      p_con_id=var('con_id_max'),
      p_con_type='email',
      p_con_value='max@example.org',
   )
)

file(
   'portrait_{per_id_max}.png',
   b'\x89PNG\r\n\x1a\n...',
)

resetsequence(
   'person_seq',
   table='person',
   field='per_id',
}

checkerrors()

This file can then be imported into an Oracle database with the following command:

python -m ll.pysql data.pysql -d user/pwd@database

This will create two sequences, two tables and two procedures. Then it will import two records, one by calling person_insert and one by calling contact_insert. The PL/SQL equivalent of the above is:

declare
   v_per_id_max integer;
   v_con_id_max integer;
begin
   person_insert(
      per_id=v_per_id_max,
      per_firstname='Max',
      per_lastname='Mustermann'
   );
   contact_insert(
      con_id=v_con_id_max,
      per_id=v_per_id_max,
      con_type='email',
      con_value='max@example.org'
   )
end;

Furthermore it will create one file (named something like portrait_42.png) and reset the sequence person_seq to the maximum value of the field per_id in the table person. Finally it will make sure that no errors exist in the schema.

Multiple database connections

PySQL can handle multiple database connections. New database connections can be opened with the connect command. This command opens a new database connection. Subsequent commands that talk to the database will use this connection until a disconnect command disconnects from the database and reverts to the previous connection (which might not exist). An example looks like this:

connect("user/pwd@db")
procedure("test")
disconnect()

Variables

Variable objects can be used to receive out parameters of procedure calls or SQL statements. A variable object can be specified like this var("foo"). "foo" is the “name” of the variable. When a variable object is passed to a procedure the first time (i.e. the variable object is uninitialized), a cx_Oracle var object will be passed and the resulting value after the call will be stored under the name of the variable. When the variable is used in a later command the stored value will be used instead. (Note that it’s not possible to use the same variable twice in the same procedure call, if it hasn’t been used before, however in later commands this is no problem).

The type of the variable defaults to int, but a different type can be passed when creating the object like this: var("foo", str).

It is also possible to create variable objects via command line parameters.

As a PySQL command is a Python literal, it is possible to use Python expressions inside a PySQL command.

External files

Inside a PySQL command it is possible to load values from external files. The loadbytes command loads a bytes object from an external file like this:

loadbytes("path/to/file.png")

A string can be loaded with the loadstr command like this:

loadstr("path/to/file.txt", encoding="utf-8", errors="replace")

The second and third argument are the encoding and error handling name respectively.

The filename is treated as being relative to the file containing the loadbytes or loadstr call.

This file content can then be used in other PySQL commands (e.g. as parameters in procedure commands, or as file content in scp or file commands).

Command line usage

pysql.py has no external dependencies except for cx_Oracle and can be used as a script for importing a PySQL file into the database (However some commands require ll.orasql). As a script it supports the following command line options:

file

The name of one or more PySQL files that will be read and imported. If no filename is given, commands are read from stdin.

-v, --verbose

Gives different levels of output while data is being imported to the database. The default is no output (unless an exception occurs). Possible modes are: dot (one dot for each command), type (each command type), file (the file names and line numbers from which code gets executed), log (the log messages output by the commands) or full (source code that will be executed and the log messages output by the commands)

-d, --database

The value is an Oracle connectstring to specify the initial database connection that will be used before any additional connect commands.

-z, --summary

Give a summary of the number of commands executed and procedures called.

-r, --rollback

Specifies that transactions should be rolled back at the end of the script run, or when a disconnect command disconnects from the database. The default is to commit at the end or on each disconnect. (But note that DDL in the script will still commit everything up to the DDL statement.)

-s, --scpdirectory

The base directory for scp file copy commands. As files are copied via scp this can be a remote filename (like root@www.example.org:~/uploads/) and must include a trailing /.

-f, --filedirectory

The base directory for file file save commands. It must include a trailing /.

--tabsize

The tab size when PySQL source is printed in full mode.

--context

The number of lines at the start and end of the source code of a block to print in full mode. The default is to print the complete source code.

-D, --define

Can be used multiple times to define variables. Supported formats are:

name

Defines a string variable named name and sets the value to the empty string.

name=value

Defines a string variable named name and sets the value to value.

name:type

Defines a variable named name of type type and sets the value to False, 0, 0.0 or the empty string depending on the type. Supported types are str, bool, int and float.

name:type=value

Defines a variable named name of type type and sets the value to value. For type bool supported values are 0, no, false, False, 1, yes, true and True.

class ll.pysql.Command[source]

Bases: object

The base class of all commands. A Command object is created from a function call in a PySQL file and then immediatetel the method execute() will be called to execute the command.

The only parameters in the call that is supported by all commands are the following:

raiseexceptionsbool (optional)

Specifies whether exceptions that happen during the execution of the command should be reported and terminate the script (True), or should be ignored (False). None (the default) uses the global configuration.

condbool (optional)

Specifies whether this command should be executed or not. If cond is true (the default), the command will be executed, else it won’t.

ll.pysql.register(cls)[source]

Register a Command subclass as a PySQL command.

This is used as a class decorator.

class ll.pysql.include[source]

Bases: ll.pysql.Command

The include command includes another PySQL file. The filename is passed in the first parameter filename. This filename is interpreted as being relative to the directory with the file containing the include command.

For the parameters raiseexceptions and cond see the base class Command.

class ll.pysql.connect[source]

Bases: ll.pysql.Command

The connect command connects to the database given in the connectstring in the parameter connectstring. After the connect command until the matching disconnect command, all commands that talk to the database will use this connection. After a disconnect command pysql will revert back to the previously active database connection. Parameter have the following meaning:

modestring or None (optional)

The connection mode: This can be either 'sysdba' or None.

retryint (optional)

The number of times PySQL tries to get a database connection.

retrydelayint (optional)

The number of seconds to wait between connection tries.

For the parameters raiseexceptions and cond see the base class Command.

class ll.pysql.disconnect[source]

Bases: ll.pysql.Command

The disconnect command disconnects from the active database connection and reverts back to the previously active database connection.

commit specifies whether the transaction should be committed. If commit is None, the default commit mode is used (which can be changed on the command line via the -r/--rollback option).

For the parameters raiseexceptions and cond see the base class Command.

class ll.pysql.procedure[source]

Bases: ll.pysql._SQLCommand

A procedure command calls an Oracle procedure in the database. The following parameters are supported:

namestring (required)

The name of the procedure to be called (This may include . for calling a procedure in a package or one owned by a different user).

argsdictionary (optional)

A dictionary with the names of the parameters as keys and the parameter values as values. PySQL supports all types as values that cx_Oracle supports. In addition to those, three special classes are supported:

  • sqlexpr objects can be used to specify that the paramater should be literal SQL. So e.g. sqlexpr("sysdate") will be the date when the PySQL script was executed.

  • var objects can be used to hold values that are OUT parameters of the procedure. For example on first use of var("foo_10") the value of the OUT parameter will be stored under the key "foo_10". The next time var("foo_10") is encountered the value stored under the key "foo_10" will be passed to the procedure. The type of the variable defaults to int. If a different type is required it can be passed as the second argument to var, e.g. var("foo_10", str).

  • Finally all other commands can be called to get a value (for example the two commands loadbytes and loadstr to load values from external files (as long as they are of type bytes or str). loadbytes("foo/bar.txt") will return with the content of the external file foo/bar.txt (as a bytes object). If a str object is required, loadstr can be used. Encoding info can be passed like this:

    loadstr("foo/bar.txt", encoding="utf-8", errors="replace")
    

For the rest of the parameters see the base class _DatabaseCommand.

class ll.pysql.sql[source]

Bases: ll.pysql._SQLCommand

An sql command directly executes an SQL statement in the Oracle database. The following parameters are supported:

sqlstring (required)

The SQL to be executed. This may contain parameters in the form of :paramname. The values for those parameters will be taken from args.

argsdictionary (optional)

A dictionary with the names of the parameters as keys and the parameter values as values. Similar to procedure calls var, loadbytes and loadstr objects are supported. However sqlexpr objects are not supported (they will be ignored).

For the rest of the parameters see the base class _DatabaseCommand.

class ll.pysql.literalsql[source]

Bases: ll.pysql._SQLCommand

A literalsql is used for SQL that appears literally in the PySQL file. Apart from the sql attribute it supports the parameters raiseexceptions and cond, but those parameters can’t be passed when the literalsql object is created from literal SQL, only when the literalsql command is invoked directly (which won’t work in a PySQL file).

class ll.pysql.commit[source]

Bases: ll.pysql._SQLCommand

A commit command commits the current transaction in the activate database connection (or the one specified via the connection parameter).

For the rest of the parameters see the base class _DatabaseCommand.

class ll.pysql.rollback[source]

Bases: ll.pysql._SQLCommand

A rollback command rolls back the current transaction in the activate database connection (or the one specified via the connection parameter).

For the rest of the parameters see the base class _DatabaseCommand.

class ll.pysql.literalpy[source]

Bases: ll.pysql._DatabaseCommand

A literalpy is used for Python code that appears literally in the PySQL file. Apart from the code attribute it supports the parameters raiseexceptions and cond, but those parameters can’t be passed when the literalpy object is created via a Python block, only when the literalpy command is invoked directly (which won’t work in a PySQL file and doesn’t make much sense anyway, since the functionality of raiseexceptions and cond can be implemented in the Python block itself).

class ll.pysql.setvar[source]

Bases: ll.pysql.Command

The setvar command sets a variable to a fixed value. The following parameters are supported:

namestring (required)

The name of the variable to set.

valueobject (required)

The value of the variable.

For the parameters raiseexceptions and cond see the base class Command.

class ll.pysql.unsetvar[source]

Bases: ll.pysql.Command

The unsetvar command deletes a variable. The parameter name must be given and must contain the name of the variable.

For the parameters raiseexceptions and cond see the base class Command.

class ll.pysql.raiseexceptions[source]

Bases: ll.pysql.Command

The raiseexceptions command changes the global error reporting mode for all subsequent commands. After:

raiseexceptions(False)

for all subsequent commands any exception will be ignored and reported and command execution will continue with the next command.

raiseexceptions(True)

will switch back to aborting the execution of the PySQL script once an exception is encountered.

Note that the global configuration will only be relevant for commands that don’t specify the raiseexceptions parameter themselves.

For the parameters raiseexceptions and cond see the base class Command.

class ll.pysql.pushraiseexceptions[source]

Bases: ll.pysql.Command

The pushraiseexceptions command changes the global error reporting mode for all subsequent commands, but remembers the previous exception handling mode. After:

pushraiseexceptions(False)

for all subsequent commands any exception will be ignored and reported and command execution will continue with the next command. It is possible to switch back to the previous exception handling mode via:

popraiseexceptions()

Note that this global configuration will only be relevant for commands that don’t specify the raiseexceptions parameter themselves.

For the parameters raiseexceptions and cond see the base class Command.

class ll.pysql.popraiseexceptions[source]

Bases: ll.pysql.Command

The popraiseexceptions command restores the previously active exception handling mode (i.e. the one active before the last pushraiseexceptions command).

For the parameters raiseexceptions and cond see the base class Command.

class ll.pysql.checkerrors[source]

Bases: ll.pysql._DatabaseCommand

The checkerrors command checks that there are no compilation errors in the active database schema. If there are, an exception will be raised.

For the rest of the parameters see the base class _DatabaseCommand (but the value of the raiseexceptions key will be ignored).

class ll.pysql.scp[source]

Bases: ll.pysql.Command

The scp command creates a file by copying it via the scp program. The following parameters are supported:

namestring (required)

The name of the file to be created. It may contain format() style specifications containing any variable (for example those that appeared in a procedure or sql command). These specifiers will be replaced by the correct variable values. As these files will be copied via the scp program, ssh file names can be used.

contentbytes (required)

The content of the file to be created. This can also be a loadbytes command to load the content from an external file.

For the parameters raiseexceptions and cond see the base class Command.

class ll.pysql.file[source]

Bases: ll.pysql.Command

The file command creates a file by directly saving it from Python. The following parameters are supported:

namestring (required)

The name of the file to be created. It may contain format() style specifications containing any variable (for example those that appeared in a procedure or sql command). These specifiers will be replaced by the correct variable values.

contentbytes (required)

The content of the file to be created. This can also be a loadbytes command to load the content from an external file.

modeinteger (optional)

The file mode for the new file. If the mode is specified, os.chmod() will be called on the file.

ownerinteger or string (optional)

The owner of the file (as a user name or a uid).

groupinteger or string (optional)

The owning group of the file (as a group name or a gid). If owner or group is given, os.chown() will be called on the file.

For the parameters raiseexceptions and cond see the base class Command.

class ll.pysql.resetsequence[source]

Bases: ll.pysql._DatabaseCommand

The resetsequence command resets a sequence in the database to the maximum value of a field in a table. The following parameters are supported:

sequencestring (required)

The name of the sequence to reset.

tablestring (required)

The name of the table that contains the field.

fieldstring (required)

The name of the field in the table table. The sequence will be reset to a value so that fetching the next value from the sequence will deliver a value that is larger than the maximum value of the field field in the table table.

minvalueinteger (optional, default taken from sequence)

The minimum value for the sequence.

incrementinteger (optional, default taken from sequence)

The increment (i.e. the step size) for the sequence.

For the rest of the parameters see the base class _DatabaseCommand.

class ll.pysql.user_exists[source]

Bases: ll.pysql._DatabaseCommand

The user_exists command returns whether a user with a specified name exists in the database. It supports the following parameters:

namestring (required)

The name of the user to be checked for existence.

For the rest of the parameters see the base class _DatabaseCommand.

class ll.pysql.object_exists[source]

Bases: ll.pysql._DatabaseCommand

The object_exists command returns whether an object with a specified name exists in the database. It supports the following parameters:

namestring (required)

The name of the object to be checked for existence.

ownerstring (optional)

The owner of the object (defaults to the current user if not specified or None).

For the rest of the parameters see the base class _DatabaseCommand.

Note that object_exists won’t test for constraints. For this use constraint_exists.

class ll.pysql.constraint_exists[source]

Bases: ll.pysql._DatabaseCommand

The constraint_exists command returns whether a constraint (i.e. a primary key, foreign key, unique or check constraint) with a specified name exists in the database. It supports the following parameters:

namestring (required)

The name of the object to be checked for existence.

ownerstring (optional)

The owner of the constraint (defaults to the current user if not specified or None).

For the rest of the parameters see the base class _DatabaseCommand.

class ll.pysql.drop_types[source]

Bases: ll.pysql._DatabaseCommand

The drop_types command drops database objects.

Unlike all other commands this command requires the ll.orasql module.

drop_types supports the following parameters:

droplist of strings (optional)

The types of objects to drop (value must be names for ll.orasql object types.

keeplist string (required)

The types of objects to keep (value must be names for ll.orasql object types.

drop and keep are mutually exclusive. When neither of them is specified all database objects will be dropped.

For the rest of the parameters see the base class _DatabaseCommand.

class ll.pysql.comment[source]

Bases: ll.pysql.Command

The comment command does nothing.

class ll.pysql.loadbytes[source]

Bases: ll.pysql.Command

The loadbytes command can be used to load a bytes object from an external file. The following parameters are supported:

filenamestring (required)

The name of the file to be loaded. The filename is treated as being relative to the directory containing the PySQL file that contains loadbytes command.

For the parameters raiseexceptions and cond see the base class Command.

execute(context)[source]

Read the file and return the file content as a bytes object.

class ll.pysql.loadstr[source]

Bases: ll.pysql.Command

The loadstr command can be used to load a str object from an external file. The following parameters are supported:

filenamestring (required)

The name of the file to be loaded. The filename is treated as being relative to the directory containing the PySQL file that contains the the loadstr command.

encodingstring (optional)

The encoding used for decoding the bytes in the file to text.

errorsstring (optional)

The error handling mode for decoding.

For the parameters raiseexceptions and cond see the base class Command.

__init__(filename, *, encoding=None, errors='strict', raiseexceptions=None, cond=True)[source]

Create a new loadbytes object.

execute(context)[source]

Read the file and return the file content as a str object.

class ll.pysql.var[source]

Bases: ll.pysql.Command

var commands are used to mark procedure values that are OUT parameters. On first use the parameter is used as an OUT parameter and PySQL will remembers the OUT value under the unique key specified in the constructor. When a var object is used a second time its value will be passed to the procedure as a normal IN parameter instead. The following parameters are supported:

keystring (required)

A unique name for the value.

typeclass (optional)

The type of the value (defaulting to int).

Note that when the key is None, PySQL will not remember the value, instead each use of var(None) will create a new OUT parameter. This can be used for OUT parameters whose values is not required by subsequent commands.

class ll.pysql.env[source]

Bases: ll.pysql.Command

A env command returns the value of an environment variable.

The following parameters are supported:

namestring (required)

The name of the environment variable.

defaultstring (optional)

The default to use, if the environment variable isn’t set. This defaults to None.

class ll.pysql.log[source]

Bases: ll.pysql.Command

log commands generate logging output.

The following parameters are supported:

objectsAny

The objects to log. String will be logged directly. For all other objects repr() will be called.

class ll.pysql.CommandExecutor[source]

Bases: object

A CommandExecutor object wraps a Command object in a callable. Calling the CommandExecutor object executes the command using the specified context and returns the command result.

This class exists because Command objects serve two purposes:

  1. They can be created to print them to a file (via the method Command.source());

  2. They can be put into a PySQL file which will then be read and executed, with must then create the Command object and execute it immediately. This is the job of CommandExecutor objects.

class ll.pysql.Context[source]

Bases: object

A Context objects contains the configuration and run time information required for importing a PySQL file.

executeall(*filenames)[source]

Execute all commands in the PySQL files specified by filenames. If filenames is empty sys.stdin is read.

class ll.pysql.sqlexpr[source]

Bases: object

An sqlexpr object can be used to specify an SQL expression as a procedure parameter instead of a fixed value. For example passing the current date (i.e. the date of the import) can be done with sqlexpr("sysdate").

class ll.pysql.pyexpr[source]

Bases: object

A pyexpr object can be used to embed literal Python source code in a PySQL file.

Note

As PySQL source code is evaluated via eval()/exec() anyway, it it always possible to embed Python expressions in PySQL source code. However this doesn’t roundtrip, i.e. printing the PySQL command via source() outputs the value of a “literal” Python expression.

exception ll.pysql.CompilationError[source]

Bases: Exception

Exception raised by checkerrors when invalid database objects are encountered.

exception ll.pysql.SCPError[source]

Bases: Exception

Exception raised by scp when a call to the scp command fails.

class ll.pysql.Location[source]

Bases: object

The location of a PySQL/SQL command in a PySQL file.