GGA Sofware Services

User Manual for PostgreSQL

Contents

Basics

Molecules

Reactions

Importing and Exporting Data

Utility functions

Permissions management

Maintenance

Basics

Data representation

Bingo supports Daylight SMILES with some ChemAxon extensions and MDL (Symyx) Molfile/Rxnfile formats both in the text and binary representation. Please look at the corresponding section of Bingo User Manual for Oracle for details.

Storage

Suppose you have a table with a text, varchar or bytea column containing a Molfiles/Rxnfiles or SMILES with molecules or reactions.

Once you have prepared your table, you can execute CheckMolecule or CheckReaction to ensure that all the records are valid. These functions return not null string with invalid records corresponding error messages. All molecules or reactions that are in this table will be excluded from the chemical index. You can update these molecules later after indexing.

After you have prepared and checked your table, you can execute Create Index to make Bingo search procedures available for you table. The more records the table contains, the longer it takes to create an index.

Queries

You can specify the query molecule as a varchar string containing a Molfile (including various query features), a SMILES, or a SMARTS string. For reaction queries, use Rxnfiles, reaction SMILES, or reaction SMARTS.

Molecules

Creating an Index

The following command creates the index:

create index $index on $table using bingo_idx ($column bingo.molecule)

Updating and Dropping Index

You can add, remove, or edit records in the table after the index is created. Adding records does not slow down the queries, i.e. the performance will be the same as if you had indexed the whole table at once. No re- indexing is required after adding the records.

After you delete, you must call autovacuum utility functions to clean up the index.

The general form of substructure search query is as follows:

select * from $table where $column @ ('$query', '$parameters')::bingo.sub

Please see the corresponding section of Bingo User Manual for Oracle to learn the rules of Bingo substructure matching (including Resonance search, Conformation search, Affine transformation search), and various query features available.

The syntax of SMARTS expression search is similar to the ordinary substructure search:

select * from $table where $column @ ('$query', '$parameters')::bingo.smarts

Please see the corresponding section of Bingo User Manual for Oracle to learn the rules of SMARTS matching in Bingo.

The general form of exact search query is as follows:

select * from $table where $column @ ('$query', '$parameters')::bingo.exact

Please see the corresponding section of Bingo User Manual for Oracle to learn the rules of Bingo exact matching and various flags available for $parameters string.

Tautomer search is implemented within Substructure and Exact search functions, and requires TAU flag to be specified in $parameters string. Please see the corresponding section of Bingo User Manual for Oracle to learn the rules of Bingo exact and substructure tautomer matching.

Customizing the Rules

Your database (to which you have installed Bingo) contains a table called bingo.bingo_tau_config. By default it contains 3 records with predefined rules. You can add, remove, or update the defined rules. Please see the corresponding section of Bingo User Manual for Oracle to learn the format of the tautomer matching rules.

The general form of similarity search query is as follows:

select * from $table where $column @ ($bottom, $top, '$query', '$metric')::bingo.sim

By default, the bottom limit is zero and the top limit is 1, which is the maximum possible value of similarity. You can specify null in place of $bottom or $top to disable the lower or upper bound. In most cases, you may want to cancel the upper bound:

select * from $table where $column @ (0.8, null, '$query', 'Tanimoto')::bingo.sim

Please see the corresponding section of Bingo User Manual for Oracle to learn more about the metrics.

The general form of gross formula search query is as follows:

select * from $table where $column @ ('$query', '$parameters')::bingo.gross

Please see the corresponding section of the Bingo User Manual for Oracle to see some examples.

The general form of molecular weight query is as follows:

select bingo.getWeight('$molecule', '$mass_type')

To use the bingo index the search query is:

select * from $table where $column > '$bottom'::bingo.mass AND $column < '$top'::bingo.mass

Format Conversion

You can convert a molecule to SMILES string using the function:

select bingo.SMILES('$molecule');

You can convert a molecule to Molfile using the function:

select bingo.Molfile('$molecule');

The automatic layout procedure is performed to calculate the 2D coordinates of the resulting molecule.

You can convert a molecule to CML format using the function:

select bingo.CML('$molecule');

Canonical SMILES computation

You can use the function:

select bingo.CanSMILES('$molecule')

to generate canonical SMILES strings for molecules represented as Molfiles or SMILES strings. Please see the corresponding section of Bingo User Manual for Oracle to learn the benefits of Bingo canonical SMILES format.

Reactions

Creating an Index

The following command creates the index:

create index $index on $table using bingo_idx ($column bingo.reaction)

The general form of reaction substructure search query is as follows:

select * from $table where $column @ ('$query', '$parameters')::bingo.rsub

Please see the corresponding section of Bingo User Manual for Oracle to learn the rules of Bingo reaction substructure matching and various query features available.

The syntax of SMARTS expression search is similar to the ordinary substructure search:

select * from $table where $column @ ('$query', '$parameters')::bingo.rsmarts

Please see the corresponding section of Bingo User Manual for Oracle to learn the rules of SMARTS matching in Bingo.

The general form of exact search query is as follows:

select * from $table where $column @ ('$query', '$parameters')::bingo.rexact

Please see the corresponding section of Bingo User Manual for Oracle to learn the rules of Bingo exact matching and various flags available for $parameters string.

Automatic Atom-to-Atom mapping

You can compute reaction AAM by calling the function:

select bingo.AAM('$reaction', '$strategy');

The corresponding section of Bingo User Manual for Oracle describes the allowable values of the $strategy parameter and shows some examples.

Format Conversion

You can convert a reaction to reaction SMILES string using the function:

select bingo.RSMILES('$reaction')

You can convert a reaction SMILES string to Rxnfile using the function:

select bingo.Rxnfile('$reaction');

The automatic layout procedure is performed to calculate the 2D coordinates of the resulting reaction.

You can convert a reaction to a reaction CML using the function:

select bingo.RCML('$reaction');

Importing and Exporting Data

Importing SDFiles, RDFiles and SMILES

You can import a molecule or reaction table from an SDF file. You can also import SDF fields corresponding to each record in the SDF file. Prior to importing, you have to create the table manually:

create table $table ($id int, $column text, ...);
select bingo.ImportSDF ('$table', '$column', '$sdf_id $id[, $other_columns]', '$filename.sdf[.gz]');

A simple example of importing the NCI 2D compound database would be the following:

create table nci (nsc int, molfile text);
select bingo.ImportSDF('nci', 'molfile', 'nsc nsc', 'C:/Users/Administrator/july2008_2d.sdf');

Importing RDF files is done with ImportRDF() function the same way as SDF files:

create table $table ($id int, $column text, ...);
select bingo.ImportRDF ('$table', '$column', '$rdf_id $id[, $other_columns]', '$filename.rdf[.gz]');

Importing multi-line molecule or reaction SMILES file is done the similar way with the ImportSMILES() function:

select bingo.ImportSMILES('$table', '$column', '$id', '$filename');

Note: When you import the file contents to a table, the old table contents are not removed. Thus, you can import multiple files into the same table.

Exporting SDFiles and RDFiles

You can export a molecule table to a SDF file. You can also export table fields corresponding to each record in the SDF file:

create table $table ($id int, $column text, ...);
...
select bingo.ExportSDF ('$table', '$column', '$id[, $other_columns]', '$filename.sdf');

A simple example of exporting the NCI 2D compound database would be the following:

create table nci (nsc int, molfile text);
...
select bingo.ExportSDF('nci', 'molfile', 'nsc', 'C:/Users/Administrator/july2008_2d.sdf');

Exporting RDF files is done with ExportRDF() function the same way as SDF files:

create table $table ($id int, $column text, ...);
...
select bingo.ExportRDF ('$table', '$column', '$id[, $other_columns]', '$filename.rdf');

Utility functions

Extracting Names

bingo.Name function extracts the molecule or reaction name from Molfile, Rxnfile, or SMILES string.

select bingo.Name(molfile) from mytable;

select bingo.Name('c1ccc2ccccc2c1 Naphthalene');

Calculating Molecule Properties

bingo.getMass function returns the molecular weight of the given molecule, represented as a Molfile or SMILES string. It has an additional parameter which defines the 'kind' of the resulting molecular mass value.

Here are some examples of using the Bingo.getMass() operator:

select bingo.getMass('C1C=CC=CC=1', '');

select bingo.getWeight(molfile, 'most-abundant-mass') from mytable;

Similarly, bingo.Gross() function returns the gross formula of the given molecule

select bingo.Gross('C1C=CC=CC=1');

select bingo.Gross(molfile) from mytable;

Checking for Correctness

You can use the bingo.CheckMolecule() function to check that molecules are presented in acceptable form. If the molecule has some problems (unsupported format, exceeded valence, incorrect stereochemistry), the functions returns a string with the description of the problem. Is the molecule is represented with a correct Molfile or SMILES string, the function returns null.

select bingo.CheckMolecule($molecule);

select $table.*, bingo.CheckMolecule($column) from $table where bingo.CheckMolecule($column) is not null;

Similarly, you can check reactions for correctness with the bingo.CheckReaction() function:

select bingo.CheckReaction($reaction);

select $table.*, bingo.CheckReaction($column) from $table where bingo.CheckReaction($column) is not null;

Reading Files on Server

The Bingo.FileToText() function accepts a text file path and loads a file from the server file system to PostgreSQL text.

select bingo.FileToText($path);

Usually you may want to load the query molecule in the following way:

select * form $table where $column @ (bingo.FileToText($path), '')::bingo.sub;

The Bingo.FileToBlob() function accepts a text file path and loads a file from the server file system to PostgreSQL bytea.

select bingo.FileToBlob($path);

Permissions management

Let the schema name bingo was specified during installation of the Bingo cartridge. Let the user test_user is going to create the bingo index on a table. There are two objects user will work with to create the bingo index.

So for precise permissions management you need:

Set permissions for building the index

grant usage on schema bingo to test_user;
grant select on table bingo.bingo_config to test_user;
grant select on table bingo.bingo_tau_config to test_user;

Maintenance

Obtaining Bingo Version Number

select bingo.GetVersion();

Viewing the Log File

All operation of Bingo is logged into the PostgreSQL native LOG. All error and warning messages (not necessarily visible in SQL session) are logged. Some performance measures of the SQL queries are written to the log as well.