Feed aggregator

Oracle Cloud Infrastructure Architect Associate (1Z0-932) | DBCS Feature

Online Apps DBA - Sat, 2018-09-29 07:39

[1Z0-932 Exam Q/A] Oracle Cloud Infrastructure Architect Exam (1Z0-932) consists of 70 Questions and you need to clear 68% so around 49 Question correct. There are different topics like Architecture, Storage, Networking, IAM, Compute, HA, and Database. Check one of the many questions that you can expect in OCI Architect 1Z0-932 from Database Cloud Service […]

The post Oracle Cloud Infrastructure Architect Associate (1Z0-932) | DBCS Feature appeared first on Oracle Trainings for Apps & Fusion DBA.

Categories: APPS Blogs

Performance Tuning Process

Online Apps DBA - Sat, 2018-09-29 07:36

Do You Know There are Major Four Generic Processes to increase the performance of your database? Visit: https://k21academy.com/tuning12 to know in depth about the first interesting process, “Identify”, that includes the following Steps: ✔ Identify Which SQL to Tune? ✔ Identify End to End View ✔ Identify Database Time View ✔ Identify Simplification Do You […]

The post Performance Tuning Process appeared first on Oracle Trainings for Apps & Fusion DBA.

Categories: APPS Blogs

Insertion over db links creating extra rows than expected

Tom Kyte - Fri, 2018-09-28 19:46
Hi Tom, iam facing a weird issue . below is the scenario, My package creates insert statements for 4 tables which lie on someother oracle 11g db. tab1 tab2 tab3 tab4 The same 4 tables exists in 7 servers. (admin and cus1 to cu6). ...
Categories: DBA Blogs

A SQLite extension for gawk (part II)

Yann Neuhaus - Fri, 2018-09-28 17:21

Welcome to part II of a three-part article on extending gawk with a SQLite binding. Part I is here. Part II is followed by Part III, which give some explanations for the code presented here and shows how to use the extension with a stress test.
Here, I’ll list the source code of the extension and give instructions to compile and use it in gawk. Beware though that the code should be taken with several grains of salt, actually a whole wheelbarrow of it, because it has been only superficially tested. Some more serious testing is required in order to trust it entirely. So, caveat emptor !
I assume the source code of gawk is already installed (see for example the instructions here). We still need the SQLite source code. Go here and download the amalgamation zip file. Unzip it somewhere and then copy the file sqlite3.c to the gawk extension directory, ~/dmgawk/gawk-4.2.1/extension. Compile it with the command below:

gcc -c sqlite3.c -DHAVE_READLINE -fPIC -lpthread -ldl

Now, edit the file Makefile.am and add the references to the new extension, as shown below:

vi Makefile.am
pkgextension_LTLIBRARIES = \
filefuncs.la \
...
sqlite_gawk.la <-----
 
noinst_LTLIBRARIES = \
...
time_la_SOURCES = time.c
time_la_LDFLAGS = $(MY_MODULE_FLAGS)
time_la_LIBADD = $(MY_LIBS)
sqlite_gawk_la_SOURCES = sqlite_gawk.c <-----
sqlite_gawk_la_LDFLAGS = $(MY_MODULE_FLAGS) <-----
sqlite_gawk_la_LIBADD = $(MY_LIBS) -lpthread -ldl -lreadline <-----

...

Save and quit; that’s all for the make file;
We are still in the extension directory. Let’s edit the interface sqlite_gawk.c now and insert the code below;
vi sqlite_gawk.c

/*
 * sqlite-gawk.c - an interface to sqlite() library;
 * Cesare Cervini
 * dbi-services.com
 * 8/2018
*/
#ifdef HAVE_CONFIG_H
#include <config.h>
#endif

#include <stdio.h>
#include <assert.h>
#include <stdlib.h>
#include <string.h>
#include <unistd.h>

#include <sys/types.h>
#include <sys/stat.h>

#include "gawkapi.h"

// extension;
#include <time.h>
#include <errno.h>
#include <limits.h>
#include <sys/time.h>
#include <sys/resource.h>
#include <sys/types.h>
#include <regex.h>
#include <sqlite3.h>

#include "gettext.h"
#define _(msgid)  gettext(msgid)
#define N_(msgid) msgid

static const gawk_api_t *api;   /* for convenience macros to work */
static awk_ext_id_t ext_id;
static const char *ext_version = "an interface to sqlite3: version 1.0";

int plugin_is_GPL_compatible;

/* internal structure and variables */
/*
internally stores the db handles so an integer is returned to gawk as the index of a array;
*/
#define MAX_DB 100
static unsigned nb_sqlite_free_handles = MAX_DB;
static sqlite3 *sqlite_handles[MAX_DB];

/* init_sqlite_handles */
/*
since gawk does not know pointers, we use integers as db handles, which are actually indexes into a table of sqllite db handles;
initializes the sqlite_handles array to null pointers and resets the number of free handles;
called at program start time;
*/
static awk_bool_t init_sqlite_handles(void) {
   for (unsigned i = 0; i < MAX_DB; i++)
      sqlite_handles[i] = NULL;
   nb_sqlite_free_handles = MAX_DB;

   register_ext_version(ext_version);

   return awk_true;
}

/*
readfile() and writefile() functions for blob I/Os from/to file into/from memory;
e.g.:
   INSERT INTO chickenhouse my_blob = readfile('chicken_run.mp4');
   SELECT writefile("'Mary Poppins Returns.mp4'", my_blob) FROM children_movies;
they are taken directly from the source file sqlite3.c, i.e. sqlite-s shell program;
those functions are later registered in do_sqlite_open() via a call to sqlite_create_function() for use as extending SQL functions;
to be done while opening a db in do_sqlite_open() and for each opened db where the extended functions must be available, i.e. all for short;
*/

// ------------------------------------------- begin of imported functions from sqllite3.c ---------------------------------------------
/*
** This function is used in place of stat().  On Windows, special handling
** is required in order for the included time to be returned as UTC.  On all
** other systems, this function simply calls stat().
*/
static int fileStat(
  const char *zPath,
  struct stat *pStatBuf
){
  return stat(zPath, pStatBuf);
}

/*
** Set the result stored by context ctx to a blob containing the 
** contents of file zName.
*/
static void readFileContents(sqlite3_context *ctx, const char *zName){
  FILE *in;
  long nIn;
  void *pBuf;

  in = fopen(zName, "rb");
  if( in==0 ) return;
  fseek(in, 0, SEEK_END);
  nIn = ftell(in);
  rewind(in);
  pBuf = sqlite3_malloc( nIn );
  if( pBuf && 1==fread(pBuf, nIn, 1, in) ){
    sqlite3_result_blob(ctx, pBuf, nIn, sqlite3_free);
  }else{
    sqlite3_free(pBuf);
  }
  fclose(in);
}

/*
** Implementation of the "readfile(X)" SQL function.  The entire content
** of the file named X is read and returned as a BLOB.  NULL is returned
** if the file does not exist or is unreadable.
*/
static void readfileFunc(
  sqlite3_context *context,
  int argc,
  sqlite3_value **argv
){
  const char *zName;
  (void)(argc);  /* Unused parameter */
  zName = (const char*)sqlite3_value_text(argv[0]);
  if( zName==0 ) return;
  readFileContents(context, zName);
}

/*
** This function does the work for the writefile() UDF. Refer to 
** header comments at the top of this file for details.
*/
static int writeFile(
  sqlite3_context *pCtx,          /* Context to return bytes written in */
  const char *zFile,              /* File to write */
  sqlite3_value *pData,           /* Data to write */
  mode_t mode,                    /* MODE parameter passed to writefile() */
  sqlite3_int64 mtime             /* MTIME parameter (or -1 to not set time) */
){
  if( S_ISLNK(mode) ){
    const char *zTo = (const char*)sqlite3_value_text(pData);
    if( symlink(zTo, zFile)<0 ) return 1;
  }else
  {
    if( S_ISDIR(mode) ){
      if( mkdir(zFile, mode) ){
        /* The mkdir() call to create the directory failed. This might not
        ** be an error though - if there is already a directory at the same
        ** path and either the permissions already match or can be changed
        ** to do so using chmod(), it is not an error.  */
        struct stat sStat;
        if( errno!=EEXIST
         || 0!=fileStat(zFile, &sStat)
         || !S_ISDIR(sStat.st_mode)
         || ((sStat.st_mode&0777)!=(mode&0777) && 0!=chmod(zFile, mode&0777))
        ){
          return 1;
        }
      }
    }else{
      sqlite3_int64 nWrite = 0;
      const char *z;
      int rc = 0;
      FILE *out = fopen(zFile, "wb");
      if( out==0 ) return 1;
      z = (const char*)sqlite3_value_blob(pData);
      if( z ){
        sqlite3_int64 n = fwrite(z, 1, sqlite3_value_bytes(pData), out);
        nWrite = sqlite3_value_bytes(pData);
        if( nWrite!=n ){
          rc = 1;
        }
      }
      fclose(out);
      if( rc==0 && mode && chmod(zFile, mode & 0777) ){
        rc = 1;
      }
      if( rc ) return 2;
      sqlite3_result_int64(pCtx, nWrite);
    }
  }

  if( mtime>=0 ){
#if defined(AT_FDCWD) && 0 /* utimensat() is not universally available */
    /* Recent unix */
    struct timespec times[2];
    times[0].tv_nsec = times[1].tv_nsec = 0;
    times[0].tv_sec = time(0);
    times[1].tv_sec = mtime;
    if( utimensat(AT_FDCWD, zFile, times, AT_SYMLINK_NOFOLLOW) ){
      return 1;
    }
#else
    /* Legacy unix */
    struct timeval times[2];
    times[0].tv_usec = times[1].tv_usec = 0;
    times[0].tv_sec = time(0);
    times[1].tv_sec = mtime;
    if( utimes(zFile, times) ){
      return 1;
    }
#endif
  }

  return 0;
}

/*
** Argument zFile is the name of a file that will be created and/or written
** by SQL function writefile(). This function ensures that the directory
** zFile will be written to exists, creating it if required. The permissions
** for any path components created by this function are set to (mode&0777).
**
** If an OOM condition is encountered, SQLITE_NOMEM is returned. Otherwise,
** SQLITE_OK is returned if the directory is successfully created, or
** SQLITE_ERROR otherwise.
*/
static int makeDirectory(
  const char *zFile,
  mode_t mode
){
  char *zCopy = sqlite3_mprintf("%s", zFile);
  int rc = SQLITE_OK;

  if( zCopy==0 ){
    rc = SQLITE_NOMEM;
  }else{
    int nCopy = (int)strlen(zCopy);
    int i = 1;

    while( rc==SQLITE_OK ){
      struct stat sStat;
      int rc2;

      for(; zCopy[i]!='/' && i<nCopy; i++);
      if( i==nCopy ) break;
      zCopy[i] = '';

      rc2 = fileStat(zCopy, &sStat);
      if( rc2!=0 ){
        if( mkdir(zCopy, mode & 0777) ) rc = SQLITE_ERROR;
      }else{
        if( !S_ISDIR(sStat.st_mode) ) rc = SQLITE_ERROR;
      }
      zCopy[i] = '/';
      i++;
    }

    sqlite3_free(zCopy);
  }

  return rc;
}

/*
** Set the error message contained in context ctx to the results of
** vprintf(zFmt, ...).
*/
static void ctxErrorMsg(sqlite3_context *ctx, const char *zFmt, ...){
  char *zMsg = 0;
  va_list ap;
  va_start(ap, zFmt);
  zMsg = sqlite3_vmprintf(zFmt, ap);
  sqlite3_result_error(ctx, zMsg, -1);
  sqlite3_free(zMsg);
  va_end(ap);
}

/*
** Implementation of the "writefile(W,X[,Y[,Z]]])" SQL function.  
** Refer to header comments at the top of this file for details.
*/
static void writefileFunc(
  sqlite3_context *context,
  int argc,
  sqlite3_value **argv
){
  const char *zFile;
  mode_t mode = 0;
  int res;
  sqlite3_int64 mtime = -1;

  if( argc4 ){
    sqlite3_result_error(context, 
        "wrong number of arguments to function writefile()", -1
    );
    return;
  }

  zFile = (const char*)sqlite3_value_text(argv[0]);
  if( zFile==0 ) return;
  if( argc>=3 ){
    mode = (mode_t)sqlite3_value_int(argv[2]);
  }
  if( argc==4 ){
    mtime = sqlite3_value_int64(argv[3]);
  }

  res = writeFile(context, zFile, argv[1], mode, mtime);
  if( res==1 && errno==ENOENT ){
    if( makeDirectory(zFile, mode)==SQLITE_OK ){
      res = writeFile(context, zFile, argv[1], mode, mtime);
    }
  }

  if( argc>2 && res!=0 ){
    if( S_ISLNK(mode) ){
      ctxErrorMsg(context, "failed to create symlink: %s", zFile);
    }else if( S_ISDIR(mode) ){
      ctxErrorMsg(context, "failed to create directory: %s", zFile);
    }else{
      ctxErrorMsg(context, "failed to write file: %s", zFile);
    }
  }
}
// ------------------------------------------- end of imported functions from sqllite3.c ---------------------------------------------

/* get_free_sqlite_handle */
/*
looks for a free slot in sqlite_handles;
return its index if found, -1 otherwise;
*/
static unsigned get_free_sqlite_handle(void) {
   if (0 == nb_sqlite_free_handles) {
       fprintf(stderr, "maximum of open db [%d] reached, no free handles !\n", MAX_DB);
       return -1;
   }
   for (unsigned i = 0; i < MAX_DB; i++)
      if (NULL == sqlite_handles[i])
         return i;
   // should never come so far;
   return -1;
}

/* do_sqllite_open */
/* returns -1 if error, a db handle in the range 0 .. MAX_DB - 1 otherwise; */
static awk_value_t *
do_sqlite_open(int nargs, awk_value_t *result, struct awk_ext_func *unused) {
   awk_value_t db_name;
   short int ret;

   assert(result != NULL);

   unsigned int db_handle = get_free_sqlite_handle();
   if (-1 == db_handle)
      return make_number(-1, result);

   if (get_argument(0, AWK_STRING, &db_name)) {
      sqlite3 *db;

      ret = sqlite3_open(db_name.str_value.str, &db);

      if (ret) {
         char error_string[1000];
         sprintf(error_string, "sqlite3_open(): cannot open database [%s], error %s\n", db_name.str_value.str, sqlite3_errmsg(db));
         fprintf(stderr, "%s\n", error_string);
         update_ERRNO_string(_(error_string));
         ret = -1;
      }
      else {
         sqlite_handles[db_handle] = db;
         nb_sqlite_free_handles--;
         ret = db_handle;

         // register the extension functions readfile() and writefile() for blobs;
         ret = sqlite3_create_function(db, "readfile", 1, SQLITE_UTF8, 0, readfileFunc, 0, 0);
         if (ret == SQLITE_OK) {
            ret = sqlite3_create_function(db, "writefile", -1, SQLITE_UTF8, 0, writefileFunc, 0, 0);
            if (SQLITE_OK != ret)
               fprintf(stderr, "%s\n", "could not register function writefile()");
         }
         else if (SQLITE_OK != ret)
            fprintf(stderr, "%s\n", "could not register function readfile()");
      }
   }
   else {
      update_ERRNO_string(_("sqlite3_open(): missing parameter database name"));
      ret = -1;
   }

   return make_number(ret, result);
}

/* do_sqllite_close */
/* returns -1 if error, 0 otherwise; */
static awk_value_t *
do_sqlite_close(int nargs, awk_value_t *result, struct awk_ext_func *unused) {
   awk_value_t db_handle;
   int ret;

   assert(result != NULL);

   if (get_argument(0, AWK_NUMBER, &db_handle)) {
      sqlite3_close(sqlite_handles[(int) db_handle.num_value]);
      sqlite_handles[(int) db_handle.num_value] = NULL;
      nb_sqlite_free_handles++;
      ret = 0;
   }
   else {
      update_ERRNO_string(_("sqlite3_close(): missing parameter database handle"));
      ret = -1;
   }
   return make_number(ret, result);
}

/* do_sqllite_exec */
/*
returns -1 if error, 0 otherwise;
sqlite_exec is overloaded;
if 2 parameters, usual DML/DDL statements;
if 6 parameters, then incremental blob I/O;
sqlite_exec(db, db_name, table, column, rowid, readfile(file_name))
or
sqlite_exec(db, db_name, table, column, rowid, writefile(file_name))
implements sqlite3'c shell readfile()/writefile() syntax with incremental blob I/Os;
Example of usage:
first, get the rowid of the row that contains the blob to access;
   sqlite_select(db, "select rowid from <table> where <condition>", array)
then, call the sqlite_exec function with the tuple (<db_name>, <table>, <blob_column>, <rowid>) and the action to do, either readfile() or writefile();
   sqlite_exec(db, <db_name>, '<table>', '<blob_column>', array[0]["rowid"], readfile(file_name))
   sqlite_exec(db, <db_name>, '<table>', '<blob_column>', array[0]["rowid"], writefile(file_name))
e.g.:
   rc = sqlite_exec(my_db, "main", "test_with_blob", "my_blob", a_test[0]["rowid"], "readfile(/home/dmadmin/setup_files/documentum.tar)")
note how the file name is not quoted;
in case of readfile(), if the blob's size changes, an update of the blob filled with zero-byte bytes and with the new size is first performed, then the blob is reopened;
see doc here for incremental blob I/Os: https://sqlite.org/c3ref/blob_open.html;
int sqlite3_blob_open(sqlite3*, const char *zDb, const char *zTable, const char *zColumn, sqlite3_int64 iRow, int flags, sqlite3_blob **ppBlob);
int sqlite3_blob_reopen(sqlite3_blob *, sqlite3_int64);
int sqlite3_blob_read(sqlite3_blob *, void *Z, int N, int iOffset);
int sqlite3_blob_write(sqlite3_blob *, const void *z, int n, int iOffset);
int sqlite3_blob_close(sqlite3_blob *);
*/
static awk_value_t *
do_sqlite_exec(int nargs, awk_value_t *result, struct awk_ext_func *unused) {
   awk_value_t db_handle;
   int ret = 1;

   assert(result != NULL);

   if (!get_argument(0, AWK_NUMBER, &db_handle)) {
      fprintf(stderr, "in do_sqlite_exec, cannot get the db handle argument\n");
      ret = -1;
      goto end;
   }
   if (2 == nargs) {
      awk_value_t sql_stmt;
      if (!get_argument(1, AWK_STRING, &sql_stmt))  {
         fprintf(stderr, "in do_sqlite_exec, cannot get the sql_stmt argument\n");
         ret = -1;
         goto end;
      }
      char *errorMessg = NULL;
      ret = sqlite3_exec(sqlite_handles[(int) db_handle.num_value], sql_stmt.str_value.str, NULL, NULL, &errorMessg);
      if (SQLITE_OK != ret) {
         fprintf(stderr, "in do_sqlite_exec, SQL error %s while executing [%s]\n", sqlite3_errmsg(sqlite_handles[(int) db_handle.num_value]), sql_stmt.str_value.str);
         sqlite3_free(errorMessg);
         ret = -1;
         goto end;
      }
   }
   else if (6 == nargs) {
      awk_value_t arg, number_value;
      char *db_name = NULL, *table_name = NULL, *column_name = NULL, *file_stmt = NULL, *file_name = NULL;

      if (!get_argument(1, AWK_STRING, &arg))  {
         fprintf(stderr, "in do_sqlite_exec, cannot get the db_name argument\n");
         ret = -1;
         goto abort;
      }
      db_name = strdup(arg.str_value.str);

      if (!get_argument(2, AWK_STRING, &arg))  {
         fprintf(stderr, "in do_sqlite_exec, cannot get the table_name argument\n");
         ret = -1;
         goto abort;
      }
      table_name = strdup(arg.str_value.str);

      if (!get_argument(3, AWK_STRING, &arg))  {
         fprintf(stderr, "in do_sqlite_exec, cannot get the column_name argument\n");
         ret = -1;
         goto abort;
      }
      column_name = strdup(arg.str_value.str);
      
      if (!get_argument(4, AWK_NUMBER, &number_value))  {
         fprintf(stderr, "in do_sqlite_exec, cannot get the rowid argument\n");
         ret = -1;
         goto abort;
      }
      long int rowid = number_value.num_value;
      
      if (!get_argument(5, AWK_STRING, &arg))  {
         fprintf(stderr, "in do_sqlite_exec, cannot get the readfile()/writefile() argument\n");
         ret = -1;
         goto abort;
      }
      file_stmt = strdup(arg.str_value.str);
      
      unsigned short bRead2Blob;
      char *RE_readfile = "^readfile\\(([^)]+)\\)$";
      char *RE_writefile = "^writefile\\(([^)]+)\\)$";
      regex_t RE;
      regmatch_t pmatches[2];

      if (regcomp(&RE, RE_readfile, REG_EXTENDED)) {
         fprintf(stderr, "in do_sqlite_exec, error compiling REs %s\n", RE_readfile);
         ret = -1;
         goto abort;
      }
      if (regexec(&RE, file_stmt, 2, pmatches, 0)) {
         // no call to readfile() requested, try writefile();
         regfree(&RE);
         if (regcomp(&RE, RE_writefile, REG_EXTENDED)) {
            fprintf(stderr, "in do_sqlite_exec, error compiling REs %s\n", RE_writefile);
            ret = -1;
            goto abort;
         }
         if (regexec(&RE, file_stmt, 2, pmatches, 0)) {
            fprintf(stderr, "in do_sqlite_exec, error executing RE %s and RE %s against %s;\nneither readfile(file_name) nor writefile(file_name) was found\n", RE_readfile, RE_writefile, file_stmt);
            ret = -1;
            goto abort;
         }
         else bRead2Blob = 0;
      }
      else bRead2Blob = 1;
      file_name = strndup(file_stmt + pmatches[1].rm_so, pmatches[1].rm_eo - pmatches[1].rm_so);
      regfree(&RE);
      sqlite3_blob *pBlob;
      if (bRead2Blob) {
         ret = sqlite3_blob_open(sqlite_handles[(int) db_handle.num_value], db_name, table_name, column_name, rowid, 1, &pBlob);
         if (SQLITE_OK != ret) {
            fprintf(stderr, "in do_sqlite_exec, at reading blob, with parameters: db_name=%s, table_name=%s, column_name=%s, rowid=%ld, file statement=%s, error in sqlite3_blob_open %s\n%s\n",
                            db_name, table_name, column_name, rowid, file_stmt,
                            sqlite3_errmsg(sqlite_handles[(int) db_handle.num_value]), sqlite3_errstr(ret));
            ret = -1;
            goto abort;
         }

         FILE *fs_in = fopen(file_name, "r");
         if (NULL == fs_in) {
            fprintf(stderr, "in do_sqlite_exec, error opening file %s for reading\n", file_name);
            ret = -1;
            goto local_abort_w;
         }

         // will the blob size change ?
         fseek(fs_in, 0, SEEK_END);
         unsigned long file_size = ftell(fs_in);
         rewind(fs_in);
         unsigned long blobSize = sqlite3_blob_bytes(pBlob);
         if (file_size != blobSize) {
            // yes, must first update the blob with the new size and reopen it;
            char stmt[500];
            char *errorMessg = NULL;
            sprintf(stmt, "update %s set %s = zeroblob(%ld) where rowid = %ld", table_name, column_name, file_size, rowid);
            ret = sqlite3_exec(sqlite_handles[(int) db_handle.num_value], stmt, NULL, NULL, &errorMessg);
            if (SQLITE_OK != ret) {
               fprintf(stderr, "in do_sqlite_exec, SQL error %s while changing the blob's size through [%s]:\n%s\n", sqlite3_errmsg(sqlite_handles[(int) db_handle.num_value]), stmt, errorMessg);
               sqlite3_free(errorMessg);
               ret = -1;
               goto local_abort_w;
            }
            ret = sqlite3_blob_reopen(pBlob, rowid);
            if (SQLITE_OK != ret) {
               fprintf(stderr, "in do_sqlite_exec, error while reopening the blob: %s\n%s\n", sqlite3_errmsg(sqlite_handles[(int) db_handle.num_value]), sqlite3_errstr(ret));
               ret = -1;
               goto local_abort_w;
            }
         }

         // let's work with a 10 MiB large buffer;
         unsigned long BUFFER_SIZE = 10 * 1024 * 1024;
         char *pBuffer = (char *) malloc(sizeof(char) * BUFFER_SIZE);
         unsigned long nbBytes;
         unsigned long offset = 0;
         while ((nbBytes = fread(pBuffer, sizeof(char), BUFFER_SIZE, fs_in)) > 0) {
            ret = sqlite3_blob_write(pBlob, pBuffer, nbBytes, offset);
            if (SQLITE_OK != ret) {
               fprintf(stderr, "in do_sqlite_exec, sqlite3_blob_write, error %s\n%s\n", sqlite3_errmsg(sqlite_handles[(int) db_handle.num_value]), sqlite3_errstr(ret));
               ret = -1;
               free(pBuffer);
               goto local_abort_w;
            }
            offset += nbBytes;
         }
         free(pBuffer);
local_abort_w:
         fclose(fs_in);
         ret = sqlite3_blob_close(pBlob);
         if (SQLITE_OK != ret) {
            fprintf(stderr, "in do_sqlite_exec, sqlite3_blob_close, error %s\n%s\n", sqlite3_errmsg(sqlite_handles[(int) db_handle.num_value]), sqlite3_errstr(ret));
            ret = -1;
         }
      }
      else {
         ret = sqlite3_blob_open(sqlite_handles[(int) db_handle.num_value], db_name, table_name, column_name, rowid, 0, &pBlob);
         if (SQLITE_OK != ret) {
            fprintf(stderr, "in do_sqlite_exec at writing blob, error %d in sqlite3_blob_open with parameters: db_name=%s, table_name=%s, column_name=%s, rowid=%ld, file statement=%s\n",
                            ret,
                            db_name, table_name, column_name, rowid, file_stmt);
            ret = -1;
            goto abort;
         }
         unsigned long BUFFER_SIZE = 10 * 1024 * 1024;
         char *pBuffer = (char *) malloc(sizeof(char) * BUFFER_SIZE);
         unsigned long offset = 0;
         FILE *fs_out = fopen(file_name, "w");
         if (NULL == fs_out) {
            fprintf(stderr, "in do_sqlite_exec, error %d opening file %s for writing\n", errno, file_name);
            ret = -1;
            goto local_abort_r;
         }
         unsigned long blobSize = sqlite3_blob_bytes(pBlob);
         if (BUFFER_SIZE >= blobSize) {
            ret = sqlite3_blob_read(pBlob, pBuffer, blobSize, offset);
            if (SQLITE_OK != ret) {
               fprintf(stderr, "in do_sqlite_exec, sqlite3_blob_read, error %s\n%s\n", sqlite3_errmsg(sqlite_handles[(int) db_handle.num_value]), sqlite3_errstr(ret));
               ret = -1;
               goto local_abort_r;
            }
            unsigned long nbBytes = fwrite(pBuffer, sizeof(char), BUFFER_SIZE, fs_out);
            if (nbBytes < blobSize) {
               fprintf(stderr, "in do_sqlite_exec, error in fwrite()\n");
               ret = -1;
               goto local_abort_r;
            }
         }
         else {
            unsigned long nbBytes;
            while ((nbBytes = (blobSize <= BUFFER_SIZE ? blobSize : BUFFER_SIZE)) > 0) {
               ret = sqlite3_blob_read(pBlob, pBuffer, nbBytes, offset);
               if (SQLITE_OK != ret) {
                  fprintf(stderr, "in do_sqlite_exec, sqlite3_blob_read, error %s\n%s\n", sqlite3_errmsg(sqlite_handles[(int) db_handle.num_value]), sqlite3_errstr(ret));
                  ret = -1;
                  goto local_abort_r;
               }
               ret = fwrite(pBuffer, sizeof(char), nbBytes, fs_out);
               if (ret < nbBytes) {
                  fprintf(stderr, "in do_sqlite_exec, error in fwrite()\n");
                  ret = -1;
                  goto local_abort_r;
               }
               offset += nbBytes;
               blobSize -= nbBytes;
            }
         }
local_abort_r:
         fclose(fs_out);
         free(pBuffer);
         ret = sqlite3_blob_close(pBlob);
         if (SQLITE_OK != ret) {
            fprintf(stderr, "in do_sqlite_exec, processing of writefile(), sqlite3_blob_close, error %s\n%s\n", sqlite3_errmsg(sqlite_handles[(int) db_handle.num_value]), sqlite3_errstr(ret));
            ret = -1;
         }
      }
abort:
      free(db_name);
      free(table_name);
      free(column_name);
      free(file_stmt);
      free(file_name);
   }
   else {
      fprintf(stderr, "in do_sqlite_exec, unsupported number of parameters in statement while processing [%d]\n", nargs);
      ret = -1;
   }
end:
   return make_number(ret, result);
}

static unsigned max(unsigned n1, unsigned n2) {
   if (n1 > n2)
      return n1;
   else return n2;
}

// this struct is used to pass parameters to the callbacks;
typedef struct DISPLAYED_TABLE {
   char *sqlStmt;

   unsigned short bHeaderPrinted;
   char *COL_SEPARATOR;
   char *ELLIPSIS;
   unsigned short len_ellipsis;
   unsigned short MAX_WIDTH;
   unsigned short MIN_WIDTH;

   unsigned nb_columns;
   unsigned *max_col_widths;
   unsigned *actual_col_widths;
   char *col_overflow_action;
   char **headers;
   unsigned widest_column;

   char *size_list;  // list of blank- or comma-separated column widths;

   unsigned long NR;
   unsigned short bStoreOrDisplay;
   awk_array_t gawk_array;

   unsigned short bEpilog;
} DISPLAYED_TABLE;

void cleanup(DISPLAYED_TABLE *dt) {
   if (dt -> sqlStmt)
      free(dt -> sqlStmt);
   if (dt -> max_col_widths)
      free(dt -> max_col_widths);
   if (dt -> actual_col_widths)
      free(dt -> actual_col_widths);
   for (unsigned i = 0; i < dt -> nb_columns; i++) {
      if (dt -> headers)
         free(dt -> headers[i]);
   }
   if (dt -> headers)
      free(dt -> headers);
   if (dt -> size_list)
      free(dt -> size_list);
}

// strip the trailing blanks so they are not counted toward the column width;
// and returns the number of characters from the beginning;
// former version attempted to insert a  terminator but it caused error when the string resides in code area (e.g. SELECT sqlite_version()) because modifications are not allowed there for obvious reasons;
unsigned getUsefulLen(char * str) {
   unsigned len = strlen(str);
   char *p = str + len - 1;
   while (' ' == *p && p > str) p--;
   if (' ' != *p)
      len = p - str + 1;
   else
      len = 0;
   return(len);
}

char *fillStr(char *S, char ch, unsigned max_len) {
   S[max_len] = '';
   for (char *p = S; max_len; p++, max_len--)
      *p = ch;
   return S;
}

/* select_callback_raw */
// displays the data without truncation nor cleaning up tainling blanks;
// columns are separated by dt -> COL_SEPARATOR, which is useful to import data as CSV;
static int select_callback_raw(void *vdt, int nb_columns, char **column_values, char **column_names) {
   DISPLAYED_TABLE *dt = (DISPLAYED_TABLE *) vdt;

   if (dt -> bEpilog) {
      printf("%ld rows selected\n", dt -> NR);
      cleanup(dt);
      return 0;
   }

   if (!dt -> bHeaderPrinted) {
      // header has not been printed yet, print it and afterwards print the first row;
      for (unsigned i = 0; i < nb_columns; i++)
         printf("%s%s", column_names[i], i  COL_SEPARATOR : "");
      printf("\n");
      dt -> bHeaderPrinted = 1;
   }

   for (unsigned i = 0; i < nb_columns; i++)
      printf("%s%s", column_values[i], i  COL_SEPARATOR : "");
   printf("\n");
   dt -> NR++;
   return 0;
}

/* select_callback_draft */
/*
display the data in maximum 15-character wide columns, with possible truncation, in which case an ellipsis (...) is appended;
at the end, the optimum widths for each column are listed so they can be passed as a string list in the call to sqlite_select(,, "....") to avoid truncation;
this output is convenient as a quick draft;
*/
static int select_callback_draft(void *vdt, int nb_columns, char **column_values, char **column_names) {
   DISPLAYED_TABLE *dt = (DISPLAYED_TABLE *) vdt;

   char col_str[dt -> MAX_WIDTH + 1];

   if (dt -> bEpilog) {
      printf("%ld rows selected\n", dt -> NR);

      printf("\nOptimum column widths\n");
      printf("=====================\n");
      printf("for query: %s\n", dt -> sqlStmt);
      for (unsigned i = 0; i < dt > nb_columns; i++)
         printf("%-*s  %d\n", dt -> widest_column + 5, dt -> headers[i], dt -> max_col_widths[i]);

      cleanup(dt);
      return 0;
   }

   if (!dt -> bHeaderPrinted) {
      // header has not been printed yet, print it and afterwards print the first row;
      dt -> nb_columns = nb_columns; 
      dt -> max_col_widths = (unsigned *) malloc(sizeof(unsigned) * nb_columns);
      dt -> actual_col_widths = (unsigned *) malloc(sizeof(unsigned) * nb_columns);
      dt -> headers = (char **) malloc(sizeof(char *) * nb_columns);

      char *header_line = NULL;

      for (unsigned i = 0; i &t; nb_columns; i++) {
         char *tmp_s;
         unsigned len = strlen(column_names[i]);
         dt -> max_col_widths[i] = len;
         dt -> widest_column = max(dt -> widest_column, len);
         if (len > dt -> MAX_WIDTH) {
            // column overflow, apply a truncation with ellipsis;
            dt -> actual_col_widths[i] = dt -> MAX_WIDTH;
            strncpy(col_str, column_names[i], dt -> MAX_WIDTH - dt -> len_ellipsis);
            col_str[dt -> MAX_WIDTH - dt -> len_ellipsis] = '';
            strcat(col_str, dt -> ELLIPSIS);
            tmp_s = col_str;
         }
         else if (len %lt; dt -> MIN_WIDTH) {
            dt -> actual_col_widths[i] = dt -> MIN_WIDTH;
            tmp_s = column_names[i];
         }
         else {
            dt -> actual_col_widths[i] = len;
            tmp_s = column_names[i];
         }
         printf("%-*s%s", dt -> actual_col_widths[i], tmp_s, i  COL_SEPARATOR : "");
         dt -> headers[i] = strdup(column_names[i]);
      }
      printf("\n");

      for (unsigned i = 0; i < nb_columns; i++) {
         header_line = (char *) realloc(header_line, sizeof(char) * dt -> actual_col_widths[i]);
         fillStr(header_line, '-', dt -> actual_col_widths[i]);
         printf("%s%s", header_line, i  COL_SEPARATOR : "");
      }
      printf("\n");
      free(header_line);

      dt -> bHeaderPrinted = 1;
   }
   // header has been printed, print the rows now;
   for (unsigned i = 0; i < nb_columns; i++) {
      char *tmp_s;
      unsigned len = getUsefulLen(column_values[i]);
      dt -> max_col_widths[i] = max(dt -> max_col_widths[i], len);
      if (len > dt -> actual_col_widths[i]) {
         strncpy(col_str, column_values[i], dt -> actual_col_widths[i] - dt -> len_ellipsis);
         col_str[dt -> actual_col_widths[i] - dt -> len_ellipsis] = '';
         strcat(col_str, dt -> ELLIPSIS);
         tmp_s = col_str;
      }
      else {
         tmp_s = column_values[i];
      }
      printf("%-*.*s%s", dt -> actual_col_widths[i], dt -> actual_col_widths[i], tmp_s, i  COL_SEPARATOR : "");
   }
   printf("\n");
   dt -> NR++;
   return 0;
}

/* printConstrained */
// prints the row's column in constrained column widths;
static void printConstrained(DISPLAYED_TABLE *dt, char **data, unsigned nb_columns) {
   // let's replicate the data because they will be modified locally;
   char **ldata = (char **) malloc(sizeof(char *) * nb_columns);
   for (unsigned i = 0; i < nb_columns; i++)
      ldata[i] = strndup(data[i], getUsefulLen(data[i]));
   unsigned bWrapOccured;
   do {
      bWrapOccured = 0;
      for (unsigned i = 0; i < nb_columns; i++) {
         char *col_str = NULL;
         unsigned len = strlen(ldata[i]);
         dt -> actual_col_widths[i] = dt -> max_col_widths[i];
         if (len > dt -> max_col_widths[i]) {
            // column width overflow, apply the requested action: either wrap-around, truncate with ellipsis or truncate without ellipsis;
            if ('e' == dt -> col_overflow_action[i]) {
               if (dt -> max_col_widths[i] < dt -gt&; len_ellipsis)
                  dt -> actual_col_widths[i] = dt -> len_ellipsis;
               col_str = strndup(ldata[i], dt -> actual_col_widths[i] - dt -> len_ellipsis);
               col_str[dt -> actual_col_widths[i] - dt -> len_ellipsis] = '';
               strcat(col_str, dt -> ELLIPSIS);
               sprintf(ldata[i], "%*s", len, " ");
            }
            else if ('t' == dt -> col_overflow_action[i]) {
               col_str = strndup(ldata[i], dt -> actual_col_widths[i]);
               sprintf(ldata[i], "%*s", len, " ");
            }
            else if ('w' == dt -> col_overflow_action[i]) {
               col_str = strndup(ldata[i], dt -> actual_col_widths[i]);
               // shift the column names by as many printed characters;
               // the new column names will be printed at the next cycle of the inner loop 
               unsigned j;
               for (j = dt -> actual_col_widths[i]; j < len; j++)
                  ldata[i][j - dt -> actual_col_widths[i]] = ldata[i][j];
               ldata[i][len - dt -> actual_col_widths[i]] = '';
               bWrapOccured = 1;
            }
         }
         else {
            col_str = strdup(ldata[i]);
            // no wrap-around necessary here but prepare the str for the next cycle just in case;
            sprintf(ldata[i], "%*s", len, " ");
         }
         printf("%-*s%s", dt -> actual_col_widths[i], col_str, i  COL_SEPARATOR : "");
         free(col_str);
      }
      printf("\n");
   } while (bWrapOccured);
   for (unsigned i = 0; i < nb_columns; i++)
      free(ldata[i]);
   free(ldata);
}

/* select_callback_sized */
// displays the columns within predetermined sizes, wrap-around if overflow;
static int select_callback_sized(void *vdt, int nb_columns, char **column_values, char **column_names) {
   DISPLAYED_TABLE *dt = (DISPLAYED_TABLE *) vdt;

   if (dt -> bEpilog) {
      printf("%ld rows selected\n", dt -> NR);
      cleanup(dt);
      return 0;
   }

   if (!dt -> bHeaderPrinted) {
      // header has not been printed yet, print it and afterwards print the first row;
      dt -> actual_col_widths = (unsigned *) malloc(sizeof(unsigned) * nb_columns);
      if (dt -> nb_columns < nb_columns) {
         unsigned last_width = dt -> max_col_widths[dt -> nb_columns - 1];
         fprintf(stderr, "warning: missing column sizes, extending the last provided one %d\n", last_width);
         dt -> max_col_widths = (unsigned *) realloc(dt -> max_col_widths, sizeof(unsigned) * nb_columns);
         dt -> col_overflow_action = (char *) realloc(dt -> col_overflow_action, sizeof(char) * nb_columns);
         char last_overflow_action = dt -> col_overflow_action[dt -> nb_columns - 1];
         for (unsigned i = dt -> nb_columns; i &t; nb_columns; i++) {
            dt -> max_col_widths[i] = last_width;
            dt -> col_overflow_action[i] = last_overflow_action;
         }
         dt -> nb_columns = nb_columns;
      }
      else if (dt -> nb_columns > nb_columns) {
         fprintf(stderr, "warning: too many columns widths given, %d vs actual %d, ignoring the %d in excess\n", dt -> nb_columns, nb_columns, dt -> nb_columns - nb_columns);
         dt -> nb_columns = nb_columns;
      }
      printConstrained(dt, column_names, nb_columns);

      char *header_line = NULL;
      for (unsigned i = 0; i < nb_columns; i++) {
         header_line = (char *) realloc(header_line, sizeof(char) * dt -> actual_col_widths[i]);
         fillStr(header_line, '-', dt -> actual_col_widths[i]);
         printf("%s%s", header_line, i < nb_columns - 1 ? dt -> COL_SEPARATOR : "");
      }
      printf("\n");
      free(header_line);
      dt -> bHeaderPrinted = 1;
   }
   printConstrained(dt, column_values, nb_columns);
   dt -> NR++;
   return 0;
}

/* select_callback_array */
/*
returs the database rows into the gawk associative array passed as parameter;
its structure is as follows:
array[0] = sub-array_0
array[1] = sub-array_1
...
array[count-1] = sub-array_count-1
where the sub-arrays are associative arrays too with structure:
sub-array0[col1] = value1
sub-array0[col2] = value2
...
sub-array0[coln] = valuen
sub-array1[col1] = value1
...
sub-array1[coln] = valuen
...
sub-arraym[col1] = value1
...
sub-arraym[coln] = valuen
Said otherwise, the returned array is an array of associative arrays whose first dimension contains the rows and second dimension contains the columns, 
i.e. it' a table of database rows or an array of hashes, or a list of dictionaries;
in perl linguo, it's an array of hashes;
in python, it would be an array of dictionaries;
*/
static int select_callback_array(void *vdt, int nb_columns, char **column_values, char **column_names) {
   DISPLAYED_TABLE *dt = (DISPLAYED_TABLE *) vdt;

   if (dt -> bEpilog) {
      printf("%ld rows selected\n", dt -> NR);
      cleanup(dt);
      return 0;
   }

   awk_array_t row;
   awk_value_t value;
   awk_value_t row_index;
   awk_value_t col_index, col_value;

   if (!dt -> bHeaderPrinted) {
      // create the main array once;
      // doesn't work; keep the code in case a fix is found;
      //db_table = create_array();
      //value.val_type = AWK_ARRAY;
      //value.array_cookie = db_table; 
   
      // add it to gawk's symbol table so it appear magically in gawk's script namespace;
      //if (!sym_update(dt -> array_name, &value)) 
      //   fatal(ext_id, "in select_callback_array, creation of table array %s failed\n", dt -> array_name);
      //db_table = value.array_cookie;

      // nothing special to do here;
      dt -> bHeaderPrinted = 1;
   }
   char index_str[50];
   unsigned len = sprintf(index_str, "%ld", dt -> NR);
   make_const_string(index_str, len, &row_index);

   // create the sub-array for each row;
   // indexes are the column names and values are the column values;
   row = create_array();
   value.val_type = AWK_ARRAY;
   value.array_cookie = row;
   if (! set_array_element(dt -> gawk_array, &row_index, &value))
      fatal(ext_id, "in select_callback_array, creation of row array %ld failed\n", dt -> NR);
   row  = value.array_cookie;

   for (unsigned i = 0; i  < nb_columns; i++) {
      make_const_string(column_names[i], strlen(column_names[i]), &col_index);
      make_const_string(column_values[i], strlen(column_values[i]), &col_value);
      if (! set_array_element(row, &col_index, &col_value))
         fatal(ext_id, "in select_callback_array, assigned value %s to index %s at row %ld failed\n", column_values[i], column_names[i], dt -> NR);
   }

   dt -> NR++;
   return 0;
}

/* do_sqllite_select */
/*
generic select entry point;
possible invocations:
Case: call profile:                                          --> action;
   0: sqlite_select(db, sql_stmt)                            --> draft output, default fixed width columns, with needed column widths list at the end;
   1: sqlite_select(db, sql_stmt, "")                        --> raw output, no truncation, | as default separator;
   2: sqlite_select(db, sql_stmt, "separator-string")        --> raw output, no truncation, use given string as separator;
   2: sqlite_select(db, sql_stmt, "list-of-columns-widths")  --> fixed sized column output, a w|t|e suffix is allowed for wrapping-around or truncating too large columns without or with ellipsis;
   3: sqlite_select(db, sql_stmt, dummy, gawk_array)         --> raw output into the gawk associative array gawk_array;
the appropriate callback will be called based on the invocation's profile;
returns -1 if error, 0 otherwise;
*/
static awk_value_t *
do_sqlite_select(int nargs, awk_value_t *result, struct awk_ext_func *unused) {
   awk_value_t db_handle, sql_stmt, col_sizes;
   int ret = 0;

   assert(result != NULL);

   if (!get_argument(0, AWK_NUMBER, &db_handle)) {
      fprintf(stderr, "in do_sqlite_select, cannot get the db handle argument\n");
      ret = -1;
      goto quit;
   }
   if (!get_argument(1, AWK_STRING, &sql_stmt)) {
      fprintf(stderr, "do_sqlite_select, cannot get the sql_stmt argument\n");
      ret = -1;
      goto quit;
   }
   DISPLAYED_TABLE dt;
   dt.sqlStmt = strdup(sql_stmt.str_value.str);
   dt.bHeaderPrinted = 0;
   dt.COL_SEPARATOR = "  ";
   dt.ELLIPSIS = "..."; dt.len_ellipsis = strlen(dt.ELLIPSIS); 
   dt.MAX_WIDTH = 15;
   dt.MIN_WIDTH = dt.len_ellipsis + 5;
   dt.nb_columns = 0;
   dt.max_col_widths = NULL;
   dt.actual_col_widths = NULL;
   dt.col_overflow_action = NULL;
   dt.headers = NULL;
   dt.widest_column = 0;
   dt.size_list = NULL;
   dt.NR = 0;
   dt.bStoreOrDisplay = 1;
   dt.gawk_array = NULL;
   dt.bEpilog = 0;

   unsigned short bCase;
   unsigned short bFoundSeparator = 0;
   char *errorMessg = NULL;

   if (4 == nargs) {
      bCase = 3;
      awk_value_t value;
      if (!get_argument(3, AWK_ARRAY, &value))
         fatal(ext_id, "in do_sqlite_select, accessing the gawk array parameter failed\n");
      dt.gawk_array = value.array_cookie;
      clear_array(dt.gawk_array);
   }
   else if (get_argument(2, AWK_STRING, &col_sizes)) {
      if (0 == strlen(col_sizes.str_value.str))
         // raw, unformatted output;
         bCase = 1;
      else {
         // columns are output with constrained widths and possible wrapping-around or truncation with/without ellipsis;
         bCase = 2;
         char *width_str, *tmp_str, *next_tok_iter;
         long width_value;
         tmp_str = strdup(col_sizes.str_value.str);
         next_tok_iter = tmp_str;
         while ((width_str = strtok(next_tok_iter, " ,/"))) {
            errno = 0;
            char *overflow_action_suffix;
            width_value = strtol(width_str, &overflow_action_suffix, 10);
            if ((errno == ERANGE && (width_value == LONG_MAX || width_value == LONG_MIN)) ||
                (errno != 0 && width_value == 0) ||
                (width_value < 0)) {
               if (0 == dt.nb_columns) {
                  // let's take this as a separator for select_callback_raw();
                  dt.COL_SEPARATOR = width_str;
                  bFoundSeparator = 1;
                  bCase = 0;
               }
               else {
                  fprintf(stderr, "invalid number in size string [%s], exiting ...\n", width_str);
                  if (dt.nb_columns > 0) {
                     free(dt.max_col_widths);
                     free(dt.col_overflow_action);
                  }
                  free(tmp_str);
                  ret = -1;
                  goto quit;
               }
            }
            else if (bFoundSeparator) {
               // nothing else is accepted after a separator;
               fprintf(stderr, "separator [%s] must be the only parameter in raw output, exiting ...\n", dt.COL_SEPARATOR);
               free(tmp_str);
               ret = -1;
               goto quit;
            }
            dt.max_col_widths = (unsigned *) realloc(dt.max_col_widths, sizeof(unsigned) * (dt.nb_columns + 1));
            dt.col_overflow_action = (char *) realloc(dt.col_overflow_action, sizeof(char) * (dt.nb_columns + 1));
            dt.max_col_widths[dt.nb_columns] = width_value;
            if (NULL == overflow_action_suffix || ! *overflow_action_suffix)
               dt.col_overflow_action[dt.nb_columns] = 'e';
            else if ('t' == *overflow_action_suffix || 'w' == *overflow_action_suffix || 'e' == *overflow_action_suffix)
               dt.col_overflow_action[dt.nb_columns] = *overflow_action_suffix;
            else if (0 == dt.nb_columns) {
               bCase = 0;
               dt.COL_SEPARATOR = strdup(width_str);
               bFoundSeparator = 1;
               dt.nb_columns++;
               break;      
            }
            else {
               // allowed overflow suffix is one of t, w or e;
               fprintf(stderr, "invalid overflow action suffix [%c]; it must be one of w (wrap-around), t (truncation without ellipsis) or e (truncation with ellipsis), exiting ...\n", *overflow_action_suffix);
               free(tmp_str);
               ret = -1;
               goto quit;
            }
            if ('e' == dt.col_overflow_action[dt.nb_columns] && width_value < dt.len_ellipsis) {
               fprintf(stderr, "column [%d] has maximum width [%ld] and requests a truncation with ellipsis [%s] but a minimum width of [%d] characters is necessary for this, assuming that minimum width\n", dt.nb_columns, width_value, dt.ELLIPSIS, dt.len_ellipsis);
               dt.max_col_widths[dt.nb_columns] = dt.len_ellipsis;
            }
            dt.nb_columns++;
            next_tok_iter = NULL;
         }
         free(tmp_str);
      }
   }
   else
      // draft output, i.e. default column width, possible truncation, optimal column widths listed at the end;
      bCase = 0;

   switch (bCase) {
      case 0: ret = sqlite3_exec(sqlite_handles[(int) db_handle.num_value], sql_stmt.str_value.str, select_callback_draft, &dt, &errorMessg);
              break;
      case 1: if (!bFoundSeparator)
                 // use default separator
                 dt.COL_SEPARATOR = "|";
              ret = sqlite3_exec(sqlite_handles[(int) db_handle.num_value], sql_stmt.str_value.str, select_callback_raw, &dt, &errorMessg);
              break;
      case 2: ret = sqlite3_exec(sqlite_handles[(int) db_handle.num_value], sql_stmt.str_value.str, select_callback_sized, &dt, &errorMessg);
              break;
      case 3: ret = sqlite3_exec(sqlite_handles[(int) db_handle.num_value], sql_stmt.str_value.str, select_callback_array, &dt, &errorMessg);
              break;
      default: fprintf(stderr, "programming error: did you not forget a case ?\n");
   }
   if (SQLITE_OK == ret) {
      dt.bEpilog = 1;
      0 == bCase ? select_callback_draft(&dt, 0, NULL, NULL) :
      1 == bCase ? select_callback_raw(&dt, 0, NULL, NULL) :
      2 == bCase ? select_callback_sized(&dt, 0, NULL, NULL) :
      3 == bCase ? select_callback_array(&dt, 0, NULL, NULL) :
      0;
   }
   else {
      fprintf(stderr, "do_sqlite_select, SQL error %s while executing [%s]\n", errorMessg, sql_stmt.str_value.str);
      sqlite3_free(errorMessg);
   }
quit:
   return make_number(ret, result);
}

/* these are the exported functions along with their min and max arities; */
   static awk_ext_func_t func_table[] = {
        {"sqlite_open", do_sqlite_open, 1, 1, awk_false, NULL},
        {"sqlite_close", do_sqlite_close, 1, 1, awk_false, NULL},
        {"sqlite_exec", do_sqlite_exec, 6, 2, awk_false, NULL},
        {"sqlite_select", do_sqlite_select, 4, 2, awk_false, NULL},
};

static awk_bool_t (*init_func)(void) = init_sqlite_handles;

/* define the dl_load function using the boilerplate macro */

dl_load_func(func_table, sqlite_gawk, "")

Quite the extension ! Sorry for this lengthy listing but there is a lot of stuff going on here.
Next, let’s make the awk and the new extension. Here are the incantations:

pwd
/home/dmadmin/dmgawk/gawk-4.2.1/extension
./configure
make
cd .libs; gcc -o sqlite_gawk.so -shared sqlite_gawk.o ../sqlite3.o -pthread

That’s it. As said elsewhere, an additional sudo make install will install the new gawk and its extension to their canonical locations, i.e. /usr/local/bin/gawk for gawk and /usr/local/lib/gawk for the extensions. But for the moment, let’s test it; for this, we still need a test gawk script.
vi tsqlite.awk

# test program for the sqlite_gawk, interface to sqlite3;
# Cesare Cervini
# dbi-services.com
# 8/2018

@load "sqlite_gawk"

BEGIN {
   my_db = sqlite_open("/home/dmadmin/sqlite-amalgamation-3240000/test.db")
   print "db opened:", my_db

   my_db2 = sqlite_open("/home/dmadmin/sqlite-amalgamation-3240000/test.db")
   print "db opened:", my_db2

   sqlite_close(my_db)
   sqlite_close(my_db2)

   my_db = sqlite_open("/home/dmadmin/sqlite-amalgamation-3240000/test.db")
   print "db opened:", my_db

   printf "\n"

   rc = sqlite_exec(my_db, "CREATE TABLE IF NOT EXISTS test1(n1 NUMBER, s1 TEXT, s2 CHAR(100))")
   print "return code = ", rc

   rc = sqlite_exec(my_db, "INSERT INTO test1(n1, s1, s2) VALUES(100, \"hello1\", \"hello0101\")")
   print "return code = ", rc
   rc = sqlite_exec(my_db, "INSERT INTO test1(n1, s1, s2) VALUES(200, \"hello2\", \"hello0102\")")
   print "return code = ", rc
   rc = sqlite_exec(my_db, "INSERT INTO test1(n1, s1, s2) VALUES(300, \"hello3\", \"hello0103\")")
   print "return code = ", rc
   rc = sqlite_exec(my_db, "INSERT INTO test1(n1, s1, s2) VALUES(400, \"hello4\", \"hello0104\")")
   print "return code = ", rc
   rc = sqlite_exec(my_db, "INSERT INTO test1(n1, s1, s2) VALUES(400, \"hello5 with spaces       \", \"hello0105 with spaces              \")")
   print "return code = ", rc
   rc = sqlite_exec(my_db, "INSERT INTO test1(n1, s1, s2) VALUES(400, \"hello6 with spaces        \", \"hello0106   \")")
   print "return code = ", rc
   printf "\n"

   stmt = "SELECT * FROM test1";
   split("", a_test)
   print "sqlite_select(my_db, " stmt ", 0, a_test)"
   rc = sqlite_select(my_db, stmt, 0, a_test)
   dumparray("a_test", a_test);
   for (row in a_test) {
      printf("row %d: ", row)
      for (col in a_test[row])
         printf("  %s = %s", col, a_test[row][col])
      printf "\n"
   }
   printf "\n"

   # print in draft format;
   stmt = "SELECT name FROM sqlite_master WHERE type IN ('table','view') AND name NOT LIKE 'sqlite_%' ORDER BY 1"
   print "sqlite_select(my_db, \"" stmt "\")"
   rc = sqlite_select(my_db, stmt)
   print "return code = ", rc
   printf "\n"

   # print in draft format;
   stmt = "SELECT sql FROM sqlite_master ORDER BY tbl_name, type DESC, name"
   print "sqlite_select(my_db, \"" stmt "\", \"100\")"
   rc = sqlite_select(my_db, stmt , "100")
   print "return code = ", rc
   printf "\n"

   # print in draft format;
   stmt = "SELECT * FROM test1"
   print "sqlite_select(my_db, " stmt ")"
   rc = sqlite_select(my_db, stmt)
   print "return code = ", rc
   printf "\n"

   # print in raw format with non default separator;
   stmt = "SELECT * FROM test1"
   print "sqlite_select(my_db, " stmt ", \"||\")"
   rc = sqlite_select(my_db, stmt, "||")
   print "return code = ", rc
   printf "\n"

   # now that we know the needed column widths, let's used them;
   # trailing spaces are removed to compact the column somewhat;
   stmt = "SELECT * FROM test1" 
   print "sqlite_select(my_db, " stmt ", \"3 18 21\")"
   rc = sqlite_select(my_db, stmt, "3 18 21")
   print "return code = ", rc
   printf "\n"

   # print in raw format, with default | separator;
   stmt = "SELECT * FROM test1"
   print "sqlite_select(my_db, " stmt ", \"\")"
   rc = sqlite_select(my_db, stmt, "")
   print "return code = ", rc
   printf "\n"

   stmt = "INSERT INTO test1(n1, s1, s2) VALUES(400, \"hello6-with-spaces        \", \"hello0106-12345\")" 
   print "sqlite_exec(my_db, " stmt ")"
   rc = sqlite_exec(my_db, stmt)
   print "return code = ", rc
   printf "\n"

   stmt = "SELECT * FROM test1"
   print "sqlite_select(my_db, " stmt ", \"2e 15e 10w\")"
   rc = sqlite_select(my_db, stmt, "2e 15e 10w")
   print "return code = ", rc
   printf "\n"

   stmt = "SELECT count(*) FROM test1"
   print "sqlite_select(my_db," stmt ")"
   rc = sqlite_select(my_db, stmt)
   print "return code = ", rc
   printf "\n"

   stmt = "DELETE FROM test1"
   print "sqlite_exec(my_db, " stmt ")"
   rc = sqlite_exec(my_db, stmt)
   print "return code = ", rc
   printf "\n"

   stmt = "SELECT count(*) FROM test1"
   print "sqlite_select(my_db," stmt ")"
   rc = sqlite_select(my_db, stmt)
   print "return code = ", rc
   printf "\n"

   rc = sqlite_exec(my_db, "CREATE TABLE IF NOT EXISTS test_with_blob(n1 NUMBER, my_blob BLOB)")
   print "return code = ", rc

   rc = sqlite_exec(my_db, "DELETE FROM test_with_blob")
   print "return code = ", rc

   stmt = "INSERT INTO test_with_blob(n1, my_blob) VALUES(1, readfile(\"gawk-4.2.1.tar.gz\"))" 
   print "sqlite_exec(my_db," stmt ")"
   #rc = sqlite_exec(my_db, stmt)
   print "return code = ", rc
   printf "\n"

   
   stmt = "SELECT n1, writefile('yy' || rowid, my_blob) FROM test_with_blob" 
   print "sqlite_select(my_db, " stmt ")"
   rc = sqlite_exec(my_db, stmt)
   print "return code = ", rc
   printf "\n"

   # file too large, > 3 Gb, fails silently;
   # do don't do it;
   # stmt = "INSERT INTO test_with_blob(n1, my_blob) VALUES(1000, readfile(\"/home/dmadmin/setup_files/documentum.tar\"))" 

   # this one is OK at 68 Mb;
   stmt = "INSERT INTO test_with_blob(n1, my_blob) VALUES(1000, readfile(\"/home/dmadmin/setup_files/instantclient-basic-linux.x64-12.2.0.1.0.zip\"))" 
   print "sqlite_exec(my_db," stmt ")"
   rc = sqlite_exec(my_db, stmt)
   print "return code = ", rc
   printf "\n"

   stmt = "SELECT n1, writefile('\"yy' || rowid || '\"', my_blob) FROM test_with_blob where n1 = 1000" 
   print "sqlite_select(my_db, " stmt ")"
   #rc = sqlite_exec(my_db, stmt)
   print "return code = ", rc
   printf "\n"

   stmt = "INSERT INTO test_with_blob(n1, my_blob) VALUES(5000, readfile('/home/dmadmin/dmgawk/gawk-4.2.1/extension/sqlite_gawk.c'))" 
   print "sqlite_exec(my_db," stmt ")"
   rc = sqlite_exec(my_db, stmt)
   print "return code = ", rc
   printf "\n"

   stmt = "UPDATE test_with_blob set my_blob = readfile('/home/dmadmin/dmgawk/gawk-4.2.1/extension/sqlite_gawk.c') where n1 = 1000" 
   print "sqlite_exec(my_db," stmt ")"
   rc = sqlite_exec(my_db, stmt)
   print "return code = ", rc
   printf "\n"

   # xx is a 999'000'000 bytes file; the import using a memory buffer with that size takes some time to complete;
   # the incremental blob I/Os below seem faster;
   # to make one, use: dd if=/dev/zero of=xx count=990 bs=1000000
   stmt = "UPDATE test_with_blob set my_blob = readfile('/home/dmadmin/dmgawk/xx') where n1 = 1000" 
   print "sqlite_exec(my_db," stmt ")"
   rc = sqlite_exec(my_db, stmt)
   print "return code = ", rc
   printf "\n"

   # this is needed to enforce typing of a_array to array;
   # split("", a_test)
   delete(a_test)
   print "sqlite_select(db, \"select rowid from test_with_blob where n1 = 1000 limit 1\", 0, a_test)"
   sqlite_select(db, "select rowid from test_with_blob where n1 = 1000 limit 1", 0, a_test)
   print "after getting blob"
   dumparray("a_test", a_test)
   print "sqlite_exec(my_db, 'main', 'test_with_blob', 'my_blob', " a_test[0]["rowid"] ", writefile(~/dmgawk/my_blob_" a_test[0]["rowid"] "))"
   rc = sqlite_exec(my_db, "main", "test_with_blob", "my_blob", a_test[0]["rowid"], "writefile(/home/dmadmin/dmgawk/my_blob_" a_test[0]["rowid"] ")")
   print "return code = ", rc
   printf "\n"

   #print "sqlite_exec(my_db, 'main', 'test_with_blob', 'my_blob', " a_test[0]["rowid"] ", readfile(/home/dmadmin/setup_files/documentum.tar))"
   #rc = sqlite_exec(my_db, "main", "test_with_blob", "my_blob", a_test[0]["rowid"], "readfile(/home/dmadmin/setup_files/documentum.tar)")
   #rc = sqlite_exec(my_db, "main", "test_with_blob", "my_blob", a_test[0]["rowid"], "readfile(/home/dmadmin/setup_files/patch.bin)")
   rc = sqlite_exec(my_db, "main", "test_with_blob", "my_blob", a_test[0]["rowid"], "readfile(/home/dmadmin/dmgawk/xx)")
   print "return code = ", rc
   printf "\n"

   stmt = "SELECT n1, hex(my_blob) FROM test_with_blob where n1 = 2000 limit 1" 
   stmt = "SELECT n1, my_blob FROM test_with_blob where n1 = 2000 limit 1" 
   stmt = "SELECT n1, substr(my_blob, 1) FROM test_with_blob where n1 = 2000 limit 1" 
   rc = sqlite_select(my_db, stmt)
   rc = sqlite_select(my_db, stmt, "10 100w")
   print "return code = ", rc
   printf "\n"

   stmt = "SELECT n1, replace(my_blob, '\n', '\\n') as 'noLF' FROM test_with_blob where n1 = 5000 limit 2" 
   print "sqlite_select(my_db," stmt ", 10, 100w)"
   rc = sqlite_select(my_db, stmt, "10, 100w")
   print "return code = ", rc
   printf "\n"

   sqlite_close(my_db)

   exit(0)
}

function dumparray(name, array, i) {
   for (i in array)
      if (isarray(array[i]))
         dumparray(name "[\"" i "\"]", array[i])
      else
         printf("%s[\"%s\"] = %s\n", name, i, array[i])
      }

To execute the test:

AWKLIBPATH=gawk-4.2.1/extension/.libs gawk-4.2.1/gawk -f tsqlite.awk
db opened: 0
db opened: 0
db opened: 0
 
return code = 0
return code = 0
return code = 0
return code = 0
return code = 0
return code = 0
 
sqlite_select(my_db, SELECT * FROM test1, 0, a_test)
6 rows selected
a_test["0"]["n1"] = 100
a_test["0"]["s1"] = hello1
a_test["0"]["s2"] = hello0101
a_test["1"]["n1"] = 200
a_test["1"]["s1"] = hello2
a_test["1"]["s2"] = hello0102
a_test["2"]["n1"] = 300
a_test["2"]["s1"] = hello3
a_test["2"]["s2"] = hello0103
a_test["3"]["n1"] = 400
a_test["3"]["s1"] = hello4
a_test["3"]["s2"] = hello0104
a_test["4"]["n1"] = 400
a_test["4"]["s1"] = hello5 with spaces
a_test["4"]["s2"] = hello0105 with spaces
a_test["5"]["n1"] = 400
a_test["5"]["s1"] = hello6 with spaces
a_test["5"]["s2"] = hello0106
row 0: n1 = 100 s1 = hello1 s2 = hello0101
row 1: n1 = 200 s1 = hello2 s2 = hello0102
row 2: n1 = 300 s1 = hello3 s2 = hello0103
row 3: n1 = 400 s1 = hello4 s2 = hello0104
row 4: n1 = 400 s1 = hello5 with spaces s2 = hello0105 with spaces
row 5: n1 = 400 s1 = hello6 with spaces s2 = hello0106
 
sqlite_select(my_db, "SELECT name FROM sqlite_master WHERE type IN ('table','view') AND name NOT LIKE 'sqlite_%' ORDER BY 1")
name
--------
test
test1
test_...
3 rows selected
1 columns displayed
 
Optimum column widths
=====================
for query: SELECT name FROM sqlite_master WHERE type IN ('table','view') AND name NOT LIKE 'sqlite_%' ORDER BY 1
name 14
return code = 0
 
sqlite_select(my_db, "SELECT sql FROM sqlite_master ORDER BY tbl_name, type DESC, name", "100")
sql
----------------------------------------------------------------------------------------------------
CREATE TABLE test(a1 number)
CREATE TABLE test1(n1 NUMBER, s1 TEXT, s2 CHAR(100))
CREATE TABLE test_with_blob(n1 NUMBER, my_blob BLOB)
3 rows selected
return code = 0
 
sqlite_select(my_db, SELECT * FROM test1)
n1 s1 s2
-------- -------- --------
100 hello1 hello...
200 hello2 hello...
300 hello3 hello...
400 hello4 hello...
400 hello... hello...
400 hello... hello...
6 rows selected
3 columns displayed
 
Optimum column widths
=====================
for query: SELECT * FROM test1
n1 3
s1 18
s2 21
return code = 0
 
sqlite_select(my_db, SELECT * FROM test1, "||")
n1 ||s1 ||s2
--------||--------||--------
100 ||hello1 ||hello...
200 ||hello2 ||hello...
300 ||hello3 ||hello...
400 ||hello4 ||hello...
400 ||hello...||hello...
400 ||hello...||hello...
6 rows selected
3 columns displayed
 
Optimum column widths
=====================
for query: SELECT * FROM test1
n1 3
s1 18
s2 21
return code = 0
 
sqlite_select(my_db, SELECT * FROM test1, "3 18 21")
n1 s1 s2
--- ------------------ ---------------------
100 hello1 hello0101
200 hello2 hello0102
300 hello3 hello0103
400 hello4 hello0104
400 hello5 with spaces hello0105 with spaces
400 hello6 with spaces hello0106
6 rows selected
return code = 0
 
sqlite_select(my_db, SELECT * FROM test1, "")
n1|s1|s2
100|hello1|hello0101
200|hello2|hello0102
300|hello3|hello0103
400|hello4|hello0104
400|hello5 with spaces |hello0105 with spaces
400|hello6 with spaces |hello0106
6 rows selected
return code = 0
 
sqlite_exec(my_db, INSERT INTO test1(n1, s1, s2) VALUES(400, "hello6-with-spaces ", "hello0106-12345"))
return code = 0
 
sqlite_select(my_db, SELECT * FROM test1, "2e 15e 10w")
column [0] has maximum width [2] and requests a truncation with ellipsis [...] but a minimum width of [3] characters is necessary for this, assuming that minimum width
n1 s1 s2
--- --------------- ----------
100 hello1 hello0101
200 hello2 hello0102
300 hello3 hello0103
400 hello4 hello0104
400 hello5 with ... hello0105
with space
s
400 hello6 with ... hello0106
400 hello6-with-... hello0106-
12345
7 rows selected
return code = 0
 
sqlite_select(my_db,SELECT count(*) FROM test1)
count(*)
--------
7
1 rows selected
1 columns displayed
 
Optimum column widths
=====================
for query: SELECT count(*) FROM test1
count(*) 8
return code = 0
 
sqlite_exec(my_db, DELETE FROM test1)
return code = 0
 
sqlite_select(my_db,SELECT count(*) FROM test1)
count(*)
--------
0
1 rows selected
1 columns displayed
 
Optimum column widths
=====================
for query: SELECT count(*) FROM test1
count(*) 8
return code = 0
 
return code = 0
return code = 0
sqlite_exec(my_db,INSERT INTO test_with_blob(n1, my_blob) VALUES(1, readfile("gawk-4.2.1.tar.gz")))
return code = 0
 
sqlite_select(my_db, SELECT n1, writefile('yy' || rowid, my_blob) FROM test_with_blob)
return code = 0
 
sqlite_exec(my_db,INSERT INTO test_with_blob(n1, my_blob) VALUES(1000, readfile("/home/dmadmin/setup_files/instantclient-basic-linux.x64-12.2.0.1.0.zip")))
return code = 0
 
sqlite_select(my_db, SELECT n1, writefile('"yy' || rowid || '"', my_blob) FROM test_with_blob where n1 = 1000)
return code = 0
 
sqlite_exec(my_db,INSERT INTO test_with_blob(n1, my_blob) VALUES(5000, readfile('/home/dmadmin/dmgawk/gawk-4.2.1/extension/sqlite_gawk.c')))
return code = 0
 
sqlite_exec(my_db,UPDATE test_with_blob set my_blob = readfile('/home/dmadmin/dmgawk/gawk-4.2.1/extension/sqlite_gawk.c') where n1 = 1000)
return code = 0
 
sqlite_exec(my_db,UPDATE test_with_blob set my_blob = readfile('/home/dmadmin/dmgawk/xx') where n1 = 1000)
return code = 0
 
sqlite_select(db, "select rowid from test_with_blob where n1 = 1000 limit 1", 0, a_test)
1 rows selected
after getting blob
a_test["0"]["rowid"] = 1
sqlite_exec(my_db, 'main', 'test_with_blob', 'my_blob', 1, writefile(~/dmgawk/my_blob_1))
return code = 0
 
sqlite_exec(my_db, 'main', 'test_with_blob', 'my_blob', 1, readfile(/home/dmadmin/setup_files/documentum.tar))
return code = 0
 
sqlite_select(my_db)
return code = 0
 
sqlite_select(my_db,SELECT n1, replace(my_blob, '\n', '\n') as 'noLF' FROM test_with_blob where n1 = 5000 limit 2, 10, 100w)
n1 noLF
---------- ----------------------------------------------------------------------------------------------------
5000 /*\n * sqlite-gawk.c - an interface to sqlite() library;\n * Cesare Cervini\n * dbi-services.com\n *
8/2018\n*/\n#ifdef HAVE_CONFIG_H\n#include \n#endif\n\n#include \n#include \n#include \n#include \n#include \n\n#include \n#inc
lude \n\n#include "gawkapi.h"\n\n// extension;\n#include \n#include \n#
include \n#include \n#include \n#include \n#inclu
de \n#include \n\n#include "gettext.h"\n#define _(msgid) gettext(msgid)\n#defi
ne N_(msgid) msgid\n\nstatic const gawk_api_t *api; /* for convenience macros to work */\nstatic a
wk_ext_id_t ext_id;\nstatic const char *ext_version = "an interface to sqlite3: version 1.0";\n\nint
plugin_is_GPL_compatible;\n\n/* internal structure and variables */\n/*\ninternally stores the db h
...
c)(void) = init_sqlite_handles;\n\n/* define the dl_load function using the boilerplate macro */\n\n
dl_load_func(func_table, sqlite_gawk, "")\n\n
1 rows selected
return code = 0

That was a very long second part. If you are still there, please turn now to part Part III for some explanation of all this.

 

Cet article A SQLite extension for gawk (part II) est apparu en premier sur Blog dbi services.

An SQLite extension for gawk (part I)

Yann Neuhaus - Fri, 2018-09-28 17:20

Quick: what is the most used database management system on our planet ? Oracle ? Wrong. SQL server ? Wrong again ! MySQL ? You’re almost there. It’s SQLite. Surprised ? I must confess that I was too. Actually, SQLite is special in that it is not the traditional 2-tiers client/server but one-tier and embedded, which means that it works as a library linked to an application. As such, it is used to fulfill the database needs of browsers, portable devices such as the iPods, iPhones, Android, etc… (see a short list of famous users here). Look also here for a succinct intro and here for a list of features. Here you’ll find distinctive features of SQLite and here common uses for it.
Let’s be clear from the start: although light, this is no toy software but a solid, time-proven, globally used, rigorously tested open-source product.
So, what the relation with gawk ? Well, none. Until now. As you may know, gawk has had for some time now an easy way to be extended with useful libraries. I already talked about this in my previous blogs, e.g. here. In particular, it has also a binding for PostgreSQL (see here). So, I told to myself, wouldn’t it be nice to make one for Oracle too ? Or for some key-value xDBMs such as Berkeley DB ? But fate decided otherwise: I already used SQLite in the past as an almost no-installation SQL database and during the present research I landed fortuitously on SQLite’s web pages. I was immediately hooked on. SQLite is petite (500 KiB in one unique source file for the whole RDBMS library and about 1 Mb for the shared library object file, unbelievable !), easy to learn and use, and open-source. Actually, the SQLite creators propose such an unorthodox license agreement that I feel compelled to list it here:

May you do good and not evil
May you find forgiveness for yourself and forgive others
May you share freely, never taking more than you give.

So inspirational and quite a depart from the traditional, indigestible EULAs ! There is enough here to push the religion business to bankruptcy. And the lawyers. And make our sorry planet a paradise again.
In effect, the only data structure known to gawk is the associative array, or hashes in perl parlance, or dictionaries for pythonists. And they are entirely held in memory. I thought it would be a nice addition to gawk to be able to work with on-disk tables for those cases where huge amount of textual data have to be processed in random order.
As this article is rather large, I’ve split it into 3 parts. Part I, the one you’re reading now, presents the requirements and proposes an API. Part II lists the extension code and shows how to compile and use it in gawk. Part III comments the code and describes a stress test for the interface and SQLite from within a gawk script. So, let’s see how I hammered that screw !

The Objectives

At the very minimum, the gawk interface to SQLite (for short sqlite_gawk henceforth) shall be able to open a database file (a SQLite database is entirely self-contained in a single file, I told you it’s light), send DML/DDL statements to it and run SELECT queries against it. The SELECT statement shall be able to present the result in a nice and flexible way to avoid messing up the screen with wrapped-around, illegible lines, i.e. the columns’ width shall be individually configurable and some truncation, with or without an ellipsis, or wrap-around be possible within those limits.
Also, as SQLite supports blobs, sqlite_gawk should be able to deal with them, i.e. insert, retrieve and display them if they are textual. SQLite comes with an interactive shell functionally akin to Oracle’s sqlplus and named sqlite3 which extends the collection of SQL functions (yes, SQLite allows that too). The SQL functions the shell adds are readfile() and writefile() to read the content of a file into a blob, respectively dump a blob into a file. We definitively want that functionality in the interface too.
Another requirement is that, along with displaying the retrieved rows on the screen, sqlite_select shall be able to store them into an integer-indexed gawk array of associative arrays. The goal is to permit further in-memory processing from within gawk.

The API

All the above requirements converge into the following API:

int do_sqlite_open(db_filename)
-- opens the database file db_filename;
-- in case of success, returns a non-negative integer which is a handle to the db;
-- returns -1 and sends to stderr an error message from SQLite if an error occured;

int sqlite_close(db_handle)
-- closes the database whith db_handle handle;
-- returns 0 is success and -1 plus an error message from SQLite to stderr in case of error;

int sqlite_exec(db_handle, sql_stmt)
-- sends the DML/DDL SQL statement sql_stmt to the database with db_handle handle;
-- returns -1 plus an error message from SQLite to stderr in case of error, a non-negative integer from SQLite if success;

int sqlite_select(db_handle, select_stmt [,"" | separator_string | list-of-columns-widths | , dummy, gawk_array])
-- sends the SELECT select_stmt statement with or without formatting options;
-- the output is either displayed or sent into a gawk array of associative arrays;

sqlite_exec takes INSERT, DELETE, UPDATE, i.e. all SQL statements different from SELECT. In addition, as said above, INSERT, UPDATE and SELECT also accept the SQL extensions readfile() and writefile() for blob I/Os from/to file. Here are a few examples of usage from gawk:

rc = sqlite_exec(my_db, "INSERT INTO test_with_blob(n1, my_blob) VALUES(1000, readfile('/home/dmadmin/setup_files/instantclient-basic-linux.x64-12.2.0.1.0.zip'))"

rc = sqlite_select(my_db, "SELECT n1, writefile('blob_2000.dmp', my_blob) FROM test_with_blob where n1 = 2000 limit 1")

rc = sqlite_exec(my_db, "UPDATE test_with_blob set my_blob = readfile('/home/dmadmin/dmgawk/gawk-4.2.1/extension/sqlite_gawk.c') where n1 = 1000")
sqlite_select()

As expected from functions that must produce human-readable output, this is the most feature rich, and complex, function of the interface:

int sqlite_select(db_handle, select_stmt [, "" | , "col_separator_string" | , "list-of-columns-formats" | , dummy, gawk_array])

This compact syntax can be split into the following five acceptations:

int sqlite_select(db_handle, select_stmt)
int sqlite_select(db_handle, select_stmt, "")
int sqlite_select(db_handle, select_stmt, "col_separator_string")
int sqlite_select(db_handle, select_stmt, "list-of-columns-formats")
int sqlite_select(db_handle, select_stmt, dummy, gawk_array)

The function sqlite_select is overloaded and takes from 2 to 4 parameters; when the arities are identical (variants 2 to 4), the format of the 3rd parameter makes the difference. Let’s see what service they provide.

int sqlite_select(db_handle, select_stmt)

The first select outputs its result as a table with fixed column-widths; those widths are from 8 to 15 characters wide. Its purpose is to give a quick overview of a query’s result without messing up the screen with long, wrapped around lines. Here is an example with fake data:

sqlite_select(my_db, 'SELECT * FROM test1')
n1 s1 s2
-------- -------- --------
100 hello1 hello...
200 hello2 hello...
300 hello3 hello...
400 hello4 hello...
400 hello... hello...
400 hello... hello...
6 rows selected
3 columns displayed

We can see that too large columns are truncated and an ellipsis string (…) is appended to them to show this fact.
This quick overview terminates with a table of optimum columns widths so that, if used later, screen width permitting, the columns can be displayed entirely without truncation.

Optimum column widths
=====================
for query: SELECT * FROM test1
n1 3
s1 18
s2 21

This variant is simple to use and handy for having a quick peek at the data and their display needs.
The 4th sqlite_select() variant let us provide column formats. Here is an example of how to do that with previous optimum column widths:

sqlite_select(my_db, 'SELECT * FROM test1', "3 18 21")
n1 s1 s2
--- ------------------ ---------------------
100 hello1 hello0101
200 hello2 hello0102
300 hello3 hello0103
400 hello4 hello0104
400 hello5 with spaces hello0105 with spaces
400 hello6 with spaces hello0106
6 rows selected

The columns are now displayed without truncation.

int sqlite_select(db_handle, select_stmt, "")

The second variant takes en empty string as the 3rd parameters, which means “use | as a column separator”. Here is an example of output:

sqlite_select(my_db, 'SELECT * FROM test1', "")
n1|s1|s2
100|hello1|hello0101
200|hello2|hello0102
300|hello3|hello0103
400|hello4|hello0104
400|hello5 with spaces |hello0105 with spaces
400|hello6 with spaces |hello0106
6 rows selected

If such a default character is not appropriate, a more suitable string can be provided, which is the purpose of sqlite_select() 3rd variant, e.g. ‘||’ as shown below:

sqlite_select(my_db, 'SELECT * FROM test1', "||")
n1 ||s1 ||s2
--------||--------||--------
100 ||hello1 ||hello...
200 ||hello2 ||hello...
300 ||hello3 ||hello...
400 ||hello4 ||hello...
400 ||hello...||hello...
400 ||hello...||hello...
6 rows selected
3 columns displayed

Since this format is easy to parse, it is handy for exporting the data into a file and subsequently import them into a spreadsheet program.

int sqlite_select(db_handle, select_stmt, "list-of-columns-formats")

We’ve already seen the function’s 4th variant but there is more to the column formats.
The parameter “list-of-columns-formats” is a comma- or space-separated ordered list of numeric values, the column widths, one number for each column in the SELECT clause of that statement. If they are too many values, the superfluous ones are ignored. If they are fewer, the last one is extended to cover for the missing values.
They can also end with one of t, e or w characters where t stands for t(runcation), e stands for e(llipsis) suffix if truncation and w stands for w(rap-around).
The minimal width is 3 characters if an ellipsis is requested to accommodate the suffix itself.
Here is an example of invocation and output:

sqlite_select(my_db, 'SELECT * FROM test1', "2e 15e 10w")
n1 s1 s2
--- --------------- ----------
100 hello1 hello0101
200 hello2 hello0102
300 hello3 hello0103
400 hello4 hello0104
400 hello5 with ... hello0105
with space
s
400 hello6 with ... hello0106
400 hello6-with-... hello0106-
12345
7 rows selected

Here, we want the first column to be displayed in a 3-character wide field with truncation allowed and an ellipsis suffix. No truncation occurred in this column.
The second column should be displayed in a 15-character wide column also with ellipsis as suffix if truncation, which is visible here.
The third column is displayed in a 10-character wide column with wrapping-around.
This variant attempts to emulate some of the flexibility of Oracle sqlplus “col XX format YY” command.

int sqlite_select(db_handle, select_stmt, dummy, gawk_array)

Finally, the last acceptation below does not output anything on the screen but fills in a gawk array with the query’s result.
A dummy parameter has been introduced to allow resolving the overloaded function and invoke the expected code. It can be set to any value since this formal parameter is ignored or, as the say, is reserved for future use.
Here is an example of invocation:

sqlite_select(my_db, 'SELECT * FROM test1', 0, a_test)
6 rows selected

If we iterate and print the gawk array:

   for (row in a_test) {
      printf("row %d: ", row)
      for (col in a_test[row])
         printf("  %s = %s", col, a_test[row][col])
      printf "\n" 
   }
   printf "\n"

, we can guess its structure:

row 0: n1 = 100 s1 = hello1 s2 = hello0101
row 1: n1 = 200 s1 = hello2 s2 = hello0102
row 2: n1 = 300 s1 = hello3 s2 = hello0103
row 3: n1 = 400 s1 = hello4 s2 = hello0104
row 4: n1 = 400 s1 = hello5 with spaces s2 = hello0105 with spaces
row 5: n1 = 400 s1 = hello6 with spaces s2 = hello0106

As we can see, the array’s structure is the following (say the query returns a table of count rows and ncolumns):

array[0] = sub-array_0
array[1] = sub-array_1
...
array[count-1] = sub-array_count-1
where array is indexed by an integer and the sub-arrays are associative arrays with following composition:
sub-array0[col0] = value0,0
sub-array0[col1] = value0,1
...
sub-array0[coln-1] = value0,n-1
sub-array1[col0] = value1,0
...
sub-array1[coln-1] = value1,n-1
...
sub-arraycount-1[col0] = valuecount-1,0
...
sub-arraycount-1[coln-1] = valuecount-1,n-1

Said otherwise, the returned array is an integer-indexed array of associative arrays; its first dimension contains the rows and its second dimension contains the columns, i.e. it’s a table of database rows and value columns.
This feature is very interesting but comes with some limitation since all the data are held in memory. Maybe a future release of gawk will be able to transparently paginate gawk arrays to/from disk providing a kind of virtual memory for them. Its implementation could even use SQLite. The problem turns out to map multi-dimensional associative arrays onto relational tables. Some performance degradation is expected, unless an efficient pagination mechanism is introduced. Documentum implemented the mapping partially for repeating attributes, i.e. flat arrays of values, with joins between _s and _r tables, but there is much more to it in gawk’s array. This would be a fascinating subject for another blog on its own as one can imagine.
An implicit and free-of-charge benefit of this implementation would be persistence. Another would be serialization.
For the time being, if a query returns too many rows to be held in memory at once, it may be better to first print them into a file as delimited values (use variants 2 or 3 of sqlite_select for that, as describe above) and later work on them sequentially from there, with as many passes as needed. Or use the SQLite shell and work the data on-the-fly. e.g.:

cat - << EoQ | sqlite3 | gawk -v FS="|" '{
# do something, e.g.:
print "row", NR, $0
}'
.open my_db
.separator |
select * from test1;
.exit
EoQ
row 1 100|hello1|hello01001
row 2 200|hello2|hello01002
row 3 300|hello3|hello01003

How could Documentum benefit of SQLite ?

Whereas it would not be realistic to use SQLite to store documents’ metadata as a replacement for a full 2-tiers RDBMS, it could still find a purpose within Documentum.
Documentum, and lots of other software, stores its configuration files, such as the server.ini and the old dmcl.ini, in ini files, e.g.:

[SERVER_STARTUP]
docbase_id = 1000000
docbase_name = mydocbase
database_name = mydb
database_conn = dbconn
database_owner = dmadmin
database_password_file = /home/dmadmin/dba/mydocbase/dbpasswd.txt

[DOCBROKER_PROJECTION_TARGET]
host = myhost

[DOCBROKER_PROJECTION_TARGET_n]
#n can be 0-49
key=value

[FUNCTION_SPECIFIC_STORAGE]
#Oracle & DB2 only
key=value

[TYPE_SPECIFIC_STORAGE]
key=value
#Oracle & DB2 only

[FUNCTION_EXTENT_SIZE]
key=value
#Oracle only

[TYPE_EXTENT_SIZE]
key=value

There are also key-value files such as the dfc.properties.
By storing these data inside a SQLite table, in its own database for example (databases are so cheap in SQlite, just one file), common typos could be avoided. A classic error with ini files in particular consists in misplacing a setting in the wrong section; such errors are not easy to spot because Documentum silently ignores them.
Consider this snippet from dfcfull.properties for instance:

# ACS configuration
# =================
# Preferences prefixed dfc.acs are used by dfc for distributed content services 
# ACS.                                                                          


# Defines how often dfc verifies acs projection, specified in seconds.          
# min value:  0, max value: 10000000
# 
dfc.acs.avail.refresh_interval = 360


# Indicates whether to verify if ACS server projects.                           
# 
dfc.acs.check_availability = true


# Defines how often dfc verifies that docbase related config objects are 
# modified, specified in seconds.                                               
# min value:  0, max value: 10000000
# 
dfc.acs.config.refresh_interval = 120


# Defines how often dfc verifies that global registry related config objects are 
# modified, specified in seconds.                                               
# min value:  0, max value: 10000000
# 
dfc.acs.gr.refresh_interval = 120
...

Wouldn’t it be nice to move these settings into a SQLite table with the structure dfc_properties(key PRIMARY KEY, value, comment, is_enabled) ? Both could still coexist and be merged at server startup time, with more priority to the file (i.e. if a given setting is present in both the table and in the file, the latter would prevail). We could have a classic file dfc.propertes along with the database file dfc.properties.db.
Common operations on the parameters would be done through SQL statements, e.g.:

SELECT key, value, comment from dfc_properties where is_enable = TRUE;
SELECT 'other_values', key, value from dfc_properties where is_enable = FALSE and key = ... ORDER BY key, value;
SELECT 'all_values', is_enable, key, value from dfc_properties where ORDER BY is_enable DESC, key, value;
UPDATE server_ini SET value = 'mydb.world' WHERE key = 'database_conn';

All the supported parameters would already be present in the table with correct names and default values. Alternative or deactivated values would have their is_enabled value to FALSE.
But the main interest of using SQlite’s tables here would be the CHECK constraints (or FOREIGN keys) to prevent typos in the parameter names. And for a server.ini.db with structure server_ini(section, key, value, comment, is_enabled), the section + key would be a composite foreign key with a domain CHECK constraints in the parent table, to prevent misplaced parameters. This would require a real database schema with some complexity, and would be delivered by Documentum. The point here is that SQLite would be an excellent tool for this kind of data.
But let’s leave it at that. Such an design and implementation could also deserve its own blog.
I hope this interface got your interest. In the next part, I’ll present its gory details, comment on the implementation and its limits and show how to compile and use it within gawk. See you there !

 

Cet article An SQLite extension for gawk (part I) est apparu en premier sur Blog dbi services.

Adding a timeout in monitoring probes

Yann Neuhaus - Fri, 2018-09-28 15:40

A few months ago day, as I was writing the documentation for a monitoring probe, I suddenly realized that that probe, along with others I wrote during that time to monitor Documentum installations, had all a big, unexpected flaw. Indeed, it struck me that if it hang for some reason while running, it could stay there well after the next monitoring cycle had begun, which could too be affected by the same problem, and so on, until lots of such processes could be hanging and possibly hogging valuable repository sessions, causing their complete exhaustion and the catastrophic consequence on the client applications. How ironic would it be that health checks would actually endanger an application ?

A true story

I realized all this because it already happened once, years ago, at a different client’s. It was just after we migrated from Documentum content server v5.3 to v6.x. A big shift was introduced in that new version: the command-line tools iapi, idql, dmbasic and dmawk went java. More precisely, they switched from the native libdmcl40.so C library to the library libdmcl.so which calls the DfCs behind the scenes, with this sorcery made possible thanks to JNI. The front is still native code but all the Documentum stuff is henceforth delegated to the java DfCs.
What was the impact on those tools ? It was huge: all those tools that used to start in less than a second took now around 10 seconds or more to start because of all the bloatware initialization. We vaguely noticed it during the tests and supposed it was caused by a big load in that less powerful environment so we went confidently to production one week-end.
The next Monday morning, panicked calls flooded the Help Desk; users were complaining that part of their applications did not work any more. A closer look in the application’s log showed that it had become impossible to open new sessions to some repositories. The process list on the server machine showed tens of documentum and idql processes running at once. Those idql processes were stuck instances of a monitoring probe that run once per minute. Its job was just to connect to the target docbase, run a quick query and exit with a status. For some reason, it was probably waiting for a session, or idql was taking a lot more than the expected few seconds to do its job; therefore, the next monitoring cycle started before the previous one was completed and it too it hang there, and so on until affected users became vocal. The real root cause was programmatic since one developer thought it was a good idea to periodically and too frequently connect to docbases from within Ajax code in the clients’ home page, without informing the docbases’ administrators of this new resource hungry feature. This resulted in a saturation of the allowed sessions, stuck idql processes, weblogic threads waiting for a connection and, ultimately, application downtime.
Needless to say, the flashy Ajax feature was quickly removed, the number of allowed concurrent sessions was boosted up and we decided to keep around a copy of those fast, full binary v5.3 tools for low-level tasks such as our monitoring needs.
So let’s see how to protect the probes from themselves and from changing environments or well-meaning but ingenuous developers.

The requirements

1. If the monitoring cycles are tight, the probes shall obviously do very simple things; complex things can take time, and be fragile and buggy. Simple things complete quickly and are less subject to hasards.
2. As seen, unless the probe is started only once and runs constantly in the background, the probe’s interpreter shall start very quickly which excludes java code and its JVM; this also avoids recent issues such as the random number generator entropy that used to plague java programs for some time now and, I’m sarcastic but confident, the next ones still lurking around the corner. The interpreter that executes the probe shall be that of some well known scripting language such as the bash or ksh shells, python or perl with the needed binding to access the resource to be monitored, e.g. a Documentum repository, or some native binary tool that is part of the product to monitor, such as idql or sqlplus, launched by the shell, or even a custom compiled program.
3. While a probe is running, no other instance of it shall be allowed to start; i.e. the next instance shall not start until after the current one completes.
4. A probe shall only be allowed to execute during an allotted time; once this delay is elapsed, the probe shall be terminated manu militari with a distinct return status.
5. The probe’s cycles too shall be monitored, e.g. missing cycles should be reported.

Points 1 easy to implement; e.g. to check the availability of a repository or a docbase, just try a connection to it and exit. If a more exhaustive test is required, a quick and simple query could be sent to the server. It all depends on how exhaustive we want to be. A SELECT query won’t be able to detect, say, unusable database indexes or indexes being rebuilt off-line, if it still completes within the allowed delay. Some neutral UPDATE could be attempted to detect those kinds of issues or, more straightforwardly yet, just query the state of the indexes. But whatever is monitored, let’s keep it quick and direct. The 3rd and 4th requirements can help detecting anomalies such as the preceding index problem (in an Oracle database, unusable indexes causes UPDATEs to hang, so timeout detection and forced termination are mandatory in such cases).

Point 2 is quite obvious: if the monitoring is so aggressive that it runs in one-minute cycles, the script that it executes shall complete in less than one minute; i.e. start time + execution time shall be less than the monitoring period, let’s say less than half that time to be safe. If the monitoring tools and script cannot keep up with the stringent timing, a different, more efficient approach shall be considered, unless the timing requirement is relaxed somewhat. For example, a reverse approach could be considered where instead of pulling the status from a target, it’s the target that publish its status, like a heartbeat; that would permit very tight monitoring cycles.

Point 3 requires a barrier to prevent the next cycle to start. This does not need to be a fancy test-and-set semaphore because concurrency is practically nonexistent. A simple test of existence of a conventional file is enough. If the file exists, it means a cycle is in progress and the next cycle is not allowed in. If the file does not exist, create it and continue. There may be a race condition but it is unlikely to occur given that the monitoring cycles are quite widely spread apart, one minute at the minimum if defined in the crontab.

Point 4 means that a timer shall be set up upon starting the probe. This is easy to do from a shell, e.g. thanks to the “timeout” command. Some tools may have their own command-line option to run in batch mode within a timeout duration, which is even better. Nonetheless, an external timer offers a double protection and is still desirable.

Point 5: Obviously, this part is only possible from outside the probe. On some system (e.g. nagios), the probe’s log file itself is monitored and, if not updated within some time interval, an alert is raised. This kind of passive or indirect heartbeat permits to detect disabled or stuck probes, but doesn’t remove them. Resilience shall be auto-applied whenever possible in order to minimize human intervention. This check is useful to detect cases where the probe or the scheduler itself have been suspended abruptly or are no longer available on the file system (it can even happen that the file system itself has been unmounted by mistake or due to some technical problem or unscheduled intervention).

An example

Let’s say that we want to monitor the availability of a docbase “doctest”. We propose to attempt a connection with idql as “dmadmin” from the server machine so trusted mode authentication is used and no password is needed. A response from the docbase shall arrive within 15s. The probe shall run with a periodicity of 5 minutes, i.e. 12 times per hour. Here is a no frills attempt:

#!/bin/bash

BARRIER=/tmp/sentinel_file
DOCBASE=doctest
LOG_FILE=/var/dctm/monitor_docbase_${DOCBASE}
TIMEOUT=15s
export DOCUMENTUM=/u01/app/documentum53/product/5.3

if [ -f $BARRIER ]; then
   echo "WARNING: previous $DOCBASE monitoring cycle still running" > $LOG_FILE
   exit 100
fi
touch $BARRIER
if [ $? -ne 0 ]; then
   echo "FATAL: monitoring of $DOCBASE failed while touch-ing barrier $BARRIER" > $LOG_FILE
   exit 101
fi

timeout $TIMEOUT $DOCUMENTUM/bin/idql $DOCBASE -Udmadmin -Pxx 2>&1 > /dev/null <<EoQ
   select * from dm_server_config;
   go
   quit
EoQ
rc=$?
if [ $rc -eq 124 ]; then
   echo "FATAL: monitoring of $DOCBASE failed in timeout of $TIMEOUT" > $LOG_FILE
elif [ $rc -eq 1 ]: then
  echo "FATAL: connection to $DOCBASE was unsuccessful"               > $LOG_FILE
else
   echo "OK: connection to $DOCBASE was successful"                   > $LOG_FILE
fi

rm $BARRIER
exit $rc

Line 3: the barrier is an empty file whose existence or inexistence simulates the state of the barrier; if the file exists, then the barrier is down and the access is forbidden; if the file does not exist, then the barrier is up and the access is allowed;
Line 7: we use the full native, DMCL-based idql utility for a quick start up;
Line 9: the barrier is tested by checking the file’s existence as written above; if the file already exists, it means that an older monitoring cycle is still running, so the new cycle aborts and returns an error message and an exit code;
Line 13: the barrier has been lowered to prevent the next cycle to execute the probe;
Line 19: the idql command is launched and monitored by the command timeout with a duration of $TIMEOUT;
Line 24: the timeout command’s return status is tested; if it is 124 (line 25), it means a timeout has occurred; the probe aborts with an appropriate error message; otherwise, it’s the command’s error code: if it is 1, idql could not connect; if it is 0, the connection was OK;
Lines 27 and 29: the connection attempt returned within the $TIMEOUT time interval, meaning the idql has a connection status;
Line 33: the barrier is removed so the next monitoring cycle has the green light;
Line 34: the exit code is returned; it should be 124 for timeout, 1 for no connection to the docbase, 0 if connection OK;

The timeout command belongs to the coreutils package so install that package through your linux distribution’s package manager if the command is missing.

If cron is used as a scheduler, the crontab entry could look like below (assuming the probe’s name is test-connection.sh):

0,5,10,15,20,25,30,35,40,45,50,55 * * * /u01/app/documentum/monitoring/test-connection.sh 2>&1 > /dev/null

cron is sufficient most of the time, even though its time granularity is 1 minute.
The probe could be enhanced very easily in such a way that, once deployed, it optionally installs itself in dmadmin’s crontab, e.g.:

/u01/app/documentum/monitoring/test-connection.sh --install "0,5,10,15,20,25,30,35,40,45,50,55 * * *"

for the maximum simplicity. But this is a different topic.

Some comments

On some large infrastructures, centralized scheduler and orchestration software may be in use (CTRL-M, Activeeon, Rundeck, Dkron, etc. Just check the web, they are plenty to shop for) which have their own management of rogue jobs and commands. Still, dedicated probes such as the preceding one have to be called but the timeout logic could be removed and externalized into the launcher. Better yet, only externalize the hard-coded timeout parameter so it is passed to the probe as a command-line parameter and the probe can still work independently from the launcher in use.
Other systems use a centralized monitoring system (e.g. nagios, Icinga, BMC TrueSight, Cacti, etc. Again, search the web) but whatever the software be prepared to manually write probes because, unless it is so ubiquitous like apache, tomcat or mysql, it is unlikely that the system to be monitored is supported out of the box, particularly specialized products such as Documentum.
Some of the above software need an agent process deployed and permanently running on each monitored or enrolled machine. There are pros and cons in this architecture but we won’t go there as the subject is out of scope.

Conclusion

Monitoring a target is interacting with it. Physics tells us that it is impossible to be totally invisible while observing but at least we can minimize the probes’ footprint. While it is impossible to anticipate every abnormality on a system, these few very simple guidelines can help a long way in making monitoring probes more robust, resilient and as unobtrusive as possible.

 

Cet article Adding a timeout in monitoring probes est apparu en premier sur Blog dbi services.

Linux: journalctl and systemd - better than /var/log/messages?

Dietrich Schroff - Fri, 2018-09-28 13:51
Nearly 8 years ago systemd was introduced on some Linux distribution (s. here). Last week i discovered some helpful commands, which i share with you.

If you want to take a look at kernel message (for example system boot), you command is
dmesg
The new equivalent is
journalctl -k
Ok - not really amazing.
But all of you know the message
See "systemctl status nginx.service" and "journalctl -xe" for details.
You can run the "systemctl start/restart/stop" and in case of error open the logs with "journalctl -xe". I would recommend to open a seperate shell and run there
journdalctl -f
This is something like "tail -f" to the systemd-journal.
If you do a "systemctl restart network" the shell with journalcctl -f shows the DHCP waiting for a answer from the server and you know why its so slow. You especially know, that your fifth interface has DHCP enabled and there is no DHCP, which slows down every "systemctl restart network".

journalctl has some nice filters like
journalctl -p 0..4This just shows the message with
  • "emerg" (0), 
  • "alert" (1), 
  • "crit" (2), 
  • "err" (3), 
  • "warning" (4), 
  • "notice" (5),
  • "info" (6), "debug" (7)
Or filter for something like network messages:
journalctl -u NetworkManager

And my favourite: Pipe your own log messages into the systemd-journal:
echo This is important | systemd-cat -t MightyJournal -p notice
Which result in this entry:
Sep 28 20:48:55 zerberus MightyJournal[28520]: This is important

Oracle Offline Persistence Toolkit - After Request Sync Listener

Andrejus Baranovski - Fri, 2018-09-28 11:15
In my previous post, we learned how to handle replay conflict - Oracle Offline Persistence Toolkit - Reacting to Replay Conflict. Additional important thing to know - how to handle response from request which was replayed during sync (we are talking here about PATCH). It is not as obvious as handling response from direct REST call in callback (there is no callback for response which is sinchronised later). You may think, why you would need to handle response, after successful sync. Well there could be multiple reasons - for instance you may read returned value and update value stored on the client.

Listener is registered in Persistence Manager configuration, by adding event listener of type syncRequest for given endpoint:


This is listener code. We are getting response, reading change indicator value (it was updated on the backend and new value is returned in response) and storing it locally on the client. Additionally we maintain array with mapping of change indicator value to updated row ID (in my next post I will explain why this is needed). After request listener must return promise:


On runtime - when request sync is executed, you should see in the log message printed, which shows new change indicator value:


Double check in payload, to make sure request was submitted with previous value:


Check response, you will see new value for change indicator (same as in after request listener):


Sample code can be downloaded from GitHub repository.

Hacking for Skew

Jonathan Lewis - Fri, 2018-09-28 07:23

In my presentation to the UKOUG SIG yesterday “Struggling with Statistics – part 2” I described a problem that I wrote about a few months ago: when you join a fact table with a massively skewed distribution on one of the surrogate key columns to a dimension holding the unique list of keys and descriptions a query against a description “loses” the skew. Here’s an demo of the problem that’s a little simpler than the one in the previous article.


rem
rem     Script:         bitmap_join_histogram.sql
rem     Author:         Jonathan Lewis
rem     Dated:          June 2016
rem     Updated:        Sep 2018
rem 

execute dbms_random.seed(0)

create table facts
nologging
as
with generator as (
        select  --+ materialize
                rownum id
        from dual 
        connect by 
                level <= 1e4 --> comment to avoid wordpress format issue
)
select
        rownum                                  id,
        trunc(3 * abs(dbms_random.normal))      id_status,
        lpad(rownum,10,'0')                     v1,
        lpad('x',100,'x')                       padding
from
        generator       v1,
        generator       v2
where
        rownum <= 1e5 --> comment to avoid wordpress format issue
;

alter table facts add constraint fct_pk primary key(id);
alter table facts modify id_status not null;

create table statuses
as
select
        id,
        chr(65 + id)            status_code,
        rpad('x',100,'x')       description
from    (
        select
                distinct(id_status)             id
        from
                facts
        )
;

alter table statuses modify status_code not null;

alter table statuses add constraint sta_pk primary key (id);
alter table facts add constraint fct_fk_sta foreign key (id_status) references statuses(id);

create bitmap index fct_b1 on facts(id_status);

begin
        dbms_stats.gather_table_stats(
                ownname          => user,
                tabname          =>'facts',
                method_opt       => 'for all columns size skewonly'
        );

        dbms_stats.gather_table_stats(
                ownname          => user,
                tabname          =>'statuses',
                method_opt       => 'for all columns size 254'
        );
end;
/

The definition of the facts.id_status column means I get a nice skewing effect on the data and this is what my data looks like:


select id_status, count(*) from facts group by id_status order by id_status;

 ID_STATUS   COUNT(*)
---------- ----------
         0      26050
         1      23595
         2      18995
         3      13415
         4       8382
         5       4960
         6       2643
         7       1202
         8        490
         9        194
        10         55
        11         17
        12          2

13 rows selected.

The statuses table translates the numbers 0 – 12 into the letters ‘A’ – ‘M’.

A quick check will show you that there are 55 rows for id_status = 10, which means 55 rows for status_code = ‘K’. So what happens when we write the two queries that should show us these results. I don’t really care what the execution plans are at this point, I’m interested only in the optimizer’s estimate of cardinality – so here are two queries, each followed by its execution plan:


select
        sum(fct.id)
from
        facts   fct
where
        fct.id_status = 10
;


-----------------------------------------------------------------------------------------------
| Id  | Operation                            | Name   | Rows  | Bytes | Cost (%CPU)| Time     |
-----------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT                     |        |     1 |     8 |    12   (0)| 00:00:01 |
|   1 |  SORT AGGREGATE                      |        |     1 |     8 |            |          |
|   2 |   TABLE ACCESS BY INDEX ROWID BATCHED| FACTS  |    55 |   440 |    12   (0)| 00:00:01 |
|   3 |    BITMAP CONVERSION TO ROWIDS       |        |       |       |            |          |
|*  4 |     BITMAP INDEX SINGLE VALUE        | FCT_B1 |       |       |            |          |
-----------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------
   4 - access("FCT"."ID_STATUS"=10)


select
        sum(fct.id)
from
        facts           fct,
        statuses        sta
where
        fct.id_status = sta.id
and     sta.status_code = 'K'
;

--------------------------------------------------------------------------------
| Id  | Operation           | Name     | Rows  | Bytes | Cost (%CPU)| Time     |
--------------------------------------------------------------------------------
|   0 | SELECT STATEMENT    |          |     1 |    13 |   233   (4)| 00:00:01 |
|   1 |  SORT AGGREGATE     |          |     1 |    13 |            |          |
|*  2 |   HASH JOIN         |          |  7692 | 99996 |   233   (4)| 00:00:01 |
|*  3 |    TABLE ACCESS FULL| STATUSES |     1 |     5 |     2   (0)| 00:00:01 |
|   4 |    TABLE ACCESS FULL| FACTS    |   100K|   781K|   229   (3)| 00:00:01 |
--------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------
   2 - access("FCT"."ID_STATUS"="STA"."ID")
   3 - filter("STA"."STATUS_CODE"='K')

The estimated cardinality for the query against the base column reflects the value 55 from the histogram, but the estimated cardinality of the join is 7,692 – which is num_rows(facts) / num_distinct(id_status). Oracle has lost information about the skew. There is a way to get Oracle to produce a correct estimate (shown in the previous article) by rewriting the join as an IN subquery with the (undocumented) “precompute_subquery” hint, but there is an alternative which David Kurtz hypothesized in a conversation after the presentation was over (in fact someone else had described their use of exactly his suggested approach in a comment on a much older blog note about this problem): take the histogram from the id_status column on the facts table and “apply it” to the status_code column on the statuses table. In discussion with David I expressed the opinion that this probably shouldn’t work, and it wasn’t really a bit of fakery I’d want to apply to a production system – but we both tried it when we got home … with differing degrees of success.

Here’s a piece of code that I inserted into my script immediately after gathering stats on the statuses table. I’ll explain the details below as it makes a couple of assumptions that need to be pointed out:


declare

        srec                    dbms_stats.statrec;

        m_distcnt               number;
        m_density               number;
        m_nullcnt               number;
        m_avgclen               number;

        c_array                 dbms_stats.chararray;

begin

        dbms_stats.get_column_stats(
                ownname         => 'test_user',
                tabname         => 'facts',
                colname         => 'id_status',
                distcnt         => m_distcnt,
                density         => m_density,
                nullcnt         => m_nullcnt,
                srec            => srec,
                avgclen         => m_avgclen
        ); 

        srec.bkvals := dbms_stats.numarray();
        c_array     := dbms_stats.chararray();

        for r in (
                select  stt.status_code, count(*) ct
                from    facts fct, statuses stt
                where   stt.id = fct.id_status
                group by
                        stt.status_code
                order by
                        stt.status_code
        ) loop

                c_array.extend;
                c_array(c_array.count) := r.status_code;
                srec.bkvals.extend;
                srec.bkvals(srec.bkvals.count) := r.ct;

        end loop;

        dbms_stats.prepare_column_values(srec, c_array);

        dbms_stats.set_column_stats(
                ownname         => 'test_user',
                tabname         => 'statuses',
                colname         => 'status_code',
                distcnt         => m_distcnt,
                density         => m_density,
                nullcnt         => m_nullcnt,
                srec            => srec,
                avgclen         => m_avgclen
        ); 

end;
/

alter system flush shared_pool;

The code isn’t intended to be efficient, and I’ve been a bit lazy in setting up the content.

The first step gets the column stats from facts.id_status – and I know that I’ve got a frequency histogram that covers exactly the right number of distinct values on that column so almost everything is set up correctly to copy the stats across to statuses.status_code, except one column is numeric and the other is character and (although I know it’s true because of the way I defined the status_code values) I need to ensure that the bucket values I write to the status_code need to be arranged in alphabetic order of status_code.

So my second step is to run a query against the facts table to get the counts of status_code in alphabetical order and copy the results in order into a pair of arrays – one being a standalone array of the type defined in the dbms_stats package as an array of character types, the other being the array of bucket values that already exists in the stats record for the facts.id_status column that I’ve pulled into memory. (The bucket values array is stored as cumulative frequency values, so I do have to overwrite it with the simple frequency values at this point).

Finally I “prepare column values” and “set column stats” into the correct column, and the job is done. The flush of the shared pool is there to avoid any accidents of cursors surviving previous tests and causing confusion.

So what happens when I run a couple of queries with these faked stats in place ?

set autotrace traceonly explain

select  
        sum(fct.id)
from
        facts           fct,
        statuses        sta
where
        fct.id_status = sta.id
and     sta.status_code = 'K'
;


--------------------------------------------------------------------------------
| Id  | Operation           | Name     | Rows  | Bytes | Cost (%CPU)| Time     |
--------------------------------------------------------------------------------
|   0 | SELECT STATEMENT    |          |     1 |    14 |   233   (4)| 00:00:01 |
|   1 |  SORT AGGREGATE     |          |     1 |    14 |            |          |
|*  2 |   HASH JOIN         |          |    55 |   770 |   233   (4)| 00:00:01 |
|*  3 |    TABLE ACCESS FULL| STATUSES |     1 |     6 |     2   (0)| 00:00:01 |
|   4 |    TABLE ACCESS FULL| FACTS    |   100K|   781K|   229   (3)| 00:00:01 |
--------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------
   2 - access("FCT"."ID_STATUS"="STA"."ID")
   3 - filter("STA"."STATUS_CODE"='K')



select
        sum(fct.id)
from
        facts           fct,
        statuses        sta
where
        fct.id_status = sta.id
and     sta.status_code = 'D'
;


--------------------------------------------------------------------------------
| Id  | Operation           | Name     | Rows  | Bytes | Cost (%CPU)| Time     |
--------------------------------------------------------------------------------
|   0 | SELECT STATEMENT    |          |     1 |    14 |   233   (4)| 00:00:01 |
|   1 |  SORT AGGREGATE     |          |     1 |    14 |            |          |
|*  2 |   HASH JOIN         |          | 13415 |   183K|   233   (4)| 00:00:01 |
|*  3 |    TABLE ACCESS FULL| STATUSES |     2 |    12 |     2   (0)| 00:00:01 |
|   4 |    TABLE ACCESS FULL| FACTS    |   100K|   781K|   229   (3)| 00:00:01 |
--------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------
   2 - access("FCT"."ID_STATUS"="STA"."ID")
   3 - filter("STA"."STATUS_CODE"='D')

Querying for ‘K’ the prediction is 55 rows, querying for ‘D’ the prediction is for 13,415 rows – both estimates are exactly right. Wow !!!

Problem – that’s not what David Kurtz saw. In an email to me he said: “To my surprise, if I fake a histogram on the dimension table using the skew on the join column from the fact table I do get the correct number of rows calculated in the execution plan (provided it is less than the value if the histogram was not present)”. To make that concrete – when he queried for ‘K’ he got the correct prediction, when he queried for ‘D’ he was back to a prediction of 7,692. Looking at the report of the actual data, he’d get the right prediction for codes ‘F’ to ‘M’ and the wrong prediction for codes ‘A’ to ‘E’.

So what went wrong (and with whom) ?

When I run up new tests I tend to test Oracle versions in the order 12.1.0.2, then 11.2.0.4, then 12.2.0.1, then 18.3.0.0 – it’s the order of popularity that I currently see. So I was running my test on 12.1.0.2; David was running his test on 18.3.0.0. So I jumped a step and ran my test on 12.2.0.1: here are my results when querying for status_code = ‘D’:


--------------------------------------------------------------------------------
| Id  | Operation	    | Name     | Rows  | Bytes | Cost (%CPU)| Time     |
--------------------------------------------------------------------------------
|   0 | SELECT STATEMENT    |	       |     1 |    14 |   233	 (4)| 00:00:01 |
|   1 |  SORT AGGREGATE     |	       |     1 |    14 |	    |	       |
|*  2 |   HASH JOIN	    |	       |  7692 |   105K|   233	 (4)| 00:00:01 |
|*  3 |    TABLE ACCESS FULL| STATUSES |     1 |     6 |     2	 (0)| 00:00:01 |
|   4 |    TABLE ACCESS FULL| FACTS    |   100K|   781K|   229	 (3)| 00:00:01 |
--------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------
   2 - access("FCT"."ID_STATUS"="STA"."ID")
   3 - filter("STA"."STATUS_CODE"='D')

As David has seen with 18.3, Oracle used the num_distinct to estimate the cardinality for  ‘D’. (It still used the value indicated by the histogram for ‘K’.) When I set the optimizer_features_enable parameter back to 12.1.0.2 the cardinality estimate for ‘D’ wentback to 13,415 – so it looks as if this is a deliberate piece of coding. 172 fix controls and 31 optimizer state parameters changed, but none of the more likely looking candidates had any effect when I tried testing them separately; possibly there’s a new sanity check when the number of rows recorded for the table is a long way off the total histogram bucket count.

I took a quick look at the 10053 trace in 12.2, with and without the change to optimizer_features_enable. The key difference was in the single table access path analysis – which didn’t give me any further clues.

With optimizer_features_enable = 12.1.0.2
=========================================
Access path analysis for STATUSES
***************************************
SINGLE TABLE ACCESS PATH
  Single Table Cardinality Estimation for STATUSES[STA]
  SPD: Return code in qosdDSDirSetup: NOCTX, estType = TABLE

 kkecdn: Single Table Predicate:"STA"."STATUS_CODE"='K'
  Estimated selectivity: 5.5000e-04 , endpoint value predicate, col: #2

Access path analysis for STATUSES
***************************************
SINGLE TABLE ACCESS PATH
  Single Table Cardinality Estimation for STATUSES[STA]
  SPD: Return code in qosdDSDirSetup: NOCTX, estType = TABLE

 kkecdn: Single Table Predicate:"STA"."STATUS_CODE"='D'
  Estimated selectivity: 0.134150 , endpoint value predicate, col: #2


With optimizer_features_enable defaulting to 12.2.0.1
=====================================================
Access path analysis for STATUSES
***************************************
SINGLE TABLE ACCESS PATH
  Single Table Cardinality Estimation for STATUSES[STA]
  SPD: Return code in qosdDSDirSetup: NOCTX, estType = TABLE

 kkecdn: Single Table Predicate:"STA"."STATUS_CODE"='K'
  Estimated selectivity: 5.5000e-04 , endpoint value predicate, col: #2


Access path analysis for STATUSES
***************************************
SINGLE TABLE ACCESS PATH
  Single Table Cardinality Estimation for STATUSES[STA]
  SPD: Return code in qosdDSDirSetup: NOCTX, estType = TABLE

 kkecdn: Single Table Predicate:"STA"."STATUS_CODE"='D'
  Estimated selectivity: 0.076923 , endpoint value predicate, col: #2


Bottom line on this – there’s at least one person who already uses this method to work around the optimizer limitation, they need to be careful when they upgrade to 12.2 (or above) as the method no longer works in all cases.

 

 

[Troubleshooting] Forms Issue: FRM-92050 failed to connect to server: forms/servlet in EBS (R12)

Online Apps DBA - Fri, 2018-09-28 07:16

Oracle Forms can be configured in two modes: Servlet and Socket In servlet mode, a Java servlet (called the Forms Listener servlet) manages the communication between the Forms Java client and the OracleAS Forms services. In socket mode, the desktop clients access the Forms server directly. What if you face Oracle Forms issue ‘FRM-92050: Failed […]

The post [Troubleshooting] Forms Issue: FRM-92050 failed to connect to server: forms/servlet in EBS (R12) appeared first on Oracle Trainings for Apps & Fusion DBA.

Categories: APPS Blogs

HOW TO RESOLVE THE ERROR :- ora-01406 fetched column value was truncated

Tom Kyte - Fri, 2018-09-28 01:26
HOW TO RESOLVE THE ERROR :- ora-01406 fetched column value was truncated
Categories: DBA Blogs

Outer join with row archival

Tom Kyte - Fri, 2018-09-28 01:26
LiveSQL-Link: https://livesql.oracle.com/apex/livesql/s/hblhxmq40jtini45sivyqj4le <code> create table test_table (id number(10),name varchar2(10)) row archival; insert into test_table (id,name) values (1,'name1'); insert into test_table (id...
Categories: DBA Blogs

Find Closest Matching Single Record

Tom Kyte - Fri, 2018-09-28 01:26
I want a Query that should fetch a single record based on match conditions: table looks like <code> create table SERVICES ( srvc VARCHAR2(10) not null, location VARCHAR2(10), grp VARCHAR2(10), empno VARCHAR2(10), pric...
Categories: DBA Blogs

Documentum – Checking warnings&errors from an xPlore full re-index

Yann Neuhaus - Fri, 2018-09-28 00:00

When working with xPlore as a Full Text Server (indexing), there are a few ways to perform a full re-index. You can potentially do it from the IndexAgent UI, from the Dsearch UI, from the file system (with an ids.txt file for example, it is usually for a “small” number of r_object_id so that’s probably not an ideal way) or from the docbase (mass-queue, it’s not really a good way to do it either). Performing a full re-index from the xPlore Server directly will be faster because you remove a few layers where the Content Server asks for an index (the index queues) and expect an answer/result, that’s why I will in this blog only talk about the full re-index performed from the xPlore Server directly and below I will use a full re-index from the IndexAgent UI. For each of these cases, there might be a few warnings or errors along the re-index, some of which might be normal (password protected file), some others might not (timeout because xPlore heavily loaded).

The whole purpose of this blog is to show you how you can check these warnings/errors because there is no information about them directly displayed on the UI, you need to go find that information manually. These warnings/errors aren’t shown in the index queues since they weren’t triggered from the docbase but from the xPlore Server directly.

So first of all, you need to trigger a re-index using the IndexAgent:

  • Open the IndexAgent UI (https://<hostname>:<ia_port>/IndexAgent)
  • Login with the installation owner’s account
  • Stop the IndexAgent if it is currently running in Normal mode and then launch a re-index operation

It should look like that (for xPlore 1.6):
IA1

On the above screenshot, the green represents the success count and the blue is for the filtered count. Once completed and as shown above, you might have a few warnings/errors but you don’t have any information about them as I mentioned previously. To narrow down and facilitate the check of the warnings/errors, you need to know (approximately) the start and end time of the re-index operation: 2018-06-12 11:55 UTC to 2018-06-12 12:05 UTC for the above example. From that point, the analysis of the warnings/errors can be done in two main ways:

 

1. Using the Dsearch Admin

I will start with the way that most of you probably already know: use the Dsearch reports to see the errors/warnings. That’s not the fastest way, clearly not the funniest way either but it is an easy way for sure…

Accessing the reports from the Dsearch Admin:

  • Open the Dsearch Admin UI (https://<hostname>:<ds_port>/dsearchadmin)
  • Login with the admin account (or any other valid account with xPlore 1.6+)
  • Navigate to: Home > Diagnostic and Utilities > Reports
  • Select the “Document Processing Error Summary” report and set the following:
    • Start from: 2018-06-12 11:55
    • To: 2018-06-12 12:05
    • Domain name (optional): leave empty if you only have one IndexAgent, otherwise you can specify the domain name (usually the same name as the docbase)
  • Click on Run to get the report

At this point, you will have a report with the number of warnings/errors per type, meaning that you do not have any information about the documents yet, you only know the number of errors for each of the pre-defined error types (=error code). For the above example, I had 8 warnings once the re-index was completed and I could see them all (seven warnings for ‘777’ and one warning for ‘770’):
IA2

Base on the information from this “Document Processing Error Summary” report, you can go deeper and find the details about the documents but you can only do it for one type, one Error Code, at a time. Therefore, you will have to loop on all Error Codes returned:

  • For each Error Code:
    • Select the “Document Processing Error Detail” report and set the following:
      • Start from: 2018-06-12 11:55
      • To: 2018-06-12 12:05
      • Domain name (optional): leave empty if you only have 1 IndexAgent, otherwise you can specify the domain name (usually the same name as the docbase)
      • Processing Error Code: Select the Error Code you want to see (either 777 or 770 in my case)
      • Number of Results to Display: Set here the number of items you want to display, 10, 20, …
    • Click on Run to get the report

And there you finally have the details about the warnings/errors documents that weren’t indexed properly because of the Error Code you choose. In my case, I selected 770 so I have only 1 document:
IA3

You can export this list to excel if you want, to do some processing on these items for example but you will need to do it for all Error Codes and then merge them or whatever.

 

2. Using the logs

In the above example, I used the IndexAgent to perform the re-index so I will use the IndexAgent logs to find what happened exactly. This section is really the main purpose of this blog because I assume that most people are using the Dsearch Admin reports already but probably not the logs! If you want to script the check of warnings/errors after a re-index of just if you want to play and have fun while doing your job, then this is what you need ;).

So let’s start simple: listing all errors and warnings and keeping only the lines that contain an r_object_id.

[xplore@full_text_server_01 ~]$ cd $JBOSS_HOME/server/DctmServer_Indexagent_DocBase1/logs/
[xplore@full_text_server_01 logs]$
[xplore@full_text_server_01 logs]$ echo; egrep -i "err|warn" Indexagent_*.log* \
                                   | egrep --color "[ (<][0-9a-z]{16}[>) ]"

Indexagent_DocBase1.log:2018-06-12 11:55:26,456 WARN PrepWorkItem [full_text_server_01_9200_IndexAgent-full_text_server_01.dbi-services.com-1-full_text_server_01.dbi-services.com-StatusUpdater][DM_INDEX_AGNT_RECEIVED_FT_CALLBACK_WARN] Received warn callback: id: 090f12345007f40e message: DOCUMENT_WARNING CPS Warning [Corrupt file].
Indexagent_DocBase1.log:2018-06-12 12:01:00,752 WARN PrepWorkItem [full_text_server_01_9260_IndexAgent-full_text_server_01.dbi-services.com-1-full_text_server_01.dbi-services.com-StatusUpdater][DM_INDEX_AGENT_RECEIVED_FT_CALLBACK_WARN] Received warn callback: id: 090f1234500aaa97 message: DOCUMENT_WARNING CPS Warning [MIME-Type (application/vnd.openxmlformats-officedocument.wordprocessingml.), Unknown file format or corrupt file].
Indexagent_DocBase1.log:2018-06-12 12:01:00,752 WARN PrepWorkItem [full_text_server_01_9260_IndexAgent-full_text_server_01.dbi-services.com-1-full_text_server_01.dbi-services.com-StatusUpdater][DM_INDEX_AGENT_RECEIVED_FT_CALLBACK_WARN] Received warn callback: id: 090f1234500aaa98 message: DOCUMENT_WARNING CPS Warning [MIME-Type (application/vnd.openxmlformats-officedocument.wordprocessingml.), Unknown file format or corrupt file].
Indexagent_DocBase1.log:2018-06-12 12:01:00,754 WARN PrepWorkItem [full_text_server_01_9260_IndexAgent-full_text_server_01.dbi-services.com-1-full_text_server_01.dbi-services.com-StatusUpdater][DM_INDEX_AGENT_RECEIVED_FT_CALLBACK_WARN] Received warn callback: id: 090f1234500aa9f6 message: DOCUMENT_WARNING CPS Warning [MIME-Type (application/vnd.openxmlformats-officedocument.wordprocessingml.), Unknown file format or corrupt file].
Indexagent_DocBase1.log:2018-06-12 12:01:00,754 WARN PrepWorkItem [full_text_server_01_9260_IndexAgent-full_text_server_01.dbi-services.com-1-full_text_server_01.dbi-services.com-StatusUpdater][DM_INDEX_AGENT_RECEIVED_FT_CALLBACK_WARN] Received warn callback: id: 090f1234500aaa9a message: DOCUMENT_WARNING CPS Warning [MIME-Type (application/vnd.openxmlformats-officedocument.wordprocessingml.), Unknown file format or corrupt file].
Indexagent_DocBase1.log:2018-06-12 12:01:01,038 WARN PrepWorkItem [full_text_server_01_9260_IndexAgent-full_text_server_01.dbi-services.com-1-full_text_server_01.dbi-services.com-StatusUpdater][DM_INDEX_AGENT_RECEIVED_FT_CALLBACK_WARN] Received warn callback: id: 090f1234500aaa99 message: DOCUMENT_WARNING CPS Warning [MIME-Type (application/vnd.openxmlformats-officedocument.wordprocessingml.), Unknown file format or corrupt file].
Indexagent_DocBase1.log:2018-06-12 12:01:01,038 WARN PrepWorkItem [full_text_server_01_9260_IndexAgent-full_text_server_01.dbi-services.com-1-full_text_server_01.dbi-services.com-StatusUpdater][DM_INDEX_AGENT_RECEIVED_FT_CALLBACK_WARN] Received warn callback: id: 090f1234500aaa9b message: DOCUMENT_WARNING CPS Warning [MIME-Type (application/vnd.openxmlformats-officedocument.wordprocessingml.), Unknown file format or corrupt file].
Indexagent_DocBase1.log:2018-06-12 12:01:01,038 WARN PrepWorkItem [full_text_server_01_9260_IndexAgent-full_text_server_01.dbi-services.com-1-full_text_server_01.dbi-services.com-StatusUpdater][DM_INDEX_AGENT_RECEIVED_FT_CALLBACK_WARN] Received warn callback: id: 090f1234500aaa9d message: DOCUMENT_WARNING CPS Warning [MIME-Type (application/vnd.openxmlformats-officedocument.wordprocessingml.), Unknown file format or corrupt file].
Indexagent_DocBase1.log:2018-06-12 12:01:27,518 INFO ReindexBatch [Worker:Finalization Action:#6][DM_INDEX_AGENT_REINDEX_BATCH] Updating queue item 1b0f1234501327f0 with message= Incomplete btch. From a total of 45, 44 done, 0 filtered, 0 errors, and 8 warnings.
[xplore@full_text_server_01 logs]$

 

As you can see above, there is also one queue item (1b0f1234501327f0) listed because I kept everything that is 16 char long with 0-9 or a-z. If you want, you can rather select only r_object_id starting with 09 to have all dm_documents (using this: “[ (<]09[0-9a-z]{14}[>) ]” ) or you can just remove the r_object_id starting with 1b which are the queue items.

In the above example, all the results are in the timeframe I expected them to be but it is possible that there are older or newer warnings/errors so you might want to apply another filter with the date. Since I want everything from 11:55 to 12:05 on the 12-Jun-2018, this is how I can do it (and removing the log file name too) using a time regex:

[xplore@full_text_server_01 logs]$ time_regex="2018-06-12 11:5[5-9]|2018-06-12 12:0[0-5]"
[xplore@full_text_server_01 logs]$ echo; egrep -i "err|warn" Indexagent_*.log* \
                                   | sed 's,^[^:]*:,,' \
                                   | egrep "${time_regex}" \
                                   | egrep --color "[ (<][0-9a-z]{16}[>) ]"

2018-06-12 11:55:26,456 WARN PrepWorkItem [full_text_server_01_9200_IndexAgent-full_text_server_01.dbi-services.com-1-full_text_server_01.dbi-services.com-StatusUpdater][DM_INDEX_AGNT_RECEIVED_FT_CALLBACK_WARN] Received warn callback: id: 090f12345007f40e message: DOCUMENT_WARNING CPS Warning [Corrupt file].
2018-06-12 12:01:00,752 WARN PrepWorkItem [full_text_server_01_9260_IndexAgent-full_text_server_01.dbi-services.com-1-full_text_server_01.dbi-services.com-StatusUpdater][DM_INDEX_AGENT_RECEIVED_FT_CALLBACK_WARN] Received warn callback: id: 090f1234500aaa97 message: DOCUMENT_WARNING CPS Warning [MIME-Type (application/vnd.openxmlformats-officedocument.wordprocessingml.), Unknown file format or corrupt file].
2018-06-12 12:01:00,752 WARN PrepWorkItem [full_text_server_01_9260_IndexAgent-full_text_server_01.dbi-services.com-1-full_text_server_01.dbi-services.com-StatusUpdater][DM_INDEX_AGENT_RECEIVED_FT_CALLBACK_WARN] Received warn callback: id: 090f1234500aaa98 message: DOCUMENT_WARNING CPS Warning [MIME-Type (application/vnd.openxmlformats-officedocument.wordprocessingml.), Unknown file format or corrupt file].
2018-06-12 12:01:00,754 WARN PrepWorkItem [full_text_server_01_9260_IndexAgent-full_text_server_01.dbi-services.com-1-full_text_server_01.dbi-services.com-StatusUpdater][DM_INDEX_AGENT_RECEIVED_FT_CALLBACK_WARN] Received warn callback: id: 090f1234500aa9f6 message: DOCUMENT_WARNING CPS Warning [MIME-Type (application/vnd.openxmlformats-officedocument.wordprocessingml.), Unknown file format or corrupt file].
2018-06-12 12:01:00,754 WARN PrepWorkItem [full_text_server_01_9260_IndexAgent-full_text_server_01.dbi-services.com-1-full_text_server_01.dbi-services.com-StatusUpdater][DM_INDEX_AGENT_RECEIVED_FT_CALLBACK_WARN] Received warn callback: id: 090f1234500aaa9a message: DOCUMENT_WARNING CPS Warning [MIME-Type (application/vnd.openxmlformats-officedocument.wordprocessingml.), Unknown file format or corrupt file].
2018-06-12 12:01:01,038 WARN PrepWorkItem [full_text_server_01_9260_IndexAgent-full_text_server_01.dbi-services.com-1-full_text_server_01.dbi-services.com-StatusUpdater][DM_INDEX_AGENT_RECEIVED_FT_CALLBACK_WARN] Received warn callback: id: 090f1234500aaa99 message: DOCUMENT_WARNING CPS Warning [MIME-Type (application/vnd.openxmlformats-officedocument.wordprocessingml.), Unknown file format or corrupt file].
2018-06-12 12:01:01,038 WARN PrepWorkItem [full_text_server_01_9260_IndexAgent-full_text_server_01.dbi-services.com-1-full_text_server_01.dbi-services.com-StatusUpdater][DM_INDEX_AGENT_RECEIVED_FT_CALLBACK_WARN] Received warn callback: id: 090f1234500aaa9b message: DOCUMENT_WARNING CPS Warning [MIME-Type (application/vnd.openxmlformats-officedocument.wordprocessingml.), Unknown file format or corrupt file].
2018-06-12 12:01:01,038 WARN PrepWorkItem [full_text_server_01_9260_IndexAgent-full_text_server_01.dbi-services.com-1-full_text_server_01.dbi-services.com-StatusUpdater][DM_INDEX_AGENT_RECEIVED_FT_CALLBACK_WARN] Received warn callback: id: 090f1234500aaa9d message: DOCUMENT_WARNING CPS Warning [MIME-Type (application/vnd.openxmlformats-officedocument.wordprocessingml.), Unknown file format or corrupt file].
2018-06-12 12:01:27,518 INFO ReindexBatch [Worker:Finalization Action:#6][DM_INDEX_AGENT_REINDEX_BATCH] Updating queue item 1b0f1234501327f0 with message= Incomplete btch. From a total of 45, 44 done, 0 filtered, 0 errors, and 8 warnings.
[xplore@full_text_server_01 logs]$

 

Listing only the messages for each of these warnings/errors:

[xplore@full_text_server_01 logs]$ echo; egrep -i "err|warn" Indexagent_*.log* \
                                   | sed 's,^[^:]*:,,' \
                                   | egrep "${time_regex}" \
                                   | egrep "[ (<][0-9a-z]{16}[>) ]" \
                                   | sed 's,^[^]]*],,' \
                                   | sort -u

[DM_INDEX_AGENT_RECEIVED_FT_CALLBACK_WARN] Received warn callback: id: 090f12345007f40e message: DOCUMENT_WARNING CPS Warning [Corrupt file].
[DM_INDEX_AGENT_RECEIVED_FT_CALLBACK_WARN] Received warn callback: id: 090f1234500aa9f6 message: DOCUMENT_WARNING CPS Warning [MIME-Type (application/vnd.openxmlformats-officedocument.wordprocessingml.), Unknown file format or corrupt file].
[DM_INDEX_AGENT_RECEIVED_FT_CALLBACK_WARN] Received warn callback: id: 090f1234500aaa97 message: DOCUMENT_WARNING CPS Warning [MIME-Type (application/vnd.openxmlformats-officedocument.wordprocessingml.), Unknown file format or corrupt file].
[DM_INDEX_AGENT_RECEIVED_FT_CALLBACK_WARN] Received warn callback: id: 090f1234500aaa98 message: DOCUMENT_WARNING CPS Warning [MIME-Type (application/vnd.openxmlformats-officedocument.wordprocessingml.), Unknown file format or corrupt file].
[DM_INDEX_AGENT_RECEIVED_FT_CALLBACK_WARN] Received warn callback: id: 090f1234500aaa99 message: DOCUMENT_WARNING CPS Warning [MIME-Type (application/vnd.openxmlformats-officedocument.wordprocessingml.), Unknown file format or corrupt file].
[DM_INDEX_AGENT_RECEIVED_FT_CALLBACK_WARN] Received warn callback: id: 090f1234500aaa9a message: DOCUMENT_WARNING CPS Warning [MIME-Type (application/vnd.openxmlformats-officedocument.wordprocessingml.), Unknown file format or corrupt file].
[DM_INDEX_AGENT_RECEIVED_FT_CALLBACK_WARN] Received warn callback: id: 090f1234500aaa9b message: DOCUMENT_WARNING CPS Warning [MIME-Type (application/vnd.openxmlformats-officedocument.wordprocessingml.), Unknown file format or corrupt file].
[DM_INDEX_AGENT_RECEIVED_FT_CALLBACK_WARN] Received warn callback: id: 090f1234500aaa9d message: DOCUMENT_WARNING CPS Warning [MIME-Type (application/vnd.openxmlformats-officedocument.wordprocessingml.), Unknown file format or corrupt file].
[DM_INDEX_AGENT_REINDEX_BATCH] Updating queue item 1b0f1234501327f0 with message= Incomplete batch. From a total of 45, 44 done, 0 filtered, 0 errors, and 1 warnings.
[xplore@full_text_server_01 logs]$

 

Listing only the r_object_id (to resubmit them via the ids.txt for example):

[xplore@full_text_server_01 logs]$ echo; egrep -i "err|warn" Indexagent_*.log* \
                                   | sed 's,^[^:]*:,,' \
                                   | egrep "${time_regex}" \
                                   | egrep "[ (<][0-9a-z]{16}[>) ]" \
                                   | sed 's,.*[ (<]\([0-9a-z]\{16\}\)[>) ].*,\1,' \
                                   | sort -u \
                                   | grep -v "^1b"

090f12345007f40e
090f1234500aa9f6
090f1234500aaa97
090f1234500aaa98
090f1234500aaa99
090f1234500aaa9a
090f1234500aaa9b
090f1234500aaa9d
[xplore@full_text_server_01 logs]$

 

If you want to generate the iapi commands to resubmit them all:

[xplore@full_text_server_01 logs]$ echo; egrep -i "err|warn" Indexagent_*.log* \
                                   | sed 's,^[^:]*:,,' \
                                   | egrep "${time_regex}" \
                                   | egrep "[ (<][0-9a-z]{16}[>) ]" \
                                   | sed 's,.*[ (<]\([0-9a-z]\{16\}\)[>) ].*,\1,' \
                                   | sort -u \
                                   | grep -v "^1b"
                                   | sed 's/.*/queue,c,&,dm_fulltext_index_user/'

queue,c,090f12345007f40e,dm_fulltext_index_user
queue,c,090f1234500aa9f6,dm_fulltext_index_user
queue,c,090f1234500aaa97,dm_fulltext_index_user
queue,c,090f1234500aaa98,dm_fulltext_index_user
queue,c,090f1234500aaa99,dm_fulltext_index_user
queue,c,090f1234500aaa9a,dm_fulltext_index_user
queue,c,090f1234500aaa9b,dm_fulltext_index_user
queue,c,090f1234500aaa9d,dm_fulltext_index_user
[xplore@full_text_server_01 logs]$

 

Finally, to group the warnings/errors per types:

[xplore@full_text_server_01 logs]$ echo; IFS=$'\n'; \
                                   for type in `egrep -i "err|warn" Indexagent_*.log* \
                                     | sed 's,^[^:]*:,,' \
                                     | egrep "${time_regex}" \
                                     | egrep "[ (<][0-9a-z]{16}[>) ]" \
                                     | sed 's,^[^]]*],,' \
                                     | sort -u \
                                     | sed 's,.*\(\[[^\[]*\]\).*,\1,' \
                                     | sort -u`;
                                   do
                                     echo "  --  Listing warnings/errors with the following messages: ${type}";
                                     egrep -i "err|warn" Indexagent_*.log* \
                                       | sed 's,^[^:]*:,,' \
                                       | egrep "${time_regex}" \
                                       | egrep "[ (<][0-9a-z]{16}[>) ]" \
                                       | sed 's,^[^]]*],,' \
                                       | sort -u \
                                       | grep -F "${type}";
                                     echo;
                                   done

  --  Listing warnings/errors with the following messages: [Corrupt file]
[DM_INDEX_AGENT_RECEIVED_FT_CALLBACK_WARN] Received warn callback: id: 090f12345007f40e message: DOCUMENT_WARNING CPS Warning [Corrupt file].

  --  Listing warnings/errors with the following messages: [DM_INDEX_AGENT_REINDEX_BATCH]
[DM_INDEX_AGENT_REINDEX_BATCH] Updating queue item 1b0f1234501327f0 with message= Incomplete batch. From a total of 45, 44 done, 0 filtered, 0 errors, and 1 warnings.

  --  Listing warnings/errors with the following messages: [MIME-Type (application/vnd.openxmlformats-officedocument.wordprocessingml.), Unknown file format or corrupt file]
[DM_INDEX_AGENT_RECEIVED_FT_CALLBACK_WARN] Received warn callback: id: 090f1234500aa9f6 message: DOCUMENT_WARNING CPS Warning [MIME-Type (application/vnd.openxmlformats-officedocument.wordprocessingml.), Unknown file format or corrupt file].
[DM_INDEX_AGENT_RECEIVED_FT_CALLBACK_WARN] Received warn callback: id: 090f1234500aaa97 message: DOCUMENT_WARNING CPS Warning [MIME-Type (application/vnd.openxmlformats-officedocument.wordprocessingml.), Unknown file format or corrupt file].
[DM_INDEX_AGENT_RECEIVED_FT_CALLBACK_WARN] Received warn callback: id: 090f1234500aaa98 message: DOCUMENT_WARNING CPS Warning [MIME-Type (application/vnd.openxmlformats-officedocument.wordprocessingml.), Unknown file format or corrupt file].
[DM_INDEX_AGENT_RECEIVED_FT_CALLBACK_WARN] Received warn callback: id: 090f1234500aaa99 message: DOCUMENT_WARNING CPS Warning [MIME-Type (application/vnd.openxmlformats-officedocument.wordprocessingml.), Unknown file format or corrupt file].
[DM_INDEX_AGENT_RECEIVED_FT_CALLBACK_WARN] Received warn callback: id: 090f1234500aaa9a message: DOCUMENT_WARNING CPS Warning [MIME-Type (application/vnd.openxmlformats-officedocument.wordprocessingml.), Unknown file format or corrupt file].
[DM_INDEX_AGENT_RECEIVED_FT_CALLBACK_WARN] Received warn callback: id: 090f1234500aaa9b message: DOCUMENT_WARNING CPS Warning [MIME-Type (application/vnd.openxmlformats-officedocument.wordprocessingml.), Unknown file format or corrupt file].
[DM_INDEX_AGENT_RECEIVED_FT_CALLBACK_WARN] Received warn callback: id: 090f1234500aaa9d message: DOCUMENT_WARNING CPS Warning [MIME-Type (application/vnd.openxmlformats-officedocument.wordprocessingml.), Unknown file format or corrupt file].

[xplore@full_text_server_01 logs]$
[xplore@full_text_server_01 logs]$ # Or to shorten a little bit the loop command:
[xplore@full_text_server_01 logs]$
[xplore@full_text_server_01 logs]$ command='egrep -i "err|warn" Indexagent_*.log* | sed 's,^[^:]*:,,'
                                   | egrep "${time_regex}"
                                   | egrep "[ (<][0-9a-z]{16}[>) ]"
                                   | sed 's,^[^]]*],,'
                                   | sort -u'
[xplore@full_text_server_01 logs]$
[xplore@full_text_server_01 logs]$ echo; IFS=$'\n'; \
                                   for type in `eval ${command} \
                                     | sed 's,.*\(\[[^\[]*\]\).*,\1,' \
                                     | sort -u`;
                                   do
                                     echo "  --  Listing warnings/errors with the following messages: ${type}";
                                     eval ${command} \
                                       | grep -F "${type}";
                                     echo;
                                   done

  --  Listing warnings/errors with the following messages: [Corrupt file]
[DM_INDEX_AGENT_RECEIVED_FT_CALLBACK_WARN] Received warn callback: id: 090f12345007f40e message: DOCUMENT_WARNING CPS Warning [Corrupt file].

  --  Listing warnings/errors with the following messages: [DM_INDEX_AGENT_REINDEX_BATCH]
[DM_INDEX_AGENT_REINDEX_BATCH] Updating queue item 1b0f1234501327f0 with message= Incomplete batch. From a total of 45, 44 done, 0 filtered, 0 errors, and 1 warnings.

  --  Listing warnings/errors with the following messages: [MIME-Type (application/vnd.openxmlformats-officedocument.wordprocessingml.), Unknown file format or corrupt file]
[DM_INDEX_AGENT_RECEIVED_FT_CALLBACK_WARN] Received warn callback: id: 090f1234500aa9f6 message: DOCUMENT_WARNING CPS Warning [MIME-Type (application/vnd.openxmlformats-officedocument.wordprocessingml.), Unknown file format or corrupt file].
[DM_INDEX_AGENT_RECEIVED_FT_CALLBACK_WARN] Received warn callback: id: 090f1234500aaa97 message: DOCUMENT_WARNING CPS Warning [MIME-Type (application/vnd.openxmlformats-officedocument.wordprocessingml.), Unknown file format or corrupt file].
[DM_INDEX_AGENT_RECEIVED_FT_CALLBACK_WARN] Received warn callback: id: 090f1234500aaa98 message: DOCUMENT_WARNING CPS Warning [MIME-Type (application/vnd.openxmlformats-officedocument.wordprocessingml.), Unknown file format or corrupt file].
[DM_INDEX_AGENT_RECEIVED_FT_CALLBACK_WARN] Received warn callback: id: 090f1234500aaa99 message: DOCUMENT_WARNING CPS Warning [MIME-Type (application/vnd.openxmlformats-officedocument.wordprocessingml.), Unknown file format or corrupt file].
[DM_INDEX_AGENT_RECEIVED_FT_CALLBACK_WARN] Received warn callback: id: 090f1234500aaa9a message: DOCUMENT_WARNING CPS Warning [MIME-Type (application/vnd.openxmlformats-officedocument.wordprocessingml.), Unknown file format or corrupt file].
[DM_INDEX_AGENT_RECEIVED_FT_CALLBACK_WARN] Received warn callback: id: 090f1234500aaa9b message: DOCUMENT_WARNING CPS Warning [MIME-Type (application/vnd.openxmlformats-officedocument.wordprocessingml.), Unknown file format or corrupt file].
[DM_INDEX_AGENT_RECEIVED_FT_CALLBACK_WARN] Received warn callback: id: 090f1234500aaa9d message: DOCUMENT_WARNING CPS Warning [MIME-Type (application/vnd.openxmlformats-officedocument.wordprocessingml.), Unknown file format or corrupt file].

[xplore@full_text_server_01 logs]$

 

So the above was related to a very simple example where a full reindex took only a few minutes because it is a very small repository. But what about a full reindex that takes days because there are several millions of documents? Well the truth is that checking the logs might actually surprise you because it is usually more accurate than checking the Dsearch Admin. Yes, I said more accurate!

 

3. Accuracy of the Dsearch Admin vs the Logs

Let’s take another example with a repository containing a few TB of documents. A full re-index took 2.5 days to complete and in the commands below, I will check the status of the indexing for the 1st day: from 2018-09-19 07:00:00 UTC to 2018-09-20 06:59:59 UTC. Here is what the Dsearch Admin is giving you:

IA4

So based on this, you would expect 1 230 + 63 + 51 = 1 344 warnings/errors. So what about the logs then? I included below the DM_INDEX_AGENT_REINDEX_BATCH which are the “1b” object_id (item_id) I was talking about earlier but these aren’t document indexing, they are just batches:

[xplore@full_text_server_01 logs]$ time_regex="2018-09-19 0[7-9]|2018-09-19 [1-2][0-9]|2018-09-20 0[0-6]"
[xplore@full_text_server_01 logs]$ command='egrep -i "err|warn" Indexagent_*.log* | sed 's,^[^:]*:,,'
                                   | egrep "${time_regex}"
                                   | egrep "[ (<][0-9a-z]{16}[>) ]"
                                   | sed 's,^[^]]*],,'
                                   | sort -u'
[xplore@full_text_server_01 logs]$
[xplore@full_text_server_01 logs]$ echo; IFS=$'\n'; \
                                   for type in `eval ${command} \
                                     | sed 's,.*\(\[[^\[]*\]\).*,\1,' \
                                     | sort -u`;
                                   do
                                     echo "  --  Number of warnings/errors with the following messages: ${type}";
                                     eval ${command} \
                                       | grep -F "${type}" \
                                       | wc -l;
                                     echo;
                                   done

  --  Number of warnings/errors with the following messages: [Corrupt file]
51

  --  Number of warnings/errors with the following messages: [DM_INDEX_AGENT_REINDEX_BATCH]
293

  --  Number of warnings/errors with the following messages: [DM_STORAGE_E_BAD_TICKET]
7

  --  Number of warnings/errors with the following messages: [Password-protected or encrypted file]
63

  --  Number of warnings/errors with the following messages: [Unknown error during text extraction]
5

  --  Number of warnings/errors with the following messages: [Unknown error during text extraction(native code: 18, native msg: unknown error)]
1

  --  Number of warnings/errors with the following messages: [Unknown error during text extraction(native code: 257, native msg: handle is invalid)]
1053

  --  Number of warnings/errors with the following messages: [Unknown error during text extraction(native code: 30, native msg: out of memory)]
14

  --  Number of warnings/errors with the following messages: [Unknown error during text extraction(native code: 65534, native msg: unknown error)]
157

[xplore@full_text_server_01 logs]$

 

As you can see above, there is more granularity regarding the types of errors from the logs. Here are some key points in the comparison between the logs and the Dsearch Admin:

  1. In the Dsearch Admin, all messages that start with “Unknown error during text extraction” are considered as a single error type (N° 1023). Therefore from the logs, you can addition all of them: 5 + 1 + 1 053 + 14 + 157 = 1 230 to find the same number that was mentioned in the Dsearch Admin. You cannot separate them on the Dsearch Admin on the Error Summary report, it will only be on the Error Details report that you will see the full message and you can then separate them, kind of…
  2. You can find properly the same amount of “Password-protected or encrypted file” (63) as well as “Corrupt file” (51) from the logs and from the Dsearch Admin so no differences here
  3. You can see 7 “DM_STORAGE_E_BAD_TICKET” warnings/errors from the logs but none from the Dsearch Admin… Why is that? That’s because the Dsearch Admin do not have any Error Code for that so these errors aren’t shown!

So like I was saying at the beginning of this blog, using the Dsearch Admin is very easy but that’s not fun and you might actually miss a few information while checking the logs is funny and you are sure that you won’t miss anything (these 7 DM_STORAGE_E_BAD_TICKET errors for example)!

 

You could just as easily do the same thing in perl or using awk, that’s just a question of preferences… Anyway, you understood it, working with the logs allows you to do pretty much what you want but you will need some linux/scripting knowledge obviously while working with the Dsearch Admin is simple and easy but you will have to work with what OTX gives you and with the restrictions that it has.

 

 

Cet article Documentum – Checking warnings&errors from an xPlore full re-index est apparu en premier sur Blog dbi services.

What's New In Oracle Utilities Application Framework V4 Whitepaper updated

Anthony Shorten - Thu, 2018-09-27 15:53

The What's New In Oracle Utilities Application Framework V4 has been updated to reflect the latest changes implemented in Oracle Utilities Application Framework V4.3.0.6.0. The whitepaper has been updated with the latest information and is in the new whitepaper format to make it easier to read.

This is one of the last updates to this whitepaper as it was designed to help older Oracle Utilities Application Framework V2.x customers to understand the changes to the Oracle Utilities Application Framework since those releases. Given most of the information is already in the release notes and the vast majority of customers are in the process of migration or have migrated successfully, the whitepaper has limited use after the next service pack or major release.

The whitepaper is available for download from What's New In Oracle Utilities Application Framework V4 (Doc Id: 1177265.1) from My Oracle Support.

Column Group Catalog

Jonathan Lewis - Thu, 2018-09-27 11:16

I seem to have written a number of aricles about column groups – the rather special, and most useful, variant on extended stats. To make it as easy as possible to find the right article I’ve decided to produce a little catalogue (catalog) of all the relevant articles, with a little note about the topic each article covers. Some of the articles will link to others in the list, and there are a few items in the list from other blogs. There are also a few items which are the titles of drafts which have been hanging around for the last few years.

 

Oracle Will Seek Affirmation of Cost Award in the United States Supreme Court

Oracle Press Releases - Thu, 2018-09-27 10:32
Press Release
Oracle Will Seek Affirmation of Cost Award in the United States Supreme Court

Redwood Shores, Calif.—Sep 27, 2018

Today’s United States Supreme Court grant of certiorari in the Rimini Street case is limited to a narrow issue regarding an award of more than $12 million in litigation costs that Rimini Street was ordered to pay to Oracle. This award came after a jury found that Rimini Street infringed 93 Oracle copyrights, and this costs appeal has zero impact on the resolved issue that Rimini Street infringed 93 copyrights and the jury award of compensatory damages.

In 2016, the trial court found that Oracle was entitled to an award of attorneys’ fees and costs “because of Rimini’s repeated instances of copyright infringement and its significant litigation misconduct.” While the Supreme Court will take up this narrow costs issue as a matter of law, the underlying conduct is not in question.

“We look forward to addressing this costs issue in our nation’s highest court, and we believe that the Court should reject the attempt by Rimini Street—a dishonest, serial infringer—to avoid fully compensating Oracle for Rimini’s misconduct,” said Dorian Daley, Oracle’s Executive Vice President and General Counsel.

Contact Info
Deborah Hellinger
Oracle Corporate Communications
+1 212.508.7935
deborah.hellinger@oracle.com
About Oracle

The Oracle Cloud offers complete SaaS application suites for ERP, HCM and CX, plus best-in-class database Platform as a Service (PaaS) and Infrastructure as a Service (IaaS) from data centers throughout the Americas, Europe and Asia. For more information about Oracle (NYSE:ORCL), please visit us at www.oracle.com.

Trademarks

Oracle and Java are registered trademarks of Oracle and/or its affiliates. Other names may be trademarks of their respective owners.

Talk to a Press Contact

Deborah Hellinger

  • +1 212.508.7935

Renaming a RAC cluster

DBA Scripts and Articles - Thu, 2018-09-27 09:34

Introduction Renaming an Oracle RAC cluster is not an easy thing, unfortunately for me I had to do this today because the name chosen for the newly installed cluster was wrong. Oracle does not provide a simple command to do this and you have to go through a deconfiguration/reconfiguration of the whole cluster. Changing the … Continue reading Renaming a RAC cluster

The post Renaming a RAC cluster appeared first on Oracle DBA Scripts and Articles (Montreal).

Categories: DBA Blogs

create JSON from fields

Tom Kyte - Thu, 2018-09-27 07:06
Hi, I have relation 'resources' in database with 2 fields, 'id' and 'data'. 'id' eg. 25 and 'data' eg. <code>{"href":null,"id":"25","publicIdentifier":null,"description":null,"category":null,"validFor":null,"name":null,"lifecycleState":null,"type":"R...
Categories: DBA Blogs

DBMS_JOB.SUBMIT doesn't do anything in Oracle 12c (12.2.0.1.0)

Tom Kyte - Thu, 2018-09-27 07:06
Hi guys, I have a problem when I try to start a stored procedure with DBMS_JOB.SUBMIT. I have the package "dafneMultithread": <code> CREATE OR REPLACE PACKAGE DAFNE.dafneMultithread AS PROCEDURE start_job(p_procedure_name IN VARCHAR2...
Categories: DBA Blogs

Pages

Subscribe to Oracle FAQ aggregator