Interfacing ODBC from Linux

Our billing software runs on Windows platforms and uses ODBC to store its data. Recently we called the makers of that software since we wanted to extract some Business Intelligence from our records.

It turned out that they where going to release a new program for that purpose. We asked to evaluate the beta version and to my surprise it just created an Microsoft Excel file with a dynamic table report.

So I worked on the needed SQL query to extract the records we wanted and exported the data into a CSV to be loaded in an Excel worksheet with a dynamic table report. It worked ok, however the sales and marketting people didn’t want to contact a technician every time they needed the reports to be updated.

The problem was that our application server is a Debian box and since we didn’t have to much time to expend on this, we looked for the easiest solution. I found ODBC Socket Server which seems unmaintained but the last version works ok on our Windows Server 2003 machine. It’s a service which acts as a proxy for ODBC databases in the local machine, so they can be accessed from other machines using a socket connection and a very simple XML grammar. Once everything was in place I wrote a simple PHP program to fetch the needed records (a few thousand rows). To my surprise the available PHP libraries to interact with ODBC Socket Server didn’t perform too well. They stored the full response in memory and then build up an array, since we need a few thousend rows, the amount of memory required was too high for our little Debian system.

The following code is a pretty simple set of classes to work with ODBC Socket Server. They fetch the data on demand, using a SAX type xml parser to discard the already processed records. It’s not a really thought of or tested solution, however I’m making it public since it’s far better than existing solutions.

<?php
/*
  The ODBCSock class is an utility object to operate with the ODBC Socket Server
  from http://odbcsock.sourceforge.net

  It's compatible with ODBC Socket Server's XML formats 0 and 2. In our tests
  xml format 2 is significally faster than the others.
  Should run on PHP4 and PHP5, however it's only been tested on PHP5

  This code is copyright (c) 2006 Netxus Foundries
  v1.0 28-May-2006 : Ivan -DrSlump- Montes <imontes@netxus.es>

  This program is free software; you can redistribute it and/or modify
  it under the terms of the GNU General Public License as published by
  the Free Software Foundation; either version 2 of the License, or
  (at your option) any later version.

  This program is distributed in the hope that it will be useful, but
  WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY
  or FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License
  for more details.
  You should have received a copy of the GNU General Public License along
  with this program; if not, write to the Free Software Foundation, Inc.,
  51 Franklin Street, Fifth Floor, Boston, MA 02110-1301, USA
*/

class ODBCSock {
  var $handle = null;
  var $server = 'localhost';
  var $port = '9628';
  var $username = '';
  var $password = '';
  var $database = '';

  function ODBCSock( $server, $user, $passwd, $database = false )
  {
    $server = explode(':', $server);
    if (count($server) > 1)
      $this->port = array_pop( $server );
    else
      $this->port = 9628;
    $this->server = implode(':', $server);

    $this->handle = fsockopen( $this->server, $this->port, $errNo, $errStr, 30 );
    if ( ! $this->handle ) {
      return null;
    }

    $this->username = $user;
    $this->password = $passwd;

    if ($database !== false)
      $this->selectDB( $database );
  }


  function selectDB( $database )
  {
    $this->database = $database;
  }


  function & query( $query )
  {
    $pl = array();
    $pl[] = '<?xml version="1.0"?>';
    $pl[] = '<request>';
    $pl[] = '<connectionstring>DSN=' . HTMLSpecialChars($this->database) . ';UID=' . HTMLSpecialChars($this->username) . ';PWD=' . HTMLSpecialChars($this->password) . ';</connectionstring>';
    $pl[] = '<sql>' . HTMLSpecialChars( $query ) . '</sql>';
    $pl[] = '</request>';

    fwrite( $this->handle, implode("\r\n", $pl) );

    $rs = new ODBCSock_Result( $this );
    if ($rs->sax->error) return FALSE;
    else return $rs;
  }
}


class ODBCSock_Result {
  var $rows = array();
  var $conn;
  var $lastRow = 0;

  function ODBCSock_Result( &$conn )
  {
    $this->conn =& $conn;

    $this->sax = new ODBCSock_SAX( $this );
    $this->parser = xml_parser_create();
    xml_set_object( $this->parser, $this->sax );
    xml_parser_set_option( $this->parser, XML_OPTION_CASE_FOLDING, false );

    xml_set_element_handler( $this->parser, 'startElement', 'endElement' );
    xml_set_character_data_handler( $this->parser, 'content' );

    while ( $this->_read() ) {
      if ($this->sax->error || $this->sax->parsing)
        break;
    }
  }

  function _eof()
  {
          return feof($this->conn->handle);
  }

  function _read()
  {
          xml_parse( $this->parser, fread( $this->conn->handle, 4096 ), feof($this->conn->handle) );
  }

  function addRow( $row )
  {
          $this->rows[] = $row;
  }


  function fetch( $row = false )
  {
    if ($row === false) {
      $row = $this->lastRow;
      $this->lastRow++;
    } else {
      $this->lastRow = $row+1;
    }

    do {
      $this->_read();

      if (isset($this->rows[$row]))
        return $this->rows[$row];

    } while ( !$this->_eof() );
  }
}

class ODBCSock_SAX {

  var $error = false;
  var $parsing = false;

  var $curElement = '';
  var $curColumn = '';
  var $curColumnIdx = 0;
  var $curRowIdx = 0;

  var $rs = null;
  var $row;
  var $labels = array();

  function ODBCSock_SAX   ( &$rs )
  {
    $this->rs =& $rs;
  }

  function startElement( $parser, $name, $attrs )
  {
    $this->curElement = $name;

    if ( $name == 'result' ) {
      if ($attrs['state'] == 'success')
        $this->parsing = true;
      else
        $this->error = true;
    } else if ( $name == 'row' ) {
      $this->row = array();
      $this->curColumnIdx = 0;
    } else if ($name == 'column') {
      if ($attrs['name']) {
        if ($curRowIdx == 0)
                $this->labels[] = $attrs['name'];

        $this->curColumn = $attrs['name'];
      } else {
        $this->curColumn = $this->labels[$this->curColumnIdx];
      }

      $this->row[ $this->curColumn ] = '';
      $this->curColumnIdx++;
    }
  }

  function endElement( $parser, $name )
  {
    if ( $name == 'row' ) {
      $this->rs->addRow( $this->row );
      $this->curRowIdx++;
    }
  }

  function content( $parser, $data )
  {
    if ($this->curElement == 'column') {
      $this->row[ $this->curColumn ] .= $data;
    }
  }
}

The Binary Search algorithm

Long time since I last wrote here so I’ll take the opportunity to introduce (just joking) the Binary Search algorithm.

I was looking for a safe way to find the line of text where a user clicks the mouse. I needed this for the TextArea+ editor. Fortunately Mozilla supports the DOM3’s compareDocumentPosition() function, so I took advantatge of it. Since I’m using a UL with an LI child element for each line of text, I can say that UL.childNodes is an ordered array, which fits perfectly for the Binary Search algorithm.

The algorithm is really simple (I’m sure most of you have used it before), we just need to find the middle point between two points and compare its value with the one we’re looking for. We do this recursively until we either find the value or the right point becomes smaller than the left one, which means that the value is just not there.

Here comes a really simple example of this algorithm (without using recursive calls)

var left = 0;
var right = 100;

while (left <= right) {

    middle = floor( (right-left)/2 ) + left;
    if (haystack[mid] == needle) {
        alert('Found!');
        break;
    }
    if ( needle < haystack[middle] )
        right = middle-1;
    else
        left = middle+1;
}

The Binary Search algorithm performs at O(log n) time, but for TextArea+ I use a number of further optimizations. Since it’s likely that the user will click closely to where the cursor currently is, I just check the N precedding and following lines before. If this fails I check the current viewport and if still not found, then just do a full search. And this is just for the Mozilla code path! hell, how I hate cross-browser coding!

This post is meant to remind us how important is to learn already known and proved algorithms and that even when having a good algorithm, there is always room to optimize by taking advantge of our specific problem.

Programmatic reflection effect

Today Haripako pointed us to Thomas Fuchs (of script.aculo.us fame) DHTML image reflection effect

It uses 1px tall divs with an adjusted top offset for the containing image and the opacity property to perform it. Well, you just need to know some basic arithmetics to create this kind of effects, but I guess that if people find this kind of stuff so cool is just because most of them only do maths to check their wages :)

So here is some PHP to perform the same effect on the server side

<?php
$file = 'test.png';

// Configuration params
$mirrorSize = 1.35;   /* 1 .. 2 */
$startAlpha = 80;     /* 0 .. 127 */

// Adjust params to supported limits
$mirrorSize = max( 1, min($mirrorSize, 2) );
$startAlpha = max( 0, min($startAlpha, 127) );

// Load the source file
$imgSrc = ImageCreateFromPng( $file );

// Create the reflect portion
$imgRfl = ImageCreateTrueColor( ImagesX( $imgSrc ), floor(ImagesY( $imgSrc ) * ($mirrorSize-1)) );
ImageAlphaBlending($imgRfl, true);
ImageFill( $imgRfl, 0,0, ImageColorAllocate( $imgRfl, 255,255,255 ) );

$offset = ImagesY( $imgSrc ) - ImagesY( $imgRfl );

// In this loop we flip the source image and apply a fade out
for ($y = 0; $y < ImagesY( $imgRfl ); $y++)
{
    ImageCopy( $imgRfl, $imgSrc, 0,$y, 0,ImagesY( $imgSrc )-$y-1, ImagesX( $imgRfl ), 1 );
    $alpha = floor( ($y*(127-$startAlpha)) / ImagesY( $imgRfl ) );
    $color = ImageColorAllocateAlpha( $imgRfl, 255,255,255, 127-$startAlpha-$alpha );
    ImageFilledRectangle( $imgRfl, 0, $y, ImagesX( $imgRfl )-1, $y+1, $color );
}

// Create the image buffer which will be outputted
$imgDst = ImageCreateTrueColor( ImagesX( $imgSrc ), ImagesY( $imgSrc ) * $mirrorSize );
ImageAlphaBlending($imgDst, false);

// Paste into the output buffer the original image and the reflect portion
ImageCopy( $imgDst, $imgSrc, 0, 0, 0, 0, ImagesX( $imgSrc ), ImagesY( $imgSrc ) );
ImageCopy( $imgDst, $imgRfl, 0, ImagesY( $imgSrc ), 0, 0, ImagesX( $imgRfl ), ImagesY( $imgRfl ) );

ImageDestroy( $imgSrc );
ImageDestroy( $imgRfl );

// Output the image
header('Content-type: image/png');
ImageSaveAlpha( $imgDst, true );
ImagePng( $imgDst );

ImageDestroy( $imgDst );

I’ve tried to keep it very simple, so it’ll only work with PNG files. Moreover it’s hardcoded for white backgrounds, althought it can be easily changed.

If you use this be sure to cache the result, since the cpu cycles to process images is quite high.

Implementing a line based code editor

To my surprise I couldn’t find much info on the net about implementing text editors. So I had to figure it out on my own, however as often happens, while implementing it I found some info when looking for related stuff.

There seems to be two common ways to implement text editors, those which are line-based and those which are block-based. I started with the block-based approach, dividing the text buffer in tokens and storing the position and color attributes in each block. Soon I found that this approach gets pretty complex quite fast. So I turned the code into a line-based editor. It’s almost like the block based with the main difference that it makes an intermediate division based on lines of text. This allows to simplify the code at the expense of a bit of overhead.

For each line of text we need to store the actual text (or the coordinates to fetch it from a buffer), the offset relative to the start of the text, the parser state at the start of the line and block definitions for that line. Since we have broken down the text in lines, for each block we just need to store the offset relative to the start of line which contains it, the length of the block and the style of it. The following figure illustrates the whole concept.

Figure

To make all this work we just need a function which parses a single line of text. This function would take as arguments a string with the line of text and the parser context at the start of that line. It will return the blocks found and the parser context at the end of the line.

The code would roughly look like this:

txtLines = split('\n', text);
lines = new Array();
context = new Context();
offset = 0;
for (i=0; i < count(txtLines); i++) {
   lines[i].source = txtLines[i];
   lines[i].offset = offset;
   lines[i].state = state;
   lines[i].blocks = parseLine( txtLines[i], context );
   offset += strlen( txtLines[i] );
}

The advantatge of the line based approach is that it’s very easy to reparse the text to display any changes on the syntax highlighting. To do so we just need to parse each line from the modified one to the end until the parser context matches

context = line[i].context;
do {
   lines[i].blocks = parseLine( lines[i].text, context );
   i++;
} while ( context != line[i].context );

Most of the time, buffer changes will only affect a line and most lines will only be a few chars long, so we have a really optimized solution with only a few lines of code.

In overall, writing a fast text editor is quite simple if we only optimize for the common cases. In fact, I’m having more problems optimizing the display routines than the buffer updating ones.