MySQL 5.1 Partitioning and the Key Cache

User-defined partitioning looks like it is set to be a useful feature in MySQL. It allows you to distribute portions of individual tables across a file system according to rules which you can set largely as needed.

I was looking into it to overcome the obstacles of running OPTIMIZE against large tables, because when optimizing you need more free disk space than the size of the table to optimize it, otherwise MySQL will just sit there idling, not completing the operation but not failing with an error. It just sits there, locking the table and idles (this is a reported bug).

By partitioning, I could split a large table into several pieces, and then optimize each of these separately and not need as much free space on the disk.
Partitioning can also make some queries run faster because MySQL can tell which partition the data is sitting on and only search that one rather that having to search the entire set of data.

However – tucked away at the bottom of one of the last manual pages is a very important limitation: Key caches are not supported for partitioned tables.
The key cache helps to minimize disk I/O by putting table indexes into memory, so by using a partitioned table, each time a query is executed the index files are either accessed from the native file system buffering (provided by the operating system) or it has to be read from disk. I’m not sure I want to try that on a table with several million rows.

Key caching for partitioned tables was fixed by bug #39637 but only available in MySQL 5.5, which isn’t yet suitable for production.

For now, I’m going to create a merge table based on a collection of small tables and optimize these separately when necessary.

CakePHP paging and sorting on a Custom DataSource

CakePHP recommends that to use a web service API you should use a custom DataSource (here).
In their example on creating a Twitter DataSource, it doesn’t mention how to achieve paging and sorting.

Neil Crookes has a good example on his website, accessing Google Analytics data via REST and paging through it – however this involves modifying the cake core and overriding the pagination methods in the model.
The cake team do not look keen on this approach and rejected it when someone raised as a bug

The DataSource example by LoadSys includes paging, but sorting only works by specifying it in the controller, so it won’t work using the pagination helper in the view.

I’ve combined their approaches to work out how to allow paging and sorting in your DataSource without having to modify the other parts of you application, so it can work in just the same way as if you were using a database table.

  1. Your DataSource needs a calculate method – this is where the SQL to calculate the number of rows would normally be generated, instead you can add a flag to your code to generate a count later on
  2. Add caching to the DataSource read method, because cake will fire it twice, the first to get a count of the total available rows, the second time to request a page of the data. If you cache the data on the first request, the second request can just read from the cache
  3. Add code to the DataSource read method to pick up the calculate request and return a count of how many rows in total are available
  4. Make sure that the describe method exists and returns a meaningful schema – this is necessary for the sorting to work
  5. Add methods to carry out the pagination or sorting

Expanding on the twitter example from the cake documentation. here is what you need to do in detail to enable paging and sorting.

  1. In the datasources folder, create twitter_source.php and paste the example code in
  2. In the models folder, create tweet.php and paste the example code in
  3. In config/database.php paste the datasource definition in, substituting your login and password
  4. In the controllers folder create twitter_controller.php and paste the following code, note that you aren’t doing anything differently in here than you would to access a database table:
    class TwitterController extends AppController {
    	var $name = 'Twitter';
    	var $uses = array('Tweet');
    	function index() {
    		$this->paginate = array(
    		'limit' => 2,);
  5. The view requires a little more code to add the pagination and sorting controls, this is the same sort of code that would be generated by the scaffolding:
    <p> <?php echo $paginator->counter(array( 'format' => __('Page %page% of %pages%, showing %current% records out of %count% total, starting on record %start%, ending on %end%', true) )); ?> </p>
    <table cellpadding="0" cellspacing="0">
        <th><?php echo $paginator->sort('id');?></th>
        <th><?php echo $paginator->sort('text');?></th>
      <?php $i = 0; foreach ($tweets as $item): $class = null; if ($i++ % 2 == 0) { $class = ' class="altrow"'; } ?>
      <tr<?php echo $class;?>>
        <td><?php echo($item['Tweet']['id']);?></td>
        <td><?php echo($item['Tweet']['text']);?></td>
        <td class="actions"><?php echo $html->link(__('View', true), array('action' => 'view', $item['Tweet']['id'])); ?></td>
      <?php endforeach; ?>
    <div class="paging"> <?php echo $paginator->prev('<< '.__('previous', true), array(), null, array('class'=>'disabled'));?> | <?php echo $paginator->numbers();?> <?php echo $paginator->next(__('next', true).' >>', array(), null, array('class'=>'disabled'));?> </div>

Up until now, we have just been putting the example files into the correct locations and setting up the view. Now come the modifications to the DataSource.

Edit twitter_source.php and make the following changes:

  1. Add a calculate method:
    function calculate(&amp;$model, $func, $params = array()) {
    	return '__'.$func;
  2. Replace line 61 of the read method:
    $response = json_decode($this-&gt;connection-&gt;get($url), true);

    With the following to enable caching of the response

    $cachePath = 'tweet_'.md5($url);
    $response = cache($cachePath, null, '+1 minute');
    if ( !$response ) {
    	$response = $this-&gt;connection-&gt;get($url);
    	cache($cachePath, $response);
    $response = json_decode($response, true);
  3. Add a method to get a single page from the returned data:
    function __getPage($items = null, $queryData = array()) {
    		if ( empty($queryData['limit']) ) {
    			return $items;
    		$limit = $queryData['limit'];
    		$page = $queryData['page'];
    		$offset = $limit * ($page-1);
    		return array_slice($items, $offset, $limit);
  4. Add a method to carry out sorting returned data:
    function __sortItems(&amp;$model, $items, $order) {
    		if ( empty($order) || empty($order[0]) ) {
    			return $items;
    		$sorting = array();
    		foreach( $order as $orderItem ) {
    			if ( is_string($orderItem) ) {
    				$field = $orderItem;
    				$direction = 'asc';
    			else {
    				foreach( $orderItem as $field =&gt; $direction ) {
    			$field = str_replace($model-&gt;alias.'.', '', $field);
    			$values =  Set::extract($items, '{n}.'.$field);
    			if ( in_array($field, array('lastBuildDate', 'pubDate')) ) {
    				foreach($values as $i =&gt; $value) {
    					$values[$i] = strtotime($value);
    			$sorting[] = $values;
    			switch(low($direction)) {
    				case 'asc':
    					$direction = SORT_ASC;
    				case 'desc':
    					$direction = SORT_DESC;
    					trigger_error('Invalid sorting direction '. low($direction));
    			$sorting[] = $direction;
    		$sorting[] = &amp;$items;
    		$sorting[] = $direction;
    		call_user_func_array('array_multisort', $sorting);
    		return $items;
  5. Add the following to the read method, above the final line (No. 70, return $results) to call the paging method, call the sorting method and return the item count:
    $results = $this-&gt;__getPage($results, $queryData);
    //return item count
    if ( Set::extract($queryData, 'fields') == '__count' ) {
    	return array(array($model-&gt;alias =&gt; array('count' =&gt; count($results))));

That’s it. You can download the twitter_source and the view index to save a lot of cutting and pasting.

I’m sure there are many improvements to this that people can suggest – I’m new to CakePHP myself, but I hope you find it useful.

How to make Nacho Cheese

Here’s a very simple way to make some half decent Nacho Cheese using ingredients you may have at home.

Take a jar of salsa and some cheddar cheese, mix in a ratio of 2 parts cheese to 1 part salsa (e.g. 200g cheese to 100g salsa).

Microwave until the cheese has melted, remove and stir.

IUS MySQL – how to enable the Archive storage engine

IUS is a great way of getting more recent versions of PHP and MySQL onto a Red Hat Enterprise Linux (RHEL) or Centos server than are normally available – without the trouble of having to compile them yourself.

Here is how to add the Archive storage engine to IUS MySql. This applies to MySQL 5.1.41 but will probably be the same for subsequent versions.

  1. Install the Archive storage engine as a plugin:
    yum install mysql51-plugins-archive*
  2. Tell MySQL to include the plugin via a client program:

That’s all there is to it.

The blackhole, example, federated and innodb (beta) plugins are also available – you can view the list here.

UPDATE 15/07/10

This isn’t required any more (version 5.1.48-2) the Archive storage engine is now built in.

Eclipse PDT – code assist or PHP Manual not working

If you are new to Eclipse PDT and find that code assist or the PHP Manual (Shift+F2 or Open PHP Manual) are not working for a particular project, check that you have a file named .buildpath in the root of your project.

If there is nothing there, try creating a new project (File / New / PHP Project) grab the file from there and copy it into your existing project.

Alternatively just paste in the following:
<?xml version="1.0" encoding="UTF-8"?>
<buildpathentry kind="src" path=""/>
<buildpathentry kind="con" path="org.eclipse.php.core.LANGUAGE"/>