Storing Hierarchical Data Using Nested Set Model

Since I started my career in web development, I've had to deal with storing of Hierarchical data. For years I've been using the adjacent model (recursive structure). This is where you have a record in a table that relates to another record in the same table. It's very easy to work with when you're doing the logical design of the database and also when you're inserting into the table. If you know the depth of the relations than it's still fairly easy to work with. The problem comes when you have no idea of the depth and still need to extract and/or display this hierarchical relationship.

A few months ago, a colleague of mine introduced me to the nested set model. He showed me many tutorials and it wasn't an easy concept for me to grasp right away. He was also kind enough to give me a procedure that he had wrote to convert an adjacent model table into a nested set model table. I converted his MSSQL procedure into one for MySQL and it's the basis of what I'll be sharing with you today.

We'll be working with a problem that I myself have encountered, supervisor-subordinate relationship. Imagine that you will need to create a reporting tool that supervisors could see how well the employees directly underneath them are doing as well as all employees underneath those employees are doing.

This is a very server intensive task when you try to use the adjacency model. One of the most common ways that this would be accomplished is using multiple calls to the database from a server side language, such as PHP, and using alot of PHP logic to sort out all the incoming data. This gets expensive because of all of the calls to MySQL and PHP needing to sort it out. This can also cause you to need to send back more data than you actually need, assuming you need PHP to figure that out.

Below are the employees and to whom they report:

Bob Johnson
  - Tom Smith
    * Tony Jones
    * Sarah Jones
  * Kevin Jones

  - Scott Smith
    * Sam Harris
    * Kelly Harris
      $ Scott Vance

  * Laura Harris

  - Jane Smith
    * Kim Nelson

  - Tony Smith

We will be working with a database called nesteddb and assume that you have granted a user called nesteduser all privileges. You can change these in the scripts below as needed.

First we need to create the nesteddb database and the adjacent model table and populate the adjacent model table:


CREATE DATABASE IF NOT EXISTS nesteddb;
USE nesteddb;

DROP TABLE IF EXISTS `nesteddb`.`adjacentTable`;
CREATE TABLE `nesteddb`.`adjacentTable` (
`userId` int(11) NOT NULL auto_increment,
`supervisorId` int(11) default NULL,
`depth` int(11) default NULL,
`firstName` varchar(20) default NULL,
`lastName` varchar(20) default NULL,
PRIMARY KEY (`userId`),
INDEX(`supervisorId`)
) ENGINE=MyISAM AUTO_INCREMENT=14 DEFAULT CHARSET=latin1;

LOCK TABLES `adjacentTable` WRITE;
INSERT INTO `nesteddb`.`adjacentTable` VALUES (1,NULL,0,'Bob','Johnson'),
(2,1,1,'Tom','Smith'),
(3,1,1,'Scott','Smith'),
(4,1,1,'Jane','Smith'),
(5,1,1,'Tony','Smith'),
(6,4,2,'Kim','Nelson'),
(7,2,2,'Tony','Jones'),
(8,2,2,'Sarah','Jones'),
(9,2,2,'Kevin','Jones'),
(10,3,2,'Sam','Harris'),
(11,3,2,'Kelly','Harris'),
(12,3,2,'Laura','Harris'),
(13,11,3,'Scott','Vance');
UNLOCK TABLES;

Now it's time to take the adjacency table and create a nested set table. Below is the stored procedure in it's entirety.


DELIMITER $$

DROP PROCEDURE IF EXISTS `nesteddb`.`PROC_NESTED_SET` $$
CREATE DEFINER=`nesteduser`@`%` PROCEDURE `nesteddb`.`PROC_NESTED_SET`(IN db_name varchar(100), IN nested_table_name varchar(100))
BEGIN

DROP TABLE IF EXISTS nesteddb.var_list;

CREATE TEMPORARY TABLE nesteddb.var_list (
Node int not null,
Parent int null
);

INSERT INTO nesteddb.var_list (Node, Parent)
SELECT userId as Node, supervisorId as Parent
FROM nesteddb.adjacentTable;

SET @maxRight = 2 * (SELECT COUNT(*) FROM nesteddb.var_list);
SET @counter = 2;
SET @currentTop = 1;

DROP TABLE IF EXISTS nesteddb.nestedTable;
CREATE TABLE IF NOT EXISTS nesteddb.nestedTable (
StackTop int not null,
Node int not null,
Lft int NOT NULL,
Rgt int null,
index(Node)
);

INSERT INTO nesteddb.nestedTable (StackTop, Node, Lft, Rgt)
SELECT @currentTop as StackTop, Node, 1 as Lft, @maxRight as Rgt
FROM nesteddb.var_list
WHERE Parent IS NULL
OR Parent = 0;

DELETE FROM nesteddb.var_list WHERE Parent IS NULL OR Parent = 0;

WHILE @counter < @maxRight DO

IF EXISTS (
SELECT *
FROM nesteddb.var_list as l
WHERE EXISTS (SELECT * FROM nesteddb.nestedTable AS s WHERE s.StackTop = @currentTop AND s.Node = l.Parent)
) THEN
INSERT INTO nesteddb.nestedTable (StackTop, Node, Lft)
SELECT @currentTop + 1 as StackTop, MIN(l.Node) as Node, @counter as Lftt
FROM nesteddb.var_list as l
WHERE EXISTS (SELECT * FROM nesteddb.nestedTable as s WHERE s.StackTop = @currentTop AND s.Node = l.Parent);

DELETE FROM nesteddb.var_list
WHERE Node = (SELECT Node FROM nesteddb.nestedTable WHERE StackTop = @currentTop + 1);

SET @currentTop = @currentTop + 1;

ELSE

UPDATE nesteddb.nestedTable
SET Rgt = @counter, StackTop = -StackTop
WHERE StackTop = @currentTop;

SET @currentTop = @currentTop - 1;

END IF;

SET @counter = @counter + 1;
END WHILE;

END $$

DELIMITER ;