Friday, December 11, 2015

Converting Nested Set Data to Tree Object

In a project at work we decided to use the nested set model. It fit very well with our needs. We set up the database schema and also created some procedures to help with inserting and deleting nodes. The next step is now reading the data from a client.  However we want to read the data and make sure the objects are represented in a tree structure.  So how do we take tabular data and convert it into a tree?  Easy, recursion.  We can sort the data by the left value, which should give us the ability to step through using a depth-first approach.

I'm not going to go into what the nested set model is, I'm assuming that since you are here you know something about it.  If not then I recommend the de facto blog here: http://mikehillyer.com/articles/managing-hierarchical-data-in-mysql/

Data Structures

First step is we need to define a data structure for our nodes.  It will contain the id, name, and list of child nodes.  You'll notice I set the properties for left and right to internal.  This is because there is no value giving users access to these properties so they should only be accessible within the project.


public class Node
{
 private double category_id = 0;
 private string name = "";
 private int left;
 private int right;
 private List<Node> nodes = new List<Node>();

 public Node()
 {
 }

 public Node(double category_id, string name, int left, int right)
 {
  this.CategoryId = category_id;
  this.Name = name;
  this.Left = left;
  this.Right = right;
 }

 public double CategoryId
 {
  get { return category_id; }
  set { category_id = value; }
 }

 public string Name
 {
  get { return name; }
  set { name = value; }
 }

 internal int Left
 {
  get { return left; }
  set { left = value; }
 }

 internal int Right
 {
  get { return right; }
  set { right = value; }
 }

 public List<Node> Nodes
 {
  get { return nodes; }
  set { nodes = value; }
 }
}

Okay this looks good.

Public Method

Next we can create the public method to open a connection to the database, load the data into a list, then pass that list off to a recursive function called "BuildNodes()" that will generate the node hierarchy, which we will then return.  Notice we are sorting the results the by left value in the SQL command.


public Node GetNodes()
{
 Node root = null;

 try
 {
  List<Node> nodes = new List<Node>();

  // connection 
  using (MySqlConnection connection = new MySqlConnection(_connectionstring))
  {
   //Open connection
   connection.Open();

   //Create command
   using (MySqlCommand command = new MySqlCommand("SELECT * FROM nested_category ORDER BY lft", connection))
   {
    //Read the table row
    using (MySqlDataReader reader = command.ExecuteReader())
    {
     //Read row values
     Node node;
     while (reader.Read())
     {
      node = new Node(); 
      node.CategoryId = Convert.ToDouble(reader["id"]);
      node.Text = reader["text"];
      node.Left = Convert.ToInt(reader["lft"]);
      node.Right = Convert.ToInt(reader["rgt"]);

      nodes.Add(node);
     }
    }
   }
  }

  if (nodes.Count == 0)
   return null;

  //Build the node hierarchy
  int nodeIndex = 0;
  root = BuildNodes(nodes, ref nodeIndex);
 }
 catch (Exception ex)
 {
  throw new Exception(ex);
 }

 return root;
}

Recursive Method


The last step is the recursive method to step through the list and convert it to a tree structure.  The trick here is to track 3 values, the row index and then the left and right values of the node in that row index.  The flow is hierarchy
  1. If a leaf node (a node without children) then return the node.
  2. If an internal node (a node with children) then
    1. Continue to step through the list and add children while the child's right value is less than the node's right value.

private Node BuildNodes(List<Node> nodes, ref int nodeIndex)
{
 
 Node node = nodes[nodeIndex];
 if ((node.Right - node.Left) == 1)
 {
  //Leaf
  return node;
 }
 else
 {
  //Internal node     
  Node child;
  int childIndex = node.Left++;
  while (childIndex + 1 < node.Right)
  {
   nodeIndex++;
   child = BuildNodes(nodes, ref nodeIndex);
   if (child != null)
   {
    childIndex = child.Right++;
    node.Nodes.Add(child);
   }
  }    
 }

 return node;
}

This will build up the hierarchy converted from a list.  It assumes that the left and right values are continuous and there are no numbers missing.  If your list filters the tree and there are gaps in the left and right values then good luck and let me know what you find.

Wednesday, December 9, 2015

Inserting and Deleting In a Nested Set Model

Storing tree data in a database can be tricky. There are multiple approaches you can take depending on your project's needs.  For one of my projects I added and removed nodes so infrequently (like once or twice a year) but required thousands of requests per day.  The nested set model was the perfect fit and I got an excited for the chance to play with a new technique that was such a shift in thinking to adjacency lists.  I'm not going to go into what the nested set model is, I'm assuming that since you are here you know something about it.  If not then I recommend the de facto blog here: http://mikehillyer.com/articles/managing-hierarchical-data-in-mysql/

I find using procedures for these tasks to be much easier than requiring users to wrap their head around the nested set model or to try and translate the code to the language in your client.

 

Inserting

One item missing from the blog is the actual code, sure it gives you snippets but not the entire picture.  For example inserting nodes is different if you are dealing with a leaf or internal node.  Just a small oversight that I thought I would resolve by creating a procedure.  It takes as input the id of the parent node that we want to insert and we pass the values for the new node.  The procedure tests if the parent is a leaf node and branches accordingly.

DELIMITER;;
CREATE PROCEDURE `node_insert`(parent_category_id INT, name VARCHAR(255))
BEGIN
 DECLARE parent_left, parent_right INT DEFAULT 0;
 SELECT lft INTO parent_left FROM nested_category 
  WHERE category_id = parent_category_id;
 SELECT rgt INTO parent_right FROM nested_category 
  WHERE category_id = parent_category_id;

 IF parent_right = (parent_left + 1) THEN
  SELECT @my_left := lft FROM nested_category 
   WHERE id = parent_category_id;
  UPDATE nested_category SET rgt = rgt + 2 
   WHERE rgt > @my_left;
  UPDATE nested_category SET lft = lft + 2  
   WHERE lft > @my_left;
  INSERT INTO nested_category(`name`, `lft`, `rgt`) 
   VALUES(name, @my_left  + 1, @my_left + 2);
 ELSE
  SELECT @my_right := rgt FROM nested_category 
   WHERE id = parent_category_id;
  UPDATE nested_category SET rgt = rgt + 2 
   WHERE rgt >= @my_right;
  UPDATE nested_category SET lft = lft + 2  
   WHERE lft >= @my_right;
  INSERT INTO nested_category(`name`, `lft`, `rgt`) 
   VALUES(name, @my_right, @my_right + 1);
 END IF;
END ;;
DELIMITER ;

To call the procedure we will need the id of the parent node and values.  For example if we were to add a node for smartwatches to the PORTABLE_ELECTRONICS that has an id of 6 we can call the procedure like this:

CALL node_insert(6, 'Smartwatch');

 

Deleting

This one is pretty easy. We take as input the id of the node we want to delete.

CREATE PROCEDURE `node_delete`(category_id INT)
BEGIN
 SELECT @my_left := `lft`, @my_right := `rgt`, @my_width := `rgt` - `lft` + 1
  FROM `nested_category`
  WHERE `category_id` = category_id;

 DELETE FROM `nested_category` 
  WHERE `lft` BETWEEN @my_left AND @my_right;

 UPDATE `nested_category` SET `rgt` = `rgt` - @my_width  
  WHERE `rgt` > @my_right;
 UPDATE `nested_category` SET `lft` = `lft` - @my_width 
  WHERE `lft` > @my_right;
END

To call the procedure we just need the id of the node.  For example if we decided that CD players were now obsolete we can delete the node "CD PLAYERS" which has an id of 9.


CALL node_delete(9);