Thursday, May 30, 2013

libxml2 - process node in the tree.

Because the original code example which libxml2 demo tree1.c how to traverse the tree was using Recursive to process the relationship between the children node and the next node, but I think it is dangerous because we don't know when the stack overflow problem happen and it will happen sometime in the future.

the original code example

static void print_element_names(xmlNode * a_node)
{
 xmlNode *cur_node = NULL;

 for (cur_node = a_node; cur_node; cur_node = cur_node->next) {
   if (cur_node->type == XML_ELEMENT_NODE) {
     printf("node type: Element, name: %s\n", cur_node->name);
   }
   print_element_names(cur_node->children);
 }
}

So I checked the xmlNode structure and find out the parent node is the useful hint to traverse the xml node tree.
  1. Process the current node first.
  2. Check the node if there is any children node and yes, process the children node.
  3. Check the node if there is any next node and yes, process the next node.
  4. If there is no children and next node,
    1. check if there is any parent node and yes, go back to the parent node
    2. When in the parent node, because it should be processed in step 0, check if there is any next node.
    3. If there is a next node, repeat step 3.
    4. If there is no any next node, repeat step 4.
  5. Finally there is no next and parent node, the node should be the XML_DOCUMENT_NODE.
void do_indent(int indent)
{
  for (; indent; --indent) {
    putchar(' ');
    putchar(' ');
  }
}

int getNextNode( xmlNode **pCurr, int *deepth )
{
  xmlNode *curr = *pCurr;

  if ( curr->children!=NULL ) {
    /* Move to the children node.
     */
    curr = curr->children;
    *deepth += 1;
  } else if ( curr->next==NULL && curr->parent) {
    /* Go back to the parent node.
       then go to the next node of the parent node or
       go to the parent node of the parent node if there is no any next node behind.
       if there is no any parent node, it should be the XML_DOCUMENT_NODE.
     */
    curr = curr->parent;
    *deepth -= 1;
    while ( curr->next==NULL && curr->parent!=NULL ) {
      curr = curr->parent;
      *deepth -= 1;
    }
    if ( curr->next!=NULL ) {
      curr = curr->next;
    } else {
      /* It should be XML_DOCUMENT_NODE.
       */
      curr = NULL;
    }
  } else {
    curr = curr->next;
  }

  *pCurr = curr;

}

void process_node(xmlNode *pNode)
{
  xmlNode *curr=NULL;

  int indent=0;

  if ( !pNode ) {
    printf("No Node exist\n");
    return;
  }

  for ( curr=pNode ; curr ; getNextNode(&curr, &indent)) {

    if (curr->type == XML_ELEMENT_NODE) {
      do_indent(indent);
      printf("node type: Element, name: %s", curr->name);

      if ( curr->properties ) {
        process_attr(curr->properties);
      }

      printf("\n");
    } else if ( curr->type==XML_TEXT_NODE ) {
      do_indent(indent);
      printf("node type: Text, name: %s\n", curr->content);
    }
  }
}

No comments: