In general, a tree can be described by a string, using parentheses to demark subtrees. Here's the formula, as a string rewriting system:
TreeFormat :: S -> (AA). ; root with two children.
; final period to help a dumb lex'er
A -> (H) | (AA) ; child leaf | node with two children
H = 1 byte in hex ; content of leaf
characters other than 0-9, a-f, ( and ) are ignored
What this means is that you start with the symbol S,
replace it with the symbol string (AA)., and I intended
that (, ) and . are literal. Then you replace A with
either (H) or (AA), where ( and ) are literal. Then you
replace H with a hexidecimal representation of 1 byte, e.g.
3a.
Example:
+
/ \
23 +
/ \
1b 2c
is ((23)((1b)(2c))). by the sequence of
substitutions:
S -> (AA).
-> ((H)A).
-> ((23)A).
-> ((23)(AA)).
-> ((23)((H)A)).
-> ((23)((1b)A)).
-> ((23)((1b)(H))).
-> ((23)((1b)(2c))).
This whole business is an example of a Context Free Grammar. It
will be back to haunt you Junior or Senior year.
I threw a period on the end of the string to make it easy to input the characters without knowing anything about the format. Input characters to you get a period, then pass the string to a parsing routine. The parsing routine takes the string, looks symbol by symbol, reconstructing the tree. Here's the algorithm:
BUILDTREE
get '(', call BUILDTREE_AUX returns root, get ')', done
BUILDTREE_AUX (recursive)
(1) create tree node, set leaf false, content 0,
local workingOn flag to LEFT.
(2) if char. is a hex digit, set leaf true, update content
(3) if char. is '(', call BUILDTREE_AUX returns left/right
subtree according to whether workingOn is LEFT or RIGHT.
Set workingOne to RIGHT.
(4) if char. is ')', Return pointer to tree node to caller
(5) everything else, ignore.
Each HuffmanTrieNode has a leftChild, rightChild, a boolean flag
marking the node as a leaf or not, and a content location to
be filled in with data on the leaft nodes.
The other side of the story is given a tree, to derive the string describing the tree. This is classic recursuion. Very simple:
DESCRIBE_TREE:
write '(',
if leaf true write content,
else recurse on left s/tree then right s/tree,
write ')',
return.
Oops, don't forget the '.' at the end! I'd make a recursive
describe-tree and a non-recursive shell which adds the dot
at the end, and feeds the recursive version the tree root
as a starting node.
Good luck!