Update: Get source code for this project on CodePlex!  Note the DLR is still being developed and hence a “moving target”.  Some (or many) of the items discussed here may change :)

Part 1 of this post gave a quick overview of writing a custom “parser generator” for Excel formulas.  Here we’ll take a look at a way to bind that to the DLR.  You can try out a sample Silverlight 2.0 application here or by clicking the image below:

image

The goal here is to create a calculation engine that understands Excel-like formulas, allowing for cell dependencies, simple ranges and custom functions.  In the screenshot above, a few identifiers (D5 – D7) have been assigned static values, then included in an expression later on.

Global math functions

Excel provides many global math functions.  I implemented a small subset of these as static methods like so:

/// <summary>
/// Excel global function implementations
/// </summary>
/// <remarks>Include an optional CodeContext as the first parameter if required.</remarks>
public class ExcelHelpers
{
    public static double Exp( double d ) { return Math.Exp( d ); }
    public static decimal Abs( decimal d ) { return Math.Abs( d ); }
    public static double Pow( decimal x, decimal y ) { return Math.Pow( (double)x, (double)y ); }
    public static DateTime Today() { return DateTime.Today; }

    public static decimal Min( params decimal[] p ) { return p.Min(); }
    public static decimal Max( params decimal[] p ) { return p.Max(); }
    public static decimal Round( decimal x, int decimals ) { return Math.Round( x, decimals ); }

    public static double RoundDown( decimal number, int digits )
    {
        var x = Math.Pow( 10, digits ); // 1, 10, 100, 1000, 10000
        var y = (double)number * x;
        return Math.Round( Math.Floor( y ) / x, digits < 0 ? 0 : digits );
    }

    public static double RoundUp( decimal number, int digits )
    {
        var x = Math.Pow( 10, digits ); // 1, 10, 100, 1000, 10000
        var y = (double)number * x;
        bool isExact = Math.Floor( y ) == y;
        return Math.Round( Math.Floor( y + ( isExact ? 0 : 1 ) ) / x, digits < 0 ? 0 : digits );
    }

    public static bool And( params bool[] conditions ) { return conditions.All( i => i ); }
    public static bool Or( params bool[] conditions ) { return conditions.Any( i => i ); }

    public static string Left( string text, int count ) { return !string.IsNullOrEmpty( text ) ? text.Substring( 0, count ) : ""; }
    public static string Right( string text, int count ) { return !string.IsNullOrEmpty( text ) ? text.Substring( text.Length - count ) : ""; }

    public static string Mid( string text, int start, int characters )
    {
        if ( string.IsNullOrEmpty( text ) ) return "";

        var length = text.Length;
        var startPos = start < 0 ? 0 : start - 1; //Javascript is 0 based index, excel is 1 based.
        var returnLength = length - characters;

        return ( characters <= 0 || startPos > length || returnLength < 0 ) // Invalid bound, return blank string
            ? ""
            : text.Substring( startPos, characters );
    }

    public static object Sum( params decimal[] values )
    {
        return values.Sum();
    }

    public static object Search( string find, string withIn ) { return Search( find, withIn, 1 ); }

    public static object Search( string find, string withIn, int start )
    {
        if ( string.IsNullOrEmpty( find ) ) return null;

        string search = withIn.Replace( "~*", "*" ).Replace( "~?", "?" );

        var pos = search.IndexOf( find, start - 1, StringComparison.InvariantCultureIgnoreCase );

        return pos > -1 ? (int?)( pos + 1 ) : null;
    }

    public static string Dump( CodeContext context )
    {
        return Dump( context, false );
    }

    public static string Dump( CodeContext context, bool verbose )
    {
        var scope = context.Scope as WorksheetScope;
        var result = new System.Text.StringBuilder();

        foreach ( var cell in scope.Worksheet.Cells )
        {
            if ( result.Length > 0 ) result.AppendLine();

            var expression = verbose
                ? cell.Value.Formula.Expression.DebugView
                : cell.Value.Formula.ExpressionText;

            result.AppendFormat(
                "{0} = {1}{2}",
                cell.Key,
                cell.Value.Value,
                expression != null ? " [ = " + expression + " ]" : null );
        }

        return result.ToString();
    }

These are all pretty trivial, but notice the Sum method where we’re able to take an array of decimals and use LINQ to sum the values together (go LINQ! :) ). 

You can add any method overrides you like – later we’ll hand them to the DLR and let it figure out the best one to call.  If you try the calculator demo above, you should be able to try these out (Dump() and Dump(true) can be quite revealing :) ).

Building the expression tree

In Part 1 I described using TinyPG to create a parser.  Once you’ve defined your grammar (and fixed any errors) it generates some C# partial classes.  The only one you really need to think about is ParseTree.cs.  Here’s my customized part of the class (remember it’s a partial class, so no need to touch the TinyPG generated file):

public partial class ParseTree
{
    private Dictionary<Operators, ExcelOperationBinder> opBinders = new Dictionary<Operators, ExcelOperationBinder>();

    private ExcelContext context;
    private Dictionary<SymbolId, ParameterExpression> variables = new Dictionary<SymbolId, ParameterExpression>();

    public Expression GetExpression( ExcelContext context )
    {
        this.context = context;

        return GetNodeExpression( this );
    }

    private Expression GetNodeExpression( ParseNode node )
    {
        switch ( node.Token.Type )
        {
            case TokenType.Start:
                return GetStartExpression( node );

            case TokenType.Assignment:
                return GetAssignmentExpression( node );

            case TokenType.CompareExpr:
                return GetComparisonExpression( node );

            case TokenType.AddExpr:
                return GetAddExpression( node );

            case TokenType.MultExpr:
                return GetMultiplyExpression( node );

            case TokenType.Atom:
                return GetAtomExpression( node );

            case TokenType.Range:
                return GetRangeExpression( node );

            case TokenType.TRUE:
                return Expression.Constant( true );

            case TokenType.FALSE:
                return Expression.Constant( false );

            case TokenType.NULL:
                return Expression.Constant( null );

            case TokenType.STRING:
                return GetString( node );

            case TokenType.INTEGER:
                {
                    int iVal;
                    var value = node.Token.Text;

                    return int.TryParse( value, out iVal )
                        ? Expression.Constant( iVal )
                        : Expression.Constant( long.Parse( value ) );
                }

            case TokenType.NUMBER:
                return Expression.Constant( decimal.Parse( node.Token.Text ) );

            case TokenType.Identifier:
                return GetVariableExpression( node );

            case TokenType.Method:
                return GetMethodExpression( node );
        }

        return Expression.Constant( node.Token.Text );
    }

    private IEnumerable<Expression> GetNodeExpressions( ParseNode node )
    {
        var expression = GetNodeExpression( node );

        if ( expression.Type != typeof( Range ) ) return new[] { expression };

        var range = (Range)( (ConstantExpression)expression ).Value;

        return from s in range.GetSymbols() select GetVariableExpression( s );
    }

    private Expression GetStartExpression( ParseNode node )
    {
        return GetNodeExpression( node.Nodes[ 0 ] );
    }

    private Expression GetAssignmentExpression( ParseNode node )
    {
        return Expression.Assign(
            GetVariableExpression( node.Nodes[ 0 ].Token.Text ),
            Utils.Convert( GetNodeExpression( node.Nodes[ 2 ] ), typeof( object ) ) );
    }

    private Expression GetComparisonExpression( ParseNode node )
    {
        ExcelOperationBinder oper = null;
        Expression expr = null;

        foreach ( var n in node.Nodes )
        {
            if ( n.Token.Type == TokenType.COMPARE )
            {
                switch ( n.Token.Text )
                {
                    case "=":
                    case "==": oper = OpBinder( Operators.Equals ); break;
                    case ">": oper = OpBinder( Operators.GreaterThan ); break;
                    case ">=": oper = OpBinder( Operators.GreaterThanOrEqual ); break;
                    case "<": oper = OpBinder( Operators.LessThan ); break;
                    case "<=": oper = OpBinder( Operators.LessThanOrEqual ); break;
                    case "!=":
                    case "<>": oper = OpBinder( Operators.NotEquals ); break;
                }
            }
            else
            {
                var next = GetNodeExpression( n );

                expr = ( expr == null )
                    ? next
                    : Expression.Dynamic( oper, typeof( object ), expr, next );
            }
        }

        return expr;
    }

    private Expression GetAddExpression( ParseNode node )
    {
        ExcelOperationBinder oper = null;
        Expression expr = null;

        foreach ( var n in node.Nodes )
        {
            if ( n.Token.Type == TokenType.PLUSMINUS )
            {
                switch ( n.Token.Text )
                {
                    case "+": oper = OpBinder( Operators.Add ); break;
                    case "-": oper = OpBinder( Operators.Subtract ); break;
                    case "&": oper = OpBinder( Operators.Concatenate ); break;
                }
            }
            else
            {
                var next = GetNodeExpression( n );

                expr = ( expr == null || oper == null )
                    ? next
                    : Expression.Dynamic( oper, typeof( object ), expr, next );
            }
        }

        return expr;
    }

    private Expression GetMultiplyExpression( ParseNode node )
    {
        ExcelOperationBinder oper = null;
        Expression expr = null;

        foreach ( var n in node.Nodes )
        {
            if ( n.Token.Type == TokenType.MULTDIV )
            {
                switch ( n.Token.Text )
                {
                    case "*": oper = OpBinder( Operators.Multiply ); break;
                    case "/": oper = OpBinder( Operators.Divide ); break;
                }
            }
            else
            {
                var next = GetNodeExpression( n );

                expr = ( expr == null || oper == null )
                    ? next
                    : Expression.Dynamic( oper, typeof( object ), expr, next );
            }
        }

        return expr;
    }

    private Expression GetAtomExpression( ParseNode node )
    {
        foreach ( var n in node.Nodes )
        {
            if ( n.Token.Type != TokenType.BROPEN && n.Token.Type != TokenType.BRCLOSE ) return GetNodeExpression( n );
        }

        return null;
    }

    private Expression GetVariableExpression( ParseNode node )
    {
        return GetVariableExpression( node.Nodes.First().Token.Text );
    }

    private Expression GetVariableExpression( string id )
    {
        return GetVariableExpression( SymbolTable.StringToCaseInsensitiveId( id ) );
    }

    private Expression GetVariableExpression( SymbolId symbol )
    {
        return variables.ContainsKey( symbol )
            ? variables[ symbol ]
            : variables[ symbol ] = Expression.Variable( typeof( object ), symbol.ToString() );
    }

    private Expression GetRangeExpression( ParseNode node )
    {
        var cellIds = ( from n in node.Nodes
                        where n.Token.Type == TokenType.CELLID
                        select new CellId( n.Token.Text ) ).ToArray();

        return ( cellIds.Length > 1 )
            ? Expression.Constant( new Range( cellIds ) )
            : GetVariableExpression( cellIds.First().Symbol );
    }

    private Expression GetString( ParseNode node )
    {
        var text = node.Token.Text;

        return Expression.Constant( text.Substring( 1, text.Length - 2 ) );
    }

    private Expression GetMethodExpression( ParseNode node )
    {
        var parameters = ( from n in node.Nodes
                           where n.Token.Type == TokenType.Params
                           from p in n.Nodes
                           where p.Token.Type != TokenType.COMMA
                           from e in GetNodeExpressions( p )
                           select e ).ToArray();

        var name = node.Nodes.First().Token.Text;

        if ( name.ToLower() == "iif" ) return GetConditionExpression( parameters );

        return Expression.Dynamic(
            context.CreateCallBinder( name, true, new CallInfo( parameters.Length ) ),
            typeof( object ),
            new[] { Utils.CodeContext() }.Union( parameters ).ToArray() );
    }

    private Expression GetConditionExpression( Expression[] args )
    {
        return Expression.Condition(
            Utils.Convert( args[ 0 ], typeof( bool ) ),
            Utils.Convert( args[ 1 ], typeof( object ) ),
            Utils.Convert( args[ 2 ], typeof( object ) ) );
    }

    private ExcelOperationBinder OpBinder( Operators oper )
    {
        return opBinders.ContainsKey( oper )
            ? opBinders[ oper ]
            : opBinders[ oper ] = new ExcelOperationBinder(
                context.Binder as ExcelBinder,
                Utils.CodeContext(),
                oper );
    }
}

All it does is recursively walk the parse tree and emit a DLR expression tree (Abstract Syntax Tree).  It has some basic handling for binary operators (including Excel / VB’s ‘&’ concatenation operator).  For some operations (such as binary operators and method calls) it creates DLR DynamicExpression nodes, which hold “just enough” information to define the operation to be performed.

Binders

Before the DLR can make any sense of your DynamicExpressions you need to give it some guidance about how to perform your operations.  To do this you create custom “language binders”, whose job is to provide the DLR with rules defining how to perform operations on various data types.  For this project I created the following classes:

  • ExcelContext – Derives from LanguageContext and provides some compilation helper methods and overrides to create related binders.
  • ExcelBinder – Defines standard operations for various data types and which conversions are valid.
  • ExcelCallAction – Maps “method call” actions to static methods on a helper class (see ExcelHelpers).
  • ExcelConversionBinder – Allows conversions to be overridden; currently just a placeholder.
  • ExcelOperationBinder – Defines how binary operations (add, multiply, comparison operators etc) should behave, performing some basic type coercion.
Operations

You can control the implicit type coercions the DLR is allowed to perform by defining “extension types”.  For each data type, you can create a static class with helper methods to perform the low-level operations.  These methods are marked with the [SpecialName] and [ImplicitConversionMethod] attributes.  You can see some examples here (for this project I wanted to treat all non-integer numbers as decimals).

Other interesting bits

I defined a few classes to represent formulas, engines, cells and worksheets.  The idea is you create an engine and populate it with formulas, then use that engine to create one or more worksheets.  Everything is created / compiled on demand and invalidated appropriately when dependencies change:

  • Cell – Represents an instance of a Formula in a Worksheet.
  • Worksheet – Contains all Cells and an Engine reference.
  • Formula – Contains an ID, expression text, default value and cached expression tree.
  • Engine – Contains all Formulas.
  • Invoker – Caches dynamic delegates (see related post here).
Try it!

You can find the source for this project on CodePlex and you can try the Silverlight calculator test here :)  Don’t forget to look at the retirement planner example I posted too:

image

It’s built on an extended version of this project, able to load an Excel-like worksheet (including lookup tables and some scary long expressions).  As you drag the “retirement age” slider it’s recalculating several hundred table values (pretty quickly thanks to the DLR’s call site caching) and kicking off a tween-like animation (sometimes it’ll slow down as the axis scales etc; that’s just an animation issue rather than a DLR / engine problem :) ).

About these ads