CodeBetter.Com
CodeBetter.Com
RSS 2.0 via Feedburner
           Do you Twitter? Follow us @CodeBetter

Jeremy D. Miller -- The Shade Tree Developer

Under the hood and working with .Net, TDD, Software Design, and Agile Stuff

May 2006 - Posts

  • Achieve Better Results by following Jeremy's Third Law of TDD: Test Small Before Testing Big

    Getting back on track with TDD content.  One of the most important lessons learned the software development community has learned is that productivity follows from more frequent feedback cycles.  How fast can you go from writing a piece of code to knowing that code does what it's supposed to do?  Do the symbols on the UML diagram you've worked on all week translate to a design that works?  Does the code you just wrote work?  How often can you get working screens in front of the end users to get usability feedback?  How quickly can you get code into testing, and how fast can the testers validate that code against the desired functionality?  How fast can you identify and correct problems with your code/design/architecture?  All of these "how fast?" questions are directly related to the granularity of your testing and coding.

    Test Driven Development is an important and valuable tool to achieve rapid feedback cycles, but I've found that the benefits of TDD are only achieved if you religiously write and test software in small pieces.  That lesson is my third (but arguably the most important) law of TDD -- "Test Small Before Testing Big."  Putting it another way, "Code from the Bottom Up."  Don't get fooled by Test Driven Development, the benefits of TDD are felt throughout the software lifecycle.  It's not just extra overhead in writing more unit tests, truly adopting TDD is about:

    • Coding in an efficient manner, and sustaining that efficiency over time
    • Enabling continuous and adaptive design techniques
    • Designing an application that is resilient in the face of change
    • Faster removal of defects from code
    • Making an application easier and quicker to test

    TDD felt like more work to me at first, but over the last three or four years I've learned how to design and build code as a series of small steps that are validated every stop along the way and my results have improved considerably.  To soften the learning curve for people new to TDD, and to gain some lucidity with my own thinking on TDD, I am distilling the lessons (lumps) I've learned about TDD into Jeremy's Laws of TDD (not that the list is particularly original or new).

    1. Isolate the Ugly Stuff
    2. Push, Don't Pull
    3. Test Small before Testing Big (this post)
    4. Avoid a long tail
    5. Favor composition over inheritance
    6. Go declarative whenever possible
    7. Don't treat testing code like a second class citizen
    8. Isolate your unit tests, or suffer the consequences!
    9. The unit tests will break someday
    10. Unit tests shall be easy to setup

    And the overriding "Zeroeth Law" is "If code is hard to test, change it."

    Case Study #1:  Code from the Bottom Up

    Last week my colleague and I coded a couple of related user stories that illustrate the advantages of following the "Code from the Bottom Up" law, both in the negative and positive. 

    In the first story we were taking a "manifest" message that contained a summary of an invoice in progress.  We needed to run a series of validations against the manifest, compare the manifest to existing data, then run the manifest data through a subset of our existing rules engine.  I noticed my colleague had that furrowed brow look we all get when the code isn't flowing out of our fingertips.  When I asked what was up he told me he was having trouble getting started because he couldn't see how all the pieces fit together.  I gave the best advice I know for that situation -- "Don't care [about the whole].  What do you know how to do?"  In this case we could start by validating the requested sender and recipient of the invoice message.  We first built a class that would take in the sender and recipient id's and return a list of possible problems (users don't exist, sender doesn't have a relationship to the recipient, etc.).  That class would take in an IDomainSource object that acts as a repository to User objects:

            /// <summary>

            /// Checks the validity of a sender/recipient id pair

            /// Uses the IDomainSource repository to locate User objects

            /// and their relationships to each other

            /// </summary>

            public AddressBookValidator(IDomainSource source)

            {

                _source = source;

            }

    with a method for the validation like this:

    public EventMessage[] ValidateAll(string senderId, string receiverId)

    We went on to unit test this class by mocking the IDomainSource dependency and exercising all the permutations we could think of.  With that task completed, we moved onto the next validation against an existing domain object (a "Bundle").  We started by assuming we had already fetched the existing Bundle object and were comparing it to the Manifest message.  That leads to a simple method that can be tested with pure state-based testing (we just needed to check the values of the EventMessage objects being returned):

    public static EventMessage[] ValidateManifestAgainstBundle(Manifest manifest, Bundle bundle)

    Once we have this method completely tested, we do still have the issue of fetching the correct Bundle object to validate against.  We move onto this method and mock the IBundleRepository class in the unit tests.

            public string[] Validate(Manifest manifest)

            {

                Bundle bundle = _bundleRepository.FindBundle(manifest.ControlNumber);

                EventMessage[] messages = ValidateManifestAgainstBundle(manifest, bundle);

     

                return ResourceMessageProcessor.ToMessageArray(messages);

            }

    Now, we've dodged around the rules engine enough.  To activate the core of the rules engine we need to pass it data in the form of its own canonical data structure and an array of rule objects.  First we need to translate the Manifest object into the rules data structure.  That led us to build and test the method below:

    public InvoiceDataSet BuildInvoiceDataSet(Manifest manifest)

    Next we built a class that used the normal rules engine configuration service to retrieve all the rules for a specific receiver and select the subset of the rules that are useful in the new context.  Once we had those two pieces completed, activating the rules engine was simple.

    At this point we have little classes that execute all of the various validation rules, but we're still lacking the actual service endpoint.  Looking at our validation classes we noticed a basic pattern, so we lifted a common interface for all of the validation classes:

        public interface IManifestValidator

        {

            string[] Validate(Manifest manifest);

        }

    We knew the "contract" of the service entry point all along.  Now that we've built all of the underlying validation pieces, the rest of the service is simply an exercise in connecting the dots:

            public ManifestValidationResponse ValidateManifest(Manifest manifest)

            {

                ManifestValidationResponse response = new ManifestValidationResponse();

                response.ControlNumber = manifest.ControlNumber;

     

                // Simply loop through all of the manifest validators

                // and collect all of the validation messages

                foreach (IManifestValidator manifestValidator in _validators)

                {

                    string[] messages = manifestValidator.Validate(manifest);

                    response.AddMessages(messages);

                }

     

                return response;

            }

    We didn't completely understand what all of the pieces of the service method was going to be at the beginning of the coding session, yet we still managed to create working code in short order.  By focusing on little "worker" classes that performed small tasks, the aggregate structure fell into place.

    In the second story we had to accept a new message representing an invoice submission.  We had to take this message and run a series of transformations and validations against the invoice data to create a human readable report of validation problems to correct the invoice, or to accept the invoice.  Most of the functionality already existed, but the validation and translation code was bound up into coarse-grained workflow classes.  There wasn't any way to exercise the logic we wanted without causing side effects.  Most of the effort in that story was a series of refactorings to extract the smaller pieces of code into separate classes that could be called independent of the larger workflow.  We had a safety net of coarse grained integration and regression tests, but no unit tests.  Because of the risk, we had to make the changes in very small steps while constantly running the sluggish end to end tests to make sure we didn't break anything.

    So what's the point of the second user story, and how does this experience relate to "test small?"  In the first story the rules engine had originally been coded with Test Driven Development, and pursuing testability directly led to the rules engine component being composed of little loosely coupled, cohesive classes.  We didn't have to change the rules engine code at all, even though we were using it in a completely new way.  All we had to do was recombine some of the existing objects within a different coordinator class.  The code in the second story had not been written with TDD or testability in mind, and it showed.  The TDD code followed the Open/Closed Principle as a byproduct of working "test first," the non-TDD code required a lot of change to the existing code to create new functionality.  The TDD code was healthier code than the non-TDD code.

    Purposely Designing with Test Driven Development

    I know this is controversial and I didn't believe this initially either, but TDD is designing while coding.  Coding one task at a time helps me to discover the structure of the larger whole.  My vision of the whole is informed by the creation of the small pieces.  Using some terminology from Responsibility Driven Design, consciously divide up responsibilities by class stereotypes:

    • Service providers - classes that perform a specific operations
    • Information Holders - classes that have, or provide, data to other classes
    • Coordinators - classes that coordinate the activities of other classes
    • Controllers - classes that control the application flow

    The key for doing emergent design is to focus on creating the service provider and information holder classes first that perform small tasks.  After these classes are built and tested the construction of the coordinator and controller classes often turns into a simple game of "connect the dots."  Get the business logic and the workflow decisions complete first.  Push off tasks like configuration and even data access closer to the end.  Let the needs of the business logic and workflow dictate the interface and design of the ancillary services.

    To make this approach work you need to consciously maximize "reversibility" throughout the system.  Notice how we started the user story above by working on the validation of a Manifest object to a Bundle object.  The initial class method isn't coupled to any particular message handler, data access mechanism, or configuration subsystem.  It simply takes in two objects and returns an array of strings.  The loose coupling is just a side effect of coding for testability, but it enables us to use that class in a multitude of ways.  It isn't bound to a particular workflow.  If the requirements of the overall workflow changes (and it will), or we determine a better overall design, we can change the workflow controllers and coordinators by rearranging the service providers and information holders.  You've also maximized the potential for reuse at a later time.  In the first user story we were able to take the smaller classes of the rules engine and combine them differently for all new functionality.  In the second story we had to spend a lot of energy extracting code out of the large workflow classes to create the new functionality.  Guess which story went faster?

    When I was in school I worked with my father building houses in the summer.  On one memorable occasion I watched a pair of plumbers looking glumly at a series of pipes sticking out of the slab concrete form.  It turned out that the plumbers had put the pipes in the wrong place before the concrete was poured around the pipes.  They ended up renting a giant two man concrete saw and cutting up the foundation to move the pipes on a blistering summer day (they missed the second time around too!).  Placing pipes that will have concrete poured around them is an irreversible decision, you simply can't get that decision wrong because there isn't going to be a second chance. 

    Software doesn't have to be that way.  So what can you do to maximize reversibility in your systems?  You guessed it, write small cohesive classes that can be moved around and used in different contexts.  Somebody will comment that evolutionary design techniques are inefficient and you should just be doing more research upfront.  Maybe, but designing fluid code that can change covers a lot of potential scenarios.  Building code from an upfront design, especially a design that happens from the top down, can easily result in code that only works for the cases known upfront -- and the requirements will change, maybe not tomorrow or the next quarter, but they will change.

    Case Study #2:  Permutations

    The heart of the first big system I designed was a complex supply chain routing engine* that determined the best way to route requests from the factory lines for parts.  The engine first queried and correlated (with a full outer join no less) data from three different sets of database tables, then applied a complex series of business rules against the data to select the best part channel (part, inventory source, and factory line destination).  Adding to the complexity was a set of business rules that varied by region.  All told, there was a mountainous stack of possible permutations of input to check.  Needless to say, the testers struggled mightily with the engine because they only tested manually and with end to end blackbox tests.  The engine worked great and we found very few defects with it in testing, but because it sucked down such a disproportionate amount of testing resources other parts of the system didn't receive nearly the same level of testing and serious defects made it into production.  We worked inefficiently.

    So with the wisdom conferred upon me by hindsight, let's take a look at a cleaner way to test this functionality that won't drown the team in endless permutations.  The first and foremost thing to do is to solve the routing in two steps, first correlate the data from the database tables into an object structure, then from this object structure select the best routing.  I described the full model of the part sourcing as a bookshelf.  On the top shelf are all the books you know you'll enjoy.  The second and third shelves are books that aren't as good.  The full model works by dividing the PartSourceChannel objects into several shelves of decreasing ability to fulfill the request for the part. 

        public class PurchaseOrder{}

        public class SupplyChainSource{}

     

        /// <summary>

        /// Represents a valid source for a part - the "Books"

        /// </summary>

        public class PartSourceChannel

        {

            public string PartNumber;

            public string ChannelNumber;

            public PurchaseOrder PurchaseOrder;

            public long Inventory;

            public SupplyChainSource SupplyChainSource;

            public double AllocationPercentage;

        }

     

     

        /// <summary>

        /// Collection of related PartSourceChannel's - the "Shelf"

        /// </summary>

        public class PartSourcing

        {

            public PartSourceChannel[] Channels;

     

            public PartSourceChannel SelectChannelWithMostInventory()

            {

            }

     

            public PartSourceChannel SelectChannelByAllocation()

            {

            }

     

            public bool HasChannels()

            {

                return true;

            }

        }

    And the bookshelf itself is:

        /// <summary>

        /// Represents all of the possible PartSourceChannel's

        /// for a given part, factory line, and region - the "Bookshelf"

        /// </summary>

        public class PartSourceRouting

        {

            public PartSourcing Shelf1;

            public PartSourcing Shelf2;

            public PartSourcing Shelf3;

     

            // Select the PartSourceChannel

            public PartSourceChannel Route()

            {

                if (Shelf1.HasChannels())

                {

                    return Shelf1.SelectChannelByAllocation();

                }

     

                if (Shelf2.HasChannels())

                {

                    return Shelf2.SelectChannelWithMostInventory();

                }

     

                if (Shelf3.HasChannels())

                {

                    return Shelf3.SelectChannelByAllocation();

                }

     

                return null;

            }

        }

    My strong recommendation is to drive design from the behavior of the business objects and then work forward to the service point and backwards to the data store.  Starting small, assume you already have a PartSourcing bookshelf.  The algorithm to select a part source from a shelf came in two basic flavors, choose by allocation to a supply chain partner or choose the PartSourceChannel with the most available inventory.  The first set of unit tests could be to start the PartSourcing shelf class and test the SelectChannelWithMostInventory() and SelectChannelByAllocation() methods first.  It's an easy place to start because you can simply create an array of PartSourceChannel objects, pass them into a PartSourcing object, and verify that PartSourcing returns the correct PartSourceChannel.

     

        [TestFixture]

        public class PartSourcingTester

        {

            [Test]

            public void SelectChannelWithMostInventoryWithMoreThanOneChannel()

            {

                PartSourceChannel channel1 = new PartSourceChannel(100);

                PartSourceChannel channel2 = new PartSourceChannel(400);

                PartSourceChannel channel3 = new PartSourceChannel(200);

                PartSourceChannel channel4 = new PartSourceChannel(300);

     

                PartSourcing sourcing = new PartSourcing();

                sourcing.Channels = new PartSourceChannel[]

                    {

                        channel1, channel2, channel3, channel4

                    };

     

                PartSourceChannel channel = sourcing.SelectChannelWithMostInventory();

     

                Assert.AreSame(channel4, channel, "Channel 4 has the most inventory");

            }

        }

    That was simple enough.  Once the unit tests for PartSourcing are complete you could move onto the PartSourceRouting.Route() method.  Following the "Push, Don't Pull" law, PartSourceRouting is completely ignorant of how or where the PartSourceChannel data is stored, it just processes the data it's given.  Now that we trust the PartSourcing class, we can start building the PartSourcing members of a PartSourceRouting class and then call Route() to verify the expected outcome. 

        [TestFixture]

        public class PartSourceRoutingTester

        {

            [Test]

            public void RouteWithOnlyPartSourceChannelsOnShelf2()

            {

                // Build a PartSourceRouting in memory

                PartSourceChannel channel1 = new PartSourceChannel(100);

                PartSourceChannel channel2 = new PartSourceChannel(400);

                PartSourceChannel channel3 = new PartSourceChannel(200);

     

                PartSourcing sourcing = new PartSourcing();

                sourcing.Channels = new PartSourceChannel[]

                    {

                        channel1, channel2, channel3

                    };

     

                PartSourceRouting routing = new PartSourceRouting();

                routing.Shelf2 = sourcing;

     

                PartSourceChannel channel = routing.Route();

                Assert.AreSame(channel2, channel, "Channel 2 has the most inventory");

            }

        }

    So now that the entire "bookshelf" routing is working we can move onto the next task -- actually building the bookshelf.  We're still not ready to touch the database though.  Create a new PartSourceRoutingBuilder class that takes in an existing array of PartSourceChannel objects and creates a new PartSourceRouting with all of the PartSourceChannel objects on the proper shelves.

        public class PartSourceRoutingBuilder

        {

            private readonly PartSourceChannel[] _channels;

            private readonly PartRequest _request;

     

            public PartSourceRoutingBuilder(PartSourceChannel[] channels, PartRequest request)

            {

                _channels = channels;

                _request = request;

            }

     

            public PartSourceRouting Build()

            {

            }

        }

    Just to write simpler tests first, I would try to first test how to shelf a single PartSourceChannel before trying out the larger Build() method. 

            // I'm using a static method here so there's no issue

            // with prior state of a PartSourceRoutingBuilder

            // instance

            public static void ShelvePartSourceChannel(

                PartSourceChannel channel,

                PartSourceRouting routing,

                MaterialRequest request)

            {

                // analyze the channel against the request and

                // put the channel on the proper shelf

            }

    Once the ShelvePartSourceChannel() is unit tested with all the scenarios we can think of, then move onto the Build() method that will delegate to ShelvePartSourceChannel().  If we unit test ShelvePartSourceChannel() thoroughly, we don't need to write a unit test for nearly as many permutations of PartSourceChannel arrays through the larger Build() method, reducing the complexity of testing. 

    Back to my team's struggle with testing the original routing engine.  In no small part due to the experience with the routing engine I'm a big, big believer in white-box testing.  Because of the way we've built the pieces of the routing engine here we can easily write a series of FitNesse fixtures that allow the testers to quickly define a list of PartSourceChannel's and check the routing selection for a given MaterialRequest without the database or any kind of web service or user interface being involved.  That should cut down the difficulty of writing automated tests for the routing engine to validate the business rules of the routing engine in isolation before we try to test the engine from service invocation to database.  We can go into the black box testing with confidence that the engine itself works first.

    So the business rules are verified, but we still need a service entry point and the database access to correlate the data from the database into the PartSourceChannel objects.  Because we know it's easier to test by building from the ground up we'll code and test the correlation from the database first and following the Dependency Inversion Principle we'll put this functionality behind an interface.

        public interface IPartSourcingDataService

        {

            PartSourceChannel[] FindRoutingOptions(MaterialRequest request);

        }

    Finally, we can move on to the service class that will be called by the rest of the application to access the routing logic.

        public class PartSourceRoutingEngine

        {

            private readonly IPartSourcingDataService _service;

     

            public PartSourceRoutingEngine(IPartSourcingDataService service)

            {

                _service = service;

            }

     

            public PartSourceChannel Route(MaterialRequest request)

            {

                PartSourceChannel[] channels = _service.FindRoutingOptions(request);

                PartSourceRoutingBuilder builder = new PartSourceRoutingBuilder(channels);

                PartSourceRouting routing = builder.Build();

     

                return routing.Route();

            }

        }

    That class was easy -- and that's a big point to following the "Code from the Bottom Up" rule.  The flow of the PartSourceRoutingEngine controller class falls out because all of the little pieces are already defined and built.  PartSourceRoutingEngine simply has to coordinate the actions of the existing IPartSourceDataService, PartSourceRoutingBuilder, and PartSourceRouting classes.

    Controlling Testing Permutations

    I started this case study as an exercise in controlling permutations.  In the real world project the possible pathways through the routing looked something like this (feel free to mock my math here):

    1. 12 different combinations of table joins across the three tables * 3 sets of region specific rules for a total of 36 permutations
    2. 5 valid shelves * 0, 1, 2, or 3 channels per shelf = 4 ^ 5 = 1024

    I'm not sure how the 36 related to the 1024, but suffice it to say the final answer is >10000.  Covering every permutation simply isn't feasible, but by focusing on completely testing the smaller steps first we might get something like:

    1. The 36 permutations of the data correlation
    2. Test the selection process of each shelf individually with 0, 1, 2, or 3 channels -- 4 * 5 = 20
    3. Maybe test the selection of the five shelves -- has channels or not = 2 ^ 5 = 32 (but I think you could get by with many fewer)

    That math gives you 80+ tests.  Say you add half again as many tests that run end to end to prove that the pieces work together.  That adds up to 120 tests, far smaller than the 10,000+ combinations from end-to-end. 

    Easing the Testing Burden

    Quoting my esteemed tester colleague Jim Matthews --

    Testing takes a lot longer if all you can do is write end-to-end tests.  It's also easier to find and remove problems from smaller tests.

    When I wrote the routing engine I wasn't using Test Driven Development much less Acceptance Test Driven Development in conjunction with the testers and analysts.  The testers had no other way to test than to load up the database with data, run the routing engine, and see if the results matched up with expectations.  Three years of development with Agile processes has given me a much greater appreciation for a holistic approach for software development.  Much of the coding process is simply removing defects -- compile time checks, code reviews, unit tests, acceptance tests, etc.  The faster you can purify the code by removing defects the better the real productivity of the team.  My productivity on the routing engine in terms of coding alone was actually pretty good, but the tester's productivity was awful.  In terms of Lean Programming I made a point optimization that didn't help the whole process.  The lesson I've taken away from the routing engine is to write code that is easier for the testers to test -- and make sure that the testers are aware of the potential for smaller tests, but that's a post for a different day...

    Looking back, we could have made the testing of the routing engine much smoother is we had focused on testing the business rules in isolation from the database and the interface.  The testers bogged down in the database setup and flat file creation just to setup a business rules scenario.  If they could have just started by saying "I have these PartSourceChannel's and this MaterialRequest, I expect this PartSourceChannel to be selected" we could have nailed down the business rules much faster with tests that were human readable.  Once we were confident in the implementation of the business rules we could have moved onto integration with the front end and the database. 

    To write acceptance tests in a white box fashion, the system has to have seams, places where we can exercise the business rules in isolation without running the full application stack.  The good news is that you don't have to spend countless hours in front of a whiteboard devising seams for your testers.  Simply proceed in a "test small, before testing big" manner of constructing code and those seams will already be there in your code.

     

    Wrapping Up

    We've always known that software development works best when we can divide larger problems into smaller, more easily manageable pieces, it's just the mechanism for determining the smaller pieces that's always difficult.  To get the full benefits of TDD it really must be used in combination with other practices like Continuous Integration and true Iterative Development. 

    You might ask, why can't I achieve all of this with upfront design and just plain old unit testing? Maybe you can, at least some of the time, if your design skills are good and the requirements are stable.  I'd argue that over time the odds are in favor of adaptive techniques that give you more opportunities to make corrections along the way.  As far as just plain unit testing goes, TDD gets unit testing into play much earlier and more consistently.  If you build code without thinking about how you will test the code, you'll often find yourself with code that is hard to test.  Writing unit tests first forces you to write testable code and goes a long way toward better unit test coverage.

    Hopefully this post addressed the usage of TDD to solve bigger problems in pieces. I did cut some things out of this post for the sake of length, so there will be some follow up posts this week on TDD and Debugging, TDD and Flow, and using Mock objects to create smaller tests and avoid context switching.

  • Grab bag of follow up's for data access, persistence, query engines, and o/r mapping

    Just catching up on questions and comments from the Why I do not use Stored Procedures post yesterday.


    Sachin Rao asked my take about using an O/R mapper for for reporting.  I'd say "it depends," but in a purely reporting system I would probably opt for the simple "get dataset, slap into datagrid, rinse, and repeat" strategy.  I don't like sproc's for reports that have a multitude of optional search criteria, but more on that below.


    Nick Parker (who owes me a post on StructureMap) asked about the OO query engine that we use and how it relates to NHibernate.  There's nothing special going on with this.  The class that we use as the innermost gateway to NHibernate has this method IList Query(IObjectQuery query)  that accepts an IObjectQuery object.  The implementations of IObjectQuery are just syntactical sugar over NHibernate's ICriteria objects.  Here's an example:

        /// <summary>

        /// Finds all the instances of a given type with

        /// a property value

        /// </summary>

        public class FindByPropertyQuery : IObjectQuery

        {

            private readonly Type _memberType;

            private readonly string _propertyName;

            private readonly object _propertyValue;

     

            public FindByPropertyQuery(

                Type memberType,

                string propertyName,

                object propertyValue)

            {

                _memberType = memberType;

                _propertyName = propertyName;

                _propertyValue = propertyValue;

            }

     

            public IList FindResults(ISession session)

            {

                ICriteria criteria = session.CreateCriteria(_memberType)

                    .Add(Expression.Eq(_propertyName, _propertyValue));

                return criteria.List();

            }

        }

    Come to think of it, this would make a good example of a fluent interface.  You can use these dinky little objects to make your code easier to read by making the logic of a query apparent without getting bogged down by the NHibernate machinery.  Sql where clauses are very often business logic, and hence need to be tested.  Many times it's easier to test that the middle tier creates and passes the correct queries by checking the state of these dinky little query objects than it would be to test the middle tier all the way through the backend.


    When I mentioned the OO query engine I really just meant to put an Object Oriented structure around the creation of the sql for reports.  There are a handful of certainties in the career of any developer.  You will at some time write an Invoice class, an Order class, a dozen Address classes, and write a sql generator.  All of the "data sources" for our reports are implementations of this interface (partial definition):

        [PluginFamily]

        public interface IReaderSource

        {

            DataSet ExecuteDataSet();

     

            [IndexerName("Parameter")]

            object this[string parameterName]{get; set;}

        }

    The actual implementation could be a stored procedure, parameterized sql, or something else altogether.  As far as the reporting module is concerned everything is just an IReaderSource with a number of parameters.  Just take the options from the submitted query form and call myDataSource["State"] = queryView.StateCode.  For reports with a large number of optional query options we use an implementation of IReaderSource that consists of a select clause and an array of objects that model an optional piece of the where clause.

        [PluginFamily]

        public interface IQueryFilter : IParameter

        {

            bool IsActive();

            string GetWhereClause();

            void AttachParameters(IDbCommand command);

        }

    When the report is executed with any number of query options the IReaderSource scans through its collection of possible IQueryFilter objects for the ones that are IsActive(), and calls GetWhereClause() and AttachParameters() to build out the sql where clause and IDbCommand object. 

    Unsurprisingly, we configure and construct these query graphs with StructureMap.  In this case we put the configuration for the queries in embedded resource files like this:

    <StructureMap.DataAccess.IReaderSource Type="TemplatedQuery" Key="GetMatters">

      <Property Name="selectAndFromClause"><![CDATA[

                        SELECT

                            {MatterTable}.matter_id,

                            {MatterTable}.matter_name

                        FROM

                            {MatterTable} INNER JOIN CrossRef ON {MatterTable}.vendor_id = CrossRef.ForeignRef               

                    ]]></Property>

      <Property Name="filters">

        <Property Type="Parameterized">

          <Property Name="parameterName" Value="SenderId" />

          <Property Name="sqlSnippet" Value="CrossRef.SenderId = {Value}" />

        </Property>

        <Property Type="Parameterized">

          <Property Name="parameterName" Value="ReceiverId" />

          <Property Name="sqlSnippet" Value="CrossRef.ReceiverId = {Value}" />

        </Property>

        <Property Type="Templated">

          <Property Name="parameterName" Value="MatterId" />

          <Property Name="sqlSnippet" Value="matter_id like '%{Value}%'" />

        </Property>

        <Property Type="Templated">

          <Property Name="parameterName" Value="MatterName" />

          <Property Name="sqlSnippet" Value="matter_name like '%{Value}%'" />

        </Property>

      </Property>

    </StructureMap.DataAccess.IReaderSource>


    Jay R. Wren asked about the security model.  Just for fun, here's a strategy I've used before with some success (assuming you don't get stupid with overgeneralization).  If you use something like my IReaderSource interface that represents a "query" for a report, you could happily wrap the query objects in a decorator pattern class that governs security rules.  The security decorator might take a look at the IPrincipal object on the thread and do security assertions based on the allowed roles for the named query, or more powerfully, might transparently do something like this:

        public class SecurityDecorator : IReaderSource

        {

            private readonly IReaderSource _innerSource;

            private readonly string[] _roles;

     

            public SecurityDecorator(IReaderSource innerSource, string[] roles)

            {

                _innerSource = innerSource;

                _roles = roles;

            }

     

            public DataSet ExecuteDataSet()

            {

                IPrincipal principal = Thread.CurrentPrincipal;

                // if the authenticated principal doesn't belong

                // to one of the allowable roles, throw an exception

     

                // Set parameters on the inner query object

                // to filter based on the user's roles

                if (!principal.IsInRole("Internal"))

                {

                    _innerSource["ViewInternalIssues"] = false;

                }

     

                return _innerSource.ExecuteDataSet();

            }

        }

    As an aside, one of the most painful things I've ever witnessed was a system that pulled every single record into memory, then applied security filtering row by row.  That strategy sure bogs down when the table in question runs into 10000+ rows plus.


    Our own Karl Seguin said that he prefers to do O/R mapping manually.  Fair enough, especially for smaller domain models.  For us, NHibernate cuts down a lot of the work in modeling one-to-many and many-to-one relationships and the transparent lazy loading is awfully nice (and nullable types for us .Net 1.1 slowpokes).  If nothing else, it saves the tedious type coercion that you deal with with manual mapping.  Besides, the overhead of something like NHibernate goes down considerably when you move past the learning curve.
  • Why I do not use Stored Procedures

    I promised myself that I wouldn't ever make another post about stored procedures, but Eric's post on sproc's hit a few of my hot buttons on the subject.  Four years ago the pre-Agile, VB6/ASP coding me would have fervently agreed with Eric's pro-sproc stance and I wrote PL/SQL by the bushel full, but today my answer to the sproc question is a firm "no thank you" or at least a "guilty until proven innocent."  Besides, I'm an Oracle guy in a Sql Server shop and I despise T-SQL.

    First, some common ground.  I think we can all agree that adhoc SQL in code ala ASP circa 1998 is an abomination.  I don't have that much trouble with using sproc's for CRUD, but then again I think a domain model approach backed by O/R mapping is more efficient in terms of the all important developer time and the domain model approach leads to better code.  The newer parts of our applications can accommodate database changes by simply changing the NHibernate mappings.  No sproc's necessary.

    So to Eric's post, first this:

    So when I see people in the TDD crowd slamming stored procedures and views it greatly confuses me.

    That's an easy answer Eric, stored procedures make TDD a slower, less productive process.  Business logic in stored procedures is more work to test than the corresponding logic in a domain model class.  Referential integrity will often force you to setup a lot of other data just to be able to insert the data you need for a test (unless you're working in a legacy database like ours without any foreign keys;)).  Stored procedures are inherently procedural in nature, and hence harder to create isolated tests and prone to code duplication (a duplicated "where" clause is duplicated code and duplication is bad).  Another consideration, and this matters a great deal in a sizable application, is that any automated test that hits the database is slower than a test that runs inside of the AppDomain.  Slow tests lead to longer feedback cycles.  Trust me on this one, a team will be much slower with a 20 minute CI build versus a 5 minute build time.

    I'll make an admission.  I bet I haven't written a full 100 lines of direct ADO.Net manipulation code in the last six months, yet we've rolled out a multitude of new functionality in that time frame.  I also write very little non-trivial SQL these days.  Between a little OO query engine we've developed and NHibernate, there's simply no need to spend time in the muck of data access.  Like Jeff Atwood said, stored procedures are the assembly language of the database.  I'll use sproc's when I have to have a performance gain, but until then they're nothing but premature optimization.

    I think the gap in thinking about sproc's is about how you see the role of the database in an application -- do you talk about the database in passing as a persistence mechanism or is the .Net code merely a way to get data back and forth from the UI and database?.  The applications I build are primarily about business rules, not reporting.  The database to me is simply a persistence mechanism for our domain objects and messages en route.  Maintainability, which is almost synonymous with testability, is king.  A large, comprehensive body of automated tests at both the acceptance and unit test level are, in my opinion, the best way to accommodate changes to the code.

    Again from Eric's post, and this is where I got hot under the collar (it's not directed at you Eric) -

    Your DBA team can work completely independently of your programming team and do pretty much whatever they want and your application doesn't care!

    Expletive.  NO, NO, NO!!!!!  The DBA should most certainly NOT work independently of the programming team.  Stored procedures are code, and potentially destructive code at that.  If you change a stored procedure you *MUST* integrate and test the stored procedure against the application before it gets anywhere near production.  I've been burned by this one too many times.  The stored procedure code absolutely has to be built and versioned within your Continuous Integration build.  I'm sick of bugs caused by a sproc getting out of synch with the code.  The idea of a DBA, or a maintenance developer, putting a different version of a sproc directly into the database is a bad, bad, dangerous practice. 

    The extra configuration management burden is one of the reasons most Agile teams stay away from stored procedures.  If I'm relying on NHibernate mappings or parameterized SQL in the code that SQL is going to be built and versioned with the C# code automatically.  My risk of having mismatched sproc and C# code is greatly minimized.  Our C# assemblies are signed with the CruiseControl.Net build number and it's simple to spot mismatches  The stored procedures are, err, well, uh, I don't know if that's the same version that the code was tested against to be quite honest :(

    The absolute worst thing you can do, and it's horrifyingly common in the Microsoft development world, is to split related functionality between sproc's and middle tier code.  Grrrrrrrr.  You just make the code brittle and you increase the intellectual overhead of understanding a system.

     

     

  • The YAGNI Development Assistant

  • Austin Agile Lunch Group

    For anybody in the Austin area or passing through, the Austin Agile group has a biweekly lunch at the Central Market.  The topic is anything you want to talk about.  Lately the discussions have been about FitNesse and Selenium testing, tasking and agile estimation, and designing for testability.  The range of experience with Agile practices runs the full gamut, so everybody is welcome.

     

    The next meeting is tomorrow:

    Food and agile development talk at the Central Market Cafe at Lamar
    and 38th (http://tinyurl.com/brxsz). Upstairs in the gallery.

    Thurs, May 18, 11:30 AM.

  • Automated Web Testing with Selenium Driven by .Net

    The Agile development community has struggled for years with an array of solutions for automated testing solutions for web development.  NUnitASP is a good way to unit test server side ASP.Net code, especially now that it doesn't require XHTML compliant pages, but it can't handle client side scripting and AJAX is exploding in popularity.  Several tools have used COM (must die) to drive Internet Explorer (IE) with varying degrees of success.  My personal experience is that the IE COM API is too byzantine and flat out flaky.  Besides the flakiness, Firefox and other browsers are gaining in popularity so the IE only testing might not cut it anymore.

    Enter the Selenium project.  The developers of Selenium had the brilliant, but in retrospect painfully obvious, idea to use Javascript inside a browser to drive the web testing.  Presto, automated testing for web applications that can test client side Javascript and multiple browser engines.  In its original incarnation Selenium ran FIT style test tables inside a web browser.  Recently, Selenium Remote Control has been released to that allow the core Javascript engine to be driven through API's for .Net, Java, Ruby, or Python.  The FIT style table runner is perfectly functional, but for us it's very convenient to use the Selenium RC .Net wrapper.  So far I'm pleasantly surprised by how easy it is to use the .Net wrapper.  

    Sample Test

    The API libraries communicate with the Selenium Server, a Java executable that can start and stop any supported browser and send testing commands to the browser.  The .Net callable wrapper works by sending HTTP messages to the Selenium Server that controls a browser.  The first step is to start up the Selenium Server from a command prompt.

    java -jar server\selenium-server.jar -interactive

    The next step was to create a simple HTML page with a textbox that changes values when a button is clicked:

    <html>
    <head>
    <script language=javascript src=prototype-1.4.0.js></script>
    <title>Selenium Target Page</title>
    <script id=clientEventHandlersJS language=javascript>
    <!--
     
    function button1_onclick() {
        $('text1').value = "Goodbye"; // Using the Prototype library
    }
     
    //-->
    </script>
    </head>
    <body>
     
    <form name="form1">
        <input id="button1" type=button onclick="return button1_onclick()"
     value="click me"/>
        <input id="text1" type=text value="Hello"/>
        <select testid="select1" >
            <option value="1">North</option>
            <option value="2" selected=true>West</option>
            <option value="3">South</option>
            <option value="4">East</option>
        </select>
    </form>
     
    </body>
    </html>

    Next I wrote a little NUnit test fixture class to run tests against the web page. The key object is the DefaultSelenium class that is created in the SetUp() method:

    /// <param name="serverHost">the host name on which the 
    /// Selenium Server resides</param>
    /// <param name="serverPort">the port on which the 
    /// Selenium Server is listening</param>
    /// <param name="browserString">the command string used 
    /// to launch the browser, e.g. "*firefox", "*iexplore"
    /// or "c:\\program files\\internet explorer\\iexplore.exe"</param>
    /// <param name="browserURL">the starting URL including 
    /// just a domain name.  We'll start the browser pointing at
    /// the Selenium resources on this URL,
    /// e.g. "http://www.google.com" would send the browser to 
    /// "http://www.google.com/selenium-server/SeleneseRunner.html"</param>
    public DefaultSelenium(String serverHost, int serverPort, 
    String browserString, String browserURL)
    {
      this.commandProcessor = new HttpCommandProcessor(serverHost, 
    serverPort, browserString, browserURL);
    }


    using System;
    using NUnit.Framework;
    using Selenium;
     
    namespace SeleniumTarget
    {
        [TestFixture]
        public class WebPageTester
        {
            DefaultSelenium selenium;
     
            [SetUp]
            public void SetUp()
            {
                // 4444 is the default port for the Selenium Server
                selenium = new DefaultSelenium("localhost", 4444, "*iexplore", "http://localhost");
                selenium.Start();
            }
     
            [TearDown]
            public void TearDown()
            {
                // Make sure the Selenium environment is cleaned up after each test
                selenium.Stop();
            }
     
     
            [Test]
            public void CheckTheTitle()
            {
                selenium.Open("http://localhost/SeleniumTarget/TestPage1.htm");
     
                Assert.AreEqual("Selenium Target Page",
                                selenium.GetTitle(),
                                "Check the title of the browser");
            }
     
     
            [Test]
            public void ClickButton1ChangesText1FromHelloToGoodbye()
            {
                selenium.Open("http://localhost/SeleniumTarget/TestPage1.htm");
                Assert.AreEqual("Hello",
                                selenium.GetValue("text1"),
                                "Initial Value");
     
                selenium.Click("button1");
     
                Assert.AreEqual("Goodbye",
                                selenium.GetValue("text1"),
                                "Value after clicking button1");
            }
     
            /// <summary>
            /// Check that the options of a <select></select> element are
            /// as expected.  Finds the <select> element by using an xpath expression
            /// </summary>
            [Test]
            public void Select1Values()
            {
                selenium.Open("http://localhost/SeleniumTarget/TestPage1.htm");
                string locator = "xpath=//select[@testid='select1']";
     
                string[] options = selenium.GetSelectOptions(locator);
     
                Assert.AreEqual(new string[] {"North", "West", "South", "East"},
                                options,
                                "Values in the select1 dropdown");
            }
        }
    }

    It's a trivial example, but it's a start. 

    What I don't know -

    • What's the best way to handle the Selenium Server process?  How do you guarantee it's up when you start your tests?  I'm thinking some kind of Windows service wrapper
    • How do you integrate Selenium with CruiseControl.Net to get it into your Continuous Integration strategy?  I can't find much on the web about this yet.  You can run Selenium from NUnit for developer testing, but that isn't a very desirable answer from a tester's perspective.  For a couple of reasons, our thinking is to wrap the Selenium manipulation inside a FitNesse DoFixture.  We already know how to integrate FitNesse tests within a CC.Net build and it's convenient to run all the tests together.  We're also hoping that the DoFixture tests will be easier to understand and that we can hide some of the web page details behind the fixture.

    Other Tools

    We're looking primarily at Selenium, but we're also considering WATIR (Ruby based tool) and Sahi (I don't know much about it, but it looks strong).  One of our colleagues is experimenting with a Ruby based DSL for testing another web product using WATIR as the core that looks promising.  I've also been playing with the Ruby driver for Selenium with an eye towards creating a testing DSL for our application.   

  • If Martin says it's time for Ruby...

    ...it's time to give it a serious look.

    Martin Fowler wrote a piece on Ruby today called EvaluatingRuby.  I've been researching Ruby and Ruby on Rails pretty heavily the last couple of months and I'm impressed.  Now that I've gotten a new StructureMap release out of the way, I'm diving into a new Ruby on Rails/AJAX project on the side just to start learning. 

    It's more than just Rails too.  Take a look at Jay Field's post on creating a Domain Specific Language in Ruby.  Fellow Austinite Bret Pettichord is doing some cool stuff for web testing with WATIR.

    Ruby in Visual Studio.Net

     

  • StructureMap goes 1.0! New release of the best Dependency Injection tool for .Net*

    * as voted on by me, the impartial developer of StructureMap who has never used any of the other equivalent tools.  If nothing else, StructureMap predates both Spring.Net and Castle.

    StructureMap is a Dependency Injection framework that can be used to improve the architectural qualities of an object oriented system by reducing the mechanical costs of good design techniques. StructureMap can enable looser coupling between classes and their dependencies, improve the testability of a class structure, and provide generic flexibility mechanisms. Used judiciously, StructureMap can greatly enhance the opportunities for code reuse by minimizing direct coupling between classes and configuration mechanisms.

    I made a new release of StructureMap this evening that incorporates 15 months of enhancements and refactorings that I've made to support our project work.  We've used StructureMap successfully as the configuration subsystem of a rules engine and to smooth out the deployment reliability and testability of a large legacy system.  The new functionality includes:

  • New terser Attribute Normalized Xml configuration style
  • The ability to include secondary configuration files
  • Set a default profile at the top level of configuration
  • StructureMapExplorer, a WinForms tool to explore and debug StructureMap configurations
  • New ancillary NAnt tasks, including functionality to create a file "manifest" to verify the contents of an application deployment
  • New configuration storage choices
  • New Instance lifecycle scoping options (PerRequest, Singleton, ThreadLocal, etc.)
  • The "TemplatedMementoSource" option for large instance graphs
  • Streamlined codebase with less coupling and greater test coverage.  Greatly improved diagnostics.
  • New methods on ObjectFactory
    • GetAllInstances() - returns all instances of a certain type
    • WhatDoIHave() - for runtime troubleshooting
    • GetInstance(Type, InstanceMemento)

    See the release notes at:

    http://StructureMap.sourceforge.net or download at:

    http://so