Thursday, October 4, 2012

Yield-Return Makes Batches Happy


      For those of you who have not yet met .net's yield-return sorcery, allow me to introduce you to the best friend foreach will ever have.

     Conventional foreach blocks just run through IEnumerables; no feedback- no user warm fuzzies for the progress bar. Even a parallel.foreach is like the cable guy- you call, and wait.

    When processing big data- especially in conjunction with DB persistence- it is critical to provide incremental feedback at runtime. 
    Thanks to a post by webdev_hb on TheCodeProject, I fell in love with this unappreciated tool- and used it to peek inside the ubiquitous  foreach block.

     Here's how.

The ForEach Block:

     Say you have a .net method that populates a DataTable. You want to commit these records to your database. Maybe you use a stored procedure; maybe you lock the table for a BCP. Either way- you're eating server resources, playing try-catch roulette, and your client is stuck in black-box purgatory until you can report back to the UI thread.
    In processing and committing large data sets, you can easily overflow local memory, nuke transaction logs, and cause other avoidable unpleasantness.

    So, you refactor your code to bite off a manageable number of records at a time; let's say 100k- that's a good start. You commit each batch with a method that looks something like this:

public void ToSQL(List<DataTable> dataTables)
{
  foreach (DataTable dtMyDataTable in dataTables)
  {
     // Call your method for committing a DataTable to the Database
     CommitDT(dtMyDataTable, _ConnectionString, _TableOrSpName);
  }
}

 

The Pathetic UI: 

      Great: now you've fixed all of the runtime issues by bulk-copying friendly blocks of data- you have a happy DB, and a happy app server. 
      Not Great: your UI is playing hold music, and your user has gone to lunch.

 

The Fix:

      Now, let's get feedback from the foreach block so the user has something to root for.

public IEnumerable<Double> ToSQL(List<DataTable> dataTables)
{
  foreach (DataTable dtMyDataTable in dataTables)
  {
      double dProgress = 
           (double)dtMyDataTable.Rows.Count / (double)_TotalRecordCount;

     // Call your method for committing a DataTable to the Database

     CommitDT(dtMyDataTable, _ConnectionString, _TableOrSpName);

     // Return a progress update after each iteration with yield return
     yield return dProgress;
  }
  
  // We're done processing, so let's tell the caller explicitly with yield break
  yield break;


 

The Result:

       Where you once had an arbitrarily long wait and a static UI, you now have feedback at runtime. You can use the Double return type used here to update a progress bar, or you can get creative and return the entire DataTable for more detailed feedback. While you're at it, mark a timestamp, or recover gracefully from an exception, or just go bother your coworkers with your new passion for yield-return.



                          




                   

Wednesday, April 20, 2011

Software development will soon be fungible

Ladies, Gentlemen, and fellow developers:


     In the software development industry, client expectations have been- if I may-
     'exponentially optimistic'.

     Fortunately, the power of software development tools has been ahead of market demands for half a decade, and still has plenty of momentum. Any application of moderate complexity in the Data, Logic, and Presentation tiers can now be implemented by any developer on the project team regardless of their core competency.
     The development learning curve has reached an all-time low, and software implementation from spec will continue to become even more accessible as increasingly powerful products enter the market.
     Don't panic- just keep up with the tools, get over your issues with IDEs, burn your FORTRAN books, and learn Ruby.


Oh, and hire that smart kid. You know- the awkward one that's friends with your nephew. You have a girlfriend, sure, but he's already forgotten more about high level programming than you're ever going to learn.