Race conditions in Javascript

Given all hype about javascript recently, it can come as a shock to realise that modern browsers still execute javascript on a single thread. For developers used to multi-threaded runtimes such as .NET and the JVM this can lead to unfamiliar results.

For this example we will consider the HTML 5 Websockets API.

Note first of all that there is no connect() method with websockets, the constructor itself initiates the connection.

Let us consider the following code. By adding a big loop we’ve essentially added a time delay between creating the websocket and wiring up the onopen event handler. Now the million dollar question: does the event handler always get triggered when a connection is made, or does the delay mean that the onopen handler is sometimes not wired up by the time the websocket has connected?

var wsUri, testWebSocket;
wsUri = "ws://localhost:8080/mywebsocket";
testWebSocket = function(){
  var connection, i$, i;
  connection = new WebSocket(wsUri);
  for (i$ = 0; i$ <= 1000000000; ++i$) {
    i = i$;
    /* simulate a delay */
  }
  console.log("wiring up event handler");
  connection.onopen = function(){
    console.log("opened");
  };
};
testWebSocket();

In the JVM/.NET world we would expect this to cause a race condition. If the websocket connects before our event handler has been wired up the handler would never get triggered. In fact though, this works fine in javascript and there is no race condition. Since the execution model is single-threaded the websocket doesn’t actually attempt to connect until nothing else is executing. Very much like a Dispatcher in WPF/Silverlight.

Of course you do still have to be very careful when using timers… this for example does indeed cause a race condition:

var wsUri, testWebSocket;
wsUri = "ws://localhost:8080/mywebsocket";
testWebSocket = function(){
  var connection, wireUp;
  connection = new WebSocket(wsUri);
  wireUp = function(){
    console.log("wiring up event handler");
    connection.onopen = function(){
      console.log("opened");
    };
  };
  window.setTimeout(wireUp, 5000);
};
testWebSocket();

Remember folks: Concurrency and asynchrony are not the same thing!

Personally I think this API design is a bit short-sighted… it’s going to make it very difficult to add real concurrency to javascript runtimes and browsers in the future without revamping the APIs and breaking existing apps.

See also:
http://ejohn.org/blog/how-javascript-timers-work/

Hosting Websharper/ASP.NET apps on Linux with Mono + nginx

F# + cheaper hosting = winning

One of the arguments often levelled against .NET web frameworks is that Windows hosting options are expensive compared to their Linux counterparts. Pricing aside, many people, myself included, also prefer the simplicity and flexibility of being able to quickly SSH into a box for administration rather than faffing around with remote desktop.

In the past Mono had something of a reputation for poor performance due to the primitive garbage collector. Mono 3.0 however ships with a new garbage collector called sgen which is much better. The Xamarin guys are doing a great job and it now seems ready for primetime.

Having recently been experimenting with Websharper, and being a big proponent of F#, I was keen to see if I could have the best of both worlds. Would it be possible, I wondered, to use Mono to host a Websharper app on Linux?

My initial attempts at installing Mono and F# proved somewhat fruitless because the mainstream Debian packages are hopelessly out of date. Fortunately some bright spark has uploaded some more recent ones onto launchpad which makes the process fairly straightforward.

Once that was done the rest was easy enough. I just copied a compiled Websharper site across from my Windows machine, fired up fastcgi-mono4, configured nginx to proxy the requests and hey presto, the page popped up! The same process should also work just fine for ASP.NET sites.

One small caveat: I did run into a websharper bug that was causing links to render incorrectly but that wasn’t too difficult to resolve.

Being a fan of automation (i.e incredibly lazy) I also created some vagrant provisioning scripts. This means you can be up and running with Ubuntu 13.04 64-bit Server hosting a Websharper site in minutes!

image

What the scripts do

  1. Download and install Mono 3.0.10 and F# 3.0
  2. Adds init.d script for fastcgi-mono4 (/etc/init.d/monoserve) – this also configures mono to use the new sgen garbage collector.
  3. Sets up nginx to point to fastcgi4-mono.
  4. Hosts the sample Websharper app in which is housed in /vagrant/www (this folder is shared between the guest VM and the host machine).

How to get started

  1. Install Virtualbox.
  2. Install Vagrant.
  3. Clone the provisioning scripts from my bitbucket account:
    git clone https://perfectshuffle@bitbucket.org/perfectshuffle/vagrant_raring_mono.git mono
  4. Launch the vagrant box:
    cd mono
    vagrant up
    
  5. Once everything has finished configuring it dumps out the boxes IP addresses to the console. Just point your browser to the eth1 IP address and you should see the site running!
  6. Replace the sample files in the
    /vagrant/www

    folder with your own website.

  7. Profit!

I’ve also tried running the scripts on some cloud hosting rather than inside vagrant and they work great.

Installing Monodevelop 3 with F# support on Ubuntu

After much experimentation and digging around on google groups (special thanks to Ibrahim Hadad) I have finally managed to get Monodevelop 3 and F# working together nicely on Ubuntu. These were the steps I took. Your mileage may vary. 🙂

(Update: Knocte has suggested a couple of modifications to simplify the process. These are now reflected below.)

1) sudo apt-get install mono-complete libgdiplus git autoconf libtool

2) Install monodevelop using the script from John Ruiz’ blog:
http://blog.johnruiz.com/2012/05/installing-monodevelop-3-on-ubuntu.html

3) Get F# source and compile:
git clone git://github.com/fsharp/fsharp
cd fsharp/
./autogen.sh --prefix=/usr
make
sudo make install

4) Run monodevelop. Go to tools, add-in manager, gallery. Install F# language binding.

5) Enjoy!

Monodevelop 3 with F# bindings

Winning!

Debugging Silverlight applications with WinDbg

To use WinDbg to examine a dump…

1)      Make sure that the dump file is 32bit if it was running under a 32bit Silverlight runtime. Process Explorer creates 64 bit dumps on 64bit machines even for 32 bit applications. These will not work in WinDbg. You can use the Sysinternals procdump tool to create a 32 bit dump: procdump –ma sllauncher.exe mydump.dmp

2)      Make sure you are using the 32 bit version of WinDbg (for 32 bit dumps).

3)      Configure symbols in windbg: .sympath SRV*c:\symbolcache*http://msdl.microsoft.com/download/symbols

4)      Load the dump file in WinDbg (File, Open crash dump)

5)      Load SOS and the CoreCLR for Silverlight. The .loadby command seems to be broken so you’ll have to use .load and enter the complete paths:


.load C:\Program Files (x86)\Microsoft Silverlight\5.1.10411.0\sos.dll
.load C:\Program Files (x86)\Microsoft Silverlight\5.1.10411.0\coreclr.dll

If you’re using the 64 bit Silverlight runtime I believe you just need to use the 64bit WinDbg and load the dlls from C:\Program Files\Microsoft Silverlight\5.1.10411.0 instead.

You should be ready to go, for example:

0:000> !clrstack
OS Thread Id: 0x45dc (0)
Child SP IP Call Site
0014f3c0 03aa025f SilverlightApplication2.MainPage..ctor()
0014f3cc 03aa0215 SilverlightApplication2.App.Application_Startup(System.Object, System.Windows.StartupEventArgs)
0014f3e4 7b316fa3 MS.Internal.CoreInvokeHandler.InvokeEventHandler(UInt32, System.Delegate, System.Object, System.Object)
0014f410 7b2f5239 MS.Internal.JoltHelper.FireEvent(IntPtr, IntPtr, Int32, Int32, System.String, UInt32)
0014f460 7b390969 DomainNeutralILStubClass.IL_STUB_ReversePInvoke(Int32, Int32, Int32, Int32, IntPtr, Int32)
0014f510 02e017a7 [ContextTransitionFrame: 0014f510]

Sometimes, for example if your Silverlight application uses managed .NET COM components, WinDbg will try to load the wrong CLR debugging module. The quote from the deep dark depths of the WinDBG help file:

“To debug a managed application, the debugger must load a data access component (DAC) that corresponds to the CLR that the application has loaded. However, in some cases, the application loads more than one CLR. In that case, you can use the I parameter to specify which DAC the debugger should load.”

In this case the following two commands should sort things out:


.cordll -u
.cordll -I coreclr -lp "C:\Program Files (x86)\Microsoft Silverlight\5.1.10411.0"

This article also may also be useful:

http://www.codeproject.com/Articles/331050/Assembly-Helps-Debug-NET-Applications

Derivatives of a polynomial in F#

Just because I was bored…

type Term = {Coefficient: int; Power: int}
type Term = {Coefficient: int; Power: int}
type Polynomial = list<Term>

let derivative polynomial =
    polynomial |> List.map (fun term -> {Coefficient = term.Coefficient * term.Power; Power = term.Power - 1}) |> List.filter (fun x -> x.Power >= 0)

let prettyPrint polynomial =
    let sortedTerms = polynomial |> List.sortBy (fun term -> -term.Power)

    let rec printTerms terms =
        match terms with
        | [] -> ""
        | [term] ->
            match term.Power with
            | 0 -> sprintf "%d" term.Coefficient
            | 1 -> sprintf "%dx" term.Coefficient
            | n -> sprintf "%dx^%d" term.Coefficient term.Power                           
        | h::t -> printTerms [h] + " + " + printTerms t

    printTerms sortedTerms

let sample = [{Coefficient=3; Power=2};{Coefficient=5; Power=1};{Coefficient=4;Power=0}]
let derived = sample |> derivative

printfn "Polynomial:\t%s" <| prettyPrint sample
printfn "Derivative:\t%s" <| prettyPrint derived

 

Town Crier 1.1 now available on NuGet

Town Crier can now be downloaded from NuGet 🙂

There is now also some built in markdown support (thanks to friism). This provides a convenient way to send HTML emails where possible but with a human readable plaintext alternative whilst only writing one template:

            var factory = new MergedEmailFactory(new TemplateParser());

            var tokenValues = new Dictionary<string, string>
                                  {
                                      {"name", "Joe"},
                                      {"userid", "123"}
                                  };

            MailMessage message = factory
                .WithTokenValues(tokenValues)
                .WithSubject("Test Subject")
                .WithMarkdownBodyFromFile(@"templates\sample-markdown.txt")
                .Create();

To install Town Crier into your project from the Visual Studio Package Console:
install-package towncrier

Temporary file helper class

Occasionally it’s necessary to output data into a temporary file, for example in order to pass data to an external program. I threw together this little helper class to help out in such situations.

public class TemporaryFile : IDisposable
{
    public string FilePath { get; protected set; }

    public TemporaryFile()
    {
        FilePath= Path.GetTempFileName();
    }

    public void Dispose()
    {
        if (File.Exists(FilePath))
            File.Delete(FilePath);
    }
}

Use it like this:

using (var tempInputFile = new TemporaryFile())
{
   // Do stuff with tempInputFile.FilePath here...
}

// Dispose will be called at the end of the using statement and so the file will be deleted.

Using Microsoft Reactive Extensions to orchestrate time-bound data retrieval

Microsoft Reactive Extensions (usually referred to simply as Rx) is a library for orchestrating and synchronising asynchronous operations. It’s based on a beautiful mathematical duality between IEnumerable/IEnumerator and their new counterparts (included in .NET 4), IObservable/IObserver. Documentation is unfortunately somewhat scarce and beyond the clichéd dictionary suggest and drag-and-drop examples it’s quite hard to find sample code. As ever though, the best way to learn something is to try and use it to solve a real-world problem.

Essentially you can think observables as push-based collections. Instead of pulling from an enumerable (e.g. with a for-each loop), the data is pushed at you by the observable.

A little background

One of my company’s websites displays statistics on various pages. The queries are dynamic enough and plenty enough that it is not practical to pre-calculate and cache all the results every few minutes, so instead we operate on an 80/20 rule. That is, 80% of our website views occur on 20% of the pages (usually new content on the homepage, or content that is newly linked to from other popular sites). Therefore we cache result of the each database query in memcached for a few minutes, the cache-key being a hashcode of the SQL query (that’s a simplification – we actually serialize the LINQ expression tree, but that’s for another blog post).

Sometimes uncached statistics take a while to retrieve depending database load and latency. Since our primary concerns are total page load time/responsiveness we simply abort the request and hide the statistics from the page if they are not retrieved within a fixed amount of time. The initial implementation of this simply aborted the thread if a certain timeout had elapsed. Unfortunately this solution has a big problem.

The death spiral

The trouble with aborting the thread is that if a database operation times out, the result never makes it into the cache. This means the next time the page is hit another cache miss occurs and the SQL database gets hit again. Since this query is identical to the first it will probably also time out. The database load keeps increasing because it is repeatedly being hit with the same query whilst the result is never cached.

The requirements in brief

The basic logic we need is therefore as follows:

  • Page view generates request for data.
  • Cache is checked for a specific key.
    • On cache timeout/error – cancel operation. Don’t hit SQL because if the cache is down it’s better to display the pages without the statistics and avoid hammering the SQL server.
    • On cache hit – return data.
    • On cache miss – request data from SQL.
      • On SQL timeout – hide the control but continue fetching the data in the background and place it into the cache when it finally returns.
      • On SQL success – return data, enter it into the cache.
      • On SQL error – abort, hide control.

The problem

Trying to write this logic using threading, locks and traditional synchronisation constructs is difficult, bug-prone, and results in horrific spaghetti-code.

Reactive Extensions to the rescue

Reactive Extensions provides us with a much nicer way to deal with these kinds of asynchronous operations.

To keep the example simple, I’ll use a console application instead of a web-app, and simulate the cache and SQL database. I’ll also forget about using the SQL query as the cache key and use an entity id instead. In order to run this example you will need to have the reactive extension assemblies installed which can be downloaded from devlabs.

The example makes use of a number of extension methods provided by Rx:

  • Defer – This defers an operation until an observable is subscribed to.
  • Return – This creates an observable that returns a single result.
  • Timeout – Causes an observable to throw an exception after a specified timeout. Note that although this means the observable is disposed and no further results will be yielded, the operation will continue to run. This is useful when you have side effects that need to occur, in this case, placing the result of long-running SQL query into the cache.
  • Catch – Specifies another observable sequence to continue with when an exception occurs.
  • Take – This is analogous to traditional LINQ. Remember though that unlike First() this does not cause execution of the query and so does not block.

It also makes use of the Subject class. This is a special class that acts as both an observer and an observable. It allows multiple subscriptions to a single stream of events. It may not strictly be necessary in this example but I have found introducing subjects helps to avoid the easy mistake of subscribing twice to an observer and causing two lots of side-effects to occur.

With further ado, the code. You will need to add project references to System.CoreEx and System.Reactive.

using System;
using System.Collections.Generic;
using System.Linq;
using System.Threading;

namespace CacheExample
{
    public class CacheMissException : ApplicationException
    {

    }

    // Represents the entity we are trying to retrieve from the cache
    // or database
    public class ResultEntity
    {
        public ResultEntity(string value)
        {
            Value = value;
        }

        public string Value { get; set; }
    }

    public interface IResultRepository
    {
        ResultEntity GetResultById(int id);
    }

    public class DatabaseRepository : IResultRepository
    {
        public ResultEntity GetResultById(int id)
        {
            Console.WriteLine("Retrieving results from database...");

            // Increment the following wait time to simulate a
            //database timeout.
            Thread.Sleep(150);

            // Note that this code is still executed even if the
            // observer is disposed.
            // This, conveniently, allows for "side-effects".
            // In this case we could put the result into the
            //cache so the next user gets a cache hit!
            Console.WriteLine("Retrieved result from database.");
            return new ResultEntity("Database Result");
        }
    }

    public class CacheRepository : IResultRepository
    {
        public ResultEntity GetResultById(int id)
        {
            Console.WriteLine("Retrieving result from cache...");

            //Increment the following value to simulate a cache timeout.
            Thread.Sleep(20);

            //Uncomment the next line to simulate a cache miss
            //throw new CacheMissException();

            Console.WriteLine("Retrieved result from cache!");
            return new ResultEntity("Cached Result");
        }
    }

    class Program
    {
        static readonly IResultRepository cacheRepository =
            new CacheRepository();
        static readonly IResultRepository databaseRepository =
            new DatabaseRepository();

        static void Main(string[] cmdLineParams)
        {
            int id = 123;
            var cacheTimeout = TimeSpan.FromMilliseconds(50);
            var databaseTimeout = TimeSpan.FromMilliseconds(200);

            var cacheObservable = Observable.Defer(
                        ()=>Observable.Return(
                              cacheRepository.GetResultById(id)));
            var databaseObservable = Observable.Defer(
                        ()=>Observable.Return(
                              databaseRepository.GetResultById(id)));

            // Try to retrieve the result from the cache, falling over
            // to the DB in case of cache miss.
            var cacheFailover = (cacheObservable
                .Timeout(cacheTimeout))
                .Catch<ResultEntity, CacheMissException>(
                    (x) =>
                        {
                        Console.WriteLine("Cache miss. Attempting to
                                    retrieve from database.");
                        return databaseObservable
                                 .Timeout(databaseTimeout);
                        }
                )
                .Catch<ResultEntity, TimeoutException>(
                    (x) =>
                    {
                        Console.WriteLine("Time out retrieving result
                                    from cache. Giving up.");
                        return Observable.Empty<ResultEntity>();
                    }
                );

            var result = new Subject<ResultEntity>();
            result.Take(1).Subscribe(
                  x=> Console.WriteLine("SUCCESS: Result: " + x.Value),
                  x=> Console.WriteLine("FAILURE: Exception!"),
                  () => Console.WriteLine("Sequence finished."));

            cacheFailover.Subscribe(result);

            Console.WriteLine("Press any key to exit.");
            Console.ReadKey();
        }
    }
}

I would recommend playing around with the code. Experiment with adjusting the timeouts and uncommenting the lines with notes by them to see what happens in different scenarios. If you haven’t used Rx before wrapping your head around observables can take a while. I would thoroughly recommend taking some time to watch the various channel 9 videos.

Town Crier – An open-source e-mail templating engine for .NET

In medieval times, town criers were the primary means of making announcements to a community. Nowadays a man with a bell is a very imaginative – but not particularly practical – means of communication.

One common scenario, especially in the business world, is the need to send out an email to a large number of people. Of course a big anonymous email lacks the friendliness of the local loud-mouthed peasant and so we try to personalise the emails with individuals’ names etc.

I suspect most .NET developers have come across this problem at some point in their career. This generally leads to a lot of messy string concatenation and trying to manhandle the System.Net.Mail.SmtpClient into doing what you want. With text-based emails this is ugly, when HTML is involved it becomes a world of pain.

Town Crier is a project I have been working on to simplify this scenario. The basic workflow for sending a templated e-mail is as follows:

  1. Create an email template.
    This can be either a plain-text or HTML file (or both). Tokens to be replaced are written like this: {%= customersname %}

    Sample email templates:
    Sample HTML e-mail template
    Sample text e-mail template

  2. Write some very simple code in the CLR language of your choice, in this case C#:
    var factory = new MergedEmailFactory(new TemplateParser());
    
    var tokenValues = new Dictionary<string, string>
                          {
                              {"name", "Joe Bloggs"},
                              {"age", "21"}
                          };
    
    MailMessage message = factory
        .WithTokenValues(tokenValues)
        .WithSubject("Test Subject")
        .WithHtmlBodyFromFile(@"templates\sample-email.html")
        .WithPlainTextBodyFromFile(@"templates\sample-email.txt")
        .Create();
    
    var from = new MailAddress("sender@test.com", "Automated Emailer");
    var to = new MailAddress("recipient@test.com", "Joe Bloggs");
    message.From = from;
    message.To.Add(to);
    
    var smtpClient = new SmtpClient();
    smtpClient.Send(message);
    

    Of course it’s then trivial to loop through rows in a database, populate the dictionary and perform a “mail-merge” programatically.

    One final handy tip – there is included a handy extension method to allow you to save the message to a .eml file:

    message.Save(new FileStream(@"output.eml", FileMode.CreateNew));

That’s pretty much it! It’s fairly basic but I’ve found it to be very useful. It’s also my first open-source project so please be nice!

I am releasing it under the Lesser GNU Public Licence. Go grab the sources at GitHub.

CC-GNU LGPL

Implementing map-reduce in F#

Introduction

MapReduce is a software paradigm popularised by Google in which we take a set of tuples (key-value pairs), transform (map) them into an intermediate set of key-value pairs, and then perform some aggregation (reduce) operation on the intermediate values to obtain a result set. This is a useful way to express a problem because it yields an obvious way to “divide and conquer” the computation in a way that lends itself to parallel/distributed computing, thus providing a fairly simple way to perform computations on extremely large data sets.

It can be quite difficult to grok at first, so I decided to try implementing one of the examples from the MongoDB documentation in F# (if interested, see shell example 2). In this example, we have a list of people and the types of pet each of them has. We wish to calculate the total number of each animal.

The Code

Again, F# proves to be a remarkably succinct language to express problems, in this case the built in syntactic sugar for tuples is a godsend!

UPDATE (25-May-2010) – Controlflow helpfully suggested that I could make my original code somewhat neater by using pattern matching to decompose tuples. I’ve updated the code below with these improvements.

#light

// Simple example of map-reduce  in F#
// Counts the total numbers of each animal

// Map function for our problem domain
let mapfunc (k,v) =
    v |> Seq.map (fun(pet) -> (pet, 1))

// Reduce function for our problem domain
let reducefunc (k,(vs:seq<int>)) =
    let count = vs |> Seq.sum
    k, Seq.ofList([count])

// Performs map-reduce operation on a given set of input tuples
let mapreduce map reduce (inputs:seq<_*_>) =
    let intermediates = inputs |> Seq.map map |> Seq.concat
    let groupings = intermediates |> Seq.groupBy fst |> Seq.map (fun(x,y) -> x, Seq.map snd y)
    let results = groupings |> Seq.map reduce
    results

// Run the example...
let alice = ("Alice",["Dog";"Cat"])
let bob = ("Bob",["Cat"])
let charlie = ("Charlie",["Mouse"; "Cat"; "Dog"])
let dennis = ("Dennis",[])

let people = [alice;bob;charlie;dennis]

let results = people |> mapreduce mapfunc reducefunc

for result in results do
    let animal = fst result
    let count = ((snd result) |> Seq.toArray).[0]
    printfn "%s : %s" animal (count.ToString())

printfn "Press any key to exit."

System.Console.ReadKey() |> ignore

This yields the expected results:

Dog : 2

Cat : 3

Mouse : 1

Exercise for the reader

Parallelise this implementation (for a single machine this should be trivial by using the Parallel LINQ integration provided in the F# Powerpack).