I built something like this quite some time ago using groovy. I called the resulting Groovy map-reduce facade grool because it was pretty thin. The results were even more impressive for toy programs than what SMR achieved. For instance, here is a fully hadoop compatible map-reduce version of word count:
wc = mapReduce(
{key, line, out, report -> line.split().each {w -> out.collect(w,1)}},
{w, counts, out, report -> out.collect(w, counts.sum())}
)
In this example, wc is a function that can then be applied to various inputs as in
wc(["this is input text", "for the word-counter"])
or
wc(new File("/local/input/file")
or even
wc(functionUsingMapReduce(new HdfsFile("/existing/file")))
In spite of how pretty these tiny examples are, I eventually quit using it for day-to-day map-reduce coding.
The problem with things like grool that attempt to make writing map-reduce programs easier is that it doesn't really solve the problems that you face in writing map-reduce programs. It does make word-count much shorter (5-7 lines of code versus >200). It does involve a very cool hack so that the closures passed as arguments are available as map and reduce objects on the right machines. But I found in using grool for production code that it didn't solve the right problem.
The big problems involved in map-reduce programming are
What I would much rather have is something that provides much higher-level semantics such as those provided by Pig, but which provides for better integration into real programming languages. The better integration that I need goes two directions. First, I want to be able to, for instance, compute every date in a date range, use those dates to form input paths and then do a computation on those input paths. Pig can't do that except via a very ugly argument substitution hack.
A second kind of better integration is what Chris Wenzel has been pushing for some time with Cascading. I would like it if I could write programs in what might be called PL/Pig (by analogy with PL/SQL) that would then be tightly integrated into a Cascading flow. Ideally, the PL/Pig program would expose its execution plan to Cascading for global optimization and scheduling.