Saturday, 6 October 2012

Functional programming in Java?

After more toying with Scala on my spare time while hacking Java in office hours I miss out on using all these functional idioms in my Java programs. So with Java 8 coming closer I got interested in how the JDK team is planning on utilizing the new Java closure syntax in for example the collections libraries to support a more functional style of programming. While I have it fresh in memory, let me tell you some cool aspects of this!

Closures in Java 

Closures are anonymous functions also known as lambda expression. In languages where functions are first class members you could assign a lamba expression to a function variable. However, this is not entirely true for the Java implementation. In order not to mess up the type system to much, the engineers behind the implementation of the Lambda JSR 335 in OpenJDK has taken the approach to use interfaces containing one method to become special in the language. Such an interface is called a functional interface. An example of such an interface is ActionListener
public interface ActionListener { 
    void actionPerformed(ActionEvent e);
}
or Runnable or Comparator.
So in JDK 7 or earlier syntax, this is how you would declare for example an ActionListener
ActionListener l = new ActionListener() { 
  public void actionPerformed(ActionEvent e) { 
    doWork(e.getModifiers());
  }
};

However, with the new Java 8 syntax, you'd write it as
ActionListener l = (ActionEvent e) -> doWork(e.getModifiers());
So you get rid of all the unnecessary boilerplate code inherent in anonymous class creations so common in Java. More pleasing examples of usages of this syntax
Comparator<String> c = (s1, s2) -> s1.compareToIgnoreCase(s2);
FileFilter java = f -> f.getName().endsWith(".java");

There's a lot of interesting things going on here like how lexical scopes work, variable binding, type inference etcetera. Check out project lead Brian Goetz report on how these things works.

Java Collection libraries 

So how will this be used in the collection libraries which are probably the most used toolkit of the Java platform? First of all, the Collections team has decided not to restart by rewriting the collections libraries from scratch. While mentioning that this might be a candidate for future JDK versions, version 8 will instead provide an evolutionary step forward in the Collections by adding extension methods to existing interfaces (List, Set, Iterable) and retrofit existing classes with new interfaces such as Stream.

A major shift will be from the imperative style of external iteration to the more functional style of internal iteration. For example, the recommended idiom in Java 5+ to change a property on all objects in a collection uses 
for (Car c : cars) {
    c.setState(MOVING);
}
With closures you'll write this as
cars.forEach(c -> { c.setState(MOVING); });
What are the benefits of this idiom? You move the control flow to the library instead of the client code. In this way the library can decide on potentially use laziness, parallelism and out-of-order execution to improve performance which will be showed in later examples.

You can pipeline operations. In this example the filter operation uses a predicate to decide which objects in the collection to pipe to the final forEach clause.
cars.filter(c -> c.getWeight() > 2000)
      .forEach(c -> { c.setState(STOPPED); });
And to store results from computations using for example the map operation which operates on each value in the piped collection use something like
Set<Engine> smallEngines = cars.filter(c -> c.getMaxSpeed() < 100)
                .map(c -> c.getEngine())
                .into(new HashSet<>());
or to sum them. This is the well known functional idiom of map reduce.
int sum = cars.filter(c -> c.getState() == MOVING)
                .map(c -> c.getWeight())
                .sum();

So, all these operations will not create temporary new collections and pass on to the next operation. Instead they operate lazily and stream values between the control blocks. This implies good performance when for example searching for the first object that satisfies some condition. The upstream iterator in this example will not continue the iteration when getFirst() has found a match.
Car fastCar = cars.filter(c -> c.getSpeed() > 120).getFirst();
When used to these constructs, a lot of boilerplate code should be possible to remove. Here Brian Goetz shows an example of a method in java.lang.Class as today
 for (Method m : enclosingInfo.getEnclosingClass().getDeclaredMethods()) {
     if (m.getName().equals(enclosingInfo.getName()) ) {
         Class<?>[] candidateParamClasses = m.getParameterTypes();
         if (candidateParamClasses.length == parameterClasses.length) {
             boolean matches = true;
             for(int i = 0; i < candidateParamClasses.length; i++) {
                 if (!candidateParamClasses[i].equals(parameterClasses[i])) {
                     matches = false;
                     break;
                 }
             }

             if (matches) { // finally, check return type
                 if (m.getReturnType().equals(returnType) )
                     return m;
             }
         }
     }
 }

 throw new InternalError("Enclosing method not found");

and how it could be rewritten without all the temporary variables making it both more readable and less error prone.
Method matching =
  Arrays.asList(enclosingInfo.getEnclosingClass().getDeclaredMethods())
    .filter(m -> Objects.equals(m.getName(), enclosingInfo.getName())
    .filter(m ->  Arrays.equals(m.getParameterTypes(), parameterClasses))
    .filter(m -> Objects.equals(m.getReturnType(), returnType))
    .getFirst();
if (matching == null)
    throw new InternalError("Enclosing method not found");
return matching;

There's a lot of more cool features on the project site, but before ending you should see how easy parallel computation can become. By streaming the pipeline via parallel() the library will try to divide the pipeline stream of operations to all your cores. 
int sum = cars.parallel()
                .filter(c -> c.getState() == MOVING)
                .map(c -> c.getWeight())
                .sum();
Via the new interface Splittable you can also very easily use the Fork/Join framwork for divide and conquer tasks.

Timeplans

This looks awesome. But when can we use it? According to the milestone plan the public review version should be available in January 2013. The JDK 8 timeplan currently looks like

2012/7 Expert Group formation
2012/9 Early Draft Review
2013/1 Public Review
2013/6 Proposed Final Draft
2013/8 Final Release
http://openjdk.java.net/projects/jdk8/spec/

But you can download bleeding edge versions of the OpenJDK 8 binaries today and play around with the lambda language construct and the current implementations in the collections libraries as well as a lot of the other libraries. http://jdk8.java.net/download.html

Alternative functional libraries

If you can't wait there are libraries for functional programming in Java that will work with JDK5 or newer.

Guava

Googles Guava libraries have support for functional idioms. However without language support the code easily becomes messed up with boilerplate. On the other hand, there are a lot of good stuff in Guava like a richer set of collection constructs and easier use of immutable collections types. Here
Multiset<Integer> lengths = HashMultiset.create(
  FluentIterable.from(strings)
    .filter(new Predicate<String>() {
       public boolean apply(String string) {
         return CharMatcher.JAVA_UPPER_CASE.matchesAllOf(string);
       }
     })
    .transform(new Function<String, Integer>() {
       public Integer apply(String string) {
         return string.length();
       }
     }));

Lambdaj

Another interesting library is Lambdaj which uses static imported methods to hide give a nicer looking syntax. This is some typical Java code to sort a list of persons according to age
List<Person> sortedByAgePersons = new ArrayList<Person>(persons);
Collections.sort(sortedByAgePersons, new Comparator<Person>() {
        public int compare(Person p1, Person p2) {
           return Integer.valueOf(p1.getAge()).compareTo(p2.getAge());
        }
});
With Lambdaj, you could express this as
List<Person> sortedByAgePersons = sort(persons, on(Person.class).getAge());
Check out more features at http://code.google.com/p/lambdaj/wiki/LambdajFeatures.

FunctionalJava

FunctionalJava solves the syntax verbosity problem by using the Java 7 BGGA proposal syntax. This adds closures as a part of the language dialect. However, this requires a pass with a pre compiler to render compilable Java code.
This is an example of code that adds 42 to each element in the array.


  1. final Array<Integer> a = array(123);  
  2. final Array<Integer> b = a.map({int i => i + 42});  
  3. arrayShow(intShow).println(b); // {43,44,45}  

Noteworthy is that solution also heavily relies on static imports. Check out more example at http://functionaljava.org/examples/1.5/


My thoughts

In the end though, due to the lack of anonymous functions in Java today, the best choice to program in a functional way is probably to stick to Scala, Clojure, Groovy or another of the JVM languages that has inherent support for this style until Java 8. With what we got today in Java you can still use many of the functional concepts like preferring immutable data, avoiding side effects and more.

The above mentioned alternatives are just a few of what's out there. But common among them are either that you must rely on preprocessing some functional dialect of Java to produce compilable Java code or to use a verbose syntax like Guava. In my opinion these tradeoffs effect on readability, maintainability and possibly portability just isn't worth it.

By the way, Brian Goetz appears 20 mins into the Java One 2012 Technical keynote to show off some collection examples of Java 8. http://medianetwork.oracle.com/video/player/1871712019001