Sunday, 2 October 2016

Tricking Ubuntu to boot on a Sony Vaio Pro 13

This is just a reminder to myself for the next time I install / update Ubuntu on my laptop that I need to do the following after installation, otherwise Ubuntu won't boot. Assumes that Secure Boot is disabled and using UEFI boot.


  1. Install Ubuntu
  2. Use the boot disk to load a live session (Try Ubuntu)
  3. Mount the EFI partition
    sudo mount /dev/sda1 /mnt
    cd /mnt/EFI
  4. Create a Microsoft boot directory
    sudo mkdir -p Microsoft/Boot
  5. Copy the Ubuntu boot files into the Microsoft boot directory
    sudo cp -r ubuntu/* Microsoft/Boot/
  6. Create the boot file Microsoft wants (assumes not using Secure Boot)
    sudo cp Microfsoft/Boot/grubx64.efi Microsoft/Boot/bootmgfw.efi
  7. Unmount partition
    cd
    sudo umount /mnt

Friday, 3 June 2016

Don't just assume arrays will be faster: how we implemented object pooling

Update (29/06/2016): As Mike Barker kindly pointed out, the array backed object pool was not pre-populating the array with objects so the benchmark was going through some extra logic. I've since updated the results and graphs with the array pre-populated, which hasn't greatly changed the results.

To ensure customers receive the best service possible at TransFICC we need to make sure that our systems have robust latency and throughput figures. A key part of avoiding latency spikes (in Java anyway) is to attempt to keep the Garbage Collector at bay. One way we attempt to do this is with the use of object pooling pattern to allow us to re-use objects.

We weren't sure what data structure would be best to back our object pool implementation so we decided to write a few benchmarks that utilise various data structures: stack, dequeue based stack, queue, and an array. The goal of these benchmarks was to see if any of the data structures allocate (e.g. if using a LinkedList, a node would be created everytime you added an object to the pool), and to see what the latency and throughput values would be.

The benchmark results for measuring throughput show that both of the deque backed object pools perform at a similar level for both of the scenarios we are testing (removing and adding a single object, removing and adding several objects), with our stack implementation having similar throughput for the removing and adding several objects scenario, taking into account the error.

What is more interesting is the graphing of the latency figures.







Both object pools that are backed by a deque have very similar latency profiles, which I was surprised at as I expected the the stack implementation to outperform on the remove one add one strategy as it would only be using a single object (the queue implementation will always be taking a different object from the front of the queue). The stack object pool does perform slightly better in the remove several add several scenario at one data point, but the difference isn't significant compared to the queue implementation.

What I found most surprising is the high variance in the array backed object pool. I assumed using a simple array would yield the best throughput and latency numbers, however this is not the case. I'm not sure if it's because of my implementation of the array object pool so will try another version if I have time.

In the end we opted to use an object pool using a stack backed by a deque based on our findings. I should point out that all our object pools are accessed from a single thread so are not required to be thread safe.

The benchmarks were run on a Fedora 23 box with 6 core Intel(R) Xeon(R) CPU E5-1650 v3 @ 3.50GHz and 32GB RAM.

Benchmark code can be found here.

Throughput


Benchmark                                                              Mode  Cnt   Score    Error   Units
StackObjectPoolUsingDequeBenchmark.removeAndAddOne                          thrpt   10  96.785 ±  0.076  ops/us
StackObjectPoolUsingDequeBenchmark.removeAndAddOne:·gc.alloc.rate           thrpt   10   0.002 ±  0.006  MB/sec
StackObjectPoolUsingDequeBenchmark.removeAndAddOne:·gc.alloc.rate.norm      thrpt   10  ≈ 10⁻⁵             B/op
StackObjectPoolUsingDequeBenchmark.removeAndAddOne:·gc.count                thrpt   10     ≈ 0           counts
QueueObjectPoolUsingDequeBenchmark.removeAndAddOne                          thrpt   10  96.558 ±  0.614  ops/us
QueueObjectPoolUsingDequeBenchmark.removeAndAddOne:·gc.alloc.rate           thrpt   10   0.002 ±  0.006  MB/sec
QueueObjectPoolUsingDequeBenchmark.removeAndAddOne:·gc.alloc.rate.norm      thrpt   10  ≈ 10⁻⁵             B/op
QueueObjectPoolUsingDequeBenchmark.removeAndAddOne:·gc.count                thrpt   10     ≈ 0           counts
StackObjectPoolUsingArrayBenchmark.removeAndAddOne                          thrpt   10  88.930 ±  0.065  ops/us
StackObjectPoolUsingArrayBenchmark.removeAndAddOne:·gc.alloc.rate           thrpt   10   0.002 ±  0.006  MB/sec
StackObjectPoolUsingArrayBenchmark.removeAndAddOne:·gc.alloc.rate.norm      thrpt   10  ≈ 10⁻⁵             B/op
StackObjectPoolUsingArrayBenchmark.removeAndAddOne:·gc.count                thrpt   10     ≈ 0           counts
StackObjectPoolBenchmark.removeAndAddOne                                  thrpt   10  16.276 ±  0.004  ops/us
StackObjectPoolBenchmark.removeAndAddOne:·gc.alloc.rate                     thrpt   10   0.002 ±  0.006  MB/sec
StackObjectPoolBenchmark.removeAndAddOne:·gc.alloc.rate.norm                thrpt   10  ≈ 10⁻⁴             B/op
StackObjectPoolBenchmark.removeAndAddOne:·gc.count                          thrpt   10     ≈ 0           counts


StackObjectPoolUsingDequeBenchmark.removeAndAddSeveral                      thrpt   10  38.524 ±  0.012  ops/us
StackObjectPoolUsingDequeBenchmark.removeAndAddSeveral:·gc.alloc.rate       thrpt   10   0.002 ±  0.006  MB/sec
StackObjectPoolUsingDequeBenchmark.removeAndAddSeveral:·gc.alloc.rate.norm  thrpt   10  ≈ 10⁻⁴             B/op
StackObjectPoolUsingDequeBenchmark.removeAndAddSeveral:·gc.count            thrpt   10     ≈ 0           counts
QueueObjectPoolUsingDequeBenchmark.removeAndAddSeveral                      thrpt   10  38.108 ±  0.255  ops/us
QueueObjectPoolUsingDequeBenchmark.removeAndAddSeveral:·gc.alloc.rate       thrpt   10   0.002 ±  0.006  MB/sec
QueueObjectPoolUsingDequeBenchmark.removeAndAddSeveral:·gc.alloc.rate.norm  thrpt   10  ≈ 10⁻⁴             B/op
QueueObjectPoolUsingDequeBenchmark.removeAndAddSeveral:·gc.count            thrpt   10     ≈ 0           counts
StackObjectPoolUsingArrayBenchmark.removeAndAddSeveral                      thrpt   10  39.241 ±  1.179  ops/us
StackObjectPoolUsingArrayBenchmark.removeAndAddSeveral:·gc.alloc.rate       thrpt   10   0.002 ±  0.006  MB/sec
StackObjectPoolUsingArrayBenchmark.removeAndAddSeveral:·gc.alloc.rate.norm  thrpt   10  ≈ 10⁻⁴             B/op
StackObjectPoolUsingArrayBenchmark.removeAndAddSeveral:·gc.count            thrpt   10     ≈ 0           counts
StackObjectPoolBenchmark.removeAndAddSeveral                                thrpt   10   5.479 ±  0.036  ops/us
StackObjectPoolBenchmark.removeAndAddSeveral:·gc.alloc.rate                 thrpt   10   0.002 ±  0.006  MB/sec
StackObjectPoolBenchmark.removeAndAddSeveral:·gc.alloc.rate.norm            thrpt   10  ≈ 10⁻³             B/op
StackObjectPoolBenchmark.removeAndAddSeveral:·gc.count                      thrpt   10     ≈ 0           counts


Latency

Remove One Add One Scenario

Percentile   StackObjectPoolUsingDequeBenchmark   QueueObjectPoolUsingDequeBenchmark   StackObjectPoolUsingArrayBenchmark   StackObjectPoolBenchmark

0                      0.558                                0.558                                   0.558                             0.558
50                     0.559                                0.559                                   0.559                             0.629
90                     0.629                                0.629                                   0.629                             0.629
95                     0.629                                0.629                                   0.629                             0.698
99                     0.629                                0.629                                   0.629                             0.699
99.9                   1.118                                1.118                                   1.118                             1.118
99.99                  10.88                                6.842                                   10.752                            10.964
99.999                 11.872                               11.792                                  11.872                            12.057
99.9999                17.991                               16.591                                  51.3                              43.027
100                    18.144                               17.92                                   58.56                              44.416
            
Mean                   0.588                                0.59                                    0.586                             0.628
Error (us/op)          0.001                                0.001                                   0.001                             0.001

Remove Several Add Several Scenario

Percentile    StackObjectPoolUsingDequeBenchmark    QueueObjectPoolUsingDequeBenchmark    StackObjectPoolUsingArrayBenchmark    StackObjectPoolBenchmark
0            0.558                                0.558                                   0.558                              0.698
50         0.628                                0.628                                   0.628                              0.768
90                         0.629                                0.629                                   0.629                              0.769
95                         0.629                                0.629                                   0.629                              0.769
99                         0.629                                0.629                                   0.629                              0.769
99.9                       1.118                                1.118                                   1.118                              1.258
99.99                      10.816                               6.6                                     10.816                             11.232
99.999                     12.006                               11.872                                  11.99                              12.312
99.9999                    14.389                               19.716                                  29.362                             67.966
100                        14.512                               22.4                                    35.712                             69.376
                
Mean                       0.603                                0.605                                   0.604                              0.746
Error (us/op)              0.001                                0.001                                   0.001                              0.001



Friday, 15 April 2016

So.....Gradle

My previous post mentioned that we are currently using Maven as our build system but looking at moving across to Buck. Well, there has been a slight change of plan and we've moved to using Gradle. The main reason we opted for Gradle over Buck is its integration with Intellij, which was some what of a pain point when I was working at LMAX.

Migrating from Gradle to Maven was super easy, and has allowed us to simplify the various tests we had defined; Maven becomes very complicated if you try and do anything beyond their defined build tasks, which was very annoying for us as we have multiple test definitions being unit tests and integration tests (dao, acceptance, benchmark, perf, compatibility.....). This had resulted in some serious hackery (note: completely my fault).

As an example, this is how we ran DAO tests in the Maven world:


And this is how we run them with Gradle:

Wrapper script:

build.gradle:

Nice and simple!

Thursday, 14 April 2016

Keep it Simple - Using a bootstrap script to wrap your build system

Do you know what's annoying when you start at a new company (or do a clean checkout of some code)? Getting your development environment set up; especially when it comes to making sure you have the right software in place to be able to build said code!

I'm sure there are lots of ways to solve these problems, but a simple solution we use at TransFICC is to use a bootstrap script that wraps our build system. This is called everytime you want to run something via the build system (in our case Maven [we haven't buck'd up]), and ensures all build dependencies are present. I should note that we only use this for build dependencies, not run time dependencies.

bootstrap.sh


Our build dependencies are Java, Node (well, npm), Buck, and Maven. This script assumes the dependencies live at http://thearts.internal.transficc.com/artifacts/ (e.g. the jdk is located at http://thearts.internal.transficc.com/artifacts/jdk/jdk1.8.0_74.tar.gz and the directory contained inside the tar.gz is jdk1.8.0_74). The checksums check for a change of dependencies rather than ensuring they are downloaded correctly.

This script could be used as follows.

transficc.sh


./transficc.sh compile

Tuesday, 21 January 2014

Why I use integration/acceptance tests

I recently posted about my annoyances of setting up integration tests in a separate directory structure than normal unit tests when using the Maven dependency management system. I briefly posted how I resolved this problem, but did not mention why I was writing integration tests.

I am relatively new to Javascript development, having done some web development in PHP many moons ago (professionally I am a server-side developer, working in a team that utilises TDD, continuous integration, etc). I was finding that as I developed the client side functionality, I was constantly breaking existing functionality as I couldn't be bothered going through and clicking on every interactive part of every webpage to ensure that everything still worked. This was becoming a real annoyance, with refactoring of the client side code becoming more risky as I didn't have any tests to ensure that functionality remained unchanged.

Changes to my APIs defined by my Spring MVC Controllers were breaking the interaction between client and server without me even realising. I knew that the functionality on the client and server sides worked independently, but I was not testing their interaction. The same could be said of my database calls. I knew my individual components of the system were working as expected, but their interactions was not!

All of this was enough for me to start writing integration (acceptance tests). To do this I used selenium for testing the interaction on the actual website (UI tests), and wrote an API so that I could interact with the server side. The API part of the testing actually allowed me to do two things:


  1. I was able to ensure that pure API calls worked correctly (with verification and validation not just happening via the web page as well!);
  2. I could set up the website environment using API calls when testing different parts of the UI using selenium (for every test I had to register a new user - I didn't want to test the user registration page in every test as this will make all my tests slower and adds more scope than is required to each test).


By writing a testing language that wrapped my API and Selenium UI calls, I was then able to define my business requirements in my acceptance tests. I find this a fair better way of defining behaviour than comments as these can often become out of date without developers realising.

Integration tests in a separate folder structure from unit tests in Maven

I have recently been working on introducing integration/acceptance tests into a stand alone web application. It was around this time that I really started to regret using Maven as my dependency management tool. It had been fine until the point where I wanted to introduce integration tests (note: the concept of acceptance tests does not exist in Maven) into different folder structure than normal unit tests.

Maven instantly became very angry at me for wanting to do this, and insisted that if I wanted to do such a thing then I would want to do it in a separate module as they shouldn't be written specific to one module. This makes sense if you have a multi-module project as these tests should be about end-to-end testing. However, many projects are not going to be multi-module, and creating a separate module just for integration testing feels like I am over-engineering. What I wanted to do is simple against the Maven model.

After some serious faffing (and adding several new dependencies) I was able to get a solution where I my tests run in the "integration" phase of Maven, allowing for starting and stopping of Jetty in the pre-integration and post-integration phases.

pom.xml


This allows me to have the following project structure.

/src/test/java/
/src/integration/java/
/src/main/java/

I really wanted to avoid using a separate plugin for integration tests but could not get any other solution to work. I hope this helps.

Tuesday, 17 September 2013

Multiple term search algorithm (naive)

The other week I was working on a search algorithm for a website I have been developing, which utilises Java 7, Spring MVC 3, Jetty, Bootstrap, jquery et al. The algorithm has to search certain database tables and return a list of results in a semi-useful order.

What is semi-useful to the customer? For me, just being returned a list of results in alphabetical order isn't what I would deem as "semi-useful". There is also the separate issue of how to deal with the search terms: take the entire contents of the search box as an entire search term, split on spaces/punctuation (e.g. commas), try and form multi-word search terms that would make sense etc. etc.

For the search term issue, I actually cheated slightly. Due to the nature of the data for the website I was able to cheat, and state that search terms should be comma separated. I know this isn't necessarily the best solution, but provides slightly more flexibility than "everything in the search box is one search term".

Now the search term conundrum is out of the way, how should I go about creating a suitable search algorithm? Initially I took the approach of just producing something naive that iterates over the given search terms, and goes poking the database for results.


Now this was a little bit too naive as duplicate objects were present when a record was returned for multiple terms. Changing the LinkedList to a Set resolved this (overriding both hashCode and equals to ensure each object of ResultData is unique, and the hashcode is deterministic).


Now for the more interesting part: returning the results in a semi-useful order. Providing each record returned a weighting, based on how many search terms returned it as a result seemed pretty suitable (higher the weighting, the more likely the record is the right one for the customer).

Simply use Map<ResultData, Integer>, incrementing the integer value related to a ResultData key if the key already exists, or putting a new instance of ResultData into the map with a integer value of one.


There are a few things to note in the above implementation. First, results has been changed from a Collection to a Map (as Maps aren't part of Java collections). Secondly, the introduction of the nested for-loop that iterates over the databases results, checking if each ResultData record has been returned previously (results.get(record) will return null if the record has not been seen before), and adding the record to the map with an updated weighting. And finally, the search method now returns null. Well that doesn't sound right. A sorted map (using a HashMap is not suitable here as the ordering of it's key-pair values are non-deterministic) is sorted by key, whilst we want ordering by descending order of value.

The next step here would be create a new another map, which contains a bespoke comparator for ResultData that takes in the initial map of results. This map can be used to get the value for each key, and provide some order to the second map. This sounds quite horrible, and I really want to go have some tea, so I am vetoing this solution, and instead going hunting for some other data types that may be useful to me.

After some quick searching, I decided that using the Google Collections (now known as Guava) would be the way forward (and fun to try something new). Reviewing the API resulted in me discovering the TreeMultiMap, and rubbing my hands together rather happily.
Implementation of Multimap whose keys and values are ordered by their natural ordering or by supplied comparators. In all cases, this implementation uses Comparable.compareTo(T) or Comparator.compare(T, T) instead of Object.equals(java.lang.Object) to determine equivalence of instances.

Time to create a comparator it seems for ResultData, and start playing with TreeMultiMap.


The RecordDataComparator is simply ordering RecordData in alphabetical order by comparing title's (and if they're equal, unique id's).

The new search algorithm is more interesting. This is iterating over each record returned by the database, and checking if the record already exists in the value side of the map (note that it is now MultiMap<Integer, ResultData> and not Map<ResultData, Integer>). If the record has already been added to the map then we need to find the key by iterating over the key set, identifying it's key (i.e. it's weighting), and adding the record to key with a higher value. If the record does not already exist in the map then we add it with a weighting of 1. Every key-pair element in the map is ordered by the key (descending order), and then by the ResultDataComparator. By calling sortedResults.values(), you receive the search results in descending weighted order (and subsequently in alphabetical order).

How MultiMap works is that when you add a value to a key that is already present within the map, it chains that value onto the end of the values that already exist for that key. This allowed me to creating weighting "buckets", where records are added put into buckets based on how frequent they are returned by the database.

I haven't performance tested this solution yet, but I suspect it will suffer the same pitfalls as many queueing strategies, where the majority of the elements will sit in the keys 1 and 2, making the checks on the value sets of these keys rather expensive.

This is only an initial implementation of a searching algorithm that provides results in a semi-useful order, and can no doubt use some more work!!