- Install Ubuntu
- Use the boot disk to load a live session (Try Ubuntu)
- Mount the EFI partition
sudo mount /dev/sda1 /mnt cd /mnt/EFI
- Create a Microsoft boot directory
sudo mkdir -p Microsoft/Boot
- Copy the Ubuntu boot files into the Microsoft boot directory
sudo cp -r ubuntu/* Microsoft/Boot/
- Create the boot file Microsoft wants (assumes not using Secure Boot)
sudo cp Microfsoft/Boot/grubx64.efi Microsoft/Boot/bootmgfw.efi
- Unmount partition
cd sudo umount /mnt
A blog by an overly energetic software developer, who has a knack for talking too much.
Sunday, 2 October 2016
Tricking Ubuntu to boot on a Sony Vaio Pro 13
This is just a reminder to myself for the next time I install / update Ubuntu on my laptop that I need to do the following after installation, otherwise Ubuntu won't boot. Assumes that Secure Boot is disabled and using UEFI boot.
Friday, 3 June 2016
Don't just assume arrays will be faster: how we implemented object pooling
Update (29/06/2016): As Mike Barker kindly pointed out, the array backed object pool was not pre-populating the array with objects so the benchmark was going through some extra logic. I've since updated the results and graphs with the array pre-populated, which hasn't greatly changed the results.
To ensure customers receive the best service possible at TransFICC we need to make sure that our systems have robust latency and throughput figures. A key part of avoiding latency spikes (in Java anyway) is to attempt to keep the Garbage Collector at bay. One way we attempt to do this is with the use of object pooling pattern to allow us to re-use objects.
We weren't sure what data structure would be best to back our object pool implementation so we decided to write a few benchmarks that utilise various data structures: stack, dequeue based stack, queue, and an array. The goal of these benchmarks was to see if any of the data structures allocate (e.g. if using a LinkedList, a node would be created everytime you added an object to the pool), and to see what the latency and throughput values would be.
The benchmark results for measuring throughput show that both of the deque backed object pools perform at a similar level for both of the scenarios we are testing (removing and adding a single object, removing and adding several objects), with our stack implementation having similar throughput for the removing and adding several objects scenario, taking into account the error.
What is more interesting is the graphing of the latency figures.
Both object pools that are backed by a deque have very similar latency profiles, which I was surprised at as I expected the the stack implementation to outperform on the remove one add one strategy as it would only be using a single object (the queue implementation will always be taking a different object from the front of the queue). The stack object pool does perform slightly better in the remove several add several scenario at one data point, but the difference isn't significant compared to the queue implementation.
What I found most surprising is the high variance in the array backed object pool. I assumed using a simple array would yield the best throughput and latency numbers, however this is not the case. I'm not sure if it's because of my implementation of the array object pool so will try another version if I have time.
In the end we opted to use an object pool using a stack backed by a deque based on our findings. I should point out that all our object pools are accessed from a single thread so are not required to be thread safe.
The benchmarks were run on a Fedora 23 box with 6 core Intel(R) Xeon(R) CPU E5-1650 v3 @ 3.50GHz and 32GB RAM.
To ensure customers receive the best service possible at TransFICC we need to make sure that our systems have robust latency and throughput figures. A key part of avoiding latency spikes (in Java anyway) is to attempt to keep the Garbage Collector at bay. One way we attempt to do this is with the use of object pooling pattern to allow us to re-use objects.
We weren't sure what data structure would be best to back our object pool implementation so we decided to write a few benchmarks that utilise various data structures: stack, dequeue based stack, queue, and an array. The goal of these benchmarks was to see if any of the data structures allocate (e.g. if using a LinkedList, a node would be created everytime you added an object to the pool), and to see what the latency and throughput values would be.
The benchmark results for measuring throughput show that both of the deque backed object pools perform at a similar level for both of the scenarios we are testing (removing and adding a single object, removing and adding several objects), with our stack implementation having similar throughput for the removing and adding several objects scenario, taking into account the error.
What is more interesting is the graphing of the latency figures.
Both object pools that are backed by a deque have very similar latency profiles, which I was surprised at as I expected the the stack implementation to outperform on the remove one add one strategy as it would only be using a single object (the queue implementation will always be taking a different object from the front of the queue). The stack object pool does perform slightly better in the remove several add several scenario at one data point, but the difference isn't significant compared to the queue implementation.
What I found most surprising is the high variance in the array backed object pool. I assumed using a simple array would yield the best throughput and latency numbers, however this is not the case. I'm not sure if it's because of my implementation of the array object pool so will try another version if I have time.
In the end we opted to use an object pool using a stack backed by a deque based on our findings. I should point out that all our object pools are accessed from a single thread so are not required to be thread safe.
The benchmarks were run on a Fedora 23 box with 6 core Intel(R) Xeon(R) CPU E5-1650 v3 @ 3.50GHz and 32GB RAM.
Benchmark code can be found here.
Throughput
Benchmark Mode Cnt Score Error Units StackObjectPoolUsingDequeBenchmark.removeAndAddOne thrpt 10 96.785 ± 0.076 ops/us StackObjectPoolUsingDequeBenchmark.removeAndAddOne:·gc.alloc.rate thrpt 10 0.002 ± 0.006 MB/sec StackObjectPoolUsingDequeBenchmark.removeAndAddOne:·gc.alloc.rate.norm thrpt 10 ≈ 10⁻⁵ B/op StackObjectPoolUsingDequeBenchmark.removeAndAddOne:·gc.count thrpt 10 ≈ 0 counts QueueObjectPoolUsingDequeBenchmark.removeAndAddOne thrpt 10 96.558 ± 0.614 ops/us QueueObjectPoolUsingDequeBenchmark.removeAndAddOne:·gc.alloc.rate thrpt 10 0.002 ± 0.006 MB/sec QueueObjectPoolUsingDequeBenchmark.removeAndAddOne:·gc.alloc.rate.norm thrpt 10 ≈ 10⁻⁵ B/op QueueObjectPoolUsingDequeBenchmark.removeAndAddOne:·gc.count thrpt 10 ≈ 0 counts StackObjectPoolUsingArrayBenchmark.removeAndAddOne thrpt 10 88.930 ± 0.065 ops/us StackObjectPoolUsingArrayBenchmark.removeAndAddOne:·gc.alloc.rate thrpt 10 0.002 ± 0.006 MB/sec StackObjectPoolUsingArrayBenchmark.removeAndAddOne:·gc.alloc.rate.norm thrpt 10 ≈ 10⁻⁵ B/op StackObjectPoolUsingArrayBenchmark.removeAndAddOne:·gc.count thrpt 10 ≈ 0 counts StackObjectPoolBenchmark.removeAndAddOne thrpt 10 16.276 ± 0.004 ops/us StackObjectPoolBenchmark.removeAndAddOne:·gc.alloc.rate thrpt 10 0.002 ± 0.006 MB/sec StackObjectPoolBenchmark.removeAndAddOne:·gc.alloc.rate.norm thrpt 10 ≈ 10⁻⁴ B/op StackObjectPoolBenchmark.removeAndAddOne:·gc.count thrpt 10 ≈ 0 counts StackObjectPoolUsingDequeBenchmark.removeAndAddSeveral thrpt 10 38.524 ± 0.012 ops/us StackObjectPoolUsingDequeBenchmark.removeAndAddSeveral:·gc.alloc.rate thrpt 10 0.002 ± 0.006 MB/sec StackObjectPoolUsingDequeBenchmark.removeAndAddSeveral:·gc.alloc.rate.norm thrpt 10 ≈ 10⁻⁴ B/op StackObjectPoolUsingDequeBenchmark.removeAndAddSeveral:·gc.count thrpt 10 ≈ 0 counts QueueObjectPoolUsingDequeBenchmark.removeAndAddSeveral thrpt 10 38.108 ± 0.255 ops/us QueueObjectPoolUsingDequeBenchmark.removeAndAddSeveral:·gc.alloc.rate thrpt 10 0.002 ± 0.006 MB/sec QueueObjectPoolUsingDequeBenchmark.removeAndAddSeveral:·gc.alloc.rate.norm thrpt 10 ≈ 10⁻⁴ B/op QueueObjectPoolUsingDequeBenchmark.removeAndAddSeveral:·gc.count thrpt 10 ≈ 0 counts StackObjectPoolUsingArrayBenchmark.removeAndAddSeveral thrpt 10 39.241 ± 1.179 ops/us StackObjectPoolUsingArrayBenchmark.removeAndAddSeveral:·gc.alloc.rate thrpt 10 0.002 ± 0.006 MB/sec StackObjectPoolUsingArrayBenchmark.removeAndAddSeveral:·gc.alloc.rate.norm thrpt 10 ≈ 10⁻⁴ B/op StackObjectPoolUsingArrayBenchmark.removeAndAddSeveral:·gc.count thrpt 10 ≈ 0 counts StackObjectPoolBenchmark.removeAndAddSeveral thrpt 10 5.479 ± 0.036 ops/us StackObjectPoolBenchmark.removeAndAddSeveral:·gc.alloc.rate thrpt 10 0.002 ± 0.006 MB/sec StackObjectPoolBenchmark.removeAndAddSeveral:·gc.alloc.rate.norm thrpt 10 ≈ 10⁻³ B/op StackObjectPoolBenchmark.removeAndAddSeveral:·gc.count thrpt 10 ≈ 0 counts
Latency
Remove One Add One Scenario
Percentile StackObjectPoolUsingDequeBenchmark QueueObjectPoolUsingDequeBenchmark StackObjectPoolUsingArrayBenchmark StackObjectPoolBenchmark 0 0.558 0.558 0.558 0.558 50 0.559 0.559 0.559 0.629 90 0.629 0.629 0.629 0.629 95 0.629 0.629 0.629 0.698 99 0.629 0.629 0.629 0.699 99.9 1.118 1.118 1.118 1.118 99.99 10.88 6.842 10.752 10.964 99.999 11.872 11.792 11.872 12.057 99.9999 17.991 16.591 51.3 43.027 100 18.144 17.92 58.56 44.416 Mean 0.588 0.59 0.586 0.628 Error (us/op) 0.001 0.001 0.001 0.001
Remove Several Add Several Scenario
Percentile StackObjectPoolUsingDequeBenchmark QueueObjectPoolUsingDequeBenchmark StackObjectPoolUsingArrayBenchmark StackObjectPoolBenchmark 0 0.558 0.558 0.558 0.698 50 0.628 0.628 0.628 0.768 90 0.629 0.629 0.629 0.769 95 0.629 0.629 0.629 0.769 99 0.629 0.629 0.629 0.769 99.9 1.118 1.118 1.118 1.258 99.99 10.816 6.6 10.816 11.232 99.999 12.006 11.872 11.99 12.312 99.9999 14.389 19.716 29.362 67.966 100 14.512 22.4 35.712 69.376 Mean 0.603 0.605 0.604 0.746 Error (us/op) 0.001 0.001 0.001 0.001
Labels:
Java,
Performance
Friday, 15 April 2016
So.....Gradle
My previous post mentioned that we are currently using Maven as our build system but looking at moving across to Buck. Well, there has been a slight change of plan and we've moved to using Gradle. The main reason we opted for Gradle over Buck is its integration with Intellij, which was some what of a pain point when I was working at LMAX.
Migrating from Gradle to Maven was super easy, and has allowed us to simplify the various tests we had defined; Maven becomes very complicated if you try and do anything beyond their defined build tasks, which was very annoying for us as we have multiple test definitions being unit tests and integration tests (dao, acceptance, benchmark, perf, compatibility.....). This had resulted in some serious hackery (note: completely my fault).
As an example, this is how we ran DAO tests in the Maven world:
And this is how we run them with Gradle:
Wrapper script:
build.gradle:
Nice and simple!
Migrating from Gradle to Maven was super easy, and has allowed us to simplify the various tests we had defined; Maven becomes very complicated if you try and do anything beyond their defined build tasks, which was very annoying for us as we have multiple test definitions being unit tests and integration tests (dao, acceptance, benchmark, perf, compatibility.....). This had resulted in some serious hackery (note: completely my fault).
As an example, this is how we ran DAO tests in the Maven world:
And this is how we run them with Gradle:
Wrapper script:
build.gradle:
Nice and simple!
Thursday, 14 April 2016
Keep it Simple - Using a bootstrap script to wrap your build system
Do you know what's annoying when you start at a new company (or do a clean checkout of some code)? Getting your development environment set up; especially when it comes to making sure you have the right software in place to be able to build said code!
I'm sure there are lots of ways to solve these problems, but a simple solution we use at TransFICC is to use a bootstrap script that wraps our build system. This is called everytime you want to run something via the build system (in our case Maven [we haven't buck'd up]), and ensures all build dependencies are present. I should note that we only use this for build dependencies, not run time dependencies.
bootstrap.sh
Our build dependencies are Java, Node (well, npm), Buck, and Maven. This script assumes the dependencies live at http://thearts.internal.transficc.com/artifacts/ (e.g. the jdk is located at http://thearts.internal.transficc.com/artifacts/jdk/jdk1.8.0_74.tar.gz and the directory contained inside the tar.gz is jdk1.8.0_74). The checksums check for a change of dependencies rather than ensuring they are downloaded correctly.
This script could be used as follows.
transficc.sh
I'm sure there are lots of ways to solve these problems, but a simple solution we use at TransFICC is to use a bootstrap script that wraps our build system. This is called everytime you want to run something via the build system (in our case Maven [we haven't buck'd up]), and ensures all build dependencies are present. I should note that we only use this for build dependencies, not run time dependencies.
bootstrap.sh
Our build dependencies are Java, Node (well, npm), Buck, and Maven. This script assumes the dependencies live at http://thearts.internal.transficc.com/artifacts/ (e.g. the jdk is located at http://thearts.internal.transficc.com/artifacts/jdk/jdk1.8.0_74.tar.gz and the directory contained inside the tar.gz is jdk1.8.0_74). The checksums check for a change of dependencies rather than ensuring they are downloaded correctly.
This script could be used as follows.
transficc.sh
./transficc.sh compile
Subscribe to:
Posts (Atom)