Splitting Compilation and Execution in v8go

Splitting Compilation and Execution in v8go

(I could have totally found a free stock image of some generic computer thing but who doesn’t love pictures of puppies instead?)

V8 is Google’s open source high-performance JavaScript and WebAssembly engine, written in C++. The V8 engine orchestrates isolates where an isolate is equivalent to a sandbox for running JS code. Cloudflare has a great explanation so I won’t rehash it here.

v8go is a library written in Go and C++ that allows users to execute JavaScript from Go using V8 isolates. Using Cgo bindings means we can run JavaScript in Go at native performance.

Why would someone want to render JS from a Go server you might be wondering. Go is a pretty phenomenal language for web servers — it scales well and its concurrency features play a big role in that. There are other approaches to this — for example, Deno used to be written in Go and then switched to Rust to maximize performance. At the end of the day, if you choose to use Go for this, you’ll probably end up using v8go and you want the library to support you making the best performance decisions.

Like in any software, it’s important to evolve and find different ways to amortize and reduce the cost of repeated actions. One of the major changes I contributed to the v8go library was with this goal in mind — splitting the compilation of code from its execution. This allows users to cache (in memory or elsewhere) the compiled bits and use that whenever it needs to be executed instead of having to recompile every time it needs to be run. Here’s how and why I made those changes.

V1: Compile + Run Every Time

Let’s start with the initial API supported by v8go for compiling and running JavaScript from Go. You would first initialize a V8 Isolate, a V8 Context, and then use the context’s RunScript function to execute the provided code in that V8 Isolate. It would look like:

If you cared about the return value of the code, you could handle the returned V8 Value:

Under the hood, what RunScript is doing is taking the provided JavaScript and compiling it into a V8 Script that is bound to the V8 Context, and then running it and returning to you the Go version of the resulting JS value. The internals of v8go’s C++ code looks like:

Essentially, because of the way v8go’s RunScript is written, you pass your code in and you get your result back — but each time you run the same code, it has to be recompiled. There are some internal optimizations to V8 whereby it keeps a cache of the compiled data such that it could reuse it for the same isolate, but across isolates or across processes, it would be recompiling.

For large, complex bits of code, it’s obvious then that having to recompile each time is a pain we should avoid. Even if it doesn’t take a particularly long time for an individual piece, if you have many pieces that need to be executed every time (like polyfills), you are unable to amortize in this fashion. Folks try to work around this by reusing the same V8 Isolate for as much as they can, and distinguishing different requests by their V8 Contexts. But this hits a separate issue in v8go that still currently stands which is a “memory leak”.

To investigate v8go performance, let’s start with a simple example. We can take a URL polyfill and measure how long it takes to compile and run:

On average, this takes ~3.0ms.

If we do this concurrently like we would if we were handling different requests in a Go server and each gets its own V8 Isolate and Context, then that average bumps to ~6.0ms.

V2: Compile Then Run As Much As You Want

Now that we have a baseline of performance for the existing RunScript implementation, let’s experiment with the idea of splitting compilation and execution, and measure the performance benefits.

V8 has a few APIs for supporting this flow. The important one is that V8 supports compiling an unbound script in a V8 Isolate (unbound means that the script is not bound to a V8 context.) The V8 ScriptCompiler has the ability to create a “code cache” from an unbound script. From this code cache, the cached data (the compiled code) can be extracted as bytes. In order to use those bytes, they can be passed back in as compile options to any new V8 Isolate.

In v8go, the API looks like:

Under the hood, it’s pretty straightforward. The c++ changes are here.

With this change, we can do some performance tests.

So the situation where it’s run in order:

This averages ~2.2ms.

In the concurrent case, we see a run time of ~3.0ms.

Every ms counts

In v8go, we kept the original ctx.RunScript API to enable one-off or occasional executions of a script. In these cases it’s fine to compile and run when you need to.

However, when your system is particularly sensitive to performance, every millisecond counts. This example only shows a decrease from 6ms to 3ms, but there are plenty of cases where larger script inputs or repeated usage of many common libraries could make this improvement quite significant. Hopefully you found this useful and you can find similar ways to amortize those costs in your own codebases because… Speed. Is. Everything.

NOW WITH OVER +8500 USERS. people can Join Knowasiak for free. Sign up on Knowasiak.com
Read More



“Simplicity, patience, compassion.
These three are your greatest treasures.
Simple in actions and thoughts, you return to the source of being.
Patient with both friends and enemies,
you accord with the way things are.
Compassionate toward yourself,
you reconcile all beings in the world.”
― Lao Tzu, Tao Te Ching