2016年1月2日 星期六

[Quick Note] OrcJIT in LLVM

JIT(Just in Time) compilation becomes more and more popular in recent years due to the rising demand on higher dynamic type language performance, Javascript and Python, to name a few.

However, LLVM, which is developed mainly for static type language like C/C++, also joined JIT's battlefield few years ago. Although applying LLVM on dynamic compilation has some drawbacks, for example, long running optimizations increase latencies and inadequate type system which is originally developed for static type rather than dynamic type. It doesn't stop the community from working on this subsystem. More and more enhancements come up like native support for patch points and stack maps, which debut in version 3.8 as experiment features.

There are mainly two kinds of compilation strategies: method base and trace base. Difference between them falls on the compilation scope they adopt. As their name suggests, method-based approach take functions or methods as minimum compilation units. Google's V8 Javascript engine is a well-known example. Trace-based JIT use traces, where we can treat them as a range of control flow, as the compilation units. Firefox's TraceMonkey and Lua's JIT use this kind of approach. Which of them is better may required several academic papers to explain and the debate still goes on now.

Back to LLVM, the main JIT implementation is called MCJIT. It adopted the popular MC- framework within LLVM, which is kind of LLVM's own assembler, as its backend to gain more flexibility. MCJIT use non of strategies mentioned above, instead, it use LLVM's Module as the compilation unit. You can use the llvm::EngineBuilder class to build a llvm::ExecutionEngine, which is the main JIT interface, then you can either retrieve compiled symbols or run compiled functions directly by its rich APIs like getPointerToFunction or runFunction.

There are several drawbacks on using module as compilation unit, in short, module scope is too big in comparison with method and trace, which may spend too much time on compilation procedure and increase latency. And that's what OrcJIT tries to fix.

Orc is the shorthand of on-request-compiling. As the name suggests, it lazily performs the compilation process of each functions within the module until it is used. OrcJIT also build from scratch instead of building on MCJIT. Even more, it introduced a concept called Layer. Layer is like LLVM's IR pass, but handles compilation procedure rather than IR. It provides flexibilities for building your own JIT compilation flow. Nevertheless, OrcJIT still lacks of central layer manager similar to llvm::PassManager and even common layer interface! In another word, we can only construct our own compilation "stack" by passing the base layer instance to the first argument of another layer constructor and non of the layer class inherit a common parent class as interface. For example, the lli program construct its compilation stack as below:
OrcLazyJIT(std::unique_ptr<TargetMachine> TM,
             std::unique_ptr<CompileCallbackMgr> CCMgr,
             IndirectStubsManagerBuilder IndirectStubsMgrBuilder,
             bool InlineStubs)
             : TM(std::move(TM)), DL(this->TM->createDataLayout()),
             CCMgr(std::move(CCMgr)),
             ObjectLayer(),
             CompileLayer(ObjectLayer, orc::SimpleCompiler(*this->TM)),
             IRDumpLayer(CompileLayer, createDebugDumper()),
             CODLayer(IRDumpLayer, extractSingleFunction, *this->CCMgr,
                 std::move(IndirectStubsMgrBuilder), InlineStubs),
             CXXRuntimeOverrides(
                 [this](const std::string &S) { return mangle(S); }) {}

The compilation flow would run as the order: ObjectLayer, CompileLayer, IRDumpLayer, CODLayer. Where CODLayer is the instance of CompileOnDemandLayer and treated as the highest level class in OrcJIT. We would use CODLayer to add module by addModuleSet and invoke findSymbol to retrieve compiled function symbols. OrcJIT would differ the compilation of each function until we called findSymbol.

Pros of OrcJIT would be the decreasing compilation region, brings potential to add profiler for detecting hot method which is used in merely every dynamic compilation framework nowadays. On the other hand, OrcJIT does not use the llvm::ExecutionEngine interface. Although it provides llvm::orc::OrcMCJITReplacement as an wrapper,  the main author tends to use its own interface, this raises lots of questions and arguments on the mailing list from the debut of OrcJIT. What's more, as previous mentioned, OrcJIT's layer still has no common interface and layer manger.

I was pretty excited when I saw OrcJIT, because smaller compilation unit is what trace-based JIT overwhelms method-based JIT. It looks that LLVM gets another armor in the battlefield of dynamic compilation.

沒有留言:

張貼留言