INTERNALS.md - platform/build/kati - Git at Google

 Kati internals
 ==============

 This is an informal document about internals of kati. This document is not meant
 to be a comprehensive document of kati or GNU make. This explains some random
 topics which other programmers may be interested in.

 Motivation
 ----------

 The motivation of kati was to speed up Android platform build. Especially, its
 incremental build time was the main focus. Android platform's build system is a
 very unique system. It provides a DSL, (ab)using Turing-completeness of GNU
 make. The DSL allows developers to write build rules in a descriptive way, but
 the downside is it's complicated and slow.

 When we say a build system is slow, we consider "null build" and "full
 build". Null build is a build which does nothing, because all output files are
 already up-to-date. Full build is a build which builds everything, because there
 were nothing which have been already built. Actual builds in daily development
 are somewhere between null build and full build. Most benchmarks below were done
 for null build.

 For Android with my fairly beefy workstation, null build took ~100 secs with GNU
 make. This means you needed to wait ~100 secs to see if there's a compile error
 when you changed a single C file. To be fair, things were not that bad. There
 are tools called mm/mmm. They allow developers to build an individual module. As
 they ignore dependencies between modules, they are fast. However, you need to be
 somewhat experienced to use them properly. You should know which modules will be
 affected by your change. It would be nicer if you can just type "make" whenever
 you change something.

 This is why we started this project. We decided to create a GNU make clone from
 scratch, but there were some other options. One option was to replace all
 Android.mk by files with a better format. There is actually a longer-term
 project for this. Kati was planned to be a short-term project. Another option
 was to hack GNU make instead of developing a clone. We didn't take this option
 because we thought the source code of GNU make is somewhat complicated due to
 historical reason. It's written in old-style C, has a lot of ifdefs for some
 unknown architectures, etc.

 Currently, kati's main mode is --ninja mode. Instead of executing build commands
 by itself, kati generates build.ninja file and
 [ninja](https://github.com/martine/ninja) actually runs commands. There were
 some back-and-forths before kati became the current form. Some experiments
 succeeded and some others failed. We even changed the language for kati. At
 first, we wrote kati in Go. We naively expected we can get enough performance
 with Go. I guessed at least one of the following statements are true: 1. GNU
 make is not very optimized for computation heavy Makefiles, 2. Go is fast for
 our purpose, or 3. we can come up with some optimization tricks for Android's
 build system. As for 3, some of such optimization succeeded but it's performance
 gain didn't cancel the slowness of Go.

 Go's performance would be somewhat interesting topic. I didn't study the
 performance difference in detail, but it seemed both our use of Go and Go
 language itself were making the Go version of kati slower. As for our fault, I
 think Go version has more unnecessary string allocations than C++ version
 has. As for Go itself, it seemed GC was the main show-stopper. For example,
 Android's build system defines about one million make variables, and buffers for
 them will be never freed. IIRC, this kind of allocation pattern isn't good for
 non-generational GC.

 Go version and test cases were written by ukai and me, and C++ rewrite was done
 mostly by me. The rest of this document is mostly about the C++ version.

 Overall architecture
 --------------------

 Kati consists of the following components:

 * Parser
 * Evaluator
 * Dependency builder
 * Executor
 * Ninja generator

 A Makefile has some statements which consist of zero or more expressions. There
 are two parsers and two evaluators - one for statements and the other for
 expressions.

 Most of users of GNU make may not care about the evaluator much. However, GNU
 make's evaluator is very powerful and is Turing-complete. For Android's null
 build, most time is spent in this phase. Other tasks, such as building
 dependency graphs and calling stat function for build targets, are not the
 bottleneck. This would be a very Android specific characteristics. Android's
 build system uses a lot of GNU make black magics.

 The evaluator outputs a list of build rules and a variable table. The dependency
 builder creates a dependency graph from the list of build rules. Note this step
 doesn't use the variable table.

 Then either executor or ninja generator will be used. Either way, kati runs its
 evaluator again for command lines. The variable table is used again for this
 step.

 We'll look at each components closely. GNU make is a somewhat different language
 from modern languages. Let's see.

 Parser for statements
 ---------------------

 I'm not 100% sure, but I think GNU make parses and evaluates Makefiles
 simultaneously, but kati has two phases for parsing and evaluation. The reason
 of this design is for performance. For Android build, kati (or GNU make) needs
 to read ~3k files ~50k times. The file which is read most often is read ~5k
 times. It's waste of time to parse such files again and again. Kati can re-use
 parsed results when it needs to evaluate a Makefile second time. If we stop
 caching the parsed results, kati will be two times slower for Android's
 build. Caching parsed statements is done in *file_cache.cc*.

 The statement parser is defined in *parser.cc*. In kati, there are four kinds of
 statements:

 * Rules
 * Assignments
 * Commands
 * Make directives

 Data structures for them are defined in *stmt.h*. Here are examples of these
 statements:

     VAR := yay!      # An assignment
     all:             # A rule
     	echo $(VAR)  # A command
     include xxx.mk   # A make directive (include)

 In addition to include directive, there are ifeq/ifneq/ifdef/ifndef directives
 and export/unexport directives. Also, kati internally uses "parse error
 statement". As GNU make doesn't show parse errors in branches which are not
 taken, we need to delay parse errors to evaluation time.

 ### Context dependent parser

 A tricky point of parsing make statements is that the parsing depends on the
 context of the evaluation. See the following Makefile chunk for example:

     $(VAR)
     	X=hoge echo $${X}

 You cannot tell whether the second line is a command or an assignment until
 *$(VAR)* is evaluated. If *$(VAR)* is a rule statement, the second line is a
 command and otherwise it's an assignment. If the previous line is

     VAR := target:

 the second line will turn out to be a command.

 For some reason, GNU make expands expressions before it decides the type of
 a statement only for rules. Storing assignments or directives in a variable
 won't work as assignments or directives. For example

     ASSIGN := A=B
     $(ASSIGN):

 doesn't assign "*B:*" to *A*, but defines a build rule whose target is *A=B*.

 Anyway, as a line starts with a tab character can be either a command statement
 or other statements depending on the evaluation result of the previous line,
 sometimes kati's parser cannot tell the statement type of a line. In this case,
 kati's parser speculatively creates a command statement object, keeping the
 original line. If it turns out the line is actually not a command statement,
 the evaluator re-runs the parser.

 ### Line concatenations and comments

 In most programming languages, line concatenations by a backslash character and
 comments are handled at a very early stage of a language
 implementation. However, GNU make changes the behavior for them depending on
 parse/eval context. For example, the following Makefile outputs "has space" and
 "hasnospace":

     VAR := has\
     space
     all:
     	echo $(VAR)
     	echo has\
     nospace

 GNU make usually inserts a whitespace between lines, but for command lines it
 doesn't. As we've seen in the previous subsection, sometimes kati cannot tell
 a line is a command statement or not. This means we should handle them after
 evaluating statements. Similar discussion applies for comments. GNU make usually
 trims characters after '#', but it does nothing for '#' in command lines.

 We have a bunch of comment/backslash related testcases in the testcase directory
 of kati's repository.

 Parser for expressions
 ----------------------

 A statement may have one or more expressions. The number of expressions in a
 statement depends on the statement's type. For example,

     A := $(X)

 This is an assignment statement, which has two expressions - *A* and
 *$(X)*. Types of expressions and their parser are defined in *expr.cc*. Like
 other programming languages, an expression is a tree of expressions. The type of
 a leaf expression is either literal, variable reference,
 [substitution references](http://www.gnu.org/software/make/manual/make.html#Substitution-Refs),
 or make functions.

 As written, backslashes and comments change their behavior depending on the
 context. Kati handles them in this phase. *ParseExprOpt* is the enum for the
 contexts.

 As a nature of old systems, GNU make is very permissive. For some reason, it
 allows some kind of unmatched pairs of parentheses. For example, GNU make
 doesn't think *$($(foo)* is an error - this is a reference to variable
 *$(foo*. If you have some experiences with parsers, you may wonder how one can
 implement a parser which allows such expressions. It seems GNU make
 intentionally allows this:

 http://git.savannah.gnu.org/cgit/make.git/tree/expand.c#n285

 No one won't use this feature intentionally. However, as GNU make allows this,
 some Makefiles have unmatched parentheses, so kati shouldn't raise an error for
 them, unfortunately.

 GNU make has a bunch of functions. Most users would use only simple ones such as
 *$(wildcard ...)* and *$(subst ...)*. There are also more complex functions such
 as *$(if ...)* and *$(call ...)*, which make GNU make Turing-complete. Make
 functions are defined in *func.cc*. Though *func.cc* is not short, the
 implementation is fairly simple. There is only one weirdness I remember around
 functions. GNU make slightly changes its parsing for *$(if ...)*, *$(and ...)*,
 and *$(or ...)*. See *trim_space* and *trim_right_space_1st* in *func.h* and how
 they are used in *expr.cc*.

 Evaluator for statements
 ------------------------

 Evaluator for statements are defined in *eval.cc*. As written, there are four
 kinds of statements:

 * Rules
 * Assignments
 * Commands
 * Make directives

 There is nothing tricky around commands and make directives. A rule statement
 have some forms and should be parsed after evaluating expression by the third
 parser. This will be discussed in the next section.

 Assignments in GNU make is tricky a bit. There are two kinds of variables in GNU
 make - simple variables and recursive variables. See the following code snippet:

     A = $(info world!)   # recursive
     B := $(info Hello,)  # simple
     $(A)
     $(B)

 This code outputs "Hello," and "world!", in this order. The evaluation of
 a recursive variable is delayed until the variable is referenced. So the first
 line, which is an assignment of a recursive variable, outputs nothing. The
 content of the variable *$(A)* will be *$(info world!)* after the first
 line. The assignment in the second line uses *:=* which means this is a simple
 variable assignment. For simple variables, the right hand side is evaluated
 immediately. So "Hello," will be output and the value of *$(B)* will be an empty
 string ($(info ...) returns an empty string). Then, "world!" will be shown when
 the third line is evaluated as *$(A)* is evaluated, and lastly the forth line
 does nothing, as *$(B)* is an empty string.

 There are two more kinds of assignments (i.e., *+=* and *?=*). These assignments
 keep the type of the original variable. Evaluation of them will be done
 immediately only when the left hand side of the assignment is already defined
 and is a simple variable.

 Parser for rules
 ----------------

 After evaluating a rule statement, kati needs to parse the evaluated result. A
 rule statement can actually be the following four things:

 * A rule
 * A [target specific variable](http://www.gnu.org/software/make/manual/make.html#Target_002dspecific)
 * An empty line
 * An error (there're non-whitespace characters without a colon)

 Parsing them is mostly done in *rule.cc*.

 ### Rules

 A rule is something like *all: hello.exe*. You should be familiar with it. There
 are several kinds of rules such as pattern rules, double colon rules, and order
 only dependencies, but they don't complicate the rule parser.

 A feature which complicates the parser is semicolon. You can write the first
 build command on the same line as the rule. For example,

     target:
     	echo hi!

 and

     target: ; echo hi!

 have the same meaning. This is tricky because kati shouldn't evaluate expressions
 in a command until the command is actually invoked. As a semicolon can appear as
 the result of expression evaluation, there are some corner cases. A tricky
 example:

     all: $(info foo) ; $(info bar)
     $(info baz)

 should output *foo*, *baz*, and then *bar*, in this order, but

     VAR := all: $(info foo) ; $(info bar)
     $(VAR)
     $(info baz)

 outputs *foo*, *bar*, and then *baz*.

 Again, for the command line after a semicolon, kati should also change how
 backslashes and comments are handled.

     target: has\
     space ; echo no\
     space

 The above example says *target* depends on two targets, *has* and *space*, and
 to build *target*, *echo nospace* should be executed.

 ### Target specific variables

 You may not familiar with target specific variables. This feature allows you to
 define variable which can be referenced only from commands in a specified
 target. See the following code:

     VAR := X
     target1: VAR := Y
     target1:
     	echo $(VAR)
     target2:
     	echo $(VAR)

 In this example, *target1* shows *Y* and *target2* shows *X*. I think this
 feature is somewhat similar to namespaces in other programming languages. If a
 target specific variable is specified for a non-leaf target, the variable will
 be used even in build commands of prerequisite targets.

 In general, I like GNU make, but this is the only GNU make's feature I don't
 like. See the following Makefile:

     hello: CFLAGS := -g
     hello: hello.o
     	gcc $(CFLAGS) $< -o $@
     hello.o: hello.c
     	gcc $(CFLAGS) -c $< -o $@

 If you run make for the target *hello*, *CFLAGS* is applied for both commands:

     $ make hello
     gcc -g -c hello.c -o hello.o
     gcc -g hello.o -o hello

 However, *CFLAGS* for *hello* won't be used when you build only *hello.o*:

     $ make hello.o
     gcc  -c hello.c -o hello.o

 Things could be even worse when two targets with different target specific
 variables depend on a same target. The build result will be inconsistent. I
 think there is no valid usage of this feature for non-leaf targets.

 Let's go back to the parsing. Like for semicolons, we need to delay the
 evaluation of the right hand side of the assignment for recursive variables. Its
 implementation is very similar to the one for semicolons, but the combination of
 the assignment and the semicolon makes parsing a bit trickier. An example:

     target1: ;X=Y echo $(X)  # A rule with a command
     target2: X=;Y echo $(X)  # A target specific variable

 Evaluator for expressions
 -------------------------

 Evaluation of expressions is done in *expr.cc*, *func.cc*, and
 *command.cc*. The amount of code for this step is fairly large especially
 because of the number of GNU make functions. However, their implementations are
 fairly straightforward.

 One tricky function is $(wildcard ...). It seems GNU make is doing some kind of
 optimization only for this function and $(wildcard ...) in commands seem to be
 evaluated before the evaluation phase for commands. Both C++ kati and Go kati
 are different from GNU make's behavior in different ways, but it seems this
 incompatibility is OK for Android build.

 There is an important optimization done for Android. Android's build system has
 a lot of $(shell find ...) calls to create a list of all .java/.mk files under a
 directory, and they are slow. For this, kati has a builtin emulator of GNU
 find. The find emulator traverses the directory tree and creates an in-memory
 directory tree. Then the find emulator returns results of find commands using
 the cached tree. For my environment, the find command emulator makes kati ~1.6x
 faster for AOSP.

 The implementations of some IO-related functions in commands are tricky in the
 ninja generation mode. This will be described later.

 Dependency builder
 ------------------

 Now we get a list of rules and a variable table. *dep.cc* builds a dependency
 graph using the list of rules. I think this step is what GNU make is supposed to
 do for normal users.

 This step is fairly complex like other components but there's nothing
 strange. There are three types of rules in GNU make:

 * explicit rule
 * implicit rule
 * suffix rule

 The following code shows the three types:

     all: foo.o
     foo.o:
     	echo explicit
     %.o:
     	echo implicit
     .c.o:
     	echo suffix

 In the above example, all of these three rules match the target *foo.o*. GNU
 make prioritizes explicit rules first. When there's no explicit rule for a
 target, it uses an implicit rule with longer pattern string. Suffix rules are
 used only when there are no explicit/implicit rules.

 Android has more than one thousand implicit rules and there are ten thousands of
 targets. It's too slow to do matching for them with a naive O(NM)
 algorithm. Kati uses a trie to speed up this step.

 Multiple rules without commands should be merged into the rule with a
 command. For example:

     foo.o: foo.h
     %.o: %.c
     	$(CC) -c $< -o $@

 *foo.o* depends not only on *foo.c*, but also on *foo.h*.

 Executor
 --------

 C++ kati's executor is fairly simple. This is defined in *exec.cc*. This is
 useful only for testing because this lacks some important features for a build
 system (e.g., parallel build).

 Expressions in commands are evaluated at this stage. When they are evaluated,
 target specific variables and some special variables (e.g., $< and $@) should be
 considered. *command.cc* is handling them. This file is used by both the
 executor and the ninja generator.

 Evaluation at this stage is tricky when both *+=* and target specific variables
 are involved. Here is an example code:

     all: test1 test2 test3 test4

     A:=X
     B=X
     X:=foo

     test1: A+=$(X)
     test1:
     	@echo $(A)  # X bar

     test2: B+=$(X)
     test2:
     	@echo $(B)  # X bar

     test3: A:=
     test3: A+=$(X)
     test3:
     	@echo $(A)  # foo

     test4: B=
     test4: B+=$(X)
     test4:
     	@echo $(B)  # bar

     X:=bar

 *$(A)* in *test3* is a simple variable. Though *$(A)* in the global scope is
 simple, *$(A)* in *test1* is a recursive variable. This means types of global
 variables don't affect types of target specific variables. However, The result
 of *test1* ("X bar") shows the value of a target specific variable is
 concatenated to the value of a global variable.

 Ninja generator
 ---------------

 *ninja.cc* generates a ninja file using the results of other components. This
 step is actually fairly complicated because kati needs to map GNU make's
 features to ninja's.

 A build rule in GNU make may have multiple commands, while ninja's has always a
 single command. To mitigate this, the ninja generator translates multiple
 commands into something like *(cmd1) && (cmd2) && ...*. Kati should also escape
 some special characters for ninja and shell.

 The tougher thing is $(shell ...) in commands. Current kati's implementation
 translates it into shell's $(...). This works for many cases. But this approach
 won't work when the result of $(shell ...) is passed to another make
 function. For example

     all:
     	echo $(if $(shell echo),FAIL,PASS)

 should output PASS, because the result of $(shell echo) is an empty string. GNU
 make and kati's executor mode output PASS correctly. However, kati's ninja
 generator emits a ninja file which shows FAIL.

 I wrote a few experimental patches for this issue, but they didn't
 work well. The current kati's implementation has an Android specific workaround
 for this. See *HasNoIoInShellScript* in *func.cc* for detail.

 Ninja regeneration
 ------------------

 C++ kati has --regen flag. If this flag is specified, kati checks if anything
 in your environment was changed after the previous run. If kati thinks it doesn't
 need to regenerate the ninja file, it finishes quickly. For Android, running
 kati takes ~30 secs at the first run but the second run takes only ~1 sec.

 Kati thinks it needs to regenerate the ninja file when one of the followings is
 changed:

 * The command line flags passed to kati
 * A timestamp of a Makefile used to generate the previous ninja file
 * An environment variable used while evaluating Makefiles
 * A result of $(wildcard ...)
 * A result of $(shell ...)

 Quickly doing the last check is not trivial. It takes ~18 secs to run all
 $(shell ...) in Android's build system due to the slowness of $(shell find
 ...). So, for find commands executed by kati's find emulator, kati stores the
 timestamps of traversed directories with the find command itself. For each find
 commands, kati checks the timestamps of them. If they are not changed, kati
 skips re-running the find command.

 Kati doesn't run $(shell date ...) and $(shell echo ...) during this check. The
 former always changes so there's no sense to re-run them. Android uses the
 latter to create a file and the result of them are empty strings. We don't want
 to update these files to get empty strings.

 TODO
 ----

 A big TODO is sub-makes invoked by $(MAKE). I wrote some experimental patches
 but nothing is ready to be used as of writing.