Introduction
!!! warning This auditors' handbook is frozen and obsolete; the Nim language manual alongside other Nim documentation, Status Nim style guide, Chronos guides, and Nim by Example supercede it.
The Nimbus Nim-Beacon-Chain (NBC) project is an implementation of the Ethereum 2 Beacon Chain specification in the Nim programming language.
The Auditors' Handbook aims to be provide a comprehensive introduction to:
- The Nim programming language, as used in the project.
- The NBC project.
- The dependencies of the project.
A particular focus will be given to features related to safety, correctness, error handling, testing, fuzzing, or inspecting Nim code.
One of the major highlights of Nim is that it compiles to C or C++ before compiling to native code. All techniques available to audit C code can be used to audit Nim.
The dependencies NBC rely on are detailed per audit phase in the build system and dependencies section.
The Nim Programming Language
!!! warning This auditors' handbook is frozen and obsolete; the Nim language manual alongside other Nim documentation, Status Nim style guide, Chronos guides, and Nim by Example supercede it.
The Nim programming language is a compiled language, with strong static typing.
The rest of the Handbook will assume that Nim-by-example was read.
Nim compilation process is in 2 phases, first lowering the Nim code to C, C++ or Javascript. Then for machine code, rely on the C/C++ compiler to produce the final code.
Nim can target any combination of C compiler, host OS and hardware architecture as long as the C compiler supports it.
Installing
Nim can be installed via:
- A Linux distribution package manager or Homebrew on MacOS
- Instructions at https://nim-lang.org/install.html
- Docker: https://hub.docker.com/r/nimlang/nim/
- Choosenim https://github.com/dom96/choosenim
Nim Vagrant https://github.com/status-im/nim-vagrant is unmaintained but might help setting up your own virtualized environment.
We target Nim 1.2.2 and should be compatible with the latest stable, Nim 1.2.4
Casings
Nim has unusual partial case insensitivity for identifiers. The rationales being:
- Preventing bugs when using
SDL_QUIT
instead ofSDL_Quit
. - Having consistent casing in a codebase even when relying on external dependencies with different casing.
The convention used in Nim-Beacon-Chain is:
snake_case
for fields and procedures names from the Ethereum specMACRO_CASE
for Ethereum spec constantsPascalCase
for all types (Ethereum or additional)camelCase
for our own additional codePascalCase
for our additional constants
In summary, we respect the Ethereum spec for Ethereum specified identifiers and use Nim NEP-1 for the rest.
Checking the C code
By default the intermediate C code produced by the Nim compiler is available at
$HOME/.nim/compiled_project_d
on UNIX systems$HOME/nimcache/compiled_project_d
on Windows
The suffix _d
indicates a debug build, the suffix _d
indicates a release build
Compiler options
At the time of writing, NBC targets Nim v1.2.2 compiler. The build system is at https://github.com/status-im/nimbus-build-system No patching is done at the moment on the Nim compiler, we use vanilla v1.2.2 upstream.
Nim compiler offers debug, release with -d:release
and danger with -d:danger
flag.
The debug and -d:release
build differ by, the verbosity of stacktraces and passing -O3
or equivalent to the C compiler.
Runtime checks (overflow, array bounds checks, nil checks, ...) are still included in -d:release
build. We also choose to have verbose stacktraces in NBC.
A danger build optimizes away all runtime checks and debugging help like stackframes. This might have a significant impact on performance as it may enable optimizations that were not possible like optimizing tail calls. This is not used in NBC.
References
-
Nim by example:
-
The Nim Manual, is a specification of how Nim should behave
https://nim-lang.org/docs/manual.html -
Nim tutorials
-
Nim for
- the C programmer: https://github.com/nim-lang/Nim/wiki/Nim-for-C-programmers
- Python programmer: https://github.com/nim-lang/Nim/wiki/Nim-for-Python-Programmers
Further resources are collected at:
Compiler configuration
- Compiler User Guide: https://nim-lang.org/docs/nimc.html
Style Guide
Nim Routines
!!! warning This auditors' handbook is frozen and obsolete; the Nim language manual alongside other Nim documentation, Status Nim style guide, Chronos guides, and Nim by Example supercede it.
Nim offers several kinds of "routines" that:
- do computation
- produce side-effect
- generate code
Those are:
proc
andfunc
method
converter
iterator
template
macro
proc
and func
proc
and func
are the most basic routines.
At the moment, Nim requires forward declaration of proc and func. Also it prevents circular dependencies, this means that a procedure is always coming from one of the imports.
Additionally, all dependencies are submodules and a proc can be found by greping
procname*
, the *
being the export marker.
The only exception being the standard library. Procedures from the standard library are all listed in "The Index".
Function call syntax
Nim provides flexible call syntax, the following are possible:
prof foo(a: int) =
discard
foo(a)
foo a
a.foo()
a.foo
Additionally this is also possible for strings:
let a = fromHex"0x12345" # Without spaces
Nim doesn't enforce namespacing by default but is an option
let a = byteutils.fromhex"0x12345"
Parameters
Mutable parameters must be tagged with var
TODO
Symbol resolution
If 2 procedures are visible in the same module (a module is a file) and have the same name the compiler will infer which to call from the arguments signatures. In case both are applicable, the compiler will throw an "ambiguous call" compile-time error.
Note that a procedure specialized to a concrete type has priority over a generic procedure, for example a procedure for int64 over a procedure for all number types.
func
and side effect analysis
In Nim a proc is considered to have side-effect if it accesses a global variable. Mutating a declared function parameter is not considered a side-effect as there is no access to a global variable. Printing to the standard output or reading the standard input is considered a sideeffect.
func
are syntactic sugar for proc
without sideeffects. In particular this means that func
behaviors are fully determined by their input parameters.
In the codebase, logging at the trace
level are not considered a sideeffect.
Additionally some logging statements and metrics statement may be in an explicit {.noSideEffect.}:
code-block.
Returning values
There are 3 syntaxes to return a value from a procedure:
- The return statement
- The implicit
result
variable - The "last statement as expression"
proc add1(x: int): int =
return x + 1
proc add2(x: int): int =
result = x + 2
proc add3(x: int): int =
x + 3
The main differences are:
return
allows early returns, in particular from a loop.result
offers Return Value Optimization and Copy Elision which is particularly valuable for array types.- Requires the last statement to be a valid expression.
This is particularly interesting for conditional return values
as forgetting to set the value in a branch will be a compile-time error,
for example:
Omittingproc select(ctl: bool, a, b: int): int = if ctl: echo "heavy processing" a else: echo "heavy processing" b
a
orb
will be a compiletime error, unlikeproc select(ctl: bool, a, b: int): int = if ctl: echo "heavy processing" return a else: echo "heavy processing" # Forgot to return b
proc select(ctl: bool, a, b: int): int = if ctl: echo "heavy processing" result = a else: echo "heavy processing" # Forgot to result = b
Due to the differences we prefer using the "last statement as expression" unless
- copying the type is expensive (SHA256 hash for example)
- or we need early returns
Ignoring return values
Unlike C, return values MUST be used or explicitly discarded.
Mutable return values
TODO
At a low-level
Argument passing
Nim passes arguments by value if they take less than 3*sizeof(pointer) (i.e. 24 bytes on 64-bit OS) and passes them by pointer with the C backend or reference with the C++ backend if they are bigger. Mutable arguments are always passed by pointer.
This behavior can be changed on a type-by-type bases by tagging them {.bycopy.} or {.byref.}. This is only used for interfacing with non-Nim code.
Stacktraces
With --stacktrace:on
, Nim create a stackframe on proc
entry and destroys it on exit. This is used for reporting stacktraces.
NBC is always compiled with --stacktraces:on
NBC uses libbacktrace to have less costly stacktraces.
Name in the C code or Assembly
proc
and func
are materialized in the produced C code with name-mangling appended at the end. For the purpose of building Nim libraries, the name can be controlled by:
{.exportc.}
so that the generated C name is the same as Nim- `{.exportc: "specific_name".} to generate a specific name
method
method
s are used for dynamic dispatch when an object has an inherited subtype only known at runtime.
method
are dispatched using a dispatch tree in the C code instead of a VTable.
There might be some cases where method
were used not for their intended purpose
converter
Converters are procedures that are implicitly called on a value to change its type.
For example with a fictional option type that automatically extracts the boxed type.
type Option[T] = object
case hasValue: bool
of true:
value: T
else:
discard
converter get[T](x: Option[T]): T =
x.value
let x = Option[int](hasValue: true, value: 1)
let y = Option[int](hasValue: true, value: 2)
let z = x + y
Even though the +
operator is not defined for Option[int]
it is defined for int
and Nim implicitly calls the converter.
converter
are seldom used in the codebase as we prefer explicit over implicit.
Note that in case an operation is defined on both the convertible and the converted type, the operation without conversion should be preferred however the compiler might throw an ambiguous call instead.
Iterators
Iterators are construct that transforms a for loop.
For example to iterate on a custom array collection
const MaxSize = 7
type SmallVec[T] = object
buffer*: array[MaxSize, T]
len*: int
iterator items*[T](a: SmallVec[T]): T =
for i in 0 ..< a.len:
yield a.data[i]
Now iterating becomes
for value in a.items():
echo a
A singly-linked list forward iterator could be implemented as
iterator items[T](head: ref T): ref T =
## Singly-linked list iterator
assert: not head.isNil
var cur = head
while true:
let next = cur.next
yield cur
cur = next
if cur.isNil:
break
a doubly-linked list backward iterator as
iterator backward[T](tail: ptr T): ptr T =
var cur = tail
while not cur.isNil:
let prev = cur.prev
yield cur
cur = prev
an iterator to unpack individual bits from a byte as:
iterator unpack(scalarByte: byte): bool =
yield bool((scalarByte and 0b10000000) shr 7)
yield bool((scalarByte and 0b01000000) shr 6)
yield bool((scalarByte and 0b00100000) shr 5)
yield bool((scalarByte and 0b00010000) shr 4)
yield bool((scalarByte and 0b00001000) shr 3)
yield bool((scalarByte and 0b00000100) shr 2)
yield bool((scalarByte and 0b00000010) shr 1)
yield bool( scalarByte and 0b00000001)
In all cases, the syntax to iterate on the collection remains:
for value in a.items():
echo a
for value in b.backward():
echo b
for bit in s.unpack():
echo s
The echo is inlined at "yield".
Iterators are not present in the produced C code, they are always inlined at the callsite.
Iterators are prone to code bloat, for example
iterator iterate[T](s: seq[T], backward: bool): T =
if backward:
for i in s.len-1 .. 0:
yield s[i]
else:
for i in 0 ..< s.len:
yield s[i]
for value in s.iterate(backward = false):
## Long-series of operations
echo value
The long series of operation will be duplicated.
items
and pairs
The items
and pairs
iterator are special cased and implictly call
if there is respectively one and two iteration variables hence:
for x in collection:
echo x
will automatically call the items
proc defined for the collection (or error)
for x, y in collection:
echo x
echo y
will automatically call the pairs
proc defined for the collection (or error)
fields
and fieldPairs
fields
and fieldsPairs
are iterator-like magic, that allow "iterating" on an object field. Note that those are unrolled at compile-time.
Closures and closure iterators
Will be covered in a dedicated section.
They are the backbone of Chronos, our async/await framework and also have a major potential for memory leaks.
template
template
s in Nim allows raw code substitution.
templates
are hygienic and typechecked unlike the C preprocessor.
Also they create their own scope unless tagged with the {.dirty.}
pragma.
A major issue with templates is that as they "copy-paste" code, it is very easy to misuse them and do a computation twice.
For instance
proc foo(): int =
echo "launch missile"
return 1
template doSomething(a: int) =
process(a)
log(a)
This would be transformed to:
process(foo())
log(foo())
and triggers the "launch missile" side-effect twice.
Another issue with templates is that they may not generate stacktraces properly as they are not materialized in the C code.
Symbol visibility and
TODO
macro
TODO
The do
notation
TODO
Operators and bit manipulation
!!! warning This auditors' handbook is frozen and obsolete; the Nim language manual alongside other Nim documentation, Status Nim style guide, Chronos guides, and Nim by Example supercede it.
Operators
A set of symbol and keywords can be used as infix operators Nim supports operator overloading.
Those symbols are:
= + - * / < >
@ $ ~ & % |
! ? ^ . : \
The keyword operators are
and or not xor shl shr div mod in notin is isnot of as
In-particular:
- bitwise and, or, flip, xor are denoted
and
,or
,not
,xor
instead of using a symbol - shift left and right are
shl
andshr
- division and modulo are
div
andmod
Implementation-defined behaviors
mod
is defined as the mathematical remainder, like C. With signed integers a mod b
has the same sign as a
shr
of a signed integer will not preserve the sign bit.
ashr
can be used for arithmetic right shift.
This distinction was introduced recently and may lead to subtle bugs,
also ashr
relies on the C compiler actually lowering >>
to SAR for signed integer.
This is specified for GCC and Clang (https://gcc.gnu.org/onlinedocs/gcc/Integers-implementation.html#Integers-implementation)
but implementation defined in general.
Operator precedence
Operator precedence is specified described in the manual:
https://nim-lang.org/docs/manual.html#syntax-precedence
Additional system operators
Nim system exports additional operators with a %
like +%
and -%
.
Those cast the signed integer operand to unsigned and cast the result back to signed.
This is intended to use with the ByteAddress type for pointer arithmetic
Bit manipulation
https://github.com/status-im/nim-stew/blob/master/stew/bitops2.nim
Closures and closures iterators
!!! warning This auditors' handbook is frozen and obsolete; the Nim language manual alongside other Nim documentation, Status Nim style guide, Chronos guides, and Nim by Example supercede it.
TODO
At a low-level
Closures and closures iterators are implemented via a pointer + an environment that stores the captured state necessary to execute the function.
The Nim compiler has a limited form of borrow checking and prevents capturing mutable variable or openArray (pointer+length pair).
It otherwise copies the capture variables in case of objects with value semantics or increment the reference count in case of ref object.
The Chronos library needs to generate a closure iterator for all async proc instantiated which may lead to long-lived ref objects and unreclaimable memory.
This may also extend to other resources like socket connections or file descriptors and a significant effort is underway to track memory usage and detect such scenarios:
- https://github.com/status-im/nim-libp2p/issues/145
- https://github.com/status-im/nim-libp2p/issues/207
Nim data types
!!! warning This auditors' handbook is frozen and obsolete; the Nim language manual alongside other Nim documentation, Status Nim style guide, Chronos guides, and Nim by Example supercede it.
Builtin types
Numerical types
For integers, Nim offers uint8
, uint16
, uint32
and uint64
types as well as int8
, int16
, int32
, int64
types.
Operations on signed integers are overflowed and underflowed checked unless the flag --overflowChecks:off
is used.
Nim also offers int
and uint
types, which have the same size as the platform word size, so 4 bytes when compiled for 32-bit OS
or 8 bytes for a 64-bit OS.
Conversion between int
and int32
or int64
must be explicit except for string literals.
Integer literals default to int
.
float32
and float64
maps to C float
and double
. float
is an alias to float64
whether on a 32-bit or 64-bit platform
Binary blobs
Nim has a specific byte
type which behaves like uint8.
It is the preferred type to represent binary blobs, i.e. we use seq[byte]
over string
, seq[char]
or seq[uint8]
for binary data.
Range types
Nim allows defining ranges of valid value which will be runtime checked everytime the value changes for example
Nim defines by default type Natural = range[0 .. high(int)]
. If the value of a Natural becomes less than 0
a RangeDefect
will be thrown.
This is valuable to catch / prevent underflows.
Sequences
Sequences are heap-allocated containers similar to C++ vectors.
They have value-semantics and are copied on assignments.
Sequences have a data pointer, reserved memory and current length
Strings
Sequences are heap-allocated containers.
They have value-semantics and are copied on assignments.
Strings have a data pointer, reserved memory and current length.
The data is terminated by \0
.
Nim strings automatically decays without copy to C strings in FFI calls
The representation is the same as a seq[byte]
or seq[char]
except for the terminating nul-byte,
including within the GC tracking data structures
Consequently, strings can be cast to seq[byte]
but inversely casting seq[byte]
to string
will lead to non nul-terminated string in C FFI and buffer overflows.
Record / structs
Nim has 3 kinds of record types
-
Value
type Foo = object field0: int field1: float
-
Reference
type Foo = ref object field0: int field1: float
-
Pointer
type Foo = ptr object field0: int field1: float
Value
A value object is allocated on the stack (unless if nested in heap-allocated types).
The equality check is structural.
Copy is done by copyMem (C memcpy)
Object variants do not have an equality operator set by default
Reference
A reference object is allocated on the heap and managed by the GC.
They are not thread-safe.
The equality check is by reference
Copy is done by copying the reference and increment the reference count
Pointer
A pointer object is manually managed, it can be on the heap (malloc) or on the stack (alloca)
Casting and low-level memory representation
!!! warning This auditors' handbook is frozen and obsolete; the Nim language manual alongside other Nim documentation, Status Nim style guide, Chronos guides, and Nim by Example supercede it.
Conversions
Casting to a signed integer will lead to a range check. Conversion to an unsigned integer even from a negative signed integer will NOT lead to a range check (https://github.com/nim-lang/RFCs/issues/175) https://nim-lang.org/docs/manual.html#statements-and-expressions-type-conversions
Casting integers
Unsigned integer casts behave like C.
Upcasting will lead to zero extension Downcasting will lead to truncation
Signed integer ranges will not be checked with a cast.
Casting to/from other types
Casting to or from any other types will be done through:
- union casts by default
- or C conversion if the type is a pointer.
In practice this means that the bit-pattern will be reinterpreted as the new type, similar to C++ reinterpret cast.
Nim memory management
!!! warning This auditors' handbook is frozen and obsolete; the Nim language manual alongside other Nim documentation, Status Nim style guide, Chronos guides, and Nim by Example supercede it.
Nim memory management is on a per-type basis.
Plain objects and char and numerical types are allocated on the stack.
Sequences and strings are allocated on the heap but have value semantics. They are copied on assignment
Ref types are allocated on the heap and have reference semantics, i.e. an unique instance can be held by multiple variables and only when all those variables go out-of-scope is the ref type discarded.
By default Nim uses a deferred reference counting GC. Additionally, if the type can lead to cycles, Nim will add "mark-and-sweep" passes to collect them.
Destructors
TODO
Nim allocator
Nim GCs are backed by a TLSF allocator which allows Nim to provide soft real-time guarantees if needed.
Analyzing memory leaks
Nim can be compiled with -d:useMalloc
to bypass the TLSF allocator and directly use malloc/free
References
Generics and Static types
!!! warning This auditors' handbook is frozen and obsolete; the Nim language manual alongside other Nim documentation, Status Nim style guide, Chronos guides, and Nim by Example supercede it.
Nim types can be parametrized by types (generics) or compile-time values (static)
For example
type
MySeq[T] = object
len, reserved: int
data: ptr UncheckedArray[T]
The generics can be restricted
type
MySeq[T: int32 or int64] = object
len, reserved: int
data: ptr UncheckedArray[T]
With static types
type
SmallSeq[MaxLen: static int, T] = object
len: int
data: array[MaxLen, T]
Arrays, openarrays, varargs
!!! warning This auditors' handbook is frozen and obsolete; the Nim language manual alongside other Nim documentation, Status Nim style guide, Chronos guides, and Nim by Example supercede it.
Arrays
TODO
Openarrays
Openarray are a parameter-only type that represent a (pointer, length) pair. In other languages they are also known as slices, ranges, views, spans.
The name openArray is inherited from Pascal, Oberon and Modula 2
Arrays and sequences are implictily converted to openArray.
The compiler has a limited form of escape analysis to prevent capturing openarrays in closures or returning them.
UncheckedArrays
TODO
Correctness, distinct, mutability, effects, exceptions
!!! warning This auditors' handbook is frozen and obsolete; the Nim language manual alongside other Nim documentation, Status Nim style guide, Chronos guides, and Nim by Example supercede it.
The Nim compiler provides several constraints that can be used to enforce proper usage of variables, types and error handling at compile-time.
One was already mentioned in previous paragraphs:
- Side-Effect analysis via using
func
or{.noSideEffect.}
(in the routines chapter)
Note that range types currently work at runtime only
Side-Effect
As mentioned in the routines chapter, using a func
or a proc tagged with {.noSideEffect.}
will prevent reading or writing to global variables (i.e. variables that are neither parameters or locals).
Note that side-effect analysis cannot analyse raw emitted C code
Additionally, allocating a sequence or a string, even if they technically access a global memory allocator, is not considered a side-effect.
The compiler will ignore statements in a {.noSIdeEffect.}
block for the purpose of side-effect analysis. At the moment this is only used for trace and some debug logs, as writing to stdout/stderr is considered writing to a global variables and so a side-effect.
not nil
The compiler exposes a "not nil" annotation for reference and pointer types. This enforces that parameters are proven always initialized in tagged:
- procedures
- types
This is not used in the codebase as a more powerful prover is required for our application.
Currently, the compiler warns when it cannot prove that a result reference is not nil.
distinct
types
A distinct type is a type that has the same representation as a base type at a low-level but cannot be used in its stead.
type Miles = distinct float32
type Meters = distinct float32
Procedures accepting:
float32
will not acceptMiles
orMeters
Miles
will not acceptfloat32
orMeters
Meters
will not acceptfloat32
orMiles
distinct
type can reuse the base type procedures and fields via the borrow annotation as described in the manual (https://nim-lang.org/docs/manual.html#types-distinct-type)
Enforcing exception handling
The codebase uses a mix of Result and Exceptions for error handling with option types and bool in some cases.
As an effort to sanitize error handling and ensure that all exception paths are handled, we use the effect tracking system the following way:
proc foo() {.raises: [].} =
discard
The procedure above will refuse to compile if its body can throw an unhandled exception.
proc foo() {.raises: [ValueError].} =
discard
The procedure above will refuse to compile if its body can throw an exception besides a ValueError.
In particular Nim distinguishes between Defects, which are non-recoverable, and Exceptions, which we should recover from.
For our purposes, we allow all procedures to throw a Defect (for example an assertion broken), this is done by adding {.push raises:[Defect]}
at the start of a file
{.push raises:[Defect]}
proc foo1() =
discard
proc foo2() =
discard
{.pop.}
Mutability
Only variables declared with var
and var
parameters are mutable.
Note that mutability analysis is not deep for reference types or pointer types. You can always mutate through a pointer.
Future
Significant improvements are in-progress planned to improve Nim safety:
lent
return values and owned reference, a limited form of borrow checker: https://nim-lang.org/docs/destructors.html#lent-type- Z3 theorem prover for bound checks and nil checks at compile-time: https://nim-lang.org/docs/drnim.html
- write-tracking to prevent deep immutability: https://github.com/nim-lang/RFCs/issues/234 (https://nim-lang.org/araq/writetracking.html)
however it is too early to use them in a production codebase.
Debugging Nim
!!! warning This auditors' handbook is frozen and obsolete; the Nim language manual alongside other Nim documentation, Status Nim style guide, Chronos guides, and Nim by Example supercede it.
Reference article: https://nim-lang.org/blog/2017/10/02/documenting-profiling-and-debugging-nim-code.html
GDB / LLDB
Nim can be instrumented with sourcemaps:
- by passing
--debugger:native
to the compiler, so that the stacktraces in gdb shows the Nim source code - by passing
--passC:"-g"
to the compiler, so that the stacktraces in gdb shows the C source code
Sanitizers & Valgrind
LLVM and GCC sanitizers can be used with
nim c --cc:clang -r -d:release --debugger:native \
--passC:"-fsanitize=address" --passL:"-fsanitize=address" \
--outdir:build target_application.nim
Note on deactivating Nim memory allocator:
As mentioned in the memory management section, Nim has
- a garbage collector, by default deferred reference counting + cycle detection via mark-and-sweep if the types can have cycles (and is not tagged {.acyclic.})
- an allocator based on TLSF
Instead of Nim custom allocators, the sys malloc/free can be used by passing
-d:useMalloc
in the command-line
Some GC/versions might not properly accept the flag, this is a Nim bug and we can patch upstream and our own fork in that case
Fuzzers
TODO
Internal fuzzing
We are currently adding fuzzing to our repositories via libFuzzer and AFL.
- Fuzzing abstraction: https://github.com/status-im/nim-testutils
- Network lib fuzzing: https://github.com/status-im/nim-eth/tree/4f533eb5/tests/fuzzing
- Eth2 core/Validator core fuzzing: https://github.com/status-im/nim-blscurve/tree/271a5738/tests/fuzzing
External "Consensus" fuzzing
Sigma Prima is fuzzing all Ethereum 2 clients on the spec (Eth2 core/Validator core)
We provide them with a simple C API
That we implement:
And compile the nim code as a shared or static library.
On usage, the only extra limitation compared to a C library is the need to call NimMain()
before calling any Nim function to initialize the Nim runtime.
Inspecting the generated intermediate code and objects
The intermediate C code and object files are stored in
$HOME/.cache/nim/compiled_target_d
or$HOME/.cache/nim/compiled_target_r
on UNIX (d for debug, r for release)$HOME/nimcache/compiled_target_d
or$HOME/.cache/nim/compiled_target_r
on Windows
The cache directory can be set with the --nimcache=<targetdir>
compiler flag.
Repositories vendoring nimbus-build-system
do not use the default setting
but generate directly in the (gitignored) nimcache
folder of the repo.
Foreign language interop
!!! warning This auditors' handbook is frozen and obsolete; the Nim language manual alongside other Nim documentation, Status Nim style guide, Chronos guides, and Nim by Example supercede it.
Wrapping C
Using shared library
Example: secp256k1
https://github.com/status-im/nim-secp256k1/blob/master/secp256k1_abi.nim
Compiling directly the C files
Example: MIRACL Core
https://github.com/status-im/nim-blscurve/blob/master/blscurve/miracl/miracl.nim
Wrapping C++
Beyond the syntax for wrapping C, Nim offers a flexible syntax for wrapping C++, for example for vectors:
type
CppVector* {.importcpp"std::vector", header: "<vector>", byref.} [T] = object
proc newCppVector*[T](): CppVector[T] {.importcpp: "std::vector<'*0>()", header: "<vector>", constructor.}
proc newCppVector*[T](size: int): CppVector[T] {.importcpp: "std::vector<'*0>(#)", header: "<vector>", constructor.}
proc len*(v: CppVector): int {.importcpp: "#.size()", header: "<vector>".}
proc add*[T](v: var CppVector[T], elem: T){.importcpp: "#.push_back(#)", header: "<vector>".}
proc `[]`*[T](v: CppVector[T], idx: int): T{.importcpp: "#[#]", header: "<vector>".}
proc `[]`*[T](v: var CppVector[T], idx: int): var T{.importcpp: "#[#]", header: "<vector>".}
proc `[]=`*[T](v: var CppVector[T], idx: int, value: T) {.importcpp: "#[#]=#", header: "<vector>".}
Example: ttmath
https://github.com/status-im/nim-ttmath/blob/master/src/ttmath.nim
Exporting
See "Fuzzing" chapter for exporting a C API for fuzzing
References
- Nim manual:
- Nim backend integration: https://nim-lang.org/docs/backends.html
Nim threat model
!!! warning This auditors' handbook is frozen and obsolete; the Nim language manual alongside other Nim documentation, Status Nim style guide, Chronos guides, and Nim by Example supercede it.
Nim and its standard library are not the focus of the audits.
In particular the codebase intentionally limits reliance on the standard library
so that it is not tied to Nim release schedule, instead of the standard library
we use stew
most of the time: https://github.com/status-im/nim-stew.
Nim standard library is implemented here:
We target Nim v1.2.2
Nim FAQ
Nimbus Beacon Chain
!!! warning This auditors' handbook is frozen and obsolete; the Nim language manual alongside other Nim documentation, Status Nim style guide, Chronos guides, and Nim by Example supercede it.
https://github.com/status-im/nimbus-eth2
Nimbus Beacon Chain (NBC) is an implementation of an Ethereum 2 client.
Audit scope
Network Core (leveraging the libp2p framework)
Sub-topic |
---|
Discovery Protocol (discv5) |
Publish/Subscribe protocol |
Eth2 Request/Response protocol |
SSZ - (De)serialization & tree hashing |
Wire encryption |
ETH2 Specification core
Sub-topic |
---|
State transition logic |
Signature verification |
Epoch finalisation and justification |
Reward processing |
Eth1 data processing |
Fork choice logic |
Block processing and production |
Attestation processing and production |
Block synchronization |
Peer pool management |
Validator core and user experience
Sub-topic |
---|
Block/attestation signing |
Slash-prevention mechanisms |
RPC API |
Accounts management & key storage |
Command Line Interface (CLI) |
High-level view of the stack
https://miro.com/app/board/o9J_kvfytDI=/
Diagram
TODO
Specifications
We target v1.0.1 phase0 of https://github.com/ethereum/consensus-specs
The p2p-interface specs in particular describe the subset of libp2p spec that are used to implement Ethereum 2
Resources
Ethereum 2.0 Ask Me Anything:
Nimbus build system & dependencies
!!! warning This auditors' handbook is frozen and obsolete; the Nim language manual alongside other Nim documentation, Status Nim style guide, Chronos guides, and Nim by Example supercede it.
Build system
NBC repo
nimbus-eth2
uses a build system based on makefiles.
Usage is described in the README.
In particular a env.sh
scripts is provided that setups the environment variable
expected.
Libraries
Library repositories uses the official package manager called nimble
for building and testing.
In particular, we use by convention the following command to run the test suite via nimble task system:
nimble test
The details are implemented in <package name>.nimble
at the root of the repository.
The continuous integration setup for Travis, Azure Pipelines and/or Appveyor are also setting up a Nim environment from scratch.
Package dependencies
NBC repo
For the nimbus-eth2
application, all dependencies are vendored
in the vendor
folder.
With regards to the audit scope we have the following dependencies:
All
- Async
- asynctools
- nim-chronos
- Logging
- jswebsockets
- karax
- nim-chronicles
- nim-chronicles-tail (devtool, out-of-audit-scope)
- nim-libbacktrace
- websocket.nim
- nim-protobuf-serialization
- nim-rocksdb
- CLI interface
- nim-confutils
- nim-prompt
- nim-unicodedb
- Metrics
- nim-metrics (the server endpoint is out-of-audit-scope)
- Helpers
- nim-stew
- nim-testutils (fuzzing)
- nimYAML (should be test only)
Network core
- Cryptography
- nim-bearssl
- nim-secp256k1
- nimcrypto
- Networking & Discovery
- nim-eth
- nim-libp2p
- nim-nat-traversal
- Serialization
- nim-faststreams
- nim-serialization
- nim-snappy
- BigInt
- nim-stint (Note: one of the main use is 256-bit bloomfilter, a dedicated library should be used instead)
ETH2 Specification core
- Cryptography
- nim-blscurve
- Database
- nim-sqlite3-abi
- Eth1 monitoring
- nim-web3
- Build system
- nimbus-build-system
Validator core
- Cryptography
- nim-blscurve
- RPC
- news
- nim-http-utils
- nim-json-rpc
- nim-json-serialization
NBC Threat model
!!! warning This auditors' handbook is frozen and obsolete; the Nim language manual alongside other Nim documentation, Status Nim style guide, Chronos guides, and Nim by Example supercede it.
NBC primarily targets resource restricted devices like Raspberry Pi or smartphones to desktop computers.
We focus on:
- remote attacks
This includes but not limited to:
- denial-of-services
- retrieving a secret key
- via software defect or misuse of libraries (cryptography, networking)
- or poor key management practices or UI or documentation confusing the user
- collusion of peers (eclipse, surround voting, ...)
- loss of funds
In particular, we do not defend against a malicious cloud provider which would offer a “Rent-a-Raspberry-Pi” service for validators and would subsequently engage in attacks against those validators, the implest being, taking the validator offline when it's their duty time.
Sensitive data:
- Signing key: each validator has a secret signing key used to sign attestation and blocks with 32 ETH at stake (about 7500 USD at the current rate) Each beacon node instance can have dozens to hundreds of validators attached. Leaking the signing key would allow an attacker to double-vote with the key, leading to slashing and ultimately ejecting the validator once its stake reaches below 16 ETH. The signing key must be present in memory as a validator may have signing duties every 6 seconds
- Withdrawal key: a withdrawal key allows withdrawing the stake. A withdrawal key is unnecessary during day-to-day operation and can be stored in cold storage, possibly hardware wallet
- IP address: if a validator IP address leaks, they might be subject to target DoS attacks taking their node offline and making them miss their duties. In a regular setting, the network will have "relayer"/"listener" nodes and validator nodes that shouldn't be distinguishable.
- Random number generation:
- Strong recommendations to generate the withdrawal key offline.
- The RNG is used in particular to select which peer to connect to from a pool of candidates.
Cryptographic libraries
A review in in progress to select BLS signature backend.
See https://notes.status.im/nim-bls-curve-backends#Threat-model
Resources
Surround vote detection (not in audit scope): https://github.com/protolambda/eth2-surround