Nim Routines

!!! warning This auditors' handbook is frozen and obsolete; the Nim language manual alongside other Nim documentation, Status Nim style guide, Chronos guides, and Nim by Example supercede it.

Nim offers several kinds of "routines" that:

  • do computation
  • produce side-effect
  • generate code

Those are:

  • proc and func
  • method
  • converter
  • iterator
  • template
  • macro

proc and func

proc and func are the most basic routines.

At the moment, Nim requires forward declaration of proc and func. Also it prevents circular dependencies, this means that a procedure is always coming from one of the imports.

Additionally, all dependencies are submodules and a proc can be found by greping procname*, the * being the export marker.

The only exception being the standard library. Procedures from the standard library are all listed in "The Index".

Function call syntax

Nim provides flexible call syntax, the following are possible:

prof foo(a: int) =
  discard

foo(a)
foo a
a.foo()
a.foo

Additionally this is also possible for strings:

let a = fromHex"0x12345" # Without spaces

Nim doesn't enforce namespacing by default but is an option

let a = byteutils.fromhex"0x12345"

Parameters

Mutable parameters must be tagged with var

TODO

Symbol resolution

If 2 procedures are visible in the same module (a module is a file) and have the same name the compiler will infer which to call from the arguments signatures. In case both are applicable, the compiler will throw an "ambiguous call" compile-time error.

Note that a procedure specialized to a concrete type has priority over a generic procedure, for example a procedure for int64 over a procedure for all number types.

func and side effect analysis

In Nim a proc is considered to have side-effect if it accesses a global variable. Mutating a declared function parameter is not considered a side-effect as there is no access to a global variable. Printing to the standard output or reading the standard input is considered a sideeffect.

func are syntactic sugar for proc without sideeffects. In particular this means that func behaviors are fully determined by their input parameters.

In the codebase, logging at the trace level are not considered a sideeffect.

Additionally some logging statements and metrics statement may be in an explicit {.noSideEffect.}: code-block.

Returning values

There are 3 syntaxes to return a value from a procedure:

  1. The return statement
  2. The implicit result variable
  3. The "last statement as expression"
proc add1(x: int): int =
  return x + 1

proc add2(x: int): int =
  result = x + 2

proc add3(x: int): int =
  x + 3

The main differences are:

  1. return allows early returns, in particular from a loop.
  2. result offers Return Value Optimization and Copy Elision which is particularly valuable for array types.
  3. Requires the last statement to be a valid expression. This is particularly interesting for conditional return values as forgetting to set the value in a branch will be a compile-time error, for example:
    proc select(ctl: bool, a, b: int): int =
      if ctl:
        echo "heavy processing"
        a
      else:
        echo "heavy processing"
        b
    
    Omitting a or b will be a compiletime error, unlike
    proc select(ctl: bool, a, b: int): int =
      if ctl:
        echo "heavy processing"
        return a
      else:
        echo "heavy processing"
        # Forgot to return b
    
    proc select(ctl: bool, a, b: int): int =
      if ctl:
        echo "heavy processing"
        result = a
      else:
        echo "heavy processing"
        # Forgot to result = b
    

Due to the differences we prefer using the "last statement as expression" unless

  • copying the type is expensive (SHA256 hash for example)
  • or we need early returns

Ignoring return values

Unlike C, return values MUST be used or explicitly discarded.

Mutable return values

TODO

At a low-level

Argument passing

Nim passes arguments by value if they take less than 3*sizeof(pointer) (i.e. 24 bytes on 64-bit OS) and passes them by pointer with the C backend or reference with the C++ backend if they are bigger. Mutable arguments are always passed by pointer.

This behavior can be changed on a type-by-type bases by tagging them {.bycopy.} or {.byref.}. This is only used for interfacing with non-Nim code.

Stacktraces

With --stacktrace:on, Nim create a stackframe on proc entry and destroys it on exit. This is used for reporting stacktraces.

NBC is always compiled with --stacktraces:on

NBC uses libbacktrace to have less costly stacktraces.

Name in the C code or Assembly

proc and func are materialized in the produced C code with name-mangling appended at the end. For the purpose of building Nim libraries, the name can be controlled by:

  • {.exportc.} so that the generated C name is the same as Nim
  • `{.exportc: "specific_name".} to generate a specific name

method

methods are used for dynamic dispatch when an object has an inherited subtype only known at runtime.

method are dispatched using a dispatch tree in the C code instead of a VTable.

There might be some cases where method were used not for their intended purpose

converter

Converters are procedures that are implicitly called on a value to change its type.

For example with a fictional option type that automatically extracts the boxed type.

type Option[T] = object
  case hasValue: bool
  of true:
    value: T
  else:
    discard

converter get[T](x: Option[T]): T =
  x.value

let x = Option[int](hasValue: true, value: 1)
let y = Option[int](hasValue: true, value: 2)

let z = x + y

Even though the + operator is not defined for Option[int] it is defined for int and Nim implicitly calls the converter.

converter are seldom used in the codebase as we prefer explicit over implicit.

Note that in case an operation is defined on both the convertible and the converted type, the operation without conversion should be preferred however the compiler might throw an ambiguous call instead.

Iterators

Iterators are construct that transforms a for loop.

For example to iterate on a custom array collection

const MaxSize = 7

type SmallVec[T] = object
    buffer*: array[MaxSize, T]
    len*: int

iterator items*[T](a: SmallVec[T]): T =
  for i in 0 ..< a.len:
    yield a.data[i]

Now iterating becomes

for value in a.items():
  echo a

A singly-linked list forward iterator could be implemented as

iterator items[T](head: ref T): ref T =
  ## Singly-linked list iterator
  assert: not head.isNil
  var cur = head
  while true:
    let next = cur.next
    yield cur
    cur = next
    if cur.isNil:
      break

a doubly-linked list backward iterator as

iterator backward[T](tail: ptr T): ptr T =
  var cur = tail
  while not cur.isNil:
    let prev = cur.prev
    yield cur
    cur = prev

an iterator to unpack individual bits from a byte as:

iterator unpack(scalarByte: byte): bool =
  yield bool((scalarByte and 0b10000000) shr 7)
  yield bool((scalarByte and 0b01000000) shr 6)
  yield bool((scalarByte and 0b00100000) shr 5)
  yield bool((scalarByte and 0b00010000) shr 4)
  yield bool((scalarByte and 0b00001000) shr 3)
  yield bool((scalarByte and 0b00000100) shr 2)
  yield bool((scalarByte and 0b00000010) shr 1)
  yield bool( scalarByte and 0b00000001)

In all cases, the syntax to iterate on the collection remains:

for value in a.items():
  echo a

for value in b.backward():
  echo b

for bit in s.unpack():
  echo s

The echo is inlined at "yield".

Iterators are not present in the produced C code, they are always inlined at the callsite.

Iterators are prone to code bloat, for example

iterator iterate[T](s: seq[T], backward: bool): T =
  if backward:
    for i in s.len-1 .. 0:
      yield s[i]
  else:
    for i in 0 ..< s.len:
      yield s[i]

for value in s.iterate(backward = false):
  ## Long-series of operations
  echo value

The long series of operation will be duplicated.

items and pairs

The items and pairs iterator are special cased and implictly call if there is respectively one and two iteration variables hence:

for x in collection:
  echo x

will automatically call the items proc defined for the collection (or error)

for x, y in collection:
  echo x
  echo y

will automatically call the pairs proc defined for the collection (or error)

fields and fieldPairs

fields and fieldsPairs are iterator-like magic, that allow "iterating" on an object field. Note that those are unrolled at compile-time.

Closures and closure iterators

Will be covered in a dedicated section.

They are the backbone of Chronos, our async/await framework and also have a major potential for memory leaks.

template

templates in Nim allows raw code substitution.

templates are hygienic and typechecked unlike the C preprocessor. Also they create their own scope unless tagged with the {.dirty.} pragma.

A major issue with templates is that as they "copy-paste" code, it is very easy to misuse them and do a computation twice.

For instance

proc foo(): int =
  echo "launch missile"
  return 1

template doSomething(a: int) =
  process(a)
  log(a)

This would be transformed to:

process(foo())
log(foo())

and triggers the "launch missile" side-effect twice.

Another issue with templates is that they may not generate stacktraces properly as they are not materialized in the C code.

Symbol visibility and

TODO

macro

TODO

The do notation

TODO