Accelerators and Deep Learning

I have been reading a lot about developing accelerator systems mainly for deep learning training and inference applications. Among the existing ones, I have found the following most interesting:

  1. Intel’s Knights Mill (New generation Xeon Phi): The individual cores on this many-core chip are smaller where inner loops can fit in L1-instruction caches. As a result, the performance of cores per socket is decreased but the number of cores is greater, which is good for compute intensive applications like deep networks. Its design is said to be a midway between a server CPU and a hardware accelerator.
  2. NVIDIA’s Tesla V100 Chip: The 640 Tensor cores on this chip are designed to accelerate Matrix Multiply Accumulate (MMA) operations: a main deep learning operation.
  3. Google’s Tensor Processing Unit (TPU): Another MMA 256×256 array of 8-bit multipliers.
  4. ThinCI’s Chip: This chip incorporates small processors and a thread scheduler analogous to a CPU with execution units and an instruction scheduler. The new feature here is that the processors can stream data to each other instead of having to load/store from RAM each time a computation is needed.
  5. Data-Flow Engines: One of the coolest accelerator types. The main design is focused on how to map data graphs onto the data flow processing nodes to maximize computation speed and minimize IPC overhead and synchronizations.

Basic C++ Concepts (Q&A)

Q\When to use “*” and “&”?

A\ “*” is used for declaring a pointer. “&” is used for declaring/defining a reference (has to be done at initialization). A pointer is different from a reference as the pointer is its own variable with its own memory address that contains the address of what it points to.

“*” is also used for dereferencing pointers (see next Q/A) and “&” is the address-of operator.

Q\When to use “->” and “.”?

A\ pointer->at(i) is equivalent to (*pointer)[i] which translates to a structure dereferencing (member of object pointed to by) operation.

“.” is used for structure referencing (member of object) operations.

Q\What is the difference between Pointers, Objects and References? What’s the difference between pass-by-reference and pass-by-value?

A\ A Pointer is a memory variable that contains the address of the object it points to when initialized. It is allocated on the stack (automatic allocation)

An Object is a variable allocated on the stack. A Reference is a synonym for the object and shares and contains the address of the object it’s assigned until it goes out of scope.

int i = 3;

int *p = &i;

int &ref = i; //constant pointer => reference (not pointer to constant) //Has to stay that way, cannot be reassigned! int &ref => int *const ref;

MyObject x; //x is a stack variable of type MyObject

MyObject *y; //y is a pointer stack variable that holds addresses of MyObject type variables

MyObject *z = new MyObject(); //z is a pointer on the stack and contains the address of an instance of //MyObject allocated on the heap (Dynamic Allocation)

MyObject *a = &x; //a is a pointer on the stack and contains the address of x (points to variable x)

int func(MyObject *obj, MyObject object, int &a)

//pass-by-reference parameter will be passed by creating a copy of the source pointer and setting it to point to source object.

//pass-by-value parameter will be passed by creating a copy of the entire source object on the stack.

//pass-by-reference without copying: when address is passed using “&”.

Q\What is a const reference? How to use the const keyword?

A\ A const reference is used when passing a reference to a function with the intention that the function will not modify its variable’s value (the variable is read-only).

The const keyword is used for type/function declarations that we don’t want modified.

Q\What is the difference between “const char *p = &a” and “char *const p = &a”?

A\ The first is a pointer to constant: the value of the pointer cannot change (a is constant) i.e. p cannot change value of a. Can be NULL and do arithmetic operations.

The second is constant pointer: similar to a reference, value of a can change through p but value of p is constant (reference).

Q\What is the rule of three (five)?

A\ Classes should have the following members as default:

  1. A Copy Constructor
  2. A Move Constructor
  3. A Destructor
  4. A Copy Assignment Operator
  5. A Move Assignment Operator

Q\What does the static keyword do?

A\ Depends on the context:

  1. Static Variables defined in namespace scopes are only defined in that scope and can’t be accessed from outside said scope. They have the lifetime of the unit (.cpp) they’re defined in.
  2. Static Variables defined in function scopes share memory location/address. Variables are shared across function calls.
  3. Static Variables defined as class members share memory location/address. Variables are shared across class instances.
  4. Static Functions defined as class members share memory location/address. Variables are shared across class instances.

In general, static usually refers to either internal linkage or storage. Global variables are different from static variables in the sense that they have external linkage.

Q\What is the importance of a copy constructor and how to write one?

A\ A copy constructor is important when deep copying class instances. They take an instance of the class as their input argument.

Q\How to use the STL vector container and its thrust library equivalents with host/device programming?

A\ First, the name of the std::vector container is NOT a pointer to its first element; unlike C arrays.

With the standard vector container, we can create dynamically allocated arrays using a non-const size parameter in the static array allocation syntax. Deletion is automatic like in static arrays also.

size_t size = 10;

int sarray[10];

int *darray = new int[size];


delete [] darray;

std::vector<int> array(size);

The push_back() function:

  • adds argument to the end of the array.
  • allocates some memory, adds elements until memory is exhausted.
  • reallocates more memory and moves (copies) the original array into the new space, therefore copy constructors are useful here.

Q\What are smart pointers?

A\ In C++11, smart pointers are used when declaring heap allocations so they behave as stack allocations, i.e. deletion/freeing memory is automatic on scope exits. A smart pointer is declared on the stack (as a local or automatic variable) using the original raw pointer as its initial value.

Q\When to use return by value and return by reference?

A\ Use return by value when:

  • returning variables that were declared/modified (function arguments) inside the function.

Use return by address when:

  • returning a built-in array or pointer (dynamically allocated memory)
  • returning function parameters that were passed by address

Be careful when returning addresses of variables that were local to the function.

Use return by reference when:

  • returning struct or class that was not destroyed inside the function
  • returning a reference parameter
  • returning an array element that was passed into the function

int & returnByReference(); //whatever is returned has to be static

int returnByValue();

int value = returnByReference(); //Correct

int &ref = returnByValue(); //Not Correct

const int &ref = returnByValue(); Corrrect