[TVM] Basics

import logging
import sys

import numpy as np
import tvm

from tvm import autotvm # seems like this is the autotuner

Basic Flow

There are placeholder and operations which make it just like some of the other declarative deep learning frameworks. Addition to that is create_schedule which creates schedule.

define_knob + split

Schedule is created through two stages (1) get config object + define the search space –> 5×5 = 25

# get the config object
cfg = autotvm.get_config()

# define search space
cfg.define_knob("tile_y", [1, 2, 4, 8, 16])
cfg.define_knob("tile_x", [1, 2, 4, 8, 16])

(2) schedule according to entity in the space.

# yo, yi = s[C].split(y, cfg['tile_y'].val)
# xo, xi = s[C].split(x, cfg['tile_x'].val)

define_split + apply

Another (better) method would be to (1) get config object + define the split knob (which enumerates all the possible ways to split an axis and construct the space) –> {(1,32) to (32,1)} = 6

# get the config object
cfg = autotvm.get_config()

# define search space --> each entry in cfg is SplitEntity
cfg.define_split("tile_y", y, num_outputs=2)
cfg.define_split("tile_x", x, num_outputs=2)

(2) schedule according to entity in the space.

# yo, yi = cfg["tile_y"].apply(s, C, y)
# xo, xi = cfg["tile_x"].apply(s, C, x)


There are RandomTuner, GridSearchTuner, GATuner (Genetic Algorithm), XGBTuner

GridSearch Tuner

next_batch() function uses Tuner’s counter as index to get new config and append to ret which is tested one by one.

Random Tuner

next_batch() function finds random index and never visit the same config again by comparing the new random index with entries in the visited set.

GA Tuner

point2knob makes index to vector
knob2point makes vector to index

At initialization, it makes a list of genes with pop_size elements.

Each next_batch() call will find batch_size number of genes to run experiments using measure_batch() and get their time as output.

Every time batch is over, it goes through an update(). It just appends scores until the whole pop_size is tested. After it has completed testing pop_size, it picks the best elite_num genes from the ones experimented using np.argpartition (link). Out of them, it (1) samples two of them using their scores as probability, (2) mix two of them together to form pop_size number of tmp_gene. (3) mutates some dimensions of the knob.

—> only traverses ones that are likely to be successful, so small probability of invalid…

XGBoost Tuner

CostModel: predicts the speed of a config…
ModelOptimizer: find optimal points of a cost model…
ModelBasedTuner: fit a cost model and use an optimizer to find the maximums…


itervar: use features extracted from IterVar (default)
knob: use flatten ConfigEntityDirectly
curve: use sampled curve feature

Details of encoding are programmed in C/C++ in src/autotvm/touch_extractor.cc

Transfer Learning

Transfer learning is implemented by reading the past logs… It reads task.name which are in the form of “topi_nn_conv2d”. Reading from file is implemented in python/tvm/autotvm/record.py.

Invalid Configurations

MeasureResult(costs=(InstantiationError(‘Skipped because of invalid gpu kernel’,),)

LocalBuilder –> default_build_func(measure_input, tmp_dir, **kwargs) –> _build_func_common(measure_input, **kwargs) –> gpu_verify_pass(**check_gpu = 1st part of **kwargs)

LocalBuilder’s build calls self.executor.submit(self.build_func, inp, self.tmp_dir, **self.build_kwargs)

Verification code finally goes to src/pass/verify_gpu_code.cc


On firefly board… using screen or tmux for each of the following command…

python3 -m tvm.exec.rpc_tracker
python3 -m tvm.exec.rpc_server –tracker=[HOST_IP]:9190 –key=rk3399

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

This site uses Akismet to reduce spam. Learn how your comment data is processed.