TVM Pycharm 설정법


정상적으로 빌드해서 라이브러리 파일들을 생성한다.

아래와 같이 bashrc에 있는 환경변수들을 pycharm에서도 동일하게 수행해 주어야 한다.

PYTHONPATH 설정

Preference -> Project Interpreter

Python Console 설정

Console에서 Always show debug console 체크

Edit Config

Run -> Edit Configuration

NVCC 관련 에러가 발생하면 아래와 같이 PATH에 /usr/local/cuda/bin을 추가한다.
which nvcc를 통해서 설치된 경로를 찾을 수 있다. deafult는 위와 같은 경로이다.
system PATH가 바로 수정은 안되므로 맨 위에 Use environment variables에 PATH와 전체 경로를 복사한 후에 맨 뒤에 nvcc path를 추가하는 방식으로 설정한다.

스크린샷 2021-03-09 오후 5.34.53

정상적으로 설정하면 아래와 같이 .idea 파일에서 xml을 통해서 경로가 변경됨을 확인할 수 있다.

스크린샷 2021-03-09 오후 5.53.23

'AI > TVM' 카테고리의 다른 글

VTA on FPGA Board  (4) 2019.04.02
PYNQ: Python productivity on ZYNQ  (0) 2019.04.02
TVM 설치 방법  (0) 2019.04.02

VTA on FPGA Board


사전 설치 사항

TVM llvm활성 상태에서 빌드한 후 Path설정을 모두 완료한 상태

PYNQ 보드에 기반한 TVM의 VTA 실행 환결 설정 및 결과

OSX 환경에서의 SSHFS 설치

  • Mount Remote File Systems

brew install SSHFS로 설치 후 fuse를 다운 받아서 설치 한다.

설치가 완료되면 파일 시스템을 마운트한 상태에서는 host target의 데이터가 자동으로 sync된다.

  • sshfs xilinx@192.168.0.3:/home/xilinx pynq-z1-tvm

컴파일 순서

git clone --recursive https://github.com/dmlc/tvm
ssh xilinx@192.168.0.3

xilinx@pynq:~/tvm$ mkdir build
xilinx@pynq:~/tvm$ cp cmake/config.cmake build/
xilinx@pynq:~/tvm$ cp vta/config/pynq_sample.json build/vta_config.json
xilinx@pynq:~/tvm$ cd ./build/
xilinx@pynq:~/tvm/build$ cmake ..
-- The C compiler identification is GNU 7.3.0
-- The CXX compiler identification is GNU 7.3.0
-- Check for working C compiler: /usr/bin/cc
-- Check for working C compiler: /usr/bin/cc -- works
-- Detecting C compiler ABI info
-- Detecting C compiler ABI info - done
-- Detecting C compile features
-- Detecting C compile features - done
-- Check for working CXX compiler: /usr/bin/c++
-- Check for working CXX compiler: /usr/bin/c++ -- works
-- Detecting CXX compiler ABI info
-- Detecting CXX compiler ABI info - done
-- Detecting CXX compile features
-- Detecting CXX compile features - done
-- Performing Test SUPPORT_CXX11
-- Performing Test SUPPORT_CXX11 - Success
-- Build with RPC support...
-- Build with Graph runtime support...
-- Use VTA config /home/xilinx/tvm/build/vta_config.json
-- Build VTA runtime with target: pynq
-- Build with contrib.hybriddump
-- Configuring done
-- Generating done
-- Build files have been written to: /home/xilinx/tvm/build

make runtime vta -j2

Scanning dependencies of target tvm_runtime
[  0%] Building CXX object CMakeFiles/tvm_runtime.dir/src/runtime/builtin_fp16.cc.o
[  0%] Building CXX object CMakeFiles/tvm_runtime.dir/src/runtime/c_dsl_api.cc.o
[ 14%] Building CXX object CMakeFiles/tvm_runtime.dir/src/runtime/c_runtime_api.cc.o
...
...
[100%] Built target tvm_runtime
Scanning dependencies of target runtime
[100%] Built target runtime
Scanning dependencies of target vta
[  0%] Building CXX object CMakeFiles/vta.dir/vta/src/device_api.cc.o
[ 50%] Building CXX object CMakeFiles/vta.dir/vta/src/runtime.cc.o
[ 50%] Building CXX object CMakeFiles/vta.dir/vta/src/pynq/pynq_driver.cc.o
[100%] Linking CXX shared library libvta.so
[100%] Built target vta

Run a RPC server

xilinx@pynq:~/tvm$ sudo ./apps/pynq_rpc/start_rpc_server.sh
[sudo] password for xilinx:
INFO:RPCServer:bind to 0.0.0.0:9091

Terminate this RPC server with Ctrl+c

host 환경변수 설정

# On the Host-side
export VTA_PYNQ_RPC_HOST=192.168.2.99
export VTA_PYNQ_RPC_PORT=9091

호스트 PC에서 target을 pynq라고 설정하는 방법은 그냥 json설정 파일을 복사하면 된다.

 # On the Host-side
 cd <tvm root>
 cp vta/config/pynq_sample.json vta_config.json

test_program_rpc.py를 실행해서 FPGA를 굽는다.

  • VTA bitstream을 프로그램화
  • VTA runtime을 빌드

vta_config.json과 매칭되어지는 pre-compiled bitstream을 VTA bitstream repository에서 다운로드 받는다.

만약 vta_config.json을 수정한다면 30초 정도 걸리는 VTA runtime generation 과정을 다시 수행하게 된다.

python <tvm root>/vta/tests/python/pynq/test_program_rpc.py

실행 결과

아래는 간단한 benchmark 예제인 test_benchmark_topi_conv2d.py의 실행 결과를 host target에서 각각 나온것을 나타낸다.

host

 (tvm-vta) ➜  tvm git:(master) python ./vta/tests/python/integration/test_benchmark_topi_conv2d.py
 key=resnet-cfg[1]
 Conv2DWorkload(batch=1, height=56, width=56, in_filter=64, out_filter=64, hkernel=3, wkernel=3, hpad=1, wpad=1, hstride=1, wstride=1)
 ----- CONV2D CPU End-to-End Test-------
     Time cost = 0.254753 sec/op, 0.90759 GOPS
 key=resnet-cfg[2]
 Conv2DWorkload(batch=1, height=56, width=56, in_filter=64, out_filter=64, hkernel=1, wkernel=1, hpad=0, wpad=0, hstride=1, wstride=1)
 ----- CONV2D CPU End-to-End Test-------
     Time cost = 0.0364432 sec/op, 0.704936 GOPS
 key=resnet-cfg[3]
 Conv2DWorkload(batch=1, height=56, width=56, in_filter=64, out_filter=128, hkernel=3, wkernel=3, hpad=1, wpad=1, hstride=2, wstride=2)
 ----- CONV2D CPU End-to-End Test-------
     Time cost = 0.12858 sec/op, 0.899091 GOPS
 key=resnet-cfg[4]
 Conv2DWorkload(batch=1, height=56, width=56, in_filter=64, out_filter=128, hkernel=1, wkernel=1, hpad=0, wpad=0, hstride=2, wstride=2)
 ----- CONV2D CPU End-to-End Test-------
     Time cost = 0.0159981 sec/op, 0.802913 GOPS
 key=resnet-cfg[5]
 Conv2DWorkload(batch=1, height=28, width=28, in_filter=128, out_filter=128, hkernel=3, wkernel=3, hpad=1, wpad=1, hstride=1, wstride=1)
 ----- CONV2D CPU End-to-End Test-------
     Time cost = 0.25949 sec/op, 0.891023 GOPS
 key=resnet-cfg[6]
 Conv2DWorkload(batch=1, height=28, width=28, in_filter=128, out_filter=256, hkernel=3, wkernel=3, hpad=1, wpad=1, hstride=2, wstride=2)
 ----- CONV2D CPU End-to-End Test-------
     Time cost = 0.131113 sec/op, 0.881722 GOPS
 key=resnet-cfg[7]
 Conv2DWorkload(batch=1, height=28, width=28, in_filter=128, out_filter=256, hkernel=1, wkernel=1, hpad=0, wpad=0, hstride=2, wstride=2)
 ----- CONV2D CPU End-to-End Test-------
     Time cost = 0.0139933 sec/op, 0.917941 GOPS
 key=resnet-cfg[8]
 Conv2DWorkload(batch=1, height=14, width=14, in_filter=256, out_filter=256, hkernel=3, wkernel=3, hpad=1, wpad=1, hstride=1, wstride=1)
 ----- CONV2D CPU End-to-End Test-------
     Time cost = 0.265993 sec/op, 0.869237 GOPS
 key=resnet-cfg[9]
 Conv2DWorkload(batch=1, height=14, width=14, in_filter=256, out_filter=512, hkernel=3, wkernel=3, hpad=1, wpad=1, hstride=2, wstride=2)
 ----- CONV2D CPU End-to-End Test-------
     Time cost = 0.13347 sec/op, 0.866153 GOPS
 key=resnet-cfg[10]
 Conv2DWorkload(batch=1, height=14, width=14, in_filter=256, out_filter=512, hkernel=1, wkernel=1, hpad=0, wpad=0, hstride=2, wstride=2)
 ----- CONV2D CPU End-to-End Test-------
     Time cost = 0.0184653 sec/op, 0.695634 GOPS
 key=resnet-cfg[11]
 Conv2DWorkload(batch=1, height=7, width=7, in_filter=512, out_filter=512, hkernel=3, wkernel=3, hpad=1, wpad=1, hstride=1, wstride=1)
 ----- CONV2D CPU End-to-End Test-------
     Time cost = 0.435112 sec/op, 0.531383 GOPS
 key=resnet-cfg[0]
 Conv2DWorkload(batch=1, height=224, width=224, in_filter=16, out_filter=64, hkernel=7, wkernel=7, hpad=3, wpad=3, hstride=2, wstride=2)
 ----- CONV2D End-to-End Test-------
     Time cost = 0.101999 sec/op, 12.3414 GOPS
 key=resnet-cfg[1]
 Conv2DWorkload(batch=1, height=56, width=56, in_filter=64, out_filter=64, hkernel=3, wkernel=3, hpad=1, wpad=1, hstride=1, wstride=1)
 ----- CONV2D End-to-End Test-------
     Time cost = 0.0229889 sec/op, 10.0575 GOPS
 key=resnet-cfg[2]
 Conv2DWorkload(batch=1, height=56, width=56, in_filter=64, out_filter=64, hkernel=1, wkernel=1, hpad=0, wpad=0, hstride=1, wstride=1)
 ----- CONV2D End-to-End Test-------
     Time cost = 0.0194093 sec/op, 1.3236 GOPS
 key=resnet-cfg[3]
 Conv2DWorkload(batch=1, height=56, width=56, in_filter=64, out_filter=128, hkernel=3, wkernel=3, hpad=1, wpad=1, hstride=2, wstride=2)
 ----- CONV2D End-to-End Test-------
     Time cost = 0.00972201 sec/op, 11.8911 GOPS
 key=resnet-cfg[4]
 Conv2DWorkload(batch=1, height=56, width=56, in_filter=64, out_filter=128, hkernel=1, wkernel=1, hpad=0, wpad=0, hstride=2, wstride=2)
 ----- CONV2D End-to-End Test-------
     Time cost = 0.00962549 sec/op, 1.33448 GOPS
 key=resnet-cfg[5]
 Conv2DWorkload(batch=1, height=28, width=28, in_filter=128, out_filter=128, hkernel=3, wkernel=3, hpad=1, wpad=1, hstride=1, wstride=1)
 ----- CONV2D End-to-End Test-------
     Time cost = 0.0136985 sec/op, 16.8786 GOPS
  key=resnet-cfg[6]
  Conv2DWorkload(batch=1, height=28, width=28, in_filter=128, out_filter=256, hkernel=3, wkernel=3, hpad=1, wpad=1, hstride=2, wstride=2)
  ----- CONV2D End-to-End Test-------
      Time cost = 0.011236 sec/op, 10.2889 GOPS
  key=resnet-cfg[7]
  Conv2DWorkload(batch=1, height=28, width=28, in_filter=128, out_filter=256, hkernel=1, wkernel=1, hpad=0, wpad=0, hstride=2, wstride=2)
  ----- CONV2D End-to-End Test-------
      Time cost = 0.00486118 sec/op, 2.64238 GOPS
  key=resnet-cfg[8]
  Conv2DWorkload(batch=1, height=14, width=14, in_filter=256, out_filter=256, hkernel=3, wkernel=3, hpad=1, wpad=1, hstride=1, wstride=1)
  ----- CONV2D End-to-End Test-------
      Time cost = 0.0140004 sec/op, 16.5147 GOPS
  key=resnet-cfg[9]
  Conv2DWorkload(batch=1, height=14, width=14, in_filter=256, out_filter=512, hkernel=3, wkernel=3, hpad=1, wpad=1, hstride=2, wstride=2)
  ----- CONV2D End-to-End Test-------
      Time cost = 0.0111904 sec/op, 10.3308 GOPS
  key=resnet-cfg[10]
  Conv2DWorkload(batch=1, height=14, width=14, in_filter=256, out_filter=512, hkernel=1, wkernel=1, hpad=0, wpad=0, hstride=2, wstride=2)
  ----- CONV2D End-to-End Test-------
      Time cost = 0.00519472 sec/op, 2.47272 GOPS
  key=resnet-cfg[11]
  Conv2DWorkload(batch=1, height=7, width=7, in_filter=512, out_filter=512, hkernel=3, wkernel=3, hpad=1, wpad=1, hstride=1, wstride=1)
  ----- CONV2D End-to-End Test-------
      Time cost = 0.0104386 sec/op, 22.1496 GOPS
  Save memoize result to .pkl_memoize_py3/vta.tests.test_benchmark_topi.conv2d.verify_nhwc.get_ref_data.pkl

target RPC server

NFO:RPCServer:Finish serving ('192.168.0.2', 51718)
INFO:RPCServer:connection from ('192.168.0.2', 51733)
INFO:root:Program FPGA with 1x16x16_8bx8b_15_15_18_17_100MHz_8ns_v0_0_0.bit
INFO:RPCServer:Finish serving ('192.168.0.2', 51733)
INFO:RPCServer:connection from ('192.168.0.2', 51737)
INFO:root:Skip reconfig_runtime due to same config.
INFO:RPCServer:Finish serving ('192.168.0.2', 51737)
INFO:RPCServer:connection from ('192.168.0.2', 51738)
INFO:root:Loading VTA library: /home/xilinx/tvm/vta/python/vta/../../../build/libvta.so
INFO:RPCServer:load_module /tmp/tmpmsdnt4p7/conv2d.o
INFO:root:Loading VTA library: /home/xilinx/tvm/vta/python/vta/../../../build/libvta.so
INFO:RPCServer:load_module /tmp/tmpmsdnt4p7/conv2d.o
INFO:root:Loading VTA library: /home/xilinx/tvm/vta/python/vta/../../../build/libvta.so
INFO:RPCServer:load_module /tmp/tmpmsdnt4p7/conv2d.o
INFO:root:Loading VTA library: /home/xilinx/tvm/vta/python/vta/../../../build/libvta.so
INFO:RPCServer:load_module /tmp/tmpmsdnt4p7/conv2d.o
INFO:root:Loading VTA library: /home/xilinx/tvm/vta/python/vta/../../../build/libvta.so
INFO:RPCServer:load_module /tmp/tmpmsdnt4p7/conv2d.o
INFO:root:Loading VTA library: /home/xilinx/tvm/vta/python/vta/../../../build/libvta.so
INFO:RPCServer:load_module /tmp/tmpmsdnt4p7/conv2d.o
INFO:root:Loading VTA library: /home/xilinx/tvm/vta/python/vta/../../../build/libvta.so
INFO:RPCServer:load_module /tmp/tmpmsdnt4p7/conv2d.o
INFO:root:Loading VTA library: /home/xilinx/tvm/vta/python/vta/../../../build/libvta.so
INFO:RPCServer:load_module /tmp/tmpmsdnt4p7/conv2d.o
INFO:root:Loading VTA library: /home/xilinx/tvm/vta/python/vta/../../../build/libvta.so
INFO:RPCServer:load_module /tmp/tmpmsdnt4p7/conv2d.o
INFO:root:Loading VTA library: /home/xilinx/tvm/vta/python/vta/../../../build/libvta.so
INFO:RPCServer:load_module /tmp/tmpmsdnt4p7/conv2d.o
INFO:root:Loading VTA library: /home/xilinx/tvm/vta/python/vta/../../../build/libvta.so
INFO:RPCServer:load_module /tmp/tmpmsdnt4p7/conv2d.o
Initialize VTACommandHandle...
Close VTACommandhandle...
INFO:RPCServer:Finish serving ('192.168.0.2', 51738)
INFO:RPCServer:connection from ('192.168.0.2', 51740)
INFO:root:Loading VTA library: /home/xilinx/tvm/vta/python/vta/../../../build/libvta.so
INFO:RPCServer:load_module /tmp/tmpvu58ss03/conv2d.o
Initialize VTACommandHandle...
INFO:root:Loading VTA library: /home/xilinx/tvm/vta/python/vta/../../../build/libvta.so
INFO:RPCServer:load_module /tmp/tmpvu58ss03/conv2d.o
INFO:root:Loading VTA library: /home/xilinx/tvm/vta/python/vta/../../../build/libvta.so
INFO:RPCServer:load_module /tmp/tmpvu58ss03/conv2d.o
INFO:root:Loading VTA library: /home/xilinx/tvm/vta/python/vta/../../../build/libvta.so
INFO:RPCServer:load_module /tmp/tmpvu58ss03/conv2d.o
INFO:root:Loading VTA library: /home/xilinx/tvm/vta/python/vta/../../../build/libvta.so
INFO:RPCServer:load_module /tmp/tmpvu58ss03/conv2d.o
INFO:root:Loading VTA library: /home/xilinx/tvm/vta/python/vta/../../../build/libvta.so
INFO:RPCServer:load_module /tmp/tmpvu58ss03/conv2d.o
INFO:root:Loading VTA library: /home/xilinx/tvm/vta/python/vta/../../../build/libvta.so
INFO:RPCServer:load_module /tmp/tmpvu58ss03/conv2d.o
INFO:root:Loading VTA library: /home/xilinx/tvm/vta/python/vta/../../../build/libvta.so
INFO:RPCServer:load_module /tmp/tmpvu58ss03/conv2d.o
INFO:root:Loading VTA library: /home/xilinx/tvm/vta/python/vta/../../../build/libvta.so
INFO:RPCServer:load_module /tmp/tmpvu58ss03/conv2d.o
INFO:root:Loading VTA library: /home/xilinx/tvm/vta/python/vta/../../../build/libvta.so
INFO:RPCServer:load_module /tmp/tmpvu58ss03/conv2d.o
INFO:root:Loading VTA library: /home/xilinx/tvm/vta/python/vta/../../../build/libvta.so
INFO:RPCServer:load_module /tmp/tmpvu58ss03/conv2d.o
INFO:root:Loading VTA library: /home/xilinx/tvm/vta/python/vta/../../../build/libvta.so
INFO:RPCServer:load_module /tmp/tmpvu58ss03/conv2d.o
Close VTACommandhandle...
INFO:RPCServer:Finish serving ('192.168.0.2', 51740)

VTA FPGA Toolchain Installiation

만약 FPGA에 올라가는 합성 코드 자체를 수정하고 싶으면 Xilinx IDE를 설치해야 하므로 아래의 문서를 참조해서 절차를 따른다.
https://docs.tvm.ai/vta/install.html#vta-fpga-toolchain-installation

Troubleshooting

Cannot find the files: libtvm.dylib

python3 vta/tests/python/pynq/test_program_rpc.py
에러 메시지

Traceback (most recent call last):
  File "vta/tests/python/pynq/test_program_rpc.py", line 2, in <module>
    import tvm
  File "/Users/jeminlee/development/pynq-z1-tvm/tvm/python/tvm/__init__.py", line 5, in <module>
    from . import tensor
  File "/Users/jeminlee/development/pynq-z1-tvm/tvm/python/tvm/tensor.py", line 4, in <module>
    from ._ffi.node import NodeBase, NodeGeneric, register_node, convert_to_node
  File "/Users/jeminlee/development/pynq-z1-tvm/tvm/python/tvm/_ffi/node.py", line 8, in <module>
    from .node_generic import NodeGeneric, convert_to_node, const
  File "/Users/jeminlee/development/pynq-z1-tvm/tvm/python/tvm/_ffi/node_generic.py", line 7, in <module>
    from .base import string_types
  File "/Users/jeminlee/development/pynq-z1-tvm/tvm/python/tvm/_ffi/base.py", line 48, in <module>
    _LIB, _LIB_NAME = _load_lib()
  File "/Users/jeminlee/development/pynq-z1-tvm/tvm/python/tvm/_ffi/base.py", line 39, in _load_lib
    lib_path = libinfo.find_lib_path()
  File "/Users/jeminlee/development/pynq-z1-tvm/tvm/python/tvm/_ffi/libinfo.py", line 93, in find_lib_path
    raise RuntimeError(message)
RuntimeError: Cannot find the files.
List of candidates:
/Users/jeminlee/development/pynq-z1-tvm/tvm/python/tvm/libtvm.dylib
/Users/jeminlee/development/pynq-z1-tvm/tvm/build/libtvm.dylib
/Users/jeminlee/development/pynq-z1-tvm/tvm/build/Release/libtvm.dylib
/Users/jeminlee/development/pynq-z1-tvm/tvm/lib/libtvm.dylib
/Users/jeminlee/development/pynq-z1-tvm/libtvm.dylib
/Users/jeminlee/development/pynq-z1-tvm/tvm/python/tvm/libtvm_runtime.dylib
/Users/jeminlee/development/pynq-z1-tvm/tvm/build/libtvm_runtime.dylib
/Users/jeminlee/development/pynq-z1-tvm/tvm/build/Release/libtvm_runtime.dylib
/Users/jeminlee/development/pynq-z1-tvm/tvm/lib/libtvm_runtime.dylib
/Users/jeminlee/development/pynq-z1-tvm/libtvm_runtime.dylib

시도한 방법들
conda python3.7 설치 후 실행
tvm on host의 컴파일을 llvm-6으로 변경후에 다시 실행

no module named vta.testing

에러 메시지
python ./vta/tests/python/integration/

test_benchmark_topi_conv2d.py
Traceback (most recent call last):
  File "./vta/tests/python/integration/test_benchmark_topi_conv2d.py", line 10, in <module>
    import vta.testing
ModuleNotFoundError: No module named 'vta.testing'

해결 방법
그냥 쉘에서 아래를 실행하고 실행 한다.
export PYTHONPATH=/Users/jeminlee/development/pynq-z1-tvm/tvm/vta/python:${PYTHONPATH}

환경 변수 설정이 잘못 되었었음.

  • 수정전: /Users/jeminlee/development/pynq-z1-tvm/tvm/vta/tests/python
  • 수정후: /Users/jeminlee/development/pynq-z1-tvm/tvm/vta/python:

참고문헌

미니컴 설정 방법
https://www.nengo.ai/nengo-pynq/connect.html


'AI > TVM' 카테고리의 다른 글

TVM Pycharm 설정법  (0) 2021.03.09
PYNQ: Python productivity on ZYNQ  (0) 2019.04.02
TVM 설치 방법  (0) 2019.04.02

PYNQ: Python productivity on ZYNQ


기본적인 설명

The PYNQ-Z1 board is designed to be used with PYNQ, a new open-source framework that enables embedded programmers to exploit the capabilities of Xilinx Zynq All Programmable SoCs (APSoCs) without having to design programmable logic circuits.

  • YNQ XC7Z020-1CLG400C

    • 650MHz dual-core Cortex-A9 processor
    • DDR3 memory controller with 8 DMA channels and 4 High Performance AXI3 Slave ports
    • High-bandwidth peripheral controllers: 1G Ethernet, USB 2.0, SDIO
    • Low-bandwidth peripheral controller: SPI, UART, CAN, I2C
    • Programmable from JTAG, Quad-SPI flash, and microSD card
    • Programmable logic equivalent to Artix-7 FPGA
      • 13,300 logic slices, each with four 6-input LUTs and 8 flip-flops
      • 630 KB of fast block RAM
      • 4 clock management tiles, each with a phase-locked loop (PLL) and mixed-mode clock manager (MMCM)
      • 220 DSP slices
      • On-chip analog-to-digital converter (XADC)
  • 512 MB DDR3

  • Wide range of USB, Ethernet, Video and Audio connectivity
  • Arduino shield and Pmod connectors for adding-on hardware devices
  • Programmable from JTAG, Quad-SPI flash, and microSD card

  • Key FPGA Specifications
    스크린샷 2019-03-14 오전 9.54.17

스크린샷 2019-03-19 오전 9.24.28

다른 비싼것 FPGA 성능
ZYNQ-ZCU104 (약 100만원)

  • Logic slices 504,000
  • memory 38MB
  • DSP slices 1,728

ZYNQ-ZCU104 (약 1,000만원)

  • Logic slices 930,000
  • memory 60.5B
  • DSP slices 4,272

보드 이미지 다운로드
http://www.pynq.io/board

보드 비교
스크린샷 2019-03-11 오후 8.07.32

dd command를 이용한 Disk image fussing on micro sd card

공식문서
https://pynq.readthedocs.io/en/v1.3/17_appendix.html

확인 및 unmount
diskutil list
`diskutil unmountDisk /dev/disk``

플래시 방법
sudo dd bs=1m if=pynq_z1_v2.4.img of=/dev/rdisk4

중간 중간 progress확인은 ctrl+t (SIGINFO signal)을 이용해서 확인 한다.

load: 7.21  cmd: dd 6329 uninterruptible 0.00u 0.00s
30+0 records in
29+0 records out
30408704 bytes transferred in 3.649711 secs (8331812 bytes/sec)
load: 7.21  cmd: dd 6329 uninterruptible 0.00u 0.01s
36+0 records in
35+0 records out
36700160 bytes transferred in 4.396652 secs (8347297 bytes/sec)
load: 5.83  cmd: dd 6329 uninterruptible 0.00u 0.29s
1033+0 records in
1032+0 records out
1082130432 bytes transferred in 128.452542 secs (8424360 bytes/sec)
load: 6.31  cmd: dd 6329 uninterruptible 0.00u 0.53s
1885+0 records in
1884+0 records out
1975517184 bytes transferred in 235.902680 secs (8374289 bytes/sec)
5401+1 records in
5401+1 records out
5664169984 bytes transferred in 679.525499 secs (8335478 bytes/sec)

정상적으로 했다면 아래와 같이 변경된다.
diskutil list

/dev/disk4 (internal, physical):
  #:                       TYPE NAME                    SIZE       IDENTIFIER
  0:     FDisk_partition_scheme                        *7.8 GB     disk4
  1:                 DOS_FAT_32                         7.8 GB     disk4s1
/dev/disk4 (internal, physical):
   #:                       TYPE NAME                    SIZE       IDENTIFIER
   0:     FDisk_partition_scheme                        *7.8 GB     disk4
   1:             Windows_FAT_32 NO NAME                 104.9 MB   disk4s1
   2:                      Linux                         5.6 GB     disk4s2

Getting Started with PYNQ support

설치후 OS 버전 정보

xilinx@pynq:~$ lsb_release -a
No LSB modules are available.
Distributor ID:    pynqlinux
Description:    PYNQ Linux, based on Ubuntu 18.04
Release:    v2.4
Codename:    provo

접속 기본 정보

공식 사이트: https://pynq.readthedocs.io/en/latest/getting_started.html

공유기에 따라서 IP Address는 다름

초기 계정 비번 xilinx / xilinx

jupyter notebook 가능

SAMBA 공유 파일 시스템 접속 방법 (OSX)

고유 연결 방법 Finder에서 이동->연결->URL 입력

  • mac/linux: smb://pynq/xilinx

스크린샷 2019-03-13 오후 2.57.44

참고 사이트

  • pynq.io
  • pynq.readthedocs.org
  • github.com/Xilinx/pynq
  • digilentinc.com/pynq
  • pynq.io/support


'AI > TVM' 카테고리의 다른 글

TVM Pycharm 설정법  (0) 2021.03.09
VTA on FPGA Board  (4) 2019.04.02
TVM 설치 방법  (0) 2019.04.02

TVM 설치 방법


서브모듈까지 모두 다운받기 위해서 --recursive 옵션을 사용한다.

git clone --recursive https://github.com/dmlc/tvm

Build the Shared Library

여기선 shared libaries를 빌드하는 것이 목적
각각의 운영체제에 따라서 지원하는 공유라이브러리가 다르다.

Linux: libtvm.so, libtvm_topi.so
OSX: libtvm.dylib, libtvm_topi.dylib
Windows: libtvm.dll, libtvm_topi.dll

sudo apt-get update
sudo apt-get install -y python python-dev python-setuptools gcc libtinfo-dev zlib1g-dev build-essential cmake

설치전 최소 요구사항

  • C++11 (g++ 4.8 or higher)
  • Cmake 3.5 or higher

    cmake --version

  • LLVM 설치 (Build)
  • CUDA or openCL을 사용한다면 LLVM을 설치하지 않아도 된다.
  • NNVM compiler를 사용하고 싶다면 LLVM을 설치 해야함.

Build Libary config.

  • config.cmake파일을 수정해서 빌드에 사용할 libary들을 선택 한다.
  • macOS의 경우 에러가 발생하면 XCdoe에서 어쩌면 LDFLAGS에 -lc++abi를 추가하면 된다.
  • CUDA backend를 사용하고 싶으면, set(USE_CUDA_OFF) set(USE_CUDA_ON)으로 변경 한다.

LLVM 의존성
apt로 설치하면 4.0보다 낮은 버전이 설치되므로 사용을 위해선 직접 빌드해서 써야한다.

빌드에 오랜 시간이 걸리므로 pre-build를 아래 링크에서 다운 받는다.

unzip후 build/config.cmake set(USE_LLVM /path/to/your/llvm/bin/llvm-config)

1080 PC 에서

default mode
-> LLVM 없는 상태

실행 로그

jemin@jemin-desktop:~/Users/jemin/tvm$ make -j4
-- The C compiler identification is GNU 5.4.0
-- The CXX compiler identification is GNU 5.4.0
-- Check for working C compiler: /usr/bin/cc
-- Check for working C compiler: /usr/bin/cc -- works
-- Detecting C compiler ABI info
-- Detecting C compiler ABI info - done
-- Detecting C compile features
-- Detecting C compile features - done
-- Check for working CXX compiler: /usr/bin/c++
-- Check for working CXX compiler: /usr/bin/c++ -- works
-- Detecting CXX compiler ABI info
-- Detecting CXX compiler ABI info - done
-- Detecting CXX compile features
-- Detecting CXX compile features - done
-- Performing Test SUPPORT_CXX11
-- Performing Test SUPPORT_CXX11 - Success
-- Build with RPC support...
-- Build with Graph runtime support...
-- Build VTA runtime with target: sim
-- Configuring done
-- Generating done
-- Build files have been written to: /home/jemin/Users/jemin/tvm/build
...
...
...
[ 99%] Building CXX object CMakeFiles/nnvm_compiler.dir/nnvm/src/top/vision/yolo/reorg.cc.o
[100%] Linking CXX shared library libnnvm_compiler.so
make[3]: 디렉터리 '/home/jemin/Users/jemin/tvm/build' 나감
[100%] Built target nnvm_compiler
make[2]: 디렉터리 '/home/jemin/Users/jemin/tvm/build' 나감
make[1]: 디렉터리 '/home/jemin/Users/jemin/tvm/build' 나감

LLVM ON

-- Build with RPC support...
-- Build with Graph runtime support...
-- Build VTA runtime with target: sim
-- Use llvm-config=/home/jemin/Users/jemin/clang+llvm-7.0.1-x86_64-linux-gnu-ubuntu-16.04/bin/llvm-config
-- /home/jemin/Users/jemin/clang+llvm-7.0.1-x86_64-linux-gnu-ubuntu-16.04/include
-- Found LLVM_INCLUDE_DIRS=/home/jemin/Users/jemin/clang+llvm-7.0.1-x86_64-linux-gnu-ubuntu-16.04/include
-- Found LLVM_DEFINITIONS= -DNDEBUG -D_GNU_SOURCE -D__STDC_CONSTANT_MACROS -D__STDC_FORMAT_MACROS -D__STDC_LIMIT_MACROS
-- Found TVM_LLVM_VERSION=70
-- Build with LLVM
-- Set TVM_LLVM_VERSION=70
-- Configuring done
-- Generating done
-- Build files have been written to: /home/jemin/Users/jemin/tvm/build

Xavier에서 CUDA ON

nvidia@jetson-0423718017159:~/tvm$ make -j4
-- The C compiler identification is GNU 7.3.0
-- The CXX compiler identification is GNU 7.3.0
-- Check for working C compiler: /usr/bin/cc
-- Check for working C compiler: /usr/bin/cc -- works
-- Detecting C compiler ABI info
-- Detecting C compiler ABI info - done
-- Detecting C compile features
-- Detecting C compile features - done
-- Check for working CXX compiler: /usr/bin/c++
-- Check for working CXX compiler: /usr/bin/c++ -- works
-- Detecting CXX compiler ABI info
-- Detecting CXX compiler ABI info - done
-- Detecting CXX compile features
-- Detecting CXX compile features - done
-- Performing Test SUPPORT_CXX11
-- Performing Test SUPPORT_CXX11 - Success
-- Build with RPC support...
-- Build with Graph runtime support...
-- Build VTA runtime with target: sim
-- Looking for pthread.h
-- Looking for pthread.h - found
-- Looking for pthread_create
-- Looking for pthread_create - not found
-- Looking for pthread_create in pthreads
-- Looking for pthread_create in pthreads - not found
-- Looking for pthread_create in pthread
-- Looking for pthread_create in pthread - found
-- Found Threads: TRUE
-- Found CUDA_TOOLKIT_ROOT_DIR=/usr/local/cuda-10.0
-- Found CUDA_CUDA_LIBRARY=/usr/local/cuda-10.0/lib64/stubs/libcuda.so
-- Found CUDA_CUDART_LIBRARY=/usr/local/cuda-10.0/lib64/libcudart.so
-- Found CUDA_NVRTC_LIBRARY=/usr/local/cuda-10.0/lib64/libnvrtc.so
-- Found CUDA_CUDNN_LIBRARY=/usr/lib/aarch64-linux-gnu/libcudnn.so
-- Found CUDA_CUBLAS_LIBRARY=/usr/local/cuda-10.0/lib64/libcublas.so
-- Build with CUDA support
-- Configuring done
-- Generating done
....
....
[ 99%] Building CXX object CMakeFiles/nnvm_compiler.dir/nnvm/src/top/vision/yolo/reorg.cc.o
[100%] Linking CXX shared library libnnvm_compiler.so
make[3]: Leaving directory /home/nvidia/tvm/build
[100%] Built target nnvm_compiler
make[2]: Leaving directory /home/nvidia/tvm/build
make[1]: Leaving directory /home/nvidia/tvm/build

OSX에 설치하기

llvm을 프리 빌드 버전을 다운받는다.
7.0 버전: http://releases.llvm.org/7.0.0/clang+llvm-7.0.0-x86_64-apple-darwin.tar.xz
압축푼 경로에서 bin/llvm-config가 정상적으로 동작 하는지를 파악한다.

mkdir Build
cp cmake/config.cmake build
cd build
cmake ..
-- The C compiler identification is AppleClang 10.0.0.10001044
-- The CXX compiler identification is AppleClang 10.0.0.10001044
-- Check for working C compiler: /Library/Developer/CommandLineTools/usr/bin/cc
-- Check for working C compiler: /Library/Developer/CommandLineTools/usr/bin/cc -- works
-- Detecting C compiler ABI info
-- Detecting C compiler ABI info - done
-- Detecting C compile features
-- Detecting C compile features - done
-- Check for working CXX compiler: /Library/Developer/CommandLineTools/usr/bin/c++
-- Check for working CXX compiler: /Library/Developer/CommandLineTools/usr/bin/c++ -- works
-- Detecting CXX compiler ABI info
-- Detecting CXX compiler ABI info - done
-- Detecting CXX compile features
-- Detecting CXX compile features - done
-- Performing Test SUPPORT_CXX11
-- Performing Test SUPPORT_CXX11 - Success
-- Build with RPC support...
-- Build with Graph runtime support...
-- Build VTA runtime with target: sim
-- Use llvm-config=/Users/jeminlee/development/llvm/clang+llvm-7.0.0-x86_64-apple-darwin/bin/llvm-config
-- /Users/jeminlee/development/llvm/clang+llvm-7.0.0-x86_64-apple-darwin/include
-- Found LLVM_INCLUDE_DIRS=/Users/jeminlee/development/llvm/clang+llvm-7.0.0-x86_64-apple-darwin/include
-- Found LLVM_DEFINITIONS= -DNDEBUG -D__STDC_CONSTANT_MACROS -D__STDC_FORMAT_MACROS -D__STDC_LIMIT_MACROS
-- Found TVM_LLVM_VERSION=70
-- Build with LLVM
-- Set TVM_LLVM_VERSION=70
-- Configuring done
-- Generating done
-- Build files have been written to: /Users/jeminlee/development/tvm/build

make -j8
Scanning dependencies of target vta
Scanning dependencies of target tvm_runtime
Scanning dependencies of target tvm
[  0%] Building CXX object CMakeFiles/vta.dir/vta/src/device_api.cc.o
[  1%] Building CXX object CMakeFiles/vta.dir/vta/src/sim/sim_driver.cc.o
...
...
[ 99%] Building CXX object CMakeFiles/nnvm_compiler.dir/nnvm/src/top/vision/ssd/mutibox_op.cc.o
[100%] Building CXX object CMakeFiles/nnvm_compiler.dir/nnvm/src/top/vision/yolo/region.cc.o
[100%] Building CXX object CMakeFiles/nnvm_compiler.dir/nnvm/src/top/vision/yolo/reorg.cc.o
[100%] Linking CXX shared library libtvm_topi.dylib
[100%] Built target tvm_topi
[100%] Linking CXX shared library libnnvm_compiler.dylib
[100%] Built target nnvm_compiler

단 llvm을 사용해서 nnvm을 활성화 하는 방법은 build 디렉터리내의 cofnig.cmake를 수정해서 set(USE_LLVM 경로)를 삽입한다.

Python Package Installation

tvm/python에 위치한 python package를 설치하는 방법으로 두 가지 방법이 존재 한다.

방법 1 (개발자)

  • export TVM_HOME=/path/to/tvm
  • export PYTHONPATH=$TVM_HOME/python:$TVM_HOME/topi/python:$TVM_HOME/nnvm/python:${PYTHONPATH}

방법 2

Python dependencies

  • Necessary dependencies
    • pip install --user numpy decorator
  • RPC Tracker
    • pip install --user tornado
  • Auto-tuning module
    • pip install --user tornado psutil xgboost

Troubleshooting

tensor error
Python 3.x 버전으로 변경한다.

모델 컴파일 에러
llvm 6.0.1에 최적화 되어 있으므로 7버전의 것을 사용하지 않는다.


'AI > TVM' 카테고리의 다른 글

TVM Pycharm 설정법  (0) 2021.03.09
VTA on FPGA Board  (4) 2019.04.02
PYNQ: Python productivity on ZYNQ  (0) 2019.04.02

+ Recent posts