After installing TensorFlow, I needed to install Pytorch to run HanLP. This time, I will focus on installing Pytorch. I encountered a lot of difficulties, especially with LAPACK.(English version Translated by GPT-3.5, 返回中文)
Description
The entire installation process will be done in Docker (CentOS 7) without using Conda. We will be using Python 3.8.
GCC-10.2 will be used for the installation. I intend to use HanLP Python components package with Pytorch on my local machine for learning and interest purposes. I will record the installation process, including any errors encountered. Please read the entire context before following this tutorial to avoid running into the same issues.
Docker’s CentOS 7 is the original image from Docker Hub.
Create Docker Container
Create the Docker Container
Use the following command to create a new official CentOS 7 Docker container and then enter the container to install the necessary dependencies.
[root@67bce55d5a71 /]# passwd root Changing password for user root. New password: BAD PASSWORD: The password is shorter than 8 characters Retype new password: passwd: all authentication tokens updated successfully. [root@67bce55d5a71 /]# systemctl start sshd && systemctl enable sshd [root@67bce55d5a71 /]#
Compile GCC-10.2 (Takes about 32 minutes)
I haven’t tested using the default GCC 4.8.5, as I have faced many issues with GCC while compiling TensorFlow in the past. I also want to avoid encountering issues with GLIBCXX version incompatibility.
Download and Compile GCC-10.2
Find a mirror from GCC mirror sites and navigate to the releases/gcc-10.2.0/ directory. I chose a mirror in Japan. Download link: gcc-10.2.0.tar.gz
1 2 3 4 5 6
wget -c http://ftp.tsukuba.wide.ad.jp/software/gcc/releases/gcc-10.2.0/gcc-10.2.0.tar.gz tar -zxvf gcc-10.2.0.tar.gz cd gcc-10.2.0 ./configure --prefix=/usr/local/gcc-10.2 make -j7 make install
[root@67bce55d5a71 download]# tar -zxvf gcc-10.2.0.tar.gz ..... gcc-10.2.0/.gitattributes gcc-10.2.0/.dir-locals.el [root@67bce55d5a71 download]# cd gcc-10.2.0 [root@67bce55d5a71 gcc-10.2.0]# ./configure --prefix=/usr/local/gcc-10.2 checking build system type... aarch64-unknown-linux-gnu checking host system type... aarch64-unknown-linux-gnu checking target system type... aarch64-unknown-linux-gnu checking for a BSD-compatible install... /usr/bin/install -c ..... checking whether to enable maintainer-specific portions of Makefiles... no configure: creating ./config.status config.status: creating Makefile [root@67bce55d5a71 gcc-10.2.0]# make -j7 && make install .... See any operating system documentation about shared libraries for more information, such as the ld(1) and ld.so(8) manual pages. ---------------------------------------------------------------------- make[4]: Nothing to be done for `install-data-am'. make[4]: Leaving directory `/root/download/gcc-10.2.0/aarch64-unknown-linux-gnu/libatomic' make[3]: Leaving directory `/root/download/gcc-10.2.0/aarch64-unknown-linux-gnu/libatomic' make[2]: Leaving directory `/root/download/gcc-10.2.0/aarch64-unknown-linux-gnu/libatomic' make[1]: Leaving directory `/root/download/gcc-10.2.0' [root@67bce55d5a71 gcc-10.2.0]#
[root@67bce55d5a71 ~]# gcc --version gcc (GCC) 10.2.0 Copyright (C) 2020 Free Software Foundation, Inc. This is free software; see the source for copying conditions. There is NO warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.
wget -c https://www.python.org/ftp/python/3.8.8/Python-3.8.8.tgz tar -zxvf Python-3.8.8.tgz cd Python-3.8.8
Execute the Compilation
When using the --enable-optimizations parameter, Python tests will be performed, providing better performance for executing Python code.
1 2 3
./configure --with-ssl-default-suites=openssl --enable-optimizations make -j7 make install
Console output omitted as there were no errors.
Test Python Availability
1 2 3 4 5
[root@67bce55d5a71 Python-3.8.8]# python3 Python 3.8.8 (default, Mar 11 2021, 08:11:33) [GCC 10.2.0] on linux Type "help", "copyright", "credits" or "license" for more information. >>>
wget -c https://github.com/Kitware/CMake/releases/download/v3.19.6/cmake-3.19.6.tar.gz tar -zxvf cmake-3.19.6.tar.gz cd cmake-3.19.6
Execute the Compilation
During the first step ./configure --no-qt-gui, it may take a longer time. It is recommended to create a symbolic link beforehand to prevent recompilation. The configure step takes about 7 minutes.
1 2 3
./configure --no-qt-gui gmake -j7 make install
Console error
1 2 3 4 5 6
/root/download/cmake-3.19.6/Bootstrap.cmk/cmake: /lib64/libstdc++.so.6: version `GLIBCXX_3.4.20' not found (required by /root/download/cmake-3.19.6/Bootstrap.cmk/cmake) /root/download/cmake-3.19.6/Bootstrap.cmk/cmake: /lib64/libstdc++.so.6: version `GLIBCXX_3.4.21' not found (required by /root/download/cmake-3.19.6/Bootstrap.cmk/cmake) --------------------------------------------- Error when bootstrapping CMake: Problem while running initial CMake ---------------------------------------------
This error occurs because the gcc-10.2 installed is located in /usr/local, while it is trying to access /lib64/libstdc++.so.6 from /lib64.
.... -- Checking for curses support -- Checking for curses support - Failed -- Looking for elf.h -- Looking for elf.h - found -- Looking for a Fortran compiler -- Looking for a Fortran compiler - /usr/local/gcc-10.2/bin/gfortran -- Performing Test run_pic_test -- Performing Test run_pic_test - Success -- Performing Test run_inlines_hidden_test -- Performing Test run_inlines_hidden_test - Success -- Configuring done -- Generating done -- Build files have been written to: /root/download/cmake-3.19.6 --------------------------------------------- CMake has bootstrapped. Now run gmake. [root@67bce55d5a71 cmake-3.19.6]# date
This time it was successful. Proceed to run gmake -j7 && make install.
Install Pytorch (Takes about 28 minutes)
Before installing Pytorch, clone its code. This process is quite lengthy, so it is recommended to use a VPN for faster downloading. You can refer to the official GitHub repository for instructions: GitHub - pytorch/pytorch at v1.6.0
Clone Pytorch
This step will recursively download many source code packages. A total of 34 modules (including 33 sub-modules) need to be downloaded, totaling 1.03GB.
1 2 3 4 5
git clone -b v1.6.0 --recursive https://github.com/pytorch/pytorch.git cd pytorch # if you are updating an existing checkout git submodule sync git submodule update --init --recursive
There was a red error message, and upon closer inspection, it turned out to be because of network issues during the cloning process. Trying again after connecting to a VPN by running the second line of code.
-- Build files have been written to: /tmp/pip-install-p6yq4172/ninja_2beb9619a1d947568581b78204e58ab9/_skbuild/linux-aarch64-3.8/cmake-build Scanning dependencies of target download_ninja_source [ 10%] Creating directories for 'download_ninja_source' [ 20%] Performing download step (download, verify and extract) for 'download_ninja_source' -- Downloading... dst='/tmp/pip-install-p6yq4172/ninja_2beb9619a1d947568581b78204e58ab9/_skbuild/linux-aarch64-3.8/cmake-build/v1.10.0.gfb670.kitware.jobserver-1.tar.gz' timeout='none' inactivity timeout='none' -- Using src='https://github.com/kitware/ninja/archive/v1.10.0.gfb670.kitware.jobserver-1.tar.gz' CMake Error at _skbuild/linux-aarch64-3.8/cmake-build/download_ninja_source-prefix/src/download_ninja_source-stamp/download-download_ninja_source.cmake:170 (message): Each download failed!
error: downloading 'https://github.com/kitware/ninja/archive/v1.10.0.gfb670.kitware.jobserver-1.tar.gz' failed status_code: 28 status_string: "Timeout was reached" log: --- LOG BEGIN --- Trying 13.250.177.223:443...
connect to 13.250.177.223 port 443 failed: Connection timed out
Failed to connect to github.com port 443: Connection timed out
After re-running, it was successful.
1 2 3 4 5 6 7
Building wheels for collected packages: ninja Building wheel for ninja (PEP 517) ... done Created wheel for ninja: filename=ninja-1.10.0.post2-cp38-cp38-linux_aarch64.whl size=112136 sha256=b65f8597c88b6c58577c534e254700267877c4cb0896a12c9f80fc83d2041a50 Stored in directory: /root/.cache/pip/wheels/75/4e/92/8e0a2f0960c17371491b56a359066f9bfb43e69544a96f1881 Successfully built ninja Installing collected packages: urllib3, pycparser, idna, chardet, certifi, typing-extensions, six, requests, pyyaml, numpy, ninja, future, dataclasses, Cython, cffi Successfully installed Cython-0.29.22 certifi-2020.12.5 cffi-1.14.5 chardet-4.0.0 dataclasses-0.6 future-0.18.2 idna-2.10 ninja-1.10.0.post2 numpy-1.20.1 pycparser-2.20 pyyaml-5.4.1 requests-2.25.1 six-1.15.0 typing-extensions-3.7.4.3 urllib3-1.26.3
Build the Source Code
If you also need to run HanLP models, you need to run yum install lapack64-devel lapack-devel first to install the lapack development package. Otherwise, you will encounter the error LAPACK library not found in compilation when building the HanLP module.
1
python3 setup.py install
During the configure process, the following output indicates if LAPACK has been found.
..... -- Performing Test CXX_HAS_AVX_3 -- Performing Test CXX_HAS_AVX_3 - Failed -- Performing Test CXX_HAS_AVX2_1 -- Performing Test CXX_HAS_AVX2_1 - Failed -- Performing Test CXX_HAS_AVX2_2 -- Performing Test CXX_HAS_AVX2_2 - Failed -- Performing Test CXX_HAS_AVX2_3 -- Performing Test CXX_HAS_AVX2_3 - Failed -- Looking for cheev_ -- Looking for cheev_ - found -- Found a library with LAPACK API (generic) ----------------------- 这一句. disabling CUDA because NOT USE_CUDA is set -- USE_CUDNN is set to 0. Compiling without cuDNN support disabling ROCM because NOT USE_ROCM is set -- MIOpen not found. Compiling without MIOpen support disabling MKLDNN because USE_MKLDNN is not set -- Looking for clock_gettime in rt -- Looking for clock_gettime in rt - found -- Looking for mmap -- Looking for mmap - found -- Looking for shm_open -- Looking for shm_open - found -- Looking for shm_unlink .....
Of course, there was an error in the middle. The error message was:
Then, run the compilation command python3 setup.py install. The compilation takes about 28 minutes, and there were no errors after making the modifications mentioned above.
Test
Run python3 and enter import torch. If there are no import errors, the installation is complete.
Console Output
1 2 3 4 5 6
[root@67bce55d5a71 download]# python3 Python 3.8.8 (default, Mar 11 2021, 08:11:33) [GCC 10.2.0] on linux Type "help", "copyright", "credits" or "license" for more information. >>> import torch >>>
Oh, by the way, remember to cd ../ to leave the Pytorch source code directory. Otherwise, if you run import torch in the source code package, you will encounter the following error:
1 2 3 4 5 6 7 8 9 10 11 12 13
[root@67bce55d5a71 pytorch]# python3 Python 3.8.8 (default, Mar 11 2021, 08:11:33) [GCC 10.2.0] on linux Type "help", "copyright", "credits" or "license" for more information. >>> import torch Traceback (most recent call last): File "<stdin>", line 1, in <module> File "/root/download/pytorch/torch/__init__.py", line 335, in <module> from .random import set_rng_state, get_rng_state, manual_seed, initial_seed, seed File "/root/download/pytorch/torch/random.py", line 4, in <module> from torch._C import default_generator ImportError: cannot import name 'default_generator' from 'torch._C' (unknown location) >>>
Export the whl File
To ensure smooth installation in the future, you can export the whl file. To export the file, you need to install the wheel package using pip: pip install wheel
1
python3 setup.py bdist_wheel
There were no errors, and the exported package will be located in dist/torch-1.6.0a0+b31f58d-cp38-cp38-linux_aarch64.whl
[root@67bce55d5a71 pytorch]# python3 setup.py bdist_wheel Building wheel torch-1.6.0a0+b31f58d -- Building version 1.6.0a0+b31f58d cmake --build . --target install --config Release -- -j 8 [0/1] Install the project... -- Install configuration: "Release" running bdist_wheel running build running build_py copying torch/version.py -> build/lib.linux-aarch64-3.8/torch .... copying caffe2/proto/metanet_pb2.py -> build/lib.linux-aarch64-3.8/caffe2/proto running build_ext -- Building with NumPy bindings -- Not using cuDNN -- Not using CUDA -- Not using MKLDNN -- Not using NCCL -- Building with distributed package