Summary on deep learning framework --- Torch7 

1. 尝试第一个 CNN 的 torch版本, 代码如下:


 --    We now have 5 steps left to do in training our first torch neural network
-- 1. Load and normalize data
-- 2. Define Neural Network
-- 3. Define Loss function
-- 4. Train network on training data
-- 5. Test network on test data. -- 1. Load and normalize data
require 'paths'
require 'image';
if (not paths.filep("")) then
os.execute('wget -c')
trainset = torch.load('cifar10-train.t7')
testset = torch.load('cifar10-test.t7')
classes = {'airplane', 'automobile', 'bird', 'cat',
'deer', 'dog', 'frog', 'horse', 'ship', 'truck'} print(trainset)
print( itorch.image([]) -- display the 100-th image in dataset
print(classes[trainset.label[]]) -- ignore setmetatable for now, it is a feature beyond the scope of this tutorial.
-- It sets the index operator
{__index = function(t, i)
return {[i], t.label[i]}
); = -- convert the data from a ByteTensor to a DoubleTensor. function trainset:size()
end print(trainset:size())
itorch.image(trainset[][]) redChannel =[{ {}, {}, {}, {} }] -- this pick {all images, 1st channel, all vertical pixels, all horizontal pixels}
print(#redChannel) -- TODO:fill
mean = {}
stdv = {}
for i = , do
mean[i] =[{ {}, {i}, {}, {} }]:mean() -- mean estimation
print('Channel ' .. i .. ' , Mean: ' .. mean[i])[{ {}, {i}, {}, {} }]:add(-mean[i]) -- mean subtraction stdv[i] =[ { {}, {i}, {}, {} }]:std() -- std estimation
print('Channel ' .. i .. ' , Standard Deviation: ' .. stdv[i])[{ {}, {i}, {}, {} }]:div(stdv[i]) -- std scaling
end -- 2. Define Neural Network
net = nn.Sequential()
net:add(nn.SpatialConvolution(, , , )) -- 3 input image channels, 6 output channels, 5x5 convolution kernel
net:add(nn.ReLU()) -- non-linearity
net:add(nn.SpatialMaxPooling(,,,)) -- A max-pooling operation that looks at 2x2 windows and finds the max.
net:add(nn.SpatialConvolution(, , , ))
net:add(nn.ReLU()) -- non-linearity
net:add(nn.View(**)) -- reshapes from a 3D tensor of 16x5x5 into 1D tensor of 16*5*5
net:add(nn.Linear(**, )) -- fully connected layer (matrix multiplication between input and weights)
net:add(nn.ReLU()) -- non-linearity
net:add(nn.Linear(, ))
net:add(nn.ReLU()) -- non-linearity
net:add(nn.Linear(, )) -- 10 is the number of outputs of the network (in this case, 10 digits)
net:add(nn.LogSoftMax()) -- converts the output to a log-probability. Useful for classification problems -- 3. Let us difine the Loss function
criterion = nn.ClassNLLCriterion() -- 4. Train the neural network
trainer = nn.StochasticGradient(net, criterion)
trainer.learningRate = 0.001
trainer.maxIteration = -- just do 5 epochs of training.
trainer:train(trainset) -- 5. Test the network, print accuracy
itorch.image([]) = -- convert from Byte tensor to Double tensor
for i=, do -- over each image channel[{ {}, {i}, {}, {} }]:add(-mean[i]) -- mean subtraction[{ {}, {i}, {}, {} }]:div(stdv[i]) -- std scaling
end -- for fun, print the mean and standard-deviation of example-100
horse =[]
print(horse:mean(), horse:std()) print(classes[testset.label[]])
predicted = net:forward([]) -- the output of the network is Log-Probabilities. To convert them to probabilities, you have to take e^x
print(predicted:exp()) for i=,predicted:size() do
print(classes[i], predicted[i])
end -- test the accuracy
correct =
for i=, do
local groundtruth = testset.label[i]
local prediction = net:forward([i])
local confidences, indices = torch.sort(prediction, true) -- true means sort in descending order
if groundtruth == indices[] then
correct = correct +
end print(correct, *correct/ .. ' % ') class_performance = {, , , , , , , , , }
for i=, do
local groundtruth = testset.label[i]
local prediction = net:forward([i])
local confidences, indices = torch.sort(prediction, true) -- true means sort in descending order
if groundtruth == indices[] then
class_performance[groundtruth] = class_performance[groundtruth] +
end for i=,#classes do
print(classes[i], *class_performance[i]/ .. ' %')
end require 'cunn';
net = net:cuda()
criterion = criterion:cuda() =
trainset.label = trainset.label:cuda() trainer = nn.StochasticGradient(net, criterion)
trainer.learningRate = 0.001
trainer.maxIteration = -- just do 5 epochs of training. trainer:train(trainset)


    那么,运行起来 却出现如下的问题:


/home/wangxiao/torch/install/bin/luajit: ./train_network.lua:26: attempt to index global 'itorch' (a nil value)
stack traceback:
./train_network.lua:26: in main chunk
[C]: in function 'dofile'
...xiao/torch/install/lib/luarocks/rocks/trepl/scm-1/bin/th:145: in main chunk
[C]: at 0x00406670
wangxiao@AHU:~/Documents/Lua test examples$

    主要是 itorch 的问题, 另外就是 要引用 require 'nn' 来解决 无法辨别 nn 的问题.

  我是把 带有 itorch 的那些行都暂时注释了.

2.  'libcudnn (R5) not found in library path.

wangxiao@AHU:~/Downloads/wide-residual-networks-master$ th ./train_Single_Multilabel_Image_Classification.lua
/home/wangxiao/torch/install/bin/luajit: /home/wangxiao/torch/install/share/lua/5.1/trepl/init.lua:384: /home/wangxiao/torch/install/share/lua/5.1/trepl/init.lua:384: /home/wangxiao/torch/install/share/lua/5.1/cudnn/ffi.lua:1600: 'libcudnn (R5) not found in library path.
Please install CuDNN from
Then make sure files named as or libcudnn.5.dylib are placed in your library load path (for example /usr/local/lib , or manually add a path to LD_LIBRARY_PATH)

stack traceback:
[C]: in function 'error'
/home/wangxiao/torch/install/share/lua/5.1/trepl/init.lua:384: in function 'require'
./train_Single_Multilabel_Image_Classification.lua:8: in main chunk
[C]: in function 'dofile'
...xiao/torch/install/lib/luarocks/rocks/trepl/scm-1/bin/th:145: in main chunk
[C]: at 0x00406670



  重新下载了 cudnn-7.5-linux-x64-v5.0-ga.tgz

  并且重新配置了,但是依然提醒这个问题,那么,问题何在呢?查看了博客: 中的内容:

坑4 可能出现’libcudnn not found in library path’的情况


Please install CuDNN from
Then make sure files named as or libcudnn.5.dylib are placed in your library load path (for example /usr/local/lib , or manually add a path to LD_LIBRARY_PATH)
  • 1
  • 2


  1. sudo gedit /etc/ 就是新建一个conf文件。名字随便
  2. 加入刚才的路径/usr/local/cuda-7.5/lib64/
  3. 反正我还添加了/usr/local/cuda-7.5/include/,这个估计不要也行。
  4. 保存后,再sudo ldconfig来更新缓存。(可能会出现不是符号连接的问题,不过无所谓了!!)


th neural_style.lua -gpu 0 -backend cudnn
  • 1



评价:  按照这种做法试了,确实成功了! 赞一个 !!!

  3. 利用 gm 加载图像时,提示错误,但是装上那个包仍然提示错误:



Load library:

gm = require 'graphicsmagick'

First, we provide two high-level functions to load/save directly into/form tensors:

img = gm.load('/path/to/image.png' [, type])    -- type = 'float' (default) | 'double' | 'byte''/path/to/image.jpg' [,quality]) -- quality = 0 to 100 (for jpegs only)

The following provide a more controlled flow for loading/saving jpegs.

Create an image, from a file:

image = gm.Image('/path/to/image.png')
-- or
image = gm.Image()
image:load('/path/to/image.png')   但是悲剧的仍然有错, 只好换了用 image.load() 的方式加载图像:
--To load as byte tensor for rgb imagefile
local img = image.load(imagefile,3,'byte')

  4. Torch 保存 txt 文件:
  -- save opt
  file = torch.DiskFile(paths.concat(opt.checkpoints_dir,, 'opt.txt'), 'w')
  5. Torch 创建新的文件夹
  opts.modelPath = opt.modelDir .. opt.modelName
  if not paths.dirp(opt.modelPath) then
  end   6. Torch Lua 保存 图像到文件夹
  借助 image package,首先安装: luarocks install image
  然后 require 'image'
  就可以使用了: local img ='./saved_pos_neg_image/candidate_' .. tostring(i) .. tostring(j) .. '.png', pos_patch, 1, 32, 32)   7. module 'bit' not found:No LuaRocks module found for bit

wangxiao@AHU:/media/wangxiao/724eaeef-e688-4b09-9cc9-dfaca44079b2/fast-neural-style-master$ th ./train.lua
/home/wangxiao/torch/install/bin/lua: /home/wangxiao/torch/install/share/lua/5.2/trepl/init.lua:389: /home/wangxiao/torch/install/share/lua/5.2/trepl/init.lua:389: /home/wangxiao/torch/install/share/lua/5.2/trepl/init.lua:389: module 'bit' not found:No LuaRocks module found for bit
no field package.preload['bit']
no file '/home/wangxiao/.luarocks/share/lua/5.2/bit.lua'
no file '/home/wangxiao/.luarocks/share/lua/5.2/bit/init.lua'
no file '/home/wangxiao/torch/install/share/lua/5.2/bit.lua'
no file '/home/wangxiao/torch/install/share/lua/5.2/bit/init.lua'
no file '/home/wangxiao/.luarocks/share/lua/5.1/bit.lua'
no file '/home/wangxiao/.luarocks/share/lua/5.1/bit/init.lua'
no file '/home/wangxiao/torch/install/share/lua/5.1/bit.lua'
no file '/home/wangxiao/torch/install/share/lua/5.1/bit/init.lua'
no file './bit.lua'
no file '/home/wangxiao/torch/install/share/luajit-2.1.0-beta1/bit.lua'
no file '/usr/local/share/lua/5.1/bit.lua'
no file '/usr/local/share/lua/5.1/bit/init.lua'
no file '/home/wangxiao/.luarocks/lib/lua/5.2/'
no file '/home/wangxiao/torch/install/lib/lua/5.2/'
no file '/home/wangxiao/torch/install/lib/'
no file '/home/wangxiao/.luarocks/lib/lua/5.1/'
no file '/home/wangxiao/torch/install/lib/lua/5.1/'
no file './'
no file '/usr/local/lib/lua/5.1/'
no file '/usr/local/lib/lua/5.1/'
stack traceback:
[C]: in function 'error'
/home/wangxiao/torch/install/share/lua/5.2/trepl/init.lua:389: in function 'require'
./train.lua:5: in main chunk
[C]: in function 'dofile'
...xiao/torch/install/lib/luarocks/rocks/trepl/scm-1/bin/th:145: in main chunk
[C]: in ?

在终端中执行:luarocks install luabitop 

8.  HDF5Group:read() - no such child 'media' for [HDF5Group 33554432 /]

/home/wangxiao/torch/install/bin/lua: /home/wangxiao/torch/install/share/lua/5.2/hdf5/group.lua:312: HDF5Group:read() - no such child 'media' for [HDF5Group 33554432 /]
stack traceback:
[C]: in function 'error'
/home/wangxiao/torch/install/share/lua/5.2/hdf5/group.lua:312: in function </home/wangxiao/torch/install/share/lua/5.2/hdf5/group.lua:302>
(...tail calls...)
./fast_neural_style/DataLoader.lua:44: in function '__init'
/home/wangxiao/torch/install/share/lua/5.2/torch/init.lua:91: in function </home/wangxiao/torch/install/share/lua/5.2/torch/init.lua:87>
[C]: in function 'DataLoader'
./train.lua:138: in function 'main'
./train.lua:327: in main chunk
[C]: in function 'dofile'
...xiao/torch/install/lib/luarocks/rocks/trepl/scm-1/bin/th:145: in main chunk
[C]: in ?

最近在训练 类型迁移的代码,发现这个蛋疼的问题。哎。。纠结好几天了。。这个 hdf5 到底怎么回事 ?  求解释 !!!


  后来发现, 是我自己的数据集路径设置的有问题, 如: 应该是 CoCo/train/image/

  但是,我只是给定了 CoCo/train/ ...

  9. 怎么设置 torch代码在哪块 GPU 上运行 ? 或者 怎么设置在两块卡上同时运行 ?



  如图所示: export CUDA_VISIBLE_DEVICES=0 即可指定代码在 GPU-0 上运行. 


  10.  When load the pre-trained VGG model, got the following errors:

    warning: module 'data [type 5]' not found
    nn supports no groups!
    warning: module 'conv2 [type 4]' not found
    nn supports no groups!
    warning: module 'conv4 [type 4]' not found
    nn supports no groups!
    warning: module 'conv5 [type 4]' not found


 using cudnn
Successfully loaded ./feature_transfer/AlexNet_files/bvlc_alexnet.caffemodel
warning: module 'data [type 5]' not found
nn supports no groups!
warning: module 'conv2 [type 4]' not found
nn supports no groups!
warning: module 'conv4 [type 4]' not found
nn supports no groups!
warning: module 'conv5 [type 4]' not found
 wangxiao@AHU:~/Downloads/multi-modal-visual-tracking$ qlua ./train_match_function_alexNet_version_2017_02_28.lua
using cudnn
Successfully loaded ./feature_transfer/AlexNet_files/bvlc_alexnet.caffemodel
warning: module 'data [type 5]' not found
nn supports no groups!
warning: module 'conv2 [type 4]' not found
nn supports no groups!
warning: module 'conv4 [type 4]' not found
nn supports no groups!
warning: module 'conv5 [type 4]' not found
nn.Sequential {
[input -> () -> () -> () -> output]
(): nn.SplitTable
(): nn.ParallelTable {
|`-> (): nn.Sequential {
| [input -> () -> () -> () -> () -> () -> () -> () -> () -> () -> () -> () -> () -> () -> () -> () -> () -> () -> () -> output]
| (): nn.SpatialConvolution( -> , 11x11, ,)
| (): nn.ReLU
| (): nn.SpatialCrossMapLRN
| (): nn.SpatialMaxPooling(3x3, ,)
| (): nn.ReLU
| (): nn.SpatialCrossMapLRN
| (): nn.SpatialMaxPooling(3x3, ,)
| (): nn.SpatialConvolution( -> , 3x3, ,, ,)
| (): nn.ReLU
| (): nn.ReLU
| (): nn.ReLU
| (): nn.SpatialMaxPooling(3x3, ,)
| (): nn.View(-)
| (): nn.Linear( -> )
| (): nn.ReLU
| (): nn.Dropout(0.500000)
| (): nn.Linear( -> )
| (): nn.ReLU
| }
`-> (): nn.Sequential {
[input -> () -> () -> () -> () -> () -> () -> () -> () -> () -> () -> () -> () -> () -> () -> () -> () -> () -> () -> output]
(): nn.SpatialConvolution( -> , 11x11, ,)
(): nn.ReLU
(): nn.SpatialCrossMapLRN
(): nn.SpatialMaxPooling(3x3, ,)
(): nn.ReLU
(): nn.SpatialCrossMapLRN
(): nn.SpatialMaxPooling(3x3, ,)
(): nn.SpatialConvolution( -> , 3x3, ,, ,)
(): nn.ReLU
(): nn.ReLU
(): nn.ReLU
(): nn.SpatialMaxPooling(3x3, ,)
(): nn.View(-)
(): nn.Linear( -> )
(): nn.ReLU
(): nn.Dropout(0.500000)
(): nn.Linear( -> )
(): nn.ReLU
... -> output
(): nn.PairwiseDistance
================= AlextNet based Siamese Search for Visual Tracking ========================
==>> The Benchmark Contain: videos ...
deal with video / video name: BlurFace ... please waiting ...
the num of gt bbox:
the num of video frames:
========>>>> Begin to track video name: nil-th frame, please waiting ...
========>>>> Begin to track video name: nil-th frame, please waiting ... ............] ETA: 0ms | Step: 0ms
========>>>> Begin to track video name: nil-th frame, please waiting ... ............] ETA: 39s424ms | Step: 80ms
========>>>> Begin to track video name: nil-th frame, please waiting ... ............] ETA: 33s746ms | Step: 69ms
========>>>> Begin to track video name: nil-th frame, please waiting ... ............] ETA: 31s817ms | Step: 65ms
========>>>> Begin to track video name: nil-th frame, please waiting ... ............] ETA: 32s575ms | Step: 66ms
========>>>> Begin to track video name: nil-th frame, please waiting ... ............] ETA: 34s376ms | Step: 70ms
========>>>> Begin to track video name: nil-th frame, please waiting ... ............] ETA: 40s240ms | Step: 82ms
========>>>> Begin to track video name: nil-th frame, please waiting ... ...........] ETA: 44s211ms | Step: 91ms
========>>>> Begin to track video name: nil-th frame, please waiting ... ...........] ETA: 45s993ms | Step: 95ms
========>>>> Begin to track video name: nil-th frame, please waiting ... ...........] ETA: 47s754ms | Step: 99ms
========>>>> Begin to track video name: nil-th frame, please waiting ... ...........] ETA: 50s392ms | Step: 104ms
========>>>> Begin to track video name: nil-th frame, please waiting ... ...........] ETA: 53s138ms | Step: 110ms
========>>>> Begin to track video name: nil-th frame, please waiting ... ...........] ETA: 55s793ms | Step: 116ms
========>>>> Begin to track video name: nil-th frame, please waiting ... ...........] ETA: 59s253ms | Step: 123ms
========>>>> Begin to track video name: nil-th frame, please waiting ... ...........] ETA: 1m2s | Step: 130ms
========>>>> Begin to track video name: nil-th frame, please waiting ... ...........] ETA: 1m5s | Step: 137ms
========>>>> Begin to track video name: nil-th frame, please waiting ... ...........] ETA: 1m8s | Step: 143ms
========>>>> Begin to track video name: nil-th frame, please waiting ... ...........] ETA: 1m11s | Step: 149ms
//////////////////////////////////////////////////////////////////////////..............] ETA: 1m14s | Step: 157ms
==>> pos_proposal_list:
==>> neg_proposal_list:
qlua: /home/wangxiao/torch/install/share/lua/5.1/nn/Container.lua::
In module of nn.Sequential:
In module of nn.ParallelTable:
In module of nn.Sequential:
/home/wangxiao/torch/install/share/lua/5.1/nn/THNN.lua:: Need input of dimension and input.size[] == but got input to be of shape: [ x x ] at /tmp/luarocks_cunn-scm--/cunn/lib/THCUNN/generic/
stack traceback:
[C]: in function 'v'
/home/wangxiao/torch/install/share/lua/5.1/nn/THNN.lua:: in function 'SpatialConvolutionMM_updateOutput' in function <>
[C]: in function 'xpcall'
/home/wangxiao/torch/install/share/lua/5.1/nn/Container.lua:: in function 'rethrowErrors'
...e/wangxiao/torch/install/share/lua/5.1/nn/Sequential.lua:: in function <...e/wangxiao/torch/install/share/lua/5.1/nn/Sequential.lua:>
[C]: in function 'xpcall'
/home/wangxiao/torch/install/share/lua/5.1/nn/Container.lua:: in function 'rethrowErrors'
...angxiao/torch/install/share/lua/5.1/nn/ParallelTable.lua:: in function <...angxiao/torch/install/share/lua/5.1/nn/ParallelTable.lua:>
[C]: in function 'xpcall'
/home/wangxiao/torch/install/share/lua/5.1/nn/Container.lua:: in function 'rethrowErrors'
...e/wangxiao/torch/install/share/lua/5.1/nn/Sequential.lua:: in function 'forward'
./train_match_function_alexNet_version_2017_02_28.lua:: in function 'opfunc'
/home/wangxiao/torch/install/share/lua/5.1/optim/adam.lua:: in function 'optim'
./train_match_function_alexNet_version_2017_02_28.lua:: in main chunk WARNING: If you see a stack trace below, it doesn't point to the place where this error occurred. Please use only the one above.
stack traceback:
[C]: at 0x7f86014df9c0
[C]: in function 'error'
/home/wangxiao/torch/install/share/lua/5.1/nn/Container.lua:: in function 'rethrowErrors'
...e/wangxiao/torch/install/share/lua/5.1/nn/Sequential.lua:: in function 'forward'
./train_match_function_alexNet_version_2017_02_28.lua:: in function 'opfunc'
/home/wangxiao/torch/install/share/lua/5.1/optim/adam.lua:: in function 'optim'
./train_match_function_alexNet_version_2017_02_28.lua:: in main chunk

  Just like the screen shot above, change the 'nn' into 'cudnn' will be ok and passed.

  11. both (null) and torch.FloatTensor have no less-than operator

    qlua: ./test_MM_tracker_VGG_.lua:254: both (null) and torch.FloatTensor have no less-than operator
    stack traceback:
    [C]: at 0x7f628816e9c0
    [C]: in function '__lt'
    ./test_MM_tracker_VGG_.lua:254: in main chunk


  Because it is floatTensor () style and you can change it like this if you want this value printed in a for loop: predictValue -->> predictValue[i] .



========>>>> Begin to track the 6-th and the video name is ILSVRC2015_train_00109004 , please waiting ...
THCudaCheck FAIL file=/tmp/luarocks_cutorch-scm-1-707/cutorch/lib/THC/generic/ line=66 error=2 : out of memory
qlua: cuda runtime error (2) : out of memory at /tmp/luarocks_cutorch-scm-1-707/cutorch/lib/THC/generic/
stack traceback:
[C]: at 0x7fa20a8f99c0
[C]: at 0x7fa1dddfbee0
[C]: in function 'Tensor'
./train_match_function_VGG_version_2017_03_02.lua:377: in main chunk

Yes, it is just out of memory of GPU. Just turn the batchsize to a small value, it may work. It worked for me. Ha ha ...

13. luarocks install class does not have any effect, it still shown me the error: No Module named "class" in Torch.

  ==>> in terminal, install this package in sudo.

  ==>> then, it will be OK.

14. How to install opencv 3.1 on Ubuntu 14.04 ??? 

  As we can found from:

  1. first, you should install torch successfully ;

  2. then, just follow what the blog said here:

sudo apt-get install build-essential
sudo apt-get install cmake git libgtk2.0-dev pkg-config libavcodec-dev libavformat-dev libswscale-dev
解压:unzip opencv-3.1.0
cd ~/opencv-3.1.0
mkdir build
cd build
cmake -D CMAKE_BUILD_TYPE=Release -D CMAKE_INSTALL_PREFIX=/usr/local ..
sudo make -j24 
sudo make install -j24  
sudo /bin/bash -c 'echo "/usr/local/lib" > /etc/'
sudo ldconfig
在安装过程中可能会出现无法下载 ippicv_linux_20151201.tgz的问题。
将下载好的文件  放入 opencv-3.1.0/3rdparty/ippicv/downloads/linux-808b791a6eac9ed78d32a7666804320e 中,如果已经存在 ,则替换掉,这样就可以安装完成了。
luarocks install cv

OpenCV bindings for Torch安装成功。

But, maybe you may found some errors, such as:

cudalegacy/src/graphcuts.cpp:120:54: error: ‘NppiGraphcutState’ has not been declared    (solution draw from:

At this moment, you need to change some files:

found graphcuts.cpp in opencv3.1, and do the following changes:

#if !defined (HAVE_CUDA) || defined (CUDA_DISABLER) 
#if !defined (HAVE_CUDA) || defined (CUDA_DISABLER) || (CUDART_VERSION >= 8000) 

then, try again, it will be ok...this code just want to make opencv3.1 work under cuda 8.0, you know...skip that judge sentence...

15.  安装torch-hdf5 
sudo apt-get install libhdf5-serial-dev hdf5-tools
git clone
cd torch-hdf5
sudo luarocks make hdf5--.rockspec LIBHDF5_LIBDIR=”/usr/lib/x86_64-Linux-gnu/”

17. iTorch安装

git clone
mkdir build-zeromq
cd build-zeromq
cmake ..
make && make install
安装完之后,luarocks install itorch
之后可以通过luarocks list查看是否安装成功


