https://blog.csdn.net/nockinonheavensdoor/article/details/82320580

先看看简单例子:

  1. import torch
  2. import torch.autograd as autograd
  3. import torch.nn as nn
  4. import torch.nn.functional as F
  5. import torch.optim as optim
  6. torch.manual_seed(1)
  • 1
  • 2
  • 3
  • 4
  • 5
  • 6
  • 7
  • torch.tensor让list成为tensor:
  1. # Create a 3D tensor of size 2x2x2.
  2. T_data = [[[1., 2.], [3., 4.]],
  3. [[5., 6.], [7., 8.]]]
  4. T = torch.tensor(T_data)
  5. print(T)
  • 1
  • 2
  • 3
  • 4
  • 5
  • 自动求导设requires_grad=True:
  1. # Computation Graphs and Automatic Differentiation
  2. x = torch.tensor([1., 2., 3], requires_grad=True)
  3. y = torch.tensor([4., 5., 6], requires_grad=True)
  4. z = x + y
  5. print(z)
  6. print(z.grad_fn)
  7. tensor([ 5., 7., 9.])
  8. <AddBackward1 object at 0x00000247781E0BE0>
  • 1
  • 2
  • 3
  • 4
  • 5
  • 6
  • 7
  • 8
  • 9
  • detach()方法获取z的值,但是不能对获取后的值求导了。
  1. new_z = z.detach()
  2. print(new_z.grad_fn)
  3. None
  • 1
  • 2
  • 3
  • 4
  • 好了,重点来了

Translation with a Sequence to Sequence Network and Attention

  1. from __future__ import unicode_literals, print_function, division
  2. from io import open
  3. import unicodedata
  4. import string
  5. import re
  6. import random
  7. import torch
  8. import torch.nn as nn
  9. from torch import optim
  10. import torch.nn.functional as F
  11. device = torch.device("cuda" if torch.cuda.is_available() else "cpu")
  • 1
  • 2
  • 3
  • 4
  • 5
  • 6
  • 7
  • 8
  • 9
  • 10
  • 11
  • 12
  • 13

准备数据:

  1. SOS_token = 0
  2. EOS_token = 1
  3. class lang:
  4. def __init__(self, name):
  5. self.name = name
  6. self.word2index = {}
  7. self.word2count = {}
  8. self.index2word = {0:'SOS', 1:'EOS'}
  9. self.n_words = 2 # Count SOS and EOS
  10. def addSentence(self, sentence):
  11. for word in sentence.split():
  12. self.addWord(word)
  13. def addWord(self, word):
  14. if word not in self.word2index:
  15. self.word2index[word] = self.n_words
  16. self.word2count[word] = 1
  17. self.index2word[self.n_words] = word
  18. self.n_words += 1
  19. else:
  20. self.word2count[word] += 1
  • 1
  • 2
  • 3
  • 4
  • 5
  • 6
  • 7
  • 8
  • 9
  • 10
  • 11
  • 12
  • 13
  • 14
  • 15
  • 16
  • 17
  • 18
  • 19
  • 20
  • 21
  • 22
  • 23
  • 24
  • Unicode字符转为ASCII,用小写字母表示一切,去掉标点符号:
  1. # Turn a Unicode string to plain ASCII, thanks to
  2. # http://stackoverflow.com/a/518232/2809427
  3. def unicodeToAscii(s):
  4. return ''.join(
  5. c for c in unicodedata.normalize('NFD', s)
  6. if unicodedata.category(c) != 'Mn'
  7. )
  8. # Lowercase,trim,remove non-letter characters
  9. #re.sub(pattern, repl, string, count=0, flags=0)
  10. def normalizeString(s):
  11. s = unicodeToAscii(s.lower().strip())
  12. # (re) 匹配括号内的表达式,也表示一个组
  13. # [...] 用来表示一组字符,单独列出:[amk] 匹配 'a','m'或'k'
  14. # \1...\9 匹配第n个分组的内容。
  15. s = re.sub(r"([.!?])", r"\1", s)
  16. # [^...] 不在[]中的字符:[^abc] 匹配除了a,b,c之外的字符。
  17. s = re.sub(r"[^a-zA-Z.!?]+",r" ", s)
  18. return s
  • 1
  • 2
  • 3
  • 4
  • 5
  • 6
  • 7
  • 8
  • 9
  • 10
  • 11
  • 12
  • 13
  • 14
  • 15
  • 16
  • 17
  • 18
  • 19
  • 20

继续:

  1. # 文件用的英语到其他语言,用reverse标志置换一对这样的数据。
  2. def readlangs(lang1, lang2, reverse= False):
  3. print("Reading lines...")
  4. #Read the file and split into lines
  5. lines = open('data/%s-%s.txt' % (lang1, lang2), encoding='utf-8').\
  6. read().strip().split('\n')
  7. # Split every line into pairs and normalize
  8. pairs = [[normalizeString(s) for s in l.split('\t')] for l in lines]
  9. # Reverse pairs, make lang instances
  10. if reverse:
  11. pairs = [list(reversed(p)) for p in pairs]
  12. input_lang = lang(lang2)
  13. output_lang = lang(lang1)
  14. else:
  15. input_lang = lang(lang1)
  16. output_lang = lang(lang2)
  17. return input_lang, output_lang, pairs
  • 1
  • 2
  • 3
  • 4
  • 5
  • 6
  • 7
  • 8
  • 9
  • 10
  • 11
  • 12
  • 13
  • 14
  • 15
  • 16
  • 17
  • 18
  • 19
  • 20

过滤出部分样本:


  1. MAX_LENGTH = 10
  2. eng_prefixes = (
  3. "i am ", "i m ",
  4. "he is", "he s ",
  5. "she is", "she s",
  6. "you are", "you re ",
  7. "we are", "we re ",
  8. "they are", "they re "
  9. )
  10. def filterPair(p):
  11. return len(p[0].split(' ')) < MAX_LENGTH and \
  12. len(p[1].split(' ')) < MAX_LENGTH and \
  13. p[1].startswith(eng_prefixes)
  14. def filterPairs(pairs):
  15. return [ pair for pair in pairs if filterPair(pair)]
  • 1
  • 2
  • 3
  • 4
  • 5
  • 6
  • 7
  • 8
  • 9
  • 10
  • 11
  • 12
  • 13
  • 14
  • 15
  • 16
  • 17
  • 18
  • 19
  • The full process for preparing the data is:

    • Read text file and split into lines, split lines into pairs
    • Normalize text, filter by length and content
    • Make word lists from sentences in pairs
  1. def prepareData(lang1, lang2, reverse= False):
  2. input_lang, output_lang, pairs = readlangs(lang1,lang2,reverse)
  3. print("Read %s sentence pairs " % len(pairs))
  4. pairs = filterPairs(pairs)
  5. print("Trimmed to %s sentence pairs " % len(pairs))
  6. print("Counting words...")
  7. for pair in pairs:
  8. input_lang.addSentence(pair[0])
  9. output_lang.addSentence(pair[1])
  10. print("Counted word:")
  11. print(input_lang.name,input_lang.n_words)
  12. print(output_lang.name, output_lang.n_words)
  13. return input_lang, output_lang, pairs
  14. input_lang, output_lang, pairs = prepareData('eng','fra',True)
  15. print(random.choice(pairs))
  16. Reading lines...
  17. Read 135842 sentence pairs
  18. Trimmed to 11739 sentence pairs
  19. Counting words...
  20. Counted word:
  21. fra 5911
  22. eng 3965
  23. ['elle chante les dernieres chansons populaires.', 'she is singing the latest popular songs.']
  • 1
  • 2
  • 3
  • 4
  • 5
  • 6
  • 7
  • 8
  • 9
  • 10
  • 11
  • 12
  • 13
  • 14
  • 15
  • 16
  • 17
  • 18
  • 19
  • 20
  • 21
  • 22
  • 23
  • 24
  • 25
  • 26
  • 27

The Seq2Seq Model

  • 允许句子到句子有不同长度和顺序。

The Encoder :

  1. #编码器
  2. class EncoderRNN(nn.Module):
  3. def __init__(self, input_size, hidden_size):
  4. super(EncoderRNN, self).__init__()
  5. self.hidden_size = hidden_size
  6. # 指定embedding矩阵W的大小维度
  7. self.embedding = nn.Embedding(input_size, hidden_size)
  8. # 指定gru单元的大小
  9. self.gru = nn.GRU(hidden_size, hidden_size)
  10. def forward(self, input, hidden):
  11. # 扁平化嵌入矩阵
  12. embedded = self.embedding(input).view(1, 1, -1)
  13. print("embedded shape:",embedded.shape)
  14. output = embedded
  15. output, hidden = self.gru(output, hidden)
  16. return output, hidden
  17. #全0初始化隐层
  18. def initHidden(self):
  19. # 这个初始化维度可以
  20. return torch.zeros(1, 1, self.hidden_size, device=device)
  • 1
  • 2
  • 3
  • 4
  • 5
  • 6
  • 7
  • 8
  • 9
  • 10
  • 11
  • 12
  • 13
  • 14
  • 15
  • 16
  • 17
  • 18
  • 19
  • 20
  • 21
  • 22
  • 23
  • 24

这里的self.gru = nn.GRU(hidden_size, hidden_size)中,hidden_size在后面设置为256

print("embedded shape:",embedded.shape)的结果是: 
embedded shape: torch.Size([1, 1, 256])

所以self.gru(output, hidden)中传递的第一个维度是[1,1,256],被压缩为这样的。


nn.GRU源码:


The Decoder:

  • seq2seq解码器的简化版:指利用encoder的最后输出,称为context vector,
  • context vector 作为decoder的初始化隐层状态值 
  1. class DecoderRNN(nn.Module):
  2. def self__init__(self, hidden_size, output_size):
  3. super(DecoderRNN, self).__init__()
  4. self.hidden_size = hidden_size
  5. self.embedding = nn.Embedding(output_size,hidden_size)
  6. self.gru = nn.GRU(hidden_size, hidden_size)
  7. self.out = nn.Linear(hidden_size, output_size)
  8. self.softmax = nn.LogSoftmax(dim=1)
  9. def forward(self, input, hidden):
  10. output = self.embedding(input).view(1, 1, -1)
  11. # 1行X列的shape做relu
  12. output = F.relu(output)
  13. output, hidden = self.gru(output, hidden)
  14. #output[0]应该是shape为(*,*)的矩阵
  15. output = self.softmax(self.out(output[0]))
  16. return output, hidden
  17. def initHidden(self):
  18. return torch.zeros(1, 1, self.hidden_size, device=device)
  • 1
  • 2
  • 3
  • 4
  • 5
  • 6
  • 7
  • 8
  • 9
  • 10
  • 11
  • 12
  • 13
  • 14
  • 15
  • 16
  • 17
  • 18
  • 19
  • 20

Attention Decoder:

  • 简单的解码器的缺点:把整个句子做编码成一个向量,信息容易丢失,翻译一个词的时候需要追溯之前很长的距离,一般翻译的对应性也没有利用,如翻译第一个词,对应大概率在原句子的第一个位置的信息。
  • encoder的输出向量 会乘以一个attention weights,这个权值用NN来计算完成attn,使用解码器的输入和隐藏状态作为输入。。
  • 因为在训练数据中有各种大小的句子,为了实际创建和训练这一层,我们必须选择一个最大的句子长度(输入长度,对于编码器输出)因为在训练数据中有各种大小的句子,为了实际创建和训练这一层,我们必须选择一个最大的句子长度(输入长度,对于编码器输出) 
  1. class AttnDecoderRNN(nn.Module):
  2. def __init__(self, hidden_size, output_size,
  3. dropout_p = 0.1, max_length=MAX_LENGTH):
  4. super(AttnDecoderRNN,self).__init__()
  5. self.hidden_size = hidden_size
  6. self.output_size = output_size
  7. self.dropout_p = dropout_p
  8. self.max_length = max_length
  9. self.embedding = nn.Embedding(self.output_size, self.hidden_size)
  10. self.attn = nn.Linear(self.hidden_size * 2, self.max_length)
  11. self.attn_combine = nn.Linear(self.hidden_size * 2, self.hidden_size)
  12. self.dropout = nn.Dropout(self.dropout_p)
  13. #输入向量的维度是10,隐层的长度是10,默认是一层GRU
  14. self.gru = nn.GRU(self.hidden_size, self.hidden_size)
  15. self.out = nn.Linear(self.hidden_size, self.output_size)
  16. def forward(self, input, hidden, encoder_outputs):
  17. embedded = self.embedding(input).view(1,1,-1)
  18. embedded = self.dropout(embedded)
  19. attn_weights = F.softmax(
  20. self.attn(torch.cat((embedded[0],hidden[0]),1)),dim=1)
  21. # unsqueeze:在指定的轴上多增加一个维度
  22. attn_applied = torch.bmm(attn_weights.unsqueeze(0),
  23. encoder_outputs.unsqueeze(0))
  24. output = torch.cat((embedded[0],attn_applied[0]),1)
  25. output = self.attn_combine(output).unsqueeze(0)
  26. output = F.relu(output)
  27. output, hidden = self.gru(output, hidden)
  28. #print("output shape:",output.shape)
  29. #print("output[0]:",output[0])
  30. output = F.log_softmax(self.out(output[0]),dim=1)
  31. return output , hidden, attn_weights
  32. def initHidden(self):
  33. return torch.zeros(1, 1, self.hidden_size, device=device)
  • 1
  • 2
  • 3
  • 4
  • 5
  • 6
  • 7
  • 8
  • 9
  • 10
  • 11
  • 12
  • 13
  • 14
  • 15
  • 16
  • 17
  • 18
  • 19
  • 20
  • 21
  • 22
  • 23
  • 24
  • 25
  • 26
  • 27
  • 28
  • 29
  • 30
  • 31
  • 32
  • 33
  • 34
  • 35
  • 36
  • 37
  • 38
  • 39
  • 40

继续准备数据:

  1. def indexesFromSentence(lang, sentence):
  2. return [lang.word2index[word] for word in sentence.split(' ')]
  3. def tensorFromSentence(lang, sentence):
  4. indexes = indexesFromSentence(lang, sentence)
  5. indexes.append(EOS_token)
  6. return torch.tensor(indexes, dtype=torch.long, device=device).view(-1, 1)
  7. def tensorsFromPair(pair):
  8. input_tensor = tensorFromSentence(input_lang, pair[0])
  9. target_tensor = tensorFromSentence(output_lang, pair[1])
  10. return (input_tensor, target_tensor)
  • 1
  • 2
  • 3
  • 4
  • 5
  • 6
  • 7
  • 8
  • 9
  • 10
  • 11
  • 12
  • 13
  • 14

训练模型

  • 解码器的第一个输入是SOS符,并且把编码器最后的隐层状态作为解码器的第一隐层状态。
  • “Teacher forcing”指用真实样本数据作为下一步的输入,而不是解码器猜测的数据作为下一步输入。
  1. teacher_forcing_ratio = 0.5
  2. def train(input_tensor, output_tensor, encoder, decoder, encoder_optimizer,
  3. decoder_optimizer, criterion, max_length=MAX_LENGTH):
  4. # 这的隐层大小封装在encoder中,然后拿过来在train的时候初始化隐层的大小
  5. encoder_hidden = encoder.initHidden()
  6. encoder_optimizer.zero_grad()
  7. decoder_optimizer.zero_grad()
  8. # 第一维度的大小即输入长度
  9. input_length = input_tensor.size(0)
  10. output_length = output_tensor.size(0)
  11. encoder_outputs = torch.zeros(max_length, encoder.hidden_size,device=device)
  12. loss = 0
  13. for ei in range(input_length):
  14. encoder_output, encoder_hidden = encoder(input_tensor[ei],encoder_hidden)
  15. # [0,0]选取最大数组的第一个元素组里的第一个
  16. encoder_outputs[ei] = encoder_output[0 , 0]
  17. if ei == 0 :
  18. print("encoder_output[0, 0] shape: ",encoder_outputs[ei].shape)
  19. decoder_input = torch.tensor([[SOS_token]], device=device)
  20. decoder_hidden = encoder_output
  21. # niubi
  22. use_teacher_forcing = True if random.random() < teacher_forcing_ratio else False
  23. if use_teacher_forcing:
  24. # Teacher forcing: Feed the target as the next input
  25. for di in range(output_length):
  26. decoder_ouput,decoder_hidden,decoder_attention = decoder( decoder_input, decoder_hidden, encoder_outputs)
  27. loss = loss + criterion(decoder_ouput, output_tensor[di])
  28. decoder_input = output_tensor[di] # Teacher forcing
  29. else:
  30. for di in range(output_length):
  31. decoder_output,decoder_hidden,decoder_attention=decoder(decoder_input, decoder_hidden, encoder_outputs)
  32. topv ,topi = decoder_output.topk(1)
  33. decoder_input= topi.squeeze().detach() # # detach from history as input
  34. loss = loss + criterion(decoder_output, output_tensor[di])
  35. if decoder_input.item() == EOS_token:
  36. break
  37. loss.backward()
  38. encoder_optimizer.step()
  39. decoder_optimizer.step()
  40. return loss.item() / target_length
  • 1
  • 2
  • 3
  • 4
  • 5
  • 6
  • 7
  • 8
  • 9
  • 10
  • 11
  • 12
  • 13
  • 14
  • 15
  • 16
  • 17
  • 18
  • 19
  • 20
  • 21
  • 22
  • 23
  • 24
  • 25
  • 26
  • 27
  • 28
  • 29
  • 30
  • 31
  • 32
  • 33
  • 34
  • 35
  • 36
  • 37
  • 38
  • 39
  • 40
  • 41
  • 42
  • 43
  • 44
  • 45
  • 46
  • 47
  • 48
  • 49

好了,模型准备结束:

  1. import time
  2. import math
  3. def asMinutes(s):
  4. m = math.floors(s / 60)
  5. s -= m * 60
  6. return "%s(- %s)" % (asMinutes(s), asMinutes(rs))
  7. def timeSince(since, percent):
  8. now = time.time()
  9. s = now - since
  10. es = s / (percent)
  11. rs = es - s
  12. return '%s (- %s)' % (asMinutes(s), asMinutes(rs))
  • 1
  • 2
  • 3
  • 4
  • 5
  • 6
  • 7
  • 8
  • 9
  • 10
  • 11
  • 12
  • 13
  • 14
  • 15

训练过程:

  1. def trainIters(encoder, decoder, n_iters, print_every=1000, plot_every=100, learning_rate=0.01):
  2. start = time.time()
  3. plot_losses = []
  4. print_loss_total = 0 # Reset every print_every
  5. plot_loss_total = 0 # Reset every plot_every
  6. encoder_optimizer = optim.SGD(encoder.parameters(), lr=learning_rate)
  7. decoder_optimizer = optim.SGD(decoder.parameters(), lr=learning_rate)
  8. training_pairs = [tensorsFromPair(random.choice(pairs))
  9. for i in range(n_iters)]
  10. criterion = nn.NLLLoss()
  11. for iter in range(1, n_iters + 1):
  12. training_pair = training_pairs[iter - 1]
  13. input_tensor = training_pair[0]
  14. target_tensor = training_pair[1]
  15. loss = train(input_tensor, target_tensor, encoder,
  16. decoder, encoder_optimizer, decoder_optimizer, criterion)
  17. print_loss_total = loss + print_loss_total
  18. plot_loss_total = loss + plot_loss_total
  19. if iter % print_every == 0:
  20. print_loss_avg = print_loss_total / print_every
  21. print_loss_total = 0
  22. print('%s (%d %d%%) %.4f' % (timeSince(start, iter / n_iters),
  23. iter, iter / n_iters * 100, print_loss_avg))
  24. if iter % plot_every == 0:
  25. plot_loss_avg = plot_loss_total / plot_every
  26. plot_losses.append(plot_loss_avg)
  27. plot_loss_total = 0
  28. showPlot(plot_losses)
  • 1
  • 2
  • 3
  • 4
  • 5
  • 6
  • 7
  • 8
  • 9
  • 10
  • 11
  • 12
  • 13
  • 14
  • 15
  • 16
  • 17
  • 18
  • 19
  • 20
  • 21
  • 22
  • 23
  • 24
  • 25
  • 26
  • 27
  • 28
  • 29
  • 30
  • 31
  • 32
  • 33
  • 34

画图的这段:

  1. import matplotlib.pyplot as plt
  2. plt.switch_backend('agg')
  3. import matplotlib.ticker as ticker
  4. import numpy as np
  5. def showPlot(points):
  6. plt.figure()
  7. fig, ax = plt.subplots()
  8. # this locator puts ticks at regular intervals
  9. loc = ticker.MultipleLocator(base=0.2)
  10. ax.yaxis.set_major_locator(loc)
  11. plt.plot(points)
  • 1
  • 2
  • 3
  • 4
  • 5
  • 6
  • 7
  • 8
  • 9
  • 10
  • 11
  • 12
  • 13

验证的代码:

  1. def evaluate(encoder, decoder, sentence, max_length=MAX_LENGTH):
  2. with torch.no_grad():
  3. input_tensor = tensorFromSentence(input_lang, sentence)
  4. input_length = input_tensor.size()[0]
  5. encoder_hidden = encoder.initHidden()
  6. encoder_outputs = torch.zeros(max_length, encoder.hidden_size, device=device)
  7. for ei in range(input_length):
  8. encoder_output, encoder_hidden = encoder(input_tensor[ei],
  9. encoder_hidden)
  10. encoder_outputs[ei] += encoder_output[0, 0]
  11. decoder_input = torch.tensor([[SOS_token]], device=device) # SOS
  12. decoder_hidden = encoder_hidden
  13. decoded_words = []
  14. decoder_attentions = torch.zeros(max_length, max_length)
  15. for di in range(max_length):
  16. decoder_output, decoder_hidden, decoder_attention = decoder(
  17. decoder_input, decoder_hidden, encoder_outputs)
  18. decoder_attentions[di] = decoder_attention.data
  19. topv, topi = decoder_output.data.topk(1)
  20. if topi.item() == EOS_token:
  21. decoded_words.append('<EOS>')
  22. break
  23. else:
  24. decoded_words.append(output_lang.index2word[topi.item()])
  25. decoder_input = topi.squeeze().detach()
  26. return decoded_words, decoder_attentions[:di + 1]
  • 1
  • 2
  • 3
  • 4
  • 5
  • 6
  • 7
  • 8
  • 9
  • 10
  • 11
  • 12
  • 13
  • 14
  • 15
  • 16
  • 17
  • 18
  • 19
  • 20
  • 21
  • 22
  • 23
  • 24
  • 25
  • 26
  • 27
  • 28
  • 29
  • 30
  • 31
  • 32
  • 33
  • 34
  1. def evaluateRandomly(encoder, decoder, n=10):
  2. for i in range(n):
  3. pair = random.choice(pairs)
  4. print('>', pair[0])
  5. print('=', pair[1])
  6. output_words, attentions = evaluate(encoder, decoder, pair[0])
  7. output_sentence = ' '.join(output_words)
  8. print('<', output_sentence)
  9. print('')
  • 1
  • 2
  • 3
  • 4
  • 5
  • 6
  • 7
  • 8
  • 9

最后一步:

  1. hidden_size = 256
  2. encoder1 = EncoderRNN(input_lang.n_words, hidden_size).to(device)
  3. attn_decoder1 = AttnDecoderRNN(hidden_size, output_lang.n_words, dropout_p=0.1).to(device)
  4. trainIters(encoder1, attn_decoder1, 75000, print_every=5000)

先看看简单例子:

  1. import torch
  2. import torch.autograd as autograd
  3. import torch.nn as nn
  4. import torch.nn.functional as F
  5. import torch.optim as optim
  6. torch.manual_seed(1)
  • 1
  • 2
  • 3
  • 4
  • 5
  • 6
  • 7
  • torch.tensor让list成为tensor:
  1. # Create a 3D tensor of size 2x2x2.
  2. T_data = [[[1., 2.], [3., 4.]],
  3. [[5., 6.], [7., 8.]]]
  4. T = torch.tensor(T_data)
  5. print(T)
  • 1
  • 2
  • 3
  • 4
  • 5
  • 自动求导设requires_grad=True:
  1. # Computation Graphs and Automatic Differentiation
  2. x = torch.tensor([1., 2., 3], requires_grad=True)
  3. y = torch.tensor([4., 5., 6], requires_grad=True)
  4. z = x + y
  5. print(z)
  6. print(z.grad_fn)
  7. tensor([ 5., 7., 9.])
  8. <AddBackward1 object at 0x00000247781E0BE0>
  • 1
  • 2
  • 3
  • 4
  • 5
  • 6
  • 7
  • 8
  • 9
  • detach()方法获取z的值,但是不能对获取后的值求导了。
  1. new_z = z.detach()
  2. print(new_z.grad_fn)
  3. None
  • 1
  • 2
  • 3
  • 4
  • 好了,重点来了

Translation with a Sequence to Sequence Network and Attention

  1. from __future__ import unicode_literals, print_function, division
  2. from io import open
  3. import unicodedata
  4. import string
  5. import re
  6. import random
  7. import torch
  8. import torch.nn as nn
  9. from torch import optim
  10. import torch.nn.functional as F
  11. device = torch.device("cuda" if torch.cuda.is_available() else "cpu")
  • 1
  • 2
  • 3
  • 4
  • 5
  • 6
  • 7
  • 8
  • 9
  • 10
  • 11
  • 12
  • 13

准备数据:

  1. SOS_token = 0
  2. EOS_token = 1
  3. class lang:
  4. def __init__(self, name):
  5. self.name = name
  6. self.word2index = {}
  7. self.word2count = {}
  8. self.index2word = {0:'SOS', 1:'EOS'}
  9. self.n_words = 2 # Count SOS and EOS
  10. def addSentence(self, sentence):
  11. for word in sentence.split():
  12. self.addWord(word)
  13. def addWord(self, word):
  14. if word not in self.word2index:
  15. self.word2index[word] = self.n_words
  16. self.word2count[word] = 1
  17. self.index2word[self.n_words] = word
  18. self.n_words += 1
  19. else:
  20. self.word2count[word] += 1
  • 1
  • 2
  • 3
  • 4
  • 5
  • 6
  • 7
  • 8
  • 9
  • 10
  • 11
  • 12
  • 13
  • 14
  • 15
  • 16
  • 17
  • 18
  • 19
  • 20
  • 21
  • 22
  • 23
  • 24
  • Unicode字符转为ASCII,用小写字母表示一切,去掉标点符号:
  1. # Turn a Unicode string to plain ASCII, thanks to
  2. # http://stackoverflow.com/a/518232/2809427
  3. def unicodeToAscii(s):
  4. return ''.join(
  5. c for c in unicodedata.normalize('NFD', s)
  6. if unicodedata.category(c) != 'Mn'
  7. )
  8. # Lowercase,trim,remove non-letter characters
  9. #re.sub(pattern, repl, string, count=0, flags=0)
  10. def normalizeString(s):
  11. s = unicodeToAscii(s.lower().strip())
  12. # (re) 匹配括号内的表达式,也表示一个组
  13. # [...] 用来表示一组字符,单独列出:[amk] 匹配 'a','m'或'k'
  14. # \1...\9 匹配第n个分组的内容。
  15. s = re.sub(r"([.!?])", r"\1", s)
  16. # [^...] 不在[]中的字符:[^abc] 匹配除了a,b,c之外的字符。
  17. s = re.sub(r"[^a-zA-Z.!?]+",r" ", s)
  18. return s
  • 1
  • 2
  • 3
  • 4
  • 5
  • 6
  • 7
  • 8
  • 9
  • 10
  • 11
  • 12
  • 13
  • 14
  • 15
  • 16
  • 17
  • 18
  • 19
  • 20

继续:

  1. # 文件用的英语到其他语言,用reverse标志置换一对这样的数据。
  2. def readlangs(lang1, lang2, reverse= False):
  3. print("Reading lines...")
  4. #Read the file and split into lines
  5. lines = open('data/%s-%s.txt' % (lang1, lang2), encoding='utf-8').\
  6. read().strip().split('\n')
  7. # Split every line into pairs and normalize
  8. pairs = [[normalizeString(s) for s in l.split('\t')] for l in lines]
  9. # Reverse pairs, make lang instances
  10. if reverse:
  11. pairs = [list(reversed(p)) for p in pairs]
  12. input_lang = lang(lang2)
  13. output_lang = lang(lang1)
  14. else:
  15. input_lang = lang(lang1)
  16. output_lang = lang(lang2)
  17. return input_lang, output_lang, pairs
  • 1
  • 2
  • 3
  • 4
  • 5
  • 6
  • 7
  • 8
  • 9
  • 10
  • 11
  • 12
  • 13
  • 14
  • 15
  • 16
  • 17
  • 18
  • 19
  • 20

过滤出部分样本:


  1. MAX_LENGTH = 10
  2. eng_prefixes = (
  3. "i am ", "i m ",
  4. "he is", "he s ",
  5. "she is", "she s",
  6. "you are", "you re ",
  7. "we are", "we re ",
  8. "they are", "they re "
  9. )
  10. def filterPair(p):
  11. return len(p[0].split(' ')) < MAX_LENGTH and \
  12. len(p[1].split(' ')) < MAX_LENGTH and \
  13. p[1].startswith(eng_prefixes)
  14. def filterPairs(pairs):
  15. return [ pair for pair in pairs if filterPair(pair)]
  • 1
  • 2
  • 3
  • 4
  • 5
  • 6
  • 7
  • 8
  • 9
  • 10
  • 11
  • 12
  • 13
  • 14
  • 15
  • 16
  • 17
  • 18
  • 19
  • The full process for preparing the data is:

    • Read text file and split into lines, split lines into pairs
    • Normalize text, filter by length and content
    • Make word lists from sentences in pairs
  1. def prepareData(lang1, lang2, reverse= False):
  2. input_lang, output_lang, pairs = readlangs(lang1,lang2,reverse)
  3. print("Read %s sentence pairs " % len(pairs))
  4. pairs = filterPairs(pairs)
  5. print("Trimmed to %s sentence pairs " % len(pairs))
  6. print("Counting words...")
  7. for pair in pairs:
  8. input_lang.addSentence(pair[0])
  9. output_lang.addSentence(pair[1])
  10. print("Counted word:")
  11. print(input_lang.name,input_lang.n_words)
  12. print(output_lang.name, output_lang.n_words)
  13. return input_lang, output_lang, pairs
  14. input_lang, output_lang, pairs = prepareData('eng','fra',True)
  15. print(random.choice(pairs))
  16. Reading lines...
  17. Read 135842 sentence pairs
  18. Trimmed to 11739 sentence pairs
  19. Counting words...
  20. Counted word:
  21. fra 5911
  22. eng 3965
  23. ['elle chante les dernieres chansons populaires.', 'she is singing the latest popular songs.']
  • 1
  • 2
  • 3
  • 4
  • 5
  • 6
  • 7
  • 8
  • 9
  • 10
  • 11
  • 12
  • 13
  • 14
  • 15
  • 16
  • 17
  • 18
  • 19
  • 20
  • 21
  • 22
  • 23
  • 24
  • 25
  • 26
  • 27

The Seq2Seq Model

  • 允许句子到句子有不同长度和顺序。

The Encoder :

  1. #编码器
  2. class EncoderRNN(nn.Module):
  3. def __init__(self, input_size, hidden_size):
  4. super(EncoderRNN, self).__init__()
  5. self.hidden_size = hidden_size
  6. # 指定embedding矩阵W的大小维度
  7. self.embedding = nn.Embedding(input_size, hidden_size)
  8. # 指定gru单元的大小
  9. self.gru = nn.GRU(hidden_size, hidden_size)
  10. def forward(self, input, hidden):
  11. # 扁平化嵌入矩阵
  12. embedded = self.embedding(input).view(1, 1, -1)
  13. print("embedded shape:",embedded.shape)
  14. output = embedded
  15. output, hidden = self.gru(output, hidden)
  16. return output, hidden
  17. #全0初始化隐层
  18. def initHidden(self):
  19. # 这个初始化维度可以
  20. return torch.zeros(1, 1, self.hidden_size, device=device)
  • 1
  • 2
  • 3
  • 4
  • 5
  • 6
  • 7
  • 8
  • 9
  • 10
  • 11
  • 12
  • 13
  • 14
  • 15
  • 16
  • 17
  • 18
  • 19
  • 20
  • 21
  • 22
  • 23
  • 24

这里的self.gru = nn.GRU(hidden_size, hidden_size)中,hidden_size在后面设置为256

print("embedded shape:",embedded.shape)的结果是: 
embedded shape: torch.Size([1, 1, 256])

所以self.gru(output, hidden)中传递的第一个维度是[1,1,256],被压缩为这样的。


nn.GRU源码:


The Decoder:

  • seq2seq解码器的简化版:指利用encoder的最后输出,称为context vector,
  • context vector 作为decoder的初始化隐层状态值 
  1. class DecoderRNN(nn.Module):
  2. def self__init__(self, hidden_size, output_size):
  3. super(DecoderRNN, self).__init__()
  4. self.hidden_size = hidden_size
  5. self.embedding = nn.Embedding(output_size,hidden_size)
  6. self.gru = nn.GRU(hidden_size, hidden_size)
  7. self.out = nn.Linear(hidden_size, output_size)
  8. self.softmax = nn.LogSoftmax(dim=1)
  9. def forward(self, input, hidden):
  10. output = self.embedding(input).view(1, 1, -1)
  11. # 1行X列的shape做relu
  12. output = F.relu(output)
  13. output, hidden = self.gru(output, hidden)
  14. #output[0]应该是shape为(*,*)的矩阵
  15. output = self.softmax(self.out(output[0]))
  16. return output, hidden
  17. def initHidden(self):
  18. return torch.zeros(1, 1, self.hidden_size, device=device)
  • 1
  • 2
  • 3
  • 4
  • 5
  • 6
  • 7
  • 8
  • 9
  • 10
  • 11
  • 12
  • 13
  • 14
  • 15
  • 16
  • 17
  • 18
  • 19
  • 20

Attention Decoder:

  • 简单的解码器的缺点:把整个句子做编码成一个向量,信息容易丢失,翻译一个词的时候需要追溯之前很长的距离,一般翻译的对应性也没有利用,如翻译第一个词,对应大概率在原句子的第一个位置的信息。
  • encoder的输出向量 会乘以一个attention weights,这个权值用NN来计算完成attn,使用解码器的输入和隐藏状态作为输入。。
  • 因为在训练数据中有各种大小的句子,为了实际创建和训练这一层,我们必须选择一个最大的句子长度(输入长度,对于编码器输出)因为在训练数据中有各种大小的句子,为了实际创建和训练这一层,我们必须选择一个最大的句子长度(输入长度,对于编码器输出) 
  1. class AttnDecoderRNN(nn.Module):
  2. def __init__(self, hidden_size, output_size,
  3. dropout_p = 0.1, max_length=MAX_LENGTH):
  4. super(AttnDecoderRNN,self).__init__()
  5. self.hidden_size = hidden_size
  6. self.output_size = output_size
  7. self.dropout_p = dropout_p
  8. self.max_length = max_length
  9. self.embedding = nn.Embedding(self.output_size, self.hidden_size)
  10. self.attn = nn.Linear(self.hidden_size * 2, self.max_length)
  11. self.attn_combine = nn.Linear(self.hidden_size * 2, self.hidden_size)
  12. self.dropout = nn.Dropout(self.dropout_p)
  13. #输入向量的维度是10,隐层的长度是10,默认是一层GRU
  14. self.gru = nn.GRU(self.hidden_size, self.hidden_size)
  15. self.out = nn.Linear(self.hidden_size, self.output_size)
  16. def forward(self, input, hidden, encoder_outputs):
  17. embedded = self.embedding(input).view(1,1,-1)
  18. embedded = self.dropout(embedded)
  19. attn_weights = F.softmax(
  20. self.attn(torch.cat((embedded[0],hidden[0]),1)),dim=1)
  21. # unsqueeze:在指定的轴上多增加一个维度
  22. attn_applied = torch.bmm(attn_weights.unsqueeze(0),
  23. encoder_outputs.unsqueeze(0))
  24. output = torch.cat((embedded[0],attn_applied[0]),1)
  25. output = self.attn_combine(output).unsqueeze(0)
  26. output = F.relu(output)
  27. output, hidden = self.gru(output, hidden)
  28. #print("output shape:",output.shape)
  29. #print("output[0]:",output[0])
  30. output = F.log_softmax(self.out(output[0]),dim=1)
  31. return output , hidden, attn_weights
  32. def initHidden(self):
  33. return torch.zeros(1, 1, self.hidden_size, device=device)
  • 1
  • 2
  • 3
  • 4
  • 5
  • 6
  • 7
  • 8
  • 9
  • 10
  • 11
  • 12
  • 13
  • 14
  • 15
  • 16
  • 17
  • 18
  • 19
  • 20
  • 21
  • 22
  • 23
  • 24
  • 25
  • 26
  • 27
  • 28
  • 29
  • 30
  • 31
  • 32
  • 33
  • 34
  • 35
  • 36
  • 37
  • 38
  • 39
  • 40

继续准备数据:

  1. def indexesFromSentence(lang, sentence):
  2. return [lang.word2index[word] for word in sentence.split(' ')]
  3. def tensorFromSentence(lang, sentence):
  4. indexes = indexesFromSentence(lang, sentence)
  5. indexes.append(EOS_token)
  6. return torch.tensor(indexes, dtype=torch.long, device=device).view(-1, 1)
  7. def tensorsFromPair(pair):
  8. input_tensor = tensorFromSentence(input_lang, pair[0])
  9. target_tensor = tensorFromSentence(output_lang, pair[1])
  10. return (input_tensor, target_tensor)
  • 1
  • 2
  • 3
  • 4
  • 5
  • 6
  • 7
  • 8
  • 9
  • 10
  • 11
  • 12
  • 13
  • 14

训练模型

  • 解码器的第一个输入是SOS符,并且把编码器最后的隐层状态作为解码器的第一隐层状态。
  • “Teacher forcing”指用真实样本数据作为下一步的输入,而不是解码器猜测的数据作为下一步输入。
  1. teacher_forcing_ratio = 0.5
  2. def train(input_tensor, output_tensor, encoder, decoder, encoder_optimizer,
  3. decoder_optimizer, criterion, max_length=MAX_LENGTH):
  4. # 这的隐层大小封装在encoder中,然后拿过来在train的时候初始化隐层的大小
  5. encoder_hidden = encoder.initHidden()
  6. encoder_optimizer.zero_grad()
  7. decoder_optimizer.zero_grad()
  8. # 第一维度的大小即输入长度
  9. input_length = input_tensor.size(0)
  10. output_length = output_tensor.size(0)
  11. encoder_outputs = torch.zeros(max_length, encoder.hidden_size,device=device)
  12. loss = 0
  13. for ei in range(input_length):
  14. encoder_output, encoder_hidden = encoder(input_tensor[ei],encoder_hidden)
  15. # [0,0]选取最大数组的第一个元素组里的第一个
  16. encoder_outputs[ei] = encoder_output[0 , 0]
  17. if ei == 0 :
  18. print("encoder_output[0, 0] shape: ",encoder_outputs[ei].shape)
  19. decoder_input = torch.tensor([[SOS_token]], device=device)
  20. decoder_hidden = encoder_output
  21. # niubi
  22. use_teacher_forcing = True if random.random() < teacher_forcing_ratio else False
  23. if use_teacher_forcing:
  24. # Teacher forcing: Feed the target as the next input
  25. for di in range(output_length):
  26. decoder_ouput,decoder_hidden,decoder_attention = decoder( decoder_input, decoder_hidden, encoder_outputs)
  27. loss = loss + criterion(decoder_ouput, output_tensor[di])
  28. decoder_input = output_tensor[di] # Teacher forcing
  29. else:
  30. for di in range(output_length):
  31. decoder_output,decoder_hidden,decoder_attention=decoder(decoder_input, decoder_hidden, encoder_outputs)
  32. topv ,topi = decoder_output.topk(1)
  33. decoder_input= topi.squeeze().detach() # # detach from history as input
  34. loss = loss + criterion(decoder_output, output_tensor[di])
  35. if decoder_input.item() == EOS_token:
  36. break
  37. loss.backward()
  38. encoder_optimizer.step()
  39. decoder_optimizer.step()
  40. return loss.item() / target_length
  • 1
  • 2
  • 3
  • 4
  • 5
  • 6
  • 7
  • 8
  • 9
  • 10
  • 11
  • 12
  • 13
  • 14
  • 15
  • 16
  • 17
  • 18
  • 19
  • 20
  • 21
  • 22
  • 23
  • 24
  • 25
  • 26
  • 27
  • 28
  • 29
  • 30
  • 31
  • 32
  • 33
  • 34
  • 35
  • 36
  • 37
  • 38
  • 39
  • 40
  • 41
  • 42
  • 43
  • 44
  • 45
  • 46
  • 47
  • 48
  • 49

好了,模型准备结束:

  1. import time
  2. import math
  3. def asMinutes(s):
  4. m = math.floors(s / 60)
  5. s -= m * 60
  6. return "%s(- %s)" % (asMinutes(s), asMinutes(rs))
  7. def timeSince(since, percent):
  8. now = time.time()
  9. s = now - since
  10. es = s / (percent)
  11. rs = es - s
  12. return '%s (- %s)' % (asMinutes(s), asMinutes(rs))
  • 1
  • 2
  • 3
  • 4
  • 5
  • 6
  • 7
  • 8
  • 9
  • 10
  • 11
  • 12
  • 13
  • 14
  • 15

训练过程:

  1. def trainIters(encoder, decoder, n_iters, print_every=1000, plot_every=100, learning_rate=0.01):
  2. start = time.time()
  3. plot_losses = []
  4. print_loss_total = 0 # Reset every print_every
  5. plot_loss_total = 0 # Reset every plot_every
  6. encoder_optimizer = optim.SGD(encoder.parameters(), lr=learning_rate)
  7. decoder_optimizer = optim.SGD(decoder.parameters(), lr=learning_rate)
  8. training_pairs = [tensorsFromPair(random.choice(pairs))
  9. for i in range(n_iters)]
  10. criterion = nn.NLLLoss()
  11. for iter in range(1, n_iters + 1):
  12. training_pair = training_pairs[iter - 1]
  13. input_tensor = training_pair[0]
  14. target_tensor = training_pair[1]
  15. loss = train(input_tensor, target_tensor, encoder,
  16. decoder, encoder_optimizer, decoder_optimizer, criterion)
  17. print_loss_total = loss + print_loss_total
  18. plot_loss_total = loss + plot_loss_total
  19. if iter % print_every == 0:
  20. print_loss_avg = print_loss_total / print_every
  21. print_loss_total = 0
  22. print('%s (%d %d%%) %.4f' % (timeSince(start, iter / n_iters),
  23. iter, iter / n_iters * 100, print_loss_avg))
  24. if iter % plot_every == 0:
  25. plot_loss_avg = plot_loss_total / plot_every
  26. plot_losses.append(plot_loss_avg)
  27. plot_loss_total = 0
  28. showPlot(plot_losses)
  • 1
  • 2
  • 3
  • 4
  • 5
  • 6
  • 7
  • 8
  • 9
  • 10
  • 11
  • 12
  • 13
  • 14
  • 15
  • 16
  • 17
  • 18
  • 19
  • 20
  • 21
  • 22
  • 23
  • 24
  • 25
  • 26
  • 27
  • 28
  • 29
  • 30
  • 31
  • 32
  • 33
  • 34

画图的这段:

  1. import matplotlib.pyplot as plt
  2. plt.switch_backend('agg')
  3. import matplotlib.ticker as ticker
  4. import numpy as np
  5. def showPlot(points):
  6. plt.figure()
  7. fig, ax = plt.subplots()
  8. # this locator puts ticks at regular intervals
  9. loc = ticker.MultipleLocator(base=0.2)
  10. ax.yaxis.set_major_locator(loc)
  11. plt.plot(points)
  • 1
  • 2
  • 3
  • 4
  • 5
  • 6
  • 7
  • 8
  • 9
  • 10
  • 11
  • 12
  • 13

验证的代码:

  1. def evaluate(encoder, decoder, sentence, max_length=MAX_LENGTH):
  2. with torch.no_grad():
  3. input_tensor = tensorFromSentence(input_lang, sentence)
  4. input_length = input_tensor.size()[0]
  5. encoder_hidden = encoder.initHidden()
  6. encoder_outputs = torch.zeros(max_length, encoder.hidden_size, device=device)
  7. for ei in range(input_length):
  8. encoder_output, encoder_hidden = encoder(input_tensor[ei],
  9. encoder_hidden)
  10. encoder_outputs[ei] += encoder_output[0, 0]
  11. decoder_input = torch.tensor([[SOS_token]], device=device) # SOS
  12. decoder_hidden = encoder_hidden
  13. decoded_words = []
  14. decoder_attentions = torch.zeros(max_length, max_length)
  15. for di in range(max_length):
  16. decoder_output, decoder_hidden, decoder_attention = decoder(
  17. decoder_input, decoder_hidden, encoder_outputs)
  18. decoder_attentions[di] = decoder_attention.data
  19. topv, topi = decoder_output.data.topk(1)
  20. if topi.item() == EOS_token:
  21. decoded_words.append('<EOS>')
  22. break
  23. else:
  24. decoded_words.append(output_lang.index2word[topi.item()])
  25. decoder_input = topi.squeeze().detach()
  26. return decoded_words, decoder_attentions[:di + 1]
  • 1
  • 2
  • 3
  • 4
  • 5
  • 6
  • 7
  • 8
  • 9
  • 10
  • 11
  • 12
  • 13
  • 14
  • 15
  • 16
  • 17
  • 18
  • 19
  • 20
  • 21
  • 22
  • 23
  • 24
  • 25
  • 26
  • 27
  • 28
  • 29
  • 30
  • 31
  • 32
  • 33
  • 34
  1. def evaluateRandomly(encoder, decoder, n=10):
  2. for i in range(n):
  3. pair = random.choice(pairs)
  4. print('>', pair[0])
  5. print('=', pair[1])
  6. output_words, attentions = evaluate(encoder, decoder, pair[0])
  7. output_sentence = ' '.join(output_words)
  8. print('<', output_sentence)
  9. print('')
  • 1
  • 2
  • 3
  • 4
  • 5
  • 6
  • 7
  • 8
  • 9

最后一步:

  1. hidden_size = 256
  2. encoder1 = EncoderRNN(input_lang.n_words, hidden_size).to(device)
  3. attn_decoder1 = AttnDecoderRNN(hidden_size, output_lang.n_words, dropout_p=0.1).to(device)
  4. trainIters(encoder1, attn_decoder1, 75000, print_every=5000)

pytorch记录:seq2seq例子看看这torch怎么玩的的更多相关文章

  1. PyTorch官方中文文档:torch.nn

    torch.nn Parameters class torch.nn.Parameter() 艾伯特(http://www.aibbt.com/)国内第一家人工智能门户,微信公众号:aibbtcom ...

  2. PyTorch官方中文文档:torch

    torch 包 torch 包含了多维张量的数据结构以及基于其上的多种数学操作.另外,它也提供了多种工具,其中一些可以更有效地对张量和任意类型进行序列化. 它有CUDA 的对应实现,可以在NVIDIA ...

  3. PyTorch官方中文文档:torch.optim 优化器参数

    内容预览: step(closure) 进行单次优化 (参数更新). 参数: closure (callable) –...~ 参数: params (iterable) – 待优化参数的iterab ...

  4. PyTorch官方中文文档:torch.optim

    torch.optim torch.optim是一个实现了各种优化算法的库.大部分常用的方法得到支持,并且接口具备足够的通用性,使得未来能够集成更加复杂的方法. 如何使用optimizer 为了使用t ...

  5. [Spark][Python]DataFrame中取出有限个记录的例子

    [Spark][Python]DataFrame中取出有限个记录的例子: sqlContext = HiveContext(sc) peopleDF = sqlContext.read.json(&q ...

  6. 基于PyTorch的Seq2Seq翻译模型详细注释介绍(一)

    版权声明:本文为博主原创文章,遵循 CC 4.0 BY-SA 版权协议,转载请附上原文出处链接和本声明.本文链接:https://blog.csdn.net/qysh123/article/detai ...

  7. pytorch做seq2seq注意力模型的翻译

    以下是对pytorch 1.0版本 的seq2seq+注意力模型做法语--英语翻译的理解(这个代码在pytorch0.4上也可以正常跑): # -*- coding: utf-8 -*- " ...

  8. 【转载】 Pytorch(0)降低学习率torch.optim.lr_scheduler.ReduceLROnPlateau类

    原文地址: https://blog.csdn.net/weixin_40100431/article/details/84311430 ------------------------------- ...

  9. pytorch中Math operation操作:torch.ger()

    torch.ger(vec1, vec2, out=None) → Tensor Outer product of vec1 and vec2. If vec1 is a vector of size ...

随机推荐

  1. mysql 联合2个列的数据 然后呈现出来

    SELECT a.voyageNum,CONCAT(a.startDate,'~',a.endDate) AS 日期  FROM tchw_voyageoilcost a ,tchw_voyageoi ...

  2. java笔记线程的生命周期图解

  3. bzoj 4184: shallot【线性基+时间线段树】

    学到了线段树新姿势! 先离线读入,根据时间建一棵线段树,每个节点上开一个vector存这个区间内存在的数(使用map来记录每个数出现的一段时间),然后在线段树上dfs,到叶子节点就计算答案. 注意!! ...

  4. vue中使用element写点击input内部标签(使用模态框传值)

    首先附上源码地址 https://files.cnblogs.com/files/maruihua/vue-tagsinput-master.zip 这个是我修改后的代码.取消了部分功能,添加的一些功 ...

  5. Java 反射机制详解(上)

    一.什么是反射 JAVA反射机制是在运行状态中,对于任意一个类,都能够知道这个类的所有属性和方法:对于任意一个对象,都能够调用它的任意方法和属性:这种动态获取信息以及动态调用对象方法的功能称为java ...

  6. icons使用

    1.将选中图标加入项目 2.unicode方式查看连接在线连接 3.复制代码到样式表 4.引用样式,并设置I标签,颜色和大小可以通过设置i标签color和font-size进行调整 <i cla ...

  7. Vmware workstation12里如何正确快速安装可视化IDS系统Security Onion(图文详解)

    不多说,直接上干货! 首先,大家要明确: 问:安全洋葱能阻止入侵吗? 答:这一点,和OSSIM一样,不能阻止入侵. Security Onion基于Ubuntu,包含了入侵检测.网络安全监控.日志管理 ...

  8. 移动端如何定义字体font-family

    移动端如何定义字体font-family 中文字体使用系统默认即可,英文用Helvetica /* 移动端定义字体的代码 */ body{font-family:Helvetica;} 参考<移 ...

  9. 点击后打开QQ临时会话

    1.QQ官方提供的代码.如果没有加好友需要加好友才能聊天,也可以到这里http://shang.qq.com/v3/index.html 开通一个服务,同样可以实现临时会话. <a href=& ...

  10. mysql之通过cmd连接远程数据库

    ---恢复内容开始--- 目录 前提 连接远程数据库 前提: 本地安装了mysql数据库 本地和远程网络是连通的,通过命令ping ip (即ping 192.168.0.333),可以ping通 连 ...