TensorFlow Tutorial

  • Initialize variables
  • Start your own session
  • Train algorithms
  • Implement a Neural Network

1. Exploring the Tensorflow Library

To start, you will import the library:

  1. import math
  2. import numpy as np
  3. import h5py
  4. import matplotlib.pyplot as plt
  5. import tensorflow as tf
  6. from tensorflow.python.framework import ops
  7. from tf_utils import load_dataset, random_mini_batches, convert_to_one_hot, predict
  8. %matplotlib inline
  9. np.random.seed(1)

Now that you have imported the library, we will walk you through its different applications. You will start with an example, where we compute for you the loss of one training example.

\[loss = \mathcal{L}(\hat{y}, y) = (\hat y^{(i)} - y^{(i)})^2 \tag{1}
\]
  1. y_hat = tf.constant(36, name='y_hat') # Define y_hat constant. Set to 36.
  2. y = tf.constant(39, name='y') # Define y. Set to 39
  3. loss = tf.Variable((y - y_hat)**2, name='loss') # Create a variable for the loss
  4. init = tf.global_variables_initializer() # When init is run later (session.run(init)),
  5. # the loss variable will be initialized and ready to be computed
  6. with tf.Session() as session: # Create a session and print the output
  7. session.run(init) # Initializes the variables
  8. print(session.run(loss)) # Prints the loss

Writing and running programs in TensorFlow has the following steps:

  1. Create Tensors (variables) that are not yet executed/evaluated.
  2. Write operations between those Tensors.
  3. Initialize your Tensors.
  4. Create a Session.
  5. Run the Session. This will run the operations you'd written above.

1.1 - Linear function

Lets start this programming exercise by computing the following equation: \(Y = WX + b\), where \(W\) and \(X\) are random matrices and b is a random vector.

Exercise: Compute \(WX + b\) where \(W, X\), and \(b\) are drawn from a random normal distribution. W is of shape (4, 3), X is (3,1) and b is (4,1). As an example, here is how you would define a constant X that has shape (3,1):

  1. X = tf.constant(np.random.randn(3,1), name = "X")

You might find the following functions helpful:

  • tf.matmul(..., ...) to do a matrix multiplication
  • tf.add(..., ...) to do an addition
  • np.random.randn(...) to initialize randomly
  1. # GRADED FUNCTION: linear_function
  2. def linear_function():
  3. """
  4. Implements a linear function:
  5. Initializes W to be a random tensor of shape (4,3)
  6. Initializes X to be a random tensor of shape (3,1)
  7. Initializes b to be a random tensor of shape (4,1)
  8. Returns:
  9. result -- runs the session for Y = WX + b
  10. """
  11. np.random.seed(1)
  12. ### START CODE HERE ### (4 lines of code)
  13. W = tf.constant(np.random.randn(4, 3), name='W')
  14. X = tf.constant(np.random.randn(3, 1), name='X')
  15. b = tf.constant(np.random.randn(4, 1), name='b')
  16. Y = tf.add(tf.matmul(W, X), b)
  17. ### END CODE HERE ###
  18. # Create the session using tf.Session() and run it with sess.run(...) on the variable you want to calculate
  19. ### START CODE HERE ###
  20. sess = tf.Session()
  21. result = sess.run(Y)
  22. ### END CODE HERE ###
  23. # close the session
  24. sess.close()
  25. return result

1.2 - Computing the sigmoid

Great! You just implemented a linear function. Tensorflow offers a variety of commonly used neural network functions like tf.sigmoid and tf.softmax. For this exercise lets compute the sigmoid function of an input.

You will do this exercise using a placeholder variable x. When running the session, you should use the feed dictionary to pass in the input z. In this exercise, you will have to (i) create a placeholder x, (ii) define the operations needed to compute the sigmoid using tf.sigmoid, and then (iii) run the session.

** Exercise **: Implement the sigmoid function below. You should use the following:

  • tf.placeholder(tf.float32, name = "...")
  • tf.sigmoid(...)
  • sess.run(..., feed_dict = {x: z})

Note that there are two typical ways to create and use sessions in tensorflow:

Method 1:

  1. sess = tf.Session()
  2. # Run the variables initialization (if needed), run the operations
  3. result = sess.run(..., feed_dict = {...})
  4. sess.close() # Close the session

Method 2:

  1. with tf.Session() as sess:
  2. # run the variables initialization (if needed), run the operations
  3. result = sess.run(..., feed_dict = {...})
  4. # This takes care of closing the session for you :)
  1. print ("sigmoid(0) = " + str(sigmoid(0)))
  2. print ("sigmoid(12) = " + str(sigmoid(12)))

输出:

sigmoid(0) = 0.5

sigmoid(12) = 0.9999942

**To summarize, you how know how to**:
1. Create placeholders
2. Specify the computation graph corresponding to operations you want to compute
3. Create the session
4. Run the session, using a feed dictionary if necessary to specify placeholder variables' values.

1.3 - Computing the Cost

You can also use a built-in function to compute the cost of your neural network. So instead of needing to write code to compute this as a function of \(a^{[2](i)}\) and \(y^{(i)}\) for i=1...m:

\[J = - \frac{1}{m} \sum_{i = 1}^m \large ( \small y^{(i)} \log a^{ [2] (i)} + (1-y^{(i)})\log (1-a^{ [2] (i)} )\large )\small\tag{2}
\]

you can do it in one line of code in tensorflow!

Exercise: Implement the cross entropy loss. The function you will use is:

  • tf.nn.sigmoid_cross_entropy_with_logits(logits = ..., labels = ...)

Your code should input z, compute the sigmoid (to get a) and then compute the cross entropy cost \(J\). All this can be done using one call to tf.nn.sigmoid_cross_entropy_with_logits, which computes

\[- \frac{1}{m} \sum_{i = 1}^m \large ( \small y^{(i)} \log \sigma(z^{[2](i)}) + (1-y^{(i)})\log (1-\sigma(z^{[2](i)})\large )\small\tag{2}
\]
  1. # GRADED FUNCTION: cost
  2. def cost(logits, labels):
  3. """
  4.     Computes the cost using the sigmoid cross entropy
  5.     
  6.     Arguments:
  7.     logits -- vector containing z, output of the last linear unit (before the final sigmoid activation)
  8.     labels -- vector of labels y (1 or 0)
  9. Note: What we've been calling "z" and "y" in this class are respectively called "logits" and "labels"
  10. in the TensorFlow documentation. So logits will feed into z, and labels into y.
  11.     
  12.     Returns:
  13.     cost -- runs the session of the cost (formula (2))
  14. """
  15. ### START CODE HERE ###
  16. z = tf.placeholder(tf.float32, name='z')
  17. y = tf.placeholder(tf.float32, name='y')
  18. # Use the loss function (approx. 1 line)
  19. cost = tf.nn.sigmoid_cross_entropy_with_logits(logits=z, labels=y)
  20. # Create a session (approx. 1 line). See method 1 above.
  21. sess = tf.Session()
  22. # Run the session (approx. 1 line).
  23. cost = sess.run(cost, feed_dict={z: logits, y:labels})
  24. # Close the session (approx. 1 line). See method 1 above.
  25. sess.close()
  26. ### END CODE HERE ###
  27. return cost
  1. logits = sigmoid(np.array([0.2,0.4,0.7,0.9]))
  2. cost = cost(logits, np.array([0,0,1,1]))
  3. print ("cost = " + str(cost))

cost = [ 1.00538719 1.03664088 0.41385433 0.39956614]

1.4 - Using One Hot encodings

Many times in deep learning you will have a y vector with numbers ranging from 0 to C-1, where C is the number of classes. If C is for example 4, then you might have the following y vector which you will need to convert as follows:

This is called a "one hot" encoding, because in the converted representation exactly one element of each column is "hot" (meaning set to 1). To do this conversion in numpy, you might have to write a few lines of code. In tensorflow, you can use one line of code:

  • tf.one_hot(labels, depth, axis)

Exercise: Implement the function below to take one vector of labels and the total number of classes \(C\), and return the one hot encoding. Use tf.one_hot() to do this.

  1. # GRADED FUNCTION: one_hot_matrix
  2. def one_hot_matrix(labels, C):
  3. """
  4. Creates a matrix where the i-th row corresponds to the ith class number and the jth column
  5. corresponds to the jth training example. So if example j had a label i. Then entry (i,j)
  6. will be 1.
  7. Arguments:
  8. labels -- vector containing the labels
  9. C -- number of classes, the depth of the one hot dimension
  10. Returns:
  11. one_hot -- one hot matrix
  12. """
  13. ### START CODE HERE ###
  14. # Create a tf.constant equal to C (depth), name it 'C'. (approx. 1 line)
  15. C = tf.constant(C, name='C')
  16. # Use tf.one_hot, be careful with the axis (approx. 1 line)
  17. one_hot_matrix = tf.one_hot(indices=labels, depth=C, axis=0)
  18. # Create the session (approx. 1 line)
  19. sess = tf.Session()
  20. # Run the session (approx. 1 line)
  21. one_hot = sess.run(one_hot_matrix)
  22. # Close the session (approx. 1 line). See method 1 above.
  23. sess.close()
  24. ### END CODE HERE ###
  25. return one_hot

输出:

  1. labels = np.array([1,2,3,0,2,1])
  2. one_hot = one_hot_matrix(labels, C = 4)
  3. print ("one_hot = " + str(one_hot))

1.5 - Initialize with zeros and ones

Now you will learn how to initialize a vector of zeros and ones. The function you will be calling is tf.ones(). To initialize with zeros you could use tf.zeros() instead. These functions take in a shape and return an array of dimension shape full of zeros and ones respectively.

Exercise: Implement the function below to take in a shape and to return an array (of the shape's dimension of ones).

  • tf.ones(shape)
  1. # GRADED FUNCTION: ones
  2. def ones(shape):
  3. """
  4. Creates an array of ones of dimension shape
  5. Arguments:
  6. shape -- shape of the array you want to create
  7. Returns:
  8. ones -- array containing only ones
  9. """
  10. ### START CODE HERE ###
  11. # Create "ones" tensor using tf.ones(...). (approx. 1 line)
  12. ones = tf.ones(shape)
  13. # Create the session (approx. 1 line)
  14. sess = tf.Session()
  15. # Run the session to compute 'ones' (approx. 1 line)
  16. ones = sess.run(ones)
  17. # Close the session (approx. 1 line). See method 1 above.
  18. sess.close()
  19. ### END CODE HERE ###
  20. return ones

测试:

  1. print ("ones = " + str(ones([3])))

2 - Building your first neural network in tensorflow

In this part of the assignment you will build a neural network using tensorflow. Remember that there are two parts to implement a tensorflow model:

  • Create the computation graph
  • Run the graph

Let's delve into the problem you'd like to solve!

2.0 - Problem statement: SIGNS Dataset

手势数字识别

  • Training set: 1080 pictures (64 by 64 pixels) of signs representing numbers from 0 to 5 (180 pictures per number).
  • Test set: 120 pictures (64 by 64 pixels) of signs representing numbers from 0 to 5 (20 pictures per number).

Note that this is a subset of the SIGNS dataset. The complete dataset contains many more signs.

Here are examples for each number, and how an explanation of how we represent the labels. These are the original pictures, before we lowered the image resolutoion to 64 by 64 pixels.

Figure 1: SIGNS dataset

Run the following code to load the dataset.

  1. # Loading the dataset
  2. X_train_orig, Y_train_orig, X_test_orig, Y_test_orig, classes = load_dataset()

Change the index below and run the cell to visualize some examples in the dataset.

  1. # Example of a picture
  2. index = 0
  3. plt.imshow(X_train_orig[index])
  4. print ("y = " + str(np.squeeze(Y_train_orig[:, index])))

y = 4

像往常那样flatten图像数据,并且用 x /= 255. 正则化数据。除此之外,你需要转换每个数字标签,为一个 one-hot向量,如 Figure 1.

  1. # Flatten the training and test images
  2. X_train_flatten = X_train_orig.reshape(X_train_orig.shape[0], -1).T
  3. X_test_flatten = X_test_orig.reshape(X_test_orig.shape[0], -1).T
  4. # Normalize image vectors
  5. X_train = X_train_flatten/255.
  6. X_test = X_test_flatten/255.
  7. # Convert training and test labels to one hot matrices
  8. Y_train = convert_to_one_hot(Y_train_orig, 6)
  9. Y_test = convert_to_one_hot(Y_test_orig, 6)
  10. print ("number of training examples = " + str(X_train.shape[1]))
  11. print ("number of test examples = " + str(X_test.shape[1]))
  12. print ("X_train shape: " + str(X_train.shape))
  13. print ("Y_train shape: " + str(Y_train.shape))
  14. print ("X_test shape: " + str(X_test.shape))
  15. print ("Y_test shape: " + str(Y_test.shape))

number of training examples = 1080

number of test examples = 120

X_train shape: (12288, 1080)

Y_train shape: (6, 1080)

X_test shape: (12288, 120)

Y_test shape: (6, 120)

Note that 12288 comes from \(64 \times 64 \times 3\). Each image is square, 64 by 64 pixels, and 3 is for the RGB colors. Please make sure all these shapes make sense to you before continuing.

Your goal is to build an algorithm capable of recognizing a sign with high accuracy. To do so, you are going to build a tensorflow model that is almost the same as one you have previously built in numpy for cat recognition (but now using a softmax output).

The model is LINEAR -> RELU -> LINEAR -> RELU -> LINEAR -> SOFTMAX. (要输出多个分类)

2.1 - Create placeholders

Your first task is to create placeholders for X and Y. This will allow you to later pass your training data in when you run your session.

Exercise: Implement the function below to create the placeholders in tensorflow.

  1. # GRADED FUNCTION: create_placeholders
  2. def create_placeholders(n_x, n_y):
  3. """
  4. Creates the placeholders for the tensorflow session.
  5. Arguments:
  6. n_x -- scalar, size of an image vector (num_px * num_px = 64 * 64 * 3 = 12288)
  7. n_y -- scalar, number of classes (from 0 to 5, so -> 6)
  8. Returns:
  9. X -- placeholder for the data input, of shape [n_x, None] and dtype "float"
  10. Y -- placeholder for the input labels, of shape [n_y, None] and dtype "float"
  11. Tips:
  12. - You will use None because it let's us be flexible on the number of examples you will for the placeholders.
  13. In fact, the number of examples during test/train is different.
  14. """
  15. ### START CODE HERE ### (approx. 2 lines)
  16. X = tf.placeholder(tf.float32, shape=[n_x, None], name='X')
  17. Y = tf.placeholder(tf.float32, shape=[n_y, None], name='Y')
  18. ### END CODE HERE ###
  19. return X, Y
  1. X, Y = create_placeholders(12288, 6)
  2. print ("X = " + str(X))
  3. print ("Y = " + str(Y))

X = Tensor("X_1:0", shape=(12288, ?), dtype=float32)

Y = Tensor("Y:0", shape=(6, ?), dtype=float32)

2.2 - Initializing the parameters

initialize the parameters in tensorflow.

Exercise: 在TensorFlow中初始化参数. 用 Xavier Initialization 对 weights 并且用 Zero Initialization for biases. As an example, to help you, for W1 and b1 you could use:

  1. W1 = tf.get_variable("W1", [25,12288], initializer = tf.contrib.layers.xavier_initializer(seed = 1))
  2. b1 = tf.get_variable("b1", [25,1], initializer = tf.zeros_initializer())

Please use seed = 1 to make sure your results match ours.

  1. # GRADED FUNCTION: initialize_parameters
  2. def initialize_parameters():
  3. """
  4. Initializes parameters to build a neural network with tensorflow. The shapes are:
  5. W1 : [25, 12288]
  6. b1 : [25, 1]
  7. W2 : [12, 25]
  8. b2 : [12, 1]
  9. W3 : [6, 12]
  10. b3 : [6, 1]
  11. Returns:
  12. parameters -- a dictionary of tensors containing W1, b1, W2, b2, W3, b3
  13. """
  14. tf.set_random_seed(1) # so that your "random" numbers match ours
  15. ### START CODE HERE ### (approx. 6 lines of code)
  16. W1 = tf.get_variable("W1", [25, 12288], initializer=tf.contrib.layers.xavier_initializer(seed=1))
  17. b1 = tf.get_variable("b1", [25, 1], initializer=tf.zeros_initializer())
  18. W2 = tf.get_variable("W2", [12, 25], initializer=tf.contrib.layers.xavier_initializer(seed=1))
  19. b2 = tf.get_variable("b2", [12, 1], initializer=tf.zeros_initializer())
  20. W3 = tf.get_variable("W3", [6, 12], initializer=tf.contrib.layers.xavier_initializer(seed=1))
  21. b3 = tf.get_variable("b3", [6, 1], initializer=tf.zeros_initializer())
  22. ### END CODE HERE ###
  23. parameters = {"W1": W1,
  24. "b1": b1,
  25. "W2": W2,
  26. "b2": b2,
  27. "W3": W3,
  28. "b3": b3}
  29. return parameters
  1. tf.reset_default_graph()
  2. with tf.Session() as sess:
  3. parameters = initialize_parameters()
  4. print("W1 = " + str(parameters["W1"]))
  5. print("b1 = " + str(parameters["b1"]))
  6. print("W2 = " + str(parameters["W2"]))
  7. print("b2 = " + str(parameters["b2"]))

W1 = <tf.Variable 'W1:0' shape=(25, 12288) dtype=float32_ref>

b1 = <tf.Variable 'b1:0' shape=(25, 1) dtype=float32_ref>

W2 = <tf.Variable 'W2:0' shape=(12, 25) dtype=float32_ref>

b2 = <tf.Variable 'b2:0' shape=(12, 1) dtype=float32_ref>

2.3 - Forward propagation in tensorflow

implement the forward propagation module in tensorflow. The function will take in a dictionary of parameters and it will complete the forward pass. The functions you will be using are:

  • tf.add(...,...) to do an addition
  • tf.matmul(...,...) to do a matrix multiplication(矩阵乘法)
  • tf.nn.relu(...) to apply the ReLU activation

Question: Implement the forward pass of the neural network. We commented for you the numpy equivalents so that you can compare the tensorflow implementation to numpy. It is important to note that the forward propagation stops at z3. The reason is that in tensorflow the last linear layer output is given as input to the function computing the loss. Therefore, you don't need a3!

  1. # GRADED FUNCTION: forward_propagation
  2. def forward_propagation(X, parameters):
  3. """
  4. Implements the forward propagation for the model: LINEAR -> RELU -> LINEAR -> RELU -> LINEAR -> SOFTMAX
  5. Arguments:
  6. X -- input dataset placeholder, of shape (input size, number of examples)
  7. parameters -- python dictionary containing your parameters "W1", "b1", "W2", "b2", "W3", "b3"
  8. the shapes are given in initialize_parameters
  9. Returns:
  10. Z3 -- the output of the last LINEAR unit
  11. """
  12. # Retrieve the parameters from the dictionary "parameters"
  13. W1 = parameters['W1']
  14. b1 = parameters['b1']
  15. W2 = parameters['W2']
  16. b2 = parameters['b2']
  17. W3 = parameters['W3']
  18. b3 = parameters['b3']
  19. ### START CODE HERE ### (approx. 5 lines) # Numpy Equivalents:
  20. Z1 = tf.add(tf.matmul(W1, X), b1) # Z1 = np.dot(W1, X) + b1
  21. A1 = tf.nn.relu(Z1) # A1 = relu(Z1)
  22. Z2 = tf.add(tf.matmul(W2, A1), b2) # Z2 = np.dot(W2, a1) + b2
  23. A2 = tf.nn.relu(Z2) # A2 = relu(Z2)
  24. Z3 = tf.add(tf.matmul(W3, A2), b3) # Z3 = np.dot(W3,Z2) + b3
  25. ### END CODE HERE ###
  26. return Z3
  1. tf.reset_default_graph()
  2. with tf.Session() as sess:
  3. X, Y = create_placeholders(12288, 6)
  4. parameters = initialize_parameters()
  5. Z3 = forward_propagation(X, parameters)
  6. print("Z3 = " + str(Z3))

2.4 - Compute cost

As seen before, it is very easy to compute the cost using:

  1. tf.reduce_mean(tf.nn.softmax_cross_entropy_with_logits(logits = ..., labels = ...))

Question: Implement the cost function below.

  • It is important to know that the "logits" and "labels" inputs of tf.nn.softmax_cross_entropy_with_logits are expected to be of shape (number of examples, num_classes). We have thus transposed Z3 and Y for you.
  • Besides, tf.reduce_mean basically does the summation over the examples.
  1. # GRADED FUNCTION: compute_cost
  2. def compute_cost(Z3, Y):
  3. """
  4. Computes the cost
  5. Arguments:
  6. Z3 -- output of forward propagation (output of the last LINEAR unit), of shape (6, number of examples)
  7. Y -- "true" labels vector placeholder, same shape as Z3
  8. Returns:
  9. cost - Tensor of the cost function
  10. """
  11. # to fit the tensorflow requirement for tf.nn.softmax_cross_entropy_with_logits(...,...)
  12. logits = tf.transpose(Z3) # 转置
  13. labels = tf.transpose(Y)
  14. ### START CODE HERE ### (1 line of code)
  15. # tf.reduce_mean 函数用于计算张量tensor沿着指定的数轴(tensor的某一维度)上的的平均值,
  16. # 主要用作降维或者计算tensor(图像)的平均值。
  17. cost = tf.reduce_mean(tf.nn.softmax_cross_entropy_with_logits(logits=logits,
  18. labels=labels))
  19. ### END CODE HERE ###
  20. return cost
  1. tf.reset_default_graph()
  2. with tf.Session() as sess:
  3. X, Y = create_placeholders(12288, 6)
  4. parameters = initialize_parameters()
  5. Z3 = forward_propagation(X, parameters)
  6. cost = compute_cost(Z3, Y)
  7. print("cost = " + str(cost))

cost = Tensor("Mean:0", shape=(), dtype=float32)

2.5 - Backward propagation & parameter updates

All the backpropagation and the parameters update is taken care of in 1 line of code.

After you compute the cost function. You will create an "optimizer" object. 选择优化函数(optimization) 和 learning rate 最小化代价。

For instance, for gradient descent the optimizer would be:

  1. optimizer = tf.train.GradientDescentOptimizer(learning_rate = learning_rate).minimize(cost)

To make the optimization you would do:

  1. _ , c = sess.run([optimizer, cost], feed_dict={X: minibatch_X, Y: minibatch_Y})

This computes the backpropagation by passing through the tensorflow graph in the reverse order. From cost to inputs.

Note When coding, we often use _ as a "throwaway" variable to store values that we won't need to use later. Here, _ takes on the evaluated value of optimizer, which we don't need (and c takes the value of the cost variable).

2.6 - Building the model

Now, you will bring it all together!

Exercise: Implement the model. You will be calling the functions you had previously implemented.

  1. def model(X_train, Y_train, X_test, Y_test, learning_rate = 0.0001,
  2. num_epochs = 1500, minibatch_size = 32, print_cost = True):
  3. """
  4. Implements a three-layer tensorflow neural network: LINEAR->RELU->LINEAR->RELU->LINEAR->SOFTMAX.
  5. Arguments:
  6. X_train -- training set, of shape (input size = 12288, number of training examples = 1080)
  7. Y_train -- test set, of shape (output size = 6, number of training examples = 1080)
  8. X_test -- training set, of shape (input size = 12288, number of training examples = 120)
  9. Y_test -- test set, of shape (output size = 6, number of test examples = 120)
  10. learning_rate -- learning rate of the optimization
  11. num_epochs -- number of epochs of the optimization loop
  12. minibatch_size -- size of a minibatch
  13. print_cost -- True to print the cost every 100 epochs
  14. Returns:
  15. parameters -- parameters learnt by the model. They can then be used to predict.
  16. """
  17. ops.reset_default_graph() # to be able to rerun the model without overwriting tf variables
  18. tf.set_random_seed(1) # to keep consistent results
  19. seed = 3 # to keep consistent results
  20. (n_x, m) = X_train.shape # (n_x: input size, m : number of examples in the train set)
  21. n_y = Y_train.shape[0] # n_y : output size
  22. costs = [] # To keep track of the cost
  23. # Create Placeholders of shape (n_x, n_y)
  24. ### START CODE HERE ### (1 line)
  25. X, Y = create_placeholders(n_x, n_y)
  26. ### END CODE HERE ###
  27. # Initialize parameters
  28. ### START CODE HERE ### (1 line)
  29. parameters = initialize_parameters()
  30. ### END CODE HERE ###
  31. # Forward propagation: Build the forward propagation in the tensorflow graph
  32. ### START CODE HERE ### (1 line)
  33. Z3 = forward_propagation(X, parameters)
  34. ### END CODE HERE ###
  35. # Cost function: Add cost function to tensorflow graph
  36. ### START CODE HERE ### (1 line)
  37. cost = compute_cost(Z3, Y)
  38. ### END CODE HERE ###
  39. # Backpropagation: Define the tensorflow optimizer. Use an AdamOptimizer.
  40. ### START CODE HERE ### (1 line)
  41. optimizer = tf.train.AdamOptimizer(learning_rate=learning_rate).minimize(cost)
  42. ### END CODE HERE ###
  43. # Initialize all the variables
  44. init = tf.global_variables_initializer()
  45. # Start the session to compute the tensorflow graph
  46. with tf.Session() as sess:
  47. # Run the initialization
  48. sess.run(init)
  49. # Do the training loop
  50. for epoch in range(num_epochs):
  51. epoch_cost = 0. # Defines a cost related to an epoch
  52. num_minibatches = int(m / minibatch_size) # number of minibatches of size minibatch_size in the train set
  53. seed = seed + 1
  54. minibatches = random_mini_batches(X_train, Y_train, minibatch_size, seed)
  55. for minibatch in minibatches:
  56. # Select a minibatch
  57. (minibatch_X, minibatch_Y) = minibatch
  58. # IMPORTANT: The line that runs the graph on a minibatch.
  59. # Run the session to execute the "optimizer" and the "cost", the feedict should contain a minibatch for (X,Y).
  60. ### START CODE HERE ### (1 line)
  61. _, minibatch_cost = sess.run([optimizer, cost],
  62. feed_dict={X: minibatch_X,
  63. Y: minibatch_Y})
  64. ### END CODE HERE ###
  65. epoch_cost += minibatch_cost / num_minibatches
  66. # Print the cost every epoch
  67. if print_cost == True and epoch % 100 == 0:
  68. print ("Cost after epoch %i: %f" % (epoch, epoch_cost))
  69. if print_cost == True and epoch % 5 == 0:
  70. costs.append(epoch_cost)
  71. # plot the cost
  72. plt.plot(np.squeeze(costs))
  73. plt.ylabel('cost')
  74. plt.xlabel('iterations (per tens)')
  75. plt.title("Learning rate =" + str(learning_rate))
  76. plt.show()
  77. # lets save the parameters in a variable
  78. parameters = sess.run(parameters)
  79. print ("Parameters have been trained!")
  80. # Calculate the correct predictions
  81. correct_prediction = tf.equal(tf.argmax(Z3), tf.argmax(Y))
  82. # Calculate accuracy on the test set
  83. accuracy = tf.reduce_mean(tf.cast(correct_prediction, "float"))
  84. print ("Train Accuracy:", accuracy.eval({X: X_train, Y: Y_train}))
  85. print ("Test Accuracy:", accuracy.eval({X: X_test, Y: Y_test}))
  86. return parameters
  1. parameters = model(X_train, Y_train, X_test, Y_test)

Cost after epoch 0: 1.855702

Cost after epoch 100: 1.016458

Cost after epoch 200: 0.733102

Cost after epoch 300: 0.572939

Cost after epoch 400: 0.468774

Cost after epoch 500: 0.381021

Cost after epoch 600: 0.313827

Cost after epoch 700: 0.254280

Cost after epoch 800: 0.203799

Cost after epoch 900: 0.166512

Cost after epoch 1000: 0.140937

Cost after epoch 1100: 0.107750

Cost after epoch 1200: 0.086299

Cost after epoch 1300: 0.060949

Cost after epoch 1400: 0.050934

Parameters have been trained!

Train Accuracy: 0.999074

Test Accuracy: 0.725

2.7 - Test with your own image

  1. # import scipy
  2. # from PIL import Image
  3. # from scipy import ndimage
  4. import imageio
  5. from skimage.transform import resize
  6. ## START CODE HERE ## (PUT YOUR IMAGE NAME)
  7. my_image = "thumbs_up.jpg"
  8. ## END CODE HERE ##
  9. # We preprocess your image to fit your algorithm.
  10. fname = "images/" + my_image
  11. # image = np.array(ndimage.imread(fname, flatten=False))
  12. # my_image = scipy.misc.imresize(image, size=(64,64)).reshape((1, 64*64*3)).T
  13. image = np.array(imageio.imread(fname)) # 读入图片为矩阵, 这里用原版本的会出错,scipy的那个函数被删了
  14. # print(image.shape)
  15. # 转置图片为 (num_px*num_px*3, 1)向量
  16. my_image = resize(image, output_shape=(64, 64)).reshape((1, 64 * 64 * 3)).T
  17. # print(my_image)
  18. my_image_prediction = predict(my_image, parameters)
  19. plt.imshow(image)
  20. print("Your algorithm predicts: y = " + str(np.squeeze(my_image_prediction)))

改善深层神经网络-week3编程题(Tensorflow 实现手势识别 )的更多相关文章

  1. 改善深层神经网络-week1编程题(Initializaion)

    Initialization 如何选择初始化方式,不同的初始化会导致不同的结果 好的初始化方式: 加速梯度下降的收敛(Speed up the convergence of gradient desc ...

  2. 改善深层神经网络-week1编程题(Regularization)

    Regularization Deep Learning models have so much flexibility and capacity that overfitting can be a ...

  3. 改善深层神经网络-week1编程题(GradientChecking)

    1. Gradient Checking 你被要求搭建一个Deep Learning model来检测欺诈,每当有人付款,你想知道是否该支付可能是欺诈,例如该用户的账户可能已经被黑客掉. 但是,反向传 ...

  4. 改善深层神经网络-week2编程题(Optimization Methods)

    1. Optimization Methods Gradient descent goes "downhill" on a cost function \(J\). Think o ...

  5. deeplearning.ai 改善深层神经网络 week3 超参数调试、Batch正则化和程序框架 听课笔记

    这一周的主体是调参. 1. 超参数:No. 1最重要,No. 2其次,No. 3其次次. No. 1学习率α:最重要的参数.在log取值空间随机采样.例如取值范围是[0.001, 1],r = -4* ...

  6. deeplearning.ai 改善深层神经网络 week3 超参数调试、Batch Normalization和程序框架

    这一周的主体是调参. 1. 超参数:No. 1最重要,No. 2其次,No. 3其次次. No. 1学习率α:最重要的参数.在log取值空间随机采样.例如取值范围是[0.001, 1],r = -4* ...

  7. 改善深层神经网络_优化算法_mini-batch梯度下降、指数加权平均、动量梯度下降、RMSprop、Adam优化、学习率衰减

    1.mini-batch梯度下降 在前面学习向量化时,知道了可以将训练样本横向堆叠,形成一个输入矩阵和对应的输出矩阵: 当数据量不是太大时,这样做当然会充分利用向量化的优点,一次训练中就可以将所有训练 ...

  8. [DeeplearningAI笔记]改善深层神经网络_深度学习的实用层面1.10_1.12/梯度消失/梯度爆炸/权重初始化

    觉得有用的话,欢迎一起讨论相互学习~Follow Me 1.10 梯度消失和梯度爆炸 当训练神经网络,尤其是深度神经网络时,经常会出现的问题是梯度消失或者梯度爆炸,也就是说当你训练深度网络时,导数或坡 ...

  9. deeplearning.ai 改善深层神经网络 week1 深度学习的实用层面 听课笔记

    1. 应用机器学习是高度依赖迭代尝试的,不要指望一蹴而就,必须不断调参数看结果,根据结果再继续调参数. 2. 数据集分成训练集(training set).验证集(validation/develop ...

随机推荐

  1. Centos6.5时间服务器NTP搭建

    NTP时间服务器安装与配置 第1章 Server端的安装与配置 1.1 查看系统是否已经安装ntp服务组件 rpm -qa | grep "ntp" #<==查看是否已经安装 ...

  2. python中的getpass模块问题,在pycharm中不能继续输入密码

    python中getpass模块   在pycharm中运行下面的代码: 1 import getpass 2 name = input('请输入你的名字:') 3 passwd = getpass. ...

  3. python中dump与dumps的区别

    刚写了一个代吗,没有搞懂dump和dumps的区别,现在搞懂了,下班后在来整理import pickleq = [1,2,3,4]pickle.dump(q,open("cb1.txt&qu ...

  4. 成本降低40%、资源利用率提高20%的 AI 应用产品云原生容器化之路

    作者 郭云龙,腾讯云高级工程师,目前就职于 CSIG 云产品三部-AI 应用产品中心,现负责中心后台业务框架开发. 导语 为了满足 AI 能力在公有云 SaaS 场景下,服务和模型需要快速迭代交付的需 ...

  5. TP生成二维码插件

    安装 composer require endroid/qrcode 使用: use Endroid\QrCode\QrCode 然后 这个类库要改一下 在路径:你的项目路径\vendor\endro ...

  6. SourceTree使用详解-摘录收藏

    前言: 非原创,好文收录,原创作者:追逐时光者 俗话说的好工欲善其事必先利其器,Git分布式版本控制系统是我们日常开发中不可或缺的.目前市面上比较流行的Git可视化管理工具有SourceTree.Gi ...

  7. 这是我见过最简单的博客文只有一张图,Python基础10分钟学完

  8. 阿里云ECS服务器Centos中安装SQL Server(破解内存限制)

    前言 前段时间赶上阿里云618活动入手了一个低配的Linux服务器,供自己学习使用,在安装SQL Server中遇到了很多小问题,查阅很多博客结合自身遇到的问题做个总结. 安装过程 1.下载阿里云在线 ...

  9. Decorator装饰器模式个人理解

    对于装饰器模式,其主要是为了:在不改变本体特征的情况下,对其进行包装.装饰,目的是为了补充.扩展.增强其功能. 有三个原则: 不能改变本体的特征 要对本体的功能进行扩展 装饰器脱离了本体则没有任何含义 ...

  10. Python 文件路径设置

    菜鸟教程:https://www.runoob.com/python/os-chdir.html Python官方文件教程:https://docs.python.org/3.9/library/os ...