Introduction to Neural Networks and Deep Learning (Part 2 — Convolutional Neural Networks, Basic Language Modeling)

When:
April 19, 2025 @ 8:30 am – 12:30 pm America/New York Timezone
2025-04-19T08:30:00-04:00
2025-04-19T12:30:00-04:00
Where:
Zoom

Registration Fees:

Members Early Rate (by April 4) $115.00

Members Rate after (April 4) $130.00

Non-Member Early Rate (April 4) $135.00

Non-Member Rate after (April 4):  $150.00

Decision to run or cancel the course is:  Friday, April 11, 2025

Series Overview: Neural networks and deep learning currently provides the best solutions to many problems in image recognition, speech recognition, natural language processing, and generative AI.

The Part 1 class and this Part 2 class will teach many of the core concepts behind neural networks and deep learning, and basic language modeling.

The planned Part 3 class (to be confirmed) will teach a simple Generative Pre-trained Transformer (GPT), based on the seminal Attention is All You Need paper and OpenAI’s GPT-2/GPT-3.

In this Part 2 class, in the first section, we again use a neural network in teaching a computer to recognize handwritten digits. Here we introduce the convolutional neural network. They are predominantly used in computer vision applications, such as for recognizing objects in images.

The second section of the Part 2 class introduces basic language modeling, and simple generation of text based on prior learned text, in this case, baby names.

But you don’t need to be a professional programmer. The demo code provided is in Python, and should be easy to understand with just a little effort.

Reference:

  • Book: Neural Networks and Deep Learning by Michael Nielsen, http://neuralnetworksanddeeplearning.com
  • Video Course: Neural Networks: Zero to Hero by Andrej Karpathy, an OpenAI cofounder, https://karpathy.ai/zero-to-hero.html

Benefits of attending this Part 2 class of the series:

  • Build upon the core principles behind neural networks and deep learning in the Part 1 class to learn about convolutional neural networks.
  • See a simple Python program that solves a concrete problem: teaching a computer to recognize a handwritten digit.
  • Improve the result through incorporating more and more core ideas about neural networks and deep learning.
  • Understand basic language modeling.
  • Implement a simple language model that generates baby names from existing names.
  • Get introduced to the popular PyTorch library.
  • Run straightforward Python demo code examples.

Just as for the Part 1 class, for the first section of the Part 2 class, the demo Python program (updated from version provided in the book) can be downloaded from the speaker’s GitHub account. The demo program is run in a Docker container that runs on your Mac, Windows, or Linux personal computer; we will provide instructions on doing that in advance of the class.

The second section of the Part 2 class is based on a Colab hosted Jupyter Notebook running Python; the link to the file will be shared in advance of the class.

Part 2 class Background and Content: This is a live instructor-led introductory course on Neural Networks and Deep Learning. It is planned to be a three-part series of classes.

Similar to the Part 1 class, which is a pre-requisite, this Part 2 class is also complete by itself. It comprises two sections. Section 1 covers convolutional neural networks. Section 2 covers basic language modeling. It will be a pre-requisite for the planned Part 3 class (to be confirmed) introducing a simple Generative Pre-trained Transformer (GPT).

The Section 1, Part 2, class material is mostly from the same highly-regarded and free online book used for the Part 1 class: Neural Networks and Deep Learning by Michael Nielsen. We add some additional material such as introducing the Residual or Skip connection in a Residual block, which is commonly adopted in many types of deep neural networks.

The Section 2, Part 2, class material is from the sixth video: Building makemore Part 5: Building a WaveNet from the above referenced truly amazing video course series by one of OpenAI’s co-founders, Andrej Karpathy.

Part 2 class Outline:

Section 1 Convolutional Neural Networks.

  • Simple (Python) Network to classify a handwritten digit
    • Local receptive fields
    • Feature map: Shared weights, bias
    • Pooling
  • Demo code using Theano library for learning only
    • Automatic gradient/backprop calculation
    • Weight initialization
  • Quick introduction to PyTorch library
  • AlexNet: Example of a Convolutional Neural Network architecture
  • Residual or Skip connection

Section 2 Basic Language Modeling.

  • Simple language model (generate baby names from existing names)
    • Vocabulary (character-level)
    • Block or Context length – # of tokens (characters) considered in predicting next one
    • Datasets for training, validation, test
    • Multi-layer neural network
      • Embedding layer, Flatten layer, Linear layer, BatchNorm1d layer, Tanh activation
      • Improve Flatten layer with a hierarchical architecture
      • PyTorch’s cross_entropy method to get loss
      • Automatic gradient calculation with Pytorch’s loss.backward method
      • Stochastic Gradient Descent to learn/update parameters

Part 2 class Pre-requisites: The material in the Part 1 class, which requires some basic familiarity with multivariable calculus and matrix algebra, but nothing advanced. Basic familiarity with Python or similar computer language.

Speaker:   CL Kim works in Software Engineering at CarGurus, Inc.

CL Kim works in Software Engineering at CarGurus, Inc. He has graduate degrees in Business Administration and in Computer and Information Science from the University of Pennsylvania. He had previously taught for a few years the well-rated IEEE Boston Section class on introduction to the Android Platform and API.