Machine Learning (Fall 2024)

Administrative Matters

Instructor: Lin ZHANG

TA: Linfei LI, cslinfeili@tongji.edu.cn

 

Office: RM418L, Jishi Building, Jiading Campus

张林等,计算机视觉:原理算法与实践 (updated on Oct. 06, 2024)

Lecture Slides

 

Slides

Reading Materials

Woman in machine learning concept 35275609

Introduction

 

â??cascade structureâ??ç??å?¾ç??æ??ç´¢ç»?æ??

AdaBoost and Cascade Structure

1. P. Viola and M.J. Jones, Robust real-time face detection, IJCV' 04

2. Y. Freund and R.E. Schapire, A Decision-Theoretic Generalization of on-Line Learning and an Application to Boosting, Journal of Computer and System Sciences,1995
 

â??pcaâ??ç??å?¾ç??æ??ç´¢ç»?æ??

Principal Component Analysis

1. M. Turk and A. Pentland, Eigenfaces for recognition, Journal of Cognitive Neuroscience' 91

2. PCADemo: a matlab program used in our lectures to demonstrate the basic concepts related to PCA
3. FaceRecByEigenFace: a matlab demo composed by me to illustrate how to make use of the eigen-face approach to perform face recognition. Very simple and straightforward. (You need to run "training.m" at first)

Sparse Representation based Classification

1. J. Wright et al., Robust face recognition via sparse representation, IEEE PAMI' 09
2. L. Zhang et al., Sparse representation or collaborative representation: which helps face recognition?, ICCV' 11
3. Implementations of several l1-minization solvers, provided by Allen Yang (EECS, Berkeley)
4. CRC_RLS: a matlab demo program implementing the CRC_RLS based face recognition method.

5. Basis pursuit denoising. In applied mathematics and statistics, the l1-minimization problem is named basis pursuit denoising (BPDN)

6. David L. Donoho and Yaakov Tsaig, Fast Solution of l1-norm Minimization Problems When the Solution May be Sparse, 2006

Linear Models

 

Neural Networks and CNN

1. K. He et al., Deep Residual Learning for Image Recognition, CVPR, 2016

2. G. Huang et al., Densely Connected Convolutional Networks, CVPR, 2017

3. J. Redmon et al., Yolo: 9000 better, faster, stronger, CVPR, 2017

4. N. Ma et al., ShuffleNet V2: Practical Guidelines for Efficient CNN Architecture Design, ECCV, 2018

5. J. Redmon et al., YOLOv3: An Incremental Improvement, arXiv, 2018
6. Github for YOLOv4, https://github.com/AlexeyAB/darknet
7. Ultralytics YOLOv8, https://ultralytics.com/yolov8
8. J.R. Terven et al., A comprehensive review of YOLO: From YOLOv1 to YOLOv8 and beyond, arXiv:2304.0050, 2023.
9. 典型卷积神经网络模型结构的演进

Fundamentals for Convex Optimization

1.  "Part I: Theory" of the book "S. Boyd and L. Vandenberghe, Convex Optimization, Cambridge Press, 2004".

Support Vector Machines

1.  A. Kowalczyk, Support Vector Machines Succinctly, Syncfusion, 2017

2. Demos for SVM, https://github.com/csLinZhang/CVBook/tree/main/chapter-14-SVM

Transformers-based Object Detection

1.  A. Vasvani et al., Attention is all you need, NeurIPS, 2017.

2. A. Dosovitskiy et al., An image is worth 16X16 words: Transformers for image recognition at scale, ICLR, 2021.

3. Z. Liu et al., Swin transformer: Hierarchical vision transformer using shifted windows, ICCV, 2021.

4. N. Carion et al., End-to-end object detection with transformers, ECCV, 2020.

5. Y. Zhao et al., DETRs beat YOLOs on real-time object detection, CVPR, 2024.

6. H. Zhang et al., VarifocalNet: An IoU-aware dense object detector, CVPR, 2021.

Visual Perception Practices in Autonomous Driving

1.  Xuan Shao, Lin Zhang* et al., "MOFISSLAM: A multi-object semantic SLAM system with front-view, inertial and surround-view sensors for indoor parking", IEEE Trans. Circuits and Systems for Video Technology, vol. 32, no. 7, pp. 4788-4803, 2022.

2.  Tianjun Zhang, Nlong Zhao, Ying Shen, Xuan Shao, Lin Zhang*, and Yicong Zhou, “ROECS: A Robust Semi-direct Pipeline Towards Online Extrinsics Correction of the Surround-view System”, in: Proc. ACM MM, pp. 3153-3161, 2021.

3.  Lin Zhang et al., "Vision-based parking-slot detection: A DCNN-based approach and a large-scale benchmark dataset", IEEE Trans. Image Processing, vol. 27, no. 11, pp. 5350-5364, 2018.

 

Assignments

 

Notes:

1. Compress all files into a .rar file whose name is composed of student name and ID.

2. For the programming assignments, please make sure your program can successfully run on TA's machine.

3. All the documents you hand in, including comments in the source codes, should be in English.

4. Please send your solutions to TA (Linfei Li, cslinfeili@tongji.edu.cn) and confirm with TA that he has received your email successfully.

 

1. Assignment 1. (Due: Oct. 20, 2024)

2. Assignment 2. (Due: Nov. 26, 2024) supermarket.mp4

3. Assignment 3. (Due: Dec. 31, 2024) mySVM.m, test video for speed-bump and person detection

 

Created on: Aug. 30, 2024

Last updated on: Oct. 5, 2024