AI Security (CSED490H)

A practical red-teaming course covering modern AI systems, attack methods, and optimization tools for AI security research.

Objective

As AI advances and becomes practical at scale, safety and security concerns are rapidly emerging. In this course, we learn the art of attacking AI systems together with the core concepts and tools needed in modern AI.

In particular, we study two core axes: victim models (e.g., LLMs, VLAs, and Agentic AI) and attack methods (e.g., adversarial examples and jailbreaking), along with optimization tools such as gradient descent, policy optimization, and prompt tuning with LoRA.

By the end of this class, students will have a strong understanding of trendy AI model families, broad AI red teaming methods, and practical AI tooling required for security research and engineering.

Course Snapshot

Credits3-0-3
Time / LocationThu 14:00–15:15 / Building 2 - 109
PrerequisiteArtificial Intelligence
Assessment80% Projects · 20% Participation

Course Staff

Instructor

Prof. Sangdon Park

Prof. Sangdon Park

Assistant Professor
Graduate School of Artificial Intelligence (GSAI)
Department of Computer Science and Engineering (CSE)
POSTECH

Teaching Assistant

TA Sechan Lee

Sechan Lee

Teaching Assistant for CSED490H. Supporting course operations, student discussions, and technical guidance on assignments and project milestones.

Email: chan1031@postech.ac.kr

Schedule

Week Topics
1 Introduction to AI Security
Course Logistics
2 Models: Neural Networks / SGD
Inference-time Attacks: Adversarial Examples / Adversarial Patches / Transfer Attacks
3 Models: Transformer / LLMs / LRMs / LoRA / RAG
4Student Presentation and Discussion on HW 1
5 Inference-time Attacks for LLMs: Basics
6 Models: Diffusion Models / VLMs / VLAs
Optimization: Optimization for Whitebox Victim Models -- Prompt tuning methods
Optimization: Optimization for Blackbox Victim Models -- Zero-th Order Optimization
7 Optimization: Optimization for Blackbox Victim Models -- Policy Optimization
Inference-time Attacks for GenAI: Advanced
8 Models: Agentic AI / Tool-calling Agents
Introduction to OpenClaw
9Student Presentation and Discussion on HW 2
10 Inference-time Attacks for Agentic AI: Current Trends
11Training-set Attacks: membership inference attacks
Training-set Attacks: data poisoning attacks
12Model Attacks: model extraction attacks
13Final Remarks: Overview on defense methods
14Student Presentation and Discussion on Final Projects
15Student Presentation and Discussion on Final Projects