CSED490H | AI Security

Objective

As AI advances and becomes practical at scale, safety and security concerns are rapidly emerging. In this course, we learn the art of attacking AI systems together with the core concepts and tools needed in modern AI.

In particular, we study two core axes: victim models (e.g., LLMs, VLAs, and Agentic AI) and attack methods (e.g., adversarial examples and jailbreaking), along with optimization tools such as gradient descent, policy optimization, and prompt tuning with LoRA.

By the end of this class, students will have a strong understanding of trendy AI model families, broad AI red teaming methods, and practical AI tooling required for security research and engineering.

Course Snapshot

Credits3-0-3

Time / LocationThu 14:00–15:15 / Building 2 - 109

PrerequisiteArtificial Intelligence

Assessment80% Projects · 20% Participation

Course Staff

Instructor

Prof. Sangdon Park

Assistant Professor
Graduate School of Artificial Intelligence (GSAI)
Department of Computer Science and Engineering (CSE)
POSTECH

Teaching Assistant

Sechan Lee

Teaching Assistant for CSED490H. Supporting course operations, student discussions, and technical guidance on assignments and project milestones.

Email: chan1031@postech.ac.kr

Schedule

Week	Topics
1	Introduction to AI Security Course Logistics
2	Models: Neural Networks / SGD Inference-time Attacks: Adversarial Examples / Adversarial Patches / Transfer Attacks
3	Models: Transformer / LLMs / LRMs / LoRA / RAG
4	Student Presentation and Discussion on HW 1
5	Inference-time Attacks for LLMs: Basics
6	Models: Diffusion Models / VLMs / VLAs Optimization: Optimization for Whitebox Victim Models -- Prompt tuning methods Optimization: Optimization for Blackbox Victim Models -- Zero-th Order Optimization
7	Optimization: Optimization for Blackbox Victim Models -- Policy Optimization Inference-time Attacks for GenAI: Advanced
8	Models: Agentic AI / Tool-calling Agents Introduction to OpenClaw
9	Student Presentation and Discussion on HW 2
10	Inference-time Attacks for Agentic AI: Current Trends
11	Training-set Attacks: membership inference attacks Training-set Attacks: data poisoning attacks
12	Model Attacks: model extraction attacks
13	Final Remarks: Overview on defense methods
14	Student Presentation and Discussion on Final Projects
15	Student Presentation and Discussion on Final Projects

AI Security (CSED490H)