Futa (Kai) Waseda

PhD Student in Information Science and Technology

The University of Tokyo

Biography

I’m a PhD student in Information Science and Technology at The University of Tokyo.

My research interests are robustness and reliability of deep learning models.

I have worked on various topics in this field, including post-hoc calibration (ICML'23), understanding adversarial attacks (WACV'23), adversarial defenses for computer vision (ICIP'24, ICLR'25), adversarial defenses for vision-language models (MIRU'25 Oral, ACMMM'25), protecting IP in deep learning models (ACL'25 Main), and more.

I am actively looking for research positions in the industry or academia starting in 2026. If my research aligns with your interests, I welcome the opportunity to discuss potential collaborations. Please feel free to contact me.

Interests

Deep Learning
Computer Vision
Vision-Language Model
Robustness, Reliability

Education

BEng in Systems Innovation, 2020
The University of Tokyo
MS in Informatics, 2023
The University of Tokyo

News

2025.07: One 1st-authored paper accepted at ACMMM 2025.
2024.06: Reviewed 5 papers for NeurIPS 2025.
2024.05: Reviewed 5 papers for ICML 2025.
2025.05: One 1st-authored paper accepted at ACL 2025 Main.
2025.01: One 1st-authored paper accepted at ICLR 2025.
2024.08 - (current): Research internship at SB Intuitions.
2024.09: One 1st-authored paper accepted at ICIP 2024.
2024.05: One 1st-authored paper accepted at MIRU 2024 Oral.
2024.03: Reviewed 5 papers for ICML2024.
2024.02 - (current): Research internship at CyberAgent AI Lab.
2024.11: Reviewed 2 papers for ICLR2024.
2023.10: Reviewed 1 paper for IEEE TIFS2024.
2023.09: Selected for research fellowship JST DC2.
2023.08-2023.12: Research internship at NEC Japan.
2023.07: Reviewed 1 paper for NeurIPS2023.
2023.07 - 2024.03: Received a research grant of 1 million yen from AIP Challenge Program, JST.
2023.04: One 1st-authored paper accepted at ICML 2023.
2023.04: Started PhD at The University of Tokyo.
2023.03: Selected for research fellowship JST SPRING GX.
2022.10: One 1st-authored paper accepted at WACV 2023.

Skills

python

Machine Learning/Deep Learning

Teamwork

Experience

Research Internship

SB Intuitions

Aug 2024 – Present Tokyo, Japan.

Research keywords:

Deep Learning
Reliable AI

Research Internship

CyberAgent AI Lab

Feb 2024 – Present Tokyo, Japan.

Research keywords:

Deep Learning
Vision-Language Models
Adversarial Robustness

Research Internship

NEC Corporation

Aug 2023 – Dec 2023 Tokyo, Japan.

Research keywords:

Deep Learning
Computer Vision
Adversarial Robustness
Parameter-Efficient Training

Exchange Student

Technical University of Munich

Apr 2021 – Mar 2022 Munich, Germany.

Conducted research, supervised by Christian Tomani.

Research Assistant

National Institute of Informatics

May 2020 – Present Tokyo, Japan.

Research keywords:

Deep Learning
Computer Vision
Adversarial Robustness

Machine learning Engineer

Ollo inc.

May 2020 – Present Tokyo, Japan

Responsibilities include:

Researcher
Data Scientist
Software Engineer

Masters Student

The University of Tokyo

Apr 2020 – Mar 2023 Tokyo, Japan

Supervised by Prof. Isao Echizen.

Recent Publications

See all publications.

Futa (Kai) Waseda, Saku Sugawara, Isao Echizen

July 2025 ACMMM 2025

[ACMMM'25] Quality Text, Robust Vision: The Role of Language in Enhancing Visual Robustness of Vision-Language Models

Defending pre-trained vision-language models (VLMs), such as CLIP, against adversarial attacks is crucial, as these models are widely used in diverse zero-shot tasks, including image classification. However, existing adversarial training (AT) methods for robust fine-tuning largely overlook the role of language in enhancing visual robustness. Specifically, (1) supervised AT methods rely on short texts (e.g., class labels) to generate adversarial perturbations, leading to overfitting to object classes in the training data, and (2) unsupervised AT avoids this overfitting but remains suboptimal against practical text-guided adversarial attacks due to its lack of semantic guidance. To address these limitations, we propose Quality Text-guided Adversarial Fine-Tuning (QT-AFT), which leverages high-quality captions during training to guide adversarial examples away from diverse semantics present in images. This enables the visual encoder to robustly recognize a broader range of image features even under adversarial noise, thereby enhancing robustness across diverse downstream tasks. QT-AFT overcomes the key weaknesses of prior methods – overfitting in supervised AT and lack of semantic awareness in unsupervised AT – achieving state-of-the-art zero-shot adversarial robustness and clean accuracy, evaluated across 16 zero-shot datasets. Furthermore, our comprehensive study uncovers several key insights into the role of language in enhancing vision robustness; for example, describing object properties in addition to object names further enhances zero-shot robustness. Our findings point to an urgent direction for future work – centering high-quality linguistic supervision in robust visual representation learning.

PDF

Shojiro Yamabe, Futa (Kai) Waseda, Tsubasa Takahashi, Koki Wataoka

May 2025 ACL 2025 Main

[ACL'25 Main] MergePrint: Merge-Resistant Fingerprints for Robust Black-box Ownership Verification of Large Language Models

Protecting the intellectual property of Large Language Models (LLMs) has become increasingly critical due to the high cost of training. Model merging, which integrates multiple expert models into a single multi-task model, introduces a novel risk of unauthorized use of LLMs due to its efficient merging process. While fingerprinting techniques have been proposed for verifying model ownership, their resistance to model merging remains unexplored. To address this gap, we propose a novel fingerprinting method, MergePrint, which embeds robust fingerprints capable of surviving model merging. MergePrint enables black-box ownership verification, where owners only need to check if a model produces target outputs for specific fingerprint inputs, without accessing model weights or intermediate outputs. By optimizing against a pseudo-merged model that simulates merged behavior, MergePrint ensures fingerprints that remain detectable after merging. Additionally, to minimize performance degradation, we pre-optimize the fingerprint inputs. MergePrint pioneers a practical solution for black-box ownership verification, protecting LLMs from misappropriation via merging, while also excelling in resistance to broader model theft threats.

PDF

Futa (Kai) Waseda, Ching-Chun Chang, Isao Echizen

May 2024 ICLR 2025

[ICLR'25] Rethinking Invariance Regularization in Adversarial Training to Improve Robustness-Accuracy Trade-off

Although adversarial training has been the state-of-the-art approach to defend against adversarial examples (AEs), it suffers from a robustness-accuracy trade-off, where high robustness is achieved at the cost of clean accuracy. In this work, we leverage invariance regularization on latent representations to learn discriminative yet adversarially invariant representations, aiming to mitigate this trade-off. We analyze two key issues in representation learning with invariance regularization: (1) a gradient conflict between invariance loss and classification objectives, leading to suboptimal convergence, and (2) the mixture distribution problem arising from diverged distributions of clean and adversarial inputs. To address these issues, we propose Asymmetrically Representation-regularized Adversarial Training (AR-AT), which incorporates asymmetric invariance loss with stop-gradient operation and a predictor to improve the convergence, and a split-BatchNorm (BN) structure to resolve the mixture distribution problem. Our method significantly improves the robustness-accuracy trade-off by learning adversarially invariant representations without sacrificing discriminative ability. Furthermore, we discuss the relevance of our findings to knowledge-distillation-based defense methods, contributing to a deeper understanding of their relative successes.

PDF

Lukas Strack, Futa (Kai) Waseda, Huy H. Nguyen, Yinqiang Zheng, Isao Echizen

September 2023 ICIP 2024

[ICIP'24] Defending Against Physical Adversarial Patch Attacks on Infrared Human Detection

Infrared detection is an emerging technique for safety-critical tasks owing to its remarkable anti-interference capability. However, recent studies have revealed that it is vulnerable to physically-realizable adversarial patches, posing risks in its real-world applications. To address this problem, we are the first to investigate defense strategies against adversarial patch attacks on infrared detection, especially human detection. We have devised a straightforward defense strategy, patch-based occlusion-aware detection (POD), which efficiently augments training samples with random patches and subsequently detects them. POD not only robustly detects people but also identifies adversarial patch locations. Surprisingly, while being extremely computationally efficient, POD easily generalizes to state-of-the-art adversarial patch attacks that are unseen during training. Furthermore, POD improves detection precision even in a clean (i.e., no-patch) situation due to the data augmentation effect. Evaluation demonstrated that POD is robust to adversarial patches of various shapes and sizes. The effectiveness of our baseline approach is shown to be a viable defense mechanism for real-world infrared human detection systems, paving the way for exploring future research directions.

PDF

Christian Tomani, Futa (Kai) Waseda, Yuesong Shen, Daniel Cremers

February 2023 ICML 2023

[ICML'23] Beyond In-Domain Scenarios: Robust Density-Aware Calibration

Calibrating deep learning models to yield uncertainty-aware predictions is crucial as deep neural networks get increasingly deployed in safety-critical applications. While existing post-hoc calibration methods achieve impressive results on in-domain test datasets, they are limited by their inability to yield reliable uncertainty estimates in domain-shift and out-of-domain (OOD) scenarios. We aim to bridge this gap by proposing DAC, an accuracy-preserving as well as Density-Aware Calibration method based on k-nearest-neighbors (KNN). In contrast to existing post-hoc methods, we utilize hidden layers of classifiers as a source for uncertainty-related information and study their importance. We show that DAC is a generic method that can readily be combined with state-of-the-art post-hoc methods. DAC boosts the robustness of calibration performance in domain-shift and OOD, while maintaining excellent in-domain predictive uncertainty estimates. We demonstrate that DAC leads to consistently better calibration across a large number of model architectures, datasets, and metrics. Additionally, we show that DAC improves calibration substantially on recent large-scale neural networks pre-trained on vast amounts of data.

PDF

Futa (Kai) Waseda, Sosuke Nishikawa, Trung-Nghia Le, Huy H. Nguyen, Isao Echizen

February 2023 WACV 2023

[WACV'23] Closer Look at the Transferability of Adversarial Examples: How They Fool Different Models Differently

Deep neural networks are vulnerable to adversarial examples (AEs), which have adversarial transferability: AEs generated for the source model can mislead another (target) model’s predictions. However, the transferability has not been understood in terms of to which class target model’s predictions were misled (i.e., class-aware transferability). In this paper, we differentiate the cases in which a target model predicts the same wrong class as the source model (“same mistake”) or a different wrong class (“different mistake”) to analyze and provide an explanation of the mechanism. We find that (1) AEs tend to cause same mistakes, which correlates with “non-targeted transferability”; however, (2) different mistakes occur even between similar models, regardless of the perturbation size. Furthermore, we present evidence that the difference between same mistakes and different mistakes can be explained by non-robust features, predictive but human-uninterpretable patterns: different mistakes occur when non-robust features in AEs are used differently by models. Non-robust features can thus provide consistent explanations for the class-aware transferability of AEs.

PDF Poster

Futa (Kai) Waseda, Kenji Tanaka

June 2020 EEEIC 2020

[EEEIC'20] Bidding Agent for Electric Vehicles in Peer-to-Peer Electricity Trading Market considering uncertainty

It is a problem that as the spread of solar power generation expands, the net power demand sharply fluctuates between day and night. The P2P (Peer to Peer) Electricity Market is expected to be a solution when accumulator-users play an important role. In such background, widespread EVs are expected to participate in the P2P market and utilize the battery storage. However, in previous research, only simulation and effect verification under an ideal condition were conducted and no EV bidding agent which works in the real situation was proposed. Therefore, in this paper, a whole system of a robust automatic bidding agent of EV which works in the real situation is proposed, and case studies based on the actual EV driving data were conducted. The results show that even EVs are running irregularly, proposed EV bidding agent was able to realize benefits for EV-users and leveling effect of the power demand through the day.

PDF Slides

Tsubasa Sakai, Maiko Kamada, Kento Hori, Futa (Kai) Waseda

March 2019 DEIM 2019

[DEIM'19] 消費者アンケートを活用した飲食店顧客の多面的理解

ビッグデータの分析を通じた消費者の趣味嗜好の理解とそれによる効率的な顧客獲得が広く試みられている。顧客の年齢・職業といった基本情報や購買履歴の分析から得られる情報は効果的であるが、多様な消費者の嗜好を考慮すると、より多角的な視点からの消費者の購買心理の理解が必要である。本研究では、消費者の基本情報・内面・価値観・行動に関する計 2000 項目ほどの多角的なアンケートデータから「飲食店への来店頻度」を予測するタスクを通じて、各サービスを利用する消費者に特有の特徴量の集合の抽出を行った。結果、飲食店の購買データのみではできるはずのなかった新たな視点からの消費者嗜好の把握が可能となり、また飲食店ごとに特徴的な消費者層の把握が可能となった。本研究の結果は、マーケティングにおける消費者の情報収集を検討するうえでの重要な示唆となる。

Conference Link PDF

Projects

Awards

Won the special prize at SAS analytics hackathon 2019.（SAS社のThe Analytics Hackathon 2019にて特別賞）

SAS Institute Japan Jun 2019

In the contest, participants were given data and asked to construct machine learning system with high accuracy. (article url: https://enterprisezine.jp/article/detail/12209?p=2)

Won the first prize at MDS data science contest 2018.（MDSデータサイエンスコンテストで優勝）

Mathematics and Informatics Center Nov 2018

In the contest, participants were given big data and asked to perform value-generating analysis freely. Our group won the first prize and we were able to submit a paper. See the publication section.

Accomplishments

Summer School for Deep Generative Models 2020

Matsuo Lab. Aug 2020 – Sep 2020

Learned deep generative models from basics to state-of-the-art.

Chair for Global Consumer Intelligence (GCI 2018)

Matsuo Lab. Oct 2018

Learned how to utilize big data by machine learning technology.

Futa (Kai) Waseda

PhD Student in Information Science and Technology

Biography

Interests

Education

News

Skills

python

Machine Learning/Deep Learning

Teamwork

Experience

Research Internship

Research Internship

Research Internship

Exchange Student

Technical University of Munich

Research Assistant

Machine learning Engineer

Masters Student

The University of Tokyo

Recent Publications

Projects

Awards

Won the special prize at SAS analytics hackathon 2019.（SAS社のThe Analytics Hackathon 2019にて特別賞）

Won the first prize at MDS data science contest 2018.（MDSデータサイエンスコンテストで優勝）

Accomplish­ments

Summer School for Deep Generative Models 2020

Chair for Global Consumer Intelligence (GCI 2018)

Contact

Accomplishments