Welcome to my homepage! I am Futa Waseda, a PhD student in Information Science and Technology at The University of Tokyo.
My research focuses on the robustness and reliability of deep learning models. I am particularly fascinated by computer vision and vision-language multi-modal learning, aiming to understand and mitigate real-world risks.
Explore my work and join me in the journey to make AI more reliable and create a better future!
BEng in Systems Innovation, 2020
The University of Tokyo
MS in Informatics, 2023
The University of Tokyo
Research keywords:
Research keywords:
Responsibilities include:
Research keywords:
Although adversarial training has been the state-of-the-art approach to defend against adversarial examples (AEs), it suffers from a robustness-accuracy trade-off, where high robustness is achieved at the cost of clean accuracy. In this work, we leverage invariance regularization on latent representations to learn discriminative yet adversarially invariant representations, aiming to mitigate this trade-off. We analyze two key issues in representation learning with invariance regularization: (1) a gradient conflict between invariance loss and classification objectives, leading to suboptimal convergence, and (2) the mixture distribution problem arising from diverged distributions of clean and adversarial inputs. To address these issues, we propose Asymmetrically Representation-regularized Adversarial Training (AR-AT), which incorporates asymmetric invariance loss with stop-gradient operation and a predictor to improve the convergence, and a split-BatchNorm (BN) structure to resolve the mixture distribution problem. Our method significantly improves the robustness-accuracy trade-off by learning adversarially invariant representations without sacrificing discriminative ability. Furthermore, we discuss the relevance of our findings to knowledge-distillation-based defense methods, contributing to a deeper understanding of their relative successes.
Infrared detection is an emerging technique for safety-critical tasks owing to its remarkable anti-interference capability. However, recent studies have revealed that it is vulnerable to physically-realizable adversarial patches, posing risks in its real-world applications. To address this problem, we are the first to investigate defense strategies against adversarial patch attacks on infrared detection, especially human detection. We have devised a straightforward defense strategy, patch-based occlusion-aware detection (POD), which efficiently augments training samples with random patches and subsequently detects them. POD not only robustly detects people but also identifies adversarial patch locations. Surprisingly, while being extremely computationally efficient, POD easily generalizes to state-of-the-art adversarial patch attacks that are unseen during training. Furthermore, POD improves detection precision even in a clean (i.e., no-patch) situation due to the data augmentation effect. Evaluation demonstrated that POD is robust to adversarial patches of various shapes and sizes. The effectiveness of our baseline approach is shown to be a viable defense mechanism for real-world infrared human detection systems, paving the way for exploring future research directions.
Calibrating deep learning models to yield uncertainty-aware predictions is crucial as deep neural networks get increasingly deployed in safety-critical applications. While existing post-hoc calibration methods achieve impressive results on in-domain test datasets, they are limited by their inability to yield reliable uncertainty estimates in domain-shift and out-of-domain (OOD) scenarios. We aim to bridge this gap by proposing DAC, an accuracy-preserving as well as Density-Aware Calibration method based on k-nearest-neighbors (KNN). In contrast to existing post-hoc methods, we utilize hidden layers of classifiers as a source for uncertainty-related information and study their importance. We show that DAC is a generic method that can readily be combined with state-of-the-art post-hoc methods. DAC boosts the robustness of calibration performance in domain-shift and OOD, while maintaining excellent in-domain predictive uncertainty estimates. We demonstrate that DAC leads to consistently better calibration across a large number of model architectures, datasets, and metrics. Additionally, we show that DAC improves calibration substantially on recent large-scale neural networks pre-trained on vast amounts of data.
Deep neural networks are vulnerable to adversarial examples (AEs), which have adversarial transferability: AEs generated for the source model can mislead another (target) model’s predictions. However, the transferability has not been understood in terms of to which class target model’s predictions were misled (i.e., class-aware transferability). In this paper, we differentiate the cases in which a target model predicts the same wrong class as the source model (“same mistake”) or a different wrong class (“different mistake”) to analyze and provide an explanation of the mechanism. We find that (1) AEs tend to cause same mistakes, which correlates with “non-targeted transferability”; however, (2) different mistakes occur even between similar models, regardless of the perturbation size. Furthermore, we present evidence that the difference between same mistakes and different mistakes can be explained by non-robust features, predictive but human-uninterpretable patterns: different mistakes occur when non-robust features in AEs are used differently by models. Non-robust features can thus provide consistent explanations for the class-aware transferability of AEs.
It is a problem that as the spread of solar power generation expands, the net power demand sharply fluctuates between day and night. The P2P (Peer to Peer) Electricity Market is expected to be a solution when accumulator-users play an important role. In such background, widespread EVs are expected to participate in the P2P market and utilize the battery storage. However, in previous research, only simulation and effect verification under an ideal condition were conducted and no EV bidding agent which works in the real situation was proposed. Therefore, in this paper, a whole system of a robust automatic bidding agent of EV which works in the real situation is proposed, and case studies based on the actual EV driving data were conducted. The results show that even EVs are running irregularly, proposed EV bidding agent was able to realize benefits for EV-users and leveling effect of the power demand through the day.
ビッグデータの分析を通じた消費者の趣味嗜好の理解とそれによる効率的な顧客獲得が広く試みられてい る。顧客の年齢・職業といった基本情報や購買履歴の分析から得られる情報は効果的であるが、多様な消費者の嗜 好を考慮すると、より多角的な視点からの消費者の購買心理の理解が必要である。本研究では、消費者の基本情報・ 内面・価値観・行動に関する計 2000 項目ほどの多角的なアンケートデータから「飲食店への来店頻度」を予測する タスクを通じて、各サービスを利用する消費者に特有の特徴量の集合の抽出を行った。結果、飲食店の購買データ のみではできるはずのなかった新たな視点からの消費者嗜好の把握が可能となり、また飲食店ごとに特徴的な消費 者層の把握が可能となった。本研究の結果は、マーケティングにおける消費者の情報収集を検討するうえでの重要 な示唆となる。
Made a model which outputs text from a image like human tweets, using Encoder-Decoder Model. Application of image captioning technique.
My first twitter bot app. He learns japanese from his followers, by fitting retrieved data to Markov model.
Demo web application I made in school. In this app, you can clip the place you want to go in the future, find the shortest way to go through the chosen spots. I was responsible for front-end system using html, css, javascript.