Koya Sakamoto

I'm a PhD student at the University of Tokyo, under the supervision of Prof. Yusuke Iwasawa and Prof. Yutaka Matsuo. I received my master's degree from Kyoto University, where I was fortunate to be supervised by Prof. Shin Ishii and Dr. Motoaki Kawanabe. Previously, I completed my bachelor's degree, also at Kyoto University, advised by Prof. Hidetoshi Shimodaira. My research connects language, vision, and 3D perception — including language-goal aerial navigation, embodied question answering, and city-scale 3D language fields. I'm broadly interested in agents that perceive, reason about, and act in the physical world.

koya DOT sakamoto AT weblab DOT t DOT u-tokyo DOT ac DOT jp  /  CV  /  Scholar  /  Github  /  Twitter  /  LinkedIn

profile photo

Research

E3VS-Bench E3VS-Bench: A Benchmark for Viewpoint-Dependent Active Perception in 3D Gaussian Splatting Scenes
Koya Sakamoto, Taiki Miyanishi, Daichi Azuma, Shuhei Kurita, Shu Morikuni, Naoya Chiba, Motoaki Kawanabe, Yusuke Iwasawa, Yutaka Matsuo
arXiv preprint, 2026
arXiv  /  project  /  code
GeoProg3D GeoProg3D: Compositional Visual Reasoning for City-Scale 3D Language Fields
Shunsuke Yasuki, Taiki Miyanishi, Nakamasa Inoue, Shuhei Kurita, Koya Sakamoto, Daichi Azuma, Masato Taki, Yutaka Matsuo
IEEE/CVF International Conference on Computer Vision (ICCV), 2025
arXiv  /  project  /  code
CityNav CityNav: Language-Goal Aerial Navigation Dataset with Geographic Information
Jungdae Lee, Taiki Miyanishi, Shuhei Kurita, Koya Sakamoto, Daichi Azuma, Yutaka Matsuo, Nakamasa Inoue
IEEE/CVF International Conference on Computer Vision (ICCV), 2025
arXiv  /  project  /  code
AIRoA MoMa Dataset AIRoA MoMa Dataset: A Large-Scale Hierarchical Dataset for Mobile Manipulation
Ryosuke Takanami, Petr Khrapchenkov, Shu Morikuni, Jumpei Arima, Yuta Takaba, Shunsuke Maeda, Takuya Okubo, Genki Sano, Satoshi Sekioka, Aoi Kadoya, Motonari Kambara, Naoya Nishiura, Haruto Suzuki, Takanori Yoshimoto, Koya Sakamoto, Shinnosuke Ono, Hu Yang, Daichi Yashima, Aoi Horo, Tomohiro Motoda, Kensuke Chiyoma, Hiroshi Ito, Koki Fukuda, Akihito Goto, Kazumi Morinaga, Yuya Ikeda, Riko Kawada, Masaki Yoshikawa, Norio Kosuge, Yuki Noguchi, Kei Ota, Tatsuya Matsushima, Yusuke Iwasawa, Yutaka Matsuo, Tetsuya Ogata
arXiv preprint, 2025
arXiv  /  dataset
Map-based Modular Approach for Zero-shot Embodied QA Map-based Modular Approach for Zero-shot Embodied Question Answering
Koya Sakamoto, Daichi Azuma, Taiki Miyanishi, Shuhei Kurita, Motoaki Kawanabe
IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), 2024
arXiv  /  project  /  code
Answerability Fields Answerability Fields: Answerable Location Estimation via Diffusion Models
Daichi Azuma, Taiki Miyanishi, Shuhei Kurita, Koya Sakamoto, Motoaki Kawanabe
IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), 2024
arXiv

Adapted from Jon Barron's source code.