MARQA: Bridging Human and Autonomous Ship Operations Through Large Language Model-Based Navigational Reasoning

Korea Advanced Institute of Science and Technology (KAIST)
T-ITS review

*Corresponding Author

Abstract

In maritime navigation, understanding the intentions of other vessels and safely avoiding collisions are critical. Avoidance maneuvers are guided by the convention on the international regulations for preventing collisions at sea (COLREG). However, in situations involving imminent danger or unusual circumstances, vessels may negotiate mutual agreements via radio communication, when strict adherence to COLREG rules is impractical. Communication gaps exist between conventional human-operated ships and autonomous ships due to differences in decision-making processes. To address this issue, this study develops MARQA (Maritime Autonomous Reasoning for Question Answering), an algorithm utilizing large language models (LLMs) to enable navigational question-and-answer interactions between vessels. The algorithm uses a Turning Circle-based Control Barrier Function to explore possible scenarios and generates a vector space that accounts for COLREG rules, navigational safety, and efficiency. It selects the optimal scenario through LLM reasoning, producing text outputs that facilitate agreements with manned ships. We propose the first navigational question answering benchmark dataset for autonomous ships, and verify the algorithm’s performance based on this dataset. The proposed algorithm achieves an answer accuracy of 98.55\% across various scenarios and question types. Also, it can produce answers considering the flow of context even in multi-turn conversations.

Flowchart of the proposed algorithm MARQA.

Experiment Results

Methodology

Scenario Generation

Image Description
Graphical representation of the TC-CBF. TC-CBF generates an obstacle-avoiding trajectory by selecting the ego ship's turning circle direction.

Top K Scenario Selection

Image Description
Representation into vector space. The results generated from Scenario Generation are stored in a vector space with three components: COLREG Score, Safety Score, and Efficiency Score. Subsequently, the most promising K scenarios are selected.

Answer Generation

Image Description
Input and Output Prompts. The input prompt includes the Top K scenarios, previous dialogue, environmental data, and question from target ship. The LLM infers the optimal and suboptimal scenarios and generates the response after evaluating COLREG compliance.

BibTeX

@article{Shin2025MARQA,
  title={MARQA: Bridging Human and Autonomous Ship Operations Through Large Language Model-Based Navigational Reasoning},
  author={Shin, Yeongha and Lee, Changyu and Kim, Jinwhan},
  journal={}
  year={2025},
}