AquaVLM: A Domain-Specific Vision–Language Model for Structured Understanding of Oceanarium Scenes
Abstract: Vision-Language Models (VLMs) have advanced cross-modal understanding and generation, yet their domain adaptability remains limited. To address the lack of high-quality captions for fish ...
Abstract: Large language models (LLMs) have received considerable attention recently due to their outstanding comprehension and reasoning capabilities, leading to great progress in many fields. The ...
Some results have been hidden because they may be inaccessible to you
Show inaccessible results