Search Results: glm-v

Found 3 Skills

glmv-grounding

A skill that uses GLM-V native grounding capabilities for coordinate conversion, bounding-box visualization, and more. GLM-V native grounding can locate any target specified by the prompt in an image and output relative coordinates normalized to 0-1000 based on image size. Coordinate formats include 2D bounding box (default), 2D points, and 3D bounding box. GLM-V also supports spatiotemporal localization and tracking of multiple prompt-specified targets in videos, outputting 2D bounding boxes per second.

🇺🇸|EnglishTranslated

6 scripts/Attention

AI & Machine Learningzai-org/glm-skills

glmv-doc-based-writing

Write a textual content based on given document(s) and requirements, using ZhiPu GLM-V multimodal model. Read and comprehend one or multiple documents (PDF/DOCX), write a content in Markdown format according to the specified requirements. Use when the user wants to draft a paper/article/essay/report/review/post/brief/proposal/plan, etc.

🇺🇸|EnglishTranslated

1 scripts/Checked

AI & Machine Learningthincher/awsome_skills

glm-understand-image

Perform image understanding and analysis using GLM Vision MCP. Trigger conditions: (1) Users request image analysis, image understanding, or description of image content (2) Need to identify objects, text, or scenes in images (3) Use GLM's visual understanding capabilities

🇨🇳|ChineseTranslated