Search Results: llm-inference

Found 30 Skills

hugging-face-evaluation

Add and manage evaluation results in Hugging Face model cards. Supports extracting eval tables from README content, importing scores from Artificial Analysis API, and running custom model evaluations with vLLM/lighteval. Works with the model-index metadata format.

🇺🇸|EnglishTranslated

8 scripts/Checked

Code Qualityyonatangross/orchestkit

performance

Performance optimization patterns covering Core Web Vitals, React render optimization, lazy loading, image optimization, backend profiling, and LLM inference. Use when improving page speed, debugging slow renders, optimizing bundles, reducing image payload, profiling backend, or deploying LLMs efficiently.

🇺🇸|EnglishTranslated

7 scripts/Attention

AI & Machine Learningparcadei/continuous-claud...

agentica-server

Agentica server + Claude proxy setup - architecture, startup sequence, debugging

🇺🇸|EnglishTranslated

AI & Machine Learningteam-telnyx/skills

telnyx-ai-inference-python

Access Telnyx LLM inference APIs, embeddings, and AI analytics for call insights and summaries. This skill provides Python SDK examples.

🇺🇸|EnglishTranslated

AI & Machine Learningruvnet/ruflo

llm-config

Configure RuVLLM local inference with model selection, MicroLoRA fine-tuning, and SONA adaptation

🇺🇸|EnglishTranslated

AI & Machine Learningscientiacapital/skills

groq-inference

Fast LLM inference with Groq API - chat, vision, audio STT/TTS, tool use. Use when: groq, fast inference, low latency, whisper, PlayAI TTS, Llama, vision API, tool calling, voice agents, real-time AI.

🇺🇸|EnglishTranslated

1 scripts/Checked

AI & Machine Learningpluginagentmarketplace/cu...

model-deployment

LLM deployment strategies including vLLM, TGI, and cloud inference endpoints.

🇺🇸|EnglishTranslated

1 scripts/Checked

AI & Machine Learningaradotso/trending-skills

deepseek-ocr

Expert skill for using DeepSeek-OCR, a vision-language model for optical character recognition with context optical compression supporting documents, PDFs, and images.

🇺🇸|EnglishTranslated

AI & Machine Learningteam-telnyx/skills

telnyx-ai-inference-curl

Access Telnyx LLM inference APIs, embeddings, and AI analytics for call insights and summaries. This skill provides REST API (curl) examples.

🇺🇸|EnglishTranslated

AI & Machine Learningdatabricks/databricks-age...

databricks-model-serving

Manage Databricks Model Serving endpoints via CLI. Use when asked to create, configure, query, or manage model serving endpoints for LLM inference, custom models, or external models.

🇺🇸|EnglishTranslated

AI & Machine Learningascend-ai-coding/awesome-...

vllm-ascend

vLLM Ascend plugin for LLM inference serving on Huawei Ascend NPU. Use for offline batch inference, API server deployment, quantization inference (with msmodelslim quantized models), tensor/pipeline parallelism for distributed serving, and OpenAI-compatible API endpoints. Supports Qwen, DeepSeek, GLM, LLaMA models with Ascend-optimized kernels.

🇺🇸|EnglishTranslated

3 scripts/Attention

AI & Machine Learninghkuds/cli-anything

cli-anything-ollama

Command-line interface for Ollama - Local LLM inference and model management via Ollama REST API. Designed for AI agents and power users who need to manage models, generate text, chat, and create embeddings without a GUI.

🇺🇸|EnglishTranslated