Local AIPROPRO REQUIREDFC-AI-004
Model Eval Claw
model-eval-claw
Evalúa y benchmarka rendimiento de LLMs con suites de tests, métricas comparativas y análisis de calidad.
Model Eval Claw evaluates and benchmarks LLM performance using structured test suites, comparative metrics, and quality analysis. It designs evaluation frameworks that measure accuracy, safety, latency, and cost across models, helping teams choose the right model for their use case.
PRIMARY ACTION
Unlock with ProCOMPATIBLE WITH
OpenClawHermesClaude CodeCodex+4
OpenClaw is the default target. Cursor example below.
When to Use
- Run local models with private workflows
- Tune inference for local hardware
- Choose effective GGUF variants
- Benchmark practical latency and quality
Compatible Frameworks
8 TOOLS
Quality Gates
- Framework con todas las dimensiones
- Suite de tests representativa
- Rúbricas de scoring consistentes
- Comparaciones justas y controladas
- Evaluación de seguridad incluida
5 GATES DEFINED
Expected Outputs
evaluation frameworkstest suitesscoring rubricscomparison matricesbenchmark reportsmodel recommendations
Native exports per tool
OpenClaw10 files
openclaw/AGENTS.mdopenclaw/SOUL.mdopenclaw/TOOLS.md+7 moreHermes5 files
hermes/skills/flickclaw/model-eval-claw/SKILL.mdhermes/skills/flickclaw/model-eval-claw/references/workflow.mdhermes/skills/flickclaw/model-eval-claw/references/quality-gates.md+2 moreClaude Code6 files
claude-code/CLAUDE.mdclaude-code/.claude/skills/model-eval-claw/SKILL.mdclaude-code/.claude/skills/model-eval-claw/references/workflow.md+3 moreCodex5 files
codex/AGENTS.mdcodex/.flickclaw/agents/model-eval-claw/codex.mdcodex/.flickclaw/agents/model-eval-claw/workflow.md+2 moreCursor3 files
cursor/.cursor/rules/flickclaw-model-eval-claw.mdccursor/.cursor/rules/flickclaw-model-eval-claw-workflow.mdccursor/.cursor/rules/flickclaw-model-eval-claw-quality-gates.mdcWindsurf3 files
windsurf/.windsurf/rules/flickclaw-model-eval-claw.mdwindsurf/.windsurf/rules/flickclaw-model-eval-claw-workflow.mdwindsurf/.windsurf/rules/flickclaw-model-eval-claw-quality-gates.mdAider3 files
aider/CONVENTIONS.mdaider/aider.mdaider/.aider.conf.ymlOllama4 files
ollama/Modelfileollama/system-prompt.mdollama/template.md+1 moreUse in Your Tool
Primary command uses OpenClaw by default. Secondary example targets Cursor.
OpenClaw (default)
RECOMMENDEDnpm exec --yes @flickclaw/cli@latest -- install model-eval-clawCursor (secondary)
npm exec --yes @flickclaw/cli@latest -- install model-eval-claw --target cursorSupported AI Agent Frameworks
Example Prompt
Build a complete plan and deliverable package for this agent's role in a production workflow.