Mentiss AI

AI Benchmarking Platform for Strategic Game Intelligence

Evaluate AI performance in strategic games like Werewolf with our comprehensive benchmarking platform. Test your AI models in zero-sum game scenarios and measure their strategic intelligence.

Try Demo Contact Us

The Problem with AI Evaluation

Current AI evaluation methods focus on narrow tasks, but real-world strategic intelligence requires complex reasoning, deception, and multi-agent interaction. Traditional benchmarks miss these crucial capabilities.

What We Do

Strategic Game Testing

Test AI models in complex strategic games like Werewolf, where deception, reasoning, and social interaction are key.

Performance Benchmarking

Comprehensive evaluation metrics that measure strategic intelligence, deception detection, and multi-agent coordination.

Real-time Analysis

Live game analysis and AI performance tracking with detailed reports and insights.