Questions & Answers

Frequently Asked Questions

Common questions from institutional teams

You can. Here's what you lose. One model asked to "give four perspectives" is one brain roleplaying. Our four independent models have genuinely different training data, different biases, and different blind spots. Push back on any single LLM and it caves. Ours don't see each other's output until cross-examination. When all four flag the same risk independently, that signal means something. When one model lists five risks, you have no idea which ones matter.

A single model with four prompts is one set of weights playing four roles. You get one polished perspective, not genuine dissent. Prompt libraries also drift. Every analyst tweaks them, none of the runs are comparable, and the audit trail is whatever someone remembered to paste. Conclavik orchestrates four frontier models from four different labs, surfaces where they disagree, and keeps a structured record of every challenge round. Different problem, different tool.

Yes. Conclavik works across cadences and document types: M&A diligence memos and CIMs, quarterly portfolio reviews, monthly thesis updates, weekly research stress-tests. The format adapts to what you upload, from a 200-page deal memo to a one-page thesis. The multi-model debate stays the same.

Seven to twelve minutes for a full multi-model second-opinion. Four independent analyses, then five rounds of structured cross-examination. Submit the night before and have a stress-tested report ready for your next committee, partner meeting, or board readout. We optimised for depth, not speed. If you wanted fast surface-level answers, you'd use ChatGPT.

Conclavik runs four frontier AI models per analysis, one from each leading provider. The current panel is Anthropic, OpenAI, Google, and xAI. We track public model leaderboards continuously and select the strongest model available from each provider's lineup. If a new provider produces a stronger entry, we swap. The principle is provider diversity (different training corpora produce different blind spots) plus current-best-quality, not commitment to any specific vendor. The value isn't access to the four models, which is available to anyone. The value is that you don't manage four vendor relationships, four APIs, four version-tracking processes, and the orchestration code that makes them debate.

Enterprise-grade. AES-256 encryption at rest, TLS in transit, EU-hosted infrastructure (Germany). Each analysis is cryptographically isolated: no cross-client data access, no query aggregation, no model fine-tuning on your inputs. Ephemeral mode available: your question is processed and immediately purged, with only a cryptographic attestation retained. Full details on our Security page.

By default, encrypted analysis records are retained for 90 days so you can access your reports. After 90 days, they're automatically purged. You can enable ephemeral mode for any analysis: zero retention, cryptographic proof only. You can also request immediate deletion of any record under GDPR Article 17.

Yes, on request. Standard deployment processes data in EU regions. For enterprise and regulated clients, we arrange custom data residency in your preferred region, dedicated cloud tenants, or self-hosted deployment on your own infrastructure. Specifics agreed during onboarding.

Yes. The structured debate architecture works across any domain: strategy, legal, technical, medical, policy. The system auto-detects your question's domain and adapts scenario labels, framing, and disclaimers accordingly. A strategy question gets Challenging/Realistic/Optimistic scenarios. A finance question gets Bear/Base/Bull with market data. Same rigour, different vocabulary.

Three things you can't replicate in different tabs. First: structured cross-examination. Each model must defend its position against specific objections from the other three, not just respond to a generic prompt. Second: steelman testing. The system forces models to strengthen opposing arguments before attacking them. Third: quantified agreement. You get actual scores showing where conviction is real (4/4 flagged the same risk) versus where it's fragile (2-2 split on a key assumption). The process is what creates the signal, not the models.

Ready to see it in action?

Submit your first question and see four independent models stress-test your next memo, model, or thesis.

Try It →