Platinum Medical QA Training Data
Section 01
Swarm & Bee produces CoVe-verified platinum medical question-answer pairs for machine learning model training, fine-tuning, and evaluation. Every pair passes through our Chain-of-Verification pipeline, where each factual claim is independently checked by a 235 billion parameter model.
Our datasets cover 14 medical specialties — from cardiology and radiology to pharmacology and emergency medicine. Available as individual specialty packs, bundles, or the complete vault.
235B
Verifier Model
3
LLM Calls / Claim
79%
Raw Reject Rate
14
Specialties
10K+
Platinum Pairs
94%+
Disclaimer Rate
Section 02
Every product ships as a ZIP containing the following files:
| [product].jsonl | Full dataset — all pairs for this product |
| splits/train.jsonl | Training set (80% of pairs) |
| splits/val.jsonl | Validation set (10% of pairs) |
| splits/test.jsonl | Test set (10% of pairs) |
| PRODUCT_OM.txt | Product-specific offering memorandum |
| LICENSE.txt | Commercial data license |
Each pair is a single JSON object with the following fields:
| id | Unique identifier (hash-based, deterministic) |
| question | Clinical scenario or medical question |
| answer | Evidence-based response with clinical disclaimers |
| specialty | Medical specialty classification |
| source | Generation pipeline identifier |
| cove_status | Verification result (PASS or REWRITE) |
Section 03
Every pair passes through a six-stage pipeline from medical literature to verified training data:
01
Harvest
10+ medical sources
02
Generate
Specialty QA models
03
Audit
Multi-stage quality gate
04
Verify
CoVe 235B fact-check
05
Rewrite
235B corrects errors
06
Deliver
Platinum vault only
Sources: PubMed, PMC full-text, FDA drug labels, ClinicalTrials.gov, Semantic Scholar, medRxiv, Europe PMC, HuggingFace medical datasets, GitHub medical repositories, Reddit medical communities.
Section 04
Chain-of-Verification (CoVe) is a three-step verification framework from Meta AI research, adapted for medical fact-checking. For every claim in every answer, three independent LLM calls determine accuracy.
Step 1 — Plan
Extract all factual claims from the answer. For each claim, generate a verification question that can independently confirm or refute it. Drug dosages, contraindications, guidelines, and clinical protocols each become separate verification targets.
Step 2 — Execute
Route each verification question to the 235B parameter model (Qwen3-235B). The verifier has no access to the original answer — it answers from its own knowledge base. This independence is critical for detecting hallucinated content.
Step 3 — Compare
Compare the 235B verification response with the original claim. Three outcomes: PASS (claim verified correct), FLAG (claim questionable — sent to 235B for rewrite with verified facts), FAIL (claim incorrect — permanently rejected).
A single failed claim downgrades the entire pair. FLAG pairs are rewritten by the 235B model using only independently verified facts. The result is platinum-grade training data where every factual claim has been independently verified.
Section 05
Compatible with any HuggingFace-based training framework. Recommended configurations for medical fine-tuning:
Recommended Base Models
LoRA Configuration (7B Models)
Training Configuration
Hardware Requirements
Expected Results (~500 Platinum Pairs, 7B Base)
Section 06
Cardiology — Acute Coronary Syndrome
Q: A 62-year-old male presents with substernal chest pressure radiating to the left arm for 45 minutes. ECG shows ST-elevation in leads II, III, and aVF. Troponin I is 2.4 ng/mL. BP 138/82, HR 88. What is the immediate management approach, and what are the key pharmacotherapy decisions in the first 24 hours?
Section 07
Upon purchase, you receive a non-exclusive, non-transferable, perpetual license:
Enterprise licenses permit use by up to 10 individuals within a single organization. Full terms at data.swarmandbee.com/terms.html.
Section 08
This dataset is intended for machine learning research and model training purposes only. The medical content within is not intended as clinical advice, diagnostic guidance, or treatment recommendations.
While every pair has been verified through our CoVe pipeline, no dataset is guaranteed to be 100% error-free. Users are responsible for validating outputs of any models trained on this data before deploying in clinical or healthcare settings.
Swarm & Bee is not a healthcare provider and does not provide medical advice. Our products are data assets for AI/ML development.