🧠 mini_imo

A Lightweight IMO-Style Math Solver + AutoGrader Pipeline

Elegant • Minimal • Fully Reproducible • LLM-Powered

✨ Overview

mini_imo is a complete, minimal implementation of an IMO-style mathematics evaluation pipeline, featuring:

🔢 Math Solver (LLM-based or your tiny Transformer)
🧩 Short-Answer AutoGrader with equivalence checking
📚 Proof AutoGrader using IMO-style rubric {0,1,6,7}
🚀 End-to-End Evaluation Script
🧪 Ready-to-run sample benchmark
🔬 Optional: tiny GPT math model (PyTorch)

This project mimics modern math-evaluation pipelines used in LLM reasoning research.

🔥 Features

🎯 Short Answer Grader

Extracts final answer
Algebraic / numeric equivalence
Strict grading (Correct/Incorrect)

📝 Proof Grader

Four-level rubric:
- Incorrect
- Partial
- Almost
- Correct
Scoring mapped to {0, 1, 6, 7}
Judges correctness & completeness

🤖 Solver Options

GPT-style LLM via OpenAI API
Or your own mini_gpt_math.py model

📈 Benchmark Runner

Reads JSONL file
Solves → Grades → Produces CSV report
Summary: accuracy & proof score

Name		Name	Last commit message	Last commit date
Latest commit History 9 Commits
README.md		README.md
answer_autograder.py		answer_autograder.py
autograder_core.py		autograder_core.py
demo_answer_grader.py		demo_answer_grader.py
demo_proof_grader.py		demo_proof_grader.py
imo_benchmark_example.jsonl		imo_benchmark_example.jsonl
imo_eval.py		imo_eval.py
mini_gpt_math.py		mini_gpt_math.py
proof_autograder.py		proof_autograder.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

🧠 mini_imo

A Lightweight IMO-Style Math Solver + AutoGrader Pipeline

✨ Overview

🔥 Features

🎯 Short Answer Grader

📝 Proof Grader

🤖 Solver Options

📈 Benchmark Runner

About

Uh oh!

Releases

Packages

Languages

lfopensource/mini_imo

Folders and files

Latest commit

History

Repository files navigation

🧠 mini_imo

A Lightweight IMO-Style Math Solver + AutoGrader Pipeline

✨ Overview

🔥 Features

🎯 Short Answer Grader

📝 Proof Grader

🤖 Solver Options

📈 Benchmark Runner

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages