RepoAudit

Reproducibility auditing for ML GitHub repositories using AST-based static analysis and AI semantic checks, with CI integration via GitHub Actions.

Description

RepoAudit is an automated machine-learning repository auditor that scans GitHub projects and generates a 0–100 reproducibility score. It combines Python AST analysis with LLM-based README validation to detect issues in experiment determinism, environment setup, dataset handling, and documentation.

Live website: https://repo-audit.vercel.app

Features

Automated Repo Analysis:
Clones public GitHub repositories and evaluates reproducibility across six categories: environment setup, determinism, dataset usage, semantic documentation alignment, execution entry points, and README completeness.

Determinism Detection:
Python AST inspection identifies missing seeds for PyTorch, NumPy, TensorFlow, and Python random, highlighting experiments that may produce non-reproducible results.

Dataset Path Validation:
Detects hardcoded local paths and checks for dataset documentation or download instructions.

README–Code Consistency Checks:
LLM audit compares README instructions with repository structure, flagging missing scripts or mismatched setup steps.

Dependency & Environment Checks:
Verifies reproducible environments through requirements.txt, environment.yml, or Dockerfiles and identifies unpinned dependencies.

Score History Tracking:
Stores audits per commit and visualizes reproducibility trends across repository updates.

GitHub Action Integration:
Runs RepoAudit in CI pipelines, posts PR reports, and optionally fails builds below a reproducibility threshold.

Commit-Based Caching:
Uses commit hashes to reuse previous audit results and avoid redundant analysis.

Architecture

Distributed audit pipeline with asynchronous repository analysis.

Frontend (Next.js):
Dashboard for submitting repos and visualising scores, category breakdowns, and history charts.

API Layer (FastAPI):
Handles audit requests, status polling, and result retrieval.

Worker System (Celery + Redis):
Executes repository scans asynchronously and manages queued analysis jobs.

Analysis Engine (Python):
Clones repositories, performs AST-based checks, dataset path analysis, dependency inspection, import graph tracing, and LLM semantic auditing.

Storage (Supabase Postgres):
Stores repository metadata, audit results, and historical scores.

Caching (Upstash Redis):
Commit-hash cache for fast reuse of previous analysis results.

Deployment:
Frontend on Vercel; backend and workers on Render using a containerised Docker setup.

Issues & PRs Board
No issues or pull requests added.