1 min readfrom Data Science

MCGrad: fix calibration of your ML model in subgroups

Hi r/datascience

We’re open-sourcing MCGrad, a Python package for multicalibration–developed and deployed in production at Meta. This work will also be presented at KDD 2026.

The Problem: A model can be globally calibrated yet significantly miscalibrated within identifiable subgroups or feature intersections (e.g., "users in region X on mobile devices"). Multicalibration aims to ensure reliability across such subpopulations.

The Solution: MCGrad reformulates multicalibration using gradient boosted decision trees. At each step, a lightweight booster learns to predict residual miscalibration of the base model given the features, automatically identifying and correcting miscalibrated regions. The method scales to large datasets, and uses early stopping to preserve predictive performance. See our tutorial for a live demo.

Key Results: Across 100+ production models at meta, MCGrad improved log loss and PRAUC on 88% of them while substantially reducing subgroup calibration error.

Links:

Install via pip install mcgrad or via conda. Happy to answer questions or discuss details.

submitted by /u/TaXxER
[link] [comments]

Want to read more?

Check out the full article on the original site

View original article

Tagged with

#rows.com
#natural language processing for spreadsheets
#generative AI for data analysis
#Excel alternatives for data analysis
#large dataset processing
#predictive analytics in spreadsheets
#predictive analytics
#big data performance
#google sheets