Data Resources
Datasets, APIs & Packages for Sports Analytics
A curated starting point for finding sports data. Whether you’re building a class project or exploring a research question, these sources cover the major North American sports leagues and beyond.
Public Data & Reference Sites
Free, browser-accessible statistics and historical data — no API key required.
- Sports Reference — A robust collection of data across different sports
- Baseball Reference — comprehensive MLB statistics, historical records, and WAR calculations
- Pro Football Reference — NFL statistics, draft history, and advanced metrics
- Basketball Reference — NBA/WNBA statistics, shot logs, and play-by-play data
- Hockey Reference — NHL statistics and historical records
- FBref — soccer statistics across major global leagues (powered by StatsBomb)
- Sports Reference College Sports — college football, basketball, and baseball data
- SportsDataVerse — college football, basketball, and baseball data
APIs & Data Providers
Programmatic access to sports data, ranging from free tiers to professional subscriptions.
- Sportradar — official data partner for NFL, NBA, NHL, MLB, and more; academic access available
- StatsBomb — high-resolution soccer event data; free open data available for select competitions
- Opta / Stats Perform — event-level data across soccer, American football, basketball, and cricket
- SportsDataIO — multi-sport API with a free developer tier
- The Sports DB — open, community-built sports database with a free API
R Packages
Install via install.packages().
nflfastR— play-by-play NFL data with expected points and win probability modelsnflreadr— fast loading of nflverse data including rosters, contracts, and combine resultsbaseballr— scraping tools for Baseball Reference, FanGraphs, and Statcast (Baseball Savant)hoopR— NBA and men’s college basketball play-by-play via ESPN and NBA Stats APIwehoop— WNBA and women’s college basketball dataworldfootballR— soccer data from FBref, Transfermarkt, and UnderstatfastRhockey— NHL and PHF play-by-play datasportyR(also available in Python) — draw scale versions of playing surfaces via ggplot2MSUthemes(also available in Python) — The MSUthemes package provides colour palettes and themes for Michigan State University (MSU) and comprehensive colour support for all Big Ten Conference institutions
Python Packages
Install via pip install <package>.
nflreadpy— Python interface to nflverse data (mirrorsnflreadr)pybaseball— Statcast, FanGraphs, and Baseball Reference scrapingsoccerdata— unified API for FBref, FotMob, WhoScored, and moresportypy(also available in R) — draw scale versions of playing surfacesmsuthemes-py(also available in R) — The msuthemes-py package provides colour palettes and themes for Michigan State University (MSU) and comprehensive colour support for all Big Ten Conference institutions
General Data Repositories
Broader repositories that include sports datasets alongside other domains.
- Kaggle Datasets — search “sports” for community-shared datasets across many sports
- Harvard Dataverse — peer-reviewed research data deposits, including sports science studies
- GitHub — many researchers publish cleaned datasets and scraping scripts publicly; search
sports analytics data