Skip to contents

Welcome folks,

I’m Saiem Gilani, one of the authors of wehoop, and I hope to give the community a high-quality resource for accessing women’s basketball data for statistical analysis, basketball research, and more. I am excited to show you some of what you can do with this edition of the package.

Installing R and RStudio

  1. Head to https://cran.r-project.org
  2. Select the appropriate link for your operating system (Windows, Mac OS X, or Linux)
  • Windows - Select base and download the most recent version
  • Mac OS X - Select Latest Release, but check to make sure your OS is the correct version. Look through Binaries for Legacy OS X Systems if you are on an older release
  • Linux - Select the appropriate distro and follow the installation instructions
  1. Head to Posit.co
  2. Follow the associated download and installation instructions for RStudio.
  3. Start peering over the RStudio IDE Cheatsheet. An IDE is an integrated development environment.
  4. For Windows users: I recommend you install Rtools. This is not an R package! It is “a collection of resources for building packages for R under Microsoft Windows, or for building R itself”. Go to https://cran.r-project.org/bin/windows/Rtools/ and follow the directions for installation.

Install wehoop

# You can install using the pacman package using the following code:
if (!requireNamespace('pacman', quietly = TRUE)){
  install.packages('pacman')
}

pacman::p_load(wehoop, dplyr, glue, tictoc, progressr)

Quick Start

WNBA full play-by-play seasons (2002-2023) ~ 30-60 seconds

tictoc::tic()
progressr::with_progress({
  wnba_pbp <- wehoop::load_wnba_pbp()
})
tictoc::toc()
## 1.137 sec elapsed
## 13.91 sec elapsed
glue::glue("{nrow(wnba_pbp)} rows of WNBA play-by-play data from {length(unique(wnba_pbp$game_id))} games.")
## 102191 rows of WNBA play-by-play data from 262 games.
## 1782985 rows of WNBA play-by-play data from 4674 games.
dplyr::glimpse(wnba_pbp)
## Rows: 102,191
## Columns: 62
## $ game_play_number                <int> 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12,…
## $ id                              <dbl> 4015785744, 4015785747, 4015785749, 40…
## $ sequence_number                 <int> 4, 7, 9, 10, 11, 13, 14, 15, 16, 17, 1…
## $ type_id                         <int> 615, 110, 90, 120, 92, 92, 155, 143, 1…
## $ type_text                       <chr> "Jumpball", "Driving Layup Shot", "Out…
## $ text                            <chr> "A'ja Wilson vs. Jonquel Jones (Alysha…
## $ away_score                      <int> 0, 2, 2, 4, 4, 4, 4, 4, 4, 4, 4, 4, 7,…
## $ home_score                      <int> 0, 0, 0, 0, 3, 3, 3, 3, 3, 3, 3, 3, 3,…
## $ period_number                   <int> 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1,…
## $ period_display_value            <chr> "1st Quarter", "1st Quarter", "1st Qua…
## $ clock_display_value             <chr> "10:00", "9:38", "9:17", "8:54", "8:40…
## $ scoring_play                    <lgl> FALSE, TRUE, FALSE, TRUE, TRUE, FALSE,…
## $ score_value                     <int> 0, 2, 0, 2, 3, 0, 0, 0, 0, 0, 0, 0, 3,…
## $ team_id                         <int> 17, 17, 9, 17, 9, 17, 9, 9, 17, 17, 9,…
## $ athlete_id_1                    <int> 3149391, 3149391, 981, 924, 2593770, 3…
## $ athlete_id_2                    <int> 2999101, 3888043, NA, NA, 981, NA, NA,…
## $ athlete_id_3                    <int> 924, NA, NA, NA, NA, NA, NA, NA, NA, N…
## $ wallclock                       <chr> "2023-10-19T00:09:06Z", "2023-10-19T00…
## $ shooting_play                   <lgl> FALSE, TRUE, FALSE, TRUE, TRUE, TRUE, …
## $ coordinate_x_raw                <dbl> -214748340, 23, 13, 24, 2, 40, 40, 27,…
## $ coordinate_y_raw                <dbl> -214748365, 3, 2, 3, 0, 21, 21, 2, 2, …
## $ game_id                         <int> 401578574, 401578574, 401578574, 40157…
## $ season                          <int> 2023, 2023, 2023, 2023, 2023, 2023, 20…
## $ season_type                     <int> 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3,…
## $ home_team_id                    <int> 9, 9, 9, 9, 9, 9, 9, 9, 9, 9, 9, 9, 9,…
## $ home_team_name                  <chr> "New York", "New York", "New York", "N…
## $ home_team_mascot                <chr> "Liberty", "Liberty", "Liberty", "Libe…
## $ home_team_abbrev                <chr> "NY", "NY", "NY", "NY", "NY", "NY", "N…
## $ home_team_name_alt              <chr> "New York", "New York", "New York", "N…
## $ away_team_id                    <int> 17, 17, 17, 17, 17, 17, 17, 17, 17, 17…
## $ away_team_name                  <chr> "Las Vegas", "Las Vegas", "Las Vegas",…
## $ away_team_mascot                <chr> "Aces", "Aces", "Aces", "Aces", "Aces"…
## $ away_team_abbrev                <chr> "LV", "LV", "LV", "LV", "LV", "LV", "L…
## $ away_team_name_alt              <chr> "Las Vegas", "Las Vegas", "Las Vegas",…
## $ game_spread                     <dbl> 2.5, 2.5, 2.5, 2.5, 2.5, 2.5, 2.5, 2.5…
## $ home_favorite                   <lgl> TRUE, TRUE, TRUE, TRUE, TRUE, TRUE, TR…
## $ game_spread_available           <lgl> FALSE, FALSE, FALSE, FALSE, FALSE, FAL…
## $ home_team_spread                <dbl> 2.5, 2.5, 2.5, 2.5, 2.5, 2.5, 2.5, 2.5…
## $ qtr                             <int> 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1,…
## $ time                            <chr> "10:00", "9:38", "9:17", "8:54", "8:40…
## $ clock_minutes                   <int> 10, 9, 9, 8, 8, 8, 8, 8, 8, 8, 7, 7, 7…
## $ clock_seconds                   <dbl> 0, 38, 17, 54, 40, 26, 25, 19, 18, 13,…
## $ home_timeout_called             <lgl> FALSE, FALSE, FALSE, FALSE, FALSE, FAL…
## $ away_timeout_called             <lgl> FALSE, FALSE, FALSE, FALSE, FALSE, FAL…
## $ half                            <int> 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1,…
## $ game_half                       <int> 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1,…
## $ lead_qtr                        <int> 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1,…
## $ lead_half                       <int> 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1,…
## $ start_quarter_seconds_remaining <dbl> 600, 578, 557, 534, 520, 506, 505, 499…
## $ start_half_seconds_remaining    <dbl> 1200, 1178, 1157, 1134, 1120, 1106, 11…
## $ start_game_seconds_remaining    <dbl> 2400, 2378, 2357, 2334, 2320, 2306, 23…
## $ end_quarter_seconds_remaining   <dbl> 600, 557, 534, 520, 506, 505, 499, 498…
## $ end_half_seconds_remaining      <dbl> 1200, 1157, 1134, 1120, 1106, 1105, 10…
## $ end_game_seconds_remaining      <dbl> 2400, 2357, 2334, 2320, 2306, 2305, 22…
## $ period                          <int> 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1,…
## $ lag_qtr                         <int> NA, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1…
## $ lag_half                        <int> NA, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1…
## $ coordinate_x                    <dbl> -214748406.75, -38.75, 39.75, -38.75, …
## $ coordinate_y                    <dbl> -214748365, -2, 12, -1, 23, 15, -15, -…
## $ game_date                       <date> 2023-10-18, 2023-10-18, 2023-10-18, 2…
## $ game_date_time                  <dttm> 2023-10-18 20:00:00, 2023-10-18 20:00…
## $ type_abbreviation               <chr> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA…

WNBA full team box score seasons (2003-2023) ~ 5-30 seconds

tictoc::tic()
progressr::with_progress({
  wnba_team_box <- wehoop::load_wnba_team_box()
})

tictoc::toc()
## 0.514 sec elapsed
glue::glue("{nrow(wnba_team_box)} rows of WNBA team boxscore data from {length(unique(wnba_team_box$game_id))} games.")
## 524 rows of WNBA team boxscore data from 262 games.
dplyr::glimpse(wnba_team_box)
## Rows: 524
## Columns: 57
## $ game_id                           <int> 401578574, 401578574, 401578573, 401…
## $ season                            <int> 2023, 2023, 2023, 2023, 2023, 2023, …
## $ season_type                       <int> 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, …
## $ game_date                         <date> 2023-10-18, 2023-10-18, 2023-10-15,…
## $ game_date_time                    <dttm> 2023-10-18 20:00:00, 2023-10-18 20:…
## $ team_id                           <int> 17, 9, 17, 9, 9, 17, 9, 17, 9, 18, 1…
## $ team_uid                          <chr> "s:40~l:59~t:17", "s:40~l:59~t:9", "…
## $ team_slug                         <chr> "las-vegas-aces", "new-york-liberty"…
## $ team_location                     <chr> "Las Vegas", "New York", "Las Vegas"…
## $ team_name                         <chr> "Aces", "Liberty", "Aces", "Liberty"…
## $ team_abbreviation                 <chr> "LV", "NY", "LV", "NY", "NY", "LV", …
## $ team_display_name                 <chr> "Las Vegas Aces", "New York Liberty"…
## $ team_short_display_name           <chr> "Aces", "Liberty", "Aces", "Liberty"…
## $ team_color                        <chr> "ce1141", "86cebc", "ce1141", "86ceb…
## $ team_alternate_color              <chr> "b4975a", "000000", "b4975a", "00000…
## $ team_logo                         <chr> "https://a.espncdn.com/i/teamlogos/w…
## $ team_home_away                    <chr> "away", "home", "away", "home", "awa…
## $ team_score                        <int> 70, 69, 73, 87, 76, 104, 82, 99, 87,…
## $ team_winner                       <lgl> TRUE, FALSE, FALSE, TRUE, FALSE, TRU…
## $ assists                           <int> 18, 19, 13, 28, 19, 31, 17, 21, 22, …
## $ blocks                            <int> 1, 5, 0, 8, 4, 1, 3, 5, 6, 6, 4, 4, …
## $ defensive_rebounds                <int> 32, 30, 26, 31, 22, 32, 24, 30, 31, …
## $ fast_break_points                 <chr> "4", "17", "2", "12", "3", "11", "11…
## $ field_goal_pct                    <dbl> 41.8, 36.1, 33.3, 52.4, 36.1, 52.9, …
## $ field_goals_made                  <int> 28, 26, 23, 33, 26, 37, 32, 35, 28, …
## $ field_goals_attempted             <int> 67, 72, 69, 63, 72, 70, 69, 64, 68, …
## $ flagrant_fouls                    <int> 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, …
## $ fouls                             <int> 14, 12, 14, 16, 15, 18, 18, 16, 11, …
## $ free_throw_pct                    <dbl> 81.8, 66.7, 87.0, 50.0, 72.7, 100.0,…
## $ free_throws_made                  <int> 9, 8, 20, 8, 16, 17, 9, 20, 21, 3, 1…
## $ free_throws_attempted             <int> 11, 12, 23, 16, 22, 17, 13, 23, 25, …
## $ largest_lead                      <chr> "7", "12", "1", "17", "0", "32", "8"…
## $ offensive_rebounds                <int> 6, 9, 8, 4, 13, 8, 6, 4, 11, 7, 5, 1…
## $ points_in_paint                   <chr> "44", "24", "30", "34", "30", "32", …
## $ steals                            <int> 7, 5, 7, 4, 4, 7, 6, 7, 5, 7, 6, 9, …
## $ team_turnovers                    <int> 2, 0, 2, 0, 0, 2, 0, 0, 1, 0, 1, 1, …
## $ technical_fouls                   <int> 0, 0, 0, 0, 0, 0, 2, 2, 0, 0, 0, 0, …
## $ three_point_field_goal_pct        <dbl> 23.8, 34.6, 31.8, 43.3, 22.9, 44.8, …
## $ three_point_field_goals_made      <int> 5, 9, 7, 13, 8, 13, 9, 9, 10, 11, 6,…
## $ three_point_field_goals_attempted <int> 21, 26, 22, 30, 35, 29, 29, 22, 22, …
## $ total_rebounds                    <int> 38, 39, 34, 35, 35, 40, 30, 34, 42, …
## $ total_technical_fouls             <int> 0, 0, 0, 0, 0, 0, 2, 2, 0, 0, 0, 0, …
## $ total_turnovers                   <int> 15, 13, 13, 11, 10, 11, 11, 11, 13, …
## $ turnover_points                   <chr> "14", "15", "14", "13", "13", "14", …
## $ turnovers                         <int> 13, 13, 11, 11, 10, 9, 11, 11, 12, 1…
## $ opponent_team_id                  <int> 9, 17, 9, 17, 17, 9, 17, 9, 18, 9, 3…
## $ opponent_team_uid                 <chr> "s:40~l:59~t:9", "s:40~l:59~t:17", "…
## $ opponent_team_slug                <chr> "new-york-liberty", "las-vegas-aces"…
## $ opponent_team_location            <chr> "New York", "Las Vegas", "New York",…
## $ opponent_team_name                <chr> "Liberty", "Aces", "Liberty", "Aces"…
## $ opponent_team_abbreviation        <chr> "NY", "LV", "NY", "LV", "LV", "NY", …
## $ opponent_team_display_name        <chr> "New York Liberty", "Las Vegas Aces"…
## $ opponent_team_short_display_name  <chr> "Liberty", "Aces", "Liberty", "Aces"…
## $ opponent_team_color               <chr> "86cebc", "ce1141", "86cebc", "ce114…
## $ opponent_team_alternate_color     <chr> "000000", "b4975a", "000000", "b4975…
## $ opponent_team_logo                <chr> "https://a.espncdn.com/i/teamlogos/w…
## $ opponent_team_score               <int> 69, 70, 87, 73, 104, 76, 99, 82, 84,…

WNBA full player box score seasons (2002-2023) ~ 5-30 seconds

tictoc::tic()
progressr::with_progress({
  wnba_player_box <- wehoop::load_wnba_player_box()
})
tictoc::toc()
## 0.35 sec elapsed
length(unique(wnba_player_box$game_id))
## [1] 262
nrow(wnba_player_box)
## [1] 5796

Women’s college basketball full play-by-play seasons (2004-2024) ~ 45-90 seconds

tictoc::tic()
progressr::with_progress({
  wbb_pbp <- wehoop::load_wbb_pbp()
})
tictoc::toc()
## 2.244 sec elapsed
length(unique(wbb_pbp$game_id))
## [1] 959
nrow(wbb_pbp)
## [1] 328507

Women’s college basketball full team box score seasons (2006-2024) ~ 5-30 seconds

tictoc::tic()
progressr::with_progress({
  wbb_team_box <- wehoop::load_wbb_team_box()
})
tictoc::toc()
## 0.424 sec elapsed
length(unique(wbb_team_box$game_id))
## [1] 997
nrow(wbb_team_box)
## [1] 1994

Women’s college basketball full player box score seasons (2006-2024) ~ 5-30 seconds

tictoc::tic()
progressr::with_progress({
  wbb_player_box <- wehoop::load_wbb_player_box()
})
tictoc::toc()
## 0.57 sec elapsed
length(unique(wbb_player_box$game_id))
## [1] 999
nrow(wbb_player_box)
## [1] 28488