Getting Started with wehoop
Saiem Gilani
Source: vignettes/getting-started-wehoop.Rmd
getting-started-wehoop.Rmd
Welcome folks,
I’m Saiem Gilani, one of the authors of wehoop
,
and I hope to give the community a high-quality resource for accessing
women’s basketball data for statistical analysis, basketball research,
and more. I am excited to show you some of what you can do with this
edition of the package.
Installing R and RStudio
- Head to https://cran.r-project.org
- Select the appropriate link for your operating system (Windows, Mac OS X, or Linux)
- Windows - Select base and download the most recent version
- Mac OS X - Select Latest Release, but check to make sure your OS is the correct version. Look through Binaries for Legacy OS X Systems if you are on an older release
- Linux - Select the appropriate distro and follow the installation instructions
- Head to Posit.co
- Follow the associated download and installation instructions for RStudio.
- Start peering over the RStudio IDE Cheatsheet. An IDE is an integrated development environment.
- For Windows users: I recommend you install Rtools. This is not an R package! It is “a collection of resources for building packages for R under Microsoft Windows, or for building R itself”. Go to https://cran.r-project.org/bin/windows/Rtools/ and follow the directions for installation.
Install wehoop
# You can install using the pacman package using the following code:
if (!requireNamespace('pacman', quietly = TRUE)){
install.packages('pacman')
}
pacman::p_load(wehoop, dplyr, glue, tictoc, progressr)
Quick Start
WNBA full play-by-play seasons (2002-2023) ~ 30-60 seconds
tictoc::tic()
progressr::with_progress({
wnba_pbp <- wehoop::load_wnba_pbp()
})
tictoc::toc()
## 1.137 sec elapsed
## 13.91 sec elapsed
glue::glue("{nrow(wnba_pbp)} rows of WNBA play-by-play data from {length(unique(wnba_pbp$game_id))} games.")
## 102191 rows of WNBA play-by-play data from 262 games.
## 1782985 rows of WNBA play-by-play data from 4674 games.
dplyr::glimpse(wnba_pbp)
## Rows: 102,191
## Columns: 62
## $ game_play_number <int> 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12,…
## $ id <dbl> 4015785744, 4015785747, 4015785749, 40…
## $ sequence_number <int> 4, 7, 9, 10, 11, 13, 14, 15, 16, 17, 1…
## $ type_id <int> 615, 110, 90, 120, 92, 92, 155, 143, 1…
## $ type_text <chr> "Jumpball", "Driving Layup Shot", "Out…
## $ text <chr> "A'ja Wilson vs. Jonquel Jones (Alysha…
## $ away_score <int> 0, 2, 2, 4, 4, 4, 4, 4, 4, 4, 4, 4, 7,…
## $ home_score <int> 0, 0, 0, 0, 3, 3, 3, 3, 3, 3, 3, 3, 3,…
## $ period_number <int> 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1,…
## $ period_display_value <chr> "1st Quarter", "1st Quarter", "1st Qua…
## $ clock_display_value <chr> "10:00", "9:38", "9:17", "8:54", "8:40…
## $ scoring_play <lgl> FALSE, TRUE, FALSE, TRUE, TRUE, FALSE,…
## $ score_value <int> 0, 2, 0, 2, 3, 0, 0, 0, 0, 0, 0, 0, 3,…
## $ team_id <int> 17, 17, 9, 17, 9, 17, 9, 9, 17, 17, 9,…
## $ athlete_id_1 <int> 3149391, 3149391, 981, 924, 2593770, 3…
## $ athlete_id_2 <int> 2999101, 3888043, NA, NA, 981, NA, NA,…
## $ athlete_id_3 <int> 924, NA, NA, NA, NA, NA, NA, NA, NA, N…
## $ wallclock <chr> "2023-10-19T00:09:06Z", "2023-10-19T00…
## $ shooting_play <lgl> FALSE, TRUE, FALSE, TRUE, TRUE, TRUE, …
## $ coordinate_x_raw <dbl> -214748340, 23, 13, 24, 2, 40, 40, 27,…
## $ coordinate_y_raw <dbl> -214748365, 3, 2, 3, 0, 21, 21, 2, 2, …
## $ game_id <int> 401578574, 401578574, 401578574, 40157…
## $ season <int> 2023, 2023, 2023, 2023, 2023, 2023, 20…
## $ season_type <int> 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3,…
## $ home_team_id <int> 9, 9, 9, 9, 9, 9, 9, 9, 9, 9, 9, 9, 9,…
## $ home_team_name <chr> "New York", "New York", "New York", "N…
## $ home_team_mascot <chr> "Liberty", "Liberty", "Liberty", "Libe…
## $ home_team_abbrev <chr> "NY", "NY", "NY", "NY", "NY", "NY", "N…
## $ home_team_name_alt <chr> "New York", "New York", "New York", "N…
## $ away_team_id <int> 17, 17, 17, 17, 17, 17, 17, 17, 17, 17…
## $ away_team_name <chr> "Las Vegas", "Las Vegas", "Las Vegas",…
## $ away_team_mascot <chr> "Aces", "Aces", "Aces", "Aces", "Aces"…
## $ away_team_abbrev <chr> "LV", "LV", "LV", "LV", "LV", "LV", "L…
## $ away_team_name_alt <chr> "Las Vegas", "Las Vegas", "Las Vegas",…
## $ game_spread <dbl> 2.5, 2.5, 2.5, 2.5, 2.5, 2.5, 2.5, 2.5…
## $ home_favorite <lgl> TRUE, TRUE, TRUE, TRUE, TRUE, TRUE, TR…
## $ game_spread_available <lgl> FALSE, FALSE, FALSE, FALSE, FALSE, FAL…
## $ home_team_spread <dbl> 2.5, 2.5, 2.5, 2.5, 2.5, 2.5, 2.5, 2.5…
## $ qtr <int> 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1,…
## $ time <chr> "10:00", "9:38", "9:17", "8:54", "8:40…
## $ clock_minutes <int> 10, 9, 9, 8, 8, 8, 8, 8, 8, 8, 7, 7, 7…
## $ clock_seconds <dbl> 0, 38, 17, 54, 40, 26, 25, 19, 18, 13,…
## $ home_timeout_called <lgl> FALSE, FALSE, FALSE, FALSE, FALSE, FAL…
## $ away_timeout_called <lgl> FALSE, FALSE, FALSE, FALSE, FALSE, FAL…
## $ half <int> 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1,…
## $ game_half <int> 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1,…
## $ lead_qtr <int> 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1,…
## $ lead_half <int> 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1,…
## $ start_quarter_seconds_remaining <dbl> 600, 578, 557, 534, 520, 506, 505, 499…
## $ start_half_seconds_remaining <dbl> 1200, 1178, 1157, 1134, 1120, 1106, 11…
## $ start_game_seconds_remaining <dbl> 2400, 2378, 2357, 2334, 2320, 2306, 23…
## $ end_quarter_seconds_remaining <dbl> 600, 557, 534, 520, 506, 505, 499, 498…
## $ end_half_seconds_remaining <dbl> 1200, 1157, 1134, 1120, 1106, 1105, 10…
## $ end_game_seconds_remaining <dbl> 2400, 2357, 2334, 2320, 2306, 2305, 22…
## $ period <int> 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1,…
## $ lag_qtr <int> NA, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1…
## $ lag_half <int> NA, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1…
## $ coordinate_x <dbl> -214748406.75, -38.75, 39.75, -38.75, …
## $ coordinate_y <dbl> -214748365, -2, 12, -1, 23, 15, -15, -…
## $ game_date <date> 2023-10-18, 2023-10-18, 2023-10-18, 2…
## $ game_date_time <dttm> 2023-10-18 20:00:00, 2023-10-18 20:00…
## $ type_abbreviation <chr> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA…
WNBA full team box score seasons (2003-2023) ~ 5-30 seconds
tictoc::tic()
progressr::with_progress({
wnba_team_box <- wehoop::load_wnba_team_box()
})
tictoc::toc()
## 0.514 sec elapsed
glue::glue("{nrow(wnba_team_box)} rows of WNBA team boxscore data from {length(unique(wnba_team_box$game_id))} games.")
## 524 rows of WNBA team boxscore data from 262 games.
dplyr::glimpse(wnba_team_box)
## Rows: 524
## Columns: 57
## $ game_id <int> 401578574, 401578574, 401578573, 401…
## $ season <int> 2023, 2023, 2023, 2023, 2023, 2023, …
## $ season_type <int> 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, …
## $ game_date <date> 2023-10-18, 2023-10-18, 2023-10-15,…
## $ game_date_time <dttm> 2023-10-18 20:00:00, 2023-10-18 20:…
## $ team_id <int> 17, 9, 17, 9, 9, 17, 9, 17, 9, 18, 1…
## $ team_uid <chr> "s:40~l:59~t:17", "s:40~l:59~t:9", "…
## $ team_slug <chr> "las-vegas-aces", "new-york-liberty"…
## $ team_location <chr> "Las Vegas", "New York", "Las Vegas"…
## $ team_name <chr> "Aces", "Liberty", "Aces", "Liberty"…
## $ team_abbreviation <chr> "LV", "NY", "LV", "NY", "NY", "LV", …
## $ team_display_name <chr> "Las Vegas Aces", "New York Liberty"…
## $ team_short_display_name <chr> "Aces", "Liberty", "Aces", "Liberty"…
## $ team_color <chr> "ce1141", "86cebc", "ce1141", "86ceb…
## $ team_alternate_color <chr> "b4975a", "000000", "b4975a", "00000…
## $ team_logo <chr> "https://a.espncdn.com/i/teamlogos/w…
## $ team_home_away <chr> "away", "home", "away", "home", "awa…
## $ team_score <int> 70, 69, 73, 87, 76, 104, 82, 99, 87,…
## $ team_winner <lgl> TRUE, FALSE, FALSE, TRUE, FALSE, TRU…
## $ assists <int> 18, 19, 13, 28, 19, 31, 17, 21, 22, …
## $ blocks <int> 1, 5, 0, 8, 4, 1, 3, 5, 6, 6, 4, 4, …
## $ defensive_rebounds <int> 32, 30, 26, 31, 22, 32, 24, 30, 31, …
## $ fast_break_points <chr> "4", "17", "2", "12", "3", "11", "11…
## $ field_goal_pct <dbl> 41.8, 36.1, 33.3, 52.4, 36.1, 52.9, …
## $ field_goals_made <int> 28, 26, 23, 33, 26, 37, 32, 35, 28, …
## $ field_goals_attempted <int> 67, 72, 69, 63, 72, 70, 69, 64, 68, …
## $ flagrant_fouls <int> 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, …
## $ fouls <int> 14, 12, 14, 16, 15, 18, 18, 16, 11, …
## $ free_throw_pct <dbl> 81.8, 66.7, 87.0, 50.0, 72.7, 100.0,…
## $ free_throws_made <int> 9, 8, 20, 8, 16, 17, 9, 20, 21, 3, 1…
## $ free_throws_attempted <int> 11, 12, 23, 16, 22, 17, 13, 23, 25, …
## $ largest_lead <chr> "7", "12", "1", "17", "0", "32", "8"…
## $ offensive_rebounds <int> 6, 9, 8, 4, 13, 8, 6, 4, 11, 7, 5, 1…
## $ points_in_paint <chr> "44", "24", "30", "34", "30", "32", …
## $ steals <int> 7, 5, 7, 4, 4, 7, 6, 7, 5, 7, 6, 9, …
## $ team_turnovers <int> 2, 0, 2, 0, 0, 2, 0, 0, 1, 0, 1, 1, …
## $ technical_fouls <int> 0, 0, 0, 0, 0, 0, 2, 2, 0, 0, 0, 0, …
## $ three_point_field_goal_pct <dbl> 23.8, 34.6, 31.8, 43.3, 22.9, 44.8, …
## $ three_point_field_goals_made <int> 5, 9, 7, 13, 8, 13, 9, 9, 10, 11, 6,…
## $ three_point_field_goals_attempted <int> 21, 26, 22, 30, 35, 29, 29, 22, 22, …
## $ total_rebounds <int> 38, 39, 34, 35, 35, 40, 30, 34, 42, …
## $ total_technical_fouls <int> 0, 0, 0, 0, 0, 0, 2, 2, 0, 0, 0, 0, …
## $ total_turnovers <int> 15, 13, 13, 11, 10, 11, 11, 11, 13, …
## $ turnover_points <chr> "14", "15", "14", "13", "13", "14", …
## $ turnovers <int> 13, 13, 11, 11, 10, 9, 11, 11, 12, 1…
## $ opponent_team_id <int> 9, 17, 9, 17, 17, 9, 17, 9, 18, 9, 3…
## $ opponent_team_uid <chr> "s:40~l:59~t:9", "s:40~l:59~t:17", "…
## $ opponent_team_slug <chr> "new-york-liberty", "las-vegas-aces"…
## $ opponent_team_location <chr> "New York", "Las Vegas", "New York",…
## $ opponent_team_name <chr> "Liberty", "Aces", "Liberty", "Aces"…
## $ opponent_team_abbreviation <chr> "NY", "LV", "NY", "LV", "LV", "NY", …
## $ opponent_team_display_name <chr> "New York Liberty", "Las Vegas Aces"…
## $ opponent_team_short_display_name <chr> "Liberty", "Aces", "Liberty", "Aces"…
## $ opponent_team_color <chr> "86cebc", "ce1141", "86cebc", "ce114…
## $ opponent_team_alternate_color <chr> "000000", "b4975a", "000000", "b4975…
## $ opponent_team_logo <chr> "https://a.espncdn.com/i/teamlogos/w…
## $ opponent_team_score <int> 69, 70, 87, 73, 104, 76, 99, 82, 84,…
WNBA full player box score seasons (2002-2023) ~ 5-30 seconds
tictoc::tic()
progressr::with_progress({
wnba_player_box <- wehoop::load_wnba_player_box()
})
tictoc::toc()
## 0.35 sec elapsed
## [1] 262
nrow(wnba_player_box)
## [1] 5796
Women’s college basketball full play-by-play seasons (2004-2024) ~ 45-90 seconds
tictoc::tic()
progressr::with_progress({
wbb_pbp <- wehoop::load_wbb_pbp()
})
tictoc::toc()
## 2.244 sec elapsed
## [1] 959
nrow(wbb_pbp)
## [1] 328507
Women’s college basketball full team box score seasons (2006-2024) ~ 5-30 seconds
tictoc::tic()
progressr::with_progress({
wbb_team_box <- wehoop::load_wbb_team_box()
})
tictoc::toc()
## 0.424 sec elapsed
## [1] 997
nrow(wbb_team_box)
## [1] 1994
Women’s college basketball full player box score seasons (2006-2024) ~ 5-30 seconds
tictoc::tic()
progressr::with_progress({
wbb_player_box <- wehoop::load_wbb_player_box()
})
tictoc::toc()
## 0.57 sec elapsed
## [1] 999
nrow(wbb_player_box)
## [1] 28488