Skip to contents

helper that loads multiple seasons from the data repo either into memory or writes it into a db using some forwarded arguments in the dots

Usage

load_wnba_pbp(
  seasons = most_recent_wnba_season(),
  ...,
  dbConnection = NULL,
  tablename = NULL
)

Arguments

seasons

A vector of 4-digit years associated with given WNBA seasons. (Min: 2002)

...

Additional arguments passed to an underlying function that writes the season data into a database (used by update_wnba_db()).

dbConnection

A DBIConnection object, as returned by DBI::dbConnect()

tablename

The name of the play by play data table within the database

Value

A dataframe with 42 columns

col_namedescription
shooting_playLogical value (TRUE/FALSE) indicating whether the play was a shooting play
sequence_numberSequence number is supposed to represent a shot-possession, examine the last two numbers to see if there are multiple events that occur within the same shot-possession. A shot-possession is basically any sequence of plays until there is a shot, change in possession, and probably things like technical fouls and the like. So as soon as a shot goes up, a new sequence starts regardless, even if the shooting team retains possession via offensive or deadball rebound. The first portion of the number is usually time related (i.e. the numeric representation of when the sequence started, from a seconds remaining in the period perspective or so)
period_display_valueLong form of period (1st quarter, 2nd Quarter, OT, etc.)
period_numberThe numeric period of play in the game
home_scoreHome score at the time of the play
coordinate_xThe entire scale is a rectangle of size 25x47, intended as a half-court representation of the basketball court (i.e. on the side of the offense), with each coordinate unit representing a foot. It appears that the basket is roughly represented as the (25, 0) point. This is a nonsensical definition when considering that the basket overhangs the court, with the backboard aligned 48 inches from the baseline, then the center of the hoop being roughly 11 inches from there. This is an idiosyncracy of either sensor placement or software and data entry. Use your best judgement in making your charts, I think you will find that making some translations will be helpful.
coordinate_y
scoring_playLogical value (TRUE/FALSE) indicating whether the play was a play on which the offense scored
clock_display_valueTime left within the period
team_idUnique team identification number for the offensive team
type_idUnique play type identifcation number
type_textPlay type text description
away_scoreAway score at the time of the play
idUnique play identifcation number
textText description of the play
score_valueThe points value of the shot taken
participants_0_athlete_idUnique player identification number
participants_1_athlete_idUnique player identification number
participants_2_athlete_idUnique player identification number
type_abbreviationPlay type abbreviation
seasonSeason of the game
season_typeSeason type of the game, 1 is pre-season, 2 is regular season, 3 is post-season, 4 is off-season
away_team_idUnique away team identification number
away_team_nameAway team name
away_team_mascotAway team mascot
away_team_abbrevText abbreviation for the away team
away_team_name_altAlternate versions of the away team abbreviation
home_team_idUnique home team identification number
home_team_namehome team name
home_team_mascothome team mascot
home_team_abbrevText abbreviation for the home team
home_team_name_altAlternate versions of the home team abbreviation
home_team_spreadThe game spread with respect to the home team
game_spreadGame spread in (-X Team) format. There are almost none, I would recommend not trusting any of these three columns
home_favoriteLogical (TRUE/FALSE) indicating whether the home team is favored
clock_minutesClock minutes split from seconds for developer convenience
clock_secondsClock seconds split from minutes for developer convenience
halfHalf of the game
lag_halfA lag column on the half
lead_halfA lead column on the half
game_play_numberGame play number
game_idUnique identifier for the game event

Examples

# \donttest{
  try(load_wnba_pbp())
#> ──────────────────────────────────────────────────────────────── wehoop 3.0.0 ──
#> # A tibble: 59,829 × 64
#>    game_play_number        id sequence_number type_id type_text text  away_score
#>               <int>     <dbl>           <int>   <int> <chr>     <chr>      <int>
#>  1                1   4.02e 9               4     615 Jumpball  Nata…          0
#>  2                2   4.02e 9               8     128 Driving … Oliv…          0
#>  3                3   4.02e 9               9     155 Defensiv… Jess…          0
#>  4                4   4.02e10              10     119 Driving … Awak…          0
#>  5                5   4.02e10              14     132 Step Bac… Nia …          3
#>  6                6   4.02e10              16      90 Out of B… Paig…          3
#>  7                7   4.02e10              17      92 Jump Shot Kayl…          6
#>  8                8   4.02e10              20      92 Jump Shot Awak…          6
#>  9                9   4.02e10              21     155 Defensiv… Oliv…          6
#> 10               10   4.02e10              22      44 Shooting… Paig…          6
#> # ℹ 59,819 more rows
#> # ℹ 57 more variables: home_score <int>, period_number <int>,
#> #   period_display_value <chr>, clock_display_value <chr>, scoring_play <lgl>,
#> #   score_value <int>, team_id <int>, athlete_id_1 <int>, athlete_id_2 <int>,
#> #   athlete_id_3 <int>, wallclock <chr>, shooting_play <lgl>,
#> #   coordinate_x_raw <dbl>, coordinate_y_raw <dbl>, points_attempted <int>,
#> #   short_description <chr>, game_id <int>, season <int>, season_type <int>, …
# }