Skip to contents

helper that loads multiple seasons from the data repo either into memory or writes it into a db using some forwarded arguments in the dots

Usage

load_wbb_pbp(
  seasons = most_recent_wbb_season(),
  ...,
  dbConnection = NULL,
  tablename = NULL
)

Arguments

seasons

A vector of 4-digit years associated with given women's college basketball seasons. (Min: 2004)

...

Additional arguments passed to an underlying function that writes the season data into a database (used by update_wbb_db()).

dbConnection

A DBIConnection object, as returned by DBI::dbConnect()

tablename

The name of the play by play data table within the database

Value

A dataframe with 55 columns:

col_namedescription
shooting_playLogical value (TRUE/FALSE) indicating whether the play was a shooting play
sequence_numberSequence number is supposed to represent a shot-possession, examine the last two numbers to see if there are multiple events that occur within the same shot-possession. A shot-possession is basically any sequence of plays until there is a shot, change in possession, and probably things like technical fouls and the like. So as soon as a shot goes up, a new sequence starts regardless, even if the shooting team retains possession via offensive or deadball rebound. The first portion of the number is usually time related (i.e. the numeric representation of when the sequence started, from a seconds remaining in the period perspective or so)
period_display_valueLong form of period (1st quarter, 2nd Quarter, OT, etc.)
period_numberThe numeric period of play in the game
home_scoreHome score at the time of the play
scoring_playLogical value (TRUE/FALSE) indicating whether the play was a play on which the offense scored
clock_display_valueTime left within the period
team_idUnique team identification number for the offensive team
type_idUnique play type identifcation number
type_textPlay type text description
away_scoreAway score at the time of the play
idUnique play identifcation number
textText description of the play
score_valueThe points value of the shot taken
participants_0_athlete_idUnique player identification number
participants_1_athlete_idUnique player identification number
seasonSeason of the game
season_typeSeason type of the game, 1 is pre-season, 2 is regular season, 3 is post-season, 4 is off-season
away_team_idUnique away team identification number
away_team_nameAway team name
away_team_mascotAway team mascot
away_team_abbrevText abbreviation for the away team
away_team_name_altAlternate versions of the away team abbreviation
home_team_idUnique home team identification number
home_team_namehome team name
home_team_mascothome team mascot
home_team_abbrevText abbreviation for the home team
home_team_name_altAlternate versions of the home team abbreviation
home_team_spreadThe game spread with respect to the home team
game_spreadGame spread in (-X Team) format
home_favoriteLogical (TRUE/FALSE) indicating whether the home team is favored
game_spread_availableLogical (TRUE/FALSE) indicating whether the spread was available from ESPN. Basically, I would just not recommend using any of the spread information, I think I defaulted a lot of them to -2.5 for the home team. Most games probably do not have spread information. This column should really be listed first
game_idUnique identifier for the game event
qtrQuarter of the game
timeTime left within the period
clock_minutesClock minutes split from seconds for developer convenience
clock_secondsClock seconds split from minutes for developer convenience
halfHalf of the game
game_halfHalf of the game
lag_qtrA lag column on the quarter
lead_qtrA lead column on the quarter
lag_game_halfA lag column on the half
lead_game_halfA lead column on the half
start_quarter_seconds_remainingQuarter seconds remaining at the start of the play (these are more or less code artifacts from other sports, but may eventually be used more seriously)
start_half_seconds_remainingGame half seconds remaining at the start of the play (these are more or less code artifacts from other sports, but may eventually be used more seriously)
start_game_seconds_remainingGame seconds remaining at the start of the play (”')
game_play_numberGame play number
end_quarter_seconds_remainingQuarter seconds remaining at the end of the play (”')
end_half_seconds_remainingGame half seconds remaining at the end of the play (”')
end_game_seconds_remainingGame seconds remaining at the end of the play (”')
periodPeriod of the game
coordinate_xThe entire scale is a rectangle of size 25x47, intended as a half-court representation of the basketball court (i.e. on the side of the offense), with each coordinate unit representing a foot. It appears that the basket is roughly represented as the (25, 0) point. This is a nonsensical definition when considering that the basket overhangs the court, with the backboard aligned 48 inches from the baseline, then the center of the hoop being roughly 11 inches from there. This is an idiosyncracy of either sensor placement or software and data entry. Use your best judgement in making your charts, I think you will find that making some translations will be helpful.
coordinate_y
weekApparently there are weeks
media_idWhere did you come from

Examples

# \donttest{
  try(load_wbb_pbp())
#> ──────────────────────────────────────────────────────────────── wehoop 3.0.0 ──
#> # A tibble: 2,824,090 × 62
#>    game_play_number      id sequence_number type_id type_text   text  away_score
#>               <int>   <dbl>           <int>   <int> <chr>       <chr>      <int>
#>  1                1 4.02e17       120415188     615 Jumpball    "Sta…          0
#>  2                2 4.02e17       120415189     615 Jumpball    "Jum…          0
#>  3                3 4.02e17       120415190     615 Jumpball    "Jum…          0
#>  4                4 4.02e17       120415191     558 JumpShot    "Lau…          0
#>  5                5 4.02e17       120415193     558 JumpShot    "Mad…          0
#>  6                6 4.02e17       120415194     586 Offensive … "Tes…          0
#>  7                7 4.02e17       120415195     572 LayUpShot   "Tes…          0
#>  8                8 4.02e17       120415196     586 Offensive … "Rav…          0
#>  9                9 4.02e17       120415197     572 LayUpShot   "Mad…          2
#> 10               10 4.02e17       120415199     598 Lost Ball … "Gab…          2
#> # ℹ 2,824,080 more rows
#> # ℹ 55 more variables: home_score <int>, period_number <int>,
#> #   period_display_value <chr>, clock_display_value <chr>, scoring_play <lgl>,
#> #   score_value <int>, wallclock <chr>, shooting_play <lgl>,
#> #   coordinate_x_raw <dbl>, coordinate_y_raw <dbl>, points_attempted <int>,
#> #   short_description <chr>, game_id <int>, season <int>, season_type <int>,
#> #   home_team_id <int>, home_team_name <chr>, home_team_mascot <chr>, …
# }