Tuesday, May 24, 2016

GP Charlotte 2016: Saturday (The GP)

Well. That was a weekend, wasn't it? I'm still not quite awake. I mean, I'm awake, but I feel a bit like a zombie.

I do, however, want to get this post out there while the conversations about the GP are still fresh. There's a pretty substantial knowledge gap around tournament software--you know, the thing at the heart of the long delays and ever-shifting information that players in Charlotte experienced.

I was the scorekeeping lead for Swiss side events (which got blown up by the influx of 700+ players with Infinite Challenge badges), and I was sent to the GP at the end of Round 4 when they thought the initial problem was fixed to help Kristin and Patrick (the GP scorekeepers) get through the backlog of result slips for the round. I chatted with Kristin on Sunday. My understanding of what happened on the GP is based mostly on that conversation and my own experiences scorekeeping large events, but it does shed some light on some important questions that came up, like why backups weren't an option and why tiebreakers carried over to Day 2 despite previous announcements to the contrary.

Before I get into those things, though, there's some important context. I'm not writing this as a defense of or as an excuse for the things that happened in Charlotte. As much as we (we being everyone on stage--outside admins and SCG staff) tried to make the best out of a miserable situation, I understand that it was exactly that: a miserable situation. It's not the event experience that we wanted to create for the attendees of the Grand Prix, and that's a pretty awful feeling to come home with. But it happened. Instead, I'm writing this to (hopefully) shed some light on the circumstances that led to it all.

The Unholy Trinity: DCI-R, WER, and WLTR

There are three different pieces of software that have been developed and used for Magic tournaments. Most people are most familiar with WER, or Wizards Event Reporter. This is the software that's used at your Friendly Local Game Store. DCI-R (the R stands for Reporter) used to be used for large events like GPs and SCG Tour events, but it was replaced last year with WLTR, also known as Walter or Wizards Large Tournament Reporter.

WER

When WER came out, the goal was for it to be used to run all events, but it doesn't scale very well. The results entry interface in WER is kind of awful (read: inefficient) for scorekeepers who don't use hotkeys, which is about half of us. Instead of switching over, GPs continued to use DCI-R until last year, when the switch to WLTR happened.

WER is also missing some important functionality for large events--for example, you can't edit a player's DCI number. That may not seem important, but the sheer magnitude of events the size of GP Charlotte means that EVERYTHING that can happen does happen, and that includes players signing up with the wrong DCI number.

DCI-R

When DCI-R was developed, no one imagined that a Magic tournament would ever have more than 2,000 players. It was an absurd thought, and that was the cap they set on the number of players in a single event. If you remember the (recent) days of GP Day 1 splits, this is one of the primary reasons. The other was that DCI-R didn't support multi-point data entry, which means that you could only have one scorekeeper, in front of one computer, working on a given event. The number of people who can effectively scorekeep an event that size is pretty limited. When events started to get bigger, with 2,000+ players being the norm instead of a crazy exception, this model clearly wasn't sustainable.

There's one other thing that's important to understand about DCI-R. Going around the software's interface was really easy. The tournament data was stored in a series of .dat files, and as long as you knew what each column of each file meant, it was easy to manipulate the files directly if you couldn't do something through the interface. An easy example of this is the starting table number of events--999 was the highest number you could set (because you wouldn't need more than that for 2,000 players, right?). However, if you saved and closed the tournament, you could change the starting table number in the event information file to whatever you wanted and everything worked just fine.

This is the reason that DCI-R worked (reasonably) well for large events--there were workarounds for the restrictions that existed as a result of the player cap.

But, DCI-R doesn't interact with the organized play database. You can't use it to sanction or upload tournament data, and it doesn't connect to the player database to verify DCI numbers and byes.

WLTR

WLTR is WotC's solution to the problems with DCI-R and WER for giant tournaments. It does a lot of cool things, like let you enroll more than 2,000 players in one tournament. It also lets multiple scorekeepers at multiple computers work on the same event, which is insanely important as attendance creeps up. It has a lot of the features that large events need that are missing from WER, which is awesome.

When using the multi-entry option, one computer is the "main" computer and the rest are "slaves" (yeah, I know...we have the best lingo for these things). Bad Things happen when you try to advance the event (seat for the player meeting, pair rounds, perform a cut) from one of the slave computers, but you can do basically anything else, like entering results and dropping players.

However, WLTR keeps tournament data stored in a different way, and editing it directly isn't as easy as editing the .dat files that DCI-R used, and pinpointing problems can be a little bit trickier (read: a lot). It's also still new. My first experience with it was last year at GP Atlanta, which was one of the first GPs that used it exclusively.

So, What Happened on Saturday?

When Round 4 of the GP was paired, players noticed that they didn't have the correct number of match points--no one had more than three. Obviously, this is a problem after the third round, when tons of players had two- and three-round byes, not to mention the players who, you know, won matches of Magic: the Gathering in the first three rounds.

From what I understand, the rounds that were paired randomly weren't entirely paired randomly--they were paired based on these incorrect numbers of match points. Since two rounds of match points were obliterated by the tournament software, that means that instead of being paired in their brackets, players were paired up to two brackets up or down. Players at 3-0 were paired against players at 3-0, 2-1, or 1-2.

WLTR has what's called the player card. It shows each player's opponent from each round, the table they played at, and the results of the round--it's a great quick reference for troubleshooting when a player says that their results were reported incorrectly. Despite the match points being wrong for most of the players in the event, the information on the player cards was correct. This means that the results were all still there, WLTR just wasn't calculating match points for the second and third rounds, and it was pairing based on these faulty match point totals.

What caused it?

I'm not really sure. This is Kristin's theory:

There was a player in the event who wasn't correctly assigned byes, despite having a Sleep in Special. At the beginning of Round 3, he showed up to the stage because he couldn't find his name on the pairings.

Stuff like this happens all the time. Patrick (who was on the "slave" computer) re-added him to the event and fixed his results from the first two rounds. WLTR crashed. This didn't raise any red flags at the time because WLTR crashes, well, a lot. You just re-open it, and everything is Fine. Most of the time. That seemed to be the case this time.

Except, then the round finished, Kristin tried to pair Round 4, and everything went to shit. There was some confusion about whether the pairings were correct because the player cards had accurate match results.

Reverting to the back-up from the end of Round 3 and re-pairing Round 4 didn't fix anything because the trigger for the problem happened, effectively, rounds earlier. Moreover, the data (match results from the previous rounds) was still there, it was just unusable. Re-creating the event from Round 1 would have taken hours--in addition to re-entering all those results, each round would have to be manually paired in the software, and there's no quick or easy way to do that in WLTR.

The folks on the main stage got on the phone with WotC and sent copies of the tournament file to them to see if they could figure out what the problem was.

Eventually, there was an announcement that these pairings, and the pairings for the rest of the day, were going to be random. Anything else was going to take way too much time. Round 4 started with random pairings because that was the only way to start Round 4.

So what about after that?

At some point during Round 4, it seemed like the issue had been resolved. There were about 12 minutes left in the round when I was sent to the main stage to help Kristin and Patrick hammer through the result slips from the round so they could get Round 5 paired ASAP.

Even though it looked like things were going to be okay, there were some players who were already understandably upset--they had lost to players they shouldn't have been paired against, or their tiebreakers were adversely impacted by a huge pair down. The discussions had already started about what to do for these players.

Pete and Jared decided to let players drop for a free Infinite Challenge badge, and they added some events to the side events schedule. By this point, it was late in the day, and most of the Challenge events had fired; these events gave the players who were dropping more options for the rest of the day.

There were about 700 drops, and processing them took a huge amount of time. Round 5 was paired, and it looked like the match point issue had been solved.

But it wasn't. When Round 6 was paired, the same thing happened again.

The Ever-Changing Announcements

I've been reading some of the Reddit threads and FB posts in my local playgroups, and the thing people seem to be most frustrated about is that the information they were given was constantly changing. That's true. Random or Swiss pairings? Ten minutes until the next round or twenty or an hour?

It sucks. No, really. It's awful.

Communicating in these kinds of situations is difficult. You want to get information out there so everyone knows what's going on, but it's hard to estimate how long it will take to actually get everything done.

The biggest hold-up on getting the Infinite Challenge Badges in the hands of the players who wanted to drop from the GP was the lack of label printers for the badge bar codes. We had more people on stage--Jeff was my Swiss scorekeeping buddy, and I took his events to free him up to help with the customer service line--but there wasn't a label printer for him. Instead, he handed out prizes to players finishing their Challenge events, traded playmats for vouchers, and did pretty much anything else he could that wasn't print an Infinite Challenge Badge bar code. Kristin and Patrick were also slammed trying to get all those players dropped from the GP event before Round 5.

The flip-flop on whether the rounds were going to be paired Swiss-style or randomly was based on the changing understanding of the tournament status that the event staff had--was the problem fixed, was it not fixed? After Round 4, it seemed like everything was back to normal. A few rounds later, that was clearly not the case. Going into Round 7, things were back to normal. For real this time.

Day 2 Tiebreakers

At several points on Saturday, the GP stage announced that tiebreakers from Day 1 wouldn't carry over. This is good and bad--it means that the players who had been paired down in the two rounds that were paired randomly wouldn't be punished for the impact that had on their breakers, but it also meant that the players who went undefeated on Day 1 wouldn't be rewarded for that if they did not-so-well on Day 2.

Regardless of the pros and cons, both of which are many and varied, it was the announcement that was made.

Here's the thing: it was impossible to wipe tiebreakers from Day 1 with WLTR.

Wait, what? Isn't that what used to happen on Day 2 of giant GPs by default?

Yes. But, remember what I said about DCI-R and the workarounds for events with more than 2,000 players? The fresh Day 2 tiebreakers was a side effect of one of those solutions: Day 2 was run as a completely separate event. Everyone was awarded a bye for the first round of Day 2, and they were assigned a number of match points for that bye equal to the number of points they earned on Day 1. This is just one of those crazy things that you could force DCI-R to do.

But WLTR can't do it. Why would you need to anymore? No event will ever have to split again, so there's no reason to need to create a new event and carry those points over. WLTR assigns 3 match points for a win, 1 for a draw, and 0 for a loss.

There's sort of a way to do it. You'd have to re-create every round from Day 1 and assign each player a bye or loss (rather than pairing them and entering a result) to get them to the correct number of match points. This way, they aren't associated with any opponents, so there are no tiebreakers to calculate. This has two problems: it would have taken literally forever, and there's no way to assign a player a draw.

On Tournament Software in General

Many of the responses to this issue, and the issues that have come up in the past as a result of tournament software, have included comments about developing "real" tournament software.

None of these programs are perfect by any stretch of the imagination. Far from it. DCI-R had fewer problems not because it had fewer problems, but because it had been around so long that the people using it knew how to work around them (like, you know, splitting GPs and editing files directly).

One of the big problems with developing these kind of programs is that the people writing specs and code aren't the people using them. They don't know what needs to be possible, like changing DCI numbers--if you've never scorekept an event, it's easy to imagine that you would never, ever need to change a player's DCI number. WLTR was developed with input from GP-level scorekeepers, and it does some pretty exciting things. However, the attitude toward many of the issues that have cropped up has been "that's probably a one-time thing" or "why would that be a problem?"

It mostly works. It's good enough, most of the time, But, the times when it's not, these are the kinds of things that happen.

23 comments:

  1. I have a different theory that I'd like to share. It seems that the network dropped at the same time that a slave computer was trying to update the master computer database (and by database I mean json file because why the hell not). The slave computer crashed because it could not connect to the master. The master ended up with null data in the match result field for a player. Poor null handling caused the standings calculation to fail and incorrectly assign match points to players. That continued to cause problems all day until the problem in the json was identified and corrected.

    ReplyDelete
    Replies
    1. Does WLTR store data in json in files like DCI-R stored data in .dat files?

      Delete
    2. Are Wizards going to let the people who know what they are doing help with the development of WLTR to help prevent this in the future?

      Delete
    3. Oh, wow, Flats. That assessment seems really accurate.

      It's a little odd that the program would crash when it couldn't connect, though. I would've expected them to catch that scenario, at least.

      Delete
    4. Is there any chance I could get my hands on a sample of the WLTR json? I understand if that's not possible. I'm mostly curious. I'm also thinking it might be parsable back and forth with DCI-Rs .dat formatting. Could get all those tricks back.

      Delete
    5. Given the general level of QA we see here, I'm not at all surprised by that. One of the easiest things to forget when you're half-assing test cases is that other people's infrastructure isn't going to be as high-quality or reliable as yours.

      Delete
    6. I don't think it matters how the null value made it there.

      The code that generates the standings/pairings should gracefully handle null values. That's pretty baseline programming skills.

      Delete
    7. > and by database I mean json file because why the hell not).

      Jesus. Even a SQLite backend would be a dramatic improvement. At least that's ACID compliant.

      Delete
    8. Flats' theory is pretty close, from what I understand. Essentially, the guy who got added in Round 3, with two byes, was the trigger event. We got the right people involved: Wizards' techs, and Nick Fang (SK at GP LA), and the JSON was changed, which resulted in Round 5 being "normal".
      However, there was more to it than initially discovered (I don't have details), and that's what got fixed during Round 6, allowing the rest of the event to proceed fairly quickly and smoothly.

      Huge kudos to everyone that dove in, willingly and eagerly, to overcome this software/data glitch. It's never fun when your already stressful day is turned upside down and inside out!

      Delete
  2. So glad you wrote this. Wasn't there (really glad now, haha, since I would likely have been judging if so), but I would like to clarify something above:

    Tiebreakers are carried over to Day 2 now? And if they are, was there an announcement about it?

    ReplyDelete
    Replies
    1. They should be. In the past, they only didn't carry over when the GP had to split, and that announcement was usually made on site when it was the case.

      TL;DR--Tiebreakers carrying over between days should always happen, but it couldn't in the past for some giant events because of software/staff limitations.

      Delete
    2. Except if you go back far enough that it wasn't the case, but people don't tend to care about history that old.

      Delete
  3. Your guess about what happened jives with what Scott Marshall told me in an interview on Sunday. There was a bizarre data corruption from the late-added player with byes that no one foresaw. Nicely explained!

    ReplyDelete
  4. Your second-to-last paragraph greatly describes one of the main problems with professional software development. Assigning some judges, or maybe having some judges who are software developers themselves, to be consultants for the developing of an eventual new software would be nice

    ReplyDelete
    Replies
    1. There's a surprising amount of overlap in the Professional Software Development and the Magic Judge categories...

      Delete
    2. They do consult on features, and quite a few scorekeepers provided input on WLTR. It's not the result of not asking--I think it's more that the developers don't have an intimate understanding of why some things are important.

      Delete
  5. SCG and WOTC shoud've refunded everyone's money and restarted the tournament. There is no excuse for sure mis management. This is supposed to be a PRO level event. Would the WSOP ever do anything as half assed as this? NO. WOTC needs to understand that due to this I can't take them seriously at all. I've never seen a business this big look so unprofessional.

    ReplyDelete
    Replies
    1. "Everyone here's your money, go home." or "Everyone here's your money, now here's a free tournament starting 8 hours late." are both bad ideas!

      Delete
    2. I'm also more confidant that WSOP would have just told the player who was incorrectly registered to go home. Maybe we should ask less of the software and more of the players?

      Delete
  6. So, you've got three systems in use, none of which can handle 2000+ individual entries of players for an event, particularly when you ask said database to sort/join winners/losers, or deal with drops in a timely manner?

    This is more of a programming/testing issue with whomever produced the software, than it is an issue with SCG/WOTC. When testing, you ALWAYS test for min/max testing (i.e. what happens if 10k people show up for this event?). If that wasn't done, bad on the software dev(s).

    Also, make sure your GUI can handle at least basic situations (wrong first name/last name/DCI number/etc), so that issues can be sorted out. If the database blew up because two players had the DCI number of '234567', that's on the development team for not handling errors properly.

    (Disclaimer: I am a professional Software Analyst. I hate crappy software.)

    ReplyDelete
    Replies
    1. There's nothing like a money-on-line tournament to bring out edge-cases.

      Delete
  7. It is my understanding that Starcitygames has its own software that to my knowledge is less prone to these types if issues. Can anyone verify the stability of SCG's program? I've also heard WOTC is unwilling to use our allow SCG to use said software for GPs. Is this also true? Anyone know why?

    ReplyDelete