Module: AG0982A - Creative Research

This blog documents my 3rd year research project at Abertay University. The focus of my research is on video game progression, tutorial design, and how to teach the player. My vision statement could be stated as such:

A game often needs to gradually introduce its mechanics and skills to the player. This needs to be done at such a pace that the player is neither anxious nor bored, and needs to be clear without sacrificing challenge. How can this balance be achieved? To investigate this, I've created a simple puzzle game, and released it to a sample of players. I can use data from their feedback to improve my game.

This issue came to my interest when I noticed that many games do a superb job of gradually teaching a player how to master a complicated system (such as Portal), while many other - often more complicated - games are lacking in comfortable and effective tutorship (such as Crusader Kings II), forcing players to resort to online wiki reading, and YouTube guides.

Wednesday, 20 April 2016

Final Thoughts on Comparing Iterations, and Overall Conclusion

Iteration 1 was created, to the best of my ability, to try and follow a smooth rise in difficulty, thus achieving flow. Iteration 2 was made in response to data from Iter-1, and was very much a stab in the dark; it opposed some of my initial ideas on how to design level progression, generally moving faster than I thought would be appropriate for many players.

Comparing data from both iterations, I can draw some conclusions about what constitutes appropriate progression design.

Hand-Holding Is Dangerous
One of the obvious features of Iter-1's experience graph is a drop in enjoyment and difficulty between Levels 1 and 2. These levels were reserved to explain - very slowly - two basic interactions in Circle Puzzle to new players; movement, and locked pieces. It was all a precaution against players not understanding how to play the game, and was done under the idea that only one new concept should be introduced at a time. But spending two whole levels to introduce two simple ideas was clearly boring. Meanwhile, where the two levels were combined into the first level of Iter-2, no significant drop in enjoyment was seen, and difficulty did not rise at all.

At the same time, Iter-2's Level 2 seems to have increased in difficulty too quickly. Many players struggled on this level more so than on the following Level 3. I think this is because Iter-2's Level 2 was added as a more complex single-gem puzzle, as I was unable to design any simpler ones. But, clearly, the opening levels of the game demand more attention.

Making Tutorial Stages Fun May Be Impossible
I was disappointing to see lower enjoyment ratings for tutorial levels (Lev 1-2 in Iter-1, Lev 1 in Iter-2), but I wasn't surprised. Though a consistently enjoyed game would have been preferable, it's hard to imagine how you can maintain the player's interest while teaching them the mundane functions of the game. The fun in a puzzle game is solving a difficult problem. In tutorial levels, it was important to avoid distracting the learning player, so any difficult problems were avoided. Instead, Levels 1-2 in Iter-1 and Level 1 in Iter-2 were mostly brief tests that demanded the player to learn basic mechanics.

It would be interesting, however, to see if more puzzle-like challenges could be worked into these segments of the game, without disturbing the player's learning process.

More Challenge Sometimes Means Less Completion
Completion rates for both iterations can be seen below:

Where most players were able to complete all of Iter-1, only half could finish Iter-2. If the game was commercially released, and assuming that players who didn't finish the game on their first try would never go back to it, half of the game's players wouldn't enjoy all of its content. Does this indicate a shortcoming, or simply a more challenging game?

Unexpected Reception
One of the main things I've learned from the development of this game is that levels were rarely received as I expected them to be. In Iter-1, I assumed that Level 5 would definitely be perceived as the most difficult. Instead, it was rated to be roughly as difficult as Levels 3 and 4. Level 2 was considered to be easier than Level 1, despite my assumption that the added mechanic and very minor puzzle-solving element would demand more effort from players. This suggests that I need to take more care with how I order my game's progression, and utilize playtests to check for these mistakes.

After practicing the difficult process of structuring a game to achieve flow, I'm more aware of what I need to do to make an engaging game, but I wouldn't say that I've accomplished flow. The game was not phenomenally well-rated; average ratings were between 3 and 4 out of 5 (Where 1 was "Not At All" and 5 was "A Lot"). It's not bad, but the game can always be improved.

Though I did see some enjoyment in the playtesters I was watching directly - and though these moments of enjoyment did occur in the circumstances I'd designed for, i.e. on the triumph of beating a puzzle - there's always room for improvement. In the case of Circle Puzzle, I suspect that work could be done in areas beside progression/tutorials. For example, many players progressed through the game without actually understanding how to approach each puzzle. I would ask a victorious player if they understood the significance of move-ordering; they'd indicate that they did not, and that their moves had mostly been guesswork. Additionally, the game's art, UI, and kinaesthetics were quite poor (not being the focus of my work), and I felt that the lack of interactive feedback/game-feel harmed the game's enjoyment.

I look forward to employing flow, pacing, and skill-gates in my future games. Furthermore, the language I've learned to refer to these concepts will serve well in communicating with other game designers.

Tuesday, 19 April 2016

Presentation 3 Feedback, and Traditional vs. Ongoing Tutorials

Today I gave my final presentation on my research. This presentation summarized the issue I chose to approach, what I produced in response to it, and the data I gathered.

In my feedback, it was pointed out that the final product of my research has largely been a focus on difficulty progression (how difficulty changes throughout the game), rather than tutorials (the early stages of the game that focus on teaching the player), and it was suggested that I could change my vision statement in response to this outcome. I think this is a good point, but I'm not sure I entirely agree.

Earlier video games often kept their tutorials in carefully defined sections of the game, at the start, or when the player needs to learn a new control. Though usually effective at teaching the player, these tutorials can be notoriously boring, and subtract from the player's enjoyment/immersion. Many games - in particular those that heavily involve learning throughout the game's progression, such as Portal - take a more seamless approach. In Portal, there is no clear line between an initial tutorial, and the game that follows. Instead, each level could be said to deliver elements of a tutorial, and each level is very much some kind of puzzle. Thus, no part of Portal trades out actual gameplay in favour of tutoring the player, yet the game teaches players how to move, use portals, and solve momentum puzzles. It may seem that the game lacks a tutorial, but in fact, most of Portal is the tutorial. It approaches difficulty progression and information delivery all at once, step-by-step.

This style of ongoing tutorial isn't always possible or appropriate, but it is seeing wider use in modern games. Examples include Company of Heroes 2, Skyrim, Batman: Arkham Asylum, where elements are introduced gradually and seamlessly, without violating the player's immersion. Even after the initial 'tutorial' section is completed, new challenges are gradually handed to the player, such that the pattern of teach-test-repeat doesn't disappear for at least a few hours. Even when more obvious tutoring has ceased - where the game no longer instructs the player through explicit dialogue or suggestion - subtle learning/teaching processes are underway. Long after a player has been introduced to Skyrim's combat, exploring, and looting, the game will still place them against increasingly difficult enemies, requiring the development of new strategies which - in the best of cases - are suggested to the player through a level's design.

This is all to say that, to a degree, difficulty progression can be considered an extension of tutorial design. I've been aware of this from the start of my research, and wouldn't say that my intentions have really changes, which is why I'm hesitant to change my vision statement. Furthermore, if we're viewing tutorials solely as the process of explicitly teaching a player the basic functions of the game - controls and objectives - there doesn't seem to be much left to study. Games work best when the player is taught covertly, without breaking immersion. There are several impressive tricks in wide use to accomplish this. I'm sure there's more to learn in this particular area, but

However, it may be a worthwhile exchange of language to cease describing my chosen area of study as 'tutorial design', where 'progression design' might be more fitting.

Sunday, 17 April 2016

No Correlation Between Difficulty and Enjoyment

While gathering the results of my research, I noticed an apparent proportionality between difficulty and enjoyment in Iteration 1's experience graph:

Difficulty and enjoyment seem to rise and fall together through levels 1, 2, and 3.

If there is a correlation in Iteration 2, it seems to be less clear:

Intrigued by this, I tried creating scatter graphs of individual players' enjoyment rating vs difficulty rating. If there was a strong positive correlation between them, I'd see an upwards-rightwards trend between them.

However, no such trend is immediately apparent. Though a slight positive trend can be seen in Iteration 1's scatter graph, where an empty space exists around areas with high difficulty and low enjoyment or vice versa, this trend is still challenged by a few outliers (A rating of 1 difficulty but 3 enjoyment to the far left, or 4 difficulty but two enjoyment to the right). Iteration 2's graph is even harder to make sense of, with ratings (likely by chance) not landing in the 3,3 center of the scatter cluster, forming a ring shape. Here, it seems, it would be easy to draw a positive or negative trend between difficulty and enjoyment.

The experience curve values were based on averages, so I've also tried to create a scatter graph using those values.

Here, we can see a very clear correlation. Enjoyment increases with difficulty. Why does this trend appear when considering my players collectively, but less so individually? What does this mean? Is it even a valid observation?

Apparently, individual players show very little correlation between difficulty and enjoyment. A player may consider an easy game to be very fun, and a difficult game to be not so fun. But, when these results are compiled into an average, a clear correlation emerges, such that enjoyment tends to increase with difficulty on average, between all players.

I'm wary of this being due to an unforeseen statistical quirk - similar to how rolling two dice tends to produce a sum of 7 more than any other number. I don't know much about statistics, and wouldn't know how to identify this kind of mistake. Alternatively, maybe my sample of testers is just two small to draw any useful conclusions.

If the correlation is valid, however, then maybe the graph of Average Enjoyment vs Difficulty Ratings eliminates deviations to reveal a genuine underlying relationship. In this case, the trend would be showing what I suspected; that more difficult challenges are more thoroughly enjoyed by the player, and that too much hand-holding damages the game's experience.

But, unless I can confirm this concept, I'm not in a position to draw this conclusion.

Finding Design Problems with OAP

A few weeks ago, I set to work on integrating my classmate's Unity analytics tool (Oliver's Analytics Package, or OAP) into my Circle Puzzle game. The tool works by tracking specified objects in a game level - in my case, the puzzle pieces with gems on them - uploading that data to a database, and downloading it for replaying at will. This allows me to see how players have played my level.

Though OAP did have some barriers to its usability, I was able to use it to make inferences about how well designed my game was. For example, using OAP, I was able to determine that 3/8 of my players were solving Iteration 2's 3rd Level (the puzzle shall now be named Simple Primer) in such a way that there was very little difficulty in solving the puzzle at all. This occurred because players were moving a piece that I didn't expect them to move as their first move. The challenging part of this puzzle is working out which order the red and green gems need to be deployed in, but solving the puzzle in this unexpected way destroys this challenge.

(Simple Primer - Iteration 2 Level 3)

I looked up the survey ratings for these three corresponding players, to see if there were any major differences in how they rated this level ...

Average enjoyment of Simple Primer (out of 5): 3.375
P1 enjoyment: 4
P2 enjoyment: 4
P3 enjoyment: 2
Average of P1-3: 3.333

Average difficulty of Simple Primer (out of 5): 2.75
P1 difficulty: 2
P2 difficulty: 4
P3 difficulty: 1
Average of P1-3: 2.333

So neither average enjoyment nor average difficulty rating dropped below the overall average by a significant amount, indicating that this accidental strategy did not cause any lapse in the players' enjoyment, when compared to the ratings of other players.

Why was this the case? I would expect players who are finding an easy solution to one of the puzzles to rate it less difficult, and less enjoyable, that those who aren't finding the easy solution.

Saturday, 16 April 2016

Comparing Data from Iterations 1 and 2

With the second load of data through from my survey, I'm ready to start analyzing my findings to inform my research.

I created two versions of the same game. The first featured level organisation to the best of my ability. I designed this iteration to have a steady progression of difficulty that would gradually introduce the game's mechanics, and the strategies that players would need to overcome them. I then sent this version of the game out to playtesters, who provided feedback on each level in the form of survey ratings. Of particular interest to me was the 'difficulty' and 'fun' rating of each level.

The second version was made in response to these results. I found that players weren't enjoying the first two levels at all, and that many were rating the second one as less difficult than the first. I also found that the third level was rated as the most enjoyable, and the fourth as the most difficult. Ideally, the game would have been enjoyed throughout its duration, and the final level would have been the most difficult.

Iteration 1 Experience Graph

For the sake of research, I decided to make drastic changes to the game, creating significantly more difficult levels and introducing them in a quicker amount of time. This opposed my original views on progression design - I had been lead to believe that a game needs to ensure that players are very comfortable with its progression long before introducing any challenge. But I wanted to push the limits on what I thought was acceptable game design, and see the change in results.

Below are the results from my second iteration.

Iteration 2 Experience Graph

Achieving a "Tense and Release" Oscillation
Keep in mind that I was aiming to generate a single 'crest' of pace oscillation; introducing a new type of challenge, building up, and reaching a climax. If the game was longer, I'd be able to induce multiple oscillations, and the overall difficulty curve might look like this:

Ideally, the difficulty curve would show a steady rise. But, as can be seen in my two experience graphs, this was not quite the case.

It's unclear which, if either, of the two iterations come close to accomplishing this curve. It seems clear, however, that Iteration 2's level 3 is easier than level 2.

(Iteration 2: Level 2)
(Iteration 2: Level 3)

I placed the two puzzles in this order because:
  • Level 2 elaborates on puzzles using 1 colour of gem.
  • Level 3 introduces a puzzle with two gems.
Clearly, this was a misguided decision. It may be best to introduce complexities one at a time, such that 2-gem puzzles aren't touched until the player has mastered 1-gem puzzles. But, for whatever reason, Level 2 was perceived as much more difficult than Level 3. With this in mind, I'd consider swapping the two puzzles around. But I'd be more inclined to scrap both puzzles altogether; Level 2 is too difficult for a 2nd puzzle, and calls for strategies that have not yet been developed in the player, while Level 3 introduces two-gem puzzles too easily. That said, since it may be best to introduce 2-gem puzzles very carefully, the transition to these puzzles may merit a pacing-oscillation of their own, such that:
  • Levels 1 - 5 are entirely about single-gem puzzles.
  • Levels 6 - 10 introduce multiple-gems, where Level 6 is comparatively simple.
This did occur to me during development, but I decided against it for two reasons:
  1. Creating interesting puzzles with just one gem is actually quite difficult. Generally, it's easy to find the solution to these puzzles without much effort, since you don't need to worry about which order the gems need to move in.
  2. Creating 10 puzzles was beyond my time limit.
Apparent Difficulty
Notice, also, that Iteration 1's (now referred to as Iter-1) Level 5 is the same puzzle as Iteration 2's (Iter-2) Level 4:

... which I'll now refer to as the "Complex Primer" puzzle. Iter-1's difficulty rating of this level averages to 3.2. Iter-2 averages 4.

Meanwhile, enjoyment ratings for this level are 3.2 for Iter-1, 3.75 for Iter-2.

Complex Primer was both apparently more difficult (averaging at 4 - "Difficult" - on the scale) and better enjoyed in Iter-2 than Iter-1, despite being the same puzzle.

Why is this? I can think of two possible reasons. The first is that Complex Primer - though one of the more difficult puzzles - might not be difficult enough to stand as the finale-puzzle following 4 other puzzles, each raising the player's skill and understanding of the game. In Iter-1, by the time the player reached Complex Primer, Complex Primer may have been below their capabilities, this steering them too close to the "Boredom" section of the Flow graph. Levels 1 - 5 - introducing concepts of order, movement, patterns, and interface - may have prepared the player too well for the final trial, making it less of a challenge and ruining the Oscillation's crest. In Iter-2, meanwhile, it was only the 4th puzzle, and remained challenging to the slightly less practised player.

Alternatively, it could be that Complex Primer is judged in the context of its previous puzzle. In Iter-1, this was Level 4 (hereby called "Intersection Primer"):

Intersection Primer was rated as the most difficult puzzle of Iter-1. I still can't explain why this is the case. The puzzle only reiterates what has been taught in Iter-1's 3rd puzzle (the concept of order-priming), and I assumed it would actually be quite boring for players. The rating, however, is only .25 above its previous and following puzzles.

Nonetheless, here is a puzzle perceived as quite difficult by the average player. The following puzzle - Complex Primer - is consequently rated as slightly easier (again, only by .25 / 5). Similarly, in Iter-2, Complex Primer follows a puzzle with a lower rating; this puzzle being Level 3 in Iter-2's experience graph. Are these ratings partially informed by comparison with the previous level?

The bottom line is that difficulty is not just the product of a Level's singular design. Difficulty is perceived in context of the Level's order in the game, and the player's experience. A level plucked from the 3rd section of a game - complete with new mechanics and complex puzzles - and placed immediately after the game's interface tutorial, will be perceived as unfairly difficult. If left in its correct place, however, any players reaching this point will be experienced enough to describe the level as "only slightly difficult". The same level can be perceived quite differently, due to changes in level order.

Final Tweaks with OAP Implementation

Over the course of my usage, my classmate Oliver Smith has been making a few tweaks to the Oliver's Analytics Package, which I'm using to gather data on how players play my game.

Some of the changes have been driven by my feedback. For example, where my game has 5 levels, and each one needed a game object to handle data-gathering, each with its own settings that needed to be changed constantly, I informed Oliver that finding a way to have all of these objects work with just one set of settings and one game object would be an enormous improvement. He managed to get this functionality working with an XML script to store my upload settings. As a result, OAP can now be controlled from just one game object (in my first level):

This might mean that I can now attach the TrackingScript.cs component, which marks the puzzle pieces that I actually want to track (i.e. my gem pieces), onto the prefab, but I still don't trust Unity's prefab handling, so I'll stick to having TrackingScript.cs attached to each instance separately, since it's only about 8 objects.

Oliver's Analytics Package also assigns a random name to each player as an identifier. This is used to differentiate data on the database, and to identify play sessions when replaying. However, this causes a problem. Players will play the game - giving me a set of movement data - and then fill out a form - giving me some feedback. It's important that I can link both of these sets of data between play sessions. For example, if I know that one player said that they kept getting stuck on Level 4, I need to be able to find their corresponding play session to take a look at where they were getting stuck. This session, of course, has a randomly generated name.

I can link a session's ID to a survey, however, by using pre-filled surveys. Using Unity's Application.OpenURL() function, I can open my survey's url, which has a specific section that pre-fills the "Player ID" field in the survey. If I pass the randomly generated PlayerID to this value, I can find a player's survey answers to link to their movement data.

Implementing OAP: Using My Classmate's Analytics Tool

When I started this project, I was hoping to put some focus on using analytics and user metrics to gather solid feedback on my puzzle game. This would include scripts to track how long it takes a player to complete a level, when certain events occur, and possibly how they move their puzzle pieces. I haven't had time for this, however, but a friend and classmate of mine - Oliver Smith - has made that the focus of his research. The blog explaining his research is available here.

Oliver's Unity package - which he calls Oliver's Analytics Package (OAP) - contains functionality for recording the movement of an object, storing in in an SQL server, and retrieving that data in the Unity editor for viewing and playback. You can watch an object move around the scene as it did when the player was controlling it. This kind of functionality is very useful for game designers, as it allows them to review how players play their games at a glance. Similar tools have been in the arsenal of Bungie Studios and other shooter developers for about a decade now, as seen in their famous heatmaps:

OAP can't compile heatmaps (yet) but it can show me an exact playback of how my players solve my levels. This allows me to review common strategies, common misunderstandings, and any outlying annoyances. It may be very useful for a puzzle game, such as the one I've developed. Additionally, the feedback I gather on how easy it is to use the tool should be useful for Oliver's research.

I can't use the tool without implementing it. Since OAP is custom-made and still in an early stage of development, integrating it to your game isn't a straightforward task. Oliver provided me with a ReadMe file, and communicated with me as I set the tool up to gather data from my game. The process ended up taking about a day of work.

Three objects are at play when using OAP:
- AnConnectScript.cs: The central hub for OAP features. This is where networking settings are applied, and where playback of player movement can be accessed.
- TrackerInfo.cs: This just keeps track of the UserID (UID) and SessionID (SID) for database storage purposes.
- TrackingScript.cs: This is attached to whichever objects the developer wants to track when players are testing the game. In a platformer, this would be the character, but in my case this is the gem puzzle pieces. It gets all of its settings from AnConnectScript.cs.

A few minor usability issues plague Oliver's analytics tool. For example, a line of code needs to be commented or uncommented, depending on which version of Unity you're using. And rather than just having to click one button to download and view data, you have to click several. These, however, are minor inconveniences.

But a major flaw of OAP is how it doesn't play nice with Unity's prefabs. A prefab is a blueprint-like game object stored in Unity's assets, and can be copied into the scene at any time. You use prefabs for any kind of object that needs to be copied in and out of the scene; enemies, pickups, puzzle pieces. Sometimes you want to be able to change a value in the parent prefab that will immediately change the values in all of its in-game copies, such as the damage of an enemy. Sometimes you want to have certain values for each individual copy; maybe each enemy needs its own hairstyle. Unity is sometimes pretty good at guessing when you want all values to reset in unison, or all of them to be individually set. With Oliver's tool, and specifically with the AnConnectScript.cs object, Unity is terrible at this.

As a result, where prefabs would normally be a blessing, here I can't use them. Instead, I need to place a copy of AnConnectScript.cs in each of my game levels, and whenever I want to change a setting, I have to apply the change t each of them separately. With six levels, this is error-prone but doable. With more, it'd start to become a nightmare. So my main sticking point for Oliver has been to find a way around this.

Nonetheless, OAP has been successfully integrated. Next time I send out an iteration of my game for playtesters, I'll be able to review their movement patterns.

(Above: The trails represent the movement of the puzzle pieces recorded from a previous session, where red is the start and blue is the end. The blue gems are what I'm currently using to show where the gem slices were during playback.)