Difficulty balance in custom 'Quake' maps

When it comes to the architecturally superb “Func Map Jam 9 - Contract Revoked”, I played it on a “Normal” difficulty level and from this standpoint, I call some combative setups simply unfair. On the other hand, we all are aware that within the terms of an episode or a campaign, there is certain difficulty level escalation, as well as the protagonist power escalation; things round up to a final fight, which should be difficult. In a jam, every map, can be a final fight map; there is nobody telling otherwise.

Jams, by nature, tend to be quite random, to a degree for sure. With the “Func Map Jam 9 - Contract Revoked”, the artistic spectrum, was very coherent, albeit the jam nature of event, showed in the realm mentioned - the difficulty level, was all over the place, such as you said.

Some of the blame, could perhaps be put on the lack of experience and overt fascination with the ‘Quoth’ novelties; thus underestimating the further impact the mod would make on the difficulty scale - but you decide, whether this argument, is that much convincing, since we speak of a second ‘Quoth’ iteration already.

For the difficulty level measure, I think it should be something of mathematical nature; certain sums of relevant parts, need to balance out. Understandably, this is not a bulletproof solution - there are also environmental factors and other elements of more qualitative than quantitative nature; albeit, maybe even these, could be given an estimation value. Anyhow, I do not really know how to bite the problem, where to begin, what to suggest. It takes experienced mappers for that kind of job and I do not really believe plenty of people would want this; to restrict themselves. Other question is - is it worth of the human energy investment to carry out?

For the general project outline, we would need to gather all relevant parts influencing the combat or otherwise valid in combat, assume that a map divides into sections and the balance, needs to be measured for each section independently. The map section, I see as a space where the protagonist, may roam openly - even if “openly”, means just back and forth - to possibly replenish the resources, even in the midst of a combat going on. Therefore, if a combat setup shuts off all escape routes, it is counted as a section. Some larger sections, could understandably be broken down into sub-sections, for strategic resources allocation.

Speculatively speaking, there is no other way but to simply start with a draft - even if it was silly - in order to see what does it tell, if it does make any common sense against the experience and perhaps then ask a question, what direction the alterations need to go next.

Yeah, I was thinking about this, that some kind of objective valuation would be helpful, so here’s a simple draft that makes use of weighted averages. We’ll construct a rough formula that takes into consideration some key quantitative elements that play into the difficulty level of a map, disregarding qualitative elements such as architecture and strategic monster placement:

Monster Count (the more the monsters, the higher the difficulty):

M = ΣwᵢMᵢ,

where the values Mᵢ represent the number of monsters of different types and wᵢ are their weights (the tougher the monster, the higher the weight);

Ammo Count (the less ammo, the higher the difficulty):

A = Σw’ᵢAᵢ,

where the values Aᵢ represent how much of each ammo type there is on the map and wᵢ are their weights (Cells > Rockets > Nail > Shells), provided of course that if there is ammo of a certain type on a map then at least one of the corresponding weapons is also present;

Health Count (the less health, the higher the difficulty):

H = Σhᵢ,

where hᵢ are the individual health boxes found on the map, measured in hp (health points);

Armor Count (the less armour, the higher the difficulty):

P = Σpᵢ,

where pᵢ are the individual armors found on the map, measured in armor points;

Distance Between Monsters (the smaller the distance between monsters, the higher the difficulty), weighted with their toughness:

D = Σwᵢwⱼd(i, j) / Cₙ² = 2Σwᵢwⱼd(i, j) / n(n-1),

where n is the number of monsters on the map and d(i, j) is the distance between the monsters i and j if the monsters have different coordinates, and the distance between one of the monsters and its trigger, or the player at the time of its teleportation, if i≠j but the monsters have the same coordinates, i.e. one is set to teleport in the other’s spot;

Then our formula for the skill level of the map would be:

S = M / D(A+H+P).

Computing this for the original ID maps would allow us to assign numerical values to the Easy, Normal, and Hard categories, and weigh any custom map against those.

Notes:

• it might be lucrative to convert ammo and armour to hp and compute it that way; monsters could also be converted that way by taking their weights to match their health, however, there’s a difference between killing an enforcer and killing a spawn with four shotgun shots;

• in order to compute the distance between monsters one would of course have to have access to their 3D coordinates;

• the formula could be tweaked such that S = 1 mean Normal.

Correction: in the formula above for D, the expression wᵢwⱼ should be replaced with (1-wᵢ)(1-wⱼ), provided that the weights are normalized, or generally with (w-wᵢ)(w-wⱼ), where w = Σwᵢ, in order for the difficulty to increase in the final formula with the roughness of the monsters. Alternatively, one could keep wᵢwⱼ and have the inverse of the distance as defined above instead be in the formula for D and then take
S = M•D / (A+H+P).
There are many ways to go about it.

@‘I Like Quake’, it looks very impressive.

Perhaps in some next ‘Trenchbroom’ beta version, sections - as proposed by myself in a starter post - could be made a mapping utility and the program, would count the difficulty balance value automatically for the mapper, based upon the formulations you proposed, within the scope of each section. Unless you propose the difficulty balance to always be counted for an entire map; which could be a default solution, if no sections are chosen.

I would also suggest that the size of a section, was a variable itself - the smaller the section, the higher the difficulty evaluation for it. I am uncertain, though, whether one could assume the same for an entire map size.

It is reasonable to refer to the ‘id Software’ maps as ideal, but I do think there are more maps out there on ‘Quaddicted’, that have good difficulty balance. In my opinion, “https://www.quaddicted.com/reviews/rubicon2.html”, is one set of them.

Perhaps you know some more?

@triple_agent

That’s what I was thinking, incorporate it into Trenchbroom or some other tool, so as the mapper is mapping they can see the difficulty level vary in real time. Anyway, that formula needs a lot of work to actually be implemented. I only sketched something off the top of my head.

The problem that you’re talking about, with sections, I’ve noticed it too. One can have, for example’s sake, a map composed of two areas far away from each other, each packed with monsters. How would that play into a valuation? One could valuate both areas individually and then average out the results or evaluate globally directly. The first approach would be more adequate if the space between the two areas can’t be used while fighting (for example to retreat or duck) as the player would more or less be locked in the respective areas, while the second approach would be more adequate if the space between the areas can be used while fighting because it would take into consideration the distance between the two areas in computing the distance between the monsters therein. The problem is: how does one codify what an area of the map is? One would have to use a graph or something, for example with the monsters as vertices, two vertices being connected if the line from one monster to the other doesn’t intersect a wall or some unnavigable barrier. Then, an area of the map would translate to a connected component of the graph. But then we’d be getting into qualitative aspects.

The only reason I referred to the original ID maps is because their difficulty levels are taken as the baseline. However, here too we need some differentiation because the episodes are not equally difficult, E4 is more difficult then E1 for example.

I guess one could have an absolute valuation, based on the layout of the map alone, and a relative valuation, where the skill level is computed relative to that of a set sample of maps (e.g. the ID pack).

A lot can be done here.

‘id Software’ maps reference is understandable and mandatory in difficulty level evaluation, but I would rather lean towards the more lenient episodes and maps, than those requiring more game practice. Therefore, perhaps the original early ‘Quake’ episodes, should pose better reference value than the latter ones, speaking of “Normal” difficulty level expectation.

The case is complicated enough just the way it is. Overdoing it, is not going to help further. Aiming to build a system that negates the existence of a human factor, is self defying. The difficulty level estimation algorithm, is a guideline for the mapper, not a rule. We know all of that. Mapper may still decide to step beyond what the guideline says; ultimately, it is only a tool in his toolbox.

With the sections, they do not have to connect - they are just areas, where the mapmaking tool, such as the “Trenchbroom”, would calculate the difficulty balance evaluation outcome. Some sections can be empty of enemies and if there are resources within them - for logistical reasons of minimizing the long backtracking factor - their difficulty balance, would be lower than easy. There is no need for such evaluation. Evaluation is needed only where the risk of overdoing or underdoing something sensitive, occurs.

Too many elements taken at once - especially elements that essentially do not play out together directly - can blur or even handicap the adequacy of final result. Global average, is an abstract. True bottlenecks and problems mostly occur locally, therefore I propose for the sections tool. The sections, the way I see it, are simply “ghost” areas, marked out by a mapper in request for the difficulty balance evaluation. The size of the section being a variable itself, is to stand for qualitative nature of the setup, but such as you said, this is a slippery ground - the mapper, would have to follow very strict design principles for it to make sense.

@triple_agent

Well, absolute valuations don’t depend on any maps for reference and relative valuations can be done with respect to any maps or map sets, so, ID reference, even though understandable, is not mandatory.

A valuation tool would just be an indicator, showing a mapper how difficult their map is, or sections of their map, yes. It wouldn’t limit the mapper in any way. Now if a mapper wanted to make a map with a certain difficulty they might choose to operate within certain constraints, but that would be by their own choice. They might even choose to design different areas of the map to different difficulty levels and use the tool to evaluate them.

But if one wants to design a valuation tool to assign a number to a map to represent its difficulty level overall, and if one chooses the qualitative approach, then one needs to implement a function to combine the valuations obtained per different map sectors into a single number, and in this number’s computation, the relationship between the parcels is relevant. The above was an example where I showed how whether the areas connected or not affected the overall skill level. And this is what the skill level of a map is: a number that reflects the difficulty of the map in its entirety. If one divides it into parcels, one would need to combine the results into a single number, and factoring in the relationship between the parcels, even though optional, would increase the faithfulness of the indicator. If, to take your example, there is on the map an area with goodies but no monsters, that would decrease the overall difficulty, because the player can restock and replenish their health, and even more if the player can use that area for cover. So, it should be taken into consideration in the qualitative approach of the computation of the skill level of the map. However, there is also the more crude quantitative approach, that doesn’t take into consideration the relathionship between parcels of the map. The example I provided only takes into consideration ammo, health, armor, monsters and the average distance between them (a related measure would be, I guess, “monster density”, the (weighted) sum of monsters divided by the navigable area / volume of the map).

So, a global tool wouldn’t conflict with a local tool. It’d be the same tool, applied by mappers locally (as a design aid), and globally (as a skill level indicator, a number that would go with the map).

For the global measure of map difficulty, what about a case, where we have majority of a map doing along the “id” baseline standard, but suddenly, there is a bottleneck of “id++” setup? Should the map be labeled as “id”, “id+” or “id++”? When it comes to my opinion, the map difficulty label, should always take after the section with highest number in this regard - not the average. Consistency, is another thing.

I am rather against “mechanical” estimation of global map difficulty level; even though, it would look neat on the ‘Quaddicted’, to be able to search maps for their “objective” gameplay difficulty. In the end, this is how it should work, but it does introduce a major field for complaints.

I would also suggest to implement a variable determining the resources density, measuring the distance between collectible items. I assume the more spread around the collectibles are, the harder the difficulty level and the more clumped together they are, the easier.

Well, the global measure is an average. Let’s take a numerical example, since a formula would output numbers. Labels would come later.
So, let’s say for simplicity’s sake that a map has two areas, A1 with skill 3, and A2 with skill 7 (I’m going with random numbers here). Abstracting any other factors, the skill of the map would be 5. A skill 7 would correspond to both areas having skill 7, or maybe one having skill 5 and the other skill 9, etc… Taking the section with the highest skill as the map’s skill would not reflect its overall skill, just like taking the maximum of a function over an interval does not reflect its mean value over that interval. However, the author could add in the accompanying .txt file that the map contains sections of this or that skill level. There can also be a feature to the tool to compute the maximum skill level of the map (the skill level of the hardest part of the map). So, the map could come accompanied by two numbers, the mean, and the max — which shows how much the difficulty spikes on the map. That would give a more nuanced description of the map difficulty-wise.

Well, it’s not “mechanical”, it’s just mathematical. For a more nuanced description, more numbers have to be used (e.g. the mean together with the max). One could even have a tool that graphs the difficulty level on each progression route. There’s really no limit to the detail of mathematical description other than one’s mathematical knowledge and one’s ability to model.

Yeah, well a lot can be done here. But my impression is that generally the more spread the items are, the easier the difficulty is (and vice versa), because one has access more readily to spread items, and one can only collect so many items at a time.

I would be fine with map designer giving rough estimation on the difficulty. Markie’s recent Reliquary mod does that very upfront - the skill levels are renamed to Normal, Hard and Very Hard and readme file explicitly stating that the mod is somewhat about difficult gameplay.

Also I just noticed there’s a tag on quaddicted map list in some entries: “hard”. :slight_smile:

While it may sound like a good idea to try to estimate a map’s difficulty mathematically, there are various subjective factors that also come into play and cannot be “counted” by a formula. Take Markie’s own video on difficulty balancing for example (https://www.youtube.com/watch?v=s9bleQCTdTo) - things like enemy and item placement for example, encounter setup according to map geometry, etc. You can pretty much make a map more difficult without changing enemy nor item count.

Some other thoughts about that:

  • If the map has an area with a high monster density, but gives the player a Quad or Pentagram, that section will obviously be easier. All Alkaline and EoE maps for example include at least one “Quad run” per map, so a measure based purely on monster amount/density would skew that rating into the wrong direction. And even if you try to consider the presence of those powerups around, there’s no way to tell if it’ll actually be used into the more populated area, or if it’ll be freely given to the player, or be inside a secret, without playing the map;
  • An apparent high monster density can be masked by the possibility that some of them will only be spawned/released over time, after certain map events, instead of being fought all at the same time - say, you have an area you need to pass more than once, and after getting a key for example, when you go back to it, more monsters came in. Again, that cannot be determined without actually playing the map;
  • Another (more specific) problem: various mods have some kind “monster spawner” for enemy hordes, so it means that the actual monsters spawned in-game won’t always correspond to individual map entities. And each mod (hell, I’d say each map even) do this their own way code-wise, so trying to “guess” that engine-side would be mostly fruitless;
  • And a problem that have been brought up already: higher HP monsters don’t always mean tougher fights. That’s a thing even in vanilla - a Hell Knight for example has only 20% less health than a Fiend, and I think it’s obvious their lethality difference is a lot more than that. And even if one creates hand-tuned difficulty ratings for each vanilla monster, how do you factor that for mods with custom enemies?

That’s the hard truth about difficulty balancing and trying to ascertain that. Unless someone comes up with some “Quake-playing-AI” (which would be pretty difficult, if not downright impossible), I don’t think that a rating system like that would be much useful.

Following on from bmFbr’s point, I’d like to make some further considerations for why one couldn’t even begin to write a mathematical formula to effectively evaluate the difficulty of a Quake map and why, if it were possible, the game wouldn’t be anywhere near as interesting to map for. I’d also like to suggest a better solution to the core problem: players having no good way of determining a map’s difficulty before playing it.

Among a mapper’s tools to balance difficulty are, in addition to the things you have identified:
[list=*]
]Traps and enviromental hazards such as spike shooters, liquids, crushers, pushers and moving platforms used in any number of ways/]
]Monster jumps and other ways of surprising the player/]
]Enemy placement, both in terms of the angle they attack the player from and their relationship to available cover/]
]Item placement relative to enemies (are you given an item before an encounter or do you have to push forward and gain ground to get it)/]
]Timing of events such as monsters spawning, doors opening and so on./]
]Powerups, as bmFbr mentioned, can change an encounter from borderline impossible to easy/]
]The availability of secrets: their difficulty is not only basically impossible to analyse, but it will also vary between players as maps will require a mixture of platforming, timing, observation and other puzzles to access all of the secrets./]
[/list]

Mappers who I would consider especially good at encounter design (Skacky, Mazu, fairweather, Juzley etc) not only know how to use them, but make them collectively and perhaps even individually more important for balance than any of the points you are analysing, and none of them can be effectively accounted for with a formula. What makes Quake mapping interesing to me is that there are so many permutations of different factors to be used in creative ways that every map has the potential to be entirely unique in ways going far beyond the types and numbers of enemies and resources used, in other words it defies mathematical analysis by its very nature.

In the image below I’ve constructed two encounters:
https://cdn.discordapp.com/attachments/854618691094577192/870809345418477578/unknown.png
The left has fewer enemies, spaced further apart, and the same items available, so according to the formula should be easier, yet the right would be a fairly easy encounter and the left would be basically impossible without further resources. While I have designed this specifically to illustrate my point, it isn’t an edge case and you see encounters of both types, though less extreme, all the time.

Finally, my actual suggestion would be to leave it up to the players to rate a map’s difficulty. Much like how Quaddicted has a 1-5 user rating system for quality, there could also be one for difficulty. People often cry for more granularity in map rating, so this could be a good opportunity to completely revamp the system. It might suffer from standards for difficulty changing over time, but I’d argue that this is less of a problem for difficulty rating than for quality.

@bmFbr

Great video. Yes, things like enemy and item placement, encounter setups, etc., can make the map more difficult without changing enemy or ammo count, but those would be taken into consideration by a qualitative formula. The sketch for the formula above (which was made for illustration purposes) only takes into consideration some quantitative aspects, as specified. And those factors (enemy, item placement, etc.) are not subjective, they are merely qualitative, meaning that they need more than numbers (i.e. mathematical structures, like graphs or matrices) to be modeled/represented. A subjective factor would be the player’s (subject’s) skill level / experience, which would affect how they perceive the difficulty.

  • if one is to seriously craft a formula, one would naturally start easy and build their way up. For example, first they would consider only maps with no power-ups, no secrets, no traps, and no teleporting monsters, and would work a formula for those, then they would successively factor in extra elements;

  • well, the question is, ¿how good does a quantitative formula approximate the skill level?; for example, in the video linked above, there clearly is a correlation between monster count / toughness and difficulty: the higher the difficulty was, the tougher the monsters and the more of them. And this things do correlate, how ever roughly: the more you have to move around to avoid / inflict damage, the higher the monster count / toughness.

  • well, to get map data one would have to read the map file; surveying a map via an engine is not efficient and can’t provide all the data; I’m not a mapper, so I don’t know details, but if the engine can read it, interpret it, and render it into a playable level, then a program can be written that takes the same data, but instead of rendering it into a level, plugs it into a formula.

  • yeah, that’s why I suggested weights for monsters; if one would want to convert monsters to hp, they would need to take into consideration monster weights; as I said, there’s a difference between killing an enforcer and a spawn with 4 shotgun shots;

So, ¿would a mathematically modeled rating formula be useful? Well, given that there clearly is a correlation between monster/ammo count/type/density and difficulty, I think that a well-crafted quantitative formula would be useful as a first approximation for minimalistic maps and as a basis to build upon and factor in other important elements of maps (secrets, traps, power-ups, etc.). True, there is a lot of room to order those elements in to affect the difficulty level, and maps may even be engineered that trick the formula way off the rating as experienced by the player, but generally speaking, for most maps, it should work as a rough approximation. We could even test that: design a formula and a program to read off from the map file the elements that go into the formula (monster count, ammo count, etc.), and have it compute the difficulty for a large sample of maps, and compare that to how difficult the maps actually feel.

But, ¿what about elements like entity placement, triggers, geometry, routes, etc.? Well, if they can be programmed, then they can be modelled, and if the quantitative approach proves insufficient, then they ought to be factored in. The question is, ¿to what degree of detail do we need to model them?; because, if factoring in elements beyond a certain level of detail only affects the formula from, let’s say its second decimal, it might be deemed not worth the bother. This takes place in physics and mechanics all the time: using a simpler but less precise formula because it’s good enough approximation. There even is a field in mathematics, Numerical Analysis, that deals with finding as simple formulae as possible for complex quantities (like integrals) given a desired degree of approximation.

It’s not impossible to make Quake-playing AI, if you’re talking about true AI: neural network, self-improving AI, and that’s because, one doesn’t have to program in behavior (like one does with Quake monsters or bots); true AI behavior improves through repeated interaction with its environment. In this case, it would learn on its own how to play the game. But even if AI were to play maps and rate them, it would still do that by taking in data and processing it, i.e. executing an algorithm (read “computing a formula”, not limiting formula to basic arithmetic operations).

@h4724

I should stress out that by formula I don’t mean a quantity expressed using only basic arithmetical operations. You can replace “formula” with “algorithm”: it can contain advanced mathematical operations over complex mathematical structures taken in several steps using conditionals.

A rating algorithm wouldn’t exclude manual rating.

By the way, thank you, guys, for the kick-ass maps you roll out.

Actually, a difficulty rating option on Quaddicted would be most useful, to me even more than the rating of quality, because I’m not picky, but I would like to know beforehand what I’m getting myself into difficulty-wise, so as to know how to approach the map: ¿do I go easy-breezy?, ¿or do I take my time and strategize?

Same here. Like I said, SOME releases do have “hard” tag in them. However, some don’t, like first SMEJ - I’d put “hard” there just on the merits of the first level alone.

@Gila

Yeah, when I first tried I couldn’t finish the first level. After many quick saves I got to the puzzles but I couldn’t figure them out so I abandoned. Months later I tried again; surprisingly, it went much better, I got to the puzzles, figured them out, and was able to fully appreciate their ingenuity. I got up to the last level and I just couldn’t finish it no matter how many quick saves. Third time’s the charm ¿maybe?

This is the thing we should start with. Have people be able to rate the maps for difficulty balance reception. It could provide vital feedback not only for the players, but also for developers.

The difference between what the ‘h4724’ proposes here and what this thread in primary outline strides to achieve, is that the suggestion of ‘h4724’, is doable in the manner of days - depending on how the forces behind the ‘Quaddicted’ portal would sympathize with the idea or a notion of anything new in the system - while anything else, is an intellectual sport, until someone actually sits down and does the thing.

Still, a good sport is a good sport.

@triple_agent

Exactly.

And there’s a famous experiment in which people in a crowd are tasked with estimating the weight of an object in front of them and when the average is taken it turns out to be highly accurate — this turns out to be due to the fact that the people over-guessing cancel out with the people under-guessing. So, a difficulty rating option should, empirically, give quite accurate results, if enough people rate.

This is somewhat similar to instead of using complex mathematical models to approximately compute the volume of an object, you just submerge it under water, that is, use an empirical mean.

@‘I Like Quake’, our belief in the mathematical patterns equals more or less our belief in the grammar, but should we not forget, both are just means to an end. Grammar without content, tells exactly nothing - but content without grammar, is cosmic chaos. Everything emerges from struggle and the struggle, is a way of finding balance. The goal, is balance - proper balance.

For the difficulty balance rating itself, what I propose, is to have a common “5 star” evaluation for each difficulty level a map offers, simply telling how proper it feels from the player standpoint. If the player played only one difficulty level, the rating, should ideally be given only in the realm of that one difficulty. More difficulties played, require more ratings to give. It should be transparent what difficulty level, is being evaluated.