Is the difficulty of debugging complex games non-linear?

Commenter

Newcomer
Considering AC: Unity had been in development for 1500 years with a team of ten thousand people (ok slight exaggeration), are we right to demand completely bug-free experiences that run at 1080p / 60fps? The testing and debigguing of a piece of software has to be be balanced against the costs as well, and if the increase in potential bugs and glitches is non-linear or even exponential to the increase in complexity then will glitchy AAA titles that need subsequent patching become more hte norm than the exception?
 
That's no what I meant. I mean will the difficulty of rooting out all bugs in a game go up exponentially with games written from the ground up to take advantage of the new hardware?
 
The difficulties to stop making a game (aka deadline) raise exponentially with the amount of money (or time, or people) believed to be available.
The answer to you question is in the consequences of the acting upon this belief.
 
When many crafters work semi cooperatively and you don't have solid documentation, it's usually hard to get things to work as intended for sure.
Making programs, and games in particular, is not at an industrial stage with processes, testing protocols in place and everything else, but rather at the crafting level with many crafters seemingly cooperating...

Most games companies don't have peer reviewing, they don't have unit testing, they don't have documentation (and when they have some it's desperatly bad, outdated and generally useless), and they don't have software architects/designers, just a lot of hands of (rather less than more) skilled individuals doing what they have been told with little interactions with people who might know better (and usually little knowledge in algorithms).

Granted I haven't worked everywhere, and there are notable exceptions.

Games companies DO NOT even TRY to make bugfree games !

(And why would they since people still purchase their software and consider it acceptable to have bugs anyway ?)
 
Most games companies don't have peer reviewing, they don't have unit testing, [...] and they don't have software architects/designers[...]

Would you mind to elaborate? My experience is limited to a sub-sector (game engine) of the field, yet I saw it different. Dont consider startups (<50-100) - they always trade quality for speed.
 
Would you mind to elaborate? My experience is limited to a sub-sector (game engine) of the field, yet I saw it different. Dont consider startups (<50-100) - they always trade quality for speed.
A lot of game companies are <100 people and exist since several years, so are no startup.
If you worked in a company selling game engines (or other middleware), you necessarily had better standards. (Or the company went bankrupt.)
That said you weren't making games ;)
 
Bugs do increase exponentially with software complexity, and you cannot begin to understand the complexity (and misery) of software development without a decent amount of first hand knowledge. There are so many interacting parts (libs and APIs) developed by people that you might have no control over yourself and there's the issue of never knowing if a bug is your fault or theirs.

Two examples
1: At university I had a simple program in two loops. It wasn't completing. I put a printf statement between loops to find which loop was failing. The printf didn't show so I knew it was the first loop that was wrong, but I couldn't find the problem. No-one could. Turns out the problem was that the system buffered the output and wouldn't actually display the statement until another statement was printed. Totally unnatural behaviour and something you cannot intuitively work with - you just have to learn the systems inside and out.

2: My game used to connect to Google Play services leaderboards just fine. I updated the library I was using and it stopped working. But at the same time, Google updated their library. So which library was to blame? Debug log showed in big, red letters "Google Play Resources cannot be found," suggesting something wrong with the settings or compilation. Lots of struggles later, nothing. And by lots, I mean weeks of going round in circles accomplishing nothing. The more things you change to try and fix one bug, the more chance you have of breaking something else. Then I end up chasing any every bit of info, having learnt something from that Uni experience of yesteryear, and I look up the resources error message. It's a bug in Google's API, reporting they're missing when they're not. It's a lie! So the information one's using to logically try to the solve the issue us useless and you have to have to find other insights.

Then there's the fact that tutorials only ever cover simple examples and are a completely useless basis for anything more complicated, and the non-existent or outdated documentation where you have to work everything out yourself including methods/functions that don't work how they are described to work.

Ultimately, software development deals with unreliable information, and that in my mind makes it far more difficult that brain surgery or rocket science. These work with constants, whereas computing works with ever changing states and rules and misinformation. "Rocket Science" came up in the atomic era and something cool and futuristic and difficult, but it's an outdated analogy now.

However, in answer to the OP's question whether games should be bug free, they should work well. It's a fundamental requirement of the concept of a sale and a fair transaction that the product is fit for purpose. If a company cannot make the game they want in the budget they have without enough bugs to sink it, they shouldn't make it and should pick an easier target. If they want to experiment with bug-ridden games, they need to finance/sell them as open betas/Early Access type products. It's not good enough to say, "the games people want cost too much. We'll have to release them broken and then pay for bug fixes with the sales revenue." If the business can't handle games of that scope without low-bug releases, the games should fit what the market will support.
 
A lot will hang on the engine. I know little about games development but have some experience with a software platform called VBS2 by Bohemia Interactive in Australia. Anybody who'se been involved in military or first-responder simulators will know Bohemia's products which are very powerful and extensible - sorry, this is sounding like an advert!

My team has done a fair amount of work on this platform and it's a joy to use. The engine and the tools are the key. AC:U is running on AnvilNext and I imagine a lot of code was completely rewritten, which is a tremendous opportunity to lose track of hacky fixes from old versions and introduce new issues.
 
Back to the OP question:
Game companies are here for profits, not art, that's almost a byproduct to them. They'll produce the cheapest games they can for the highest price they can to maximise their profits, which means the quality will be as low as the market will allow them.
(As they have no reason to do otherwise, but of course they will adapt to the market and change the quality to match its requirements.)
So I expect games to continue to come with day 1 patches and further patching during their first months.

[The problem is that since games are good and selling well, game companies are believing they are doing a good job, hell they even get compared to the movie industry ! Problem is, although the games are good, the engineering behind them is not, but game companies have no reason to believe anyone telling them they are no good at engineering, since they make games that work and are very profitable...]
[And why would they improve anyway since they make huge profits, it would be expensive to introduce processes & ISO standards since they'd have to train all their people, and they also fear it would impair creativity. Engineering != creativity. Game designers dream, engineers make it a reality (with tradeof sometimes), but rare are the people to acknowledge it.]
 
Last edited:
@Shifty Geezer

Reminds me of this I read on Quora.net.

It's kind of painful to re-live this one. As a programmer, you learn to blame your code first, second, and third... and somewhere around 10,000th you blame the compiler. Well down the list after that, you blame the hardware.

This is my hardware bug story.

Among other things, I wrote the memory card (load/save) code for Crash Bandicoot. For a swaggering game coder, this is like a walk in the park; I expected it would take a few days. I ended up debugging that code for 6 weeks. I did other stuff during that time, but I kept coming back to this bug -- a few hours every few days. It was agonizing.

The symptom was that you'd go to save your progress and it would access the memory card, and almost all the time, it worked normally... But every once in a while the write or read would time out... for no obvious reason. A short write would often corrupt the memory card. The player would go to save, and not only would we not save, we'd wipe their memory card. D'Oh.

After a while, our producer at Sony, Connie Booth, began to panic. We obviously couldn't ship the game with that bug, and after six weeks I still had no clue what the problem was. Via Connie we put the word out to other PS1 devs -- had anybody seen anything like this? Nope. Absolutely nobody had any problems with the memory card system.

About the only thing you can do when you run out of ideas debugging is divide and conquer: keep removing more and more of the errant program's code until you're left with something relatively small that still exhibits the problem. You keep carving parts away until the only stuff left is where the bug is.

The challenge with this in the context of, say, a video game is that it's very hard to remove pieces. How do you still run the game if you remove the code that simulates gravity in the game? Or renders the characters?

What you have to do is replace entire modules with stubs that pretend to do the real thing, but actually do something completely trivial that can't be buggy. You have to write new scaffolding code just to keep things working at all. It is a slow, painful process.

Long story short: I did this. I kept removing more and more hunks of code until I ended up, pretty much, with nothing but the startup code -- just the code that set up the system to run the game, initialized the rendering hardware, etc. Of course, I couldn't put up the load/save menu at that point because I'd stubbed out all the graphics code. But I could pretend the user used the (invisible) load/save screen and asked to save, then write to the card.

I ultimately ended up with a pretty small amount of code that exhibited the problem -- but still randomly! Most of the time, it would work, but every once in a while, it would fail. Almost all of the actual Crash code had been removed, but it still happened. This was really baffling: the code that remained wasn't really doing anything.

At some moment -- it was probably 3am -- a thought entered my mind. Reading and writing (I/O) involves precise timing. Whether you're dealing with a hard drive, a compact flash card, a Bluetooth transmitter -- whatever -- the low-level code that reads and writes has to do so according to a clock.

The clock lets the hardware device -- which isn't directly connected to the CPU -- stay in sync with the code the CPU is running. The clock determines the Baud Rate -- the rate at which data is sent from one side to the other. If the timing gets messed up, the hardware or the software -- or both -- get confused. This is really, really bad, and usually results in data corruption.

What if something in our setup code was messing up the timing somehow? I looked again at the code in the test program for timing-related stuff, and noticed that we set the programmable timer on the PS1 to 1kHz (1000 ticks/second). This is relatively fast; it was running at something like 100Hz in its default state when the PS1 started up. Most games, therefore, would have this timer running at 100Hz.

Andy, the lead (and only other) developer on the game, set the timer to 1kHz so that the motion calculations in Crash would be more accurate. Andy likes overkill, and if we were going to simulate gravity, we ought to do it as high-precision as possible!

But what if increasing this timer somehow interfered with the overall timing of the program, and therefore with the clock used to set the baud rate for the memory card?

I commented the timer code out. I couldn't make the error happen again. But this didn't mean it was fixed; the problem only happened randomly. What if I was just getting lucky?

As more days went on, I kept playing with my test program. The bug never happened again. I went back to the full Crash code base, and modified the load/save code to reset the programmable timer to its default setting (100 Hz) before accessing the memory card, then put it back to 1kHz afterwards. We never saw the read/write problems again.

But why?

I returned repeatedly to the test program, trying to detect some pattern to the errors that occurred when the timer was set to 1kHz. Eventually, I noticed that the errors happened when someone was playing with the PS1 controller. Since I would rarely do this myself -- why would I play with the controller when testing the load/save code? -- I hadn't noticed it. But one day one of the artists was waiting for me to finish testing -- I'm sure I was cursing at the time -- and he was nervously fiddling with the controller. It failed. "Wait, what? Hey, do that again!"

Once I had the insight that the two things were correlated, it was easy to reproduce: start writing to memory card, wiggle controller, corrupt memory card. Sure looked like a hardware bug to me.

I went back to Connie and told her what I'd found. She relayed this to one of the hardware engineers who had designed the PS1. "Impossible," she was told. "This cannot be a hardware problem." I told her to ask if I could speak with him.

He called me and, in his broken English and my (extremely) broken Japanese, we argued. I finally said, "just let me send you a 30-line test program that makes it happen when you wiggle the controller." He relented. This would be a waste of time, he assured me, and he was extremely busy with a new project, but he would oblige because we were a very important developer for Sony. I cleaned up my little test program and sent it over.

The next evening (we were in LA and he was in Tokyo, so it was evening for me when he came in the next day) he called me and sheepishly apologized. It was a hardware problem.

I've never been totally clear on what the exact problem was, but my impression from what I heard back from Sony HQ was that setting the programmable timer to a sufficiently high clock rate would interfere with things on the motherboard near the timer crystal. One of these things was the baud rate controller for the memory card, which also set the baud rate for the controllers. I'm not a hardware guy, so I'm pretty fuzzy on the details.

But the gist of it was that crosstalk between individual parts on the motherboard, and the combination of sending data over both the controller port and the memory card port while running the timer at 1kHz would cause bits to get dropped... and the data lost... and the card corrupted.

This is the only time in my entire programming life that I've debugged a problem caused by quantum mechanics.
 
He called me and, in his broken English and my (extremely) broken Japanese, we argued. I finally said, "just let me send you a 30-line test program that makes it happen when you wiggle the controller." He relented. This would be a waste of time, he assured me, and he was extremely busy with a new project, but he would oblige because we were a very important developer for Sony. I cleaned up my little test program and sent it over.
Now imagine you're a developer who isn't that important. "Sorry, we won't help you, you're wrong and we're too busy." The project is dead, there's nothing you can do about it. Tough luck.

That is, in my mind, the very real prospect of development. I had two years of work wiped out just when it was completed when an update to Adobe Flash that I plugged into didn't work with my code and I was the only person on the planet affected by the bug. The development teams involved wouldn't help me. The only option was a complete rewrite from scratch (as I couldn't afford to hire an expert to produce a fix, which may even have been impossible). The project was abandoned.

For modern developers on middleware, you're at the mercy of the middleware vendors. If you find a bug and it's low priority, you have to find a workaround and make a change. So you can't comfortably engineer with a sense of trust that what you design will be implementable. You read the manual, make the test programs, everything's looking good - oh look, a physics bug that isn't going to be patched for a year. :rolleyes: If you're a big house or a Serious Coder, you can do a lot yourself, but you still have dependencies. There can't be many people out there writing their own physics from scratch, their own UIs, their own audio drivers, etc. And of course the man hours to make an engine with great tools like Unity or Unreal Engine is the same for your in-house project as it is for those middleware vendors who are doing it full time, so that's a mammoth undertaking.
 
one big question would be are devs less concerned by finding and resolving bugs since they can now patch their games after release, would they use the release like some free big scale beta test and just wait for feedback from players. Games like BF4 took up to almost one year after release to be "finished"
 
Now imagine you're a developer who isn't that important. "Sorry, we won't help you, you're wrong and we're too busy." The project is dead, there's nothing you can do about it. Tough luck.

That is, in my mind, the very real prospect of development. I had two years of work wiped out just when it was completed when an update to Adobe Flash that I plugged into didn't work with my code and I was the only person on the planet affected by the bug. The development teams involved wouldn't help me. The only option was a complete rewrite from scratch (as I couldn't afford to hire an expert to produce a fix, which may even have been impossible). The project was abandoned.

For modern developers on middleware, you're at the mercy of the middleware vendors. If you find a bug and it's low priority, you have to find a workaround and make a change. So you can't comfortably engineer with a sense of trust that what you design will be implementable. You read the manual, make the test programs, everything's looking good - oh look, a physics bug that isn't going to be patched for a year. :rolleyes: If you're a big house or a Serious Coder, you can do a lot yourself, but you still have dependencies. There can't be many people out there writing their own physics from scratch, their own UIs, their own audio drivers, etc. And of course the man hours to make an engine with great tools like Unity or Unreal Engine is the same for your in-house project as it is for those middleware vendors who are doing it full time, so that's a mammoth undertaking.
I would have been at the verge of committing suicide
 
one big question would be are devs less concerned by finding and resolving bugs since they can now patch their games after release, would they use the release like some free big scale beta test and just wait for feedback from players. Games like BF4 took up to almost one year after release to be "finished"

It's probably not dev, more commercial rules… not to many companies are ok to delay a costly marketing plan, and less are ok to have a good secure time dev zone… And this is for almost all industries, look at the auto and the number of recalls… We're in a world where immediate profits is the rule, don't care of futur, will all died… Punks rules the economy!!! :mad:
 
yes i should have said publishers instead of developpers. Games are way more complex now and less scripted, increasing the chances of bugs like we see in the latest AC.
 
Considering AC: Unity had been in development for 1500 years with a team of ten thousand people (ok slight exaggeration), are we right to demand completely bug-free experiences that run at 1080p / 60fps? The testing and debigguing of a piece of software has to be be balanced against the costs as well, and if the increase in potential bugs and glitches is non-linear or even exponential to the increase in complexity then will glitchy AAA titles that need subsequent patching become more hte norm than the exception?

Absolutely, there is no reason to not demand and expect a game to be properly tested and bug free with good performance. However, this doesn't mean your entitled to anything. As the consumer, you have to be willing to speak with your money. So if you don't like the fact that publishers stick to these unreasonable deadlines at the expense of quality, then you need to become an informed consumer before making the purchase. If reports and showing a half baked game, then if you truly believe that it is unacceptable, then you need to stick to your principles, and not buy it. People are not entitled to a good experience, that's where people get it wrong, they are entitled to purchase the product as it is presented. If you buy a crappy product, the manufacture/developer is not obligated to do anything if they so choose not to do it. The product that is on the shelf for $60 is what they are offering, the consumer then has to make a decision, is that product worth my money or not.

As for the developers and the complexity and quality control, of course it has gotten much more labor intensive to stay on top of this. I remember Shin'en speaking about how much easier it is for them to stay in complete control of their code, because there is only a few people working on their projects. When you have hundreds of people writing code, and they don't even work in the same facility, its going to be even tougher to keep things clean. Ultimately publishers will have to make decisions, do they want to make these massive games that need an army of workers to develop. I guess it depends on what sells, as long as sub part gaming experiences continue to sell, publishers will assume that this what the consumer wants and continue to walk the same path.
 
Most games companies don't have peer reviewing, they don't have unit testing, they don't have documentation (and when they have some it's desperatly bad, outdated and generally useless), and they don't have software architects/designers
We do have code reviews. All code submitted needs to be reviewed by someone. Mostly we use pair reviews (one coder reviews other's code), but for some big core functionalities we do peer reviews in larger groups (in a meeting room). This kind of review session for example might happen when a big core API refactoring is ready in it's own branch. We go though the changes and plan how to integrate it to the main development branch. When you have 50+ coders working on a the same code base, you need to ensure that the build never breaks. We have also a big QA/QC team working daily and shelve testing facilities for risky submits that need bigger regression tests.

We have automated unit tests on all our core libraries. UI code and rendering code (not including GPGPU code such as sorting and low level framework / engine code) often doesn't have full test coverage. Testing code that provides visual interpretation / interface for a human beings is hard to automate. To ensure that these things work properly, we have lots of added debug functionality (that can be enabled without recompiling the project, allowing quick debugging whenever needed).

We do have software architects. But that's not as clearly separated role as it is in many software development companies. It would be impossible to maintain any code quality (or productivity) in big multiple-studio collaborations without someone actively looking at the code quality.
 
Back
Top