C++ Question Very strange behaviour / memory???

Mr Martian

Orbinaut/Addon dev/Donator
Addon Developer
Donator
Joined
Jun 6, 2012
Messages
279
Reaction score
62
Points
28
Location
Sydney, Australia, Earth, Sol
Website
www.orbithangar.com
Hi All,

I have encountered a very strange proplem, twice now, in totally different projects. I have no idea what this is, if it is something wrong with my code, or bug of somesort. The only thing I can do is describe it and see if anyone might know what is causing this.

Say I have a header file, with a VESSEL class and some variables and methods. I have all those declared in my header. I have only ever encountered this while loading a simulation state (and calling clbkLoadStateEx): one of those variables whill just seem to load the value from it's neighbor (down one line). I am not sure if this is bad coding on my part, something to do with the static .lib that I have made and linked with, or some other issue.

EXAMPLE:

In a header file I may have this:
C++:
double boxes[12];
int X, Y;

void DoSomething ();
bool loaded[2];
bool ison;
int lvl;


Seemingly for no reason, when the values of these members are loaded via clbkLoadStateEx, I might have 'ison' load the value for loaded[1]. When I encounter this issue, it will always be relevant to that member's position in the header file. The ONLY solution I have been able to find, is to place (in this example case) a dummy member, for example int i; between 'bool loaded[2];' and 'bool ison;' in the header file. This corrects the issue, as it almost seems to be a problem with that particular line. I have searched everywhere for an explination for this but have never seen anything like it. Could this be the result of a memory leak? I even performed diagnostics on my system's memory in case of a bad address, but no issues. It is only an issue when loading via clbkLoadStateEx.

I have encountered this twice now, in totally seperate projects.

Sorry if I have not explained myself well, but this is really hard to explain... it makes no sense to me at all...


Thanks all.
 

dbeachy1

O-F Administrator
Administrator
Orbiter Contributor
Addon Developer
Donator
Beta Tester
Joined
Jan 14, 2008
Messages
9,176
Reaction score
1,517
Points
203
Location
VA
Website
alteaaerospace.com
Preferred Pronouns
He/Him
The underlying issue is that there is a bug somewhere in your C++ code that initializes / writes to those values: some code is writing outside the bounds of the variables defined there. One very effective way to track down memory overwrite bugs like this is to enable Visual Studio's heap checks for debug builds (and be sure to only build and run debug builds during development, not release builds, until you reach final pre-release testing). You can check out this post for full details for how to enable Visual Studio's memory heap runtime checks and run your code under the debugger. :)
 

Mr Martian

Orbinaut/Addon dev/Donator
Addon Developer
Donator
Joined
Jun 6, 2012
Messages
279
Reaction score
62
Points
28
Location
Sydney, Australia, Earth, Sol
Website
www.orbithangar.com
The underlying issue is that there is a bug somewhere in your C++ code that initializes / writes to those values: some code is writing outside the bounds of the variables defined there. One very effective way to track down memory overwrite bugs like this is to enable Visual Studio's heap checks for debug builds (and be sure to only build and run debug builds during development, not release builds, until you reach final pre-release testing). You can check out this post for full details for how to enable Visual Studio's memory heap runtime checks and run your code under the debugger. :)
Hi dbeachy1, thanks for the reply!

Just wondering the best way to go about debugging, as I have actually never tried this before. I have set the debugging command to Orbiter.exe but get this message when I try to run the debugger:
1657642115800.png
Are there any other steps I need to take to set this up properly? I always assumed just set the debugging command to orbiter.exe under Local Windows Debugger and good to go. Do these settings look right to you?

1657642288855.png

Many thanks!
 

dbeachy1

O-F Administrator
Administrator
Orbiter Contributor
Addon Developer
Donator
Beta Tester
Joined
Jan 14, 2008
Messages
9,176
Reaction score
1,517
Points
203
Location
VA
Website
alteaaerospace.com
Preferred Pronouns
He/Him
There is a link in the post I linked that details that:

dbeachy1 said:
If it helps, here are details on how to run Orbiter under the Visual Studio debugger to debug / put breakpoints in / step through your add-on's code: https://www.orbiter-forum.com/showthread.php?p=163799

The post talks about VS 2008 and 2010, but the process is the same in newer VS versions as well.
 

Mr Martian

Orbinaut/Addon dev/Donator
Addon Developer
Donator
Joined
Jun 6, 2012
Messages
279
Reaction score
62
Points
28
Location
Sydney, Australia, Earth, Sol
Website
www.orbithangar.com
Hi Again @dbeachy1 sorry just saw the other link in the thread you shared! seem to have it running now. straight away I have got some access violations for writing locations! I have never been so excited to find bugs in my code! you have just showed me a whole new world! Thank you heaps :)
 

N_Molson

Addon Developer
Addon Developer
Donator
Joined
Mar 5, 2010
Messages
8,965
Reaction score
2,792
Points
203
Location
Toulouse
@dbeachy1 : can we make a sticky somewhere where we compile posts like the one you linked ? Maybe some kind of Q/A ? I think it it could be very helpful.
 

kuddel

Donator
Donator
Joined
Apr 1, 2008
Messages
1,917
Reaction score
412
Points
83
@Mr Martian : Having an "off-by-one" error is something every C++ coder makes at least once ;)
From your description I would guess, that somewhere in your code a line states loaded[2] = whatever; wich looks harmless, but isn't.
Those heap-checks (or code-checks in general) are a tool everyone should use! Independent of how experienced one is, those off-by-one errors do happen.
 

Mr Martian

Orbinaut/Addon dev/Donator
Addon Developer
Donator
Joined
Jun 6, 2012
Messages
279
Reaction score
62
Points
28
Location
Sydney, Australia, Earth, Sol
Website
www.orbithangar.com
@Mr Martian : Having an "off-by-one" error is something every C++ coder makes at least once ;)
From your description I would guess, that somewhere in your code a line states loaded[2] = whatever; wich looks harmless, but isn't.
Those heap-checks (or code-checks in general) are a tool everyone should use! Independent of how experienced one is, those off-by-one errors do happen.
Hi kuddel, This is what I thought too, however this is just getting more and more bizzare.

I have been trying to debug this for a total of about 15 hours over the past 2 days... I have rebuilt the project about 4 times, and the issue only occurs when I load some variables via clbkLoadStateEx. I have been stepping through the dubugger step-by-step looking for any strange behaviour and there will be absolutely nothing. I will then add one variable into clbkLoadStateEx and the behaviour is very odd. my members will stay the same, there will be seemingly lo overwriting other values, but then suddenly later down the code at a seemingly random point, I will get an access violation for every proceeding step after that. For some reason I don't have a memory window option, so it is impossible for me to actually see what location is causing this.... I think I will probnably give up on this project soon. I have never seen anything like this, and seems impossible to debug.
 

Mr Martian

Orbinaut/Addon dev/Donator
Addon Developer
Donator
Joined
Jun 6, 2012
Messages
279
Reaction score
62
Points
28
Location
Sydney, Australia, Earth, Sol
Website
www.orbithangar.com
Well I have discovered something. This error only occurs when loading private members from my vessel class. loading members of the .lib class I am linking to is fine... Would anyonme have any idea why this might be? I am just at a total loss?
 

dbeachy1

O-F Administrator
Administrator
Orbiter Contributor
Addon Developer
Donator
Beta Tester
Joined
Jan 14, 2008
Messages
9,176
Reaction score
1,517
Points
203
Location
VA
Website
alteaaerospace.com
Preferred Pronouns
He/Him
Since you can reproduce the problem consistently, I suggest a standard debugging approach here:
  1. Set a breakpoint at the start of the callback where you know the memory overwrite occurs (clbkLoadStateEx, in this case).
  2. When you hit the breakpoint, add the variable the you know gets corrupted to the watch window.
  3. Now single-step through the code in that method, watching the contents of that variable in the watch window. This will let you see exactly which line in the method is overwriting that variable unexpectedly.
As a side note, since you're probably using sscanf in clbkLoadStateEx to parse the text in the scenario, I would look closely at each scanf line as you step over it: if either the format string or the pointer(s) being passed as arguments to the sscanf call are wrong or out-of-sync with each other at all, it can overwrite memory outside the bounds of the object addresses being passed. You will be able to see exactly which line is causing that error by single-stepping through the method as detailed in the steps above.

There is even a way to set a "data breakpoint" on a variable's first 4-byte (for 32-bit code) or 8-byte (for 64-bit code) memory address range so that the debugger will pop up anytime that memory gets written to; this uses debug registers in the CPU itself to send a signal to the debugger when a specific memory address range is written to, so the performance impact of the debugger "watching" for changes to that memory is minimal. This can be very useful for memory overwrite bugs that are hard to reproduce -- you shouldn't need that here, though, since you can easily reproduce the bug. However, see https://docs.microsoft.com/en-us/vi...2#BKMK_set_a_data_breakpoint_native_cplusplus for details.
 

Mr Martian

Orbinaut/Addon dev/Donator
Addon Developer
Donator
Joined
Jun 6, 2012
Messages
279
Reaction score
62
Points
28
Location
Sydney, Australia, Earth, Sol
Website
www.orbithangar.com
Since you can reproduce the problem consistently, I suggest a standard debugging approach here:
  1. Set a breakpoint at the start of the callback where you know the memory overwrite occurs (clbkLoadStateEx, in this case).
  2. When you hit the breakpoint, add the variable the you know gets corrupted to the watch window.
  3. Now single-step through the code in that method, watching the contents of that variable in the watch window. This will let you see exactly which line in the method is overwriting that variable unexpectedly.
As a side note, since you're probably using sscanf in clbkLoadStateEx to parse the text in the scenario, I would look closely at each scanf line as you step over it: if either the format string or the pointer(s) being passed as arguments to the sscanf call are wrong or out-of-sync with each other at all, it can overwrite memory outside the bounds of the object addresses being passed. You will be able to see exactly which line is causing that error by single-stepping through the method as detailed in the steps above.

There is even a way to set a "data breakpoint" on a variable's first 4-byte (for 32-bit code) or 8-byte (for 64-bit code) memory address range so that the debugger will pop up anytime that memory gets written to; this uses debug registers in the CPU itself to send a signal to the debugger when a specific memory address range is written to, so the performance impact of the debugger "watching" for changes to that memory is minimal. This can be very useful for memory overwrite bugs that are hard to reproduce -- you shouldn't need that here, though, since you can easily reproduce the bug. However, see https://docs.microsoft.com/en-us/vi...2#BKMK_set_a_data_breakpoint_native_cplusplus for details.
Thank you dbeachy! The data breakpoinbt was really useful actually. The issue seems to be using the sscanf function to write to a bool. I had this:
sscanf (line+5, "%i", &sys.lights);
changed sys.lights to an int and nothing seems to be getting overwritten now. I assume this has something to do with the 2 byte size difference between bools and ints, but I don't remeber having any issue with this previously...
 

dbeachy1

O-F Administrator
Administrator
Orbiter Contributor
Addon Developer
Donator
Beta Tester
Joined
Jan 14, 2008
Messages
9,176
Reaction score
1,517
Points
203
Location
VA
Website
alteaaerospace.com
Preferred Pronouns
He/Him
I assume this has something to do with the 2 byte size difference between bools and ints, but I don't remeber having any issue with this previously...
Grats on tracking down that bug! (y)BTW, it's actually a three-byte difference between bools (one byte) and ints (four bytes). :) So doing a sscanf of an int (via "%i") into a bool variable (which only reserves one byte of memory) will overwrite 3 bytes beyond the end of the bool, which will smash (overwrite) the first three bytes of whatever variable happens to be using those three bytes. ?

As for why bugs like that sometimes appear to work, it all depends on what happens to reside next to the above bool in memory. If that memory is not used by anything important, the bug won't be noticed.
 

Mr Martian

Orbinaut/Addon dev/Donator
Addon Developer
Donator
Joined
Jun 6, 2012
Messages
279
Reaction score
62
Points
28
Location
Sydney, Australia, Earth, Sol
Website
www.orbithangar.com
Thank you @dbeachy1! Wow That is amazing, I can't believe I have just been that lucky all this time... it makes total sense, but I always just assumed sscanf converts whatever it reads when it tries to pass the value to the given reference... I'll be sure to avoid that in the future!

Thanks again heaps for all the advice on debugging, feel like I'be been driving blind all these years!
 

kuddel

Donator
Donator
Joined
Apr 1, 2008
Messages
1,917
Reaction score
412
Points
83
scanf functions can only deduce the size of the given parameters by the format-string! Nowadays you can make the compiler/preprocessor warn about possible missalignments between format-string-qualifier and given parameter. I highly recommend turning on ALL warnings and treat them as errors!
A type-save alternative to scanf functions would be to use std::io for that, but it's not all gold there too, as the code pretty easily gets very "noisy"
C++:
  std::istringstream iss(inputString); // e.g. "DUMMY 1 0 1e3"
  std::string word;
  int number;
  bool flag;
  int real;
  iss >> word >> number >> flag >> real;
 
Top