Orbiter-Forum  

Go Back   Orbiter-Forum > Projects > ORBITER: 2010-P1 and newer > Bug
Register Blogs Orbinauts List Social Groups FAQ Projects Mark Forums Read

CTD zooming with high-res Florida textures Issue Tools
issueid=20 06-10-2008 12:57 AM
O-F Administrator
CTD zooming with high-res Florida textures

I am able to reproduce this consistently:

1. Install Beta 080609 over clean Orbiter 060929.
2. Install latest Earth textures.
3. Install Florida pack by copying it to C:\OrbiterBetaClean directory and running it. Merge finishes successfully.
4. Load up the default scenario Delta-glider -> DG-S Ready for Takeoff.
5. Hold PAGE-UP to zoom out to about 100 megameters. Often it will CTD before getting there.
6. If you reach 100 M, hold PAGE-DOWN to zoom in again until you are within 100 meters of the ship.
7. Repeat steps 5-6 a few times if necessary

It will consistently CTD somewhere in steps 5 or 6 as you are zooming and tiles are loading.

I also tested this (i.e., ran steps 5-7) after step 1 but before step 2 and it did not CTD. Then I tested it after step 2 but before step 3 and it did CTD one time, but I could not reproduce it reliably. However, after merging the Florida textures in step 3 it CTDs consistently while zooming for me, albeit at different distances. It appears to happen at random as the system is loading tiles rapidly while zooming in or out. According to the crash info it is inside DDRAW.dll.
Issue Details
Project ORBITER: 2010-P1 and newer
Status Fixed
Priority 3
Affected Version 080609
Fixed Version (none)
Users able to reproduce bug 0
Users unable to reproduce bug 3
Assigned Users (none)
Tags (none)

06-10-2008 02:11 AM
Addon Developer
 
I can't reproduce this one, but I did have a problem like that with a generic orbiter damage module in 2006 edition with patch 1. I would zoom out, then upon zooming back in, Orbiter would crash.
Reply
06-10-2008 02:42 AM
Beta Tester
 
I can't reproduce it either.
Reply
06-10-2008 04:04 AM
O-F Administrator
 
Hmm...that is odd. It might be framerate-related; under Vista x64 without VistaBoost (which was not installed) I am getting about 57 FPS in a 1280x1024 window. I can reproduce it every time, albeit at different altitudes. My guess is that it is a race condition of some sort: are the tiles loaded in a separate thread? I am running on a quad-core (Q6700) @ 3.00 GHz. Video is 8800GTX. I have seen race condition bugs show up much sooner on multi-core systems, so that's why I'm wondering if tile loading is in a separate thread.
Reply
06-10-2008 04:40 AM
Orbiter Founder
 
Yes, the tiles are loaded in a separate thread. Can you reproduce the problem with orbiter_ng and the DX7 client? If so, you may be able to debug it with the DX7 sources. The relevant code is in the TileMgr.cpp file (method TileBuffer::LoadTileAsync, called from TileManager:rocessTile which does the recursive quad-tree processing).
LoadTileAsync adds requests to the tile queue. The function TileBuffer::LoadTile_ThreadProc, which is running on a separate thread, scans the request queue and loads the tiles. The synchronisation is done via hQueueMutex. It is quite possible that there is a race condition somewhere, but it may be difficult for me to debug it since I can't reproduce the problem. You are in a better position for that.

Just for confirmation: does the problem disappear if you preload the tiles (in the planet render parameters under the Extra tab)?
Reply
06-10-2008 03:00 PM
O-F Administrator
 
Sure, I'll be happy to debug it here. I will also try preloading all the tiles. I'll post my findings here.
Reply
06-10-2008 08:17 PM
O-F Administrator
 
More testing here shows that it does not CTD when tiles are pre-loaded, so that's good news. Also, if I wrap the entire block in Loadtile_ThreadProc just after the 'Sleep(50)' call in WaitForSingleObject/ReleaseMutex, the crash does not occur. So it definitely looks like a race condition, with the problem being some shared resource contention inside the first part of the 'if (load) {' block that is not protected by the mutex. However, I haven't figured out what the problem is yet.

I do have a question: the LoadTile_ThreadProc loop locks the queue, copies the queue entry to a static buffer, unlocks the queue, and then operates on the static copy of the queued entry (qd):

Code:
  Sleep (50);
  WaitForSingleObject (hQueueMutex, INFINITE);
  if (load = (nqueue > 0)) {
   memcpy (&qd, loadqueue+queue_out, sizeof(QUEUEDESC));
  }
  ReleaseMutex (hQueueMutex);
This makes sense. However, for some reason the queue slot isn't marked as free until the end of the loop after all processing is complete:

Code:
  nqueue--;
  queue_out = (queue_out+1) % MAXQUEUE;
  ReleaseMutex (hQueueMutex);
Is there some reason why the queue entry isn't freed right away in the first mutex block right after its contents are copied to the static 'qd' buffer for processing? We want to free the queue entry as soon as possible, correct? The reason I ask is because I want to be sure I understand the code.

EDIT: from studying this some more, I see that it does not remove the queue entry until after the tile is fully loaded so that the same tile is not unnecessarily and repeatedly re-queued before the load finishes. That brings up my next question, though: can qd simply be a pointer to the existing queue entry since we don't free it until the tile is fully loaded? That way we could avoid the memcpy call.

EDIT #2: Scratch that last question. After some more testing with that, I see now why we have to latch the queued request to a static buffer: on shutdown, the queue memory may be freed by the other thread before we are done with our loop. So the code still looks good as far as I can tell. I'll debug it some more here...

In any case, from what I can tell the code is OK since it appears that only the LoadTile_ThreadProc thread writes to nqueue and queue_out. All I know for sure right now is that the unprotected part of the code in the bRunThread loop is the problem. I'll investigate some more tonight.
Reply
06-11-2008 02:30 AM
O-F Administrator
 
After more debugging, I have narrowed the problem down to the c->GetTexMgr()->LoadTexture calls in LoadTile_ThreadProc. If I lock around those calls, the CTD does not occur. If I do not lock around those calls, I get a CTD. Here are the changes in LoadTile_ThreadProc that fix the CTD for me:

Code:
WaitForSingleObject (hQueueMutex, INFINITE); // {DEB} must lock here to fix the CTD
if (gc->GetTexMgr()->LoadTexture (fname, ofs, &tex, flag) != S_OK)
   tex = NULL;
ReleaseMutex (hQueueMutex); // {DEB}
Code:
WaitForSingleObject (hQueueMutex, INFINITE); // {DEB} must lock here to fix the CTD
if (gc->GetTexMgr()->LoadTexture (fname, ofs, &mask) != S_OK)
   mask = NULL;
ReleaseMutex (hQueueMutex); // {DEB}
I believe the problem is that some of the DirectX calls invoked in the LoadTexture call stack are not reentrant (i.e., do not support multithreading), and so the DirectX library calls can crash if two threads are accessing some of the same internal DirectX data simultaneously. Unfortunately I have no idea which DirectX calls are the problem, and I see a fair number of DirectX calls happening inside the LoadTexture call stack. For example, I see this call in TextureManager::ReadDDSSurface:

Code:
if (FAILED (hr = pDD->CreateSurface( pddsd, ppddsDXT, NULL))) {
   //LOGOUT_DDERR(hr);
   goto LFail;
}
If the DirectX CreateSurface method uses some shared resources used by DirectX calls in the other thread, that could cause a CTD. The same is true for any other DirectX calls that are invoked in the LoadTexture call stack.

The bad news is that unless we can figure out which DirectX calls need to be locked via the mutex (and then to lock around those calls), we will have to lock around the entire LoadTexture call, which would negate the performance advantage of loading in a separate thread in the first place. Does anybody have any ideas?
Reply
06-11-2008 03:08 AM
Orbiter Founder
 
Oh no. It never occurred to me that DirectX may not be threadsafe. Looking at the documentation, there appears to be a flag DDSCL_MULTITHREADED that can be set for the cooperative level. I'll try to compile a test version with this for you to try (probably not before tomorrow, it's getting a bit late). Of course this could slow down the overall DirectX performance, so might not be an ideal solution.
Reply
06-11-2008 05:21 AM
O-F Administrator
 
Thanks, I'm looking forward to trying out the new build. We could also run some quick-and-dirty benchmarks between the two versions and see how much of a performance hit, if any, DDSCL_MULTITHREADED causes. Hopefully it will be minimal since it should only involve some extra EnterCriticalSection/LeaveCriticalSection calls inside the DirectX layer, and those calls do not add much overhead; as I recall, each call to those methods is only about 30-50 clock cycles (unless they have to block, of course -- but in that case, blocking is a good thing!)

From looking at the code it appears that texture loading is a fairly expensive operation, plus it will occur quite frequently, so IMHO I think using a worker thread to load textures is well worth a small performance hit if that's what it takes to make effective use of multiple cores. For multi-core systems, having a threaded texture loader should make the main thread run noticably smoother even when loading large textures on-the-fly.
Reply
06-12-2008 03:25 AM
Orbiter Founder
 
I have now added the DDSCL_MULTITHREADED flag to the D3D7 cvs sources. You can also get an inline graphcs version of orbiter.exe which sets this flag here:

http://orbiter-forum.com/project.php?issueid=21
( the orbiter.zip thread attachment. I didn't want to attach the same file to this thread as well).

A very quick test seems to show negligible impact of the flag on performance. If it really solves the problem, it's definitely worth it.
Reply
06-12-2008 06:42 AM
O-F Administrator
 
That did the trick! I tested both the new D3D7 client and the new Orbiter EXE from the other thread. Both work like a charm! Great job fixing this, Martin!
Reply
06-12-2008 10:05 AM
Orbiter Founder
 
Great news! Note that I forgot to add the MULTITHREADED flag to the fullscreen mode flags yesterday (I really shouldn't try fixing bugs after 4am). This is now fixed for the D3D7 cvs sources, and will be fixed for the next update of the orbiter.exe inline graphics.

If the lastest D3D7 client works for you in fullscreen mode, I guess we can close this issue.
Reply
06-12-2008 03:37 PM
O-F Administrator
 
I was testing in a 1280x1024 window yesterday; I will test it again today in fullscreen mode.

EDIT: OK, I updated my D3D7Client source here, but I found and fixed a typo in the DDSCL_MULTITHREADED fullscreen fix in D3D7Frame.cpp. You will want to update your source from CVS again to get the updated D3D7Frame.cpp and verify that the typo isn't present in the Orbiter embedded DX7 client as well.

After the fix, dynamic texture loading works perfectly in fullscreen mode now as well. It looks good!
Reply
06-12-2008 04:17 PM
Orbiter Founder
 
Quote:
Originally Posted by dbeachy1
 EDIT: OK, I updated my D3D7Client source here, but I found and fixed a typo in the DDSCL_MULTITHREADED fullscreen fix in D3D7Frame.cpp. You will want to update your source from CVS again to get the updated D3D7Frame.cpp and verify that the typo isn't present in the Orbiter embedded DX7 client as well.
Ah, thanks for the catch. If there's any time worse for coding than late at night it's early in the morning ...
Reply
Reply

Issue Tools
Subscribe to this issue

All times are GMT. The time now is 06:28 AM.

Quick Links Need Help?


About Us | Rules & Guidelines | TOS Policy | Privacy Policy

Orbiter-Forum is hosted at Orbithangar.com
Powered by vBulletin® Version 3.8.11
Copyright ©2000 - 2020, vBulletin Solutions Inc.
Copyright 2007 - 2017, Orbiter-Forum.com. All rights reserved.