New Release D3D9Client Development

Fix for Sudden Failure of Stereoscopic 3D with NVidia for any D3D9Client release !!

The NVIDIA API provides access to its Stereoscopic 3D features. One has to download the required NVIDIA libraries and compile D3D9 with them. As it is not a featured used with a majority of users, for development purposes it is easier to just compile without the NVIDIA libraries that would enable the 3D Stereoscopic features.

The version you downloaded wasn't compiled with the NVIDIA libraries and doesn't support 3D stereoscopic output. To my knowledge there is no recent version that has it.
Got your point ... thank you, SolarLiner .. but there's a different reason - the 3D stereo failure I suddenly suffered from is not caused by the non-NVAPI-D3D9Client compilation variant but the root cause is a weird DirectX Direct3D registry entry ... and nVidia 3D stereo does work with the standard out-of-the-box non-NVAPI-D3D9Client ... plz see below.

So here's the story:

At first I have studied the CodePlex Source Code "howto", and compiled a differently tagged D3D9-2016R1 version from the source trunk with linked 32-bit NVidia R375 API lib / headers using Visual Studio 2017 + 2010 VC++/ DirectX SDK. The D3D9ClientLog.html shows NVidia API is now initialized, this and the Orbiter.log files indicate no problems; also the Orbiter-ng.exe behaves correctly ... BUT: still no 3D stereo :facepalm:.
Now things were getting really strange:
When I just renamed the installation folder, without any other change stereo 3D worked again. Renaming it back made stereo 3D defunct again.. Now, as we know: the installation folder name or location should not have any effects on Orbiter's function, right !?!.

So next thing I did in the root cause analysis was to re-establish the original D3D9Client.dll 2016R1 onto my otherwise unchanged Orbiter2016 installation with the original folder name and defunct 3D stereo. Guess what: as soon as I just renamed the installation folder, 3D stereo worked despite the "Not compiled with nVidia API" log message from the original client !! Same happened for my second also 3D-stereo-broken installation on the Beta versions: fixed by folder rename.

Next step was to analyze nVidia and/or DirectX influence.
Adding orbiter.exe, orbiter-ng.exe, and/or Modules/Server/orbiter.exe in the nVidia Control Panel section for "Program Settings" did not have any effect on 3D stereo functioning or being defunct. Also nVidia registry entries did not have an impact.

Then I finally found the "smoking gun" ... Direct3D

There are entries in the registry done by DirectX for 3D operation at [HKEY_USERS\<...my user SID...>\SOFTWARE\Microsoft\Direct3D\Shims\MaximizedWindowedMode]

In my installation these were:
"E:\\Programs\\Orbiter2016BetaR65+D3D9-Beta25_2-for-Rev65\\modules\\server\\orbiter.exe" = dword:00000001
"E:\\Programs\\Orbiter2016\\modules\\server\\orbiter.exe" = dword:00000001
Deleting these entries made the defunct stereo 3D work again even with the original folder names being kept !!
I do not have any idea what action causes these entries being made, but if these are being created by whatever cause they render stereo 3D defunct. Maybe we have to look at possible impacts from D3D9Client's "advanced video" tab, but this is just a wild guess.

So in a nutshell:
A) The original D3D9Client.dll 2016R1 on Orbiter2016 works for NVidia stereo 3D even if not compiled with the nVidia API. The same is true for D3D9-Beta25_2-for-Rev65 on Orbiter2016BetaR65.

B) If for whatever reason the 3D stereo function goes dead, a simple rename of the installation folder for Orbiter, or deletion of the before-mentioned registry entry/entries below [HKEY_USERS\<...user SID...>\SOFTWARE\Microsoft\Direct3D\Shims\MaximizedWindowedMode] fixes it.
As usual: be extra careful when fiddling around in the registry, do an export of the entries and a double-check for the correct HKU-location before deleting, and if in doubt, or unexperienced better leave the registry as is and rename the Orbiter installation folder instead of risking to brick your complete Windows installation ... it's your choice, and I don't take any responsibility for application of this registry hack, or any side-effects from it.

If unsure about your user SID: just run
"wmic useraccount get sid,name"
in a CMD window to list all SIDs including the one you're logged in to find the correct HKU entry.


C) I am still puzzled as to why the nVidia 3D stereo works with original D3D9Client.dll despite the "not compiled with nVidia API" log message. My re-compiled dll with the _NVAPI_H define and linked nVidia API lib/headers also works but I do not even see a performance or functional impact for the 3D stereo aside from the "nVidia API initialized" message in the log.

Attached plz find the logs and the compiled NVidiaAPI-enabled D3D9Client.dll ("2016R1+ for Orbiter2016") for the Modules/Plugin folder plus the VirusTotal report (for whatever reason it's flagged 1/55, as the original 2016R1 dll also does .. probably some too nosy heuristic test by 1 of the 55 applied scanners ?!?).
Dear Fellow Orbinauts: Plz note that this attached dll is NOT INTENDED for the user community, and has no authorization or endorsement for public usage by the OVP developers.

Cheers - Rob
 

Attachments

Last edited:
Night-lights shouldn't have a blue dirt anymore.

Hello Jarmonik, :cheers:
with the latest revision of the colors of the night on the tiles appear staggered, disappeared yellow.....

D3D9ClientBeta25.1-forBETA r64(r795)
29zzvtM.jpg

LMoFroT.jpg



D3D9ClientBeta25_2-forRev65
td8DdGL.jpg

VKZ38mM.jpg


Is it just my problem?
:facepalm:
 
Say, I have two somewhat technical questions.

The first, is it possible to write custom shaders for a vessel for D3D9client?

And the second, how does D3D9 client handle draw calls of objects that are composed of several meshes (like bases, or vessels that consist of multiple meshes). Is there one pass per material in the entire object, or is there a pass per material per mesh?
 
Last edited:
The first, is it possible to write custom shaders for a vessel for D3D9client?

You can replace the Vessel.fx with your own but other than that, no.
Technically it would be possible to specify a name of a custom shader in a material section of a mesh file. But currently the mesh files are read by the Orbiter and there's not much we can do about it.

And the second, how does D3D9 client handle draw calls of objects that are composed of several meshes (like bases, or vessels that consist of multiple vessels). Is there one pass per material in the entire object, or is there a pass per material per mesh?

We can't change the order in which the groups are rendered. All attempts to improve the rendering efficiency have failed. So, there is one draw call for each group in each mesh in each object.

To optimize the rendering, mesh groups should be pre-ordered in a mesh file by material and texture to minimize material and texture changes. When dealing with a very complex constructions the optimizations can give about +33% into the frame-rate.
 
Guys, you seem to have a performance issue in connection with multiple meshes on the same vessel.

I did some benchmarks to come up with reasonable guidelines for texture sizes for IMS2, and noticed something entirely different.

I spawned a benchmark scenario with about 60 docked together vessels. Simple geometry, relatively high texture density, no normal/spec maps.
D3D9 client performed admirably at between 130 and 135 fps with black space in the background.

Then I integrated them, which means IMS takes all the vessels, and puts all their meshes into a single vessel. The result should be fairly similar, as these are modules without any logic.
THe framerate droped by more than half to around 60 to 65 fps!

My first thoguht was that I had a bottleneck somewhere in the IMS2 core that I hadn't realised so far.
But I did the same test in the inline client, with a result of around 50 to 55 fps unintegrated (each mesh an individual vessel), and around 60 fps with all meshes in one vessel. The slight improvement was expected, as from this point on Orbiter has to manage the state of only a single vessel as opposed to 60.

The backchecking with the inline client clearly shows that the performance drop is not related to IMS2.

My best guess would be that you have some bug in the algorithm that draws the individual meshes in a vessel. It almost seems as if it does double the amount of rendering passes for meshes that are in the same vessel as opposed to individual vessels with a single mesh each.
 
Last edited:
Guys, you seem to have a performance issue in connection with multiple meshes on the same vessel.

I did some benchmarks to come up with reasonable guidelines for texture sizes for IMS2, and noticed something entirely different.......

That doesn't really make any sense. Your results would indicate that running the inner loop alone is less efficient than running the outer loop plus inner loop together. Review of the code doesn't reveal anything suspicious. There is a small change that it could be related to a dynamic insertion and deletion meshes making them to render multiple times.

Could you check the statistics panel "L-Ctrl" + "L-Shift" + "C". Especially pay attention to "Meshes Rendered" and to the scene timings/composition on a bottom of the screen. Is there a shift in "Scene Drawing" and "Non-client specific tasks" ?
 
Your results would indicate that running the inner loop alone is less efficient than running the outer loop plus inner loop together.

Not exactly, no. It would indicate that the inner loop has a different growth rate than the outer loop.

Could you check the statistics panel "L-Ctrl" + "L-Shift" + "C". Especially pay attention to "Meshes Rendered" and to the scene timings/composition on a bottom of the screen. Is there a shift in "Scene Drawing" and "Non-client specific tasks" ?

Will do this evening.
 
Just cos it's pretty !
Thank you so much for all your work on this.
JMW:thumbup:
 

Attachments

  • canaria.jpg
    canaria.jpg
    121.6 KB · Views: 59
I did a direct comparison of the D3D9 output between the scenarios, find attached screenshots below. Nothing seems wrong with the drawing. In fact, as expected, in the integrated scenario almost everything is equal or less. The only thing that is more is the number of meshes has increased by 2, I think I spawned in some other meshes between the scenarios to look at some things, but they are not rendered at the moment in any case. Number of rendered meshes is equal.
But the waiting time is almost doubled, which would correspond very nicely to the drop in framerate. Now, I don't exactly know what this value is about, as probably nobody went and said "hey, let's put some waiting time in here", at least not in a single-threaded environment.Still, that seems the only thing that is really off in this data for somebody that doesn't know the inner workings of the client too well. Apart from the framerate counter, of course... :shifty:

Here's the screenshots complete with FPS counter, left is integrated (slower), right is docked (faster).
 

Attachments

  • D3D9_IMS2_Integrated.jpg
    D3D9_IMS2_Integrated.jpg
    219.9 KB · Views: 26
  • D3D9_IMS2_docked.jpg
    D3D9_IMS2_docked.jpg
    218 KB · Views: 22
Last edited:
I did a direct comparison of the D3D9 output between the scenarios, find attached screenshots below......

So, you are having 62 vessels with 1 mesh per vessel running mush faster than, one vessel with 62 meshes in it. The later case should be little bit faster but the screen shots are showing it to be much slower. In fact the GPU load has almost doubled. This doesn't make any sense. I guess it might be good idea to disable a vessel rendering from Scene.cpp and see if the difference can be reproduced. That test should confirm if the problem originates from the vessel rendering or from some place else.

Also, I have never seen a waiting time of that long as shown in your screen shots. DirectX 9 is fully multi-threaded. The main thread issues a draw calls which are placed in a directx's command queue and then executed by directx worker threads. A draw calls returns almost immediately since they aren't doing anything more than putting a call in a queue. The "green" section covers a scene related mathematics and putting the draw calls in a directx's command queue and the clients part of the frame is completed.

To buy as much time as possible for worker threads the process the data from the queue. Orbiter's physics and vessel avionics are processed next, that is the "blue" section.

After there is nothing more to do for the main thread, it is placed on hold to wait the directx worker threads to complete their work and display/present the frame. This is the "red" section.

This order of operations (green-blue-red) is different than the one used by inline engine (green-red-blue). You can toggle the default behavior by setting "PresentLocation" to "0" form D3D9Client.cfg. Does it have any effect ?

Could you also check that:
- post-processing is disabled.
- reflections are disabled. (this would be severe problems with 60 vessels near the camera)
- custom cameras are disabled.
- Frame-rate limiter is disabled.
 
Last edited:
After there is nothing more to do for the main thread, it is placed on hold to wait the directx worker threads to complete their work and display/present the frame. This is the "red" section.

Ah. So the waiting time is the main thread blocking to wait for the GPU to finish rendering. Right, things make a bit more sense now.

This order of operations (green-blue-red) is different than the one used by inline engine (green-red-blue). You can toggle the default behavior by setting "PresentLocation" to "0" form D3D9Client.cfg. Does it have any effect ?

Waiting time increased by an average of about 1.5 milliseconds. Since the order was changed for optimisation purposes, that sounds like expected bahavior.

Could you also check that:
- post-processing is disabled.
- reflections are disabled. (this would be severe problems with 60 vessels near the camera)
- custom cameras are disabled.
- Frame-rate limiter is disabled.

Check on all of those (reflections are set to "planet only").

I guess it might be good idea to disable a vessel rendering from Scene.cpp and see if the difference can be reproduced. That test should confirm if the problem originates from the vessel rendering or from some place else.

Ok, I'll check out the code. Could you tell me from which to which line I should comment out the code, just so we're sure I'm doing the right test?

---------- Post added at 01:48 PM ---------- Previous post was at 12:50 PM ----------

Whoops, getting a bunch of build errors. The compiler can't find declarations for all the MESHM_* structs in Mesh.cpp. They're not in Mesh.h. Is my repo corrupted? SVN said it checked out revision 820, is that the right one?

Not getting any missing header errors, so technically those declarations should be somewhere...
 
Ok, I'll check out the code. Could you tell me from which to which line I should comment out the code, just so we're sure I'm doing the right test?

Whoops, getting a bunch of build errors. The compiler can't find declarations for all the MESHM_* structs in Mesh.cpp. They're not in Mesh.h. Is my repo corrupted? SVN said it checked out revision 820, is that the right one?

Not getting any missing header errors, so technically those declarations should be somewhere...

Those declarations are located in gcConst.h which is located in Orbitersdk/include/ Currently only VS2015 project files are up-to-date.

820 is the right/latest revision and the lines that should be commented out are 1198-1217.
 
Right, the headers from my orbitersdk were overriding the headers from the orbitersdk in D3D9 client, just had to switch priorities around.

Test performed, performance of both scenarios is practically identical, plus minus the odd milisecond here and there.

However, I also performed some other tests, and the results now point in a not-so-closely d3d9 related direction. D3D9 seems to have something to do with it, but it's certainly not an algorithmical problem as I first supposed.

In short, I made tests comparing performance of both scenarios in inline client and D3D9client, as well as with stock IntelHD gpu vs. NVidia Geforce 840M. Shouldn't even be a competition, right? HA! Take a look at these measurements:

Measurements taken with Nvidia (in fps) (prior measurements were all taken with this card):
Inline client: Merged 57-60; Docked 50-54;
D3D9client: Merged 60-66; Docked 130-132;

Measurements taken with IntelHD:
Inline client: Merged 54-55; Docked 52-53;
D3D9client: Merged 99-100; Docked 99-100;

In other words, the drop only occurs if I run D3D9client on the Nvidia Geforce. I am at a complete loss what might be going on here at the moment, but I don't think we'll find it in D3D9clients code, or at least not there alone.
 
First thing I would check here is to make sure your nVidia driver settings aren't overriding anything set in the D3D9 client, then make sure you're using the latest drivers. One would expect the 840M to be out-performing the IntelHD, which it does in the Docked configuration under D3D9 client. My gut tells me it has to be something driver or configuration related.
 
The problem is, I can't think of a configuration that would have such a specific effect, except maybe for thread optimisation, and for that I already tested all valid values.


PACKAGE HAS BEEN UPDATED. Tested this time, let's hope I'm not wasting more of other people's time :shifty:

I think I might need a little help here to asses the scope of the behavior, so here's a call to all (well, as many as possible, anyways...)

Attached below is a zipfile with the IMS2 binary, the two test-scenarios, and the Aries second stage from 4throcks constellation add-on which I used for testing (1 mesh, 2 dds files). If you have the constellation add-on installed, don't worry, it won't destroy it.

Install the whole thing by unzipping into the orbiter folder, and run both benchmark scenarios (activate the fps counter in the modules tab first!).

Points of interest:
- The framerate when running benchmark docked and benchmark integrated.
- The model of the video card used for the test.

Of special interest is whether the behavior crops up on non-NVidia gpus, and whether it crops up on all Nvidia gpus.

new package:
http://orbiter-forum.com/attachment.php?attachmentid=15037&d=1485591888
 
Last edited:
Install the whole thing by unzipping into the orbiter folder, and run both benchmark scenarios (activate the fps counter in the modules tab first!).

That package appears to be no-go:


000000.000: IMS2:[!WARNING!] font created for non-existing styleset "default". Creating styleset implicitly
000000.000: IMS2:[!WARNING!] IMS2\BENCHMARK\ARIES_BENCHMARK.cfg: No valid module function defined!
000000.000: >>> ERROR: No vessel class configuration file found for:
============================ ERROR: ===========================
IMS2\HABITAT\BN200_BIG_NODE
[Vessel::OpenConfigFile | .\Vessel.cpp | 243]
===============================================================
000000.000: >>> TERMINATING <<<

I also moved the ARIES_BENCHMARK.cfg from VESSELS/IMS2/BENCHMARK\ to IMS2\BENCHMARK\ to address some other errors.
 
000000.000: >>> ERROR: No vessel class configuration file found for:
============================ ERROR: ===========================
IMS2\HABITAT\BN200_BIG_NODE
[Vessel::OpenConfigFile | .\Vessel.cpp | 243]
================================================== =============

Whoops, completely forgott that that node is in there too. :facepalm:

I also moved the ARIES_BENCHMARK.cfg from VESSELS/IMS2/BENCHMARK\ to IMS2\BENCHMARK\ to address some other errors.

That shouldn't have been neccessary though. I'll see what's wrong there.

---------- Post added at 08:12 PM ---------- Previous post was at 07:46 PM ----------

Right, new package. There really shouldn't be a need to move the Benchmark folder, should work fine as it is.Unless you changed your default vessel folder location, that is, since the path is relative to that.
The only thing that was missing was the node holding all the other meshes together.

Don't worry about the two IMS2 warnings that will pop up, they're normal for this setup (no logic in modules). I do hope this works now. I've tested with what should be a clean install, but can't be 100% sure without redownloading orbiter to try it out... :shifty:

http://orbiter-forum.com/attachment.php?attachmentid=15037&d=1485591888
 
Last edited:
Darn, looks like attachments get deleted if you just upload them and link to them without explicitly attaching them to a Post. Kinda makes sense, since this isn't supposed to be generic webspace...

Attached it properly this time.
 

Attachments

Darn, looks like attachments get deleted if you just upload them and link to them without explicitly attaching them to a Post. Kinda makes sense, since this isn't supposed to be generic webspace...

Ok, Thanks.

I analyzed the scenarios and I can confirm the performance drop in integrated scenario. The docked scenario runs at 3 times higher frame rate with ATI hardware.

The reason for low performance in integrated scenario is that the meshes are converted to a dynamic meshes due to call to EditMeshGroup(), this does not happen in docked scenario. Dynamic meshes are slower to render but faster to read/modify/write.
 
Back
Top