DirectX version of GDI's GetPixel?

flf

Newcomer
Hi Folks,

While a GDI GetPixel(x,y) is fast enough to test pixels of GDI applications, it certainly bogs down to a crawl any time your target is DirectX. I've been reading through the DirectX SDK docs looking for any clue as to whether one application can sample the screen output of another application. So far, I'm not seeing anything that looks promising.

I'd appreciate any clues to point me in the right direction, or a simple declaration of, "Sorry, but DirectX is ecapsulated to only be aware of, and able to interact with, its own application/surface context."

However, I'm guessing I just don't know where to look.

Specifics, in case you want to know: What I'm shooting for is an application that samples pixels from a different application (old MMO game client) and make simple decisions as a result of that data. Really, I'm just trying to see if I can make a little program that automatically heals people if their life gets too low. It's that simple. The game app can be either windowed or full screen.

I've already experimented with AutoIt (a multi-use scripting environment that will make compiled exes) but ran up against the fact that its GetPixelColor function (which does a GetDC followed by a GetPixel under the hood) has a 50 to 100 millisecond latency for that one call. I used an algorithm that converges in six calls to read a life bar, but that's still 300ms to read one life bar, where a group can consist of up to eight life bars! Obviously the time to resolve the values is going to be a problem.

My next experiment will be to write my own mini-app to test using GDI GetPixel where every call doesn't have to include a GetDC call. If that offers reasonable call latencies I'll probably work with it, but I'd love to use this as a springboard to start using DirectX if there's a solution that works.
 
After a day of coding, it looks like a C# custom coded app using the GDI GetPixel function will be usable. Speeds are actually perfectly acceptable given the response times needed.

Now I just need to figure out the technicalities of attaching to another thread's keyboard handlers and injecting keystrokes appropriately.
 
"Sorry, but DirectX is ecapsulated to only be aware of, and able to interact with, its own application/surface context."
That would be an accurate enough summary.

You can access pixel data directly using Lock() calls on surfaces, but that won't really work from an external application unless you hook the d3d9.dll file - possible, but non-trivial and quite a bit of work on your part.

Have you considered GetDIBits instead of GetPixel? Often a bit quicker if you don't mind taking ownership of the raw data.

Now I just need to figure out the technicalities of attaching to another thread's keyboard handlers and injecting keystrokes appropriately.
DirectInput can be hooked or you can inject strokes into the Win32 message queue using SendMessage..

hth
Jack
 
I'm not even sure if this whole thing is technically possible. If the game application is
  • running full screen,
  • is using AA, and
  • the hardware does the down filter "on the fly" in the DAC feed
then there may not be any "pixels" that can be sampled.
 
Have you considered GetDIBits instead of GetPixel? Often a bit quicker if you don't mind taking ownership of the raw data.

I looked at GetDIBits, and wondered to myself how fast it is by comparison if you know the exact location of the pixels you wish to sample. For example, the life bars I wish to read are vertical, and 100 pixels tall (I custom designed a UI piece to be the exact size I wanted) and I read them by starting in the center and recursively sampling plus or minus (depending upon pixel value) half of the previous offset (which starts at 25) such that my method of reading a lifebar converges in seven calls.

Let's see if I can estimate the total number of GetPixel calls per polling cycle:

5 to verify that UI panel is where it's supposed to be and fully visible
2 to verify that another panel is properly located and visible
7*4 to read local player's status bars
2*8 to check presence and status of group members
7*{# of group members up to 8} to read life bars of group members (and self) when grouped
1 to read lagmeter color
2 to check presence/color of target
1 to check weapon status (in combat/not in combat - weapon put away)

Maximum samples in a normal pass (where both UI panels are in their proper location):
111 calls to GetPixel.

I suppose that since GetDIBits would be able to snapshot just the areas needed, that it might be faster. Although it works with complete scan lines, so I would at a minimum be dealing with a structure that is 100 x 1024 pixels, and then reading pixel data from there.

For now I'll continue with GetPixel, as I'm more interested in solving some of the other problems, but I'll be writing my pixel parsing functions to be able to deal with using DIBit targets as well... at least in theory. Whether or not I code up classes to use it depends upon the final performance of my little application.


DirectInput can be hooked or you can inject strokes into the Win32 message queue using SendMessage..

I've already been able to successfully AttachThreadInput from my process to the game's process... so today will be experimentally trying to synthesize some keystrokes using SendInput.

Thanks for the info, folks.
 
Correction: After mucking about with calling old User32 DLLs I found out that you can just use three or four that are required for window management and focus and then use System.Windows.Forms.SendKeys.Send(string) to push keypresses over to the game client with minimal fuss.

However, I must add that these appear to be keystrokes, which limits being able to hold down keys and release them after an arbitrary time. Fortunately, this isn't an issue for what I want to do.

Sometimes the hardest part is digging up the info on the correct libraries to use.
 
Amended Correction: SendKeys.Send doesn't work with applications that hook DirectX keyboard input themselves. This means I'm back to working with SendInput as the method for injecting keystrokes into another application.

Why is it that whenever I look to my 'Ineptitude' Demotivator that it actually propels me on to learning to do things right even more stringently than I had originally planned?
 
I'm not even sure if this whole thing is technically possible. If the game application is
  • running full screen,
  • is using AA, and
  • the hardware does the down filter "on the fly" in the DAC feed
then there may not be any "pixels" that can be sampled.
Thats a very good point, but part of the reason why hooking into this via GDI is so slow is that it forces the GPU to resolve the image in a way that GDI can access it. Likewise for GetFrontBufferData() - documentation states its purposefully slow, presumably due to a forced resolve/conversion...

Let's see if I can estimate the total number of GetPixel calls per polling cycle:

5 to verify that UI panel is where it's supposed to be and fully visible
2 to verify that another panel is properly located and visible
7*4 to read local player's status bars
2*8 to check presence and status of group members
7*{# of group members up to 8} to read life bars of group members (and self) when grouped
1 to read lagmeter color
2 to check presence/color of target
1 to check weapon status (in combat/not in combat - weapon put away)

Maximum samples in a normal pass (where both UI panels are in their proper location):
111 calls to GetPixel.

I suppose that since GetDIBits would be able to snapshot just the areas needed, that it might be faster. Although it works with complete scan lines, so I would at a minimum be dealing with a structure that is 100 x 1024 pixels, and then reading pixel data from there.
The important detail is the overhead of GetPixel versus GetDIBits. Making 111 calls to GetPixel means 111 locks and unlocks (or whatever GDI may have to do) and I'd imagine the time taken to do this will make the actual data read/returned inconsequential. Alternatively grabbing a larger block of data but only 1 lock/unlock (..) via GetDIBits might be a huge amount faster.

For example, just locking a Direct3D surface is the painful part - whether you touch 1 pixel or 1000 is rarely the important factor.

SendKeys.Send doesn't work with applications that hook DirectX keyboard input themselves
DirectInput can effectively 'steal' the device from Windows, so any Win32 route may well ultimately be pointless. But this is a whole different topic to the original one ;)

Why is it that whenever I look to my 'Ineptitude' Demotivator that it actually propels me on to learning to do things right even more stringently than I had originally planned?
Because you've obviously cracked the "enjoy doing it poorly" part :D??

hth
Jack
 
Thanks, Jack, I'll keep this in mind moving forward. As it stands, I've started over several times now... and changed languages at least three times. As inexcusable as it sounds, I'm going with VB 2008 for now. Should I ever get serious about a performance app, it'll be VC++ all the way. At least in starting over my design has gotten cleaned up on each iteration.

I was not able to solve the keyboard problem myself, as both SendKeys and SendInput would work with a "normal" app (e.g. Notepad) but would not activate keys in a way that the game likes. Strangely enough, if you pop the game into "chat" mode by pressing the "/" key, it will then take input from both SendKeys and SendInput. I suppose they're hooking at a lower level to make it as difficult as possible to interfere with the key stream.

In the end I decided to hook externals from AutoItX3.dll, which allowed me to cut out an enormous amount of window and keyboard code. Works like a charm, although I know it's not exactly learning the hard internals that would probably help me later. Then again, the need to inject keystrokes into another application, especially a directX app, is not exactly something you'd normally do, so I don't feel to badly about not conquering this particular problem on my own.
 
As inexcusable as it sounds, I'm going with VB 2008 for now. Should I ever get serious about a performance app, it'll be VC++ all the way.
Not at all inexcusable, in fact it's more inexcusable that you (and most others) assume that VB.NET must always be slower than C++ (or whichever two language you want to pick) :smile:

Maybe if you were to write the most perfect and optimal VB.NET code and the most perfect and optimal C++ code there would be a notable difference, but the mistake is to assume that all code is perfect. Yes, silly as it might sound, a "C++ is faster than VB" assumes that you can write good quality code in both languages and whilst most good developers can write functional code in multiple languages you need to be pretty experienced in a language before you can write genuinely high quality code. I don't find it too hard to write inefficient and slow code in C++ ;)

I used to write real-time 3D games using VB6 - drop out to type libraries to interface with C/C++ only API's or offload bulk maths processing to an SSE/MMX optimized C DLL if necessary... The real win was in the architecture - optimization at the conceptual and algorithmic level almost always wins over instruction-level optimization...

Anyway, I'll get off my soap box now :cool:

Jack
 
You are, indeed, correct. Actually, the largest stumbling block with using VB for this type of thing is that the DirectX headers are all C++, and most example code for dealing with graphics problems is likewise C++. Figuring out how to convert the code, which seems to invariably include pointers, is a rather large time sink.

However, with VB it is so simple for me to instantiate Collections of any object type, and that is so convenient... I know there are analogues in C++, but they certainly aren't as syntactically straightforward.
 
I've made good progress... in fact, far more than I originally stated as my goal here. I'm still messing with a single player, but within a week I should have a functional multiplayer program that correctly reads screen values and responds accordingly.

However, performance is a bit problematic. I've come to the conclusion that I'm going to have to toss GetPixel and use DDB/DIB. In fact, I'm pretty interested to see how much the cycle time of the main loop falls (and/or is much more stable) once I've implemented the changes. Given some additional ideas bouncing around in my head, I'm going to shoot for simply snapping the entire 1024x768 client area of the play window and then work from there.

I had to finally bust out my Charles Petzold "Programming Windows, 5th Edition" and see what he had to say. Strangely enough, he had about one sentence to say. However... a little googling led me to the discovery that Petzold wrote a small screen capture utility using the DDB/DIB functions. A little further searching yielded up the original zip package for PC Magazine that he did, which includes the source files for the utility. I must say that Petzold writes nice, clean, small programs with good documentation.

I think from here I'll be able to proceed fairly rapidly... if only I didn't have to convert from C++ to VB along the way.

I'm tempted... so tempted that I think I will... read up on compiling function into DLLs. Frankly, I'd rather use the clean code as-is without having to refit it to another language. I'd be having to call DLLs anyhow, so what's another layer in the chain?

Anyone have any warnings for me here? Is shoveling 786432 pixels into my program 5 times a second potentially problematic? Especially if I am calling the function that returns the bitmap data as an external DLL?
 
An update: I solved the problem by using the Graphics and BitMap objects within .NET. (Okay, I hate dot-net's name... utterly idiotic, and impossible to use in a regular sentence without looking all screwed up.) After finding a lucid example it was extremely simple to use the Graphics.CopyFromScreen method to dump the desktop to a BitMap structure, where you can now make very fast GetPixel calls. All in all, a much better solution than a direct GetPixel call to a DC.

By using a 200ms trigger, I am dumping the whole screen to a bitmap ~5 times a second. (Not exactly, because I stop the timer while I'm doing some operations... but it's certainly fast enough.) The entire loop has settled down to sub-100ms times for *all* processing, with usual times being closer to 30ms. It turns out that the DLL I'm using to insert keystrokes is probably responsible for the big swings in cycle times, but I'm not going to worry about that too much as it's getting the job done.
 
I just found this thread via google. I have a similar app, and yes I'm reading pixels from a game as well!

I am also continuous looping and do a CopyFromScreen, then sample 20 different pixel shades via GetPixel. My problem is that after a few thousand iterations, I get an "Out of memory" error and the application dies.

It is my understanding after lots of research that some of the PixelFormat types for the Graphics are bugged. Have you encountered this issue? Would you mind sharing the way you structured your continuous loop??

Thanks in advance.
 
Here is my code block:

Code:
        private void Do()
        {

            Bitmap b = new Bitmap(3, 15, System.Drawing.Imaging.PixelFormat.Format32bppArgb);
            Graphics g = Graphics.FromImage(b);

            int iterations = 0;

            while (true)
            {
                try
                {
                    iterations++;

                    if (iterations % 100 == 0)
                    {
                        Log.TraceOut(string.Format("{0} objects", iterations));
                    }

                    app = ActiveApplTitle();

                    if (app.Contains("Game I am cheating at"))
                    {
                        g.CopyFromScreen(0, 0, 0, 0, b.Size);

                        press = false;

                        int col = 0;
                        int row = 0;

			//loop through the 30 pixels, read color, build keystr

			if (press)
			{
                        	SendKeys.SendWait(keystr);
                        	Thread.Sleep(800); //pause after keypress
                        }
                    }
                }
                catch (Exception ex)
                {
                    Log.TraceOut("ERROR : " + ex.Message.ToString());
                }

                Thread.Sleep(200);
            }

        }
 
Back
Top