Renderscript blog post

Discussion in 'GPGPU Technology & Programming' started by codedivine, Feb 1, 2013.

  1. codedivine

    Regular

    Joined:
    Jan 22, 2009
    Messages:
    271
    Likes Received:
    0
  2. french toast

    Veteran

    Joined:
    Jan 5, 2012
    Messages:
    1,667
    Likes Received:
    9
    Location:
    Leicestershire - England
    Ive just stumbled across this api and its features, very interesting, looks to be a mobile competitor to open cl, using all the SOC processing to complete certain tasks.

    The puzzlimg thing about this is its been around since ice cream sandwich (4.0) yet ive never seen much written about this, strange because once I did some digging and watched the google IO conference it seems extremely beneficial.

    Anyway here is a couple of resources I quickly found whilst reseaching, including some graphs of the performance increases between android versions on the same hardware (nexus 4) amd also the massive performance gains seen in using renderscript on the nexus 10.
    The exynos 5250 gets quite a performance bump when using the mali t604 to assist the cortex a15s in certain benchmarks/tasks.

    http://android-developers.blogspot.co.uk/2013/01/evolution-of-renderscript-performance.html?m=1

    https://developers.google.com/events/io/sessions/331954522
     
  3. french toast

    Veteran

    Joined:
    Jan 5, 2012
    Messages:
    1,667
    Likes Received:
    9
    Location:
    Leicestershire - England
  4. Tim Murray

    Tim Murray the Windom Earle of mobile SOCs
    Veteran

    Joined:
    May 25, 2003
    Messages:
    3,278
    Likes Received:
    66
    Location:
    Mountain View, CA
    so I guess I can say some stuff about this, since I'm largely responsible for the direction of the RS programming model on Android in 4.3 and forward.

    scatter is added in 4.3 (I actually had it implemented back in December or January, it's frustrated me to no end that I've had to wait this long to talk about it).

    I can't really comment specifically about NDK bindings, except that I've heard this request a lot and I sympathize with it.

    the biggest reason why people have been really hesitant to look at RS is the lack of backwards compatibility to ICS or GB, which is solved in the very near future with an upcoming SDK update that ships the RS compatibility library (which lets you run precompiled CPU binaries on older devices and use native RS on newer devices).

    points 2, 3 and 4 in the blog post are more of a feature than a regression; RS has very different goals than CUDA (CUDA wants peak GPU perf, RS wants performance portability with good perf), and the execution model is changed accordingly. I can go into excruciating detail if people want, but it's basically that the more control app developers have over scheduling, the more the resulting code will be tied to a single platform performance-wise. (also, SoCs are way more complicated than you think and have a lot of different places to run code, so specific device dispatch isn't good)
     
  5. codedivine

    Regular

    Joined:
    Jan 22, 2009
    Messages:
    271
    Likes Received:
    0
    Well that is all well and good Tim, but I also want to ask: Why not ALSO provide OpenCL? I don't see why Google is actively pushing against OpenCL when Renderscript and OpenCL can peacefully coexist. Why not offer the choice to developers who want it? It will immediately solve your NDK problems as well. You can continue to develop Renderscript according to your own vision and goals, while at the same time offer OpenCL. I don't see why this can't be done.

    For example, Nexus 10 and Nexus 4 firmware earlier supported OpenCL but that has been removed in 4.3 which was no doubt done at Google's behest. I do not understand this active opposition of an open, royalty free API. Isn't Android supposed to embrace openness and choice?

    I also noticed that your carefully worded reply does not mention OpenCL at all.
     
  6. Tim Murray

    Tim Murray the Windom Earle of mobile SOCs
    Veteran

    Joined:
    May 25, 2003
    Messages:
    3,278
    Likes Received:
    66
    Location:
    Mountain View, CA
    doing so isn't really beneficial to anyone. OCL uses exactly the same execution model as CUDA; that model does not provide performance portability because it never made the slightest attempt to do so. on Android, there are too many architectures (just for CPUs: A7, A9, A15, Krait, upcoming ARM parts, Intel parts, MIPS parts. multiply that number by 3 for GPUs, and the GPUs are extremely divergent architecturally and functionally) for someone to actually be able to write OCL kernels that perform well universally. what would happen is somebody will write an app, test on a few devices, get decent performance there, and almost certainly have an unacceptable experience everywhere else. this would be bad for app developers, bad for users, and bad for Android as a whole. I want people to use all of the available performance on these architectures, not get burned by the sheer number of different permutations the first time they try and swear off it forever.

    I doubt any of the true OCL believers will be convinced by that, but that's the reason. OCL is fine as a common language to target specific architectures, but the execution model doesn't work when you're writing a kernel to target some architecture you've probably never seen with an arbitrary selection of processors and capabilities.

    I'm also more and more convinced that you don't want specific-device dispatch in mobile, as the number of possible processors increases (CPU, GPU, ISP, DSPs? other stuff?) and dataflow becomes more important (eg, what's the best way to carry out a specific sequence of computations on camera input to encoder output on an architecture you've never seen and may have processors you wouldn't expect?). this also ignores all the good stuff you can do if you're part of the OS and not a tacked-on library (system wide scheduling across multiple processors, stuff like that).
     
  7. Dade

    Newcomer

    Joined:
    Dec 20, 2009
    Messages:
    206
    Likes Received:
    20
    Tim, first of all, thank you for sharing your thoughts.

    So do you claim a RenderScript kernel will perform well universally ? I'm a bit skeptical :wink:

    Indeed however RenderScript has really to deliver something that OpenCL can not or it is just "yet another GPU computing API" (TM).

    The other open question is also why to not improve OpenCL instead of introducing "yet another GPU computing API" (TM) ? For instance, the concept of embedded profile already exist in OpenCL and it could have been extended to fix some of the issue you are describing.

    Did I say how much annoying is to have "yet another GPU computing API" (TM) ? :wink:
     
  8. willardjuice

    willardjuice super willyjuice
    Moderator Veteran Alpha Subscriber

    Joined:
    May 14, 2005
    Messages:
    1,373
    Likes Received:
    242
    Location:
    NY
    How do you define "well"? RS doesn't aim for peak performance. It aims for decent performance on specialized tasks across a broad range of hardware. I don't think that's an unreasonable goal as long as the execution model remains rigid.

    Why is it Google's job to fix OpenCL? I suspect it's a lot cheaper/quicker rolling your own API than getting involved with a bureaucratic mess (Khronos). But let's pretend that OpenCL isn't broken; RS and OpenCL simply don't share the same goals. Do you think current OpenCL members want to sacrifice performance for portability? How do you propose Google and OpenCL members rectify these incompatibilities (outside of creating another version of OpenCL)? I know they are similar, but that doesn't mean we should try to shoehorn APIs into solving problems they weren't designed to solve.

    On an orthogonal note, just because something is open doesn't make it good. It's unreasonable to expect Google to support all open platforms, especially when they don't align with Google's goals. That's not to say I'm against OpenCL in android, but I certainly understand why Google is hesitant on adding support.
     
  9. BlueByLiquid

    Newcomer

    Joined:
    Jul 31, 2013
    Messages:
    6
    Likes Received:
    0
    What I don't expect is for Google to go out of their way to prevent users from using OpenCL. I could care less if Google wants to play with RSC like they did for graphics for a couple of years and then kill it off, but there is no reason to disable OpenCL (other than fear it will get more use than their own library). Google has decided to go against what developers want and removed something because they can't compete with it. With 4.3 on the Nexus 10 they couldn't remove the OpenCL driver because it is needed for RSC on ARM platforms so they disabled the clang front end kernel compiler to prevent anyone else from using it.

    So apparently OpenCL is completely unfeasible for a mobile platform yet Google's own hardware vendors decided OpenCL was the best choice to develop the RSC driver on. That makes lots of sense.

    Don't you know we are developing on performance and memory constrained platforms? We need all the performance we can get and low power use is crucial to our customers. Don't you know that we will very often write a vastly different algorithm for the CPU versus the algorithm for the GPU or the algorithm for a DSP. Automatic compliers might be able to do optimizations for specific hardware but they don't know how to choose different algorithms so your RSC inability to do this is ridiculous.

    So as a developer Google please protect me from myself. Please go out of the way to protect me from simply having another option if I ever choose to optimize more than your compiler can provide. Protect me from deciding I might want to do in depth optimization on the top 5-6 mobile architectures and give me something that does marginal to no acceleration on every platform which I can't predict because you won't let me have the control I need. Protect my customers from the power savings and the performance benefits of optimization. Thank you, Tim, Thank you Google.
     
  10. codedivine

    Regular

    Joined:
    Jan 22, 2009
    Messages:
    271
    Likes Received:
    0
    Tim, I see and partially agree with your vision. I understand that mobile architectures are complicated. There are considerations such as dynamic power distribution, shared memory bandwidth etc that one has not seen (before say Ivy Bridge) on desktop. I agree that trying to write 6 different codepaths for 6 different architectures is hard. But I do not agree that Renderscript solves the issues either. At present, all it does (compard to say CUDA) is prevent the programmer from specifying certain parameters.

    Think about this way. You can divide the programming tools into two categories:

    a) Close-to-metal. OpenCL is fairly close to metal exposing individual devcies, exposing memory hierarchy and thread dispatch mechanics.

    Even OpenCL however offers some amount of possibility of optimization by the driver. For example, you can leave out thread group size and let the driver choose a suitable one. Vectorization may also be performed by the driver (for example, Intel's driver does this on Ivy Bridge). It is also possible to write OpenCL drivers that automatically use local memory (on-chip shared memory in CUDA parlance) if the programmer does not. But let us ignore even OpenCL's compiler optimization possibilities and let us think of it is as close to metal.

    b) Middleware solutions that offer higher level programming languages. This typically includes a supposedly smart compiler and some kind of a scheduler+runtime. Renderscript falls in this category. My current research area happens to be this exact field, and I am a big proponent of the need of more productive languages in parallel computing so I sympathize with your goals.

    But I think Renderscript is essentially just one particular middleware solution. Renderscript compiler and drivers will have one particular set of heuristics. You said, how can the developer code for an architecture he has not seen before? I think as a compiler writer, I often face the opposite issue: How can you ensure that your compiler+driver has the right heuristics for algorithms and use-cases you have never seen?

    You said yourself that mobile architectures are more complicated than typical workstations, and people still haven't solved building good middlewares for workstations. A compiler that attempts to automatically schedule computations on the right hardware needs to have at least some performance model for the application and an idea of how that will map to the archiecture. This is very hard (and completely unsolved) on "simple" desktop architectures. How do you think this will be done automatically by a driver on mobile, where things are more complicated by your own admission? We have been trying to solve some of the same problems as you in our lab and I think years of work needs to be done.

    There is also the issue of over-optimizing for a particular architecture that you said can happen with OpenCL. This can certainly happen, but I think this can very well happen with Renderscript too. Just to give an example: Lets say the Renderscript driver on my machine always happens to choose CPU. It is very well possible to write an algorithm that performs wonderfully on my particular device's CPU and ship that, while the algorithm itself may peform very badly on other devices where driver happens to choose the GPU. CPU algorithms are not always suitable for GPUs for example, so having the same source code for both is more disastrous than trying to run code optimized for one CPU on another. No matter how smart the compiler is, it cannot replace the algorithm with another one.

    I think a better approach is to let people build different middlewares for different types of applications. Let a thousand middlewares bloom, and more particularly, let a thousand domain-specific tools bloom. This cannot be done on top of Renderscript. Middlewares building on top of another level of middleware with undocumented (and potentially ever-changing) set of optimizations is a bad bad idea but can potentially be done on top of lower level interfaces (like OpenCL). For example, let game engine programmers decide where/how they want to run the physics code. Let people build domain specific tools like Halide and let them choose how/when to optimize for which architecture. Let people build their own dynamic schedulers (like StarPU) and experiment with what scheduling algorithm suits them best. I understand Googlers are smart, but so are people at Unity or MIT or many other places.

    I am not saying Renderscript is bad. I think Renderscript is a good idea that tries to tackle a real issue, and I think you should continue developing it further. But it is not, and can never be, a good solution for everyone and by forcing people to only use this tool will limit the exploration of alternate technical solutions. This is why choice is important. Limiting the choice to only middleware solution, which happes to choose one particular set of parameters in this vast unexplored design space of middleware solutions, will be bad for everyone in the long run.
     
  11. Rodéric

    Rodéric a.k.a. Ingenu
    Moderator Veteran

    Joined:
    Feb 6, 2002
    Messages:
    3,986
    Likes Received:
    847
    Location:
    Planet Earth.
    Anything that goes in the same direction as Chapel (http://chapel.cray.com) can only be good IMO.
    We have enough hard to use, not so well designed, low level API already, we need higher level, well thought/abstracted languages.
    Solve problems instead of reinventing the wheel every time.
     
  12. BlueByLiquid

    Newcomer

    Joined:
    Jul 31, 2013
    Messages:
    6
    Likes Received:
    0
    Actually OpenCL was the only low level library we had access to no the mobile and Google has taken that away. This is an issue of Google blocking APIs not creating new ones.
     
  13. codedivine

    Regular

    Joined:
    Jan 22, 2009
    Messages:
    271
    Likes Received:
    0
    Exactly. The issue isn't Renderscript. It is totally fine for Google to propose and implement their own technical solutions. The issue is, why block other technical solutions? OpenCL has its issues, but Renderscript itself has many weaknesses. I dispute Tim's claim that somehow Renderscript solves issues he himself raised. Those are hard problems and Renderscript will end up being good in some cases, and very bad in other cases. There is no free lunch, no universally applicable good heuristics for scheduling and "Sufficiently smart compiler" (the guiding philosophy behind Renderscript) is a meme that has failed again and again in computer science.

    There is a place for solutions like Renderscript and there is a place for solutions for OpenCL. One does not replace the other.

    Blocking OpenCL will mean people will not be able to design and build their own Chapel for example, or optimize their own game engines the way they want because Renderscript is not a useful target for other middleware providers and programming language designers.

    To me, the move to block OpenCL is similar to Apple's move (which was later repealed) to block applications not written in Objective C to the app store.
     
  14. willardjuice

    willardjuice super willyjuice
    Moderator Veteran Alpha Subscriber

    Joined:
    May 14, 2005
    Messages:
    1,373
    Likes Received:
    242
    Location:
    NY
    Because there's potential that OpenCL will cause more harm than good on Android?

    I don't think anyone is claiming RS will automatically provide a 9001% performance increase on all architectures, but hopefully it will 1) actually work on all architectures 2) won't cause a 9001% performance decrease on any architecture. You're living in some bizarro world if you think OpenCL, in its current form, could remotely accomplish that (it can't even do 1) well on desktop...).

    No one is against low level languages, just poorly implemented ones.
     
  15. codedivine

    Regular

    Joined:
    Jan 22, 2009
    Messages:
    271
    Likes Received:
    0
    Actually, I don't see how 2 is not an issue with Renderscript as well. There is no way for the Renderscript runtime to make good decisions in every case. There will definitely be cases where Renderscript will have huge performance regressions (9001%). There is no way for a compiler to know where best to optimize and schedule computations correctly in every case.

    Bad compiler optimizations and scheduling can easily lead to 10x reduction in performance on GPUS, no matter what the API. There is no silver bullet. It is not clear to me that Renderscript will work "well" everywhere. Not because there is anything wrong with Renderscript, but because it is a hard problem and if you think the magic compiler and runtime will solve it for you properly, you are living in a bizarro world.

    The only cases where compilers work very reliably are generally domain-specific languages (DSLs) and Renderscript ain't one and not amenable to similar amounts of compiler heroics. Renderscript is itself general enough that compiler and runtime heuristics will work fine for some cases, but definitely fail in many cases, and fail very badly at that leading to 9001% performance regressions.

    And lets be honest here, it is not even about OpenCL per se. I would be happy if there was an alternate low level solution for Android, but there isn't. Let us say a Tegra 5 part is chosen to be in a Nexus device, would Google ban CUDA too? What about GL 4.3 compute shaders, which will likely be added to GLES at some point? Will Google ban those too given that compute shaders have the exact same programming model and optimization "problems" as OpenCL? Should Microsoft also ban DirectCompute and C++ AMP on Windows and Windows Phone? Hey, why don't we also abandon C and NDK while we are at it?

    About OpenCL, it works just fine for it is supposed to be. It is just a portable assembler as far I am concerned and has the same performance and issues a portable assembler would imply. Sure, the same OpenCL kernel does not work well everywhere but thats because the hardware IS diverse and it IS difficult. It is not unsolvable though. I have myself written OpenCL libraries that rival vendor-optimized libraries on at least 5 different architectures so I don't see why thats undoable. It is hard, but that is today's reality and no matter of hiding behind the compiler will solve that.

    Programmers are also not THAT stupid either. They will many times know very well how to write good code. It is not THAT hard to write, say, an optimized kernel in OpenCL for architectures I know well and write a general Renderscript kernel for everything else.

    Preventing people from using lower level APIs may prevent some (hypothetical) bad experiences, which is not guaranteed with Renderscript either, but Renderscript will also not help you build the best possible experiences.

    Anyway, I don't have any skin in this game. I have used everything from OpenCL, CUDA, AMP, CAL or whatever comes out. I have built languages and compilers with various high level programming models myself. I am not paid by anyone nor own any stocks. I will use Renderscript where it makes sense. I am just a grad student who has lived, breathed and built GPGPU compilers, runtimes and schedulers for 6 years now, and is frustrated that blocking lower-level solutions is a bad technical move for everyone. If lower level solutions continue to be banned on Android, I will just say "whatever" and move on with my life I guess.

    I have utmost respect for Tim personally and I know I am nowhere as experienced as many of you. I also respect what Renderscript team is trying to do and would love to help where possible. Someone needs to explore more productive solutions and I am glad RS is doing so but please also realize that it is not a silver bullet and other technical solutions should be allowed and let the market choose. If OpenCL or whatever other API is terrible, it will die its natural death. Let the solutions compete and let the best technical solution win.

    Anyway time to get back to my thesis. Should be my last post on this topic here.
     
  16. BlueByLiquid

    Newcomer

    Joined:
    Jul 31, 2013
    Messages:
    6
    Likes Received:
    0
    I'm not sure if this is a statement or if you are posing a question. If it is a statement I'm not sure how any developer would believe this. By that logic why don't we just remove the ability to write C code for android because it creates code that isn't JIT compiled and requires additional tools and support? Lets remove OpenGL because we have this thing called Renderscript Graphics (oh wait that worked out well).

    Who thinks OpenCL is poorly implemented? Apparently ARM thinks it is good enough to run RSC on top of and Google has forced them to rewrite it. While it isn't perfect but it is fine for current purposes. It works on NVidia, AMD and Intel GPUs. I have minnor issues we have to fix but we have tons of code working fine. You will always have to do minor tweaks for any low level language. Domain specific languages can be written on top of OpenCL (just a Renderscript has been done on the N10). Also OpenCL 2.0 will add even more features that will be very helpful.

    RSC will never provide what OpenCL (or any other low level parallel language can) as Google isn't trying to target performance just portability. No one wanting to write OpenCL will be happy writing RSC as Google has mentioned they are different approaches and obviously they are targeting developers who want some performance but don't really want to do any optimization.
     
  17. willardjuice

    willardjuice super willyjuice
    Moderator Veteran Alpha Subscriber

    Joined:
    May 14, 2005
    Messages:
    1,373
    Likes Received:
    242
    Location:
    NY
    But that's the point of the rigid structure (what you call limitations :smile: ). I added the qualifier "hopefully" because, as you said, one can't guarantee that 100% of the time there won't be a 9001% performance drop. The idea though is to greatly limit the cases/occurrences where performance drops off a cliff. It's a trade-off.

    I completely disagree. For me, it's absolutely about OpenCL. I'm not against low level languages and would love to see one in Android. However, I don't think Google should spend X resources on OpenCL just because at the moment there's no better alternative. It's a waste and will only cause further fragmentation.

    Let's be fair now, that's clearly not what I said or meant. In fact, I think compute shaders will eventually have great success/support on Android. I don't agree that CUDA/DC/etc. suffer from the same problems OpenCL has (lack of robustness).

    Debatable. :grin:

    I'm with you 100% of the way. However, that doesn't mean I expect Google to support every solution. I expect them to support the winner (and the jury is still out on that one). :wink:

    Good luck on your thesis!

    Except those APIs aren't poorly implemented.

    So bizarro world it is! :razz: There's a reason why Nvidia doesn't feel compelled to update their OpenCL drivers to 1.2 and it's not technical. And before you say "but CUDA!", it wouldn't matter if OpenCL was actually competitive. Nvidia would have no choice but to bow to market pressures and add 1.2 support. What do you attribute the lack of pressure from?

    I'll say this again just to be perfectly clear: I'm not against low level languages, just poorly implemented ones.
     
  18. BlueByLiquid

    Newcomer

    Joined:
    Jul 31, 2013
    Messages:
    6
    Likes Received:
    0

    It seems that we have lots of disagreements and are not going to persuade each other one way or another through debate. :) I am curious if you have tired OpenCL for any extended period of time. As I said its not a dream but it helps me reach near peak performance on a myriad of hardware. Most developers I know that have given OpenCL a chance find that it can do the same for them. Again I looked at RSC realized it was not a high performance API but sat down and tried to write a number of image processing tools. It make many impossible to even do and others had horrible performance. At the end of the day I want to use the hardware resources I have, I want good performance, OpenCL is the best API for that on the mobile and desktop platforms (Yes I use CUDA to all the time and it have pluses and minus but I like both). You can say I like in a a bizarro world but in fact I use the API that are there and I support the ones who help me do my job better. OpenCL does that and now Google has spent time and money to take that away and given me no alternative.
     
  19. silent_guy

    Veteran Subscriber

    Joined:
    Mar 7, 2006
    Messages:
    3,754
    Likes Received:
    1,379
    I think it makes sense to only support support 1 language instead of 2.

    Once you make that decision, it's very attractive to chose one over which you have 100% control. OpenCL is not that language.
     
  20. sebbbi

    Veteran

    Joined:
    Nov 14, 2007
    Messages:
    2,924
    Likes Received:
    5,288
    Location:
    Helsinki, Finland
    My opinion as a professional graphics programmer (10 years of experience).

    In order to improve the performance/efficiency of various graphics algorithms, I need following things from a mobile compute API:

    1. Seamless integration with graphics API: Compute API needs to be able to directly read and write (without doing any translations or any data movement) all graphics API buffers (vertex / texture / render target / depth buffer / etc). The compute jobs need to be seamlessly appended to the same ring buffer as graphics draw calls to ensure proper synchronization with the GPU, lowest possible latency (compute API can process data that is generated by graphics pipeline during the same frame, and feed data forward that is later output to screen by the GPU in the same frame). There should be zero waiting/stalls (between GPU<->CPU) when mixing traditional graphics API and compute to render graphics. DirectCompute (and OpenGL 4.3) handles this very well (full resource sharing and GPU driven compute and draw calls: DispactIndirect / DrawInstancedIndirect).

    2. Cooperative work using multiple GPU threads: Threads that cooperate must have some kind of synchronization mechanism and some kind of shared work memory (for efficient communication and data transfer). Compute API needs to allow me to program algorithms that are much more efficient than the traditional brute force processing by pixel shaders. Mobile devices have neither the excess performance or the battery life to do huge amount of excess data processing and movement that could be easily solved by having an efficient compute API for graphics processing.

    3. Easy to learn and understand: As a graphics programmer I am familiar with the GPU execution model (and instruction sets for pixel and vertex shaders). If the compute API has similar instruction set and familiar syntax, it will be easier for me to learn, and to port all our old inefficient pixel shader based algorithms to the new compute API. This is just a nice-to-have feature (the first two are deal breakers for any graphics programmer).

    A good compute API allows graphics programmer to boost the performance of many graphics processing steps by at least 2x-4x (compared to existing brute force pixel/vertex shaders). It also allows implementation of new algorithms that are not possible by existing pixel/vertex shaders alone (as data cannot be properly shared between threads). We definitely need an API like this for mobile devices, as mobile devices are much more power and performance constrained than PCs and consoles. Improved efficiency should be top priority for mobile device graphics rendering, after all graphics rendering drains more battery than any other processing on mobile devices.

    Mobile Kepler will be released next year, and it will have full support for OpenGL 4.3. OpenGL 4.3 has it's own compute API that is fully integrated with the graphics API and is very close to DirectCompute in feature set (including indirect draw calls and dispatch). So far this seems to be the first API on mobile devices that allows us graphics programmers to program similar performance/efficiency improvements on mobile devices that we are using in our modern PC/console engines. I just hope that we will have a similar API that is compatible with OpenGL ES 3.0 hardware, since it will take a while until all mobile chips support full (desktop) OpenGL.
     
Loading...

Share This Page

  • About Us

    Beyond3D has been around for over a decade and prides itself on being the best place on the web for in-depth, technically-driven discussion and analysis of 3D graphics hardware. If you love pixels and transistors, you've come to the right place!

    Beyond3D is proudly published by GPU Tools Ltd.
Loading...