In case you wonder: the BRP7 app version 0.17 (released last night) is meant to improve validation with the dominant Windows app version. So far it looks quite promising (2000 results, only 6 invalid).
In case you wonder: the BRP7 app version 0.17 (released last night) is meant to improve validation with the dominant Windows app version. So far it looks quite promising (2000 results, only 6 invalid).
is it a similar code change to whatever is in the v0.16 Linux-CUDA102 app? I've been testing that one out and it also seems to have very good validation
what did you tweak in the new version to improve validation if you don't mind me asking.
what did you tweak in the new version to improve validation if you don't mind me asking.
A difference seems to occur pretty early in the computation, already in the data preparation that's done on the CPU. This difference then propagates through the computation to a difference in the result such that the validator (occasionally, data dependent) can't match the two results. What I did was to use the exact same compiler version (gcc-7.3) for the Linux app that was used to compile the Windows version.
On Linux the (shared) CUDA libraries provided by NVidia are bound relatively closely to the gcc version, thus there I'm not completely free in choosing the version of gcc I compile the App with. This is the reason why we don't provide a CUDA 5.5 version for Linux (in fact we did build and publish one, but validation was terrible), and performing the same change on the current Linux CUDA version might not be impossible, but take some more time, experimenting with OS (libc), gcc and CUDA versions.
what did you tweak in the new version to improve validation if you don't mind me asking.
A difference seems to occur pretty early in the computation, already in the data preparation that's done on the CPU. This difference then propagates through the computation to a difference in the result such that the validator (occasionally, data dependent) can't match the two results. What I did was to use the exact same compiler version (gcc-7.3) for the Linux app that was used to compile the Windows version.
On Linux the (shared) CUDA libraries provided by NVidia are bound relatively closely to the gcc version, thus there I'm not completely free in choosing the version of gcc I compile the App with. This is the reason why we don't provide a CUDA 5.5 version for Linux (in fact we did build and publish one, but validation was terrible), and performing the same change on the current Linux CUDA version might not be impossible, but take some more time, experimenting with OS (libc), gcc and CUDA versions.
thanks, that's very useful information. do you think that 7.3 and 7.5 are close enough that this would be OK? or you needed an exact match. I maintain an old build environment with Ubuntu 18.04 that uses gcc 7.5 and still builds the app fine even with the latest cuda 12.2.2. wondering if it's worthwhile to attempt to manually downgrade it to 7.3.
the version of Linux-cuda you have up (10.2) in beta actually seems to validate really well in my short test of 200 tasks. only 1 invalid so far. small sample size but i have a good feeling about it. you might not need to build another one. some more time in beta and check the validation stats on it. if it looks good you can remove it from beta i think
There is ow a BRP7 app version for Intel GPUs (Beta test). Please give it a try.
could you also build one for intel GPU on linux please?
Is there a reliable driver for Linux for such GPUs by now? Last time I checked (years ago I admit) there wasn't, the Mesa platform was producing nothing but errors.
The app version itself doesn't need to built anew, its the same binary that's used for opencl-ati. The plan class settings require a bit of fiddling, though, so I want to see how that works with the Windows Intel GPU version first.
If you urgently like to try that out for yourself, use the AMD binary with an app_info.xml.
I maintain an old build environment with Ubuntu 18.04 that uses gcc 7.5 and still builds the app fine even with the latest cuda 12.2.2.
Yep, that what our Linux CUDA app was built on, too. I thought the gcc version to be reasonably close to gcc-7.3, but I also though that with gcc-7.1, which our Linux version 0.16 was built with, and that still caused problems in validation.
There is ow a BRP7 app version for Intel GPUs (Beta test). Please give it a try.
could you also build one for intel GPU on linux please?
Is there a reliable driver for Linux for such GPUs by now? Last time I checked (years ago I admit) there wasn't, the Mesa platform was producing nothing but errors.
The app version itself doesn't need to built anew, its the same binary that's used for opencl-ati. The plan class settings require a bit of fiddling, though, so I want to see how that works with the Windows Intel GPU version first.
If you urgently like to try that out for yourself, use the AMD binary with an app_info.xml.
though the scheduler does not recognize my device. it seems hard coded to only look for "HD [something]" in the device name for Intel. so in the past when testing Intel Linux Intel GPU tasks for BRP4 I had to spoof the name to something the scheduler recognized so that it would send me the binary and tasks, then I used an app_info to run the stock binary under anonymous platform so that i could remove the device name spoofing.
perhaps that's something else you can look into fixing? (the intel name recognition on the scheduler) when you have free time of course ;) no rush.
interesting that the binary is the same between opencl_intel and opencl_ati. I can try to snag the ATI binary and form an app_info to try it out. no rush on an official update. I'm just curious
An early observation- the RTX 40xx generation still has a high amount of invalids for BRP7 tasks. We saw this behavior with the FGRPB1 and was hoping that it was something about these exact tasks that was causing the issues, but it appears to be something larger. I am waiting to see results from other users that have RTX 40xx GPUs but I would assume it would be roughly the same result. Of course, it is a small sample size and more time/tasks will tell the story better but it is already not looking great.
In case you wonder: the BRP7
)
In case you wonder: the BRP7 app version 0.17 (released last night) is meant to improve validation with the dominant Windows app version. So far it looks quite promising (2000 results, only 6 invalid).
BM
Bernd Machenschalk wrote: In
)
is it a similar code change to whatever is in the v0.16 Linux-CUDA102 app? I've been testing that one out and it also seems to have very good validation
what did you tweak in the new version to improve validation if you don't mind me asking.
_________________________________________________________________________
There is ow a BRP7 app
)
There is ow a BRP7 app version for Intel GPUs (Beta test). Please give it a try.
BM
Bernd Machenschalk
)
could you also build one for intel GPU on linux please?
_________________________________________________________________________
Ian&Steve C. wrote: what did
)
A difference seems to occur pretty early in the computation, already in the data preparation that's done on the CPU. This difference then propagates through the computation to a difference in the result such that the validator (occasionally, data dependent) can't match the two results. What I did was to use the exact same compiler version (gcc-7.3) for the Linux app that was used to compile the Windows version.
On Linux the (shared) CUDA libraries provided by NVidia are bound relatively closely to the gcc version, thus there I'm not completely free in choosing the version of gcc I compile the App with. This is the reason why we don't provide a CUDA 5.5 version for Linux (in fact we did build and publish one, but validation was terrible), and performing the same change on the current Linux CUDA version might not be impossible, but take some more time, experimenting with OS (libc), gcc and CUDA versions.
BM
Bernd Machenschalk
)
thanks, that's very useful information. do you think that 7.3 and 7.5 are close enough that this would be OK? or you needed an exact match. I maintain an old build environment with Ubuntu 18.04 that uses gcc 7.5 and still builds the app fine even with the latest cuda 12.2.2. wondering if it's worthwhile to attempt to manually downgrade it to 7.3.
the version of Linux-cuda you have up (10.2) in beta actually seems to validate really well in my short test of 200 tasks. only 1 invalid so far. small sample size but i have a good feeling about it. you might not need to build another one. some more time in beta and check the validation stats on it. if it looks good you can remove it from beta i think
_________________________________________________________________________
Ian&Steve C. wrote:Bernd
)
Is there a reliable driver for Linux for such GPUs by now? Last time I checked (years ago I admit) there wasn't, the Mesa platform was producing nothing but errors.
The app version itself doesn't need to built anew, its the same binary that's used for opencl-ati. The plan class settings require a bit of fiddling, though, so I want to see how that works with the Windows Intel GPU version first.
If you urgently like to try that out for yourself, use the AMD binary with an app_info.xml.
BM
Ian&Steve C. wrote:I
)
Yep, that what our Linux CUDA app was built on, too. I thought the gcc version to be reasonably close to gcc-7.3, but I also though that with gcc-7.1, which our Linux version 0.16 was built with, and that still caused problems in validation.
BM
Bernd Machenschalk
)
yes there are intel GPU linux drivers that work well. I have them on my laptop : https://einsteinathome.org/host/12901081
though the scheduler does not recognize my device. it seems hard coded to only look for "HD [something]" in the device name for Intel. so in the past when testing Intel Linux Intel GPU tasks for BRP4 I had to spoof the name to something the scheduler recognized so that it would send me the binary and tasks, then I used an app_info to run the stock binary under anonymous platform so that i could remove the device name spoofing.
perhaps that's something else you can look into fixing? (the intel name recognition on the scheduler) when you have free time of course ;) no rush.
interesting that the binary is the same between opencl_intel and opencl_ati. I can try to snag the ATI binary and form an app_info to try it out. no rush on an official update. I'm just curious
_________________________________________________________________________
An early observation- the RTX
)
An early observation- the RTX 40xx generation still has a high amount of invalids for BRP7 tasks. We saw this behavior with the FGRPB1 and was hoping that it was something about these exact tasks that was causing the issues, but it appears to be something larger. I am waiting to see results from other users that have RTX 40xx GPUs but I would assume it would be roughly the same result. Of course, it is a small sample size and more time/tasks will tell the story better but it is already not looking great.
Here are the hosts:
Host 1
Host 2