Running the identical Docker container on minikube with the --driver=none
option is 100% perfect
It seems that we need nested virtualisation
but alas we cannot set that in
opened 03:14PM - 26 Apr 22 UTC
enhancement
size/s
service/gke
### Community Note
* Please vote on this issue by adding a 👍 [reaction](https… ://blog.github.com/2016-03-10-add-reactions-to-pull-requests-issues-and-comments/) to the original issue to help the community and maintainers prioritize this request
* Please do not leave "+1" or "me too" comments, they generate extra noise for issue followers and do not help prioritize the request
* If you are interested in working on this issue or have submitted a pull request, please leave a comment. If the issue is assigned to the "modular-magician" user, it is either in the process of being autogenerated, or is planned to be autogenerated soon. If the issue is assigned to a user, that user is claiming responsibility for the issue. If the issue is assigned to "hashibot", a community member has claimed the issue already.
### Description
It would be helpful to be able to specify the threadsPerCore in the advanced machine features section of the node_config for GKE node_pools / clusters.
This allows users to disable hyperthreading when applicable.
### New or Affected Resource(s)
node_config block in https://registry.terraform.io/providers/hashicorp/google/latest/docs/resources/container_cluster and https://registry.terraform.io/providers/hashicorp/google/latest/docs/resources/container_node_pool
### Potential Terraform Configuration
```tf
# Propose what you think the configuration to take advantage of this feature should look like.
# We may not use it verbatim, but it's helpful in understanding your intent.
```
The section should look identical to the advanced machine configuration on VMs, see: https://registry.terraform.io/providers/hashicorp/google/latest/docs/resources/compute_instance#nested_advanced_machine_features (Minus the nested virtualization which isn't supported).
That gives:
```
advanced_machine_features {
threads_per_core = 1
}
```
### References
<!---
Information about referencing Github Issues: https://help.github.com/articles/basic-writing-and-formatting-syntax/#referencing-issues-and-pull-requests
Are there any other GitHub issues (open or closed) or pull requests that should be linked here? Vendor blog posts or documentation?
--->
No current open issue that I could find
<!---
Note Google Cloud customers who are working with a dedicated Technical Account Manager / Customer Engineer: to expedite the investigation and resolution of this issue, please refer to these instructions: https://github.com/hashicorp/terraform-provider-google/wiki/Customer-Contact#raising-gcp-internal-issues-with-the-provider-development-team
--->
Is there something else that is causing this error?
2022-06-09 05:45:34,828 ERROR (render-files): __main__: Render error: 2022-06-09 05:45:34 [382ms] [Error] [carb.windowing-glfw.plugin] GLFW initialization failed.
2022-06-09 05:45:34,828 ERROR (render-files): __main__: Render error: 2022-06-09 05:45:34 [382ms] [Error] [carb] Failed to startup plugin carb.windowing-glfw.plugin (interfaces: [carb::windowing::IGLContext v0.1],[carb::windowing::IWindowing v1.1]) (impl: carb.windowing-glfw.plugin)
2022-06-09 05:45:34,828 ERROR (render-files): __main__: Render error: 2022-06-09 05:45:34 [666ms] [Fatal] [carb.crashreporter-breakpad.plugin] libcarb.events.plugin.so!carbOnPluginStartup
2022-06-09 05:45:34,828 ERROR (render-files): __main__: Render error: 2022-06-09 05:45:34 [668ms] [Fatal] [carb.crashreporter-breakpad.plugin] libcarb.tasking.plugin.so!std::thread::_State_impl<std::thread::_Invoker<std::tuple<void (carb::tasking::Scheduler::*)(unsigned int, int, carb::cpp20::latch*), carb::tasking::Scheduler*, unsigned int, int, carb::cpp20::latch*> > >::_M_run()
2022-06-09 05:45:34,828 ERROR (render-files): __main__: Render error: 2022-06-09 05:45:34 [670ms] [Fatal] [carb.crashreporter-breakpad.plugin] libpthread.so.0!funlockfile
2022-06-09 05:45:34,829 ERROR (render-files): __main__: Render error: 2022-06-09 05:45:34 [672ms] [Fatal] [carb.crashreporter-breakpad.plugin] libc.so.6!explicit_bzero
2022-06-09 05:45:34,829 ERROR (render-files): __main__: Render error: 2022-06-09 05:45:34 [672ms] [Fatal] [carb.crashreporter-breakpad.plugin] libGLX_nvidia.so.0!vk_icdGetInstanceProcAddr
seems like the only work around is this:
Did not need nested virtualization, just need to make sure k8s has the right kernel for the containers to access the hardware for the Vulcan 470. 1.2 CUDA 11.4 base ubi 8 image
See
Specifically shut down all pods using GPU uninstall
Helm chart for GPU daemonset and then put this on
kubectl apply -f https://raw.githubusercontent.com/GoogleCloudPlatform/container-engine-accelerators/master/nvidia-driver-installer/cos/daemonset-preloaded-latest.yaml