NVIDIA/故障排除
显示故障(出现六个小屏幕的问题)[编辑 | 编辑源代码]
对于一些用户,使用 GeForce GT 100M 时,在 X 启动后屏幕显示会出现故障。显示了 6 个 分辨率限制在 640x480 的小屏幕。 Quadro 2000 和高分辨率显示器最近也出现了同样的问题。
要解决此问题,请在 Device
节中启用验证模式 NoTotalSizeCheck
:
Section "Device" ... Option "ModeValidation" "NoTotalSizeCheck" ... EndSection
'/dev/nvidia0' input/output error[编辑 | 编辑源代码]
出现此错误的原因可能多种多样,针对此错误给出的最常见解决方案是检查组 / 文件权限,但这在几乎所有情况下都不是问题所在。NVIDIA 文档没有详细说明如何纠正此问题,但有一些方法对某些人有效。问题可能出在与其他设备的 IRQ 冲突、内核或 BIOS 的错误路由等。
首先要尝试的是移除其他视频设备,比如采集卡,看看问题是否会消失。如果在同一个系统上有太多的视频处理器,它可能导致内核无法启动它们,因为视频控制器会有内存分配问题。特别是在显存较小的的系统上,即使只有一个视频处理器,也可能发生这种情况。在这种情况下,您应该找出系统的视频内存量(例如,通过使用lspci -v
命令),并将分配参数传递给内核。例如,对于 32 位内核,您可以设置:
vmalloc=384M
如果运行 64 位内核,驱动程序缺陷可能导致 NVIDIA 模块在 IOMMU 打开时无法初始化。在 BIOS 中关闭它已被确认对一些用户有效。 [1]User:Clickthem#nvidia module
另一件要尝试的事情是将 BIOS IRQ 路由从 Operating system controlled
更改为 BIOS controlled
或其他方式。前者可以通过使用内核参数来设置:
PCI=biosirq
noacpi
内核参数也是解决方案之一,但是因为它会完全禁用 ACPI,所以应该谨慎使用。有些硬件很容易因过热而损坏。
常见崩溃排障[编辑 | 编辑源代码]
- 尝试在 xorg.conf 中禁用
RenderAccel
。 - 如果 Xorg 输出关于
"conflicting memory type"
或"failed to allocate primary buffer: out of memory"
的错误,或者在使用 nvidia-96xx 驱动程序时出现“Signal 11”错误并崩溃,请将nopat
添加到内核参数中。 - 如果 NVIDIA 编译器提示当前 GCC 版本与编译内核时使用的版本不一致,请把以下内容添加到
/etc/profile
中:
export IGNORE_CC_MISMATCH=1
- 如果全屏应用程序冻结或崩溃,请尝试在桌面环境的设置中启用
Display Compositing
和Direct fullscreen rendering
选项。
驱动升级后性能不佳[编辑 | 编辑源代码]
如果新驱动的 FPS 比旧驱动低,检查直接渲染是否已经启动。(glxinfo
程序包含在 mesa-utils包 软件包中):
$ glxinfo | grep direct
如果命令输出 :
direct rendering: No
您可能需要降级驱动并重启。
避免屏幕撕裂[编辑 | 编辑源代码]
无论您使用的是哪种合成器,都可以通过强制使用完整的合成管线来避免撕裂。要测试此选项是否有效,请运行:
$ nvidia-settings --assign CurrentMetaMode="nvidia-auto-select +0+0 { ForceFullCompositionPipeline = On }"
或者单击 X Server Display Configuration 菜单选项中的 Advanced 按钮。选择 Force Composition Pipeline 或 Force Full Composition Pipeline,然后单击 Apply。
为了使这一设置持久化,必须将其添加到 Xorg 配置文件的 "Screen"
部分。进行此更改时,应在驱动程序配置中启用 TripleBuffering
,并禁用 AllowIndirectGLXProtocol
。请参阅以下配置示例:
/etc/X11/xorg.conf.d/20-nvidia.conf
Section "Device" Identifier "NVIDIA Card" Driver "nvidia" VendorName "NVIDIA Corporation" BoardName "GeForce GTX 1050 Ti" EndSection Section "Screen" Identifier "Screen0" Device "Device0" Monitor "Monitor0" Option "ForceFullCompositionPipeline" "on" Option "AllowIndirectGLXProtocol" "off" Option "TripleBuffer" "on" EndSection
如果没有 Xorg 配置文件,可以使用 nvidia-xconfig
( 参见 NVIDIA#Automatic configuration) 为当前硬件创建一个 Xorg 配置文件,并将其从 /etc/X11/xorg.conf
移动到首选位置 /etc/X11/xorg.conf.d/20-nvidia.conf
。
nvidia-xconfig
生成的 20-nvidia.conf
文件中的许多配置选项都是由驱动程序自动设置的,实际并不需要。我们只需要其中的 "Screen"
部分就可以启用合成管线,该部分包含 Identifier
和 Option
等设置,而其他部分可以从该文件中删除。多显示器[编辑 | 编辑源代码]
对于多显示器设置,您需要为每个显示器指定 ForceCompositionPipeline=On
。例如 :
$ nvidia-settings --assign CurrentMetaMode="DP-2: nvidia-auto-select +0+0 {ForceCompositionPipeline=On}, DP-4: nvidia-auto-select +3840+0 {ForceCompositionPipeline=On}"
如果不执行此操作,nvidia-settings
命令将禁用其他显示器。
下面的命令可以用来获取当前的屏幕名称和偏移量:
$ nvidia-settings --query CurrentMetaMode
上面的命令适用于将两个 3840x2160 的显示器连接到 DP-2 和 DP-4 上。您需要通过导出 xorg.conf
来读取正确的 CurrentMetaMode
,并将 ForceCompositionPipeline
附加到每个显示器上。设置 ForceCompositionPipeline
只会影响目标显示器。
~/.nvidia-settings-rc
中配置,例如0/XVideoSyncToDisplayID=
,或者安装 nvidia-settings包 并使用图形配置选项。Modprobe Error: "Could not insert 'nvidia': No such device" on linux >=4.8[编辑 | 编辑源代码]
当试图使用独立显卡时,在 linux 4.8 系统中可能会遇到如下错误:
$ modprobe nvidia -vv
modprobe: INFO: custom logging function 0x409c10 registered modprobe: INFO: Failed to insert module '/lib/modules/4.8.6-1-ARCH/extramodules/nvidia.ko.gz': No such device modprobe: ERROR: could not insert 'nvidia': No such device modprobe: INFO: context 0x24481e0 released insmod /lib/modules/4.8.6-1-ARCH/extramodules/nvidia.ko.gz
# dmesg
... NVRM: The NVIDIA GPU 0000:01:00.0 (PCI ID: 10de:139b) NVRM: installed in this system is not supported by the 370.28 NVRM: NVIDIA Linux driver release. Please see 'Appendix NVRM: A - Supported NVIDIA GPU Products' in this release's NVRM: README, available on the Linux driver download page NVRM: at www.nvidia.com. ...
这个问题是由 Linux 内核中有关 PCIe 电源管理的错误提交导致的(如在 NVIDIA DevTalk 讨论串中所述)。
解决方法是在内核参数中添加 pcie_port_pm=off
。请注意,这会禁用所有设备的 PCIe 电源管理。
挂起或休眠后的屏幕损坏[编辑 | 编辑源代码]
请参阅 NVIDIA/Tips and tricks#Preserve video memory after suspend。
当使用 GDM 显示管理器时,驱动程序版本 515.43.04 以后的挂起后的损坏 bug 被修复了 [2]。
使用 400 系显卡时 CPU 间歇性出现峰值[编辑 | 编辑源代码]
如果使用 400 系列显卡时出现间歇性 CPU 峰值,则可能是 PowerMizer 不断更改 GPU 的时钟频率导致的。您可以通过把以下内容添加到 Xorg 配置的 Device
部分来将 PowerMizer 的设置从自适应切换为性能:
Option "RegistryDwords" "PowerMizerEnable=0x1; PerfLevelSrc=0x3322; PowerMizerDefaultAC=0x1"
笔记本电脑的 X 在登入和注销时挂起[编辑 | 编辑源代码]
如果在使用传统 NVIDIA 驱动程序时,Xorg 在登入和注销时候挂起(常表现为屏显被分成黑白 / 灰色两部分),但仍然可以通过 Ctrl+Alt+Backspace
(或者绑定的其他“kill X”键)登录的话,请尝试在 /etc/modprobe.d/modprobe.conf
中添加:
options nvidia NVreg_Mobile=1
有的用户报告说以下配置也有效,但经过测试它也可能导致显著的性能下降:
options nvidia NVreg_DeviceFileUID=0 NVreg_DeviceFileGID=33 NVreg_DeviceFileMode=0660 NVreg_SoftEDIDs=0 NVreg_Mobile=1
请注意 NVreg_Mobile
参数的值因笔记本厂商差异而有所不同:
- 1 - Dell 笔记本电脑
- 2 - 非 Compal 的 Toshiba 笔记本电脑
- 3 - 其他笔记本电脑
- 4 - Compal Toshiba 笔记本电脑.
- 5 - Gateway 笔记本电脑.
请参考 NVIDIA Driver's README: Appendix K 了解更多信息。
Screen(s) found, but none have a usable configuration[编辑 | 编辑源代码]
Sometimes NVIDIA and X have trouble finding the active screen. If your graphics card has multiple outputs try plugging your monitor into the other ones. On a laptop it may be because your graphics card has VGA/TV out. Xorg.0.log will provide more info.
Another thing to try is adding invalid "ConnectedMonitor" Option
to Section "Device"
to force Xorg throws error and shows you how correct it.
Here
more about ConnectedMonitor setting.
After re-run X see Xorg.0.log to get valid CRT-x,DFP-x,TV-x values.
nvidia-xconfig --query-gpu-info
could be helpful.
Blackscreen at X startup / Machine poweroff at X shutdown[编辑 | 编辑源代码]
If you have installed an update of NVIDIA and your screen stays black after launching Xorg, or if shutting down Xorg causes a machine poweroff, try the below workarounds:
- Prepend "xrandr --auto" to your xinitrc
- Use the
rcutree.rcu_idle_gp_delay=1
kernel parameter.
- You can also try to add the
nvidia
module directly to your mkinitcpio.conf.
- If the screen still stays black with both the
rcutree.rcu_idle_gp_delay=1
kernel parameter and thenvidia
module directly in the mkinitcpio.conf, try re-installing nvidia包 and nvidia-utils包 in that order, and finally reload the driver:
# modprobe nvidia
Backlight is not turning off in some occasions[编辑 | 编辑源代码]
By default, DPMS should turn off backlight with the timeouts set or by running xset. However, probably due to a bug in the proprietary NVIDIA drivers the result is a blank screen with no powersaving whatsoever. To workaround it, until the bug has been fixed you can use the vbetool
as root.
Install the vbetool包 package.
Turn off your screen on demand and then by pressing a random key backlight turns on again:
vbetool dpms off && read -n1; vbetool dpms on
Alternatively, xrandr is able to disable and re-enable monitor outputs without requiring root.
xrandr --output DP-1 --off; read -n1; xrandr --output DP-1 --auto
Driver 415: HardDPMS[编辑 | 编辑源代码]
Proprietary driver 415 includes a new feature called HardDPMS. This is reported by some users to solve the issues with suspending monitors connected over DisplayPort.
It is reported to become the default in a future driver version, but for now, the HardDPMS
option can be set in the Device
or Screen
sections. For example:
/etc/X11/xorg.conf.d/20-nvidia.conf
Section "Device" ... Option "HardDPMS" "true" ... EndSection Section "Screen" ... Option "HardDPMS" "true" ... EndSection
HardDPMS
will trigger on screensaver settings like BlankTime
. The following ServerFlags
will set your monitor(s) to suspend after 10 minutes of inactivity:
/etc/X11/xorg.conf.d/20-nvidia.conf
Section "ServerFlags" Option "BlankTime" "10" EndSection
Xorg fails to load or Red Screen of Death[编辑 | 编辑源代码]
If you get a red screen and use GRUB, disable the GRUB framebuffer by editing /etc/default/grub
and uncomment GRUB_TERMINAL_OUTPUT=console
. For more information see GRUB/Tips and tricks#Disable framebuffer.
Black screen on systems with integrated GPU[编辑 | 编辑源代码]
If you have a system with an integrated GPU (e.g. Intel HD 4000, VIA VX820 Chrome 9 or AMD Cezanne) and have installed the nvidia包 package, you may experience a black screen on boot, when changing virtual terminal, or when exiting an X session. This may be caused by a conflict between the graphics modules. This is solved by blacklisting the relevant GPU modules. Create the file /etc/modprobe.d/blacklist.conf
and prevent the relevant modules from loading on boot:
/etc/modprobe.d/blacklist.conf
install i915 /usr/bin/false install intel_agp /usr/bin/false install viafb /usr/bin/false install radeon /usr/bin/false install amdgpu /usr/bin/false
HDMI 无声[编辑 | 编辑源代码]
有时,NVIDIA HDMI 音频设备在执行 aplay -l
时不会显示。在某些新机器上,NVIDIA GPU 上的音频芯片在启动时可能被禁用。你需要重新加载启用音频的 NVIDIA 设备。确保 GPU 处于开启状态(如在笔记本电脑/Bumblebee 中),并且没有在其上运行 X,因为这会导致重置:
# setpci -s 01:00.0 0x488.l=0x2000000:0x2000000
# rmmod nvidia-drm nvidia-modeset nvidia
# echo 1 > /sys/bus/pci/devices/0000:01:00.0/remove
# echo 1 > /sys/bus/pci/devices/0000:00:01.0/rescan
# modprobe nvidia-drm
# xinit -- -retro
如果你在 NVIDIA 上运行 TTY,可以将这些命令放入脚本中,以免屏幕消失。
X fails with "no screens found" when using Multiple GPUs[编辑 | 编辑源代码]
In situations where you might have multiple GPUs on a system and X fails to start with:
[ 76.633] (EE) No devices detected. [ 76.633] Fatal server error: [ 76.633] no screens found
then you need to add your discrete card's BusID to your X configuration. This can happen on systems with an Intel CPU and an integrated GPU or if you have more than one NVIDIA card connected. Find your BusID:
# lspci | grep -E "VGA|3D controller"
00:02.0 VGA compatible controller: Intel Corporation Xeon E3-1200 v2/3rd Gen Core processor Graphics Controller (rev 09) 01:00.0 VGA compatible controller: NVIDIA Corporation GK107 [GeForce GTX 650] (rev a1) 08:00.0 3D controller: NVIDIA Corporation GM108GLM [Quadro K620M / Quadro M500M] (rev a2)
Then you fix it by adding it to the card's Device section in your X configuration. In my case:
/etc/X11/xorg.conf.d/10-nvidia.conf
Section "Device" Identifier "Device0" Driver "nvidia" VendorName "NVIDIA Corporation" BusID "PCI:1:0:0" EndSection
In the example above 01:00.0
is stripped to be written as 1:0:0
, however some conversions can be more complicated. lspci
output is in hex format, but in configuration files the BusID's are in decimal format! This means that in cases where the BusID is greater than 9 you will need to convert it to decimal!
ie: 5e:00.0
from lspci becomes PCI:94:0:0
.
Xorg fails during boot, but otherwise starts fine[编辑 | 编辑源代码]
On very fast booting systems, systemd may attempt to start the display manager before the NVIDIA driver has fully initialized. You will see a message like the following in your logs only when Xorg runs during boot.
/var/log/Xorg.0.log
[ 1.807] (EE) NVIDIA(0): Failed to initialize the NVIDIA kernel module. Please see the [ 1.807] (EE) NVIDIA(0): system's kernel log for additional error messages and [ 1.808] (EE) NVIDIA(0): consult the NVIDIA README for details. [ 1.808] (EE) NVIDIA(0): *** Aborting ***
In this case you will need to establish an ordering dependency from the display manager to the DRI device. First create device units for DRI devices by creating a new udev rules file.
/etc/udev/rules.d/99-systemd-dri-devices.rules
ACTION=="add", KERNEL=="card*", SUBSYSTEM=="drm", TAG+="systemd"
Then create dependencies from the display manager to the device(s).
/etc/systemd/system/display-manager.service.d/10-wait-for-dri-devices.conf
[Unit] Wants=dev-dri-card0.device After=dev-dri-card0.device
If you have additional cards needed for the desktop then list them in Wants and After seperated by spaces.
xrandr BadMatch[编辑 | 编辑源代码]
If you are trying to configure a WQHD monitor such as DELL U2515H using xrandr and xrandr --addmode
gives you the error X Error of failed request: BadMatch
, it might be because the proprietary NVIDIA driver clips the pixel clock maximum frequency of HDMI output to 225 MHz or lower. To set the monitor to maximum resolution you have to install nouveau drivers. You can force nouveau to use a specific pixel clock frequency by setting nouveau.hdmimhz=297
(or 330
) in your Kernel parameters.
Alternatively, it may be that your monitor's EDID is incorrect. See #Override EDID.
Another reason could be that by default current NVIDIA drivers will only allow modes explicitly reported by EDID, but sometimes refresh rates and/or resolutions are desired which are not reported by the monitor (although the EDID information is correct; it is just that current NVIDIA drivers are too restrictive).
If this happens, you may want to add an option to xorg.conf
to allow non-EDID modes:
Section "Device" Identifier "Device0" Driver "nvidia" VendorName "NVIDIA Corporation" ... Option "ModeValidation" "AllowNonEdidModes" ... EndSection
This can be set per-output. See NVidia driver readme (Appendix B. X Config Options) for more information.
Override EDID[编辑 | 编辑源代码]
See Kernel mode setting#Forcing modes and EDID, Xrandr#Troubleshooting and Qnix QX2710#Fixing X11 with Nvidia.
Overclocking with nvidia-settings GUI not working[编辑 | 编辑源代码]
Workaround is to use nvidia-settings CLI to query and set certain variables after enabling overclocking (as explained in NVIDIA/Tips and tricks#Enabling overclocking, see nvidia-settings(1) for more information).
Example to query all variables:
nvidia-settings -q all
Example to set PowerMizerMode to prefer performance mode:
nvidia-settings -a [gpu:0]/GPUPowerMizerMode=1
Example to set fan speed to fixed 21%:
nvidia-settings -a [gpu:0]/GPUFanControlState=1 -a [fan:0]/GPUTargetFanSpeed=21
Example to set multiple variables at once (overclock GPU by 50MHz, overclock video memory by 50MHz, increase GPU voltage by 100mV):
nvidia-settings -a GPUGraphicsClockOffsetAllPerformanceLevels=50 -a GPUMemoryTransferRateOffsetGPUGraphicsClockOffsetAllPerformanceLevels=50 -a GPUOverVoltageOffset=100
Overclocking not working with Unknown Error[编辑 | 编辑源代码]
If you are running Xorg as a non-root user and trying to overclock your NVIDIA GPU, you will get an error similar to this one:
$ nvidia-settings -a "[gpu:0]/GPUGraphicsClockOffset[3]=10"
ERROR: Error assigning value 10 to attribute 'GPUGraphicsClockOffset' (trinity-zero:1[gpu:0]) as specified in assignment '[gpu:0]/GPUGraphicsClockOffset[3]=10' (Unknown Error).
To avoid this issue, Xorg has to be run as the root user. See Xorg#Rootless Xorg for details.
System will not boot after driver was installed[编辑 | 编辑源代码]
If after installing the NVIDIA driver your system becomes stuck before reaching the display manager, try to disable kernel mode setting.
X fails with "Failing initialization of X screen"[编辑 | 编辑源代码]
If /var/log/Xorg.0.log
says X server fails to initialize screen
(EE) NVIDIA(G0): GPU screens are not yet supported by the NVIDIA driver (EE) NVIDIA(G0): Failing initialization of X screen
and nvidia-smi says No running processes found
The solution is at first reinstall latest nvidia-utils包, and then copy /usr/share/X11/xorg.conf.d/10-nvidia-drm-outputclass.conf
to /etc/X11/xorg.conf.d/10-nvidia-drm-outputclass.conf
, and then edit /etc/X11/xorg.conf.d/10-nvidia-drm-outputclass.conf
and add the line Option "PrimaryGPU" "yes"
. Restart the computer. The problem will be fixed.
System does not return from suspend[编辑 | 编辑源代码]
What you see in the log:
kernel: nvidia-modeset: ERROR: GPU:0: Failed detecting connected display devices kernel: nvidia-modeset: ERROR: GPU:0: Failed detecting connected display devices kernel: nvidia-modeset: WARNING: GPU:0: Failure processing EDID for display device DELL U2412M (DP-0). kernel: nvidia-modeset: WARNING: GPU:0: Unable to read EDID for display device DELL U2412M (DP-0) kernel: nvidia-modeset: ERROR: GPU:0: Failure reading maximum pixel clock value for display device DELL U2412M (DP-0).
A possible solution based on [3]:
Run this command to get the version
string:
# strings /sys/firmware/acpi/tables/DSDT | grep -i 'windows ' | sort | tail -1
Add the acpi_osi=! "acpi_osi=version"
kernel parameter to your boot loader configuration.
Vulkan error on applications start[编辑 | 编辑源代码]
On executing an application that require Vulkan acceleration, if you get this error
Vulkan call failed: -4
try to delete the ~/.nv
or ~/.cache/nvidia
directory.
Extreme lag on Xorg[编辑 | 编辑源代码]
A common issue with Mutter is that animations, video playback and gaming cause extreme desktop lag on Xorg.
See NVIDIA/Tips and tricks#Preserve video memory after suspend.
This should resolve this issue, however if it did not, you are most likely out of luck. One way you can remedy this issue is by adding these options:
/etc/environment
CLUTTER_DEFAULT_FPS=YOUR_MAIN_DISPLAY_REFRESHRATE __GL_SYNC_DISPLAY_DEVICE=YOUR_MAIN_DISPLAY_OUTPUT_NAME
turning Sync to VBlank
and Allow flipping
off within NVIDIA Settings, and configuring NVIDIA Settings to launch on startup using the flag --load-config-only
.
This will still result in a laggy desktop behavior, in particular on an eventual second (or third) monitor, but it should be much better.