It Is All About BTC, LTC, ETH, DOGE, KAS mining as well as other alternative crypto currencies
There is a little trick that can help you get some extra hashrate out of your Nvidia GPU when mining Ethereum or another more memory dependent algorithm, though it will most likely not affect many of the other crypto algorithms. By default when running Compute applications on your GPU it does not go to the highest Power State of the card, meaning that you might not be able to squeeze the maximum performance out of the video card and that is without overclocking anything. Thanks to the Nvidia System Management Interface (nvidia-smi) command line utility you are able to force the GPU to work in P0 state (the highest power state) instead of staying at maximum P2 when running a Compute applications such as a crypto miner software. Do note that this specific lower maximum power state is only for Compute applications, so it is not needed, not it should affect gaming performance where the GPU shuld go up to P0 power state if the conditions allow it.
The nvidia-smi utility is part of the video drivers and you can find it installed in the folder “C:\Program Files\NVIDIA Corporation\NVSMI\” on Windows, so you need to run the command line (cmd) and navigate to that folder in order to be able to issue commands. What you should start with is running the following command to check the current P-state of your GPU(s):
nvidia-smi -q -d PERFORMANCE
Do note that the P-state changes dynamically, so you need to be running Ethminer or another application when you issue the above command to see the power state active under load, otherwise you might see a lower power state being active if the GPU is idle.
After you verify the maximum power state that your Nvidia GPUs use when executing Compute applications such as ones that rely on OpenCL or CUDA you need to check what are the maximum frequencies of the video card that are available for the maximum performance in the P0 power state. You can do so with the following command (make sure you are still in the NVSMI folder):
nvidia-smi -q -d SUPPORTED_CLOCKS | more
The above will list all of the supported frequencies in the different power states that your video card can use, but there is no need to check the complete list. All we need to note are the frequencies at the top of the list for the Memory and the Graphics, in this example we are using GTX 970 video card from Gigabyte and the values we need are 3505 MHz for the VRAM and 1455 MHz for the GPU. We’ll need these frequencies for the next step.
What we are going to do next is to force the video card to use the maximum performance operating frequencies by going to the P0 power state. In order to do that we need to run the following command:
nvidia-smi -ac 3505,1455
Note that the above command will apply the settings to all GPUs that you have in your system, normally that should not be a problem for most mining rigs as they are usually with a number of cards that are the same model, but there are cases when this is not true. So you might need to check the individual settings fro different video cards and apply the correct parameters for each of them separately. To do so you just need to add the card ID in the command line, so that the particular option will be executed only for the specified video card. This is being done by adding the “-i
nvidia-smi -i 0 -ac 3505,1455
nvidia-smi -i 1 -ac 3505,1392
The question that undoubtedly comes now is how we have increased the performance for mining Ethereum by following the instructions above. Well it is pretty easy to check by running a benchmark with Ethminer, or just running the miner and noting the new increased frequencies that you should now have and compare them to the ones you previously had. On the Nvidia Gigabyte GTX 970 WF3OC video card used in the guide we are normally getting about 17.31 MHS in terms of hashrate mining Ethereum when the GPU is maxing out at the P2 power state, when we force it to go to the P0 power state the hashrate increases to about 19.98 MHS. So this is a nice improvement in terms of performance that comes at a cost of just about 10W higher power usage from the video card. Do note however that while this will work with Ethereum for increasing performance due to the heavy usage of video memory in the mining process, doing it may not bring performance improvement in many other commonly used mining algorithms.
19 Responses to How to Squeeze Some Extra Performance Mining Ethereum on Nvidia
deggie
February 26th, 2016 at 22:07
or you can just overclock like a muthafuxxor to much the same effect. Eth is very kind power wise.
thorthehammer
February 27th, 2016 at 04:07
No change to my Win7 running two Zotac 970s
admin
February 27th, 2016 at 11:43
This trick will work if you have not overclocked your GPUs, if you overclock them then they may be working at higher frequencies than the ones set for the P0 power state.
lmaonade80
March 2nd, 2016 at 02:53
This trick did not work for me, furthermore, the power stats reverted back to 2 after restart….
admin
March 2nd, 2016 at 10:25
The change is not permanent, you need to reapply it after a restart.
Luke
March 23rd, 2016 at 19:01
Any else get “failed to initialize NVML” when they try to run this?
Justin Ramos
May 6th, 2016 at 21:17
Worked great for me. Thanks!
titsmcgee
September 27th, 2016 at 06:46
awesome thank yoou!
gtx970 SCC evga pulling 19 – 22mh/s in windows 10!
setx GPU_FORCE_64BIT_PTR 0
setx GPU_MAX_HEAP_SIZE 100
setx GPU_USE_SYNC_OBJECTS 1
setx GPU_MAX_ALLOC_PERCENT 110
“C:\Program Files\NVIDIA Corporation\NVSMI\nvidia-smi” -acp UNRESTRICTED
“C:\Program Files\NVIDIA Corporation\NVSMI\nvidia-smi” -ac 3505,1455
ethminer.exe -F http://us-east1.nanopool.org:8888/0x341045E1c3d7B72764a2F90D6d09fc38Bf7e0067 -G –farm-recheck 200 –cl-local-work 256 –cl-global-work 8192
Leon
June 30th, 2017 at 02:04
How to make auto startable command for nvidia-smi after system restarting? Thanks
Palerider
October 12th, 2017 at 04:07
How would you force a nvidia gtx 1070sc into p0 state?The nvidia-smi does not support this card and the card stays in p2 state,anyone help?
Dmitriano
December 2nd, 2017 at 23:10
I have Window 10 with GeForceĀ® GTX 1060 and NVIDIA driver version 388.13 I tried “nvidia-smi -q -d SUPPORTED_CLOCKS”, but it show only N/A and does not show the supported clocks, see the screenshot: https://developernote.com/wp-content/uploads/2017/11/nvidia-smi-shows-p2-state.png . What does it mean?
admin
December 2nd, 2017 at 23:20
This does not seem to work on Pascal GPUs such as 1060, 1070, 1080…
Daniel K
December 10th, 2017 at 15:30
I managed to force my Asus GTX 1070 Dual OC cards into performance state p0 usint Nvidia Profile inspector. Just google it and download.
See screenshot:
https://imgur.com/fj4AcTL
The only actual difference I have noticed is that the memory clock is then set to 4004 Mhz (highest default clock) instead of 3800 Mhz in p2 state. So make sure to set your card(s) to default clock or at least downclock before applying p0 state, or you will end up with +400 Mhz extra on the memory…
Daniel K
December 10th, 2017 at 15:33
And here is the result/proof:
https://imgur.com/iTQnsno
Johngo
December 13th, 2017 at 10:34
Which CuDA version are you using? 9.1 or 6.5? Did you need Visual Studio? If so which version and is there a free workaround? I’m understanding CUDA 8 needs VS15.
JJ
January 20th, 2018 at 10:17
Thanks, Daniel K. It really worked!
On my 1080, I went from ~23 to ~25 Mh/s
Lilith
March 2nd, 2018 at 00:06
I have a gtx 980ti and I am mining musicoin, this only improved from 18.300MH to 18.600MH :(
My best improvement went from 3.500MH to 18.300MH after going to the Nvidia Control Panel > 3D Settings > Manage 3D Settings > Optimise for Compute Performance = ON
Regards,
admin
March 2nd, 2018 at 15:20
The “Optimize for Compute Performance” option is only available on 2nd gen Nvidia Maxwell GPUs like the GTX 980, so if you have one of these you should give it a try and see if it can help improve mining performance.
Petri Raatikainen
August 5th, 2018 at 15:00
Here is a code for a small program to keep the card at P2.
Then you can overclock without worrying the card going to P0 and crash.
Run in background (Linux) : ./keepP2 device=0 &
Search setiathome forums for a download link.
— snip —
/*
Keep P2 state permanently
This utility helps overclocked systems not to enter P0 state and crash.
usage: keepP2 [device=N]
Contact petri33 @ setiathome
*/
#include
//#include
#include
#include
#include
#include
// a variable in GPU memory
__device__ int i;
__global__ void myKernel(int val)
{
i = 0; // write zero to i
}
int main(int argc, char **argv)
{
int devID;
cudaDeviceProp props;
// This will pick selected or the best possible CUDA capable device
devID = findCudaDevice(argc, (const char **)argv);
//Get GPU information
checkCudaErrors(cudaGetDevice(&devID));
checkCudaErrors(cudaGetDeviceProperties(&props, devID));
printf(“Device %d: \”%s\” with Compute %d.%d capability\n”, devID, props.name, props.major, props.minor);
printf(“Keep in P2 state enabled.\nCreated 2018 by petri33 @ setiathome\n\n”);
//minimal Kernel configuration
dim3 dimGrid(1);
dim3 dimBlock(1);
unsigned int microseconds = 100000; // 0.1 seconds
for(;;)
{
// run 10 times a second, negligible performance hit
myKernel<<>>(0);
usleep(microseconds);
}
return EXIT_SUCCESS;
}
— snip —