Template:NVIDIA Jetson Orin Nano Super Mode performance using Generative AI-Template1

From RidgeRun Developer Wiki
./bandwidthTest --mode=shmoo --device=all
[CUDA Bandwidth Test] - Starting...

!!!!!Cumulative Bandwidth to be computed from all the devices !!!!!!

Running on...

 Device 0: Orin
 Shmoo Mode

.................................................................................
 Host to Device Bandwidth, 1 Device(s)
 PINNED Memory Transfers
   Transfer Size (Bytes)	Bandwidth(GB/s)
   1000				0.8
   2000				0.4
   3000				0.7
   4000				0.9
   5000				1.1
   6000				1.3
   7000				1.5
   8000				1.7
   9000				1.9
   10000			2.0
   11000			2.2
   12000			2.4
   13000			2.5
   14000			2.8
   15000			2.9
   16000			3.1
   17000			3.3
   18000			3.4
   19000			3.6
   20000			3.7
   22000			4.1
   24000			4.4
   26000			8.0
   28000			8.8
   30000			9.4
   32000			9.6
   34000			10.5
   36000			11.1
   38000			11.9
   40000			12.1
   42000			12.8
   44000			13.6
   46000			14.0
   48000			14.7
   50000			14.8
   60000			16.5
   70000			17.2
   80000			18.2
   90000			19.1
   100000			19.9
   200000			23.6
   300000			25.1
   400000			25.8
   500000			22.8
   600000			23.7
   700000			24.4
   800000			24.7
   900000			25.2
   1000000			25.3
   2000000			27.0
   3000000			27.6
   4000000			27.8
   5000000			28.0
   6000000			28.1
   7000000			28.1
   8000000			28.3
   9000000			28.3
   10000000			28.3
   11000000			28.3
   12000000			28.4
   13000000			28.4
   14000000			28.4
   15000000			28.4
   16000000			28.5
   18000000			28.5
   20000000			28.6
   22000000			28.6
   24000000			28.6
   26000000			28.6
   28000000			28.6
   30000000			28.6
   32000000			28.6
   36000000			28.6
   40000000			28.6
   44000000			28.6
   48000000			28.6
   52000000			28.7
   56000000			28.7
   60000000			28.7
   64000000			28.7
   68000000			28.7

.................................................................................
 Device to Host Bandwidth, 1 Device(s)
 PINNED Memory Transfers
   Transfer Size (Bytes)	Bandwidth(GB/s)
   1000				0.3
   2000				0.7
   3000				1.0
   4000				1.3
   5000				1.7
   6000				1.9
   7000				2.3
   8000				2.7
   9000				3.0
   10000			3.4
   11000			3.6
   12000			4.1
   13000			4.4
   14000			4.7
   15000			5.0
   16000			5.4
   17000			5.8
   18000			6.1
   19000			6.6
   20000			6.9
   22000			7.4
   24000			8.4
   26000			9.0
   28000			9.8
   30000			10.5
   32000			11.3
   34000			11.8
   36000			12.5
   38000			13.4
   40000			13.9
   42000			14.7
   44000			15.5
   46000			16.2
   48000			17.0
   50000			18.0
   60000			19.4
   70000			20.3
   80000			21.7
   90000			22.5
   100000			23.1
   200000			25.7
   300000			26.6
   400000			27.0
   500000			23.4
   600000			24.7
   700000			24.7
   800000			25.7
   900000			25.4
   1000000			26.9
   2000000			26.6
   3000000			27.7
   4000000			28.0
   5000000			28.1
   6000000			28.3
   7000000			28.2
   8000000			28.4
   9000000			28.4
   10000000			28.4
   11000000			28.4
   12000000			28.4
   13000000			28.5
   14000000			28.4
   15000000			28.5
   16000000			28.5
   18000000			28.5
   20000000			28.6
   22000000			28.6
   24000000			28.6
   26000000			28.6
   28000000			28.6
   30000000			28.6
   32000000			28.6
   36000000			28.6
   40000000			28.6
   44000000			28.7
   48000000			28.7
   52000000			28.6
   56000000			28.6
   60000000			28.7
   64000000			28.7
   68000000			28.6

.................................................................................
 Device to Device Bandwidth, 1 Device(s)
 PINNED Memory Transfers
   Transfer Size (Bytes)	Bandwidth(GB/s)
   1000				0.5
   2000				0.8
   3000				1.1
   4000				1.4
   5000				1.8
   6000				2.1
   7000				2.4
   8000				2.8
   9000				3.2
   10000			3.5
   11000			3.9
   12000			4.3
   13000			4.6
   14000			5.0
   15000			5.4
   16000			5.7
   17000			6.1
   18000			6.3
   19000			6.6
   20000			7.1
   22000			7.9
   24000			8.5
   26000			9.1
   28000			9.8
   30000			10.6
   32000			11.3
   34000			11.9
   36000			12.5
   38000			13.3
   40000			14.0
   42000			14.6
   44000			15.2
   46000			16.2
   48000			16.8
   50000			17.1
   60000			20.2
   70000			15.8
   80000			17.8
   90000			20.3
   100000			22.4
   200000			38.8
   300000			52.5
   400000			69.0
   500000			71.8
   600000			83.6
   700000			86.3
   800000			90.7
   900000			97.3
   1000000			101.9
   2000000			63.2
   3000000			69.6
   4000000			72.3
   5000000			74.8
   6000000			76.0
   7000000			77.7
   8000000			78.2
   9000000			79.3
   10000000			79.9
   11000000			80.1
   12000000			80.5
   13000000			80.7
   14000000			80.4
   15000000			81.1
   16000000			81.6
   18000000			82.1
   20000000			82.1
   22000000			82.8
   24000000			82.5
   26000000			82.9
   28000000			83.1
   30000000			82.8
   32000000			83.5
   36000000			83.5
   40000000			83.6
   44000000			83.7
   48000000			83.6
   52000000			83.9
   56000000			84.0
   60000000			84.1
   64000000			84.5
   68000000			84.1

Result = PASS

NOTE: The CUDA Samples are not meant for performance measurements. Results may vary when GPU Boost is enabled.