NVIDIA V100 TENSOR CORE GPU

                  The First Tensor Core GPU

                  欧美黄色

                  Welcome to the Era of AI.

                  Finding the insights hidden in oceans of data can transform entire industries, from personalized cancer therapy to helping virtual personal assistants converse naturally and predicting the next big hurricane. 


                  NVIDIA? V100 Tensor Core is the most advanced data center GPU ever built to accelerate AI, high performance computing (HPC), data science and graphics. It’s powered by NVIDIA Volta architecture, comes in 16 and 32GB configurations, and offers the performance of up to 32 CPUs in a single GPU. Data scientists, researchers, and engineers can now spend less time optimizing memory usage and more time designing the next AI breakthrough.

                  Run AI and HPC workloads in a virtual environment for better security and manageability using NVIDIA Virtual Compute Server (vComputeServer) software

                  32X Faster Training Throughput than a CPU

                  ResNet-50 training, dataset: ImageNet2012, BS=256 | NVIDIA V100 comparison: NVIDIA DGX-2? server, 1x V100 SXM3-32GB, MXNet 1.5.1, container=19.11-py3, mixed precision, throughput: 1,525 images/sec | Intel comparison: Supermicro SYS-1029GQ-TRT, 1 socket Intel Gold 6240@2GHz/3.9Hz Turbo, Tensorflow 0.18, FP32 (only precision available), throughput: 48 images/sec

                  AI Training

                  From recognizing speech to training virtual personal assistants and teaching autonomous cars to drive, data scientists are taking on increasingly complex challenges with AI. Solving these kinds of problems requires training deep learning models that are exponentially growing in complexity, in a practical amount of time.

                  With 640 Tensor Cores, V100 is the world’s first GPU to break the 100 teraFLOPS (TFLOPS) barrier of deep learning performance. The next generation of NVIDIA NVLink? connects multiple V100 GPUs at up to 300 GB/s to create the world’s most powerful computing servers. AI models that would consume weeks of computing resources on previous systems can now be trained in a few days. With this dramatic reduction in training time, a whole new world of problems will now be solvable with AI.

                  SEE HOW YOU CAN ACCELERATE YOUR AI MODELS WITH MIXED PRECISION ON TENSOR CORES

                  24X Higher Inference Throughput than a CPU Server

                  BERT Base fine-tuning inference, dataset: SQuADv1.1, BS=1, sequence length=128 | NVIDIA V100 comparison: Supermicro SYS-4029GP-TRT, 1x V100-PCIE-16GB, pre-release container, mixed precision, NVIDIA TensorRT? 6.0, throughput: 557 sentences/sec | Intel comparison: 1 socket Intel Gold 6240@2.6GHz/3.9Hz Turbo, FP32 (only precision available), OpenVINO MKL-DNN v0.18, throughput: 23.5 sentences/sec

                  AI Inference

                  To connect us with the most relevant information, services, and products, hyperscale companies have started to tap into AI. However, keeping up with user demand is a daunting challenge. For example, the world’s largest hyperscale company recently estimated that they would need to double their data center capacity if every user spent just three minutes a day using their speech recognition service. 

                  V100 is engineered to provide maximum performance in existing hyperscale server racks. With AI at its core, V100 GPU delivers 47X higher inference performance than a CPU server. This giant leap in throughput and efficiency will make the scale-out of AI services practical.

                  One V100 Server Node Replaces Up to 135 CPU-Only Server Nodes

                  Application (Dataset): MILC (APEX Medium) and Chroma (szscl21_24_128) | CPU Server: Dual-Socket Intel Xeon Platinum 8280 (Cascade Lake)

                  high performance computing (HPC)

                  HPC is a fundamental pillar of modern science. From predicting weather to discovering drugs to finding new energy sources, researchers use large computing systems to simulate and predict our world. AI extends traditional HPC by allowing researchers to analyze large volumes of data for rapid insights where simulation alone cannot fully predict the real world.

                  V100 is engineered for the convergence of AI and HPC. It offers a platform for HPC systems to excel at both computational science for scientific simulation and data science for finding insights in data. By pairing NVIDIA CUDAcores and Tensor Cores within a unified architecture, a single server with V100 GPUs can replace hundreds of commodity CPU-only servers for both traditional HPC and AI workloads. Every researcher and engineer can now afford an AI supercomputer to tackle their most challenging work.

                  DATA CENTER GPUs

                  Data Center Tesla V100 NVLink

                  NVIDIA V100 FOR NVLINK

                  Ultimate performance for deep learning.

                  Data Center Tesla V100 PCle

                  NVIDIA V100 FOR PCle

                  Highest versatility for all workloads.

                  NVIDIA V100 Specifications

                   

                  V100 for NVLink

                  V100 for PCIe

                  V100S for PCIe

                  PERFORMANCE
                  with NVIDIA GPU Boost

                  Double-Precision
                  7.8 teraFLOPS

                  Single-Precision
                  15.7 teraFLOPS

                  Deep Learning
                  125 teraFLOPS

                  Double-Precision
                  7 teraFLOPS

                  Single-Precision
                  14 teraFLOPS

                  Deep Learning
                  112 teraFLOPS

                  Double-Precision
                  8.2 teraFLOPS

                  Single-Precision
                  16.4 teraFLOPS

                  Deep Learning
                  130 teraFLOPS

                  INTERCONNECT BANDWIDTH
                  Bi-Directional

                  NVLink
                  300 GB/s

                  PCIe
                  32 GB/s

                  PCIe
                  32 GB/s

                  MEMORY
                  CoWoS Stacked HBM2

                  CAPACITY
                  32/16 GB HBM2

                  BANDWIDTH
                  900 GB/s

                  CAPACITY
                  32 GB HBM2

                  BANDWIDTH
                  1134 GB/s

                  POWER
                  Max Consumption


                  300 WATTS


                  250 WATTS

                  Take a Free Test Drive

                  The World's Fastest GPU Accelerators for HPC and
                  Deep Learning.

                  WHERE TO BUY

                  Find an NVIDIA Accelerated Computing Partner through
                  our NVIDIA Partner Network (NPN).