System Requirements
Mode:Batch Realtime Deployments:Virtual ApplianceThe Speechmatics Virtual Appliance operates on a hypervisor host system. For this version of the Appliance, the following hypervisors are supported:
- VMware ESXi v7.0 and greater
- VMware Workstation v16.0 and greater*
- AWS EC2
- Proxmox VE v8.0 and greater
For the Virtual Appliance to operate as required, the host must meet the requirements and have the resources available as defined below.
*VMware Workstation does not currently support PCI passthrough, which is a requirement for running GPU transcription.
Virtual Appliance System Requirements
Batch Virtual Appliance
The minimum specifications for the Speechmatics Batch Virtual Appliance are:
CPU transcription
- 2 vCPUs
- 8GB RAM
- Up to 170GB hard disk space
For each concurrent input stream using the Standard model the Appliance requires additional resources based on the requested features.
Scaling Mode | CPU | Memory (GB) |
---|---|---|
simple | 1.0 | 1.35 - 1.95 |
adaptive | 4.0 | 3.13 - 4.52 |
If you are using the Enhanced model
Scaling Mode | CPU | Memory (GB) |
---|---|---|
simple | 1.0 | 1.80 - 2.00 |
adaptive | 4.0 | 4.17 - 4.64 |
GPU transcription
- 8 vCPUs
- 32GB RAM
- Up to 170GB hard disk space
These specifications include requirements to run at least one worker using either our Standard or Enhanced model.
For each concurrent input stream using the Standard model the Appliance requires additional resources based on the requested features.
Scaling Mode | CPU | Memory (GB) |
---|---|---|
simple | 0.6 - 0.9 | 0.95 - 1.30 |
adaptive | 1.2 - 1.8 | 1.71 - 2.34 |
If you are using the Enhanced model
Scaling Mode | CPU | Memory (GB) |
---|---|---|
simple | 0.6 - 1.0 | 2.50 - 3.70 |
adaptive | 1.2 - 2.0 | 4.50 - 6.60 |
The recommended specifications for the Speechmatics Batch Virtual Appliance are:
- 16 vCPUs
- 64GB RAM
These specifications provide a good tradeoff of cost, performance and stability, it is possible to get increased throughput by choosing a larger machine and running more jobs in parallel. However depending on the use case it may be better to run several boxes of the recommended size.
Realtime Virtual Appliance
CPU transcription
The Realtime Virtual Appliance is currently in early access and CPU transcription is not currently supported.
GPU transcription
The minimum specifications for the Speechmatics Realtime Virtual Appliance are:
- 4 vCPUs
- 16GB RAM
- Up to 100GB hard disk space
These specifications include requirements to run at least one worker using either our Standard or Enhanced model.
For each additional input stream the appliance will require up to 0.2 vCPUs and 200MB of Memory
The recommended specifications for the Speechmatics Realtime Virtual Appliance are:
- 8 vCPUs
- 32GB RAM
- Up to 100GB hard disk space
The above allows the maximum GPU throughput on either enhanced and or standard modes.
Important Message on IOPS
Heavy usage of the Appliance at scale can sometimes result in very high usage of IO. If this is the case, we recommend increasing the maximum IOPs supported by your hardware to a value between 8,000-12,000. This is not necessary in all circumstances, but may result in better performance if you are running more than 10 concurrent workers. You should first upgrade the root disk. If you have an even greater number of workers, you may also need to upgrade the jobs disk.
Increasing the IOPS also will result in an increase in cost for resource usage. If you use AWS, we recommend in all cases to set the volume type
to gp3
as it costs less and has better performance than gp2
.
Low IOPS can also result in longer startup times in for the appliance, in cases where the IOPS is too low the first few transcription request on a fresh VM may fail see troubleshooting.
How to change the maximum IOPS supported by your hardware is documented here for AWS, here for Microsoft Azure, and here for VMware. You may need to do this if:
- You are using close to, or the maximum number of workers supported by that Appliance size
- The jobs being processed are all long files, and diarization is requested
Host Requirements
CPU
Standard Operating Point
The host machine requires a processor with the following microarchitecture specification:
- The host machine requires a processor with at least a Broadwell class microarchitecture or newer, with AVX2 instruction support
- You should check your hypervisor is configured to allow VM access to the AVX2 instruction
Enhanced Operating Point
- The host machine should have a processor with at least a Cascade Lake class microarchitecture or newer, with AVX512-VNNI instruction support. This will greatly improve transcription processing speed. The AVX2 instruction is also required
- You should check your hypervisor is configured to allow VM access to the AVX2 and AVX512-VNNI instructions
- Examples of this among popular hosting providers include the Microsoft Azure DS_v4 class, and the Amazon M5n EC2 server class
- If you are using VMware and the Enhanced model, and encounter performance issues, we recommend allocating dedicated memory and/or processors to the Appliance. How to apply dedicated processors in VMware is documented here, setting memory is documented here
- If you encounter performance issues when running the Enhanced model, disabling hyperthreading when running the Enhanced model can also improve transcription speed. How to do so when running on Amazon Web Services is shown here, and for Microsoft Azure please see here
GPU
- For running with a GPU, the requirements are the same as for the GPU Container.
AVX Flags
The hardware you run the Appliance on must support Advanced Vector Extensions (AVX). To see what AVX flags are supported by the CPU of your host system, you can run the following query via the Management API of the Appliance:
curl -L -u admin:$PWD -X 'GET' "https://${APPLIANCE_HOST}/v2/management/cpuinfo"
You will receive information about the host CPU. Supported AVX flags will be returned as flags in the Management API response. An example is below:
{
"usage_percentage": 2.5,
"architecture": "X86_64",
"model_name": "Intel(R) Xeon(R) Silver 4116 CPU @ 2.10GHz",
"cpus": "2",
"vendor": "GenuineIntel",
"hyperthreading": false,
"flags": "3dnowprefetch abm adx aes apic arat arch_capabilities arch_perfmon avx avx2 avx512_vnni bmi1 bmi2 clflush cmov constant_tsc cpuid cpuid_fault cx16 cx8 de f16c flush_l1d fma fpu fsgsbase fxsr hypervisor ibpb ibrs invpcid invpcid_single lahf_lm lm mca mce md_clear mmx movbe msr mtrr nonstop_tsc nopl nx pae pat pcid pclmulqdq pdpe1gb pge pni popcnt pse pse36 pti rdrand rdseed rdtscp sep smap smep ss ssbd sse sse2 sse4_1 sse4_2 ssse3 stibp syscall tsc tsc_adjust tsc_deadline_timer tsc_reliable vme x2apic xsave xsaveopt xtopology"
}
Useful Links
See below for minimum Virtual Appliance VM (guest) Specifications; the host machine must have enough resources (processor, memory and storage) to run the hypervisor, the guest VMs you intend to host on it, plus any other processes you expect to run on it. Vendor guidelines should be followed for other host requirements and installation process.
For VMware, the document Performance Best Practices for VMware vSphere 7.0 contains a comprehensive overview of hardware considerations and recommendations on how to optimize your host platform. See https://www.vmware.com/support.html for up-to-date technical information on VMware.
For Amazon EC2, the following link explains how to set up a VM using an AWS S3 bucket to store the OVA file: https://docs.aws.amazon.com/vm-import/latest/userguide/vmimport-image-import.html.
Proxmox VE docs can be found at: https://pve.proxmox.com/pve-docs/