FMA3
E648231
FMA3 is an x86 instruction set extension that provides fused multiply-add operations to improve floating-point performance and efficiency in modern processors.
Statements (46)
| Predicate | Object |
|---|---|
| instanceOf |
fused multiply-add instruction set
ⓘ
x86 instruction set extension ⓘ |
| architecture | x86 ⓘ |
| benefits |
digital signal processing workloads
ⓘ
graphics and multimedia workloads ⓘ machine learning workloads ⓘ scientific computing workloads ⓘ |
| category | SIMD floating-point extension ⓘ |
| compatibleWith |
AVX
NERFINISHED
ⓘ
AVX2 ⓘ |
| cpuidLeaf | EAX=1 ⓘ |
| cpuidRegisterBit | ECX bit 12 ⓘ |
| dataType |
double-precision floating-point
ⓘ
single-precision floating-point ⓘ |
| design | 3-operand non-destructive form ⓘ |
| detectedBy | CPUID instruction ⓘ |
| differsFrom | FMA4 by using 3-operand encoding instead of 4-operand ⓘ |
| effectOnPower | can improve energy efficiency per floating-point operation ⓘ |
| enables | higher throughput fused multiply-add pipelines ⓘ |
| featureFlag | FMA ⓘ |
| firstSupportedBy |
AMD Piledriver microarchitecture
NERFINISHED
ⓘ
Intel Haswell microarchitecture NERFINISHED ⓘ Intel Xeon E5 v3 series NERFINISHED ⓘ |
| fullName | Fused Multiply-Add 3-operand ⓘ |
| improves |
floating-point efficiency
ⓘ
floating-point performance ⓘ |
| introducedBy | Intel NERFINISHED ⓘ |
| operationCount |
3-operand form
ⓘ
4-operand effective computation (one destination, two sources, one addend) ⓘ |
| optimizationTarget |
compiler auto-vectorization
ⓘ
hand-tuned assembly code ⓘ |
| primaryOperation |
fused multiply-add
ⓘ
fused multiply-subtract ⓘ |
| reduces |
instruction count for multiply-add sequences
ⓘ
rounding errors compared to separate multiply and add ⓘ |
| registerType |
XMM registers
ⓘ
YMM registers ⓘ |
| relatedTo | FMA4 ⓘ |
| requires | support in both CPU and software toolchain ⓘ |
| roundingBehavior | single rounding for multiply-add sequence ⓘ |
| standardizedIn | Intel 64 and IA-32 Architectures Software Developer’s Manual NERFINISHED ⓘ |
| status | widely supported in modern x86-64 processors ⓘ |
| usedIn |
high-performance computing applications
ⓘ
optimized math libraries ⓘ vectorized numerical kernels ⓘ |
| usesEncoding | VEX prefix ⓘ |
Referenced by (2)
Full triples — surface form annotated when it differs from this entity's canonical label.