Intel Broadwell

Intel i7-6900K (Broadwell), 4.0 GHz, 14 nm. RAM: 32 GB (unknown).

Intel E5-2699 v4 (Broadwell), 3.6 GHz (Turbo Boost), 14 nm. RAM: 256 GB, PC4-2133.

Note: It's possible that L2 Cache Latency can be 11 cycles in some cases. But dependency chain workload shows 12 cycles.

1 GB pages

1 GB pages, (E5-2699 v4, 3.6 GHz)

  32 K     4                           
  64 K     8                       4   + 8 (L2)        
 128 K    10                       2   
 256 K    11                       1
 512 K    40                      29   + 53 (L3)
   1 M    53                      13
   2 M    59                       6
   4 M    62                       3
   8 M    63                       1   
  16 M    64                       1   
  32 M    65                       1   
  64 M    65 + 13 ns           13 ns
 128 M    65 + 45 ns           32 ns   + 75 ns (RAM)
 256 M    65 + 60 ns           15 ns
 512 M    65 + 68 ns            8 ns
1024 M    65 + 72 ns            4 ns
   1 G    65 + 74 ns            2 ns
   2 G    65 + 75 ns            1 ns
   4 G    65 + 75 ns      
   8 G    69 + 75 ns       4           + 8 (L1 TLB miss)
  16 G    71 + 75 ns       2
  32 G    76 + 75 ns       5           + 8 (L2 TLB miss)
  64 G    79 + 75 ns       3

1 GB pages, (i7-6900K, 4.0 GHz)

  Size        Latency       Increase   Description

  32 K     4                           
  64 K     8                       4   + 8 (L2)        
 128 K    10                       2   
 256 K    11                       1
 512 K    37                      26   + 47 (L3)
   1 M    48                      11
   2 M    54                       6
   4 M    56                       2
   8 M    58                       2   
  16 M    59                       1   
  32 M    59 + 15 ns           15 ns   + 46 ns (RAM)
  64 M    59 + 31 ns           16 ns
 128 M    59 + 39 ns            8 ns   
 256 M    59 + 42 ns            3 ns
 512 M    59 + 44 ns            2 ns
1024 M    59 + 45 ns            1 ns

2 MB pages mode, (i7-6900K, 4.0 GHz)

  Size        Latency       Increase   Description

  32 K     4                           
  64 K     8                       4   + 8 (L2)        
 128 K    10                       2   
 256 K    11                       1
 512 K    37                      26   + 47 (L3)
   1 M    48                      11
   2 M    54                       6
   4 M    56                       2
   8 M    58                       2   
  16 M    59                       1   
  32 M    59 + 15 ns           15 ns   + 46 ns (RAM)
  64 M    59 + 31 ns           16 ns
 128 M    63 + 39 ns       4 +  8 ns   + 8 (TLB miss)
 256 M    65 + 42 ns       2 +  3 ns
 512 M    66 + 44 ns       1 +  2 ns
1024 M    67 + 45 ns       1 +  1 ns

4 KB pages mode (i7-6900K, 4.0 GHz)

  Size        Latency       Increase   Description

  32 K     4                           
  64 K     8                       4   + 8 (L2)        
 128 K    10                       2   
 256 K    11                       1
 512 K    41                      30   + 47 (L3) +8 (L1 TLB miss)
   1 M    54                      13
   2 M    61                       7   
   4 M    64                       3
   8 M    69 +                     5   + 14 (L2 TLB miss)
  16 M    80 +                    11   
  32 M    92 + 15 ns      12 + 15 ns   + 46 ns (RAM)
  64 M   104 + 31 ns      12 + 16 ns
 128 M   119 + 39 ns      15 +  8 ns   +  ? (PDE cache miss) + ? (Page walk to L3)
 256 M   133 + 42 ns      14 +  3 ns
 512 M   137 + 44 ns       4 +  2 ns
1024 M   140 + 45 ns       3 +  1 ns

MISC, (i7-6900K, 4.0 GHz)

7-Zip Benchmark, E5-2699 v4

Notes:

7z b : MIPS values are normalized with Intel Core 2 cpu.

7z b -mm=* : MIPS and Effectiveness values are normalized with AMD K8 cpu.


## THP on, Turbo Boost on
# gcc-6 -O3, numactl -m0 -C0,44

7-Zip (a) [64] 16.02 : Copyright (c) 1999-2016 Igor Pavlov : 2016-05-21
p7zip Version 16.02 (locale=C.UTF-8,Utf16=on,HugeFiles=on,64 bits,2 CPUs Intel(R) Xeon(R) CPU E5-2699 v4 @ 2.20GHz (406F1),ASM,AES-NI)

Intel(R) Xeon(R) CPU E5-2699 v4 @ 2.20GHz (406F1)
CPU Freq:  3508  3581  3586  3591  3580  3576  3585  3585  3582

RAM size:  257701 MB,  # CPU hardware threads:   2
RAM usage:    435 MB,  # Benchmark threads:      1

                       Compressing  |                  Decompressing
Dict     Speed Usage    R/U Rating  |      Speed Usage    R/U Rating
         KiB/s     %   MIPS   MIPS  |      KiB/s     %   MIPS   MIPS

22:       5835   100   5677   5677  |      44735   100   3820   3820
23:       5566   100   5672   5672  |      44498   100   3852   3852
24:       4945   100   5317   5317  |      44198   100   3880   3880
25:       4471   100   5106   5106  |      43871   100   3905   3905
----------------------------------  | ------------------------------
Avr:             100   5443   5443  |              100   3864   3864
Tot:             100   4654   4653


RAM usage:    441 MB,  # Benchmark threads:      2

                       Compressing  |                  Decompressing
Dict     Speed Usage    R/U Rating  |      Speed Usage    R/U Rating
         KiB/s     %   MIPS   MIPS  |      KiB/s     %   MIPS   MIPS

22:       6894   148   4519   6707  |      58948   200   2517   5033
23:       6756   155   4449   6884  |      58694   200   2541   5081
24:       6556   164   4305   7049  |      58221   200   2556   5111
25:       6438   169   4340   7351  |      58055   200   2585   5167
----------------------------------  | ------------------------------
Avr:             159   4403   6998  |              200   2550   5098
Tot:             180   3476   6048


# gcc-6 -O3, numactl -m0 -N0


RAM usage:   9707 MB,  # Benchmark threads:     44

                       Compressing  |                  Decompressing
Dict     Speed Usage    R/U Rating  |      Speed Usage    R/U Rating
         KiB/s     %   MIPS   MIPS  |      KiB/s     %   MIPS   MIPS

22:      90304  3418   2571  87848  |    1039236  4375   2026  88615
23:      90950  3634   2550  92667  |    1029140  4380   2033  89056
24:      94059  3878   2608 101133  |     994296  4296   2032  87272
25:      93612  3945   2709 106883  |     976973  4283   2030  86944
----------------------------------  | ------------------------------
Avr:            3719   2610  97133  |             4333   2030  87972
Tot:            4026   2320  92552



# gcc-6 -O3, numactl -m0 -C0,44

7-Zip (a) [64] 16.02 : Copyright (c) 1999-2016 Igor Pavlov : 2016-05-21
p7zip Version 16.02 (locale=C.UTF-8,Utf16=on,HugeFiles=on,64 bits,2 CPUs Intel(R) Xeon(R) CPU E5-2699 v4 @ 2.20GHz (406F1),ASM,AES-NI)

Intel(R) Xeon(R) CPU E5-2699 v4 @ 2.20GHz (406F1)
CPU Freq:  3573  3575  3594  3594  3593  3573  3585  3585  3582

RAM size:  257701 MB,  # CPU hardware threads:   2
RAM usage:    225 MB,  # Benchmark threads:      1


Method           Speed Usage    R/U Rating   E/U Effec
                 KiB/s     %   MIPS   MIPS     %     %

CPU                      100   3583   3583
CPU                      100   3584   3584
CPU                      100   3584   3584   100   100

LZMA:x1          16521   100   6040   6040   169   169
                 43898   100   3574   3574   100   100
LZMA:x5:mt1       4942   100   6174   6174   172   172
                 44286   100   3736   3735   104   104
LZMA:x5:mt2       6560   164   5011   8196   140   229
                 44300   100   3737   3737   104   104
Deflate:x1       46307   100   5880   5880   164   164
                163226   100   5072   5071   142   142
Deflate:x5       14020   100   5398   5398   151   151
                163634   100   5080   5080   142   142
Deflate:x7        6088   100   6746   6746   188   188
                164837   100   5115   5115   143   143
Deflate64:x5     12676   100   5478   5478   153   153
                163695   100   5120   5120   143   143
BZip2:x1          6796   100   4106   4106   115   115
                 35339   100   3831   3831   107   107
BZip2:x5          5935   100   4953   4953   138   138
                 27375   100   5373   5373   150   150
BZip2:x5:mt2      8136   196   3467   6790    97   189
                 37263   157   4651   7313   130   204
BZip2:x7          1932   100   5006   5006   140   140
                 27544   100   5402   5401   151   151
PPMD:x1           5164   100   5342   5342   149   149
                  4925   100   5800   5800   162   162
PPMD:x5           3984   100   6753   6753   188   188
                  3877   100   7267   7267   203   203
Delta:4        1150161   100   7067   7067   197   197
               1153741   100   7089   7089   198   198
BCJ            1926523   100   7891   7891   220   220
               1927397   100   7895   7895   220   220
AES256CBC:1     255908   100   6289   6289   175   175
                251665   100   6185   6185   173   173
AES256CBC:2     546318   100   4476   4475   125   125
               2957922   100   6058   6058   169   169
CRC32:1         433485   100   3156   3156    88    88
CRC32:4        1384223   100   3090   3090    86    86
CRC32:8        2797027   100   3793   3793   106   106
CRC64          1257086   100   2575   2575    72    72
SHA256          215597   100   4398   4398   123   123
SHA1            537316   100   5029   5029   140   140
BLAKE2sp        347677   100   7649   7649   213   213

CPU                      100   3584   3583
------------------------------------------------------
Tot:                     109   4906   5340   137   149


RAM usage:    450 MB,  # Benchmark threads:      2


Method           Speed Usage    R/U Rating   E/U Effec
                 KiB/s     %   MIPS   MIPS     %     %

CPU                      197   3336   6586
CPU                      197   3336   6586
CPU                      197   3335   6584   101   200

LZMA:x1          22125   200   4048   8088   123   246
                 60388   200   2462   4918    75   149
LZMA:x5:mt1       6177   200   3866   7718   117   234
                 60749   200   2565   5123    78   156
LZMA:x5:mt2       6567   200   4110   8205   125   249
                 60736   200   2565   5122    78   156
Deflate:x1       57616   199   3667   7316   111   222
                224691   200   3491   6981   106   212
Deflate:x5       18419   200   3550   7092   108   215
                226644   200   3521   7036   107   214
Deflate:x7        7810   200   4327   8654   131   263
                226932   200   3521   7042   107   214
Deflate64:x5     17146   200   3712   7409   113   225
                225738   200   3537   7062   107   214
BZip2:x1          9140   200   2763   5522    84   168
                 43470   200   2356   4712    72   143
BZip2:x5          8161   200   3409   6812   104   207
                 38428   200   3773   7542   115   229
BZip2:x5:mt2      8121   200   3391   6778   103   206
                 38480   200   3783   7553   115   229
BZip2:x7          2499   200   3240   6476    98   197
                 38516   200   3778   7553   115   229
PPMD:x1           6666   200   3450   6895   105   209
                  6117   200   3610   7204   110   219
PPMD:x5           5321   200   4516   9019   137   274
                  4901   200   4599   9186   140   279
Delta:4         961498   200   2957   5907    90   179
                965199   200   2966   5930    90   180
BCJ            2189993   200   4488   8970   136   272
               2189839   200   4489   8970   136   272
AES256CBC:1     277897   200   3417   6830   104   207
                272728   200   3351   6703   102   204
AES256CBC:2    1070154   200   4384   8767   133   266
               3662861   198   3790   7502   115   228
CRC32:1         847270   199   3106   6168    94   187
CRC32:4        2567146   196   2917   5730    89   174
CRC32:8        4435331   199   3020   6014    92   183
CRC64          2351631   200   2412   4816    73   146
SHA256          186392   200   1902   3802    58   115
SHA1            506192   200   2370   4738    72   144
BLAKE2sp        395829   199   4384   8708   133   265

CPU                      197   3335   6586
------------------------------------------------------
Tot:                     200   3343   6675   102   203


# gcc-6 -O3, numactl -m0 -N0


RAM usage:   9911 MB,  # Benchmark threads:     44


Method           Speed Usage    R/U Rating   E/U Effec
                 KiB/s     %   MIPS   MIPS     %     %

CPU                     4343   2597 112779
CPU                     4344   2596 112768
CPU                     4344   2596 112751   101  4400

LZMA:x1         372477  4281   3181 136166   124  5314
               1008688  4380   1876  82151    73  3206
LZMA:x5:mt1      88971  4324   2571 111150   100  4338
                993740  4345   1929  83801    75  3270
LZMA:x5:mt2      88261  4245   2598 110263   101  4303
                996445  4367   1924  84029    75  3279
Deflate:x1      990890  4282   2938 125819   115  4910
               3731935  4365   2657 115958   104  4525
Deflate:x5      316150  4317   2820 121725   110  4750
               3736465  4368   2656 116003   104  4527
Deflate:x7      129009  4350   3286 142939   128  5578
               3770240  4362   2682 117002   105  4566
Deflate64:x5    289565  4239   2952 125130   115  4883
               3747665  4349   2696 117239   105  4575
BZip2:x1        154893  4278   2187  93581    85  3652
                755385  4259   1923  81888    75  3196
BZip2:x5        104881  4255   2057  87530    80  3416
                515652  4146   2441 101210    95  3950
BZip2:x5:mt2    107039  4234   2110  89331    82  3486
                537785  4314   2447 105554    95  4119
BZip2:x7         39781  4311   2392 103064    93  4022
                520218  4156   2455 102017    96  3981
PPMD:x1         110605  4371   2617 114393   102  4464
                 99817  4364   2693 117545   105  4587
PPMD:x5          74388  4350   2898 126068   113  4920
                 68852  4349   2967 129025   116  5035
Delta:4       15778390  4377   2215  96942    86  3783
              15601756  4343   2207  95857    86  3741
BCJ           36444997  4351   3431 149279   134  5825
              36515401  4367   3426 149567   134  5837
AES256CBC:1    4566909  4391   2556 112236   100  4380
               4505531  4393   2521 110728    98  4321
AES256CBC:2   18313350  4386   3421 150023   133  5854
              61064148  4343   2880 125059   112  4880
CRC32:1       14501531  4367   2418 105571    94  4120
CRC32:4       43953675  4320   2271  98105    89  3828
CRC32:8       73568210  4380   2278  99758    89  3893
CRC64         40255920  4394   1876  82444    73  3217
SHA256         3103108  4394   1441  63303    56  2470
SHA1           8255731  4391   1760  77274    69  3016
BLAKE2sp       6346804  4329   3226 139630   126  5449

CPU                     4343   2597 112767
------------------------------------------------------
Tot:                    4320   2391 103205    93  4027

Links

Broadwell at Wikipedia