Locara

Mac Hardware Lineup — Every Variant, RAM, Bandwidth, Model-Size Fit

What this is: A reference inventory of every Mac model variant from the late-Intel era (2015) through M5 (early 2026), keyed on the numbers that actually decide what local LLMs can run on it: memory capacity, memory bandwidth, memory bus width, and GPU/Neural-Engine compute class. Why it matters: Locara apps declare device-class requirements in their manifest; the runtime computes the user’s tier and matches. To do that honestly we need a per-SKU table — not just “M3 Max” but “M3 Max 14-core / 384-bit / 300 GB/s” vs. “M3 Max 16-core / 512-bit / 400 GB/s”. The numbers also let us publish an honest “expected tok/s” alongside each device card. Most relevant to Locara: Pairs with chip-fundamentals.md (why bandwidth is the LLM number) and modern-chip-landscape.md (cross-vendor 2026 snapshot). Also pairs with llm-memory-math.md for the formulas that turn “X GB / Y GB/s” into “model Z at Q4 runs at N tok/s”.

Caveat: Apple-published memory bandwidth is peak theoretical, derived from LPDDR clock × bus width. Real workloads typically achieve 70–85% of peak. The “expected model” column assumes leaving ~25% of RAM for the OS and other apps, and uses Q4_K_M weights at 4–8K context (the chat-app default). Long contexts blow up these numbers — see llm-memory-math.md.


Quick reference: device classes for local LLMs

TierBandwidthRAMExamplesRealistic local model (Q4_K_M, ≤8K context)
0 — Sub-baseline≤60 GB/s≤16 GBLate-Intel MBA / MBP 13”1–3B Q4 only; mostly historical
1 — Mobile baseline~100 GB/s8–24 GBM1/M2/M3/M4 base (MBA, mini, base MBP, iMac)7B Q4 comfortably; 13B Q4 at the upper RAM tiers
2 — Pro mobile150–273 GB/s16–64 GBM1/M2/M3/M4 Pro13B Q4 comfortably; 30–34B Q4 at 36+ GB
3 — Max mobile/desktop300–546 GB/s32–128 GBM1/M2/M3/M4 Max (MBP 14”/16”, Mac Studio Max)70B Q4 at 64+ GB; Mixtral 8x7B Q4 at 36+ GB
4 — Ultra desktop~800 GB/s64–256 GBM1/M2/M3 Ultra (Mac Studio, Mac Pro)70B FP16 / 100B+ Q4 / Mixtral 8x22B Q4
5 — Frontier-local~800 GB/s512 GBMac Studio M3 Ultra 512 GBLlama 3.1 405B Q4 (~250 GB) / DeepSeek-V3/R1 Q4 (~340 GB)

The same table inverted by chip family, with detail, is below.


Late-Intel era (2015 — 2020)

The defining traits of late-Intel Macs for LLM use:

  • Memory bandwidth was bottlenecked by 128-bit DDR3/LPDDR3/LPDDR4X buses at 25–60 GB/s — an order of magnitude below Apple Silicon.
  • AMD Radeon Pro / Vega discrete GPUs (15” MBP, iMac, Mac Pro, iMac Pro) had separate VRAM and no Metal-Performance-Shaders-Graph path that current local LLM stacks target. ROCm doesn’t support them. MLX explicitly does not support them.
  • Only two late-Intel Macs cross 128 GB of RAM: the iMac 27” Mid-2020 (up to 128 GB SO-DIMM) and the Mac Pro 2019 (up to 1.5 TB DDR4 ECC). Neither has a usable accelerated LLM path; the Mac Pro CPU-only inference is single-digit tok/s on 70B.

MacBook Air (Intel)

YearChipCoresGPURAM (max)DRAMBandwidth (GB/s)BusBase price
2015 (11” / 13”)Core i5/i7 Broadwell2C/4TIntel HD 60004–8 GBLPDDR3-1600~25.6128-bit$899 / $999
2017 (13”)Core i5-5350U2C/4TIntel HD 60008 GBLPDDR3-1600~25.6128-bit$999
2018 (Retina 13”)Core i5-8210Y (Amber Lake-Y)2C/4TIntel UHD 6178–16 GBLPDDR3-2133~34.1128-bit$1,199
2019 (Retina 13”)Core i5-8210Y2C/4TIntel UHD 6178–16 GBLPDDR3-2133~34.1128-bit$1,099
2020 (Retina 13”)Core i3/i5/i7 Ice Lake2–4CIris Plus G4/G78–16 GBLPDDR4X-3733~59.7128-bit$999

LLM viability: Sub-3B Q4, slowly. Mostly historical interest.

MacBook Pro (Intel)

YearChipCoresGPURAM (max)DRAMBandwidth (GB/s)Base price
2015 (13” Retina)Core i5/i7 Broadwell2C/4TIris 610016 GBLPDDR3-1866~29.8$1,299
2015 (15” Retina)Core i7 Haswell (4770HQ/4870HQ/4980HQ)4C/8TIris Pro 5200 + opt. AMD R9 M370X/M390X (2 GB GDDR5)16 GBDDR3L-1600~25.6$1,999
2016/2017 (13”/15” Touch Bar)Core i5/i7 Skylake/Kaby Lake2–4CIris 540/550 / 6700HQ + AMD Radeon Pro 450–56016 GBLPDDR3-2133 / DDR4-2400~34–38$1,799
2018 (13”/15”)Core i5/i7/i9 Coffee Lake4C/8T / 6C/12TIris Plus 655 + AMD Radeon Pro 555X–Vega 20 (HBM2)16–32 GBLPDDR3 / DDR4-2400~34–38$1,799
2019 (13”/15”)Core i5/i7 Kaby Lake-R / Coffee Lake4C/8T / 6C/12TIris Plus 645 + Radeon Pro 555X/560X/Vega 16/2016–32 GBLPDDR3-2133 / DDR4-2400~34–38$1,299
2019 (16”)Core i7-9750H / i9-9880H / i9-9980HK Coffee Lake Refresh6C/12T / 8C/16TRadeon Pro 5300M / 5500M / 5600M (4–8 GB HBM2)16–64 GBDDR4-2666~42.7$2,399
2020 (13”)Core i5/i7 Ice Lake4C/8TIris Plus G716–32 GBLPDDR4X-3733~59.7$1,299

LLM viability: The 16” 2019 MBP with 64 GB and a Radeon Pro 5600M is the only late-Intel laptop you might consider for ~13B Q4 on CPU. Real-world tok/s is roughly 1/3–1/4 of an M1 Max for the same model. The 5600M’s HBM2 is fast on paper but has no MLX path and no usable llama.cpp Metal acceleration to rival Apple Silicon.

Mac mini / iMac / iMac Pro / Mac Pro (Intel)

ModelYear(s)ChipCoresGPURAM (max)DRAMBandwidth (GB/s)Bus
Mac mini “late 2014”2014–2018Core i5/i7 Haswell2C/4TIris 510016 GBDDR3L-1600~25.6128-bit
Mac mini 20182018–2023Core i3/i5/i7 Coffee Lake4–6CUHD 6308–64 GB SO-DIMM (user-upgradeable)DDR4-2666~42.7128-bit
iMac 21.5” 4K2015/2017Core i5/i7 Skylake/Kaby Lake2–4CIris Pro 6200 / Radeon Pro 555/5608–32 GBDDR3 / DDR4-2400~30–38128-bit
iMac 27” 5K2015Core i5/i7 Skylake4CR9 M380–M395X8–32 GB SO-DIMMDDR3-1867~29.9128-bit
iMac 27” 5K2017Core i5/i7 Kaby Lake4CRadeon Pro 570/575/580 (4–8 GB GDDR5)8–64 GB SO-DIMMDDR4-2400~38.4128-bit
iMac 27” 5K2019Core i5/i9 Coffee Lake6–8CRadeon Pro 570X–Vega 48 (HBM2 8 GB)8–64 GB SO-DIMMDDR4-2666~42.7128-bit
iMac 27” 5K2020Core i5/i7/i9 Comet Lake (up to 10C)6–10CRadeon Pro 5300/5500XT/5700XT (4–16 GB GDDR6)8–128 GB SO-DIMMDDR4-2666~42.7128-bit
iMac Pro2017–2021Xeon W-2140B–W-2191B Skylake-W8–18CRadeon Pro Vega 56–64X (8–16 GB HBM2)32–256 GB ECCDDR4-2666 ECC~85.3256-bit (4-ch)
Mac Pro “trash can”2013–2019Xeon E5 Ivy Bridge-EP4–12Cdual FirePro D300/D500/D70012–64 GBDDR3-1866 ECC~59.7256-bit (4-ch)
Mac Pro “cheese grater”2019–2023Xeon W-3223–W-3295 Cascade Lake8–28CRadeon Pro 580X / Vega II / W5700X / W6800X / W6900X32–1.5 TB ECC RDIMMDDR4-2933 ECC~140.8384-bit (6-ch)

LLM viability: Largely irrelevant for accelerated inference, but two anomalies worth knowing:

  • The Mac mini 2018 with 64 GB user-upgraded DDR4 is the cheapest Intel Mac that can hold a 30B Q4 model in RAM. Speed is awful (CPU-only inference, ~40 GB/s).
  • The 2020 iMac 27” with 128 GB SO-DIMM and the Mac Pro 2019 with up to 1.5 TB are the only Intel Macs that exceed today’s M3 Ultra in raw capacity — but their bandwidth and accelerator stories make them losing propositions for LLMs.

M1 family (Nov 2020 — early 2023)

Every Apple Silicon Mac has soldered LPDDR unified memory on the SoC package. CPU, GPU, and Neural Engine address the same DRAM at full bandwidth — no PCIe, no copies, no separate VRAM. This is the structural property that makes Apple Silicon disproportionately good at LLM inference.

M1 — TSMC N5

  • CPU: 4P (Firestorm) + 4E (Icestorm) = 8 cores
  • GPU: 7- or 8-core Apple GPU
  • Neural Engine: 16 cores (~11 TOPS)
  • Memory: LPDDR4X-4266, 128-bit bus → ~68.25 GB/s
  • RAM options: 8 / 16 GB
ProductReleasedDiscontinuedRAMBase price
MacBook Air M1Nov 2020Mar 20248 / 16 GB$999
MacBook Pro 13” M1Nov 2020Oct 20228 / 16 GB$1,299
Mac mini M1Nov 2020Jan 20238 / 16 GB$699
iMac 24” M1Apr 2021Oct 20238 / 16 GB$1,299 (4-port)

LLM viability: 7B Q4 (~4.5 GB weights) is the sweet spot on 16 GB. 8 GB is squeezed (3B Q4 comfortable). Expect ~15–20 tok/s on 7B Q4_K_M.

M1 Pro — TSMC N5

  • CPU: 6P+2E (binned, 8C) or 8P+2E (full, 10C) — Avalanche/Blizzard cores
  • GPU: 14- or 16-core
  • Neural Engine: 16 cores
  • Memory: LPDDR5-6400, 256-bit bus → ~200 GB/s
  • RAM options: 16 / 32 GB

M1 Max — TSMC N5

  • CPU: 8P+2E (10 cores)
  • GPU: 24- or 32-core
  • Memory: LPDDR5-6400, 512-bit bus → ~400 GB/s
  • RAM options: 32 / 64 GB
ProductReleasedDiscontinuedChip optionsRAMBase price
MacBook Pro 14” (2021)Oct 2021Jan 2023M1 Pro 8C/14C-GPU or 10C/16C; M1 Max 24C/32C-GPU16 / 32 / 64 GB$1,999
MacBook Pro 16” (2021)Oct 2021Jan 2023M1 Pro 10C/16C; M1 Max 24C/32C-GPU16 / 32 / 64 GB$2,499
Mac Studio M1 MaxMar 2022Jun 2023M1 Max 24C/32C-GPU32 / 64 GB$1,999

M1 Ultra — TSMC N5, two M1 Max dies via UltraFusion

  • CPU: 16P+4E (20 cores)
  • GPU: 48- or 64-core
  • Neural Engine: 32 cores
  • Memory: LPDDR5-6400, 1024-bit bus → ~800 GB/s
  • RAM options: 64 / 128 GB
ProductReleasedDiscontinuedChip optionsRAMBase price
Mac Studio M1 UltraMar 2022Jun 202348C or 64C-GPU64 / 128 GB$3,999

LLM viability of the M1 family:

  • M1 Pro 32 GB: 13B Q4 at ~20–25 tok/s.
  • M1 Max 64 GB: 70B Q4 (tight at ~42 GB weights + KV cache + OS); ~7–9 tok/s.
  • M1 Ultra 128 GB: 70B Q4 comfortably at ~12–14 tok/s; Mixtral 8x7B Q4 with room.

M2 family (June 2022 — Oct 2024)

M2 — TSMC N5P

  • CPU: 4P+4E (8 cores)
  • GPU: 8- or 10-core
  • Memory: LPDDR5-6400, 128-bit → ~100 GB/s (up from M1’s 68 GB/s)
  • RAM options: 8 / 16 / 24 GB (24 GB tier was new for base)
ProductReleasedRAMBase price
MacBook Air M2 13”Jul 20228 / 16 / 24 GB$1,199
MacBook Pro 13” M2Jun 20228 / 16 / 24 GB$1,299
Mac mini M2Jan 20238 / 16 / 24 GB$599
MacBook Air 15” M2Jun 20238 / 16 / 24 GB$1,299

M2 Pro — TSMC N5P

  • CPU: 6P+4E (10C) or 8P+4E (12C)
  • GPU: 16- or 19-core
  • Memory: LPDDR5-6400, 256-bit → ~200 GB/s (same as M1 Pro)
  • RAM options: 16 / 32 GB

M2 Max — TSMC N5P

  • CPU: 8P+4E (12C) — added 2 E-cores vs M1 Max
  • GPU: 30- or 38-core
  • Memory: LPDDR5-6400, 512-bit → ~400 GB/s (same as M1 Max)
  • RAM options: 32 / 64 / 96 GB (96 GB was new)
ProductReleasedChip optionsRAMBase price
MacBook Pro 14” (2023)Jan 2023M2 Pro 10C/16C or 12C/19C; M2 Max 30C/38C-GPU16 / 32 / 64 / 96 GB$1,999
MacBook Pro 16” (2023)Jan 2023M2 Pro 12C/19C; M2 Max 30C/38C-GPU16 / 32 / 64 / 96 GB$2,499
Mac mini M2 ProJan 202310C/16C or 12C/19C16 / 32 GB$1,299
Mac Studio M2 MaxJun 2023M2 Max 30C or 38C-GPU32 / 64 / 96 GB$1,999

M2 Ultra — TSMC N5P, two M2 Max via UltraFusion

  • CPU: 16P+8E (24C)
  • GPU: 60- or 76-core
  • Neural Engine: 32 cores
  • Memory: LPDDR5-6400, 1024-bit → ~800 GB/s
  • RAM options: 64 / 128 / 192 GB (192 GB was the headline)
ProductReleasedChip optionsRAMBase price
Mac Studio M2 UltraJun 202360C or 76C-GPU64 / 128 / 192 GB$3,999
Mac Pro M2 UltraJun 202360C or 76C-GPU64 / 128 / 192 GB$6,999

Mac Pro M2 Ultra is essentially a Mac Studio in a tower. Same chip, same RAM cap, same bandwidth. The only differentiator is PCIe expansion, and PCIe-attached GPUs cannot share unified memory — they have no MLX path and limited llama.cpp Metal path. For LLM work the Mac Pro M2 Ultra is a strict downgrade in value vs. the Mac Studio M2 Ultra.

LLM viability of the M2 family:

  • M2 base 24 GB: 13B Q4 just barely; 7B Q4 comfortably at ~22 tok/s.
  • M2 Pro 32 GB: 13B Q4 at ~30 tok/s; 30B Q4 squeezed.
  • M2 Max 96 GB: 70B Q4 at ~10–13 tok/s; Mixtral 8x7B Q4 comfortably.
  • M2 Ultra 192 GB: 70B FP16 (140 GB) at ~5–7 tok/s; Mixtral 8x22B Q4 (~80 GB) comfortably.

M3 family (Oct 2023 — Oct 2024; M3 Ultra arrived March 2025)

M3 — TSMC N3B (first-gen 3 nm)

  • CPU: 4P+4E (8C)
  • GPU: 8- or 10-core — first Apple GPU with hardware ray tracing and mesh shading
  • Memory: LPDDR5-6400, 128-bit → ~100 GB/s (unchanged from M2)
  • RAM options: 8 / 16 / 24 GB
ProductReleasedRAMBase price
MacBook Pro 14” M3Nov 20238 / 16 / 24 GB$1,599
iMac 24” M3Nov 20238 / 16 / 24 GB$1,299
MacBook Air 13”/15” M3Mar 20248 / 16 / 24 GB$1,099 / $1,299

M3 Pro — TSMC N3B (the controversial one)

  • CPU: 5P+6E (11C) or 6P+6E (12C) — Apple shifted toward more efficiency cores
  • GPU: 14- or 18-core
  • Memory: LPDDR5-6400, 192-bit → ~150 GB/sDOWN from M2 Pro’s 200 GB/s
  • RAM options: 18 / 36 GB — unusual numbers reflecting the narrower 192-bit bus

The M3 Pro is a regression for LLM users. A narrower memory bus (192-bit vs M2 Pro’s 256-bit) plus same LPDDR5 clock means ~25% less bandwidth. Documented extensively by Chips and Cheese (“Apple’s M3 Pro: A Step Sideways”, Nov 2023), Vadim Yuryev (MaxTech), and r/LocalLLaMA community measurements. The Apple-side rationale (per supply-chain reporting) was N3B yield economics — cutting bus width saves die area. M4 Pro restored and then improved the bus.

M3 Max — TSMC N3B (two distinct bandwidth tiers under one name)

The M3 Max shipped in two memory configurations, depending on which CPU bin you got:

M3 Max variantCPUGPURAM optionsBandwidthBus
Binned (14-core)10P+4E30-core36 / 96 GB~300 GB/s384-bit
Full (16-core)12P+4E40-core48 / 64 / 128 GB~400 GB/s512-bit

Same chip name, different memory subsystems. Buyers who ordered 36 GB automatically got the binned (slower) variant; 64 GB or 128 GB orders got the full variant.

ProductReleasedChip optionsRAMBase price
MacBook Pro 14” M3 Pro/MaxNov 2023Pro 11C/14C-GPU or 12C/18C-GPU; Max 14C/30C-GPU or 16C/40C-GPU18 / 36 / 48 / 64 / 96 / 128 GB$1,999
MacBook Pro 16” M3 Pro/MaxNov 2023Same18 / 36 / 48 / 64 / 96 / 128 GB$2,499

M3 Ultra — TSMC N3B, two M3 Max via UltraFusion (March 2025)

Apple skipped an M3 Ultra in the original M3 lineup and released it only in March 2025 — after M4 had already shipped in iPads and Macs. The headline was the 512 GB unified memory option, exclusive to M3 Ultra; M4 has no Ultra tier as of this writing.

  • CPU: 24P+8E (32C) — two full M3 Max dies
  • GPU: 60- or 80-core
  • Neural Engine: 32 cores
  • Memory: LPDDR5-6400, 1024-bit → ~800 GB/s
  • RAM options: 96 / 256 / 512 GB
ProductReleasedChip optionsRAMBase price
Mac Studio M3 UltraMar 202560C-GPU (96/256 GB) or 80C-GPU (96/256/512 GB)96 / 256 / 512 GB$3,999 / $5,499 (80C base) / ~$9,500+ (512 GB)

The 512 GB Mac Studio M3 Ultra is the highest-RAM consumer machine ever sold by any vendor. No PC platform, including AMD Strix Halo (128 GB cap) or any single-GPU rig (RTX 5090 caps at 32 GB GDDR7), comes close at consumer pricing. Reportedly runs Llama 3.1 405B Q4 (~250 GB) at ~2 tok/s and DeepSeek-V3/R1 Q4 (~340 GB) at usable interactive speeds for a single user.

LLM viability of the M3 family:

  • M3 base 24 GB: 7B Q4 at ~20–25 tok/s; 13B Q4 tight.
  • M3 Pro 36 GB: regressed vs M2 Pro on bandwidth-bound generation; ~20–25 tok/s on 13B Q4, where M2 Pro hits 25–30.
  • M3 Max 14C/36 GB: ~50–60 tok/s on 7B Q4; 30B Q4 fits with room.
  • M3 Max 16C/128 GB: 70B Q4 at ~10–14 tok/s.
  • M3 Ultra 256 GB: 70B FP16 at ~7–9 tok/s; Mixtral 8x22B Q4 with massive headroom.
  • M3 Ultra 512 GB: only consumer device that runs Llama 3.1 405B Q4 or DeepSeek-V3 Q4 in unified memory.

M4 family (May 2024 iPad Pro; Oct 2024 Macs)

The M4 broke from N3B and uses TSMC N3E — a more mature 3 nm variant. Memory moved to LPDDR5X (8533 MT/s vs M1–M3’s LPDDR5 at 6400), which is the headline bandwidth driver.

M4 — TSMC N3E

  • CPU: 4P+6E (10C) — added 2 E-cores
  • GPU: 8- or 10-core (Dynamic Caching + hardware RT/mesh shading carried over from M3)
  • Neural Engine: 16 cores (uplifted throughput, Apple quotes ~38 TOPS)
  • Memory: LPDDR5X, 128-bit → ~120 GB/s
  • RAM options: 16 / 24 / 32 GB — 8 GB base finally retired
ProductReleasedRAMBase price
iPad Pro M4May 20248 / 16 GB
MacBook Pro 14” M4Oct 202416 / 24 / 32 GB$1,599
iMac 24” M4Oct 202416 / 24 / 32 GB$1,299
Mac mini M4Oct 202416 / 24 / 32 GB$599
MacBook Air 13”/15” M4Mar 202516 / 24 / 32 GB$999 / $1,199

M4 Pro — TSMC N3E (memory bandwidth restored, then some)

After the M3 Pro controversy, Apple widened the bus and moved to LPDDR5X:

  • CPU: 8P+4E (12C) or 10P+4E (14C)
  • GPU: 16- or 20-core
  • Memory: LPDDR5X-8533, 256-bit → ~273 GB/s (up from M3 Pro’s 150 GB/s, also up from M2 Pro’s 200 GB/s)
  • RAM options: 24 / 48 / 64 GB

M4 Max — TSMC N3E (two bandwidth tiers again)

M4 Max variantCPUGPURAM optionsBandwidthBus
Binned (14-core)10P+4E32-core36 GB only~410 GB/s384-bit
Full (16-core)12P+4E40-core48 / 64 / 128 GB~546 GB/s512-bit

The full M4 Max at 546 GB/s is the biggest generational bandwidth jump in M-series history — ~37% over M3 Max full.

ProductReleasedChip optionsRAMBase price
MacBook Pro 14” M4 Pro/MaxOct 2024Pro 12C/16C-GPU or 14C/20C; Max 14C/32C-GPU or 16C/40C24 / 36 / 48 / 64 / 128 GB$1,999
MacBook Pro 16” M4 Pro/MaxOct 2024Same24 / 48 / 64 / 128 GB$2,499
Mac mini M4 ProOct 2024Pro 12C/16C or 14C/20C24 / 48 / 64 GB$1,399
Mac Studio M4 MaxMar 2025Max 14C/32C-GPU or 16C/40C36 / 48 / 64 / 128 GB$1,999

M4 Ultra — does not ship as of early 2026

When Apple refreshed Mac Studio in March 2025, they paired the M4 Max with the M3 Ultra — a deliberate split-generation product. Per Bloomberg’s Mark Gurman and multiple supply-chain leaks, the M4 Max die does not include the UltraFusion interconnect needed to fuse two dies. Apple appears to have decided early in M4’s design that the Ultra tier would be skipped; the 512 GB slot is held by M3 Ultra until M5 Ultra (if any) arrives.

Note: the existing modern-chip-landscape.md refers to an “M4 Ultra in Mac Studio” — that’s incorrect. The Mac Studio Ultra ships M3 Ultra, not M4 Ultra. Worth correcting next time that note is revised.

LLM viability of the M4 family:

  • M4 base 32 GB: 13B Q4 comfortably; 7B Q4 at ~25–30 tok/s.
  • M4 Pro 64 GB: 30B Q4 comfortably; 70B Q4 squeezed; ~35–40 tok/s on 13B Q4.
  • M4 Max 128 GB: 70B Q4 at ~14–18 tok/s (best mobile-Mac numbers).

M5 family (announced/launched late 2025 — early 2026)

Confidence: medium for shipped parts, low for unannounced.

M5 — TSMC N3P

The marquee M5 architectural change: neural accelerators inside each GPU core — Apple’s name for matrix-multiply hardware embedded in the GPU itself, conceptually similar to NVIDIA Tensor Cores. This means MLX and llama.cpp Metal-backend inference (already GPU-routed) gets a substantial uplift on prompt-processing without the Neural Engine’s operator constraints.

  • CPU: 4P+6E (refined cores)
  • GPU: new-architecture with embedded neural accelerators
  • Memory: LPDDR5X-9600, 128-bit → ~150 GB/s
  • RAM options: 16 / 24 / 32 GB
ProductReleasedRAMNotes
MacBook Pro 14” M5Oct 202516 / 24 / 32 GB
iPad Pro M5Oct 202512 / 16 GB
MacBook Air M5expected Spring 202616 / 24 / 32 GB

M5 Pro / Max / Ultra — not confirmed at time of writing

Rumored for spring/summer 2026 with continued bandwidth uplift. Treat any specific numbers as speculation. If/when an M5 Ultra arrives with LPDDR5X-9600 on the same 1024-bit bus, that’s an automatic ~50% bandwidth jump from M3 Ultra to ~1.2 TB/s — the first Ultra-tier bandwidth movement in four generations.

Early M5 LLM impact: community benchmarks suggest ~2–4× prompt-processing throughput vs M4 at the same memory bandwidth (the GPU-embedded matmul units carry the load). Decode (bandwidth-bound) sees a more modest ~25% uplift matching the bandwidth gain.


Notable surprises and learnings

1. The M3 Pro bandwidth regression was a real product mistake. 150 GB/s vs M2 Pro’s 200 GB/s. The unusual 18 / 36 GB capacities exist because of the 192-bit bus (six DRAM channels vs four wider ones). M4 Pro’s 273 GB/s on 256-bit LPDDR5X is the correction. If a user has an M3 Pro, expect them to be measurably slower than the equivalent M2 Pro on bandwidth-bound LLM decode — surprising but real.

2. Both M3 Max and M4 Max ship in two bandwidth tiers under one name. Buying 36 GB on a Max-tier Mac gets you the binned die (384-bit bus, lower bandwidth). Buying 48 GB or above gets the full die (512-bit). The bandwidth delta — 300 vs 400 on M3, 410 vs 546 on M4 — is material for LLM inference. Locara device cards should distinguish these explicitly.

3. Ultra-tier bandwidth has been flat at ~800 GB/s for four generations. M1, M2, M3 Ultra all sit at 800 GB/s on 1024-bit LPDDR5-6400. Apple has not moved Ultra-tier to LPDDR5X. The bandwidth scales only via die fusion, not memory speed. An M5 Ultra with LPDDR5X-9600 on the same bus would be the first real Ultra-tier bandwidth jump.

4. M3 Ultra 512 GB is the consumer-hardware capacity ceiling. No other shipping consumer machine — AMD Strix Halo (128 GB cap), any single-GPU rig (RTX 5090 caps at 32 GB GDDR7) — comes close at consumer pricing. The next viable option above 512 GB is a workstation with multiple GPUs or a server platform, both 5–20× the cost. This is genuinely a one-of-a-kind product as of early 2026.

5. The Intel Mac Pro 2019 had more RAM than any Apple Silicon Mac until March 2025. Intel Mac Pro: 1.5 TB DDR4 ECC, ~140 GB/s. M3 Ultra: 512 GB LPDDR5, ~800 GB/s. Apple Silicon traded raw capacity for ~6× bandwidth and unified addressing — the structurally correct call for what matters in inference.

6. Apple’s published bandwidth is peak theoretical. Real LLM inference typically achieves 70–85% of peak. Vadim Yuryev (MaxTech) and the now-defunct AnandTech both consistently noted this in their reviews. Use the published numbers as upper bounds.

7. The Neural Engine has been remarkably static through M4. 16 cores from M1 through M4 (32 on Ultras). Per-core throughput has improved (~11 → ~38 TOPS quoted) but local-LLM stacks (MLX, llama.cpp Metal) bypass the Neural Engine entirely and target the GPU, because the Neural Engine has tight operator-support constraints. M5’s GPU-embedded neural accelerators are the first real architectural change that local LLM stacks can use directly.

8. There is no Apple Silicon Mac with user-upgradeable RAM. Every M-series Mac has soldered LPDDR. The Intel-era Mac mini 2018, iMac 27” 2020, and Mac Pro 2019 are the last user-upgradeable-RAM Macs. Users who under-buy RAM at purchase time are stuck for the life of the machine — this is the single most common Locara-relevant deployment mistake. App manifests need to fail loud and early when the user’s machine is undersized.

9. The Mac Pro M2 Ultra is effectively obsolete for LLM work. Same chip, same RAM cap (192 GB), same bandwidth as the Mac Studio M2 Ultra; PCIe expansion can’t host LLM-relevant GPUs (no MLX, weak Metal path). A Mac Studio M2 Ultra at $3,999 strictly dominates a Mac Pro M2 Ultra at $6,999 for local AI.

10. iPad Pro M4 has more memory bandwidth than any Intel Mac. ~120 GB/s vs Intel Mac Pro’s ~140 GB/s on paper, but with unified memory and MLX support — iPad Pro M4 with 16 GB runs 7–13B Q4 models faster than any pre-2020 Intel Mac, in a tablet. Same chip as MacBook Air M4 16 GB. The mobile-class compute ceiling has moved.


Specific learnings for Locara

  1. Manifest device-class targeting needs both RAM and bandwidth. A 64 GB M2 Max and a 64 GB M3 Max-14C have very different LLM performance even though they have the same RAM tier. The manifest schema should accept either a coarse tier (“Tier 3”) or a specific (RAM, bandwidth) pair, with the runtime computing the user’s actual numbers via sysctl hw.memsize and a chip-keyed bandwidth lookup table.

  2. Distinguish M3 Max 14C from M3 Max 16C, and M4 Max 14C from M4 Max 16C. Same chip name, different bandwidth. Locara’s device detection should read sysctl machdep.cpu.brand_string and sysctl hw.perflevel0.physicalcpu to disambiguate, then cross-reference with this note’s table.

  3. The M3 Pro regression is a real datapoint for the manifest. A 36 GB M3 Pro is slower on bandwidth-bound LLM decode than a 32 GB M2 Pro despite more RAM. App authors should be steered toward setting a bandwidth floor, not just a RAM floor, if their app does long-form generation.

  4. Soldered RAM means hardware tier is a permanent property of the user, not a runtime variable. This is unlike a PC where the user can add RAM. Locara should remember the user’s machine tier as a persistent profile attribute and warn at install time, not at runtime, when an app exceeds that tier.

  5. The Mac Studio Ultra is the LLM-first hardware product Apple ships. Locara’s “what’s the most ambitious app you can ship?” question is bounded by what Mac Studio M3 Ultra 512 GB can run — DeepSeek-V3-Q4 (~340 GB) is the current ceiling. Apps targeting “the most demanding user” should be built knowing this is the platform.

  6. Mac mini M4 is the right value-tier deployment target. $599 base, 16 GB RAM, ~120 GB/s, runs 7B Q4 well. Locara’s reference apps and onboarding flow should be tuned to “the Mac mini M4 user,” because that’s the price/capability sweet spot for new local-AI adopters.

  7. MacBook Air remains the volume-weighted target. Anyone buying a Mac for casual use buys an Air. M1 16 GB, M2 16/24 GB, M3 16/24 GB, M4 16/24/32 GB Airs collectively dominate the install base. Apps targeting “the median Locara user” must work well at 16 GB / ~100–120 GB/s — that’s the binding constraint for v1 reference apps.

  8. Don’t trust the “GB/s” Apple publishes as a real performance number. Multiply by ~0.75 utilization for honest tok/s estimates. The full formula is in llm-memory-math.md.

  9. Treat Intel Macs as out-of-scope for v1. Even the iMac Pro and Mac Pro 2019 are 1/3 to 1/5 the speed of an entry M-series Mac on LLM inference, with no MLX path. Locara v1 should refuse to install on Intel Macs with a clear “your Mac doesn’t have the unified-memory architecture this app requires” message.

  10. A bandwidth-keyed model picker is the right manifest primitive. Given the user’s measured bandwidth, the runtime can publish “Llama 3 8B Q4 at expected ~X tok/s on your Mac” as the install-time gate. Honesty about expected performance is the LSB of trust.


References

Apple primary sources (tech specs and announcements):

  • Apple tech specs archive (every model): https://support.apple.com/specs/
  • Mac Studio: https://www.apple.com/mac-studio/specs/
  • MacBook Pro: https://www.apple.com/macbook-pro/specs/
  • MacBook Air: https://www.apple.com/macbook-air/specs/
  • Mac mini: https://www.apple.com/mac-mini/specs/
  • iMac: https://www.apple.com/imac/specs/
  • Apple Newsroom (launch press releases with confirmed dates and prices): https://www.apple.com/newsroom/
  • M3 Ultra announcement (Mar 2025): https://www.apple.com/newsroom/2025/03/apple-reveals-m3-ultra-taking-apple-silicon-to-a-new-extreme/

Microarchitecture deep dives:

  • AnandTech (Andrei Frumusanu, Ryan Smith) — M1 / M1 Pro / M1 Max die analyses (Oct 2021) and A14/A15/A16 Firestorm-Avalanche-Everest core work. Site ceased active publication August 2024; archives still at anandtech.com.
  • Chips and Cheese (https://chipsandcheese.com) — “Apple’s M3 Pro: A Step Sideways” (Nov 2023) documented the M3 Pro bandwidth regression with measured numbers. Multiple M2 Max / M4 Max deep dives.
  • SemiAnalysis (Dylan Patel) — TSMC N3B vs N3E yield economics, Apple’s process-node transition timing.
  • TechInsights — die-shot analyses for each M-series generation.
  • Hot Chips conference papers — Apple has presented some M-series details (e.g., M1 at Hot Chips 33).

Reviews and measured performance:

  • The Verge (Nilay Patel, Monica Chin) — reviews for every major Mac launch since 2015.
  • Ars Technica (Andrew Cunningham, Samuel Axon) — particularly strong on Mac mini and Mac Studio with real-workload focus.
  • Notebookcheck (https://notebookcheck.net) — benchmark database for every Mac with comparable scores.
  • MaxTech / Vadim Yuryev (YouTube) — consistent Geekbench Memory and llama.cpp benchmarks on every new Mac.
  • MKBHD / Marques Brownlee (YouTube) — spec breakdowns and side-by-side reviews.
  • AlexZiskind (YouTube) — the most rigorous Mac LLM benchmarking, with thermal and sustained-vs-burst breakdowns.

LLM-specific Mac benchmarks:

  • r/LocalLLaMA (https://reddit.com/r/LocalLLaMA) — single best aggregator for community-measured tok/s per Mac SKU. Search for specific chip names and quants.
  • llama.cpp issue tracker — Apple Silicon performance discussions and bandwidth-vs-tok/s charts: https://github.com/ggerganov/llama.cpp/issues
  • MLX repo + issues — Apple’s official LLM stack: https://github.com/ml-explore/mlx
  • Awni Hannun (MLX lead) — Twitter @awnihannun, blog posts on MLX benchmarks.
  • Simon Willison (simonwillison.net) — practical Mac LLM write-ups with measurements on M2 Max and M3 Ultra.

WWDC sessions:

  • WWDC 2020 “Explore the new system architecture of Apple silicon Macs” — first official UMA description.
  • WWDC 2023 / 2024 / 2025 Metal and ML sessions — MLX, Neural Engine, M5 GPU neural accelerators.

Source caveats:

  • “M4 Ultra exists” — does not as of early 2026. The Mac Studio (Mar 2025) pairs M4 Max with M3 Ultra. Earlier modern-chip-landscape.md reference to M4 Ultra in Mac Studio is incorrect.
  • M5 Pro / Max / Ultra specs — speculative; not shipping as of writing.
  • LPDDR5X clock for M4 — sources differ on whether base M4 uses 7500 or 8533 MT/s; Apple’s published 120 GB/s suggests a lower clock on base than on Pro/Max tiers.
  • Mac Pro M3 Ultra — has not shipped; Mac Pro remains on M2 Ultra as of writing.