Windows Templates¶
The Windows lines (Server 2025 / 2022 Datacenter Desktop Experience, Windows 11 Enterprise) build on the same pipeline as Linux but with a different install and provisioning path. This page documents that path and the gotchas that were expensive to find.
Build path¶
- Licensed media (manual). Windows ISOs are uploaded by hand to
vsanDatastoreunder stable names (iso/windows/.../windows-server-2025.iso, etc.) so monthly media refreshes overwrite the same path without a config change. Configs live inci/config/windows-*.pkrvars.hcl. - GVLK product keys. Public KMS-client keys are committed in the configs
(
vm_inst_os_key_*). Activation happens against a KMS host at clone time, not during the build. autounattend.xmlis rendered onto acidataCD; the VMware Tools ISO comes from the ESXi product locker ([] /vmimages/tools-isoimages/windows.iso) — no staging.- WinRM 5985 is the Ansible connection;
win_updatesapplies Security + Critical updates. - vTPM.
windows-desktop-11shipsvm_vtpm = true→ the VM gets a Virtual TPM, which requires a vCenter Native Key Provider (present; check withgovc kms.ls). --onlyfilter. Datacenter would otherwise build thedexpandcoresources together; the matrixonlyfield restricts each build to the Desktop-Experience source.
Gotchas (in the order they bite)¶
These are all fixed in-repo now; the notes explain why, so a new Windows line doesn't re-discover them.
-
vm_inst_os_eval = falseis mandatory for licensed builds. The autounattend only writes the<ProductKey>(the GVLK) when this isfalse. It defaults totrue, which drops the key — and Server 2025's redesigned Setup then stalls on the "Choose a licensing method" (Azure pay-as-you-go) screen forever, surfacing as "timeout waiting for IP". Set it in eachci/config/windows-*.pkrvars.hcl. -
The runner image must be pinned to an immutable digest. pywinrm is pip-installed into Ansible's venv as a post-
mise installDocker layer. With a mutable:latesttag and noimagePullPolicy, ARC nodes can run a stale cached image whose venv lacks pywinrm → Ansible fails with "No module named 'winrm'". The image is digest-pinned in the talos-cluster helmrelease. -
ansible_shell_type: cmd(set inansible/windows-playbook.yml). ansible-core 2.21 + pywinrm 0.5 default the WinRM shell topowershell, which mangles the-EncodedCommandpayload ("not properly encoded") and fails at Gathering Facts. Ansible itself warns to usecmd. -
win_updatesmust survive the update reboot. The reboot drops WinRM (Connection refused) and the async task's WS-Man shell goes stale (InvalidSelectors). Mitigations (shared playbook +baserole): raise WinRM timeouts (operation_timeout_sec=120/read_timeout_sec=150), a pre-win_rebootto clear pending reboots, andreboot_timeout=3600onwin_updates. -
ip_wait_timeoutheadroom. Windows reports its IP only after install → first-logon VMware Tools, well past Linux's 20 min. The windows configs overridecommon_ip_wait_timeout = "60m". -
ansible-galaxycan hit a transient "Network is unreachable". Collections are baked into the image, so this is a one-off blip — retry. If it recurs, disable the provisioner's redundant galaxy re-download.
Diagnosing a Windows build¶
The console screenshot is the fastest signal — it reveals a stuck Setup screen that the run log mis-reports as an IP timeout:
# admin creds from the machine-local config/vsphere.pkrvars.hcl
govc vm.console -capture /tmp/shot.png windows-server-2025-datacenter-dexp-main-build
govc vm.info -json windows-server-2025-datacenter-dexp-main-build \
| jq '.virtualMachines[0].guest | {ip: .ipAddress, tools: .toolsRunningStatus}'
A healthy run reaches a booted desktop/lock screen with a guest IP and
guestToolsRunning ~7–8 min in, then spends the bulk of its time in
win_updates.