This host was in the provisioning state forever waiting to be marked done by the userdata done job. The job, in turn, errored for hours.
The cycle only ended when a user manually terminated the spawnhost from the UI.
1) why does the context keep getting cancelled after 15 seconds?
2) Why did it happen specifically with this host?
3) should we terminate the host ourselves after some number of attempts or a duration?