Features/SleepAndHibernate

From Spice

Support Sleep (S3) and Hibernate (S4) modes. (S3 and S4 are ACPI definitions)

In Sleep, the guest lowers the power mode of all devices, and subsequently enters a lower energy state until a wakeup event. In hibernate it follows that by writing all contents of memory to non volatile memory (hard disk) and then shuts down. On wakeup from sleep the devices are put back to the same state they were before sleep. On startup after hibernate the memory is reread from disk and the devices are reset the same as after sleep.

TODO

DrvAssertMode disable:
 call worker to:
  flush command ring
  update all areas
  *think* this is possible to do with existing N_SURFACES * QXL_IO_UPDATE_AREA but we choose
   to add a new QXL_IO_ABOUT_TO_SLEEP (rename) to reduce vmexits
 *think* about client references without too large a timeout
 *think* destroy primary (we do this today) is that correct?
 copy all surfaces from device to driver
 destroy all surfaces (wait?)
 Create a GDI primary surface

For each Drv command (rop.c, driver.c):
 *think* global per pdev flag or per surface flag.
 if on system memory punt operation
 otherwise do operation as usua
  no allocations on device memory if punting. (so punt at start of function)

DrvAssertMode enable:
 copy all surfaces from device to driver
 create all surfaces with their images
  use DRAW opaque (add operation?)
 *think* if we need to clear the surface in the guest or host? see data_is_valid
 *think* is it good to have a create surface that triggers also an image send from server

Things we are not sure how they work:
 Is SetPowerState(1) called before DrvAssertMode(TRUE) or not:
  if before, all is well.
  if not, need to ensure:
   DrvAssertMode|power==4 => Nop
   SetPowerState(1)|assert==FALSE => calls DrvAssertMode(TRUE)
    can't work due to concurrency

Short TODO:
 cmd.exe alt-enter - first got a wrong screen size at the client (tiny width)
 MoveSurfacesToVideoRam not working?
 Is DrvDeleteDeviceBitmap called for system memory bitmaps (those we did EngModifySurface to
 in MoveSurfacesToRam)
 Create tester for resolution changes
 Undo log change to global - log is per device, should be in Res[i] (i==device id)

Lifetime of a device surface:
 qxldd created ssurface
  DrvCreateDeviceBitmap created when PDev is enabled.
   calls GetFreeSurface to allocate surface_id
   calls CreateDeviceBitmap to actually allocate via EngCreateDeviceBitmap (also created another bitmap stored in surface_info->draw_area to do operations, using the same memory).
  DrvAssertMode disable:
   we move the surface to gdi managed by calling EngModifySurface
  ... <verify the rest> ...
  DrvDestroyDeviceBitmap
   called for any non modified surface (regular path)
   is it called for a modified surface (which has zero hooks and is not opaque)?


 gdi created surface going through qxldd:
  we don't hook this, we punt - so we will never see it deleted. Can be changed later (to
  add hooks). This happens if we are disabled when DrvCreateDeviceBitmap is called

Things to Check
 resolution changes
  make sure we get punted operations (add prints)
 log off / log in
 windows 7 + windows xp
 multiple monitors
 multiple vcpus
 migration during sleep
 audio channel

SetPowerState 4 (Sleep)
 Architecture decision (bring on the list with the patchset):
  Ensure all QXLOutput callbacks are called before sleep.
  Alternative:
   since we are reseting vram anyway, there is no need to call callbacks that only
   affect the vram. Currently the only callback that does something else is FreeQuicImage which
   also adds the image to the cache lru. But we may add more in the future, and this is a hard
   assumption to enforce.
 Debugging:
  For Testing: memset(0) all devram and vram.
   will check that qxl reset (via acpi) is correct
   will check that red_worker isn't referencing something it shouldn't (it will segfault hopefully)
  Ensure all PDevs are disabled.

Windows 2000 Device driver assumptions: (to be verified)
 Only one PDev is enabled at a time
  corollary: red_worker sees surfaces from only one pdev.
  corollary 2: when we destroy all surfaces, we are ok since they are for a specific pdev.
   (otherwise we are destroying too many surfaces)
 SetPowerState is _not_ concurrent with:
  * any operation on an enabled PDev
  * 


(old version) SetPowerState 4 (Sleep)
 ensure client and red_worker are in updated state:
  flush command and cursor rings
  update all surfaces (already done by assertmode - do again?)
  remove all references to glz dictionary
 For Testing: memset(0) all devram and vram.
  will check that qxl reset (via acpi) is correct
  will check that red_worker isn't referencing something it shouldn't (it will segfault hopefully)
 Ensure all PDevs are disabled.

(after device reset per acpi request - qxl ram is initialized, including all rings)
SetPowerState 1 (Return to Normal)
 Init mspaces, Init cache (anything on the device)
 Debugging:
  Check that no pdev is enabled yet.
  TODO: find some msdn reference that says this is so.

Other questions:
 Why do we use pci bars and not guest ram?

Changing resolution flow:
 Pdev 1 (which was enabled, let's assume just fres DrvEnablePDEV+DrvEnableSurface)
 Pdev 2 is the new one
  DrvAssertMode(#1, FALSE)
  DrvEnableSurface(#2)
  Followed by operations to #2..
  Do we see operations to #1?

Next month:
 PUNT_IF_ENABLED
  change to per surface flag (surface_info->copy right now) ?
  is it called?
 SurfaceInfo
  we added a *copy, can be removed and reuse base_mem?
  ditto stride
 PDev
  enable, surf_enable - leave one?

Next Year:
 3dlabs/perm2 has a nice feature, it checks for each DrvCopyBits/DrvBitBlt if it has memory on vram to copy the surface to and if so does it. We can do the same (especially the other way around - moving surfaces to system memory if no vram is left)
        // However, we are still interested in seeing DrvCopyBits
        // and DrvBitBlt calls involving this surface, because
        // in those calls we take the opportunity to see if it's
        // worth putting the device-bitmap back into video memory
        // (if some room has been freed up).
        //
        if ( EngModifySurface(psurf->hsurf,
                              psurf->ppdev->hdevEng,
                              HOOK_COPYBITS | HOOK_BITBLT,
                              0,                    // It's system-memory
                              (DHSURF)psurf,
                              pvScan0,
                              lDelta,
                              NULL))
        {

General TODO:
 We punt a lot of functions without even considering spice'ing them (the whole path, qxl_dev
 and client protocol). Why?
  DrvLiteTo, DrvPlgBlt, DrvStrokeAndFillPath, (see display/driver.c)

DDK Examples Notes:
 3dlabs/perm2/disp/heap.c:bDownload - copies from SM to VM
 bDemote - copies from VM to SM

PCI memory vs. Guest RAM:
 On the face of it it is better to let the guest manage all our memory, since pci is allocated upfront and so wasteful.
 However, for canvases and actually also for alot of our commands we require the memory to be continuous for the server as well as the guest. And such an allocation may fail.
 Suggestion forward: Move all surfaces (vram) to guest allocated, since those can fail safely.
 Leave framebuffer and devram on pci. io ram (QXLOutput managed) may be a second candidate for guest memory.
 (This is from talk with Izik 31.5.2011)


Notes

Support for the QXL device requires deleting from the PCI bar all the resources that are no longer in use by the driver, but may be in use by red_worker.

Resources that we should back up and restore on wakeup:

  • surfaces (copied from the vram bar to ram)

Resources that are purged:

  • All qxl commands (including surfaces and cursor)
  • Pending resources on the Release ring

Windows (windows 2000 graphics driver model) flow is:

Open Questions:

  • what is the relationship between miniport and displaydriver?
    • During SetPowerState(4) can calls be made to dd?
      • Seems to be negative, the rest follows this assumption
    • After SetPowerState(4) and before Call to ACPI can calls be made to dd?
    • After ACPI return from sleep and before SetPowerState(1) can calls be made?
      • Seems to be affirmative, the rest follows this assumption

General Change:

  • After SetPowerState(4)
    • copy all surfaces to local ram
    • Any operation (done after we return from SetPowerState(4) will:
      • check if the surface it is aiming at is off device.
      • if so block on an event fired by SetPowerState(1)

Sleep: (initated by the guest)

  • display:
    • T0: DrvAssertMode - the order of this and SetPowerState is experimentally T0 < T1, but could not find documentation to support this.
  • miniport:
    • T1: SetPowerState 4 (VideoPowerOff) (For hibernate: 5 - VideoPowerHibernate)
      • Clear release ring
      • QXL_IO_ABOUT_TO_SLEEP
      • device: Call worker to (1) render all commands (2) release all references to device resources. The worker will:
        • flush cursor and display command rings
        • update all surfaces
        • destroy all surfaces
        • empty GLZ dictionary
        • flush commands to client (after timeout remove them, possibly replace with surface images and last cursor)
      • Clear release ring again
      • assert command ring and cursor ring both empty