acrn-hypervisor

mirror of https://github.com/projectacrn/acrn-hypervisor.git synced 2025-11-19 01:53:17 +00:00

Author	SHA1	Message	Date
Gao, Shiqing	f398b9c29e	release: fix the compilation error in release mode Commit `512c98fd7 hv: trace: show cpu usage of vms in pcpu sharing case` causes the compilation error in release mode: hypervisor/common/schedule.c:190: undefined reference to `TRACE_16STR' This patch fixes this issue. Tracked-On: #861 Signed-off-by: Gao, Shiqing <shiqing.gao@intel.com>	2024-07-03 11:26:01 +08:00
YuanXin-Intel	e4b1584577	Change Service VM to supervisor role 1. Enable Service VM to power off or restart the whole platform even when RTVM is running. 2. Allow Service VM stop the RTVM using acrnctl tool with option "stop -f". 3. Add 'Service VM supervisor role enabled' option in ACRN configurator Tracked-On: #8618 Signed-off-by: YuanXin-Intel <xin.yuan@intel.com> Reviewed-by: Junjie Mao <junjie.mao@intel.com> Reviewed-by: Jian Jun Chen <jian.jun.chen@intel.com>	2024-06-28 13:35:07 +08:00
nacui	512c98fd79	hv: trace: show cpu usage of vms in pcpu sharing case To maximize the cpu utilization, core 0 is usually shared by service vm and guest vm. But there are no statistics to show the cpu occupation of each vm. This patch is to provide cpu usage statistic for users. To calculate it, a new trace event is added and marked in scheduling context switch, accompanying with a new python script to analyze the data from acrntrace output. Tracked-On: #8621 Signed-off-by: nacui <na.cui@intel.com> Reviewed-by: Junjie Mao <junjie.mao@intel.com> Reviewed-by: Haiwei Li <haiwei.li@intel.com>	2024-06-28 12:55:23 +08:00
Haiwei Li	3d6ca845e2	hv: s3: add timer support When resume from s3, Service VM OS will hang because timer interrupt on BSP is not triggered. Hypervisor won't update physical timer because there are expired timers on pcpu timer list. Add suspend and resume ops for modules that use timers. This patch is just for Service VM OS. Support for User VM will be added in the future. Tracked-On: #8623 Signed-off-by: Haiwei Li <haiwei.li@intel.com>	2024-06-27 11:26:09 +08:00
Haiwei Li	5283c147ef	hv: pci: Add guest cfg header access handling of type 1 device When guests resume form s3, an error occurs in guest: ``` pcieport 0000:00:1c.0: refused to change power state from D0 to D3hot ``` PCI bridge (type 1 device) will access configuration space header but now acrn is not supported. So add handling support. Tracked-On: #8623 Signed-off-by: Haiwei Li <haiwei.li@intel.com>	2024-06-27 11:26:09 +08:00
Haiwei Li	2cd0edaf9c	hv: pci: restore bus and memory/IO info after reset After some kind of reset, such as s3, pci bridge tries to restore the bus and memory/IO info (from 0x18 to 0x32, except for Secondary Latency Timer 0x1b) to resume device state. This patch is to restore these info by hypervisor. Tracked-On: #8623 Signed-off-by: Haiwei Li <haiwei.li@intel.com>	2024-06-27 11:26:09 +08:00
Haiwei Li	81935737ff	hv: s3: reset vm after resume Now only BSP is reset. After Service VM OS resumes from s3, APs' apic_base_msr are incorrect with x2apic bit en. To avoid incorrect states, do `reset_vm` after resume. Tracked-On: #8623 Signed-off-by: Haiwei Li <haiwei.li@intel.com>	2024-06-27 11:26:09 +08:00
Haiwei Li	9c139681f2	hv: s3: hwp: enable hwp after resume from s3 After Service OS resume from s3, an error occurs: [3649827us][cpu=1][idle1][sev=2][seq=1749]:= Unhandled exception: 13 (General Protection) [3658622us][cpu=1][idle1][sev=2][seq=1750]: Host Registers: [3664881us][cpu=1][idle1][sev=2][seq=1751]:= Vector=0x000000000000000D RIP=0x000000000040F9F0 [3674213us][cpu=1][idle1][sev=2][seq=1752]:= RAX=0x0000000080003801 RBX=0x0000000001800800 RCX=0x0000000000000774 [3685787us][cpu=1][idle1][sev=2][seq=1753]:= RDX=0x0000000000000000 RDI=0x0000000000000080 RSI=0x0000000000000000 [3697371us][cpu=1][idle1][sev=2][seq=1754]:= RSP=0x0000000000616C18 RBP=0x0000000000616C38 RBX=0x0000000001800800 [3708947us][cpu=1][idle1][sev=2][seq=1755]:= R8=0x0000000000000038 R9=0x0000000000000001 R10=0x00000000000003F8 [3720539us][cpu=1][idle1][sev=2][seq=1756]:= R11=0x000000000000000D R12=0x0000000000458245 R13=0x0000000000000000 [3732114us][cpu=1][idle1][sev=2][seq=1757]:= RFLAGS=0x0000000000010202 R14=0x0000000000000000 R15=0x0000000000000000 [3743699us][cpu=1][idle1][sev=2][seq=1758]:= ERRCODE=0x0000000000000000 CS=0x0000000000000008 SS=0x0000000000000010 [3755305us][cpu=1][idle1][sev=2][seq=1759]:= CR2=0x0000000000000000 The error occurs in `msr_write(MSR_IA32_HWP_REQUEST, reg)`, when HWP is not available. This patch is to initialize HWP after resume. Tracked-On: #8623 Signed-off-by: Haiwei Li <haiwei.li@intel.com>	2024-06-27 11:26:09 +08:00
Haiwei Li	cdfd35ed3d	hv: s3: enable lapic earlier After Service VM OS resumes from s3, BSP starts APs asynchronously, followed by IPIs to APs to resume tsc. This process takes place in function `host_enter_s3`. While, APs' lapic are not ready to accept IPI interrupt, so BSP fails to resume tsc. So enable lapic earlier to make sure that APs are ready. Tracked-On: #8623 Signed-off-by: Haiwei Li <haiwei.li@intel.com>	2024-06-27 11:26:09 +08:00
Qiang Zhang	29137b9e9c	doc: update doc for vUART to hypervisor console switch key Things changed since following commit (`c623e1112` debug: vuart: add guest break key support). Tracked-On: #8583 Signed-off-by: Qiang Zhang <qiang4.zhang@intel.com>	2024-06-25 11:07:21 +08:00
Jiaqing Zhao	53825c5cac	e820: properly reserve memory for multiboot modules In current implementation, if there are multiple continous 4k-aligned modules, 0-sized e820 entries will be created between these regions. And for non-4k-aligned modules, when two of them are located in one page, the second memory range will not be reserved as it was not in one e820 entry after the first is reserved, making it vulnerable. This patch fixes it by marking the exact memory range of multiboot modules as unusable first, then shrinking the e820 entries to page boundary. If the module crosses multiple e820 entries, possibly due to a buggy bootloader, hypervisor will panic immediately to prevent modules getting corrupted. Tracked-On: #8617 Signed-off-by: Jiaqing Zhao <jiaqing.zhao@linux.intel.com> Reviewed-by: Junjie Mao <junjie.mao@intel.com>	2024-06-20 09:10:27 +08:00
Haiwei Li	b31fcd3519	hv: cpuid: fix hybrid related cpuid error Some cpuids will return invalid values on hybrid platform because of the error in the pointer arithmetic. Add `(void *)` before `cpu_cpuids.leaves`. Leaf 0x14 is used to report Intel Processor Trace Enumeration and varies between P-cores and E-cores on hybrid platform. So add it to `hybrid_leaves`. Tracked-On: #8608 Fixes: `59a8cc4c2` ("hv: cpuid: make leaf 0x4 per-cpu in hybrid architecture") Signed-off-by: Haiwei Li <haiwei.li@intel.com> Reviewed-by: Junjie Mao <junjie.mao@intel.com>	2024-06-19 17:07:10 +08:00
andi6	46a860bf04	hv: fix using cpuid does not clear the upper 32-bit registers. In HV, cpuid uses the lower 32 bits of rax\rbx\rcx\rdx registers to pass parameters, But the software does not clear the upper 32-bit registers, if the guest uses 64-bit variables to pass parameters to cpuid，guest will use rax\rbx\rcx\rdx, not eax\ebx\ecx\edx, the previous value of the high 32 registers will affect the guest. Tracked-On: #8605 Reviewed-by: Junjie Mao <junjie.mao@intel.com> Signed-off-by: andi6 <andi6@xiaomi.com>	2024-06-19 15:35:26 +08:00
Jian Jun Chen	74bc2f7cfb	hv: asyncio: support data match of the same addr Virtio legacy device (ver < 1.0) uses a single PIO for all virtqueues. Notifications from different virtqueues are implemented by writing virtqueue index to the PIO. Writing different values to the same addr needs to be mapped to different eventfds by asyncio. This is called data match feature of asyncio. v3 -> v4: * Update the definition of `struct asyncio_desc` Use `struct acrn_asyncio_info` inside it, instaed of defining the duplicated fileds. * Update `add_asyncio` to use `memcpy_s` rather than assigning all the fields using 5 assignment statements. * Update `asyncio_is_conflict` for coding style 120-character line is sufficient to write all conditions. * Update the checks related to `wildcard` Because we require every conditional clause to have a Boolean type in the coding guideline. v2 -> v3: No change v1 -> v2: No change Tracked-On: #8612 Signed-off-by: Jian Jun Chen <jian.jun.chen@intel.com> Signed-off-by: Shiqing Gao <shiqing.gao@intel.com> Acked-by: Wang, Yu1 <yu1.wang@intel.com>	2024-06-05 15:23:33 +08:00
Jiaqing Zhao	91e0612e88	hv: dm: refine create/destroy functions The create function of hv-emulated device must check the return value of vpci_init_vdev() as it returns NULL pointer on failure, and that function should be called atomically. Also, the destory function should deinit the vpci devices created to prevent resource leak. Tracked-On: #8590 Signed-off-by: Jiaqing Zhao <jiaqing.zhao@linux.intel.com> Reviewed-by: Junjie Mao <junjie.mao@intel.com>	2024-06-04 09:38:34 +08:00
Jiaqing Zhao	626e2f1d17	hv: vpci: clear vdev structure on device deassign In devicemodel, a passthrough device is deassigned and then assigned to guest on guest reboot. Each time hypervisor allocates a new pci_vdev structure to keep its info. As it was stored in a statically-allocated array, it will eventually use up all slots, resulting both resource leak and out-of-bounds access. Fix it by clearing the corresponding vdev structure on device deassign, thus a bitmap is introduced to track the usage, replacing the existing array count. Tracked-On: #8590 Signed-off-by: Jiaqing Zhao <jiaqing.zhao@linux.intel.com> Reviewed-by: Junjie Mao <junjie.mao@intel.com>	2024-06-04 09:38:34 +08:00
Haiwei Li	b885d02396	hv: cpuid: add several leaf to per-cpu list in hybrid architecture P-cores and E-cores accessing leaf 0x2U/0x14U/0x16U/0x18U/0x1A/0x1C/0x80000006U will have different information in hybrid architecture. So add them to per-cpu list in hybrid architecture and directly return the physical value. Note: 0x14U is hided and return 0. Tracked-On: #8608 Signed-off-by: Haiwei Li <haiwei.li@intel.com>	2024-05-28 11:02:56 +08:00
Haiwei Li	d6fe8b0892	hv: cpuid: make leaf 0x6 per-cpu in hybrid architecture Leaf 0x6 returns thermal and power management information. In hybrid architecture, P-cores and E-cores have different information. Add leaf 0x6 to per-cpu list in hybrid architecture and handle specific cpuid access. Tracked-On: #8608 Signed-off-by: Haiwei Li <haiwei.li@intel.com>	2024-05-28 11:02:56 +08:00
Haiwei Li	59a8cc4c28	hv: cpuid: make leaf 0x4 per-cpu in hybrid architecture Leaf 0x4 returns deterministic cache parameters for each level. In hybrid architecture, P-cores and E-cores have different cache information. Add leaf 0x4 to per-cpu list in hybrid architecture and handle specific cpuid access. Tracked-On: #8608 Signed-off-by: Haiwei Li <haiwei.li@intel.com>	2024-05-28 11:02:56 +08:00
Haiwei Li	f7506424e4	hv: cpuid: refactor per-cpu leaves definition CPUID returns processor identification and feature information. Different pcpus may return different infos. That is, the info is per-cpu. In hybrid architecture, per-cpu leaf is different from the previous. So introduce a struct percpu_cpuids to indicate the per-cpu leaf. struct percpu_cpuids will consist of two parts: generic percpu leaves and hybrid related percpu leaves. This patch is just to add generic percpu leaves. Tracked-On: #8608 Signed-off-by: Haiwei Li <haiwei.li@intel.com>	2024-05-28 11:02:56 +08:00
Xin Zhang	7edf800f16	Expose CPUID leaf 0x1f to guest with patched x2APIC ID CPUID leaf 1f is preferred superset of leaf 0b, currently ACRN exposes leaf 0b but leaf 1f is empty so the 2 leaves mismatch, and so application will follow the SDM to check 1f first. Tracked-On: #8608 Signed-off-by: Xin Zhang <xin.x.zhang@intel.com>	2024-05-28 11:02:56 +08:00
Zhangwei6	ddfcb8c3fc	hv: enable thermal lvt interrupt This patch can fetch the thermal lvt irq and propagate it to VM. At this stage we support the case that there is only one VM governing thermal. And we pass the hardware thermal irq to this VM. First, we register the handler for thermal lvt interrupt, its irq vector is THERMAL_VECTOR and the handler is thermal_irq_handler(). Then, when a thermal irq occurs, it flags the SOFTIRQ_THERMAL bit of softirq_pending, This bit triggers the thermal_softirq() function. And this function will inject the virtual thermal irq to VM. Tracked-On: #8595 Signed-off-by: Zhangwei6 <wei6.zhang@intel.com> Reviewed-by: Junjie Mao <junjie.mao@intel.com>	2024-05-16 09:40:32 +08:00
Zhangwei6	78243c3f49	hv: expose thermal MSRs to VM. In this phase, we only use one VM to control thermal. So we make thermal MSRs readable and writable by this VM. This VM is flagged with GUEST_FLAG_VTM, and can read/write thermal MSRs. For the VMs not flagged with GUEST_FLAG_VTM, can only read these thermal MSRs to get current status. Tracked-On: #8595 Signed-off-by: Zhangwei6 <wei6.zhang@intel.com> Reviewed-by: Junjie Mao <junjie.mao@intel.com>	2024-05-16 09:40:32 +08:00
Zhang Chen	946a927dcb	hv: sched: Fix scheduler priority issue Fix build issue. Tracked-On: #8586 Signed-off-by: Zhang Chen <chen.zhang@intel.com> Reviewed-by: Junjie Mao <junjie.mao@intel.com>	2024-05-08 14:52:23 +08:00
Qiang Zhang	629808c767	debug: vuart: fix interrupt ID for data receiving When RX FIFO is not empty and Receive Data Available interrupt is enabled, vUART should report a Receive Data Available (IIR_RXRDY) in IIR instead of a Timeout Interrupt Pending (IIR_RXTOUT). Tracked-On: #8583 Signed-off-by: Qiang Zhang <qiang4.zhang@intel.com> Reviewed-by: Junjie Mao <junjie.mao@intel.com>	2024-04-25 15:00:09 +08:00
Qiang Zhang	c623e11125	debug: vuart: add guest break key support The break key (key value 0x0) was used as switch key from guest serial to hv console and guest serial could not receive break key. This blocked some guest debugging features like KGDB/KDB, sysrq, etc. This patch leverages escape sequence "<escape> + <break>" to send break to guest and "<escape> + e" to switch from guest serial to hv console. Tracked-On: #8583 Signed-off-by: Qiang Zhang <qiang4.zhang@intel.com> Reviewed-by: Junjie Mao <junjie.mao@intel.com>	2024-04-25 15:00:09 +08:00
Yi Sun	e0d03b27d0	hv: return error code for default case in hcall_vm_intr_monitor In hcall_vm_intr_monitor(), the default case for intr_hdr->cmd is a wrong case. So, it should return error code back. But it returns success code 0 in current codes. Tracked-On: #8580 Reviewed-by: Fei Li <fei1.li@intel.com> Signed-off-by: Yi Sun <yi.y.sun@linux.intel.com>	2024-04-23 15:58:36 +08:00
Yonghua Huang	7bcd9d783e	hv: refine set_fs_base() function Leave canary of stack protector untouched on pCPU if it has been initialized, instead of generating a new one. Tracked-On: #8577 Signed-off-by: Yonghua Huang <yonghua.huang@intel.com> Reviewed-by: Junjie Mao <junjie.mao@intel.com> Reviewed-by: Fei Li <fei1.li@intel.com>	2024-04-23 11:00:43 +08:00
Yonghua Huang	d5d21fdc1b	hv: fix potential NULL pointer dereferrence in ivshmem.c secure coding fix. Tracked-On: #8566 Signed-off-by: Yonghua Huang <yonghua.huang@intel.com> Reviewed-by: Junjie Mao <junjie.mao@intel.com>	2024-04-10 12:20:54 +08:00
Yonghua Huang	ddfe218747	hv: fill region ID to hv-land ivshmem PCI config space 1) region ID shall be configured by user via config tool. 2) region ID is programmed to "Subsystem ID" of PCI config space. 2) "Subsystem Vendor ID" is harded coded as 0x8086 Tracked-On: #8566 Signed-off-by: Yonghua Huang <yonghua.huang@intel.com> Reviewed-by: Junjie Mao <junjie.mao@intel.com>	2024-03-28 14:34:38 +08:00
Yonghua Huang	a7a6732580	config_tools: support IVSHMEM devices region ID configuration This patch adds ivshmem region ID configuration support when user configure ACRN IVSHMEM devices via ACRN config tool, this ID provides VMs with a stable identification of multiple shared memory regions. Also add logic to generate launch script with region ID configured as below: `add_virtual_device 8 ivshmem hv:/shm_region_0,256,1` Tracked-On: #8566 Signed-off-by: Kunhui-Li <kunhuix.li@intel.com> Signed-off-by: Yonghua Huang <yonghua.huang@intel.com> Reviewed-by: Junjie Mao <junjie.mao@intel.com>	2024-03-28 14:34:38 +08:00
Jiaqing Zhao	f6bb15c85c	hv: mmu: intiialize ppt_page_pool.bitmap in allocate_ppt_pages() ppt_page_pool.bitmap should be zero-initialized. Also fixes the wrong indention in allocate_ppt_pages(). Tracked-On: #8559 Reviewed-by: Junjie Mao <junjie.mao@intel.com> Signed-off-by: Jiaqing Zhao <jiaqing.zhao@linux.intel.com>	2024-03-25 09:57:08 +08:00
Jiaqing Zhao	997bdc4843	hv: configure hv console default output in scenario file Add a new option CONSOLE_VM in scenario to set the default vm to be outputted in hv console, when it is not set, acrn console will be used (current behavior). This is intended for debugging vm boot issues. Tracked-On: #8518 Signed-off-by: Jiaqing Zhao <jiaqing.zhao@linux.intel.com> Reviewed-by: Junjie Mao <junjie.mao@intel.com>	2024-03-25 09:52:30 +08:00
Wu Zhou	a052cda8e4	hv: console reads all input chars in one poll For uart console, some control keys are defined as byte sequences, such as: * up arrow - 0x1b/0x5b/0x41 * F8 - 0x1b/0x5b/0x31/0x39/0x7e Currently hv console only read one char per poll. When guest vuart console is active, those byte sequences may not be sent to guest vuart in good timing due to the poll interval. Thus control keys such as up/down can not be used in shell or vim. The solution is to read all input chars in one poll, so that control keys can be received by guest OS properly. Tracked-On: #8564 Signed-off-by: Wu Zhou <wu.zhou@intel.com> Reviewed-by: Junjie Mao <junjie.mao@intel.com>	2024-03-12 15:26:58 +08:00
Wu Zhou	925e3d95b4	hv: add max_len for sbuf_put param sbuf_put copies sbuf->ele_size of data, and puts into ring. Currently this function assumes that data size from caller is no less than sbuf->ele_size. But as sbuf->ele_size is usually setup by some sources outside of the HV (e.g., the service VM), it is not meant to be trusted. So caller should provide the max length of the data for safety reason. sbuf_put() will return UINT32_MAX if max_len of data is less than element size. Additionally, a helper function sbuf_put_many() is added for putting multiple entries. Tracked-On: #8547 Signed-off-by: Wu Zhou <wu.zhou@intel.com> Reviewed-by: Junjie Mao <junjie.mao@intel.com>	2024-02-20 11:52:02 +08:00
Wu Zhou	29b3d03ac7	hv: vm_event: send event on triple fault handler In the triple fault handler, post-launched VMs are instantly turned off. Now a vm event is generated simultaneously. So that developers can capture the event and decide what to do with it. (e.g., logging and populating diagnostics, or poweroff VM) Tracked-On: #8547 Signed-off-by: Wu Zhou <wu.zhou@intel.com> Reviewed-by: Junjie Mao <junjie.mao@intel.com>	2024-02-01 17:01:31 +08:00
Wu Zhou	ab63fe1a92	hv: vm_event: send RTC change event in hv vRTC This patch adds support for HV vrtc vm_event. RTC change event is sent upon each date/time reg write. Those events will be handled in DM. DM will try to emit an RTC change event(to Libvirt) based on its strategy. Only support post-launched VMs. The DM event handler has already implemented the rtc chanage event. Those events will be processed the same way as vrtc events from DM vrtc. Tracked-On: #8547 Signed-off-by: Wu Zhou <wu.zhou@intel.com> Reviewed-by: Junjie Mao <junjie.mao@intel.com>	2024-02-01 17:01:31 +08:00
Wu Zhou	262a48f346	dm: vm_event: add support for RTC change event When a guest OS performs an RTC change action, we wish this event be captured by developers, and then they can decide what to do with it. (e.g., whether to change physical RTC) There are some facts that makes RTC change event a bit complicated: - There are 7 RTC date/time regs (year, month…). They can only be updated one by one. - RTC time is not reliable before date/time update is finished. - Guests can update RTC date/time regs in any order. - Guests may update RTC date/time regs during either RTC halted or not halted. A single date/time update event is not reliable. We have to wait for the guest to finish the update process. So the DM's event handler sets up a timer, and wait for some time (1 second). If no more change happens befor the timer expires, we can conclude that the RTC change has been done. Then the rtc change event is emitted. This logic of event handler can be used to process HV vrtc time change event too. Tracked-On: #8547 Signed-off-by: Wu Zhou <wu.zhou@intel.com> Reviewed-by: Jian Jun Chen <jian.jun.chen@intel.com>	2024-02-01 17:01:31 +08:00
Wu Zhou	d9ccf1ccb2	dm: vm_event: create vm_event thread This patch creates a thread for vm_event delivery. The thread uses epoll to poll event notifications, then read out the msg data queued in sbuf. An event handler is called upon success receiving. Both HV and DM event sources share the same process. Also vm_event tx API for DM event source is added in this patch. Tracked-On: #8547 Signed-off-by: Wu Zhou <wu.zhou@intel.com> Reviewed-by: Jian Jun Chen <jian.jun.chen@intel.com>	2024-02-01 17:01:31 +08:00
Wu Zhou	581ec58fbb	hv: vm_event: create vm_event support This patch creates vm_event support in HV, including: 1. Create vm_event data type. 2. Add vm_event sbuf and its initializer. The sbuf will be allocated by DM in Service VM. Its page address will then be share to HV through hypercall. 3. Add an API to send the HV generated event. Tracked-On: #8547 Signed-off-by: Wu Zhou <wu.zhou@intel.com> Reviewed-by: Junjie Mao <junjie.mao@intel.com>	2024-02-01 17:01:31 +08:00
Muhammad Qasim Abdul Majeed	3be3b394ad	hypervisor: Fix spelling and grammar mistakes. Tracked-On: #8533 Signed-off-by: Muhammad Qasim Abdul Majeed <qasim.majeed20@gmail.com>	2023-10-24 11:10:47 +08:00
Muhammad Qasim Abdul Majeed	ce96ef6bae	hypervisor: Fix spelling and grammar mistakes. Tracked-On: #8533 Signed-off-by: Muhammad Qasim Abdul Majeed <qasim.majeed20@gmail.com>	2023-10-23 16:45:28 +08:00
Wu Zhou	bbe8e254cf	hv: support multi function ivshmem device Currently ivshmem device can only be configurated as single function device(bdf.f = 0) on bus 0. This greatly limits the number of ivshmem devices we can create. This patch is to enable multiple function bit in HEADER_TYPE config register, so that we can create many more ivshmem devices by using different function numbers on one bus:dev. The multi function device bit is to be set on ivshmem devices whose function number equls 0. PCI spec describe it as: ‘When Set, indicates that the Device may contain multiple Functions, but not necessarily.’, So if this dev is the only one on the bus:dev, it is still OK. Tracked-On: #8520 Signed-off-by: Wu Zhou <wu.zhou@intel.com> Reviewed-by: Junjie Mao <junjie.mao@intel.com>	2023-09-27 16:46:20 +08:00
Qiang Zhang	79b91b339b	hv: sched: add four parameters for BVT scheduler Per BVT (Borrowed Virtual Time) scheduler design, following per thread parameters are required to tune scheduling behaviour. - weight The time sharing of a thread on CPU. - warp Boost value of virtual time of a thread (time borrowed from future) to reduce Effective Virtual Time to prioritize the thread. - warp_limit Max warp time in one warp. - unwarp_period Min unwarp time after a warp. As of now, only weight is in use to tune virtual time ratio of VCPU threads from different VMs. Others parameters are for future extension. Tracked-On: #8500 Reviewed-by: Junjie Mao <junjie.mao@intel.com> Signed-off-by: Qiang Zhang <qiang4.zhang@intel.com>	2023-09-18 16:26:05 +08:00
Qiang Zhang	04a4f31d28	config: add four per-vm bvt parameters Add four per-vm bvt parameters as the initial bvt parameter values for vCPU threads. - bvt_weight The time sharing of a thread on CPU. - bvt_warp_value Boost value of virtual time of a thread (time borrowed from future) to reduce Effective Virtual Time to prioritize the thread. - bvt_warp_limit Max warp time in one warp. - bvt_unwarp_period Min unwarp time after a warp. Tracked-On: #8500 Reviewed-by: Junjie Mao <junjie.mao@intel.com> Signed-off-by: Qiang Zhang <qiang4.zhang@intel.com>	2023-09-18 16:26:05 +08:00
Qiang Zhang	6a1d91c740	hv: sched: Add sched_params struct for thread parameters Abstract out schedulers config data for vCPU threads and other hypervisor threads to sched_params structure. And it's used to initialize per thread scheduler private data. The sched_params for vCPU threads come from vm_config generated by config tools while other hypervisor threads need give them explicitly. Tracked-On: #8500 Signed-off-by: Qiang Zhang <qiang4.zhang@intel.com>	2023-09-18 16:26:05 +08:00
Qiang Zhang	c000a3f70b	hv: add clamp macro for convenience Add clamp macro to clamp a value within a range. Tracked-On: #8500 Reviewed-by: Junjie Mao <junjie.mao@intel.com> Signed-off-by: Qiang Zhang <qiang4.zhang@intel.com>	2023-09-18 16:26:05 +08:00
Wu Zhou	9a6e940849	hv: signal_event after make_request make_request sets the request bit, and signal_event wakes the vcpu thread. If we signal_event comes first, the target vCPU has a chance to sleep again before processing the request bit. Tracked-On: #8507 Signed-off-by: Wu Zhou <wu.zhou@intel.com> Reviewed-by: Junjie Mao <junjie.mao@intel.com>	2023-09-15 11:52:40 +08:00
Wu Zhou	064be1e3e6	hv: support halt in hv idle When all vCPU threads on one pCPU are put to sleep (e.g., when all guests execute HLT), hv would schedule to idle thread. Currently the idle thread executes PAUSE which does not enter any c-state and consumes a lot of power. This patch is to support HLT in the idle thread. When we switch to HLT, we have to make sure events that would wake a vCPU must also be able to wake the pCPU. Those events are either generated by local interrupt or issued by other pCPUs followed by an ipi kick. Each of them have an interrupt involved, so they are also able to wake the halted pCPU. Except when the pCPU has just scheduled to idle thread but not yet halted, interrupts could be missed. sleep-------schedule to idle------IRQ ON---HLT--(kick missed) ^ wake---kick\| This areas should be protected. This is done by a safe halt mechanism leveraging STI instruction’s delay effect (same as Linux). vCPUs with lapic_pt or hv with CONFIG_KEEP_IRQ_DISABLED=y does not allow interrupts in root mode, so they could never wake from HLT (INIT kick does not wake HLT in root mode either). They should continue using PAUSE in idle. Tracked-On: #8507 Signed-off-by: Wu Zhou <wu.zhou@intel.com> Reviewed-by: Junjie Mao <junjie.mao@intel.com>	2023-09-15 11:52:40 +08:00
Wu Zhou	64d999e703	hv: switch to dynamic timer in bvt scheduler When bvt scheduler picks up a thread to run, it sets up a counter ‘run_countdown’ to determine how many ticks it should remain running. Then the timer will decrease run_countdown by 1 on every 1000Hz tick interrupt, until it reaches 0. The tick interrupt consumes a lot of power during idle (if we are using HLT in idle thread). This patch is to switch the 1000 HZ timer to a dynamic one, which only interrupt on run_countdown expires. Tracked-On: #8507 Signed-off-by: Wu Zhou <wu.zhou@intel.com> Reviewed-by: Junjie Mao <junjie.mao@intel.com>	2023-09-13 08:30:27 +08:00

1 2 3 4 5 ...

3573 Commits