Discussion:
[Resend PATCH V5 0/10] x86/KVM/Hyper-v: Add HV ept tlb range flush hypercall support in KVM
l***@gmail.com
2018-12-06 13:21:03 UTC
Permalink
From: Lan Tianyu <***@microsoft.com>

For nested memory virtualization, Hyper-v doesn't set write-protect
L1 hypervisor EPT page directory and page table node to track changes
while it relies on guest to tell it changes via HvFlushGuestAddressLlist
hypercall. HvFlushGuestAddressLlist hypercall provides a way to flush
EPT page table with ranges which are specified by L1 hypervisor.

If L1 hypervisor uses INVEPT or HvFlushGuestAddressSpace hypercall to
flush EPT tlb, Hyper-V will invalidate associated EPT shadow page table
and sync L1's EPT table when next EPT page fault is triggered.
HvFlushGuestAddressLlist hypercall helps to avoid such redundant EPT
page fault and synchronization of shadow page table.

This patchset is based on the Patch "KVM/VMX: Check ept_pointer before
flushing ept tlb"(https://marc.info/?l=kvm&m=154408169705686&w=2).

Change since v4:
1) Split flush address and flush list patches. This patchset only contains
flush address patches. Will post flush list patches later.
2) Expose function hyperv_fill_flush_guest_mapping_list()
out of hyperv file
3) Adjust parameter of hyperv_flush_guest_mapping_range()
4) Reorder patchset and move Hyper-V and VMX changes ahead.

Change since v3:
1) Remove code of updating "tlbs_dirty" in kvm_flush_remote_tlbs_with_range()
2) Remove directly tlb flush in the kvm_handle_hva_range()
3) Move tlb flush in kvm_set_pte_rmapp() to kvm_mmu_notifier_change_pte()
4) Combine Vitaly's "don't pass EPT configuration info to
vmx_hv_remote_flush_tlb()" fix

Change since v2:
1) Fix comment in the kvm_flush_remote_tlbs_with_range()
2) Move HV_MAX_FLUSH_PAGES and HV_MAX_FLUSH_REP_COUNT to
hyperv-tlfs.h.
3) Calculate HV_MAX_FLUSH_REP_COUNT in the macro definition
4) Use HV_MAX_FLUSH_REP_COUNT to define length of gpa_list in
struct hv_guest_mapping_flush_list.

Change since v1:
1) Convert "end_gfn" of struct kvm_tlb_range to "pages" in order
to avoid confusion as to whether "end_gfn" is inclusive or exlusive.
2) Add hyperv tlb range struct and replace kvm tlb range struct
with new struct in order to avoid using kvm struct in the hyperv
code directly.



Lan Tianyu (10):
KVM: Add tlb_remote_flush_with_range callback in kvm_x86_ops
x86/hyper-v: Add HvFlushGuestAddressList hypercall support
x86/Hyper-v: Add trace in the
hyperv_nested_flush_guest_mapping_range()
KVM/VMX: Add hv tlb range flush support
KVM/MMU: Add tlb flush with range helper function
KVM: Replace old tlb flush function with new one to flush a specified
range.
KVM: Make kvm_set_spte_hva() return int
KVM/MMU: Move tlb flush in kvm_set_pte_rmapp() to
kvm_mmu_notifier_change_pte()
KVM/MMU: Flush tlb directly in the kvm_set_pte_rmapp()
KVM/MMU: Flush tlb directly in the kvm_zap_gfn_range()

arch/arm/include/asm/kvm_host.h | 2 +-
arch/arm64/include/asm/kvm_host.h | 2 +-
arch/mips/include/asm/kvm_host.h | 2 +-
arch/mips/kvm/mmu.c | 3 +-
arch/powerpc/include/asm/kvm_host.h | 2 +-
arch/powerpc/kvm/book3s.c | 3 +-
arch/powerpc/kvm/e500_mmu_host.c | 3 +-
arch/x86/hyperv/nested.c | 80 +++++++++++++++++++++++++++++++
arch/x86/include/asm/hyperv-tlfs.h | 32 +++++++++++++
arch/x86/include/asm/kvm_host.h | 9 +++-
arch/x86/include/asm/mshyperv.h | 15 ++++++
arch/x86/include/asm/trace/hyperv.h | 14 ++++++
arch/x86/kvm/mmu.c | 96 +++++++++++++++++++++++++++++--------
arch/x86/kvm/paging_tmpl.h | 3 +-
arch/x86/kvm/vmx.c | 63 +++++++++++++++++-------
virt/kvm/arm/mmu.c | 6 ++-
virt/kvm/kvm_main.c | 5 +-
17 files changed, 292 insertions(+), 48 deletions(-)
--
2.14.4
l***@gmail.com
2018-12-06 13:21:04 UTC
Permalink
From: Lan Tianyu <***@microsoft.com>

Add flush range call back in the kvm_x86_ops and platform can use it
to register its associated function. The parameter "kvm_tlb_range"
accepts a single range and flush list which contains a list of ranges.

Signed-off-by: Lan Tianyu <***@microsoft.com>
---
Change since v1:
Change "end_gfn" to "pages" to aviod confusion as to whether
"end_gfn" is inclusive or exlusive.
---
arch/x86/include/asm/kvm_host.h | 7 +++++++
1 file changed, 7 insertions(+)

diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h
index fbda5a917c5b..fc7513ecfc13 100644
--- a/arch/x86/include/asm/kvm_host.h
+++ b/arch/x86/include/asm/kvm_host.h
@@ -439,6 +439,11 @@ struct kvm_mmu {
u64 pdptrs[4]; /* pae */
};

+struct kvm_tlb_range {
+ u64 start_gfn;
+ u64 pages;
+};
+
enum pmc_type {
KVM_PMC_GP = 0,
KVM_PMC_FIXED,
@@ -1042,6 +1047,8 @@ struct kvm_x86_ops {

void (*tlb_flush)(struct kvm_vcpu *vcpu, bool invalidate_gpa);
int (*tlb_remote_flush)(struct kvm *kvm);
+ int (*tlb_remote_flush_with_range)(struct kvm *kvm,
+ struct kvm_tlb_range *range);

/*
* Flush any TLB entries associated with the given GVA.
--
2.14.4
l***@gmail.com
2018-12-06 13:21:05 UTC
Permalink
From: Lan Tianyu <***@microsoft.com>

Hyper-V provides HvFlushGuestAddressList() hypercall to flush EPT tlb
with specified ranges. This patch is to add the hypercall support.

Reviewed-by: Michael Kelley <***@microsoft.com>
Signed-off-by: Lan Tianyu <***@microsoft.com>
---
Change sincd v4:
- Expose function hyperv_fill_flush_guest_mapping_list()
out of hyperv file
- Adjust parameter of hyperv_flush_guest_mapping_range()

Change since v2:
Fix some coding style issues
- Move HV_MAX_FLUSH_PAGES and HV_MAX_FLUSH_REP_COUNT to
hyperv-tlfs.h.
- Calculate HV_MAX_FLUSH_REP_COUNT in the macro definition
- Use HV_MAX_FLUSH_REP_COUNT to define length of gpa_list in
struct hv_guest_mapping_flush_list.

Change since v1:
Add hyperv tlb flush struct to avoid use kvm tlb flush struct
in the hyperv file.
---
arch/x86/hyperv/nested.c | 79 ++++++++++++++++++++++++++++++++++++++
arch/x86/include/asm/hyperv-tlfs.h | 32 +++++++++++++++
arch/x86/include/asm/mshyperv.h | 15 ++++++++
3 files changed, 126 insertions(+)

diff --git a/arch/x86/hyperv/nested.c b/arch/x86/hyperv/nested.c
index b8e60cc50461..3d0f31e46954 100644
--- a/arch/x86/hyperv/nested.c
+++ b/arch/x86/hyperv/nested.c
@@ -7,6 +7,7 @@
*
* Author : Lan Tianyu <***@microsoft.com>
*/
+#define pr_fmt(fmt) "Hyper-V: " fmt


#include <linux/types.h>
@@ -54,3 +55,81 @@ int hyperv_flush_guest_mapping(u64 as)
return ret;
}
EXPORT_SYMBOL_GPL(hyperv_flush_guest_mapping);
+
+int hyperv_fill_flush_guest_mapping_list(
+ struct hv_guest_mapping_flush_list *flush,
+ u64 start_gfn, u64 pages)
+{
+ u64 cur = start_gfn;
+ u64 additional_pages;
+ int gpa_n = 0;
+
+ do {
+ /*
+ * If flush requests exceed max flush count, go back to
+ * flush tlbs without range.
+ */
+ if (gpa_n >= HV_MAX_FLUSH_REP_COUNT)
+ return -ENOSPC;
+
+ additional_pages = min_t(u64, pages, HV_MAX_FLUSH_PAGES) - 1;
+
+ flush->gpa_list[gpa_n].page.additional_pages = additional_pages;
+ flush->gpa_list[gpa_n].page.largepage = false;
+ flush->gpa_list[gpa_n].page.basepfn = cur;
+
+ pages -= additional_pages + 1;
+ cur += additional_pages + 1;
+ gpa_n++;
+ } while (pages > 0);
+
+ return gpa_n;
+}
+EXPORT_SYMBOL_GPL(hyperv_fill_flush_guest_mapping_list);
+
+int hyperv_flush_guest_mapping_range(u64 as,
+ hyperv_fill_flush_list_func fill_flush_list_func, void *data)
+{
+ struct hv_guest_mapping_flush_list **flush_pcpu;
+ struct hv_guest_mapping_flush_list *flush;
+ u64 status = 0;
+ unsigned long flags;
+ int ret = -ENOTSUPP;
+ int gpa_n = 0;
+
+ if (!hv_hypercall_pg || !fill_flush_list_func)
+ goto fault;
+
+ local_irq_save(flags);
+
+ flush_pcpu = (struct hv_guest_mapping_flush_list **)
+ this_cpu_ptr(hyperv_pcpu_input_arg);
+
+ flush = *flush_pcpu;
+ if (unlikely(!flush)) {
+ local_irq_restore(flags);
+ goto fault;
+ }
+
+ flush->address_space = as;
+ flush->flags = 0;
+
+ gpa_n = fill_flush_list_func(flush, data);
+ if (gpa_n < 0) {
+ local_irq_restore(flags);
+ goto fault;
+ }
+
+ status = hv_do_rep_hypercall(HVCALL_FLUSH_GUEST_PHYSICAL_ADDRESS_LIST,
+ gpa_n, 0, flush, NULL);
+
+ local_irq_restore(flags);
+
+ if (!(status & HV_HYPERCALL_RESULT_MASK))
+ ret = 0;
+ else
+ ret = status;
+fault:
+ return ret;
+}
+EXPORT_SYMBOL_GPL(hyperv_flush_guest_mapping_range);
diff --git a/arch/x86/include/asm/hyperv-tlfs.h b/arch/x86/include/asm/hyperv-tlfs.h
index 4139f7650fe5..405a378e1c62 100644
--- a/arch/x86/include/asm/hyperv-tlfs.h
+++ b/arch/x86/include/asm/hyperv-tlfs.h
@@ -10,6 +10,7 @@
#define _ASM_X86_HYPERV_TLFS_H

#include <linux/types.h>
+#include <asm/page.h>

/*
* The below CPUID leaves are present if VersionAndFeatures.HypervisorPresent
@@ -358,6 +359,7 @@ struct hv_tsc_emulation_status {
#define HVCALL_POST_MESSAGE 0x005c
#define HVCALL_SIGNAL_EVENT 0x005d
#define HVCALL_FLUSH_GUEST_PHYSICAL_ADDRESS_SPACE 0x00af
+#define HVCALL_FLUSH_GUEST_PHYSICAL_ADDRESS_LIST 0x00b0

#define HV_X64_MSR_VP_ASSIST_PAGE_ENABLE 0x00000001
#define HV_X64_MSR_VP_ASSIST_PAGE_ADDRESS_SHIFT 12
@@ -757,6 +759,36 @@ struct hv_guest_mapping_flush {
u64 flags;
};

+/*
+ * HV_MAX_FLUSH_PAGES = "additional_pages" + 1. It's limited
+ * by the bitwidth of "additional_pages" in union hv_gpa_page_range.
+ */
+#define HV_MAX_FLUSH_PAGES (2048)
+
+/* HvFlushGuestPhysicalAddressList hypercall */
+union hv_gpa_page_range {
+ u64 address_space;
+ struct {
+ u64 additional_pages:11;
+ u64 largepage:1;
+ u64 basepfn:52;
+ } page;
+};
+
+/*
+ * All input flush parameters should be in single page. The max flush
+ * count is equal with how many entries of union hv_gpa_page_range can
+ * be populated into the input parameter page.
+ */
+#define HV_MAX_FLUSH_REP_COUNT (PAGE_SIZE - 2 * sizeof(u64) / \
+ sizeof(union hv_gpa_page_range))
+
+struct hv_guest_mapping_flush_list {
+ u64 address_space;
+ u64 flags;
+ union hv_gpa_page_range gpa_list[HV_MAX_FLUSH_REP_COUNT];
+};
+
/* HvFlushVirtualAddressSpace, HvFlushVirtualAddressList hypercalls */
struct hv_tlb_flush {
u64 address_space;
diff --git a/arch/x86/include/asm/mshyperv.h b/arch/x86/include/asm/mshyperv.h
index 1d0a7778e163..7da712f65398 100644
--- a/arch/x86/include/asm/mshyperv.h
+++ b/arch/x86/include/asm/mshyperv.h
@@ -22,6 +22,11 @@ struct ms_hyperv_info {

extern struct ms_hyperv_info ms_hyperv;

+
+typedef int (*hyperv_fill_flush_list_func)(
+ struct hv_guest_mapping_flush_list *flush,
+ void *data);
+
/*
* Generate the guest ID.
*/
@@ -348,6 +353,11 @@ void set_hv_tscchange_cb(void (*cb)(void));
void clear_hv_tscchange_cb(void);
void hyperv_stop_tsc_emulation(void);
int hyperv_flush_guest_mapping(u64 as);
+int hyperv_flush_guest_mapping_range(u64 as,
+ hyperv_fill_flush_list_func fill_func, void *data);
+int hyperv_fill_flush_guest_mapping_list(
+ struct hv_guest_mapping_flush_list *flush,
+ u64 start_gfn, u64 end_gfn);

#ifdef CONFIG_X86_64
void hv_apic_init(void);
@@ -370,6 +380,11 @@ static inline struct hv_vp_assist_page *hv_get_vp_assist_page(unsigned int cpu)
return NULL;
}
static inline int hyperv_flush_guest_mapping(u64 as) { return -1; }
+static inline int hyperv_flush_guest_mapping_range(u64 as,
+ hyperv_fill_flush_list_func fill_func, void *data);
+{
+ return -1;
+}
#endif /* CONFIG_HYPERV */

#ifdef CONFIG_HYPERV_TSCPAGE
--
2.14.4
l***@gmail.com
2018-12-06 13:21:06 UTC
Permalink
From: Lan Tianyu <***@microsoft.com>

This patch is to trace log in the hyperv_nested_flush_
guest_mapping_range().

Signed-off-by: Lan Tianyu <***@microsoft.com>
---
arch/x86/hyperv/nested.c | 1 +
arch/x86/include/asm/trace/hyperv.h | 14 ++++++++++++++
2 files changed, 15 insertions(+)

diff --git a/arch/x86/hyperv/nested.c b/arch/x86/hyperv/nested.c
index 3d0f31e46954..dd0a843f766d 100644
--- a/arch/x86/hyperv/nested.c
+++ b/arch/x86/hyperv/nested.c
@@ -130,6 +130,7 @@ int hyperv_flush_guest_mapping_range(u64 as,
else
ret = status;
fault:
+ trace_hyperv_nested_flush_guest_mapping_range(as, ret);
return ret;
}
EXPORT_SYMBOL_GPL(hyperv_flush_guest_mapping_range);
diff --git a/arch/x86/include/asm/trace/hyperv.h b/arch/x86/include/asm/trace/hyperv.h
index 2e6245a023ef..ace464f09681 100644
--- a/arch/x86/include/asm/trace/hyperv.h
+++ b/arch/x86/include/asm/trace/hyperv.h
@@ -42,6 +42,20 @@ TRACE_EVENT(hyperv_nested_flush_guest_mapping,
TP_printk("address space %llx ret %d", __entry->as, __entry->ret)
);

+TRACE_EVENT(hyperv_nested_flush_guest_mapping_range,
+ TP_PROTO(u64 as, int ret),
+ TP_ARGS(as, ret),
+
+ TP_STRUCT__entry(
+ __field(u64, as)
+ __field(int, ret)
+ ),
+ TP_fast_assign(__entry->as = as;
+ __entry->ret = ret;
+ ),
+ TP_printk("address space %llx ret %d", __entry->as, __entry->ret)
+ );
+
TRACE_EVENT(hyperv_send_ipi_mask,
TP_PROTO(const struct cpumask *cpus,
int vector),
--
2.14.4
Loading...