Skip to content

Commit 4ceb5db

Browse files
author
Linus Torvalds
committed
Fix get_user_pages() race for write access
There's no real guarantee that handle_mm_fault() will always be able to break a COW situation - if an update from another thread ends up modifying the page table some way, handle_mm_fault() may end up requiring us to re-try the operation. That's normally fine, but get_user_pages() ended up re-trying it as a read, and thus a write access could in theory end up losing the dirty bit or be done on a page that had not been properly COW'ed. This makes get_user_pages() always retry write accesses as write accesses by making "follow_page()" require that a writable follow has the dirty bit set. That simplifies the code and solves the race: if the COW break fails for some reason, we'll just loop around and try again. Signed-off-by: Linus Torvalds <[email protected]>
1 parent 8d894c4 commit 4ceb5db

1 file changed

Lines changed: 4 additions & 17 deletions

File tree

mm/memory.c

Lines changed: 4 additions & 17 deletions
Original file line numberDiff line numberDiff line change
@@ -811,18 +811,15 @@ static struct page *__follow_page(struct mm_struct *mm, unsigned long address,
811811
pte = *ptep;
812812
pte_unmap(ptep);
813813
if (pte_present(pte)) {
814-
if (write && !pte_write(pte))
814+
if (write && !pte_dirty(pte))
815815
goto out;
816816
if (read && !pte_read(pte))
817817
goto out;
818818
pfn = pte_pfn(pte);
819819
if (pfn_valid(pfn)) {
820820
page = pfn_to_page(pfn);
821-
if (accessed) {
822-
if (write && !pte_dirty(pte) &&!PageDirty(page))
823-
set_page_dirty(page);
821+
if (accessed)
824822
mark_page_accessed(page);
825-
}
826823
return page;
827824
}
828825
}
@@ -941,19 +938,17 @@ int get_user_pages(struct task_struct *tsk, struct mm_struct *mm,
941938
spin_lock(&mm->page_table_lock);
942939
do {
943940
struct page *page;
944-
int lookup_write = write;
945941

946942
cond_resched_lock(&mm->page_table_lock);
947-
while (!(page = follow_page(mm, start, lookup_write))) {
943+
while (!(page = follow_page(mm, start, write))) {
948944
/*
949945
* Shortcut for anonymous pages. We don't want
950946
* to force the creation of pages tables for
951947
* insanely big anonymously mapped areas that
952948
* nobody touched so far. This is important
953949
* for doing a core dump for these mappings.
954950
*/
955-
if (!lookup_write &&
956-
untouched_anonymous_page(mm,vma,start)) {
951+
if (!write && untouched_anonymous_page(mm,vma,start)) {
957952
page = ZERO_PAGE(start);
958953
break;
959954
}
@@ -972,14 +967,6 @@ int get_user_pages(struct task_struct *tsk, struct mm_struct *mm,
972967
default:
973968
BUG();
974969
}
975-
/*
976-
* Now that we have performed a write fault
977-
* and surely no longer have a shared page we
978-
* shouldn't write, we shouldn't ignore an
979-
* unwritable page in the page table if
980-
* we are forcing write access.
981-
*/
982-
lookup_write = write && !force;
983970
spin_lock(&mm->page_table_lock);
984971
}
985972
if (pages) {

0 commit comments

Comments
 (0)