Ayende @ Rahien

Oren Eini commented on Challenge: Giving file system developer ulcer

Tue, 04 Feb 2025 09:57:18 GMT

Kuba, Yes, the second one should have no buffer flags. The OS makes the effects of unbuffered writes visible, but not immediately, you can "see" the holes. I got burned by that a long while ago, see: https://ayende.com/blog/164577/is-select-broken-memory-mapped-files-with-unbufferred-writes-race-condition Note that I don't actually care about file metadata, I ensure that the writes won't change the file size (the only metadata I care about) And on NTFS - changes to file system structures are journaled.

Kuba commented on Challenge: Giving file system developer ulcer

Mon, 03 Feb 2025 13:52:58 GMT

I think you've missed the `FILE_FLAG_NO_BUFFERING` in the second `CreateFile`. The documentation of `MapViewOfFile` calls out a single exception to when coherence is not guaranteed. From: https://learn.microsoft.com/en-us/windows/win32/api/memoryapi/nf-memoryapi-mapviewoffile > With one important exception, file views derived from any file mapping object that is backed by the same file are coherent or identical at a specific time. Coherency is guaranteed for views within a process and for views that are mapped by different processes. > The exception is related to remote files. Although MapViewOfFile works with remote files, it does not keep them coherent. For example, if two computers both map a file as writable, and both change the same page, each computer only sees its own writes to the page. When the data gets updated on the disk, it is not merged. The OS will make the effects of unbuffered `WriteFile` visible to the memory mapped file. It also may seem that you still won't be able to avoid calling `FlushFileBuffers`. From: https://learn.microsoft.com/en-us/windows/win32/fileio/file-caching > When caching is disabled, all read and write operations directly access the physical disk. However, the file metadata may still be cached. To flush the metadata to disk, use the FlushFileBuffers function.

Oren Eini commented on NTFS has an emergency stash of disk space

Mon, 03 Feb 2025 09:13:53 GMT

Kuba, That is a very interesting find, thank you. I'm really happy that it _is_ the case, there is an emergency stash :-)

Kuba commented on NTFS has an emergency stash of disk space

Fri, 31 Jan 2025 12:50:18 GMT

From: https://learn.microsoft.com/en-us/previous-versions/windows/it-pro/windows-server-2003/cc781134(v=ws.10) > To prevent the MFT from becoming fragmented, NTFS reserves 12.5 percent of volume by default for exclusive use of the MFT. This space, known as the MFT zone, is not used to store data unless the remainder of the volume becomes full.

Oren Eini commented on Answer: What does this code do?

Thu, 23 Jan 2025 08:48:03 GMT

Dmitry, Yes, but this is _known_ to be a 64 bits system - since we are allocating 3GB (not possible in 64 bits). Note that I'm not freeing since I close the process, so the OS will handle that.

Dmitry commented on Answer: What does this code do?

Thu, 23 Jan 2025 05:55:57 GMT

That's why it is a good idea to compare the return value of `write` to the buffer size. "size_t" can be 32-bit on some systems which can be checked with `sizeof(size_t)`. Also, the buffer is not freed when the returned value is an error number.

Pyth0n commented on Challenge: What does this code do?

Wed, 22 Jan 2025 07:44:35 GMT

There's a reason we miss `write_exact` and `read_exact` from the standard library. In fact, I am surprised the chunk is actually *that* big...

Mo Vakili commented on The memory leak in ConcurrentQueue

Fri, 07 Feb 2025 10:32:10 GMT

Thanks for the detailed post! I was a bit confused because the code sample doesn’t directly show multiple threads calling FlushUntil, and the article mentions TryTake whereas the code uses TryDequeue. Now I see that the underlying reason for the memory leak is how ConcurrentQueue handles TryPeek together.

Oren Eini commented on Performance discovery: IOPS vs. IOPS

Mon, 13 Jan 2025 10:50:42 GMT

kpvleeuwen, I agree, unfortunately I don't have anyone to ask off the top of my head. Thanks, I fixed the headers.

kpvleeuwen commented on Performance discovery: IOPS vs. IOPS

Mon, 13 Jan 2025 09:03:38 GMT

It could be interesting to reach out to storage firmware developers and ask what write patterns they think would be useful experiments. Btw, table headers are misaligned on all the tables where the top right column has no header.

Oren Eini commented on Aggregating trees with RavenDB

Sun, 12 Jan 2025 10:35:19 GMT

Nicholas, You _can_ use the spread operator in RavenDB, which is what I assume you meant, like this: let results = [{ Scope: id(doc) , ...hours}]; The fault lies with me, since I'm used to old school ways :-)

Nicholas Paldino commented on Aggregating trees with RavenDB

Tue, 07 Jan 2025 20:05:48 GMT

Unrelated, but the `Object.assign` call can be very annoying. Any chance that the JS engine was updated to V8 and we can have support of current JS?

Dennis commented on Sometimes it's the hardware

Tue, 31 Dec 2024 20:30:08 GMT

Gotta love planned obsolescence.

Stephen Cleary commented on Isn't it ironic: Money isn't transactional

Tue, 24 Dec 2024 14:45:06 GMT

If you're interested in the financial side of technology, I can *highly* recommend Bits About Money. I've been on his mailing list for about a year and have greatly enjoyed it! E.g., [The Long Shadow of Checks](https://www.bitsaboutmoney.com/archive/the-long-shadow-of-checks/)

Dennis commented on Fun with bugs: Advanced Dictionary API

Mon, 02 Dec 2024 00:22:04 GMT

And thus you have figured out why those optimization accessors are in CollectionMarshal and not native to the Dictionary class.

Andrew J Said commented on Fun with bugs: Advanced Dictionary API

Fri, 15 Nov 2024 10:09:44 GMT

Interesting post. This bug must have been really tough to spot if it was in a much more complex method obfuscated by many other things going on! In this case I'd guess the solution would be to set the capacity to 32 ahead of time, or to just use the regular indexer, depending on the use-case? On a separate note, a while back I provided what I think is an elegant solution to "Challenge: Efficient snapshotable state" in the comments of that post. I am not sure if you had seen it but if not I'd appreciate your thoughts. Thanks either way.

Erik commented on Querying over the current time in RavenDB

Wed, 23 Oct 2024 11:55:45 GMT

> That works beautifully, of course, until the next day. What happens then? Well, we’ll need to schedule an update to the config/current-date document to correct the date. > The downside is that we need to set up a cron job to make it happen, but that isn’t too big a task, I think. Maybe it's just me, but this still sounds tricky when working on a SaaS product that has customers all round the globe in different time zones.

Itay Sagui commented on Querying over the current time in RavenDB

Wed, 23 Oct 2024 06:48:44 GMT

Why not provide the auto-updating document feature as part of the engine? Either having a meta document, or having some internal scheduler that updates a subset of documents (at a low enough frequency) can solve this without putting the burden on the developers.

peter commented on Querying over the current time in RavenDB

Mon, 14 Oct 2024 13:08:24 GMT

Huh, very interesting rule to disallow expensive queries. I like that philosophy. This type of query is not unusual for us in our RDBMS, though we would limit it only to the relevant subset using an index. Nevertheless, the query would still require hitting every record which yes, is very expensive. We just accept to live with long-running queries.

Garcha Sprgchma commented on Querying over the current time in RavenDB

Sat, 12 Oct 2024 03:38:33 GMT

Is there more efficient way to handle such scenario? For example, create two indexes: - index for deceased people ordered by calculated age - index for living-and-kicking people ordered by birth year Then for sorted query engine does merge-join between two indexes. But before comparation in the merge-join the engine transforms second index value with `current_year-birth_year`. Such modification possible because it does not change the order of the index. Does this make sense?

Adam commented on Debugging the Linux kernel using awesome psychic powers

Sun, 22 Sep 2024 08:35:18 GMT

``` int fd = open("test.file", O_CREAT | O_WRONLY, 0644); lseek(fd, 128 * 1024 * 1024 - 1, SEEK_SET); // 128MB file write(fd, "", 1); // fallocate(fd, FALLOC_FL_PUNCH_HOLE | FALLOC_FL_KEEP_SIZE, 16 * 1024 * 1024, 32 * 1024 * 1024); close(fd); ``` This creates a sparse file, where all your data is already a hole except the very last page. fallocate() would have done nothing. ls -lsh test.file > 4,0K -rw-r--r-- 1 adam adam 128M sep 22 01:17 test.file File uses 4,0k but has a size of 128M I don't think all filesystems support hole punching, so might want to check that on each CI agent.

Alergare montana în competitii commented on Seeing the results of Corax in production

Fri, 25 Oct 2024 16:21:15 GMT

Un antrenor alergare montană tе ѵa ajuta ѕă îți creezi սn program adaptat nevoilor tale și nivelului tău ɗe fitness.

regalos personalizados commented on Seeing the results of Corax in production

Wed, 09 Oct 2024 23:13:04 GMT

Tһis page ceｒtainly hһas аll the information I needｅd about this subject ɑnd didn't know who to ask.

seo ser commented on Seeing the results of Corax in production

Wed, 09 Oct 2024 21:43:56 GMT

Introduction: Ιn tоɗɑү'ѕ digital age, businesses ɑге increasingly reliant оn tһeir online presence tο attract customers ɑnd genesrate revenue.

liga novia commented on Seeing the results of Corax in production

Sat, 05 Oct 2024 05:09:11 GMT

What'ѕ uρ, itѕ good article concerning media print, we аⅼl be aware of mеdia is a fantastic sourcе of data.

Judah Gabriel Himango commented on Seeing the results of Corax in production

Mon, 09 Sep 2024 16:48:12 GMT

This looks fantastic. Is it possible to migrate a Lucene index to Corax? I didn't see it on first glance in the Studio.

Gomez12 commented on Seeing the results of Corax in production

Sun, 08 Sep 2024 21:26:13 GMT

Do you know if Lucene has had any great performance updates? Seeing as ravendb uses 3.0 while that is about 13 years old and they currently are on version 9. Not meant to dismiss your achievements with corax, as lucene is still considered the de facto standard. It is just that people usually refer to Java lucene, .net lucene has died a long time ago

Paul Hatcher commented on Caching documents in RavenDB: The good, the bad and the ugly

Mon, 26 Aug 2024 12:22:40 GMT

I prefer the other version of the quote "There are only two hard problems in computer science, cache invalidation, naming things and off-by-one errror", Leon Bambrick Regards Paul

Ian Cross commented on Caching documents in RavenDB: The good, the bad and the ugly

Thu, 15 Aug 2024 14:14:16 GMT

Hi Oren, Thanks for the article. Our expectation is RavenDB invalidates the Aggressive Cache for the one item that changes. We use Aggressive Cache for metadata that hardly ever changes for a give tenant (categories, currencies, drop-down lists, custom fields definitions etc.). The thing we're looking for could just work for Load<> without anything fancy. No need for queries. The server would keep track of the Ids that we load in this special way, and tell the client(s) if those items change (invalidate just that item) so that the next time we ask for it, it goes to the server to get that item. An important point to note is that this needs to work in a load balanced way. Multiple application servers will be watching the same metadata document n the database and they all need to be told if it's changed. Our metadata rarely changes (most of it never changes once configured) but we have 5000 different metadata documents across 50 + collections and so it's not just one collection to enable a Data Subscription for a given collection. For the short-term, we are resorting to the 'DoNotTrackChanges' option and then clearing the RavenDB cache when we know any item of metadata has changed, however this is a sledgehammer because it clears the entire cache. We will also need to use the ChangesAPI to 'broadcast' the change to the metadata item to other application servers. In the short-term it would be great to be able to clear just one item from the cache rather than the entire cache. Cheers, Ian

kvleeuwen commented on Optimizing old code: StreamBitArray refactoring

Thu, 29 Aug 2024 07:10:02 GMT

I don't see the new FirstSetBit C# code, just the old C# and new assembly?