Ayende @ Rahienhttps://www.ayende.com/blog/Ayende @ RahienCopyright (C) Ayende Rahien 2004 - 2021 (c) 202460RavenDB and Two Factor Authentication<p style="text-align:left;">RavenDB is typically accessed directly by your application, using an X509 certificate for authentication. The same applies when you are connecting to RavenDB as a user. </p><p style="text-align:left;">Many organizations require that user authentication will not use just a single factor (such as a password or a certificate) but multiple. RavenDB now supports the ability to define Two Factor Authentication for access.</p><p style="text-align:left;">Here is how this looks like in the RavenDB Studio:</p><p style="text-align:left;"><img src="" style="float: right"/></p><p style="text-align:left;">You are able to generate a certificate as well as register the Authenticator code in your device. </p><p style="text-align:left;">When using the associated certificate, you’ll <em>not</em> be able to access RavenDB. Instead, you’ll get an error message saying that you need to complete the Two Factor Authentication process. Here is what <em>that</em> looks like:</p><p style="text-align:left;"><img src="" style="float: right"/></p><p style="text-align:left;">Once you complete the two factor authentication process, you can select for how long we’ll allow access with the given certificate and whatever to allow just accesses from the current browser window (because you are accessing it directly) or from any client (you want to access RavenDB from another device or via code).</p><p style="text-align:left;">Once the session duration expires, you’ll need to provide the authentication code again, of course. </p><p style="text-align:left;">This feature is meant specifically for certificates that are used by people directly. It is not meant for APIs or programmatic access. Those should either have a manual step to allow the certificate or utilize a secrets manager that can have additional steps and validations based on your actual requirements.</p><p style="text-align:left;">You can read more about this feature in <span style="text-decoration:underline;"><a style="color:inherit;" href="https://github.com/ravendb/ravendb/discussions/18122">the feature announcement</a></span>.</p>
<link rel="stylesheet" href="https://cdnjs.cloudflare.com/ajax/libs/prism/9000.0.1/themes/prism.min.css" integrity="sha512-/mZ1FHPkg6EKcxo0fKXF51ak6Cr2ocgDi5ytaTBjsQZIH/RNs6GF6+oId/vPe3eJB836T36nXwVh/WBl/cWT4w==" crossorigin="anonymous" referrerpolicy="no-referrer" />https://www.ayende.com/blog/200769-B/ravendb-and-two-factor-authentication?Key=ea0dc59c-4896-4dee-ad3f-c28873ad14a0https://www.ayende.com/blog/200769-B/ravendb-and-two-factor-authentication?Key=ea0dc59c-4896-4dee-ad3f-c28873ad14a0Wed, 06 Mar 2024 12:00:00 GMTRavenDB Cloud Global Status vs. Product Status<p style="text-align: left;">One of the interesting components of RavenDB Cloud is status reporting. It turns out that when you offer X as a Service, people <em>really</em> care about your operational status.</p>
<p style="text-align: left;">For RavenDB Cloud, we have <span style="text-decoration: underline;"><a style="color: inherit;" href="https://status.ravendb.net/">https://status.ravendb.net/</a></span>, which will give you some insights into the overall health of the system. Here are some details from the status page:</p>
<p style="text-align: left;"> </p>
<p style="text-align: left;">The interesting thing about this page is that it shows <em>global</em> status, indicating issues affecting large swaths of users. For instance, Azure having issues in a whole region in the image above is a great example of one such scenario. Regular maintenance, which we carry over the span of days, is something that we report, but you’ll usually never notice (due to the High Availability features of RavenDB).</p>
<p style="text-align: left;">It gets more complicated when we start talking about individual instances. There are many scenarios where the overall system health is great, but a particular database may suffer. The easiest example is if you run out of disk space. That affects that particular instance only.</p>
<p style="text-align: left;">For that scenario, we are reporting Production Monitoring Alerts within the RavenDB Cloud portal. Here is what this looks like:</p>
<p style="text-align: left;"> </p>
<p style="text-align: left;"> </p>
<p style="text-align: left;">As you can see, we report specific problems on those instances, raising that to your awareness. That was actually needed because, for the most part, RavenDB itself handles those sorts of things via High Availability, which means that even if there are issues, you’re likely to not feel them for a while.</p>
<p style="text-align: left;">Resilience at the cluster level means that even pretty severe problems are papered over and the system moves on. But there is only so much limping that you can do. If you are running at the bare edge of capacity, eventually you’ll trip over the line.</p>
<p style="text-align: left;">Those Production Monitoring Alerts allow you to detect and act upon those issues when they happen, not when they bring down production.</p>
<p style="text-align: left;">This aligns with our vision for RavenDB, the kind of system where you don’t need to have a full-time babysitter monitoring the system. Instead, if there is a problem that the database cannot solve on its own, it will explicitly notify you, in advance.</p>
<p style="text-align: left;">That leads to a system that is far healthier all around and means that you can focus on building your system, rather than managing database minutiae.</p>https://www.ayende.com/blog/200737-B/ravendb-cloud-global-status-vs-product-status?Key=00cf8c60-ab4f-401e-a133-76ecc0c9b243https://www.ayende.com/blog/200737-B/ravendb-cloud-global-status-vs-product-status?Key=00cf8c60-ab4f-401e-a133-76ecc0c9b243Mon, 04 Mar 2024 12:00:00 GMTCode review & Time Travel<p style="text-align:left;">A not insignificant part of my job is to go over code. Today I want to discuss how we approach code reviews at RavenDB, not from a process perspective but from an operational one. I have been a developer for nearly 25 years now, and I’ve come to realize that when I’m doing a code review I’m actually looking at the code from three separate perspectives.</p><p style="text-align:left;">The first, and most obvious one, is when I’m actually looking for problems in the code - ensuring that I can understand what is going on, confirming the flow makes sense, etc. This involves looking at the code <em>as it is right now</em>. </p><p style="text-align:left;">I’m going to be showing snippets of code reviews here. You are not actually expected to follow the <em>code</em>, only the concepts that we talk about here.</p><p style="text-align:left;">Here is a classic code review comment:</p><p style="text-align:left;"><img src="" style="float: right"/></p><p style="text-align:left;">There is some duplicated code that we need to manage. Another comment that I liked is this one, pointing out a potential optimization in the code:</p><p style="text-align:left;"><img src="" style="float: right"/></p><p style="text-align:left;">If we define the code using the <em>static</em> keyword, we’ll avoid delegate allocation and save some memory, yay!</p><p style="text-align:left;">It gets more interesting when the code is correct and proper, but may do something weird in some cases, such as in this one:</p><p style="text-align:left;"><img src="" style="float: right"/></p><p style="text-align:left;">I really love it when I run into those because they allow me to actually explore the problem thoroughly. Here is an even better example, this isn’t about a problem in the code, but a discussion on its impact. </p><p style="text-align:left;"><img src="" style="float: right"/></p><p style="text-align:left;">RavenDB has been around for over 15 years, and being able to go back and look at those conversations in a decade or so is invaluable to understanding what is going on. It also ensures that we can share current knowledge a lot more easily.</p><p style="text-align:left;">Speaking of long running-projects, take a look at the following comment:</p><p style="text-align:left;"><img src="" style="float: right"/></p><p style="text-align:left;">Here we need to provide some context to explain. The <em>_caseInsensitive</em> variable here is a concurrent dictionary, and the change is a pretty simple optimization to avoid the annoying KeyValuePair overload. Except… this code is there intentionally, we use it to ensure that the removal operation will only succeed if <em>both</em> the key and the value match. There was an old bug that happened when we removed blindly and the end result was that an updated value was removed. </p><p style="text-align:left;">In this case, we look at the code change from a historical perspective and realize that a modification would reintroduce old (bad) behavior. We added a comment to explain that in detail in the code (and there already was a test to catch it if this happens again). </p><p style="text-align:left;">By far, the most important and critical part of doing code reviews, in my opinion, is not focusing on what <em>is</em> or what <em>was</em>, but on what <em>will </em>be. In other words, when I’m looking at a piece of code, I’m considering not only what it is doing right now, but also what we’ll be doing with it in the future. </p><p style="text-align:left;">Here is a simple example of what I mean, showing a change to a perfectly fine piece of code:</p><p style="text-align:left;"><img src="" style="float: right"/></p><p style="text-align:left;">The problem is that the if statement will call <em>InitializeCmd</em>(), but we previously called it <em>using a different condition</em>. We are essentially testing for the same thing using two different methods, and while currently we end up with the same situation, in the future we need to be aware that this may change. </p><p style="text-align:left;">I believe one of the major shifts in my thinking about code reviews came about because I mostly work on RavenDB, and we have kept the project running over a long period of time. Focusing on making sure that we have a sustainable and maintainable code base over the long haul is <em>important. </em>Especially because you need to experience those benefits over time to really appreciate looking at codebase changes from a historical perspective.</p>
<link rel="stylesheet" href="https://cdnjs.cloudflare.com/ajax/libs/prism/9000.0.1/themes/prism.min.css" integrity="sha512-/mZ1FHPkg6EKcxo0fKXF51ak6Cr2ocgDi5ytaTBjsQZIH/RNs6GF6+oId/vPe3eJB836T36nXwVh/WBl/cWT4w==" crossorigin="anonymous" referrerpolicy="no-referrer" />https://www.ayende.com/blog/200577-B/code-review-time-travel?Key=3ec75656-3dca-4436-91a4-85f16cbae154https://www.ayende.com/blog/200577-B/code-review-time-travel?Key=3ec75656-3dca-4436-91a4-85f16cbae154Fri, 19 Jan 2024 12:00:00 GMTMeta Blog: Blogging ergonomics in 2024<p style="text-align:left;">I've been writing this blog since 2004. That means I have been doing this for twenty years, which is <em>frankly </em>unbelievable to me. The actual date is sometime in April, so I’ll probably do a summary post then about that. </p><p style="text-align:left;">What I want to talk about today is a different aspect. The mechanism and processes I use to write blog posts. A large part of the reason I write blog posts is that it helps me understand and organize my own thoughts. And in order to do that effectively, I have found that I need very little friction in the blogging process. </p><p style="text-align:left;">About a decade ago, Google Reader was shut down, and I’m still <em>very</em> bitter about that. It effectively killed a significant portion of the blogging audience and made the ergonomics of reading blogs a lot harder. That also led people to use walled gardens to communicate with others, instead of the decentralized network and feed aggregators. A side effect of that decision is that blogging tools have stopped being a viable thing people spend time or money on.</p><p style="text-align:left;">At the time, I was using Windows Live Writer, which was a high-quality editor and had a rich plugin system. Microsoft discontinued it at some point, it became an open-source project, and even that died. The website is no longer functional and even in terms of <span style="text-decoration:underline;"><a style="color:inherit; href="https://github.com/OpenLiveWriter/OpenLiveWriter"">the GitHub project</a></span>, the last commit was 5 years ago.</p><p style="text-align:left;">I’m still using Open Live Writer to write the majority of my blog posts, but given there are no longer any plugins, even something as simple as embedding code in my posts has become an… annoyance. That kills the ergonomics of blogging for me.</p><p style="text-align:left;">Not a problem, this is Open Source, and I can do that myself. Except… I really don’t have the time to spend on something ancillary like that. I would happily pay (a reasonable amount) for a blogging client, but I’m going to assume that I’m not part of a large enough group that there is a market for this. </p><p style="text-align:left;">Taking the code snippets example, I can go into the code, figure out what is going on there, and add a “code snippet” feature. I estimate that would take several days. Alternatively, I can place the code as a GitHub gist and embed it in the page. It is annoying, but far quicker than going to the trouble of figuring that out. </p><p style="text-align:left;">Another issue that bugs me (pun intended) is a problem with copy/paste of images, where taking screenshots using the Snipping Tool doesn’t paste into Writer. I need to first paste them into Paint, then into Writer. In this case, I assume that Writer doesn’t recognize the clipboard format or something similar. </p><p style="text-align:left;">Finally, it turns out that I’m not writing blog posts in the same manner as I used to. It got to the point where I asked people to review my posts before making them public. It turns out that no matter how many times it is corrected, my brain seems unable to discern when to write “whether” or “whatever”, for example. At this point I gave up updating <em>that</em> piece of software 🙂. Even the use of emojis doesn’t work properly (Open Live Writer mostly predates a lot of them and breaks the HTML in a weird fashion 🤷).</p><p style="text-align:left;">In other words, there are several problems in my current workflow, and it has finally reached the point where I need to do something about it. The last requirement, by the way, is the most onerous. Consider the workflow of getting the following fixes to a blog post:</p><ul><li>and we run => and we ran</li><li>we spend => we spent</li></ul><p style="text-align:left;">Where is my collaborating editing and the ability to suggest changes with good UX? Improving the ergonomics for the blog has just expanded in scope massively. Now it is a full-fledged publishing platform with modern sensibilities. It’s 2024, features like proper spelling and grammar corrections should absolutely be there, no? And what about AI integration? It turns out that predicting text makes the writing process more efficient. Here is what this may look like:</p><p style="text-align:left;"><img src="" style="float: right"/></p><p style="text-align:left;"><img src="" style="float: right"/></p><p style="text-align:left;">At this stage, this isn’t just a few minor fixes. I should mention that for the past decade and a half or so, I stopped considering myself as someone who can do UI in any meaningful manner. I find that the <table/> tag, which used to be my old reliable method, is not recommended now, for some reason.</p><p style="text-align:left;">This… kind of sucks. I want to upgrade my process by a couple of decades, but I don’t want to pay the price for that. If only there was an easier way to do that.</p><p style="text-align:left;">I started using Google Docs to edit my blog posts, then pasting them into Live Writer or directly to the blog (using a Rich Text Box with an editor from… a decade ago). I had to check the source code for this, by the way. The entire experience is decidedly Developer UX. Then I had a thought, I already have a pretty good process of writing the blog posts in Google Docs, right? It handles rich text editing and management much better than the editor in the blog. There are also options for things like proper workflows. For example, someone can go over my drafts and make comments or suggestions.</p><p style="text-align:left;">The only thing that I need is to put both of those together. I have to admit that I spent quite some time just trying to figure out how to <em>get</em> the document from Google Docs using code. The authentication hurdles are… significant to someone who isn’t aware of how it all plugs together. Once I got that done, I got my publishing platform with modern features. Here is what the end result looks like:</p><p style="text-align:left;"><hr/><pre class='line-numbers language-javascript><code class='line-numbers language-javascript'><span class="token keyword">public</span> <span class="token keyword">class</span> <span class="token class-name">PublishingPlatform</span>
<span class="token punctuation">{</span>
<span class="token keyword">private</span> readonly DocsService GoogleDocs<span class="token punctuation">;</span>
<span class="token keyword">private</span> readonly DriveService GoogleDrive<span class="token punctuation">;</span>
<span class="token keyword">private</span> readonly Client _blogClient<span class="token punctuation">;</span>
<span class="token keyword">public</span> <span class="token function">PublishingPlatform</span><span class="token punctuation">(</span><span class="token parameter">string googleConfigPath<span class="token punctuation">,</span> string blogUser<span class="token punctuation">,</span> string blogPassword</span><span class="token punctuation">)</span>
<span class="token punctuation">{</span>
<span class="token keyword">var</span> blogInfo <span class="token operator">=</span> <span class="token keyword">new</span> <span class="token class-name">MetaWeblogClient<span class="token punctuation">.</span>BlogConnectionInfo</span><span class="token punctuation">(</span>
<span class="token string">"https://ayende.com/blog"</span><span class="token punctuation">,</span>
<span class="token string">"https://ayende.com/blog/Services/MetaWeblogAPI.ashx"</span><span class="token punctuation">,</span>
<span class="token string">"ayende.com"</span><span class="token punctuation">,</span> blogUser<span class="token punctuation">,</span> blogPassword<span class="token punctuation">)</span><span class="token punctuation">;</span>
_blogClient <span class="token operator">=</span> <span class="token keyword">new</span> <span class="token class-name">MetaWeblogClient<span class="token punctuation">.</span>Client</span><span class="token punctuation">(</span>blogInfo<span class="token punctuation">)</span><span class="token punctuation">;</span>
<span class="token keyword">var</span> initializer <span class="token operator">=</span> <span class="token keyword">new</span> <span class="token class-name">BaseClientService<span class="token punctuation">.</span>Initializer</span>
<span class="token punctuation">{</span>
HttpClientInitializer <span class="token operator">=</span> GoogleWebAuthorizationBroker<span class="token punctuation">.</span><span class="token function">AuthorizeAsync</span><span class="token punctuation">(</span>
GoogleClientSecrets<span class="token punctuation">.</span><span class="token function">FromFile</span><span class="token punctuation">(</span>googleConfigPath<span class="token punctuation">)</span><span class="token punctuation">.</span>Secrets<span class="token punctuation">,</span>
<span class="token keyword">new</span><span class="token punctuation">[</span><span class="token punctuation">]</span> <span class="token punctuation">{</span> DocsService<span class="token punctuation">.</span>Scope<span class="token punctuation">.</span>Documents<span class="token punctuation">,</span> DriveService<span class="token punctuation">.</span>Scope<span class="token punctuation">.</span>DriveReadonly <span class="token punctuation">}</span><span class="token punctuation">,</span>
<span class="token string">"user"</span><span class="token punctuation">,</span> CancellationToken<span class="token punctuation">.</span>None<span class="token punctuation">,</span>
<span class="token keyword">new</span> <span class="token class-name">FileDataStore</span><span class="token punctuation">(</span><span class="token string">"blog.ayende.com"</span><span class="token punctuation">)</span>
<span class="token punctuation">)</span><span class="token punctuation">.</span>Result
<span class="token punctuation">}</span><span class="token punctuation">;</span>
GoogleDocs <span class="token operator">=</span> <span class="token keyword">new</span> <span class="token class-name">DocsService</span><span class="token punctuation">(</span>initializer<span class="token punctuation">)</span><span class="token punctuation">;</span>
GoogleDrive <span class="token operator">=</span> <span class="token keyword">new</span> <span class="token class-name">DriveService</span><span class="token punctuation">(</span>initializer<span class="token punctuation">)</span><span class="token punctuation">;</span>
<span class="token punctuation">}</span>
<span class="token keyword">public</span> <span class="token keyword">void</span> <span class="token function">Publish</span><span class="token punctuation">(</span><span class="token parameter">string documentId</span><span class="token punctuation">)</span>
<span class="token punctuation">{</span>
using <span class="token keyword">var</span> file <span class="token operator">=</span> GoogleDrive<span class="token punctuation">.</span>Files<span class="token punctuation">.</span><span class="token function">Export</span><span class="token punctuation">(</span>documentId<span class="token punctuation">,</span> <span class="token string">"application/zip"</span><span class="token punctuation">)</span><span class="token punctuation">.</span><span class="token function">ExecuteAsStream</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">;</span>
<span class="token keyword">var</span> zip <span class="token operator">=</span> <span class="token keyword">new</span> <span class="token class-name">ZipArchive</span><span class="token punctuation">(</span>file<span class="token punctuation">,</span> ZipArchiveMode<span class="token punctuation">.</span>Read<span class="token punctuation">)</span><span class="token punctuation">;</span>
<span class="token keyword">var</span> doc <span class="token operator">=</span> GoogleDocs<span class="token punctuation">.</span>Documents<span class="token punctuation">.</span><span class="token function">Get</span><span class="token punctuation">(</span>documentId<span class="token punctuation">)</span><span class="token punctuation">.</span><span class="token function">Execute</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">;</span>
<span class="token keyword">var</span> title <span class="token operator">=</span> doc<span class="token punctuation">.</span>Title<span class="token punctuation">;</span>
<span class="token keyword">var</span> htmlFile <span class="token operator">=</span> zip<span class="token punctuation">.</span>Entries<span class="token punctuation">.</span><span class="token function">First</span><span class="token punctuation">(</span><span class="token parameter">e</span> <span class="token operator">=></span> Path<span class="token punctuation">.</span><span class="token function">GetExtension</span><span class="token punctuation">(</span>e<span class="token punctuation">.</span>Name<span class="token punctuation">)</span><span class="token punctuation">.</span><span class="token function">ToLower</span><span class="token punctuation">(</span><span class="token punctuation">)</span> <span class="token operator">==</span> <span class="token string">".html"</span><span class="token punctuation">)</span><span class="token punctuation">;</span>
using <span class="token keyword">var</span> stream <span class="token operator">=</span> htmlFile<span class="token punctuation">.</span><span class="token function">Open</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">;</span>
<span class="token keyword">var</span> htmlDoc <span class="token operator">=</span> <span class="token keyword">new</span> <span class="token class-name">HtmlDocument</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">;</span>
htmlDoc<span class="token punctuation">.</span><span class="token function">Load</span><span class="token punctuation">(</span>stream<span class="token punctuation">)</span><span class="token punctuation">;</span>
<span class="token keyword">var</span> body <span class="token operator">=</span> htmlDoc<span class="token punctuation">.</span>DocumentNode<span class="token punctuation">.</span><span class="token function">SelectSingleNode</span><span class="token punctuation">(</span><span class="token string">"//body"</span><span class="token punctuation">)</span><span class="token punctuation">;</span>
<span class="token keyword">var</span> <span class="token punctuation">(</span>postId<span class="token punctuation">,</span> tags<span class="token punctuation">)</span> <span class="token operator">=</span> <span class="token function">ReadPostIdAndTags</span><span class="token punctuation">(</span>body<span class="token punctuation">)</span><span class="token punctuation">;</span>
<span class="token function">UpdateLinks</span><span class="token punctuation">(</span>body<span class="token punctuation">)</span><span class="token punctuation">;</span>
<span class="token function">StripCodeHeader</span><span class="token punctuation">(</span>body<span class="token punctuation">)</span><span class="token punctuation">;</span>
<span class="token function">UploadImages</span><span class="token punctuation">(</span>zip<span class="token punctuation">,</span> body<span class="token punctuation">,</span> <span class="token function">GenerateSlug</span><span class="token punctuation">(</span>title<span class="token punctuation">)</span><span class="token punctuation">)</span><span class="token punctuation">;</span>
string post <span class="token operator">=</span> <span class="token function">GetPostContents</span><span class="token punctuation">(</span>htmlDoc<span class="token punctuation">,</span> body<span class="token punctuation">)</span><span class="token punctuation">;</span>
<span class="token keyword">if</span> <span class="token punctuation">(</span>postId <span class="token operator">!=</span> <span class="token keyword">null</span><span class="token punctuation">)</span>
<span class="token punctuation">{</span>
_blogClient<span class="token punctuation">.</span><span class="token function">EditPost</span><span class="token punctuation">(</span>postId<span class="token punctuation">,</span> title<span class="token punctuation">,</span> post<span class="token punctuation">,</span> tags<span class="token punctuation">,</span> <span class="token boolean">true</span><span class="token punctuation">)</span><span class="token punctuation">;</span>
<span class="token keyword">return</span><span class="token punctuation">;</span>
<span class="token punctuation">}</span>
postId <span class="token operator">=</span> _blogClient<span class="token punctuation">.</span><span class="token function">NewPost</span><span class="token punctuation">(</span>title<span class="token punctuation">,</span> post<span class="token punctuation">,</span> tags<span class="token punctuation">,</span> <span class="token boolean">true</span><span class="token punctuation">,</span> <span class="token keyword">null</span><span class="token punctuation">)</span><span class="token punctuation">;</span>
<span class="token keyword">var</span> update <span class="token operator">=</span> <span class="token keyword">new</span> <span class="token class-name">BatchUpdateDocumentRequest</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">;</span>
update<span class="token punctuation">.</span>Requests <span class="token operator">=</span> <span class="token punctuation">[</span><span class="token keyword">new</span> <span class="token class-name">Request</span>
<span class="token punctuation">{</span>
InsertText <span class="token operator">=</span> <span class="token keyword">new</span> <span class="token class-name">InsertTextRequest</span>
<span class="token punctuation">{</span>
Text <span class="token operator">=</span> $<span class="token string">"PostId: {postId}\r\n"</span><span class="token punctuation">,</span>
Location <span class="token operator">=</span> <span class="token keyword">new</span> <span class="token class-name">Location</span>
<span class="token punctuation">{</span>
Index <span class="token operator">=</span> <span class="token number">1</span><span class="token punctuation">,</span>
<span class="token punctuation">}</span>
<span class="token punctuation">}</span><span class="token punctuation">,</span>
<span class="token punctuation">}</span><span class="token punctuation">]</span><span class="token punctuation">;</span>
GoogleDocs<span class="token punctuation">.</span>Documents<span class="token punctuation">.</span><span class="token function">BatchUpdate</span><span class="token punctuation">(</span>update<span class="token punctuation">,</span> documentId<span class="token punctuation">)</span><span class="token punctuation">.</span><span class="token function">Execute</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">;</span>
<span class="token punctuation">}</span>
<span class="token keyword">private</span> <span class="token keyword">void</span> <span class="token function">StripCodeHeader</span><span class="token punctuation">(</span><span class="token parameter">HtmlNode body</span><span class="token punctuation">)</span>
<span class="token punctuation">{</span>
<span class="token function">foreach</span><span class="token punctuation">(</span><span class="token keyword">var</span> remove <span class="token keyword">in</span> body<span class="token punctuation">.</span><span class="token function">SelectNodes</span><span class="token punctuation">(</span><span class="token string">"//span[text()='&#60419;']"</span><span class="token punctuation">)</span><span class="token punctuation">.</span><span class="token function">ToArray</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">)</span>
<span class="token punctuation">{</span>
remove<span class="token punctuation">.</span><span class="token function">Remove</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">;</span>
<span class="token punctuation">}</span>
<span class="token function">foreach</span> <span class="token punctuation">(</span><span class="token keyword">var</span> remove <span class="token keyword">in</span> body<span class="token punctuation">.</span><span class="token function">SelectNodes</span><span class="token punctuation">(</span><span class="token string">"//span[text()='&#60418;']"</span><span class="token punctuation">)</span><span class="token punctuation">.</span><span class="token function">ToArray</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">)</span>
<span class="token punctuation">{</span>
remove<span class="token punctuation">.</span><span class="token function">Remove</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">;</span>
<span class="token punctuation">}</span>
<span class="token punctuation">}</span>
<span class="token keyword">private</span> <span class="token keyword">static</span> string <span class="token function">GetPostContents</span><span class="token punctuation">(</span><span class="token parameter">HtmlDocument htmlDoc<span class="token punctuation">,</span> HtmlNode body</span><span class="token punctuation">)</span>
<span class="token punctuation">{</span>
<span class="token comment">// we use the @scope element to ensure that the document style doesn't "leak" outside</span>
<span class="token keyword">var</span> style <span class="token operator">=</span> htmlDoc<span class="token punctuation">.</span>DocumentNode<span class="token punctuation">.</span><span class="token function">SelectSingleNode</span><span class="token punctuation">(</span><span class="token string">"//head/style[@type='text/css']"</span><span class="token punctuation">)</span><span class="token punctuation">.</span>InnerText<span class="token punctuation">;</span>
<span class="token keyword">var</span> post <span class="token operator">=</span> <span class="token string">"<style>@scope {"</span> <span class="token operator">+</span> style <span class="token operator">+</span> <span class="token string">"}</style> "</span> <span class="token operator">+</span> body<span class="token punctuation">.</span>InnerHtml<span class="token punctuation">;</span>
<span class="token keyword">return</span> post<span class="token punctuation">;</span>
<span class="token punctuation">}</span>
<span class="token keyword">private</span> <span class="token keyword">static</span> <span class="token keyword">void</span> <span class="token function">UpdateLinks</span><span class="token punctuation">(</span><span class="token parameter">HtmlNode body</span><span class="token punctuation">)</span>
<span class="token punctuation">{</span>
<span class="token comment">// Google Docs put a redirect like: https://www.google.com/url?q=ACTUAL_URL</span>
<span class="token function">foreach</span> <span class="token punctuation">(</span><span class="token keyword">var</span> link <span class="token keyword">in</span> body<span class="token punctuation">.</span><span class="token function">SelectNodes</span><span class="token punctuation">(</span><span class="token string">"//a[@href]"</span><span class="token punctuation">)</span><span class="token punctuation">.</span><span class="token function">ToArray</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">)</span>
<span class="token punctuation">{</span>
<span class="token keyword">var</span> href <span class="token operator">=</span> <span class="token keyword">new</span> <span class="token class-name">Uri</span><span class="token punctuation">(</span>link<span class="token punctuation">.</span>Attributes<span class="token punctuation">[</span><span class="token string">"href"</span><span class="token punctuation">]</span><span class="token punctuation">.</span>Value<span class="token punctuation">)</span><span class="token punctuation">;</span>
<span class="token keyword">var</span> url <span class="token operator">=</span> HttpUtility<span class="token punctuation">.</span><span class="token function">ParseQueryString</span><span class="token punctuation">(</span>href<span class="token punctuation">.</span>Query<span class="token punctuation">)</span><span class="token punctuation">[</span><span class="token string">"q"</span><span class="token punctuation">]</span><span class="token punctuation">;</span>
<span class="token keyword">if</span> <span class="token punctuation">(</span>url <span class="token operator">!=</span> <span class="token keyword">null</span><span class="token punctuation">)</span>
<span class="token punctuation">{</span>
link<span class="token punctuation">.</span>Attributes<span class="token punctuation">[</span><span class="token string">"href"</span><span class="token punctuation">]</span><span class="token punctuation">.</span>Value <span class="token operator">=</span> url<span class="token punctuation">;</span>
<span class="token punctuation">}</span>
<span class="token punctuation">}</span>
<span class="token punctuation">}</span>
<span class="token keyword">private</span> <span class="token keyword">static</span> <span class="token punctuation">(</span>string<span class="token operator">?</span> postId<span class="token punctuation">,</span> List<span class="token operator"><</span>string<span class="token operator">></span> tags<span class="token punctuation">)</span> <span class="token function">ReadPostIdAndTags</span><span class="token punctuation">(</span><span class="token parameter">HtmlNode body</span><span class="token punctuation">)</span>
<span class="token punctuation">{</span>
string<span class="token operator">?</span> postId <span class="token operator">=</span> <span class="token keyword">null</span><span class="token punctuation">;</span>
<span class="token keyword">var</span> tags <span class="token operator">=</span> <span class="token keyword">new</span> <span class="token class-name">List</span><span class="token operator"><</span>string<span class="token operator">></span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">;</span>
<span class="token function">foreach</span> <span class="token punctuation">(</span><span class="token keyword">var</span> span <span class="token keyword">in</span> body<span class="token punctuation">.</span><span class="token function">SelectNodes</span><span class="token punctuation">(</span><span class="token string">"//span"</span><span class="token punctuation">)</span><span class="token punctuation">)</span>
<span class="token punctuation">{</span>
<span class="token keyword">var</span> text <span class="token operator">=</span> span<span class="token punctuation">.</span>InnerText<span class="token punctuation">.</span><span class="token function">Trim</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">;</span>
<span class="token keyword">const</span> string TagsPrefix <span class="token operator">=</span> <span class="token string">"Tags:"</span><span class="token punctuation">;</span>
<span class="token keyword">const</span> string PostIdPrefix <span class="token operator">=</span> <span class="token string">"PostId:"</span><span class="token punctuation">;</span>
<span class="token keyword">if</span> <span class="token punctuation">(</span>text<span class="token punctuation">.</span><span class="token function">StartsWith</span><span class="token punctuation">(</span>TagsPrefix<span class="token punctuation">,</span> StringComparison<span class="token punctuation">.</span>OrdinalIgnoreCase<span class="token punctuation">)</span><span class="token punctuation">)</span>
<span class="token punctuation">{</span>
tags<span class="token punctuation">.</span><span class="token function">AddRange</span><span class="token punctuation">(</span>text<span class="token punctuation">.</span><span class="token function">Substring</span><span class="token punctuation">(</span>TagsPrefix<span class="token punctuation">.</span>Length<span class="token punctuation">)</span><span class="token punctuation">.</span><span class="token function">Split</span><span class="token punctuation">(</span><span class="token string">","</span><span class="token punctuation">)</span><span class="token punctuation">)</span><span class="token punctuation">;</span>
<span class="token function">RemoveElement</span><span class="token punctuation">(</span>span<span class="token punctuation">)</span><span class="token punctuation">;</span>
<span class="token punctuation">}</span>
<span class="token keyword">else</span> <span class="token keyword">if</span> <span class="token punctuation">(</span>text<span class="token punctuation">.</span><span class="token function">StartsWith</span><span class="token punctuation">(</span>PostIdPrefix<span class="token punctuation">,</span> StringComparison<span class="token punctuation">.</span>OrdinalIgnoreCase<span class="token punctuation">)</span><span class="token punctuation">)</span>
<span class="token punctuation">{</span>
postId <span class="token operator">=</span> text<span class="token punctuation">.</span><span class="token function">Substring</span><span class="token punctuation">(</span>PostIdPrefix<span class="token punctuation">.</span>Length<span class="token punctuation">)</span><span class="token punctuation">.</span><span class="token function">Trim</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">;</span>
<span class="token function">RemoveElement</span><span class="token punctuation">(</span>span<span class="token punctuation">)</span><span class="token punctuation">;</span>
<span class="token punctuation">}</span>
<span class="token punctuation">}</span>
<span class="token comment">// after we removed post id & tags, trim the empty lines</span>
<span class="token keyword">while</span> <span class="token punctuation">(</span>body<span class="token punctuation">.</span>FirstChild<span class="token punctuation">.</span>InnerText<span class="token punctuation">.</span><span class="token function">Trim</span><span class="token punctuation">(</span><span class="token punctuation">)</span> is <span class="token string">"&nbsp;"</span> or <span class="token string">""</span><span class="token punctuation">)</span>
<span class="token punctuation">{</span>
body<span class="token punctuation">.</span><span class="token function">RemoveChild</span><span class="token punctuation">(</span>body<span class="token punctuation">.</span>FirstChild<span class="token punctuation">)</span><span class="token punctuation">;</span>
<span class="token punctuation">}</span>
<span class="token keyword">return</span> <span class="token punctuation">(</span>postId<span class="token punctuation">,</span> tags<span class="token punctuation">)</span><span class="token punctuation">;</span>
<span class="token punctuation">}</span>
<span class="token keyword">private</span> <span class="token keyword">static</span> <span class="token keyword">void</span> <span class="token function">RemoveElement</span><span class="token punctuation">(</span><span class="token parameter">HtmlNode element</span><span class="token punctuation">)</span>
<span class="token punctuation">{</span>
<span class="token keyword">do</span>
<span class="token punctuation">{</span>
<span class="token keyword">var</span> parent <span class="token operator">=</span> element<span class="token punctuation">.</span>ParentNode<span class="token punctuation">;</span>
parent<span class="token punctuation">.</span><span class="token function">RemoveChild</span><span class="token punctuation">(</span>element<span class="token punctuation">)</span><span class="token punctuation">;</span>
element <span class="token operator">=</span> parent<span class="token punctuation">;</span>
<span class="token punctuation">}</span> <span class="token keyword">while</span> <span class="token punctuation">(</span>element<span class="token operator">?.</span>ChildNodes<span class="token operator">?.</span>Count <span class="token operator">==</span> <span class="token number">0</span><span class="token punctuation">)</span><span class="token punctuation">;</span>
<span class="token punctuation">}</span>
<span class="token keyword">private</span> <span class="token keyword">void</span> <span class="token function">UploadImages</span><span class="token punctuation">(</span><span class="token parameter">ZipArchive zip<span class="token punctuation">,</span> HtmlNode body<span class="token punctuation">,</span> string slug</span><span class="token punctuation">)</span>
<span class="token punctuation">{</span>
<span class="token keyword">var</span> mapping <span class="token operator">=</span> <span class="token keyword">new</span> <span class="token class-name">Dictionary</span><span class="token operator"><</span>string<span class="token punctuation">,</span> string<span class="token operator">></span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">;</span>
<span class="token function">foreach</span> <span class="token punctuation">(</span><span class="token keyword">var</span> image <span class="token keyword">in</span> zip<span class="token punctuation">.</span>Entries<span class="token punctuation">.</span><span class="token function">Where</span><span class="token punctuation">(</span><span class="token parameter">x</span> <span class="token operator">=></span> Path<span class="token punctuation">.</span><span class="token function">GetDirectoryName</span><span class="token punctuation">(</span>x<span class="token punctuation">.</span>FullName<span class="token punctuation">)</span> <span class="token operator">==</span> <span class="token string">"images"</span><span class="token punctuation">)</span><span class="token punctuation">)</span>
<span class="token punctuation">{</span>
<span class="token keyword">var</span> type <span class="token operator">=</span> Path<span class="token punctuation">.</span><span class="token function">GetExtension</span><span class="token punctuation">(</span>image<span class="token punctuation">.</span>Name<span class="token punctuation">)</span><span class="token punctuation">.</span><span class="token function">ToLower</span><span class="token punctuation">(</span><span class="token punctuation">)</span> <span class="token keyword">switch</span>
<span class="token punctuation">{</span>
<span class="token string">".png"</span> <span class="token operator">=></span> <span class="token string">"image/png"</span><span class="token punctuation">,</span>
<span class="token string">".jpg"</span> or <span class="token string">"jpeg"</span> <span class="token operator">=></span> <span class="token string">"image/jpg"</span><span class="token punctuation">,</span>
<span class="token parameter">_</span> <span class="token operator">=></span> <span class="token string">"application/octet-stream"</span>
<span class="token punctuation">}</span><span class="token punctuation">;</span>
using <span class="token keyword">var</span> contents <span class="token operator">=</span> image<span class="token punctuation">.</span><span class="token function">Open</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">;</span>
<span class="token keyword">var</span> ms <span class="token operator">=</span> <span class="token keyword">new</span> <span class="token class-name">MemoryStream</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">;</span>
contents<span class="token punctuation">.</span><span class="token function">CopyTo</span><span class="token punctuation">(</span>ms<span class="token punctuation">)</span><span class="token punctuation">;</span>
<span class="token keyword">var</span> bytes <span class="token operator">=</span> ms<span class="token punctuation">.</span><span class="token function">ToArray</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">;</span>
<span class="token keyword">var</span> result <span class="token operator">=</span> _blogClient<span class="token punctuation">.</span><span class="token function">NewMediaObject</span><span class="token punctuation">(</span>slug <span class="token operator">+</span> <span class="token string">"/"</span> <span class="token operator">+</span> Path<span class="token punctuation">.</span><span class="token function">GetFileName</span><span class="token punctuation">(</span>image<span class="token punctuation">.</span>Name<span class="token punctuation">)</span><span class="token punctuation">,</span> type<span class="token punctuation">,</span> bytes<span class="token punctuation">)</span><span class="token punctuation">;</span>
mapping<span class="token punctuation">[</span>image<span class="token punctuation">.</span>FullName<span class="token punctuation">]</span> <span class="token operator">=</span> <span class="token keyword">new</span> <span class="token class-name">UriBuilder</span> <span class="token punctuation">{</span> Path <span class="token operator">=</span> result<span class="token punctuation">.</span><span class="token constant">URL</span> <span class="token punctuation">}</span><span class="token punctuation">.</span>Uri<span class="token punctuation">.</span>AbsolutePath<span class="token punctuation">;</span>
<span class="token punctuation">}</span>
<span class="token function">foreach</span> <span class="token punctuation">(</span><span class="token keyword">var</span> img <span class="token keyword">in</span> body<span class="token punctuation">.</span><span class="token function">SelectNodes</span><span class="token punctuation">(</span><span class="token string">"//img[@src]"</span><span class="token punctuation">)</span><span class="token punctuation">.</span><span class="token function">ToArray</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">)</span>
<span class="token punctuation">{</span>
<span class="token keyword">if</span> <span class="token punctuation">(</span>mapping<span class="token punctuation">.</span><span class="token function">TryGetValue</span><span class="token punctuation">(</span>img<span class="token punctuation">.</span>Attributes<span class="token punctuation">[</span><span class="token string">"src"</span><span class="token punctuation">]</span><span class="token punctuation">.</span>Value<span class="token punctuation">,</span> out <span class="token keyword">var</span> path<span class="token punctuation">)</span><span class="token punctuation">)</span>
<span class="token punctuation">{</span>
img<span class="token punctuation">.</span>Attributes<span class="token punctuation">[</span><span class="token string">"src"</span><span class="token punctuation">]</span><span class="token punctuation">.</span>Value <span class="token operator">=</span> path<span class="token punctuation">;</span>
<span class="token punctuation">}</span>
<span class="token punctuation">}</span>
<span class="token punctuation">}</span>
<span class="token keyword">private</span> <span class="token keyword">static</span> string <span class="token function">GenerateSlug</span><span class="token punctuation">(</span><span class="token parameter">string title</span><span class="token punctuation">)</span>
<span class="token punctuation">{</span>
<span class="token keyword">var</span> slug <span class="token operator">=</span> title<span class="token punctuation">.</span><span class="token function">Replace</span><span class="token punctuation">(</span><span class="token string">" "</span><span class="token punctuation">,</span> <span class="token string">""</span><span class="token punctuation">)</span><span class="token punctuation">;</span>
<span class="token function">foreach</span> <span class="token punctuation">(</span><span class="token parameter"><span class="token keyword">var</span> ch <span class="token keyword">in</span> Path<span class="token punctuation">.</span><span class="token function">GetInvalidFileNameChars</span><span class="token punctuation">(</span><span class="token punctuation">)</span></span><span class="token punctuation">)</span>
<span class="token punctuation">{</span>
slug <span class="token operator">=</span> slug<span class="token punctuation">.</span><span class="token function">Replace</span><span class="token punctuation">(</span>ch<span class="token punctuation">,</span> <span class="token string">'-'</span><span class="token punctuation">)</span><span class="token punctuation">;</span>
<span class="token punctuation">}</span>
<span class="token keyword">return</span> slug<span class="token punctuation">;</span>
<span class="token punctuation">}</span>
<span class="token punctuation">}</span></code></pre><hr/></p><p style="text-align:left;">You’ll probably not appreciate this, but the fact that I can just push code like that into the document and get it with proper formatting easily is a major lifestyle improvement from my point of view. </p><p style="text-align:left;">The code works with the document in two ways. First, in the Document DOM (which is quite complex), it extracts the title of the blog post and afterward updates it with the document ID. But the core of this code is to extract the document as a zip file, grab everything from there, and push that to the blog. I do some editing for the HTML to get everything set up properly, mostly editing the links and uploading the images. There is also some stuff happening with CSS scopes that I frankly don’t understand. I <em>think</em> I got it right, which is fine for now.</p><p style="text-align:left;">This cost me a couple of evenings, and it was <em>fun</em>. Nothing earth-shattering, I’ll admit. But it’s the first time in a while that I actually wrote a piece of code that was immediately useful. My blogging queue is rather full, and I hope that with this new process it will be easier to push the ideas out of my head and to the blog.</p><p style="text-align:left;">And with that, it is now 01:26 AM, and I’m going to call it a night 🙂.</p><p style="text-align:left;">And as a final thought, I had just made several changes to the post after publication, and it went <em>smoothly</em>. I think that I like it.</p>
<p><link rel="stylesheet" href="https://cdnjs.cloudflare.com/ajax/libs/prism/9000.0.1/themes/prism.min.css" integrity="sha512-/mZ1FHPkg6EKcxo0fKXF51ak6Cr2ocgDi5ytaTBjsQZIH/RNs6GF6+oId/vPe3eJB836T36nXwVh/WBl/cWT4w==" crossorigin="anonymous" referrerpolicy="no-referrer" /></p>
https://www.ayende.com/blog/200521-B/meta-blog-blogging-ergonomics-in-2024?Key=f8115b05-cf22-4270-bec0-9fe3831947eehttps://www.ayende.com/blog/200521-B/meta-blog-blogging-ergonomics-in-2024?Key=f8115b05-cf22-4270-bec0-9fe3831947eeWed, 17 Jan 2024 12:00:00 GMTI was working on the 2024 budget numbers, and…<p><a href="https://ayende.com/blog/Images/Open-Live-Writer/I-was-working-on-the-2024-budget-numbers_123A5/Designer.jpg"><img style="border: 0px currentcolor; float: right; display: inline; background-image: none;" title="Designer" src="https://ayende.com/blog/Images/Open-Live-Writer/I-was-working-on-the-2024-budget-numbers_123A5/Designer_thumb.jpg" alt="Designer" width="240" height="240" align="right" border="0" /></a></p>
<p>Today I had a meeting to go over the 2024 budget, and we ran into one of the most important line times. Our coffee budget.</p>
<p>You know the old adage about: Coders are turning Coffee into Code, right?</p>
<p>Certainly true in our case, in 2023 we spent a large 5-figure sum on coffee alone. And 2024 is shaping up to be even more expensive.</p>
<p>Happy new year!</p>https://www.ayende.com/blog/200385-A/i-was-working-on-the-2024-budget-numbers-and?Key=95522f4c-9eaa-451a-afbe-ff43b00c5613https://www.ayende.com/blog/200385-A/i-was-working-on-the-2024-budget-numbers-and?Key=95522f4c-9eaa-451a-afbe-ff43b00c5613Sun, 31 Dec 2023 12:00:00 GMTLearning over the holidays: Yet Another Bug Tracker sample app<p><a href="https://ayende.com/blog/Images/Open-Live-Writer/f9e68b110bd8_C834/1_2.jpg"><img width="227" height="360" title="1" align="right" style="border: 0px currentcolor; border-image: none; float: right; display: inline; background-image: none;" alt="1" src="https://ayende.com/blog/Images/Open-Live-Writer/f9e68b110bd8_C834/1_thumb.jpg" border="0"></a><p>If you are reading this blog, I assume that you are a like-minded person. My idea of relaxation is to sit and write code. Hopefully on something that I’m not familiar with. I have <a href="https://ayende.com/blog/posts/series/196449-A/badly-implementing-encryption">many such blog post series</a> covering topics I care about. It’s my idea of meditation.<p>For the end of 2023, I thought that we could do something similar but on a broader scale. A while ago <a href="https://alex-klaus.com/">Alex Klaus</a> wrote a <a href="https://ravendb.net/yabt-series">walkthrough</a> on how to build a complete application from scratch using modern best practices (and RavenDB). We refreshed the code and made it widely available, offering you something fun , educational, and productive to engage with.<p>The system is a bug tracker (allowing us to focus on the architecture rather than domain concerns), and you can play with <a href="https://yabt.ravendb.net/">a deployed version live</a>. The <a href="https://github.com/ravendb/samples-yabt">code is available</a> under the MIT license, and we’ll be very happy to receive any suggested improvements.<p>Topics that are covered:<ol><li><p><a href="https://ravendb.net/articles/building-application-with-net-core-and-ravendb-nosql-database">Building an enterprise application with the .NET and RavenDB</a></p><li><p><a href="https://ravendb.net/articles/nosql-data-model-through-ddd-prism">Non-Relational Data Modeling Through Domain Driven Design Prism</a></p><li><p><a href="https://ravendb.net/articles/hidden-side-of-document-ids-in-ravendb">Hidden side of document IDs in RavenDB</a></p><li><p><a href="https://ravendb.net/articles/dynamic-fields-for-indexing">Dynamic Fields for Indexing</a></p><li><p><a href="https://ravendb.net/articles/entity-relationships-in-nosql">Entity Relationships in non-relational database (one-to-many, many-to-many)</a></p><li><p><a href="https://ravendb.net/articles/nosql-multi-tenant-database">Multi-tenant database in NoSQL</a></p><li><p><a href="https://ravendb.net/articles/database-integration-testing-the-secret-recipe">Database Integration Testing – The Secret Recipe</a></p></li></ol><p>As usual, I would love any feedback you have to offer.</p>https://www.ayende.com/blog/200353-B/learning-over-the-holidays-yet-another-bug-tracker-sample-app?Key=b129183e-6820-4c9f-8d4e-135536d35adehttps://www.ayende.com/blog/200353-B/learning-over-the-holidays-yet-another-bug-tracker-sample-app?Key=b129183e-6820-4c9f-8d4e-135536d35adeFri, 22 Dec 2023 12:00:00 GMTThe role of GitHub in paying for Open Source Software<p>I have been doing Open Source work for just under twenty years at this point. I have been paying my mortgage from Open Source software for about 15. I’m stating that to explain that I have spent quite a lot of time struggling with the inherent tension between having an Open Source project and getting paid.</p>
<p>I <a href="https://ayende.com/blog/posts/series/192417-A/open-source-money">wrote about it</a> a <a href="https://ayende.com/blog/posts/series/186113-A/making-money-from-open-source-software">few times in the past</a>. It is not a trivial problem, and the core of the issue is <em>not</em> something that you can easily solve with technical means. I ran into this fascinating thread on Twitter that over the weekend:</p>
<blockquote class="twitter-tweet">
<p dir="ltr" lang="en">you just described licensing. As you missed 1 important aspect: if an org isn't obligated to pay, they won't. So you need a form of making them pay by giving them a token they paid which then makes them able to use your software. Any other form will fail.</p>
— Frans Bouma (@FransBouma) <a href="https://twitter.com/FransBouma/status/1689946604030050304?ref_src=twsrc%5Etfw">August 11, 2023</a></blockquote>
<p><script src="https://platform.twitter.com/widgets.js" async="" charset="utf-8"></script></p>
<p>And another part of that is here:</p>
<blockquote class="twitter-tweet">
<p dir="ltr" lang="en">The thing is, businesses spend significant amounts of money on software licenses, whether on-prem or as-a-service.<br /><br />They understand and accept this, as do their shareholders and investors. It is a cost of doing business.<br /><br />Donations? Not so much.</p>
— Udi Dahan (@UdiDahan) <a href="https://twitter.com/UdiDahan/status/1690034306997993473?ref_src=twsrc%5Etfw">August 11, 2023</a></blockquote>
<p><script src="https://platform.twitter.com/widgets.js" async="" charset="utf-8"></script></p>
<p>I’m quoting the most relevant pieces, but the idea is pretty simple.</p>
<p>Donations don’t work, period. They don’t work not because companies are evil or developers don’t want to pay for Open Source. They don’t work because it takes a huge amount of effort to actually get paid.</p>
<p>If you are an independent developer, your purchasing process goes something like this:</p>
<ol>
<li>I would like to use this thing</li>
<li>I need to pay for that</li>
<li>The price matches the value I’m getting</li>
<li>Where is my credit card…</li>
<li>Paid!</li>
</ol>
<p>Did you note step 2? The part about <em>needing</em> to pay?</p>
<p>If you don’t have that step, what will happen? Same scenario, an independent developer:</p>
<ol>
<li>I would like to use this thing</li>
<li>I use this thing</li>
<li>It would be great to pay something to show my appreciation</li>
<li>Where did I put the credit card? Oh, it’s down the hall… I’ll get to that later (never).</li>
</ol>
<p>That is in the best-case scenario where the thought of donating actually crossed your mind. In most likelihood, the process is more:</p>
<ol>
<ol><!--StartFragment--></ol>
<li>I would like to use this thing</li>
<li>I use this thing</li>
<li>Ticket closed, what is the next one… ?</li>
</ol>
<p>Now, what happens if you are <em>not</em> an independent developer? Let’s say that you are a contract worker for a company. You need to talk to your contact person, they will need to get purchasing approval. Depending on the amount, that may require escalating upward a few levels, etc.</p>
<p>Let’s say that the amount is under 100$, so basically within the budgetary discretion of the first manager you run into. They would still need to know what they are paying for, what they are getting out of that (they need to justify that). If this is a donation, welcome to the beauty of tax codes in multiple jurisdictions and what <em>counts</em> as such. If this is <em>not</em> a donation, what do they get? That means that you now have to do a meeting, potentially multiple ones. Present your case, open a new supplier at the company, etc.</p>
<p>The cost of all of those is high, both in time and money. Or… you can just <em>nuget add-package</em> and move on.</p>
<p>In the case of RavenDB, it is an Open Source software (a license to match, code is freely available), but we treat it as a commercial project for all intents and purposes. If you want to install RavenDB, you’ll get a popup saying you need a license, directing you to a page where you see how much we would like to get and what do you get in return, etc. That means that from a commercial perspective, we are in a familiar ground for companies. They are <em>used</em> to paying for software, and there isn’t an option to just move on to the next task.</p>
<p>There is another really important consideration here. In the ideal Open Source donation model, money just shows up in your account. In the commercial world, there is a <em>huge</em> amount of work that is required to get things done. That is when you have a model where “the software does not <em>work</em> without a purchase”. To give some context, 22% is Sales & Marketing and they spent around 21.8 <em>billion</em> in 2022 on Sales & Marketing. That is literally billions being <em>spent </em>to make sales.</p>
<p>If you want to make money, you are going to invest in sales, sales strategy, etc. I’m ignoring marketing here because if you are expected to make money from Open Source, you likely already have a project well-known enough to at least get started.</p>
<p>That means that you need to figure out what you are charging for, how do you get customers, etc. In the case of RavenDB, we use the per-core model, which is a good indication of how much use the user is getting from RavenDB. LLBLGen Pro, on the other hand, they are charging per seat. Particular’s NServiceBus uses a per endpoint / number of messages a day model.</p>
<p>There is no one model that fits all. And you need to be able to tailor your pricing model to how your users think about your software.</p>
<p>So pricing strategy, creating a proper incentive to purchase (hard limit, usually) and some sales organization to actually drive all of that are absolutely required.</p>
<p>Notice what is missing here? GitHub. It simply has no role at all up to this point. So why the title of this post?</p>
<p>There is one <em>really big</em> problem with getting paid that GitHub can solve for Open Source (and in general, I guess).</p>
<p>The whole process of actually getting paid is absolutely atrocious. In the best case, you need to create a supplier at the customer, fill up various forms (no, we don’t use child labor or slaves, indeed), figure out all sorts of <em>weird </em>roles (German tax authority requires special dispensation, and let’s not talk about getting paid from India, etc). Welcome to Anti Money Laundering roles and GDPR compliance with Known Your Customer and SOC 2 regulations. The last sentence is basically nonsense words, but I understand that if you chant it long enough, you get money in the end.</p>
<p>What GitHub can do is be a payment pipe. Since presumably your organization is already set up with them in place, you can get them to do the invoicing, collecting the payment, etc. And in the end, you get the money.</p>
<p>That sounds <em>exactly</em> like GitHub Sponsorships, right? Except that in this case, this is no a donation. This is a flat-out simple transaction, with GitHub as the medium. The idea is that you <em>have</em> a limit, which you enforce, on your usage, and GitHub is how you are paid. The ability to do it in this fashion <em>may </em>make things easier, but I would assume that there are about three books worth of regulations and EULAs to go through to make it actually successful.</p>
<p>Yet, as far as I’m concerned, that <em>is</em> really the only important role that we have for GitHub here.</p>
<p>That is <em>not</em> a small thing, mind. But it isn’t a magic bullet.</p>https://www.ayende.com/blog/199937-C/the-role-of-github-in-paying-for-open-source-software?Key=13c9495d-6552-4c62-a26b-4f70c13c70cfhttps://www.ayende.com/blog/199937-C/the-role-of-github-in-paying-for-open-source-software?Key=13c9495d-6552-4c62-a26b-4f70c13c70cfMon, 14 Aug 2023 12:00:00 GMTDeploying RavenDB with Helm Chart<p><a href="https://helm.sh/">Helm</a> is the package manager for Kubernetes. It allows you to easily deploy applications and systems to a Kubernetes cluster easily, safely and in a reproducible manner.</p>
<p>We provide you with a chart so you can use <a href="https://artifacthub.io/packages/helm/ravendb-cluster/ravendb-cluster">Helm to deploy RavenDB clusters</a>.</p>
<p>You can visit <a href="https://github.com/ravendb/helm-charts/tree/master/charts/ravendb-cluster">this link for a full discussion</a> on how to do so.</p>https://www.ayende.com/blog/199269-A/deploying-ravendb-with-helm-chart?Key=75c4e431-594c-44b3-b9d1-9477d16cbdbfhttps://www.ayende.com/blog/199269-A/deploying-ravendb-with-helm-chart?Key=75c4e431-594c-44b3-b9d1-9477d16cbdbfWed, 19 Apr 2023 12:00:00 GMTTricks of the trade: Figuring out progress of a large upload<p>I found myself today needing to upload a file to S3, the upload size is a few hundred GBs in size. I expected the appropriate command, like so:</p>
<blockquote>
<pre>aws s3api put-object --bucket twitter-2020-rvn-dump --key mydb.backup --body ./mydb.backup</pre>
</blockquote>
<p>But then I realized that this is uploading a few <em>hundred</em> GB file to S3, which may take a while. The command doesn’t have any progress information, so I had no way to figure out where it is at.</p>
<p>I decided to see what I can poke around to find, first, I ran this command:</p>
<blockquote>
<pre>ps -aux | grep s3api</pre>
</blockquote>
<p>This gave me the PID of the upload process in question.</p>
<p>Then I checked the file descriptors for this process, like so:</p>
<blockquote>
<pre>$ ls -alh /proc/84957/fd<br /><br /><br /></pre>
<p>total 0<br />dr-x------ 2 ubuntu ubuntu 0 Mar 30 08:10 .<br />dr-xr-xr-x 9 ubuntu ubuntu 0 Mar 30 08:00 ..<br />lrwx------ 1 ubuntu ubuntu 64 Mar 30 08:10 0 -> /dev/pts/8<br />lrwx------ 1 ubuntu ubuntu 64 Mar 30 08:10 1 -> /dev/pts/8<br />lrwx------ 1 ubuntu ubuntu 64 Mar 30 08:10 2 -> /dev/pts/8<br />lr-x------ 1 ubuntu ubuntu 64 Mar 30 08:10 3 -> /backups/mydb.backup</p>
</blockquote>
<p>As you can see, we can tell that file descriptor#3 is the one that we care about, then we can ask for more details:</p>
<blockquote>
<pre>$ cat /proc/84957/fdinfo/3<br />
pos: 140551127040
flags: 02400000
mnt_id: 96
ino: 57409538</pre>
</blockquote>
<p>In other words, the process is currently at ~130GB of the file or there about.</p>
<p>It’s not ideal, but it does give me some idea about where we are at. It is a nice demonstration of the ability to poke into the insides of a running system to figure out what is going on.</p>https://www.ayende.com/blog/199233-B/tricks-of-the-trade-figuring-out-progress-of-a-large-upload?Key=32b2be39-87f4-4b87-b727-ee1a64aaf8e3https://www.ayende.com/blog/199233-B/tricks-of-the-trade-figuring-out-progress-of-a-large-upload?Key=32b2be39-87f4-4b87-b727-ee1a64aaf8e3Fri, 31 Mar 2023 12:00:00 GMTOn AI, GPT and the future of developers<p>When I started using GitHub Copilot, I was quite amazed at how good it was. Sessions using ChatGPT can be jaw dropping in terms of the generated content.</p>
<p>The immediate reaction from many people is to consider what the impact of that would be on the humans who currently fill those roles. Surely, if we can get a machine to do the task of a human, we can all benefit (except for the person made redundant, I guess).</p>
<p>I had a long discussion on the topic recently and I think that it is a good topic for a blog post, given the current interest in the subject matter.</p>
<p>The history of replacing manual labor with automated machines goes back as far as you’ll like to stretch it. I wouldn’t go back to the horse & plow, but certain the Luddites and their arguments about the impact of machinery on the populace will sound familiar to anyone today.</p>
<p>The standard answer is that some professions will go away, but new ones will pop up, instead. The classic example is the ice salesman. That used to be a function, a guy on a horse-drawn carriage that would sell you ice to keep your food cold. You can assume that this profession is no longer relevant, of course.</p>
<p>The difference here is that we now have computer programs and AI taking over what was classically thought impossible. You can ask Dall-E or Stable Diffusion for an image and in a few seconds, you’ll have a beautiful render that may actually match what you requested.</p>
<p>You can start writing code with GitHub Copilot and it will predict what you <em>want</em> to do to an extent that is absolutely awe-inspiring.</p>
<p>So what is the role of the human in all of this? If I can ask ChatGPT or Copilot to write me an email validation function, what do I need a developer for?</p>
<p>Here is ChatGPT’s output:</p>
<blockquote>
<p><a href="https://ayende.com/blog/Images/Open-Live-Writer/On-AI-GPT-and-the_7BCC/image_2.png"><img style="border: 0px currentcolor; display: inline; background-image: none;" title="image" src="https://ayende.com/blog/Images/Open-Live-Writer/On-AI-GPT-and-the_7BCC/image_thumb.png" alt="image" width="1000" height="884" border="0" /></a></p>
</blockquote>
<p>And here is Copilot’s output:</p>
<blockquote>
<p><a href="https://ayende.com/blog/Images/Open-Live-Writer/On-AI-GPT-and-the_7BCC/image_4.png"><img style="margin: 0px; border: 0px currentcolor; display: inline; background-image: none;" title="image" src="https://ayende.com/blog/Images/Open-Live-Writer/On-AI-GPT-and-the_7BCC/image_thumb_1.png" alt="image" width="798" height="457" border="0" /></a></p>
</blockquote>
<p>I would rate the MailAddress version better, since I know that you can’t actually manage emails via Regex. I tried to take this further and ask ChatGPT about the Regex, and got:</p>
<blockquote>
<p><a href="https://ayende.com/blog/Images/Open-Live-Writer/On-AI-GPT-and-the_7BCC/image_6.png"><img style="border: 0px currentcolor; display: inline; background-image: none;" title="image" src="https://ayende.com/blog/Images/Open-Live-Writer/On-AI-GPT-and-the_7BCC/image_thumb_2.png" alt="image" width="993" height="503" border="0" /></a></p>
</blockquote>
<p>ChatGPT is confused, and the answer doesn’t make any sort of sense.</p>
<p>Most of the time spent on “research” for this post was waiting for ChatGPT to actually produce a result, but this post isn’t about nitpicking, actually.</p>
<p>The whole premise around “machines will make us redundant” is that the sole role of a developer is taking a low-level requirement such as email validation and producing the code to match.</p>
<p>Writing such low-hanging fruit is not your job. For that matter, a <em>function</em> is not your job. Nor is writing code a significant portion of that. A developer needs to be able to build the system architecture and design the interaction between components and the overall system.</p>
<p>They need to make sure that the system is performant, meet the non-functional requirements, etc. A developer would spend a lot more time <em>reading</em> code than writing it.</p>
<p>Here is a more realistic example of using ChatGPT, <a href="https://sharegpt.com/c/WkZlz35">asking it to write to a file using a write-ahead log</a>. I am both amazed by the quality of the answer and find myself unable to use even a bit of the code in there. The scary thing is that this code <em>looks</em> correct at a glance. It is wrong, dangerously so, but you’ll need to be a subject matter expert to know that. In this case, this doesn’t meet the requirements, the provided solution has security issues and doesn’t actually work.</p>
<p>On the other hand, I asked it about password hashing and <a href="https://sharegpt.com/c/kLMCnRx">I would give this answer a good mark</a>.</p>
<p>I believe it will get better over time, but the overall context matters. We have a <em>lot</em> of experience in trying <a href="https://ayende.com/blog/2365/and-the-secretary-will-write-the-order-dispatching-logic">to get the secretary to write code</a>. There have been <a href="https://ayende.com/blog/4575/lightswitch-the-return-of-the-secretary">many tools trying to do that</a>, going all the way back to CASE in the 80s.</p>
<p>There used to be a profession called: “computer”, where you could hire a person to compute math for you. Pocket calculators didn’t invalidate them, and Excel didn’t make them redundant. They are now called accountants or data scientists, instead. And use the new tools (admittedly, calling calculators or Excel new feels very strange) to boost up their productivity enormously.</p>
<p>Developing with something like Copilot is a far easier task, since I can usually just tab complete a lot of the routine details. But having a tool to do some part of the job doesn’t mean that there is no work to be done. It means that a developer can speed up the routine bits and get to grips faster / more easily with the other challenges it has, such as figuring out why the system doesn’t do what it needs to, improving existing behavior, etc.</p>
<p>Here is a great way to use ChatGPT <a href="https://sharegpt.com/c/wnDDySj">as part of your work</a>, ask it to optimize a function. For this scenario, it did a great job. For more complex scenarios? There is too much context to express.</p>
<p>My final conclusion is that this is a really awesome tool to assist you. It can have a massive impact on productivity, especially for people working in an area that they aren’t familiar with. The downside is that sometimes it will generate junk, then again, sometimes real people do that as well.</p>
<p>The next few years are going to be really interesting, since it provides a whole new level of capability for the industry at large, but I don’t think that it would shake the reality on the ground.</p>https://www.ayende.com/blog/198945-B/on-ai-gpt-and-the-future-of-developers?Key=d1074284-3d01-495b-a2f2-4c5f43713db6https://www.ayende.com/blog/198945-B/on-ai-gpt-and-the-future-of-developers?Key=d1074284-3d01-495b-a2f2-4c5f43713db6Thu, 02 Feb 2023 12:00:00 GMTRavenDB 6.0: Sharding webinar<p><a href="https://us02web.zoom.us/webinar/register/7116707641067/WN_APOLYWCxRviNG-nCkA5FEA"><img style="border: 0px currentcolor; float: right; display: inline; background-image: none;" title="image" src="https://ayende.com/blog/Images/Open-Live-Writer/23927057dc4f_C5C5/image_3.png" alt="image" width="388" height="333" align="right" border="0" /></a>This Wednesday I’m going to be doing a <a href="https://us02web.zoom.us/webinar/register/7116707641067/WN_APOLYWCxRviNG-nCkA5FEA">webinar about RavenDB & Sharding</a>. This is going to be the flagship feature for RavenDB 6.0 and I’m really excited to be talking about it in public <em>finally</em>.</p>
<p>Sharding involves splitting your data into multiple nodes. Similar to having different volumes of a single encyclopedia.</p>
<p>RavenDB’s sharding implementation is something that we have spent the past three or four <em>years</em> working on. That has been quite a saga to get it out. The primary issue is that we want to achieve two competing goals:</p>
<ul>
<li>Allow you to scale the amount of data you have to near infinite levels.</li>
<li>Ensure that RavenDB remains simple to use and operate.</li>
</ul>
<p>The first goal is actually fairly easy and straightforward. It is the second part that made things complicated. After a lot of work, I believe that we have a really good solution at hand.</p>
<p>In <a href="https://us02web.zoom.us/webinar/register/7116707641067/WN_APOLYWCxRviNG-nCkA5FEA">the webinar</a>, I’m going to be presenting how RavenDB 6.0 implements sharding, the behavior of the system at scale, and all the details you need to know about how it works under the cover.</p>
<p>I’m <em>really</em> excited to finally be able to show off the great work of the team! Join me, it’s going to be really interesting.</p>https://www.ayende.com/blog/198785-B/ravendb-6-0-sharding-webinar?Key=4490c862-db91-4048-bd5e-00aabdeeadc7https://www.ayende.com/blog/198785-B/ravendb-6-0-sharding-webinar?Key=4490c862-db91-4048-bd5e-00aabdeeadc7Mon, 09 Jan 2023 12:00:00 GMTFundamental knowledge<p>I’ve been calling myself a professional software developer for just over 20 years at this point. In the past few years, I have gotten into teaching university courses in the Computer Science curriculum. I have recently had the experience of supporting a non-techie as they went through a(n intense) coding bootcamp (aiming at full stack / front end roles). I’m also building a distributed database engine and all the associated software.</p>
<p>I list all of those details because I want to make an observation about the distinction between fundamental and transient knowledge.</p>
<p>My first thought is that there is <em>so much </em>to learn. Comparing the structure of C# today to what it was when I learned it (pre-beta days, IIRC), it is a <em>very</em> different language. I had literally decades to adjust to some of those changes, but someone that is just getting started needs to grasp everything all at once. When I learned JavaScript you still had browsers in the market that didn’t recognize it, so you had to do the “//<!—” trick to get things to work (don’t ask!).</p>
<p>This goes far beyond mere syntax and familiarity with language constructs. The overall environment is also critically important. One of the basic tasks that I give in class is something similar to: “Write a network service that would serve as a remote dictionary for key/value operations”. Most students have a hard time grasping details such as IP vs. host, TCP ports, how to read from the network, error handling, etc. Adding a relatively simple requirement (make it secure from eavesdroppers) will take it entirely out of their capabilities.</p>
<p>Even taking a “simple” problem, such as building a CRUD website is fraught with many important details that aren’t really visible. Responsive design, mobile friendly, state management and user experience, to name a few. Add requirements such as accessibility and you are setting the bar too high to reach.</p>
<p>I intentionally choose the examples of accessibility and security, because those are “invisible” requirements. It is easy to miss them if you don’t know that they should be there.</p>
<p>My first website was a PHP page that I pushed to the server using FTP and updated live in “production”. I was exposed to all the details about DNS and IPs, understood exactly that the server side was just a machine in a closet, and had very low levels of abstractions. (Naturally, the solution had <em>no </em>security or any other –ities). However, that knowledge from those early experiments has served me very well for decades. Same for details such as how TCP works or the basics of operating system design.</p>
<p>Good familiarity with the basic data structures (heap, stack, tree, list, set, map, queue) paid itself many times over. The amount of time that I spent learning WinForms… still usable and widely applicable even in other platforms and environments. WPF or jQuery? Not so much.</p>
<p>Learning patterns paid many dividends and was applicable on a wide range of applications and topics.</p>
<p>I looked into the topics that are being taught (both for bootcamps and universities) and I understand why in many cases, those are being skipped. You can actually <em>be</em> a front end developer without understanding much (if at all) about networks. And the breadth of details you need to know is immense.</p>
<p>My own tendency is to look at the low level stuff, and given that I work on a database engine, that is obviously quite useful. What I have found, however, is that whenever I dug deep into a topic, I found ways to utilize that knowledge at a later point in time. Sometimes I was able to solve a problem in a way that would be utterly inconceivable to me previously. I’m not just talking about being able to immediately apply new knowledge to a problem. If that were the case, I would attribute that to wanting to use the new thing I just learned.</p>
<p>However, I’m talking about scenarios where months or years later I ran into a problem, and was then able to find the right solution given what was then totally useless knowledge.</p>
<p>In short, I understand that chasing the 0.23-alpha-stage-2.3.1-dev updates on the left-pad package is <em>important, </em>but I found that spending time deep in the stack has a great cumulative effect.</p>
<p><a href="https://www.joelonsoftware.com/2002/11/11/the-law-of-leaky-abstractions/">Joel Spolsky wrote about leaky abstractions</a>, that was <strong>20 years ago</strong>. I remember reading that blog post and <em>grokking</em> that. And it is true, being able to dig one or two layers down from where you usually live has a huge amount of leverage on your capabilities.</p>https://www.ayende.com/blog/198593-A/fundamental-knowledge?Key=4c32a950-455e-418e-bae4-5110b84357e0https://www.ayende.com/blog/198593-A/fundamental-knowledge?Key=4c32a950-455e-418e-bae4-5110b84357e0Tue, 29 Nov 2022 12:00:00 GMTBeating FizzBuzz for detecting qualified candidates<p>FizzBuzz is a well known test to show that you can program. To be rather more exact, it is a simple test that does not tell you if you can program well, but if you cannot do FizzBuzz, you cannot program. This is a fail only kind of metric. We need this thing because sadly, we see people that fail FizzBuzz coming to interviews.</p>
<p>I have another test, which I feel is simpler than FizzBuzz, which can significantly reduce the field of candidates. I show them this code and ask them to analyze what is going on here:</p>
<blockquote>
<script src="https://gist.github.com/ayende/c41e7870bfb84d041829e804e05faa0f.js"></script>
</blockquote>
<p>Acceptable answers include puking, taking a few moments to breathe into a paper bag and mild to moderate <a href="https://ayende.com/blog/183713-C/toddlers-cursing-and-preparing-ahead-of-time">professional swearing</a>.</p>
<p>This is something that I actually run into (about 15 years ago, in the WebForms days) and I have used it ever since. That is a great way to measure just how much a candidate knows about the environment in which they operate.</p>https://www.ayende.com/blog/195905-C/beating-fizzbuzz-for-detecting-qualified-candidates?Key=ad59485b-6453-45c7-a0f7-96c853aa964bhttps://www.ayende.com/blog/195905-C/beating-fizzbuzz-for-detecting-qualified-candidates?Key=ad59485b-6453-45c7-a0f7-96c853aa964bFri, 31 Dec 2021 12:00:00 GMT“Work well under pressure” is a safety valve, not SOP<p><a href="https://ayende.com/blog/Images/Open-Live-Writer/Work-well-under-pressure_AD81/image_2.png"><img style="border: 0px currentcolor; float: right; display: inline; background-image: none;" title="image" src="https://ayende.com/blog/Images/Open-Live-Writer/Work-well-under-pressure_AD81/image_thumb.png" alt="image" width="212" height="408" align="right" border="0" /></a>The phrase “work well under pressure” is something that I consider to be a red flag in a professional environment. My company builds a database that is used as the backend of business critical systems. If something breaks, there is a <em>need</em> to fix it. It costs money (sometimes a <em>lot</em> of money) for every minute of downtime.</p>
<p>Under such a scenario, I absolutely want the people handling the issue to remain calm, collected and analytical. In such a case, being able to work well under pressure is a huge benefit.</p>
<p>That is not how this term is typically used, however. The typical manner you’ll hear this phrase is to refer to the <em>usual working environment</em>. For example, working under time pressure to deliver certain functionality. That sort of pressure is toxic over time.</p>
<p>Excess stress is a well known contributor to health issues (mental and physical ones), it will cause you to make mistakes and it adds frictions all around.</p>
<p>From my perspective, the ability to work well under pressure is an absolutely important quality, which should be <em>hoarded. </em>You may need to utilize this ability in order to deal with a blocking customer issue, but should be careful not to spend that on non-critical stuff.</p>
<p>And by definition, <em>most things </em>are not critical. If everything is critical, you have a different problem.</p>
<p>That means that part of the task of the manager is to identify the places where pressure is applied and remove that. In the context of software, that may be delaying a release date or removing features to reduce the amount of work.</p>
<p>When working with technology, the most valuable asset you have is the people and the knowledge they have. And one of the easiest ways to lose that is to burn the candle at both ends. You get more light, sure, but you also get no candle.</p>https://www.ayende.com/blog/195841-B/work-well-under-pressure-is-a-safety-valve-not-sop?Key=011bd589-bf2f-4c81-b2c9-d5a062320469https://www.ayende.com/blog/195841-B/work-well-under-pressure-is-a-safety-valve-not-sop?Key=011bd589-bf2f-4c81-b2c9-d5a062320469Tue, 21 Dec 2021 12:00:00 GMTRavenDB 5.3 New Features: Studio & Query improvements<p>I like to think about myself as a database guy. My go to joke about building user interfaces is that a <table> is all I need for layout (it’s not a joke). About a decade ago I just gave up on trying to follow what is going on in the frontend land and accepted that I’ll reside in the backend from here on after.</p>
<p>Being ignorant of the ways you’ll write a modern frontend doesn’t affect the fact that I like to <em>use</em> a good user interface. I have seriously mixed feelings about the importance of RavenDB Studio to the project. On the one hand, I <em>care</em> that it is easy to use, obvious and functional. I <em>love</em> that it is beautiful and will generally make your life easier. And at the same time, I abhor the fact that it has such an impact on people’s decisions. I mean, the backend of RavenDB is absolutely beautiful, from a technical perspective. But everyone always talk about the studio.</p>
<p>Leaving aside my mini rant, we spend quite a lot of time and effort on the studio and the User Experience in general. This release is not an exception and we have a couple of major new updates to the studio.</p>
<p>One of the most common things you’ll do in the studio is run queries. In this release we have done a complete revamp of the automatic code completion for the client-side RQL queries written in the studio.<br />The new code assistance is available when writing any query in the Query view, Patch view, and in the Subscription Query. That was actually quite interesting, from a computer science perspective. We have <a href="https://github.com/ravendb/ravendb/blob/v5.3/src/Raven.Studio/languageService/grammar/BaseRqlParser.g4">formal grammar</a> for RQL now, for example, which means that we can provide much better experience for query editing. For example, take a look:</p>
<p><a href="https://ayende.com/blog/Images/Open-Live-Writer/RavenDB-5.3-New_C6EA/image_6.png"><img style="margin: 0px; border: 0px currentcolor; display: inline; background-image: none;" title="image" src="https://ayende.com/blog/Images/Open-Live-Writer/RavenDB-5.3-New_C6EA/image_thumb_2.png" alt="image" width="735" height="329" border="0" /></a></p>
<p>Full code completion assistance and better error handling directly at the studio makes it easier to work with RavenDB for both developers and operations.</p>
<p>The second feature is the Identities page:</p>
<p><a href="https://ayende.com/blog/Images/Open-Live-Writer/RavenDB-5.3-New_C6EA/image_2.png"><img style="margin: 0px; border: 0px currentcolor; display: inline; background-image: none;" title="image" src="https://ayende.com/blog/Images/Open-Live-Writer/RavenDB-5.3-New_C6EA/image_thumb.png" alt="image" width="300" height="241" border="0" /></a></p>
<p>Identities has been a feature in RavenDB for a <em>long</em> time, and somehow they have never been front and center. Maybe the discoverability of the feature suffered? You can now create, edit and modify the identities directly in the studio, not just through the API.</p>
<p><a href="https://ayende.com/blog/Images/Open-Live-Writer/RavenDB-5.3-New_C6EA/image_4.png"><img style="margin: 0px; border: 0px currentcolor; display: inline; background-image: none;" title="image" src="https://ayende.com/blog/Images/Open-Live-Writer/RavenDB-5.3-New_C6EA/image_thumb_1.png" alt="image" width="913" height="237" border="0" /></a></p>https://www.ayende.com/blog/195329-A/ravendb-5-3-new-features-studio-query-improvements?Key=9fdbea23-a812-4602-990a-30655e3ef042https://www.ayende.com/blog/195329-A/ravendb-5-3-new-features-studio-query-improvements?Key=9fdbea23-a812-4602-990a-30655e3ef042Wed, 24 Nov 2021 12:00:00 GMTRavenDB and the Black Friday Surge!<p><a href="https://ayende.com/blog/Images/Open-Live-Writer/Black-Friday-Surge_F0FA/image_2.png"><img style="float: right; display: inline; background-image: none;" title="image" src="https://ayende.com/blog/Images/Open-Live-Writer/Black-Friday-Surge_F0FA/image_thumb.png" alt="image" width="821" height="257" align="right" border="0" /></a>Next week is Black Friday, which has reached a global phenomenon status. It is a fun day for shoppers, and a <em>nervous wreck</em> for IT admins everywhere. It is not uncommon to see traffic doubles or triples and the actual load (processing more heavyweight requests) can go up an order of magnitude. Preparing for Black Friday can be a harrowing issue since you have a narrow window of opportunity and it is hard to know exactly where the stress points are.</p>
<p>This year, I decided to make your life easier, and RavenDB is offering a Black Friday Surge to all our customers. No, we aren’t offering you 50% off and everything must go. What we do instead is try to be of <em>help</em>.</p>
<p>This Black Friday (and Cyber Monday as well), we are offering all our customers double what they paid for. When running RavenDB on premise, if you purchased a RavenDB license for a 12 cores cluster (running on 3 nodes of 4 cores each), we’ll offer you 30 days of double the core count. In other words, you can scale your system to be twice as powerful, and it won’t cost you a cent.</p>
<p>On the cloud, as well, we will provide users with credits to upgrade their clusters to the next level up (doubling their power) for a full week during the next 30 days. Again, there is no extra cost here.</p>
<p>You can <a href="https://ravendb.net/promos/black-friday-2021">register for the Surge here</a> to request the upgrade and you’ll get twice as much power to handle the increased load.</p>
<p>Enjoy the power up!</p>https://www.ayende.com/blog/195489-B/ravendb-and-the-black-friday-surge?Key=0c58bda1-4215-4ca0-aa64-cb2bc98063b5https://www.ayende.com/blog/195489-B/ravendb-and-the-black-friday-surge?Key=0c58bda1-4215-4ca0-aa64-cb2bc98063b5Fri, 19 Nov 2021 12:00:00 GMTRavenDB 5.3 New Features: Incremental time series & implementing lambda based accounting<p>Everyone is on the cloud these days, and one of the things that I keep seeing pushed is the notion of usage based billing. Basically, the idea that you are paying for what you use.</p>
<p>Let’s assume that we are building a software as a service where users can submit an image and you’ll do some computation on that. The actual details aren’t relevant. What matters is that your pricing model is based around how much time processing each image takes and how much memory is used. You are running this on many machines and need to figure out how to do billing at the end of the month. It turns out that this can be quite a challenge. With incremental time series, a lot of the details around that just go away.</p>
<p>Here is how you can implement this:</p>
<blockquote>
<script src="https://gist.github.com/ayende/bcd2a3b2195e8bd9682e0ba521b3e9f3.js"></script>
</blockquote>
<p>You count the required memory as well as the actual runtime and record that in an incremental time series. We are also storing the details in a separate document for that particular run in the same transaction (if the user cares about that level of detail). The interesting bit about how this can be used is that the data is now immediately available for the user to see how much they are going to be billed.</p>
<p>Typically, a lot of time is spent in figuring out how to record those details efficiently and then how to query and aggregate those. We tested time series in RavenDB to billions of data points, and the internal format lends itself very well to aggregated queries.</p>
<p>Now you can take the code above, run it on 100s of machines, and it will all end up giving you the proper result in the end.</p>https://www.ayende.com/blog/195268-C/ravendb-5-3-new-features-incremental-time-series-implementing-lambda-based-accounting?Key=e0d2f60a-35fa-4091-8f96-5b5a80b69107https://www.ayende.com/blog/195268-C/ravendb-5-3-new-features-incremental-time-series-implementing-lambda-based-accounting?Key=e0d2f60a-35fa-4091-8f96-5b5a80b69107Mon, 15 Nov 2021 12:00:00 GMTRavenDB 5.3 New Features: Incremental time series<p><a href="https://ayende.com/blog/Images/Open-Live-Writer/RavenDB-5.3-Features_12D89/image_4.png"><img style="float: right; display: inline; background-image: none;" title="image" src="https://ayende.com/blog/Images/Open-Live-Writer/RavenDB-5.3-Features_12D89/image_thumb_1.png" alt="image" width="493" height="256" align="right" border="0" /></a>In RavenDB 5.0 we had a major new feature, native time series support. Using this feature, you can store values over time, query and aggregate them, store them efficiently, produce rollups, etc.</p>
<p>The classic example for time series data in RavenDB is when you have data coming from sensors. For example a Fitbit monitoring heartrate, a stock exchange feed giving you stock values. You don’t care about a particular value, you care about the value <em>over time</em>. It turns out that there are quite a lot of use cases for those kind of details. We have seen a major pick up in IoT related fields in particular.</p>
<p>However, the API we provided for users to insert data for time series had a limitation, have a look:</p>
<blockquote>
<script src="https://gist.github.com/ayende/7a1b9c103ef263040cc52e4f16dd7636.js"></script>
</blockquote>
<p>The API gives you the ability to record a value (or a set of values) at a particular point in time, with an optional tag for additional meaning. What is the problem with this API, then?</p>
<p>Well, it works great if you are processing data from a singular source (the stock exchange feed, or a medical device), but it fails to do its job if you may need to record multiple values for the same timestamp.</p>
<p>Huh? What does that even <em>mean</em>? If we a are storing a value per timestamp, obviously there should be <em>a</em> value for that timestamp. How can there be multiple values? Note that here I’m not talking about something like location (with latitude and longitude coordinates), those are covered under storing an array of values on the same timestamp.</p>
<p>The issue happens when you have the need to record multiple <em>different</em> values at the same timestamp. Typical time series are things like Heartrate, Location, StockPrice, etc. Having multiple values for the same thing at the same time frame doesn’t really work. In the Location time series, if I’m both <em>here</em> and <em>there, </em>you can expect trouble (if only because the paradox cops will show up). A stock may have different prices at the same time <em>in different exchanges</em>, sure, but that is not the same value, by its very nature.</p>
<p>There is a common scenario where this will happen. When what I’m recording is not the full value, but <em>part</em> of that value. The classic example for that is tracking page views. Let’s say that I want to know how many people are looking at this blog post, I cannot use the <em>Append()</em> API for that purpose. Each individual operation is going to belong to a particular timestamp. What happens if I have two views on this post at the <em>exact same millisecond</em>? For that matter, what happens in the more “interesting” case of having writes to the same millisecond on <em>two different nodes</em> in the cluster?</p>
<p>With timeseries as we envisioned them for the 5.0 release, that wasn’t an issue, a timeseries had <em>a</em> value in a particular timestamp. But supporting a scenario such as tracking views, or any scenario where we want to record partial data and have RavenDB take care of everything else isn’t served well by this model.</p>
<p>Note that RavenDB already has the notion of distributed counters, they are intended specifically for doing such things. It is trivial in RavenDB to implement a counter that would track the overall views on a post. It will also handle concurrency, distributing data between nodes, everything that needs to be handled. So why can’t I use that?</p>
<p>It turns out that I typically want to know more than just the total number of views on the post, I want to know <em>when</em> they happened. Counters are only a partial answer for that.</p>
<p>That is why incremental time series were created. They are here to marry the ability of time series to track a value over time and the distributed counters ability to aggregate information concurrently and in a safe distributed manner. Here is the new API for incremental time series:</p>
<blockquote>
<script src="https://gist.github.com/ayende/39f9def81621107f2311f308203ce2ed.js"></script>
</blockquote>
<p>The changes are apparent at the API level, the <em>Increment</em>() is not setting the value, it is incrementing it with a delta value. So two increments on the same timestamp will give you the right result. Note that we don’t have a way to tag the entry any longer. That is no longer meaningful, because a single timestamp may have multiple different values. The method is called increment, but note that you can also pass negative values, if you want to reduce the amount.</p>
<p>You can see in the image on the right how this looks like in the studio. An incremental time series is one that has the “INC:” prefix in the name. Such a time series is able to accept <em>only</em> increment operations, it will reject attempts to append values to it. In the same sense, a non incremental time series will not allow you to increment a value, only append new entries. We wanted to have a strong separation between the two time series modes because mixing them up resulted in a huge mess of edge cases that are really hard to solve.</p>
<p>I probably should explain the terminology here, because it reflects an important distinction:</p>
<ul>
<li>Append – add a new timestamp and the value(s) for that time. This appends to the <em>time series</em> a new entry. Appending an entry to a time that is already in the timeseries will overwrite that time.</li>
<li>Increment – add a new timestamp and its values. If there is already value for that time in the time series, we’ll add the new value and existing value together, writing their sum as the new value.</li>
<ul>
<li>That isn’t actually how it works internally, but that is the conceptual model.</li>
</ul>
</ul>
<p>Aside from using increment to set the values, incremental time series behave just like any other time series. You can query over them, aggregate, index, etc. They can create rollups (a rolled up incremental time series is a normal time series, not an incremental one), apply retention polices, and everything else that you can do with a time series, the special behavior of incremental time series does not extend to its rolled-up versions.</p>
<p>Here is a full example of how you can use this feature:</p>
<blockquote>
<script src="https://gist.github.com/ayende/367ef1701ec30404dac82709af1d7168.js"></script>
</blockquote>
<p>As usual, this is transactional with any other operation you may want to do, so you can increment a time series along side uploading an attachment and modifying a document, as a single atomic transaction.</p>
<p>And now we can ask about view counts on an hourly basis for the last week, like so:</p>
<blockquote>
<script src="https://gist.github.com/ayende/808eac0c91cd5aeb812b8312954eee08.js"></script>
</blockquote>
<p>This feature is going to be available in all editions of RavenDB 5.3, expected for release in mid November. I got <em>so many </em>ideas about what you can use this for <img class="wlEmoticon wlEmoticon-smile" src="https://ayende.com/blog/Images/Open-Live-Writer/RavenDB-5.3-Features_12D89/wlEmoticon-smile_2.png" alt="Smile" />.</p>https://www.ayende.com/blog/195267-C/ravendb-5-3-new-features-incremental-time-series?Key=69dfe678-d0a1-4788-ae05-1cf0e0b77078https://www.ayende.com/blog/195267-C/ravendb-5-3-new-features-incremental-time-series?Key=69dfe678-d0a1-4788-ae05-1cf0e0b77078Fri, 12 Nov 2021 12:00:00 GMTRavenDB 5.3 New Features: Concurrent Subscriptions & Serial operations<p>Almost as soon as we introduced <a href="https://ayende.com/blog/195265-C/ravendb-5-3-features-concurrent-subscriptions?key=39012fb5d555413d8fd050479b6c6322">concurrent subscriptions</a>, we ran into a serious problem in their use. The desire was to do things in a serial fashion. That was quite infuriating, because we spent to much time working on making things concurrent, and now we had to deal with making them serial again? What the hell?</p>
<p>Before I dive any further, it will probably be for the best if I explained a bit more about the context of this very strange feature request.</p>
<p>Consider a system where the subscription is used to process commands, which may relationships between one another. For example, consider the following commands (all of them belonging to the same “Commands” collection):</p>
<ul>
<li>EmployeePayroll – commands/40-A</li>
<li>EmployeeBankAccountChange – commands/34-A</li>
<li>EmployeeContractUpdate – commands/49-C</li>
</ul>
<p>For each one of those commands (and many more), we want to run some logic. Some of this requires us to touch third party services, which means that we are likely to be slow / stalled on some cases. That is the exact case for using concurrent subscriptions.</p>
<p>The developers quickly jumped on the new system, setting the mode of the subscription as concurrent and running multiple workers. Things worked, latency was down and everyone was happy. Everyone, that is, except for George. The problem was George had gotten married recently. Well, that wasn’t the actual problem. George is <em>happily </em>married. The problem is that George and his wife have a new joint bank account. George let the HR department know about the new bank account in advance, which resulted in the <em>EmployeeBankAccountChange</em> command being generated. Then payroll day hit, and we have an <em>EmployeePayroll</em> command as well.</p>
<p>This is where things started to get iffy. In terms of timing, the <em>EmployeeBankAccountChange</em> happened before the <em>EmployeePayroll</em> command. When the subscription was running in serial mode, it was guaranteed that it will always process the commands in the order that they were modified. That meant that handling things like changing the bank account and actually paying had a very natural order. If you made the change before payroll, it got processed before hand, otherwise, it was processed afterward.</p>
<p>With concurrent subscriptions, this is no longer the situation. We are still working <em>roughly</em> in the order of modification, but we are no longer guaranteeing it. And it is possible to process documents out of order.</p>
<p>RavenDB’s concurrent subscriptions will ensure that you’ll not have to worry about concurrent processing of a single document, but in this case, there are different documents, so they can be processed concurrently. An <em>EmployeeBankAccountChange</em> may take a long time (verifying accounts, etc) while <em>EmployeePayroll</em> is just adding a line to a ACH file, so it is very likely that we’ll process the payroll before the account change. And that makes George very sad. Let’s see how we can avoid depressing the newlywed.</p>
<p>One option is to make use of another RavenDB feature, the compare exchange support. This allows you to use strongly consistent, cluster-wide, values which are suitable for distributed locks. I looked into what it will take to build this and quailed in fear. I don’t want to let things become this complicated.</p>
<p>The key issue here is that we want both concurrency and serial work. An interesting observation is that there is a <em>scope</em> for such things. Commands on the same employee should run in the same order they were issued, commands on <em>different </em>employees are free to run in whatever order they like. How can we make this work without diving head first into complexity the like of which will keep you up at night?</p>
<p>For the most part, we can assume that concurrent operations for the same employee is rare. Even when we have multiple commands for the same employee, we can expect that there won’t be <em>many</em> of them. Given that, we can change the way we model the commands themselves. Instead of creating a document per command, we’ll have a document <em>per employee</em>.</p>
<p>Where before we had this model:</p>
<blockquote>
<script src="https://gist.github.com/ayende/6e60596a8cf3e31d50d8fd93e42226a4.js"></script>
</blockquote>
<p>We’ll now have the following model:</p>
<blockquote>
<script src="https://gist.github.com/ayende/c1852c8d66874dc5c36d3186595cc830.js"></script>
</blockquote>
<p>What does this give us? We now have a <em>commands/employees/1-A</em> for the first employee, all operations on the employee and handled as a single unit, guaranteed by the concurrent subscription. Let’s explore further how that works, okay?</p>
<p>With the previous model/modeling, to register a command, we need to just call:</p>
<blockquote>
<script src="https://gist.github.com/ayende/5e52a1f9c5de6ff6af870cdd754288c5.js"></script>
</blockquote>
<p>All the commands were using the Commands collection, so the subscription worker will look like::</p>
<blockquote>
<pre><span style="color: #0000ff;">from </span>Commands</pre>
</blockquote>
<p>And if we process this concurrently, we may process the commands for the same employee at the same time, leading to sadness in the household of George. Instead, with the new model/modeling, we can use the patching API to handle this. Here is what this looks like:</p>
<blockquote>
<script src="https://gist.github.com/ayende/85b6d15b5d7960c3529ab4bc7456edf0.js"></script>
</blockquote>
<p>The idea in this case is that all commands for the same employee use the same document. If there isn’t already such a value, we’ll create a new instance, otherwise, we’ll apply the patch script and add to it. The end result is that we can have multiple concurrent operations and they will all be added to the same document in order of execution. However, so far this has nothing to do with concurrent subscriptions. What do we do from here? Here is what the subscription worker looks like after these changes:</p>
<blockquote>
<p> </p>
<script src="https://gist.github.com/ayende/16ee8ad407f4c890675b3280d1577c9e.js"></script>
</blockquote>
<p>The idea is that when we enqueue a command, we register them in the document specifically for the employee (the scope for serial work in a concurrent subscription) and when we process the command in the subscription worker we patch out all the commands that we already executed.</p>
<p>This behavior will guarantee that we can process commands serially within a concurrent worker. All commands for the same employee will be processed serially in the order they were submitted, while different employees will be processed concurrently.We even support adding additional commands to the employee document while the worker is <em>processing </em>commands, we’ll simply handle them in the next batch after the employee commands are all done.</p>
<p>One thing that I’m not discussing here is what to do in case we have concurrent modifications on the commands document in multiple nodes? That would generate a conflict and RavenDB defaults to selecting the latest version. You can configure RavenDB to resolve this property, <a href="https://ravendb.net/learn/inside-ravendb-book/reader/4.0/6-ravendb-clusters#conflicts">I talk about this at length here</a>.</p>
<p>Aside from leaning on the new concurrent subscriptions feature, all the rest of the things that we have been using in this post to solve the problem are long standing features of RavenDB and both conceptually and in practice this gives us a great deal of simplicity to handle a non trivial issue.</p>
<p>As usual, I would very much welcome your feedback.</p>https://www.ayende.com/blog/195266-C/ravendb-5-3-new-features-concurrent-subscriptions-serial-operations?Key=020aa66a-fae1-4077-9179-9c730610e4e6https://www.ayende.com/blog/195266-C/ravendb-5-3-new-features-concurrent-subscriptions-serial-operations?Key=020aa66a-fae1-4077-9179-9c730610e4e6Thu, 11 Nov 2021 12:00:00 GMTRavenDB 5.3 New Features: Concurrent subscriptions<p><a href="https://ayende.com/blog/Images/Open-Live-Writer/Rave.3-Features-Concurrent-subscriptions_FF11/image_2.png"><img style="float: right; display: inline; background-image: none;" title="image" src="https://ayende.com/blog/Images/Open-Live-Writer/Rave.3-Features-Concurrent-subscriptions_FF11/image_thumb.png" alt="image" width="575" height="353" align="right" border="0" /></a>RavenDB supports a dedicated batch processing mode, using the notion of subscriptions. A subscription is simply a way to register a query with the database and have the database send the subscriber the documents that match the query.</p>
<p>The previous sentence is taken directly from the Inside RavenDB book, and it is a good intro for the topic. A subscription is a way to process documents that match a query. A good example might be to run various business processes as a result of data changes. Let’s assume that we have a bank, and a new customer was registered. We need to run plenty of such processes (Know Your Customer, Anti Money Laundering, Credit Score, in-house estimation, credit limits & authorization, etc).</p>
<p>A typical subscription query would then be:</p>
<blockquote>
<pre><span style="color: #0000ff;">from</span> Customers <span style="color: #0000ff;">where</span> Onboarded = <span style="color: #0000ff;">false</span></pre>
</blockquote>
<p>And then we can register to that subscription. At this point, the database will start sending us all the customers that haven’t been onboarded yet. This is a <em>persistent</em> query, so restarts and failures are handled properly. And the key aspect is that RavenDB will <em>push</em> the matching documents to the subscription worker. RavenDB will handle batching of the results, ensure that we can process humungous amount of data safely and easily and in general remove a lot of hassle from backend processing.</p>
<p>Up until RavenDB 5.3, however, a subscription was defined to be a singleton. In other words, at any given point, only a single subscription worker could be running. That is enforced by the server and help making it much easier to reason about processing documents. One comment that we got is that this is great, if the processing that we are doing is internal, but if there is the need to make a remote call to a potentially slow service, that can be an issue.</p>
<p>For example, consider the following worker code:</p>
<blockquote>
<script src="https://gist.github.com/ayende/c09e2c788d2e097f5f87e0dba26f7535.js"></script>
</blockquote>
<p>What happens when the <em>CheckCreditScore()</em> is slow? We are halting processing for <em>everything</em>. In some cases, it is only particular customers that are slow, and we absolutely want to process them in parallel. However, RavenDB did not allow that.</p>
<p>In RavenDB 5.3, we are bringing concurrent subscriptions to the table. When you create the subscription worker, you can define it with a Concurrent mode, like so:</p>
<blockquote>
<script src="https://gist.github.com/ayende/40a0dc74e523eb6b12b0639c45f9c224.js"></script>
</blockquote>
<p>When you have done that, RavenDB will allow multiple concurrent workers to run at the same time, processing batches in parallel. That means that a single slow customer will not halt your entire processing pipeline.</p>
<p>In general, I would like you to think about this flag as just removing a limitation. Previously we blocked you from an operation, and now you can run freely. However…</p>
<p>We didn’t decide to limit your capabilities just because we like doing that. One of the <em>key</em> aspects of subscriptions is that they offer <em>reliable</em> processing of documents. If an exception has been thrown when processing a batch, RavenDB will resend the batch to the worker again, until processing is susccessful. If we handed a batch of documents to process to a worker, and that worker crashed without letting us know, we need to make sure that the <em>next</em> client to connect will start processing from the last acknowledged batch.</p>
<p>It turns out that adding concurrency and the ability for workers to work completely independently of one another make such promises a lot harder to implement.</p>
<p>There is also another aspect that we have to consider. When we have just a single worker, certain concurrency issues never happen, but when we allow you to run concurrently, we have to deal with them.</p>
<p>Consider the subscription above, running on two workers. We handed a new customer document to Worker A, which started processing it. While Worker A is processing the document, that document has changed. That means that it needs to be processed again by the subscription. We have Worker B available and ready, but if we allow such a scenario, we risk getting a race between the workers, working on the same document.</p>
<p>We could punt that to the user and ask them to ensure that this is something that they handle, but that isn’t the philosophy of RavenDB. Instead, we have implemented the following behavior for concurrent subscriptions:</p>
<p>When the server sends a batch of documents to a worker, that worker “checks them out”. Until that worker signals the server that the batch has been either processed or failed, we’ll not send those documents out to other workers, even if they have been modified. Once a batch is acknowledged as processed, we’ll scan all the documents in that batch and see if we need to schedule them for the <em>next</em> batch, because they were missed while they were checked out.</p>
<p>That means that from the perspective of the user, they can write code knowing that only a single subscription worker will run on a given document at a time. This is a very powerful promise and can significantly simplify the complexity of building your systems. A single worker that is stalling will not prevent the other workers from making progress. There aren’t any timeouts to deal with. If you have a process that may take a long time, as long as the worker is alive and functioning (maintaining the TCP connection to the server), the server will consider the documents that the worker is processing as checked out.</p>
<p>Concurrent subscriptions require you to opt in, using the <em>Concurrent</em> flag. All workers for a subscription must agree to run in a concurrent mode. This is to ensure that there aren’t any workers that expect pure serial work model. If you aren’t setting this flag, you’ll keep getting the usual serial behavior of subscriptions. We require opting in to this behavior because we violate an important guarantee of the subscription, that you’ll process the documents in the order in which they were modified. This is now no longer the case, obviously.</p>
<p>The first worker to connect to a subscription will determine if it will run in concurrent mode or serial mode. Any new worker trying to run on that subscription needs to be concurrent (if the first one was concurrent) and no concurrent worker can join a subscription that has a serial worker active. This is a <em>transient setting</em>, it is important to note. When the last worker is shut down, the subscription state is reset, and then you can connect a worker for the first time again (which will then be able to set the mode of the subscription).</p>
<p>You can see in the benchmark image on the right the impact of adding concurrent workers when there is a non trivial processing time. It is important to note that the concurrent part of the concurrent subscriptions is the fact that the <em>workers</em> are running in parallel. We are still sending batches of documents for each worker independently and then waiting for confirmation. If you have no significant processing time for a batch, you’ll not see a significant improvement in processing time (the server side cost of processing the documents, sending the batch, etc is related to the total number of documents, and won’t be impacted).</p>
<p>Concurrent subscriptions are available in RavenDB 5.3 (due to be released by mid November) and will be available in the Professional and Enterprise editions of RavenDB.</p>https://www.ayende.com/blog/195265-C/ravendb-5-3-new-features-concurrent-subscriptions?Key=39012fb5-d555-413d-8fd0-50479b6c6322https://www.ayende.com/blog/195265-C/ravendb-5-3-new-features-concurrent-subscriptions?Key=39012fb5-d555-413d-8fd0-50479b6c6322Wed, 10 Nov 2021 12:00:00 GMT