Uberobert 2015-11-13T21:41:06-08:00 http://www.uberobert.com Robert Birnie [email protected] Don't Use Homebrew for Installing NPM - use NVM 2015-11-13T00:00:00-08:00 http://www.uberobert.com/dont-use-homebrew-for-npm <p>As a user on this <a href="https://github.com/Homebrew/homebrew/issues/22408">homebrew issue</a> utter's like a savant "package managers managing package managers rarely works out well."</p> <p>First clean up your homebrew crap:</p> <div class="highlight"><pre><code class="bash">brew uninstall --force node rm -rf ~/.npm rm -rf ~/.node </code></pre></div> <p>Install NPM with the <a href="https://github.com/creationix/nvm#install-script">NVM install script</a></p> <div class="highlight"><pre><code class="bash">curl -o- https://raw.githubusercontent.com/creationix/nvm/v0.29.0/install.sh | bash </code></pre></div> <p>The inital version installed with homebrew is ok, but updating node after installing via homebrew is impossible due to how homebrew handles linking. Using NPM makes upgrades and managing versions simpler.</p> Cassandra Freezes on CentOS 6.6 and Haswell Processors 2015-10-21T00:00:00-07:00 http://www.uberobert.com/rhel-6-6-and-haswell-processors-cassandra <p>This is an FYI and warning, be very careful with haswell processors with RHEL/CentOS 6.6. There is a <a href="https://groups.google.com/forum/#!topic/mechanical-sympathy/QbmpZxp6C64">futex wait()</a> bug that can cause processes which wait to never resume agian. A good description is on <a href="http://www.infoq.com/news/2015/05/redhat-futex">InfoQ</a>.</p> <blockquote><p>“The impact of this kernel bug is very simple: user processes can deadlock and hang in seemingly impossible situations. A futex wait call (and anything using a futex wait) can stay blocked forever, even though it had been properly woken up by someone. Thread.park() in Java may stay parked. Etc. If you are lucky you may also find soft lockup messages in your dmesg logs. If you are not that lucky (like us, for example), you'll spend a couple of months of someone's time trying to find the fault in your code, when there is nothing there to find.”</p></blockquote> <p>I recently saw this with Dell R630's and cassandra. A thread dump shows the threads in a BLOCKED state and the stack trace shows them as parked.</p> <div class="highlight"><pre><code class="yaml"><span class="l-Scalar-Plain">Thread 104823</span><span class="p-Indicator">:</span> <span class="l-Scalar-Plain">(state = BLOCKED)</span> <span class="l-Scalar-Plain">- sun.misc.Unsafe.park(boolean, long) @bci=0 (Compiled frame; information may be imprecise)</span> <span class="l-Scalar-Plain">- java.util.concurrent.locks.LockSupport.parkNanos(java.lang.Object, long) @bci=20, line=226 (Compiled frame)</span> <span class="l-Scalar-Plain">- java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.awaitNanos(long) @bci=68, line=2082 (Compiled frame)</span> <span class="l-Scalar-Plain">- java.util.concurrent.LinkedBlockingQueue.poll(long, java.util.concurrent.TimeUnit) @bci=62, line=467 (Compiled frame)</span> <span class="l-Scalar-Plain">- java.util.concurrent.ThreadPoolExecutor.getTask() @bci=141, line=1068 (Compiled frame)</span> <span class="l-Scalar-Plain">- java.util.concurrent.ThreadPoolExecutor.runWorker(java.util.concurrent.ThreadPoolExecutor$Worker) @bci=26, line=1130 (Compiled frame)</span> <span class="l-Scalar-Plain">- java.util.concurrent.ThreadPoolExecutor$Worker.run() @bci=5, line=615 (Interpreted frame)</span> <span class="l-Scalar-Plain">- java.lang.Thread.run() @bci=11, line=745 (Interpreted frame)</span> </code></pre></div> <p>Cassandra logs go completely blank and CPU utilization will stay constant at some level (sometimes high / sometimes none). Interestingly you can revive the process with a <code>kill -STOP &lt;jvm_pid&gt; and kill -CONT &lt;jvm_pid&gt;</code> (which is much faster than a service restart).</p> <p>Update Centos 6.6 to the newest kernel in the updates repository to fix this, version <code>2.6.32-504.30.3.el6.x86_64</code>.</p> <p>Big thanks to Adam Hattrell, Simon Ashley and <a href="https://twitter.com/flightc">Erick Ramirez</a> from Datastax for the help to figure this out.</p> Impact of Latency on Performance Testing 2015-08-06T00:00:00-07:00 http://www.uberobert.com/latency-and-performance-testing <p>Something not often mentioned and tested is the impact of latency in the wild on the operation and scalability of a website. The vast majority of load tests conducted are ran from a local load source, jmeter in the same availability zone. In this case the latency is incredibly low, probably sub millisecond. In the real world your application will never see this kind of latency again, it will be anywhere from 50 to 500ms depending on the global mix of traffic you receive. This can kill the performance of your application in surprising ways.</p> <p>The time Apache spends waiting for a response on low latency requests is going to be small, this allows your servers to handle a much larger volume of traffic spread over a much lower number of threads. This is further amplified if your application is handling a lot of small quick requests, say an web API. In the lab, a server might be able to handle thousands of requests per second with only 30-100 threads at any given time. Using such a small number of threads is stellar for performance, the box will require much less application concurrency. A change in latency from 1ms to 200ms will cause transactions overhead to take 200% longer by definition, if your application has a 1:1 ratio of thread to transaction this will also cause a 200% increase in concurrency. This could most obviously lead to the box running out of threads or memory in production before it reaches the performance levels seen during testing.</p> <p>Latency issues could also highlight any bottlenecks in your code where the application blocks while waiting on other threads. You could see this in your performance graphs by comparing context switching and system CPU usage between QA and production, as waiting on other threads often shows up in the kernel level.</p> <h2>What to do</h2> <p>So finally, what can we do about this? Load test from over the internet! You should mimic production latency in your performance testing environment, this will ensure that you not only test the raw performance of your application but also stress production similar concurrency levels. To do this you should generate the load for your tests remotely in some form of cloud, like AWS. This raises a big question though, where do you generate the load from? If your average visitors are fairly geographically close by, you don't need to test from that far away. But if you have a truly global customer base you may want to generate load from the other side of the Atlantic. To decide where you really need a good average of your production latency, which is fairly hard to measure (I'm not about to ping every IP in my apache access log, haha). Luckily we can get this number a roundabout way through testing!</p> <p>If using Apache HTTPD, the first step is to enable <a href="http://httpd.apache.org/docs/2.2/mod/mod_status.html">Apache server-status</a>, if you want to see what this looks like <a href="http://httpd.apache.org">httpd.apache.org</a> has <a href="http://httpd.apache.org/server-status">server-status enabled by default</a>, kudos to them. Next test your app in QA, fire up enough threads to mirror the requests per second your production site sees; then measure the number of active threads ("requests currently being processed" in <code>server-status</code>). Using this you can compute the average latency you see on your production site like so:</p> <div class="highlight"><pre><code class="ruby"><span class="n">production_latency</span> <span class="o">=</span> <span class="n">local_latency</span> <span class="o">*</span> <span class="n">production_threads</span> <span class="o">/</span> <span class="n">local_threads</span> </code></pre></div> <p>To increase accuracy of your measurement increase the latency in your test environment, possibly generate the load from a near by AWS location. You will still know the latency but it then won't be so close to 0; the difference between .08ms and .07ms is pretty significant in the final number while being hard to measure accurately...</p> <p>So then when you are armed with a production latency number, peruse different cloud providers and find one that has a latency to your test site that is near or somewhat larger that can be you see in production. Then when you run tests you can also test at similar application concurrency numbers to what is experienced in production!</p> <p>Any comments, questions, concerns, or areas where I'm wrong that you'd like to troll?</p> Generate a Jekyll Sitemap 2015-07-15T00:00:00-07:00 http://www.uberobert.com/generate-a-jekyll-sitemap <p>I just setup my blog on google's webmasters tool and saw that they wanted a sitemap. This led to the question of "how do I make one of those!" Luckily I found <a href="http://davidensinger.com/2013/11/building-a-better-sitemap-xml-with-jekyll/">David Singer's blog post on building a sitemap!</a>. This is a direct paste of his code which I found to work exactly as required, be sure to check out his blog.</p> <p>All that is required is putting the sitemap code in the root directory of your blog, and then on any pages you want to customize adding (if you don't add this the page will get the defaults:</p> <div class="highlight"><pre><code class="yaml"><span class="l-Scalar-Plain">sitemap</span><span class="p-Indicator">:</span> <span class="l-Scalar-Plain">lastmod</span><span class="p-Indicator">:</span> <span class="l-Scalar-Plain">2014-01-23</span> <span class="l-Scalar-Plain">priority</span><span class="p-Indicator">:</span> <span class="l-Scalar-Plain">0.7</span> <span class="l-Scalar-Plain">changefreq</span><span class="p-Indicator">:</span> <span class="s">&#39;monthly&#39;</span> <span class="l-Scalar-Plain">exclude</span><span class="p-Indicator">:</span> <span class="s">&#39;yes&#39;</span> </code></pre></div> <div class="highlight"><pre><code class="xml">--- layout: null sitemap: exclude: &#39;yes&#39; --- <span class="cp">&lt;?xml version=&quot;1.0&quot; encoding=&quot;UTF-8&quot;?&gt;</span> <span class="nt">&lt;urlset</span> <span class="na">xmlns=</span><span class="s">&quot;http://www.sitemaps.org/schemas/sitemap/0.9&quot;</span><span class="nt">&gt;</span> {% for post in site.posts %} {% unless post.published == false %} <span class="nt">&lt;url&gt;</span> <span class="nt">&lt;loc&gt;</span>{{ site.url }}{{ post.url }}<span class="nt">&lt;/loc&gt;</span> {% if post.sitemap.lastmod %} <span class="nt">&lt;lastmod&gt;</span>{{ post.sitemap.lastmod | date: &quot;%Y-%m-%d&quot; }}<span class="nt">&lt;/lastmod&gt;</span> {% elsif post.date %} <span class="nt">&lt;lastmod&gt;</span>{{ post.date | date_to_xmlschema }}<span class="nt">&lt;/lastmod&gt;</span> {% else %} <span class="nt">&lt;lastmod&gt;</span>{{ site.time | date_to_xmlschema }}<span class="nt">&lt;/lastmod&gt;</span> {% endif %} {% if post.sitemap.changefreq %} <span class="nt">&lt;changefreq&gt;</span>{{ post.sitemap.changefreq }}<span class="nt">&lt;/changefreq&gt;</span> {% else %} <span class="nt">&lt;changefreq&gt;</span>monthly<span class="nt">&lt;/changefreq&gt;</span> {% endif %} {% if post.sitemap.priority %} <span class="nt">&lt;priority&gt;</span>{{ post.sitemap.priority }}<span class="nt">&lt;/priority&gt;</span> {% else %} <span class="nt">&lt;priority&gt;</span>0.5<span class="nt">&lt;/priority&gt;</span> {% endif %} <span class="nt">&lt;/url&gt;</span> {% endunless %} {% endfor %} {% for page in site.pages %} {% unless page.sitemap.exclude == &quot;yes&quot; %} <span class="nt">&lt;url&gt;</span> <span class="nt">&lt;loc&gt;</span>{{ site.url }}{{ page.url | remove: &quot;index.html&quot; }}<span class="nt">&lt;/loc&gt;</span> {% if page.sitemap.lastmod %} <span class="nt">&lt;lastmod&gt;</span>{{ page.sitemap.lastmod | date: &quot;%Y-%m-%d&quot; }}<span class="nt">&lt;/lastmod&gt;</span> {% elsif page.date %} <span class="nt">&lt;lastmod&gt;</span>{{ page.date | date_to_xmlschema }}<span class="nt">&lt;/lastmod&gt;</span> {% else %} <span class="nt">&lt;lastmod&gt;</span>{{ site.time | date_to_xmlschema }}<span class="nt">&lt;/lastmod&gt;</span> {% endif %} {% if page.sitemap.changefreq %} <span class="nt">&lt;changefreq&gt;</span>{{ page.sitemap.changefreq }}<span class="nt">&lt;/changefreq&gt;</span> {% else %} <span class="nt">&lt;changefreq&gt;</span>monthly<span class="nt">&lt;/changefreq&gt;</span> {% endif %} {% if page.sitemap.priority %} <span class="nt">&lt;priority&gt;</span>{{ page.sitemap.priority }}<span class="nt">&lt;/priority&gt;</span> {% else %} <span class="nt">&lt;priority&gt;</span>0.3<span class="nt">&lt;/priority&gt;</span> {% endif %} <span class="nt">&lt;/url&gt;</span> {% endunless %} {% endfor %} <span class="nt">&lt;/urlset&gt;</span> </code></pre></div> <p>Cheers David!</p> Create Puppet Facts with Package Versions 2015-06-19T00:00:00-07:00 http://www.uberobert.com/puppet-facts-with-package-versions <p>With <a href="https://puppetlabs.com/">puppet</a>, sometimes it is necessary to make case statements around what version of a package is installed rather than having puppet dictate what version is installed. Say "add x line to sshd config for x version." To do this we need to have fact with what package version is already installed on the system, here's a quick script to do that in bulk. To deploy, add this as a <a href="https://docs.puppetlabs.com/guides/plugins_in_modules.html">plugin in a module</a>.</p> <div class="highlight"><pre><code class="ruby"><span class="c1">#!/usr/bin/ruby</span> <span class="nb">require</span> <span class="s1">&#39;facter&#39;</span> <span class="n">packages</span> <span class="o">=</span> <span class="s1">&#39;(^bash |^glibc.x86_64 |^httpd |^jdk |^jre |^mod_ssl |^openssl |^php |^sshd )&#39;</span> <span class="n">version</span> <span class="o">=</span> <span class="ss">Facter</span><span class="p">:</span><span class="ss">:Util</span><span class="o">::</span><span class="no">Resolution</span><span class="o">.</span><span class="n">exec</span><span class="p">(</span><span class="s2">&quot;rpm -qa --queryformat &#39;[%{NAME} %{VERSION}-%{RELEASE}</span><span class="se">\n</span><span class="s2">]&#39; | egrep &#39;</span><span class="si">#{</span><span class="n">packages</span><span class="si">}</span><span class="s2">&#39;&quot;</span><span class="p">)</span> <span class="n">version</span><span class="o">.</span><span class="n">each_line</span> <span class="k">do</span> <span class="o">|</span><span class="n">package</span><span class="o">|</span> <span class="no">Facter</span><span class="o">.</span><span class="n">add</span><span class="p">(</span><span class="s2">&quot;package_</span><span class="si">#{</span><span class="n">package</span><span class="o">.</span><span class="n">split</span><span class="o">[</span><span class="mi">0</span><span class="o">]</span><span class="si">}</span><span class="s2">&quot;</span><span class="o">.</span><span class="n">gsub</span><span class="p">(</span><span class="s1">&#39;-&#39;</span><span class="p">,</span><span class="s1">&#39;_&#39;</span><span class="p">))</span> <span class="k">do</span> <span class="n">setcode</span> <span class="k">do</span> <span class="s2">&quot;</span><span class="si">#{</span><span class="n">package</span><span class="o">.</span><span class="n">split</span><span class="o">[</span><span class="mi">1</span><span class="o">]</span><span class="si">}</span><span class="s2">&quot;</span> <span class="k">end</span> <span class="k">end</span> <span class="k">end</span> </code></pre></div> <p>In this case we've got a regex with all the packages we care about, I know you could make this an array and loop over it but the performance is better to call the rpm command just once and then loop over the output. Technically you <em>could</em> do all packages but that seems pointless to me and would also be a ton of packages. Then the other trick was changing '-' to '_' because dashes aren't allowed in fact names but are quite common in package names.</p> <p>Then in our manifests we can make case statements or if blocks on these new fancy facts.</p> <div class="highlight"><pre><code class="ruby"><span class="k">if</span> <span class="vg">$package_foo</span> <span class="o">=~</span> <span class="sr">/3.2.7/</span> <span class="p">{</span> <span class="n">file</span> <span class="p">{</span> <span class="s1">&#39;/usr/local/foo/conf/config.yaml&#39;</span> <span class="p">:</span> <span class="k">ensure</span> <span class="o">=&gt;</span> <span class="s1">&#39;file&#39;</span><span class="p">,</span> <span class="n">source</span> <span class="o">=&gt;</span> <span class="s1">&#39;puppet:///modules/foo/3.2.7/usr/local/foo/conf/config.yaml&#39;</span><span class="p">,</span> <span class="p">}</span> <span class="p">}</span> <span class="k">else</span> <span class="p">{</span> <span class="n">notify</span> <span class="p">{</span> <span class="s1">&#39;foo_unknown&#39;</span> <span class="p">:</span> <span class="n">message</span> <span class="o">=&gt;</span> <span class="s1">&#39;I don&#39;</span><span class="n">t</span> <span class="n">know</span> <span class="n">what</span> <span class="n">version</span> <span class="n">of</span> <span class="s1">&#39;foo&#39;</span> <span class="n">to</span> <span class="n">configure</span><span class="p">,</span> <span class="n">this</span> <span class="n">run</span> <span class="n">is</span> <span class="n">foobar!</span><span class="err">&#39;</span><span class="p">,</span> <span class="p">}</span> <span class="p">}</span> </code></pre></div> <p>Let me know if you find this helpful, horrible, or both.</p> Bandwidth Required for Cassandra Hinted Handoff 2015-04-28T00:00:00-07:00 http://www.uberobert.com/bandwidth-cassandra-hinted-handoff <p>For capacity planning and stability reasons it is important to be able to estimate how Cassandra will act in a multi-dc environment during adverse networking conditions. This includes capacity planning for WAN bandwidth and ensuring that after a network partition you can stream the hints off both fast enough that they don't expire and TTL off or stream so fast that it saturates your WAN and crashes the cluster. In this blog we'll go into how to compute the streaming speeds. Then in the next blog we will look at fine-tuning the number of threads needed to stream data safely after a network partition.</p> <h2>Regular WAN Bandwidth</h2> <p>First we need some regular baseline numbers on WAN bandwidth requirements. Just from watching average running of a single DC cluster you can figure out WAN bandwidth network neccessary for day to day operations. Rough, back of envelope, estimates for WAN bandwidth can be had by disabling thrift/native_transport on a node and measure internal network communication. At this point the node will just be handling internal cassandra reads/writes, divide this number by your replication factor (because writes are sent just once over the WAN) for a high water mark of WAN bandwidth during a normal day. In practice this estimate will be higher than actual WAN traffic because the WAN is only seeing writes and not reads.</p> <h2>WAN Bandwidth and Hints</h2> <p>Next comes the much harder part. What network will be required after a temporary network partition. After network is restored you'll have both regular traffic and traffic of hints.</p> <p>The primary settings for hints are in cassandra.yaml and look like this:</p> <div class="highlight"><pre><code class="yaml"><span class="c1"># See http://wiki.apache.org/cassandra/HintedHandoff</span> <span class="l-Scalar-Plain">hinted_handoff_enabled</span><span class="p-Indicator">:</span> <span class="l-Scalar-Plain">true</span> <span class="c1"># this defines the maximum amount of time a dead host will have hints</span> <span class="c1"># generated. After it has been dead this long, new hints for it will not be</span> <span class="c1"># created until it has been seen alive and gone down again.</span> <span class="l-Scalar-Plain">max_hint_window_in_ms</span><span class="p-Indicator">:</span> <span class="l-Scalar-Plain">10800000</span> <span class="c1"># 3 hours</span> <span class="c1"># Maximum throttle in KBs per second, per delivery thread. This will be</span> <span class="c1"># reduced proportionally to the number of nodes in the cluster. (If there</span> <span class="c1"># are two nodes in the cluster, each delivery thread will use the maximum</span> <span class="c1"># rate; if there are three, each will throttle to half of the maximum,</span> <span class="c1"># since we expect two nodes to be delivering hints simultaneously.)</span> <span class="l-Scalar-Plain">hinted_handoff_throttle_in_kb</span><span class="p-Indicator">:</span> <span class="l-Scalar-Plain">1024</span> <span class="c1"># Number of threads with which to deliver hints;</span> <span class="c1"># Consider increasing this number when you have multi-dc deployments, since</span> <span class="c1"># cross-dc handoff tends to be slower</span> <span class="l-Scalar-Plain">max_hints_delivery_threads</span><span class="p-Indicator">:</span> <span class="l-Scalar-Plain">2</span> </code></pre></div> <p>The notes for <code>max_hints_delivery_threads</code> of "Consider increasing this number when you have multi-dc deployments" is less than helpful and there is currently no good documentation on the web to tune this value. So lets dig into the source and see how this all works so we can plan for it with multiple datacenters.</p> <h2>Into the Source</h2> <p>First off when cassandra starts it creates a ThreadPool of matching <code>max_hints_delivery_threads</code> (the getMaxHintsThread method)</p> <div class="highlight"><pre><code class="java"><span class="n">java</span><span class="o">/</span><span class="n">org</span><span class="o">/</span><span class="n">apache</span><span class="o">/</span><span class="n">cassandra</span><span class="o">/</span><span class="n">db</span><span class="o">/</span><span class="n">HintedHandOffManager</span><span class="o">.</span><span class="na">java</span><span class="o">:</span> <span class="mi">104</span> <span class="kd">private</span> <span class="kd">final</span> <span class="n">JMXEnabledScheduledThreadPoolExecutor</span> <span class="n">executor</span> <span class="o">=</span> <span class="mi">105</span> <span class="k">new</span> <span class="n">JMXEnabledScheduledThreadPoolExecutor</span><span class="o">(</span> <span class="mi">106</span> <span class="n">DatabaseDescriptor</span><span class="o">.</span><span class="na">getMaxHintsThread</span><span class="o">(),</span> <span class="mi">107</span> <span class="k">new</span> <span class="n">NamedThreadFactory</span><span class="o">(</span><span class="s">&quot;HintedHandoff&quot;</span><span class="o">,</span> <span class="n">Thread</span><span class="o">.</span><span class="na">MIN_PRIORITY</span><span class="o">),</span> <span class="mi">108</span> <span class="s">&quot;internal&quot;</span><span class="o">);</span> </code></pre></div> <p>Next when a node is seen via gossip, it will schedule a hint transfer</p> <div class="highlight"><pre><code class="java"><span class="n">cassandra</span><span class="o">/</span><span class="n">src</span><span class="o">/</span><span class="n">java</span><span class="o">/</span><span class="n">org</span><span class="o">/</span><span class="n">apache</span><span class="o">/</span><span class="n">cassandra</span><span class="o">/</span><span class="n">service</span><span class="o">/</span><span class="n">StorageService</span><span class="o">.</span><span class="na">java</span><span class="o">:</span> <span class="mi">1963</span> <span class="kd">public</span> <span class="kt">void</span> <span class="n">onAlive</span><span class="o">(</span><span class="n">InetAddress</span> <span class="n">endpoint</span><span class="o">,</span> <span class="n">EndpointState</span> <span class="n">state</span><span class="o">)</span> <span class="mi">1964</span> <span class="o">{</span> <span class="o">...</span> <span class="mi">1969</span> <span class="n">HintedHandOffManager</span><span class="o">.</span><span class="na">instance</span><span class="o">.</span><span class="na">scheduleHintDelivery</span><span class="o">(</span><span class="n">endpoint</span><span class="o">,</span> <span class="kc">true</span><span class="o">);</span> <span class="mi">1970</span> <span class="k">for</span> <span class="o">(</span><span class="n">IEndpointLifecycleSubscriber</span> <span class="n">subscriber</span> <span class="o">:</span> <span class="n">lifecycleSubscribers</span><span class="o">)</span> <span class="mi">1971</span> <span class="n">subscriber</span><span class="o">.</span><span class="na">onUp</span><span class="o">(</span><span class="n">endpoint</span><span class="o">);</span> </code></pre></div> <p>Which creates an executor instance to handle the handoff.</p> <div class="highlight"><pre><code class="java"><span class="n">cassandra</span><span class="o">/</span><span class="n">src</span><span class="o">/</span><span class="n">java</span><span class="o">/</span><span class="n">org</span><span class="o">/</span><span class="n">apache</span><span class="o">/</span><span class="n">cassandra</span><span class="o">/</span><span class="n">db</span><span class="o">/</span><span class="n">HintedHandOffManager</span><span class="o">.</span><span class="na">java</span><span class="o">:</span> <span class="mi">528</span> <span class="kd">public</span> <span class="kt">void</span> <span class="n">scheduleHintDelivery</span><span class="o">(</span><span class="kd">final</span> <span class="n">InetAddress</span> <span class="n">to</span><span class="o">,</span> <span class="kd">final</span> <span class="kt">boolean</span> <span class="n">precompact</span><span class="o">)</span> <span class="mi">529</span> <span class="o">{</span> <span class="mi">530</span> <span class="c1">// We should not deliver hints to the same host in 2 different threads</span> <span class="mi">531</span> <span class="k">if</span> <span class="o">(!</span><span class="n">queuedDeliveries</span><span class="o">.</span><span class="na">add</span><span class="o">(</span><span class="n">to</span><span class="o">))</span> <span class="mi">532</span> <span class="k">return</span><span class="o">;</span> <span class="mi">533</span> <span class="mi">534</span> <span class="n">logger</span><span class="o">.</span><span class="na">debug</span><span class="o">(</span><span class="s">&quot;Scheduling delivery of Hints to {}&quot;</span><span class="o">,</span> <span class="n">to</span><span class="o">);</span> <span class="mi">535</span> <span class="mi">536</span> <span class="n">executor</span><span class="o">.</span><span class="na">execute</span><span class="o">(</span><span class="k">new</span> <span class="n">Runnable</span><span class="o">()</span> </code></pre></div> <p>And finally, here's the code which divides our <code>hinted_handoff_throttle_in_kb</code> by the cluster size</p> <div class="highlight"><pre><code class="java"><span class="n">cassandra</span><span class="o">/</span><span class="n">src</span><span class="o">/</span><span class="n">java</span><span class="o">/</span><span class="n">org</span><span class="o">/</span><span class="n">apache</span><span class="o">/</span><span class="n">cassandra</span><span class="o">/</span><span class="n">db</span><span class="o">/</span><span class="n">HintedHandOffManager</span><span class="o">.</span><span class="na">java</span><span class="o">:</span> <span class="mi">359</span> <span class="c1">// rate limit is in bytes per second. Uses Double.MAX_VALUE if disabled (set to 0 in cassandra.yaml).</span> <span class="mi">360</span> <span class="c1">// max rate is scaled by the number of nodes in the cluster (CASSANDRA-5272).</span> <span class="mi">361</span> <span class="kt">int</span> <span class="n">throttleInKB</span> <span class="o">=</span> <span class="n">DatabaseDescriptor</span><span class="o">.</span><span class="na">getHintedHandoffThrottleInKB</span><span class="o">()</span> <span class="mi">362</span> <span class="o">/</span> <span class="o">(</span><span class="n">StorageService</span><span class="o">.</span><span class="na">instance</span><span class="o">.</span><span class="na">getTokenMetadata</span><span class="o">().</span><span class="na">getAllEndpoints</span><span class="o">().</span><span class="na">size</span><span class="o">()</span> <span class="o">-</span> <span class="mi">1</span><span class="o">);</span> </code></pre></div> <h2>The Math for Hints</h2> <p>So lets put all this together. Say you have a cluster with nodes split among 2 DCs, DC1 and DC2. DC2 goes down for a time, then returns to service.</p> <p>Your maximum outbound hint streaming speed per node is computed by</p> <p><code>max_streaming_speed_per_node = ( hinted_handoff_throttle_in_kb / node_count - 1 ) * max_hints_delivery_threads</code></p> <p>But because Cassandra only allows one outbound hint thread per remote node, the maximum inbound hint streaming per node will still be hinted_handoff_throttle_in_kb. This is important because you can then safely increase <code>max_hints_delivery_threads</code> without worrying about overwhealming a single node.</p> <p>Then in the case of a network partition, we'd expect streaming to be queued for the entire DC that went off the web. So expected WAN bandwidth usage would be</p> <p><code>max_wan_hint_speed = max_streaming_speed_per_node * DC1_node_count</code></p> <p>Next post looks at taking these numbers and figuring out how long it'll take hints to replay based on different <code>max_hints_delivery_threads</code> settings.</p> Cassandra gc_grace_seconds of 0 Disables Hinted Handoff 2015-01-29T00:00:00-08:00 http://www.uberobert.com/cassandra_gc_grace_disables_hinted_handoff <p>This is just a quick FYI post as I don't see this documented on the web elsewhere. As of now in all versions of Cassandra a <code>gc_grace_seconds</code> setting of 0 will disable hinted handoff. Basically to avoid an edge case that could cause data to reappear in a cluster (Detailed in Jira <a href="https://issues.apache.org/jira/browse/CASSANDRA-5314">CASSANDRA-5314</a>) hints are stored with a TTL of gc_grace_seconds for the keyspace in question. A gc_grace_seconds setting of 0 will cause hints to TTL instantly and they will never be streamed off when a node comes back up.</p> <p>Here's the code line:</p> <div class="highlight"><pre><code class="java"><span class="n">cassandra</span><span class="o">/</span><span class="n">src</span><span class="o">/</span><span class="n">java</span><span class="o">/</span><span class="n">org</span><span class="o">/</span><span class="n">apache</span><span class="o">/</span><span class="n">cassandra</span><span class="o">/</span><span class="n">db</span><span class="o">/</span><span class="n">RowMutation</span><span class="o">.</span><span class="na">java</span><span class="o">:</span> <span class="mi">124</span> <span class="cm">/*</span> <span class="cm"> 125 * determine the TTL for the hint RowMutation</span> <span class="cm"> 126 * this is set at the smallest GCGraceSeconds for any of the CFs in the RM</span> <span class="cm"> 127 * this ensures that deletes aren&#39;t &quot;undone&quot; by delivery of an old hint</span> <span class="cm"> 128 */</span> <span class="mi">129</span> <span class="kd">public</span> <span class="kt">int</span> <span class="n">calculateHintTTL</span><span class="o">()</span> <span class="mi">130</span> <span class="o">{</span> <span class="mi">131</span> <span class="kt">int</span> <span class="n">ttl</span> <span class="o">=</span> <span class="n">Integer</span><span class="o">.</span><span class="na">MAX_VALUE</span><span class="o">;</span> <span class="mi">132</span> <span class="k">for</span> <span class="o">(</span><span class="n">ColumnFamily</span> <span class="n">cf</span> <span class="o">:</span> <span class="n">getColumnFamilies</span><span class="o">())</span> <span class="mi">133</span> <span class="n">ttl</span> <span class="o">=</span> <span class="n">Math</span><span class="o">.</span><span class="na">min</span><span class="o">(</span><span class="n">ttl</span><span class="o">,</span> <span class="n">cf</span><span class="o">.</span><span class="na">metadata</span><span class="o">().</span><span class="na">getGcGraceSeconds</span><span class="o">());</span> <span class="mi">134</span> <span class="k">return</span> <span class="n">ttl</span><span class="o">;</span> <span class="mi">135</span> <span class="o">}</span> </code></pre></div> <p>And the log lines:</p> <div class="highlight"><pre><code class="yaml"><span class="l-Scalar-Plain">INFO 03:00:48,578 Finished hinted handoff of 0 rows to endpoint /10.0.0.58</span> <span class="l-Scalar-Plain">INFO 03:00:48,584 Finished hinted handoff of 0 rows to endpoint /10.0.0.59</span> <span class="l-Scalar-Plain">INFO 03:00:48,589 Finished hinted handoff of 0 rows to endpoint /10.0.0.37</span> <span class="l-Scalar-Plain">INFO 03:00:48,594 Finished hinted handoff of 0 rows to endpoint /10.0.0.36</span> <span class="l-Scalar-Plain">INFO 03:00:48,599 Finished hinted handoff of 0 rows to endpoint /10.0.0.39</span> <span class="l-Scalar-Plain">INFO 03:00:48,604 Finished hinted handoff of 0 rows to endpoint /10.0.0.38</span> <span class="l-Scalar-Plain">INFO 03:00:48,608 Finished hinted handoff of 0 rows to endpoint /10.0.0.33</span> <span class="l-Scalar-Plain">INFO 03:00:48,613 Finished hinted handoff of 0 rows to endpoint /10.0.0.32</span> <span class="l-Scalar-Plain">INFO 03:00:48,617 Finished hinted handoff of 0 rows to endpoint /10.0.0.35</span> <span class="l-Scalar-Plain">INFO 03:00:48,622 Finished hinted handoff of 0 rows to endpoint /10.0.0.34</span> <span class="l-Scalar-Plain">INFO 03:00:48,627 Finished hinted handoff of 0 rows to endpoint /10.0.0.45</span> </code></pre></div> <p>In a single DC no hints isn't a huge issue, if you are using <code>QUORUM</code> for reads you'd end up fixing the missed writes, even if consistency is compromised some. In multi-dc with <code>LOCAL_QUORUM</code> this is killer though, that data will never come across the WAN without a full repair. Yikes!</p> How to Remove Files from an RPM with FPM 2014-12-30T00:00:00-08:00 http://www.uberobert.com/remove-files-from-rpm-with-rpm2cpio-fpm <p>Sometimes its handy to be able to unpackage and rebuild an RPM to remove specific files. In this example I'll rebuild the RPM for <a href="http://www.datastax.com/what-we-offer/products-services/datastax-opscenter">Datastax Opscenter's</a> agent without the init files. The tools used to do this are rpm2cpio and <a href="https://github.com/jordansissel/fpm">FPM</a>. I manage services with <a href="http://cr.yp.to/daemontools.html">daemontools</a> so having stuff start under init is a real issue. I could just install from tarbal but I prefer to keep everything under rpm for simplicity. Plus there is a lot of useful stuff that the rpm does on install that I don't want to reproduce manually.</p> <p>Just as a fair warning, this isn't a perfect process and is quite hacky. It won't let you rebuild from source or change the source of compiled code, go find the source rpm for that. This just works for removing files and even that's not quite guaranteed. Test your work before you push to prod! haha.</p> <p>So first we need to unpack our rpm. For this we can use rpm2cpio.</p> <div class="highlight"><pre><code class="bash">mkdir datastax-agent-5.0.1 <span class="nb">cd</span> !<span class="err">$</span> rpm2cpio ../datastax-agent-5.0.1-1.noarch.rpm | cpio -idmv </code></pre></div> <p>This should install our rpm's files into our current directory. After this go ahead and make any adjustments you want, I'm removing <code>./etc/init.d</code> (note when removing files don't leave empty directories, you don't want your rpm to own <code>/etc/init.d/</code>). Next you'll need to pull out all the install scripts used in the original rpm: <code>rpm -qp --scripts datastax-agent-5.0.1-1.noarch.rpm</code>. Copy your scripts out to your text editor and go through them, break them into their separate sections for <code>%pre</code>, <code>%post</code>, etc (separated by lines that look like <code>preinstall scriptlet (using /bin/sh):</code>) and ensure they aren't using the files you removed. In my case I removed references to the services and chkconfig.</p> <p>Next rebuild your rpm with <a href="https://github.com/jordansissel/fpm">FPM</a>.</p> <div class="highlight"><pre><code class="bash">gem install fpm fpm -s dir -t rpm --url http://www.datastax.com --description <span class="s2">&quot;Datastax-agent sans init file&quot;</span> -m <span class="s2">&quot;[email protected]&quot;</span> --rpm-user opscenter-agent --rpm-group opscenter-agent -n mommas-datastax-agent -e -v 5.0.1 --iteration 1 --prefix<span class="o">=</span>/ . </code></pre></div> <p>When I rebuild them I usually try to keep versioning the same but rename the rpm to something unique so others know I mucked with it. If you are going to keep the same name, I'd recommend adding an epoch (if it doesn't already have one). This would make it so your custom rpm will appear newer than the default ones but still let you keep their same versioning. Once FPM opens up your new spec file go ahead and copy in your updated scripts from the last step. Save and close the file and FPM should build you a shiny new RPM.</p> <p>And that's it!</p> Getting Rails Server --debug with SCL Ruby 1.9.3 2014-09-23T00:00:00-07:00 http://www.uberobert.com/rails-server-debug-with-scl-ruby193 <p>The documentation on getting <code>rails server --debug</code> working with the <a href="https://fedorahosted.org/SoftwareCollections/">Software Collections (SCL)</a> version of Ruby is a little weak. So here's how to do it. If you installed <a href="http://mirror.centos.org/centos/6/SCL/x86_64/">SCL ruby193</a> you'll probably get this error when you try to start the debugger:</p> <div class="highlight"><pre><code class="bash">/usr/share/foreman<span class="nv">$ </span>rails server --debug <span class="o">=</span>&gt; Booting <span class="nv">WEBrick</span> <span class="o">=</span>&gt; Rails 3.2.8 application starting in development on http://0.0.0.0:3000 <span class="o">=</span>&gt; Call with -d to <span class="nv">detach</span> <span class="o">=</span>&gt; Ctrl-C to shutdown server You need to install ruby-debug to run the server in debugging mode. With gems, use <span class="s1">&#39;gem install ruby-debug&#39;</span> Exiting </code></pre></div> <p>The great part though is that ruby-debug doesn't exist for Ruby 1.9.3. The gem you want is 'debugger' and the 'debugger' gem isn't packaged up with SCL ruby193. To get it, first setup the <a href="http://mirror.centos.org/centos/6/SCL/x86_64/v8314/">v8314 SCL repo</a> which is required for ruby193-ruby-devel. Then install these:</p> <div class="highlight"><pre><code class="bash">yum install ruby193-ruby-devel gcc scl <span class="nb">enable </span>ruby193 bash gem install debugger </code></pre></div> <p>After those packages, debugger should start working</p> <div class="highlight"><pre><code class="bash">/usr/share/foreman<span class="nv">$ RAILS_ENV</span><span class="o">=</span>development ruby193-rails server --debug <span class="o">=</span>&gt; Booting <span class="nv">WEBrick</span> <span class="o">=</span>&gt; Rails 3.2.8 application starting in development on http://0.0.0.0:3000 <span class="o">=</span>&gt; Call with -d to <span class="nv">detach</span> <span class="o">=</span>&gt; Ctrl-C to shutdown <span class="nv">server</span> <span class="o">=</span>&gt; Debugger enabled <span class="o">[</span>2014-09-23 17:33:55<span class="o">]</span> INFO WEBrick 1.3.1 <span class="o">[</span>2014-09-23 17:33:55<span class="o">]</span> INFO ruby 1.9.3 <span class="o">(</span>2013-11-22<span class="o">)</span> <span class="o">[</span>x86_64-linux<span class="o">]</span> <span class="o">[</span>2014-09-23 17:33:55<span class="o">]</span> INFO WEBrick::HTTPServer#start: <span class="nv">pid</span><span class="o">=</span>26604 <span class="nv">port</span><span class="o">=</span>3000 </code></pre></div> <p>Great!</p> Foreman NoMethodError undefined method 'size' for nil:NilClass 2014-09-22T00:00:00-07:00 http://www.uberobert.com/foreman_nomethod_error_undefined_method_size_for_nil_nilclass <p>I spent quite a while in the last few days trying to figure this error out and since there was no blogs or info on it online I felt compelled to write one. Basic setup is RHEL 6.5 with either Foreman 1.5 or Foreman 1.6. I'm using the Foreman <a href="yum.theforeman.org/releases/latest/el6/x86_64/">rpm's</a> and the <a href="mirror.centos.org/centos/6/SCL/x86_64/">Centos SCL repo</a> for Ruby193.</p> <p>Here's the error:</p> <div class="highlight"><pre><code class="bash">Started GET <span class="s2">&quot;/&quot;</span> <span class="k">for </span>10.100.128.63 at 2014-09-19 21:03:08 +0000 NoMethodError <span class="o">(</span>undefined method <span class="sb">`</span>size<span class="err">&#39;</span> <span class="k">for </span>nil:NilClass<span class="o">)</span>: </code></pre></div> <p>To fix this try clearing your browser cache/cookies or use a Private Browser session. This error was infuriating to figure out because of the very limited stack trace. So I ran the site under webrick which still only gave me this little stack trace:</p> <div class="highlight"><pre><code class="bash">Started GET <span class="s2">&quot;/&quot;</span> <span class="k">for </span>10.100.1.25 at 2014-09-19 22:47:22 +0000 NoMethodError <span class="o">(</span>undefined method <span class="sb">`</span>size<span class="s1">&#39; for nil:NilClass):</span> <span class="s1"> /usr/lib/ruby/gems/1.8/gems/passenger-4.0.18/lib/phusion_passenger/rack/thread_handler_extension.rb:77:in `process_request&#39;</span> /usr/lib/ruby/gems/1.8/gems/passenger-4.0.18/lib/phusion_passenger/request_handler/thread_handler.rb:140:in <span class="sb">`</span>accept_and_process_next_request<span class="s1">&#39;</span> <span class="s1"> /usr/lib/ruby/gems/1.8/gems/passenger-4.0.18/lib/phusion_passenger/request_handler/thread_handler.rb:108:in `main_loop&#39;</span> /usr/lib/ruby/gems/1.8/gems/passenger-4.0.18/lib/phusion_passenger/request_handler.rb:441:in <span class="sb">`</span>block <span class="o">(</span>3 levels<span class="o">)</span> in start_threads<span class="err">&#39;</span> </code></pre></div> <p>Eventually I got the site to run in develop mode with --debug (exceedingly difficult) though and the browser side full stack trace had the real clue.</p> <div class="highlight"><pre><code class="bash">NoMethodError undefined method <span class="sb">`</span>size<span class="s1">&#39; for nil:NilClass</span> <span class="s1">Rails.root: /usr/share/foreman</span> <span class="s1">Application Full Trace</span> <span class="s1">rack (1.4.1) lib/rack/utils.rb:457:in `[]=&#39;</span> rack <span class="o">(</span>1.4.1<span class="o">)</span> lib/rack/utils.rb:76:in <span class="sb">`</span>block in parse_query<span class="s1">&#39;</span> <span class="s1">rack (1.4.1) lib/rack/utils.rb:66:in `each&#39;</span> rack <span class="o">(</span>1.4.1<span class="o">)</span> lib/rack/utils.rb:66:in <span class="sb">`</span>parse_query<span class="s1">&#39;</span> <span class="s1">rack (1.4.1) lib/rack/request.rb:263:in `cookies&#39;</span> rack <span class="o">(</span>1.4.1<span class="o">)</span> lib/rack/session/abstract/id.rb:254:in <span class="sb">`</span>extract_session_id<span class="s1">&#39;</span> <span class="s1">actionpack (3.2.8) lib/action_dispatch/middleware/session/abstract_store.rb:51:in `block in extract_session_id&#39;</span> actionpack <span class="o">(</span>3.2.8<span class="o">)</span> lib/action_dispatch/middleware/session/abstract_store.rb:55:in <span class="sb">`</span>stale_session_check!<span class="s1">&#39;</span> <span class="s1">actionpack (3.2.8) lib/action_dispatch/middleware/session/abstract_store.rb:51:in `extract_session_id&#39;</span> rack <span class="o">(</span>1.4.1<span class="o">)</span> lib/rack/session/abstract/id.rb:43:in <span class="sb">`</span>load_session_id!<span class="s1">&#39;</span> <span class="s1">rack (1.4.1) lib/rack/session/abstract/id.rb:32:in `[]&#39;</span> rack <span class="o">(</span>1.4.1<span class="o">)</span> lib/rack/session/abstract/id.rb:262:in <span class="sb">`</span>current_session_id<span class="s1">&#39;</span> <span class="s1">rack (1.4.1) lib/rack/session/abstract/id.rb:268:in `session_exists?&#39;</span> rack <span class="o">(</span>1.4.1<span class="o">)</span> lib/rack/session/abstract/id.rb:107:in <span class="sb">`</span>exists?<span class="s1">&#39;</span> <span class="s1">rack (1.4.1) lib/rack/session/abstract/id.rb:122:in `load_for_read!&#39;</span> rack <span class="o">(</span>1.4.1<span class="o">)</span> lib/rack/session/abstract/id.rb:64:in <span class="sb">`</span>has_key?<span class="s1">&#39;</span> <span class="s1">actionpack (3.2.8) lib/action_dispatch/middleware/flash.rb:258:in `ensure in call&#39;</span> actionpack <span class="o">(</span>3.2.8<span class="o">)</span> lib/action_dispatch/middleware/flash.rb:259:in <span class="sb">`</span>call<span class="s1">&#39;</span> <span class="s1">rack (1.4.1) lib/rack/session/abstract/id.rb:205:in `context&#39;</span> rack <span class="o">(</span>1.4.1<span class="o">)</span> lib/rack/session/abstract/id.rb:200:in <span class="sb">`</span>call<span class="s1">&#39;</span> <span class="s1">actionpack (3.2.8) lib/action_dispatch/middleware/cookies.rb:339:in `call&#39;</span> activerecord <span class="o">(</span>3.2.8<span class="o">)</span> lib/active_record/query_cache.rb:64:in <span class="sb">`</span>call<span class="s1">&#39;</span> <span class="s1">activerecord (3.2.8) lib/active_record/connection_adapters/abstract/connection_pool.rb:473:in `call&#39;</span> actionpack <span class="o">(</span>3.2.8<span class="o">)</span> lib/action_dispatch/middleware/callbacks.rb:28:in <span class="sb">`</span>block in call<span class="s1">&#39;</span> <span class="s1">activesupport (3.2.8) lib/active_support/callbacks.rb:405:in `_run__3062187567662466012__call__827136857413921735__callbacks&#39;</span> activesupport <span class="o">(</span>3.2.8<span class="o">)</span> lib/active_support/callbacks.rb:405:in <span class="sb">`</span>__run_callback<span class="s1">&#39;</span> <span class="s1">activesupport (3.2.8) lib/active_support/callbacks.rb:385:in `_run_call_callbacks&#39;</span> activesupport <span class="o">(</span>3.2.8<span class="o">)</span> lib/active_support/callbacks.rb:81:in <span class="sb">`</span>run_callbacks<span class="s1">&#39;</span> <span class="s1">actionpack (3.2.8) lib/action_dispatch/middleware/callbacks.rb:27:in `call&#39;</span> actionpack <span class="o">(</span>3.2.8<span class="o">)</span> lib/action_dispatch/middleware/reloader.rb:65:in <span class="sb">`</span>call<span class="s1">&#39;</span> <span class="s1">actionpack (3.2.8) lib/action_dispatch/middleware/remote_ip.rb:31:in `call&#39;</span> actionpack <span class="o">(</span>3.2.8<span class="o">)</span> lib/action_dispatch/middleware/debug_exceptions.rb:16:in <span class="sb">`</span>call<span class="s1">&#39;</span> <span class="s1">actionpack (3.2.8) lib/action_dispatch/middleware/show_exceptions.rb:56:in `call&#39;</span> railties <span class="o">(</span>3.2.8<span class="o">)</span> lib/rails/rack/logger.rb:26:in <span class="sb">`</span>call_app<span class="s1">&#39;</span> <span class="s1">railties (3.2.8) lib/rails/rack/logger.rb:16:in `call&#39;</span> actionpack <span class="o">(</span>3.2.8<span class="o">)</span> lib/action_dispatch/middleware/request_id.rb:22:in <span class="sb">`</span>call<span class="s1">&#39;</span> <span class="s1">rack (1.4.1) lib/rack/methodoverride.rb:21:in `call&#39;</span> rack <span class="o">(</span>1.4.1<span class="o">)</span> lib/rack/runtime.rb:17:in <span class="sb">`</span>call<span class="s1">&#39;</span> <span class="s1">activesupport (3.2.8) lib/active_support/cache/strategy/local_cache.rb:72:in `call&#39;</span> rack <span class="o">(</span>1.4.1<span class="o">)</span> lib/rack/lock.rb:15:in <span class="sb">`</span>call<span class="s1">&#39;</span> <span class="s1">actionpack (3.2.8) lib/action_dispatch/middleware/static.rb:62:in `call&#39;</span> railties <span class="o">(</span>3.2.8<span class="o">)</span> lib/rails/engine.rb:479:in <span class="sb">`</span>call<span class="s1">&#39;</span> <span class="s1">railties (3.2.8) lib/rails/application.rb:223:in `call&#39;</span> railties <span class="o">(</span>3.2.8<span class="o">)</span> lib/rails/railtie/configurable.rb:30:in <span class="sb">`</span>method_missing<span class="s1">&#39;</span> <span class="s1">rack (1.4.1) lib/rack/builder.rb:134:in `call&#39;</span> rack <span class="o">(</span>1.4.1<span class="o">)</span> lib/rack/urlmap.rb:64:in <span class="sb">`</span>block in call<span class="s1">&#39;</span> <span class="s1">rack (1.4.1) lib/rack/urlmap.rb:49:in `each&#39;</span> rack <span class="o">(</span>1.4.1<span class="o">)</span> lib/rack/urlmap.rb:49:in <span class="sb">`</span>call<span class="s1">&#39;</span> <span class="s1">rack (1.4.1) lib/rack/content_length.rb:14:in `call&#39;</span> railties <span class="o">(</span>3.2.8<span class="o">)</span> lib/rails/rack/debugger.rb:20:in <span class="sb">`</span>call<span class="s1">&#39;</span> <span class="s1">railties (3.2.8) lib/rails/rack/log_tailer.rb:17:in `call&#39;</span> rack <span class="o">(</span>1.4.1<span class="o">)</span> lib/rack/handler/webrick.rb:59:in <span class="sb">`</span>service<span class="s1">&#39;</span> <span class="s1">/opt/rh/ruby193/root/usr/share/ruby/webrick/httpserver.rb:138:in `service&#39;</span> /opt/rh/ruby193/root/usr/share/ruby/webrick/httpserver.rb:94:in <span class="sb">`</span>run<span class="s1">&#39;</span> <span class="s1">/opt/rh/ruby193/root/usr/share/ruby/webrick/server.rb:191:in `block in start_thread&#39;</span> </code></pre></div> <p>Notice that the trace is from where the middle layer looks for cookies. Thats what finally tipped me off that it was a browser side cookie issue. Who'd think that a browser side issue would give such a small unhelpful stacktrace...</p>