Uberobert

Don't Use Homebrew for Installing NPM - use NVM

2015-11-13T00:00:00-08:00

As a user on this homebrew issue utter's like a savant "package managers managing package managers rarely works out well."

First clean up your homebrew crap:

brew uninstall --force node
rm -rf ~/.npm
rm -rf ~/.node

Install NPM with the NVM install script

curl -o- https://raw.githubusercontent.com/creationix/nvm/v0.29.0/install.sh | bash

The inital version installed with homebrew is ok, but updating node after installing via homebrew is impossible due to how homebrew handles linking. Using NPM makes upgrades and managing versions simpler.

Cassandra Freezes on CentOS 6.6 and Haswell Processors

2015-10-21T00:00:00-07:00

This is an FYI and warning, be very careful with haswell processors with RHEL/CentOS 6.6. There is a futex wait() bug that can cause processes which wait to never resume agian. A good description is on InfoQ.

“The impact of this kernel bug is very simple: user processes can deadlock and hang in seemingly impossible situations. A futex wait call (and anything using a futex wait) can stay blocked forever, even though it had been properly woken up by someone. Thread.park() in Java may stay parked. Etc. If you are lucky you may also find soft lockup messages in your dmesg logs. If you are not that lucky (like us, for example), you'll spend a couple of months of someone's time trying to find the fault in your code, when there is nothing there to find.”

I recently saw this with Dell R630's and cassandra. A thread dump shows the threads in a BLOCKED state and the stack trace shows them as parked.

Thread 104823: (state = BLOCKED)
 - sun.misc.Unsafe.park(boolean, long) @bci=0 (Compiled frame; information may be imprecise)
 - java.util.concurrent.locks.LockSupport.parkNanos(java.lang.Object, long) @bci=20, line=226 (Compiled frame)
 - java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.awaitNanos(long) @bci=68, line=2082 (Compiled frame)
 - java.util.concurrent.LinkedBlockingQueue.poll(long, java.util.concurrent.TimeUnit) @bci=62, line=467 (Compiled frame)
 - java.util.concurrent.ThreadPoolExecutor.getTask() @bci=141, line=1068 (Compiled frame)
 - java.util.concurrent.ThreadPoolExecutor.runWorker(java.util.concurrent.ThreadPoolExecutor$Worker) @bci=26, line=1130 (Compiled frame)
 - java.util.concurrent.ThreadPoolExecutor$Worker.run() @bci=5, line=615 (Interpreted frame)
 - java.lang.Thread.run() @bci=11, line=745 (Interpreted frame)

Cassandra logs go completely blank and CPU utilization will stay constant at some level (sometimes high / sometimes none). Interestingly you can revive the process with a kill -STOP <jvm_pid> and kill -CONT <jvm_pid> (which is much faster than a service restart).

Update Centos 6.6 to the newest kernel in the updates repository to fix this, version 2.6.32-504.30.3.el6.x86_64.

Big thanks to Adam Hattrell, Simon Ashley and Erick Ramirez from Datastax for the help to figure this out.

Impact of Latency on Performance Testing

2015-08-06T00:00:00-07:00

Something not often mentioned and tested is the impact of latency in the wild on the operation and scalability of a website. The vast majority of load tests conducted are ran from a local load source, jmeter in the same availability zone. In this case the latency is incredibly low, probably sub millisecond. In the real world your application will never see this kind of latency again, it will be anywhere from 50 to 500ms depending on the global mix of traffic you receive. This can kill the performance of your application in surprising ways.

The time Apache spends waiting for a response on low latency requests is going to be small, this allows your servers to handle a much larger volume of traffic spread over a much lower number of threads. This is further amplified if your application is handling a lot of small quick requests, say an web API. In the lab, a server might be able to handle thousands of requests per second with only 30-100 threads at any given time. Using such a small number of threads is stellar for performance, the box will require much less application concurrency. A change in latency from 1ms to 200ms will cause transactions overhead to take 200% longer by definition, if your application has a 1:1 ratio of thread to transaction this will also cause a 200% increase in concurrency. This could most obviously lead to the box running out of threads or memory in production before it reaches the performance levels seen during testing.

Latency issues could also highlight any bottlenecks in your code where the application blocks while waiting on other threads. You could see this in your performance graphs by comparing context switching and system CPU usage between QA and production, as waiting on other threads often shows up in the kernel level.

What to do

So finally, what can we do about this? Load test from over the internet! You should mimic production latency in your performance testing environment, this will ensure that you not only test the raw performance of your application but also stress production similar concurrency levels. To do this you should generate the load for your tests remotely in some form of cloud, like AWS. This raises a big question though, where do you generate the load from? If your average visitors are fairly geographically close by, you don't need to test from that far away. But if you have a truly global customer base you may want to generate load from the other side of the Atlantic. To decide where you really need a good average of your production latency, which is fairly hard to measure (I'm not about to ping every IP in my apache access log, haha). Luckily we can get this number a roundabout way through testing!

If using Apache HTTPD, the first step is to enable Apache server-status, if you want to see what this looks like httpd.apache.org has server-status enabled by default, kudos to them. Next test your app in QA, fire up enough threads to mirror the requests per second your production site sees; then measure the number of active threads ("requests currently being processed" in server-status). Using this you can compute the average latency you see on your production site like so:

production_latency = local_latency * production_threads / local_threads

To increase accuracy of your measurement increase the latency in your test environment, possibly generate the load from a near by AWS location. You will still know the latency but it then won't be so close to 0; the difference between .08ms and .07ms is pretty significant in the final number while being hard to measure accurately...

So then when you are armed with a production latency number, peruse different cloud providers and find one that has a latency to your test site that is near or somewhat larger that can be you see in production. Then when you run tests you can also test at similar application concurrency numbers to what is experienced in production!

Any comments, questions, concerns, or areas where I'm wrong that you'd like to troll?

Generate a Jekyll Sitemap

2015-07-15T00:00:00-07:00

I just setup my blog on google's webmasters tool and saw that they wanted a sitemap. This led to the question of "how do I make one of those!" Luckily I found David Singer's blog post on building a sitemap!. This is a direct paste of his code which I found to work exactly as required, be sure to check out his blog.

All that is required is putting the sitemap code in the root directory of your blog, and then on any pages you want to customize adding (if you don't add this the page will get the defaults:

sitemap:
  lastmod: 2014-01-23
  priority: 0.7
  changefreq: 'monthly'
  exclude: 'yes'

---
layout: null
sitemap:
  exclude: 'yes'
---
<?xml version="1.0" encoding="UTF-8"?>
<urlset xmlns="http://www.sitemaps.org/schemas/sitemap/0.9">
  {% for post in site.posts %}
    {% unless post.published == false %}
    <url>
      <loc>{{ site.url }}{{ post.url }}</loc>
      {% if post.sitemap.lastmod %}
        <lastmod>{{ post.sitemap.lastmod | date: "%Y-%m-%d" }}</lastmod>
      {% elsif post.date %}
        <lastmod>{{ post.date | date_to_xmlschema }}</lastmod>
      {% else %}
        <lastmod>{{ site.time | date_to_xmlschema }}</lastmod>
      {% endif %}
      {% if post.sitemap.changefreq %}
        <changefreq>{{ post.sitemap.changefreq }}</changefreq>
      {% else %}
        <changefreq>monthly</changefreq>
      {% endif %}
      {% if post.sitemap.priority %}
        <priority>{{ post.sitemap.priority }}</priority>
      {% else %}
        <priority>0.5</priority>
      {% endif %}
    </url>
    {% endunless %}
  {% endfor %}
  {% for page in site.pages %}
    {% unless page.sitemap.exclude == "yes" %}
    <url>
      <loc>{{ site.url }}{{ page.url | remove: "index.html" }}</loc>
      {% if page.sitemap.lastmod %}
        <lastmod>{{ page.sitemap.lastmod | date: "%Y-%m-%d" }}</lastmod>
      {% elsif page.date %}
        <lastmod>{{ page.date | date_to_xmlschema }}</lastmod>
      {% else %}
        <lastmod>{{ site.time | date_to_xmlschema }}</lastmod>
      {% endif %}
      {% if page.sitemap.changefreq %}
        <changefreq>{{ page.sitemap.changefreq }}</changefreq>
      {% else %}
        <changefreq>monthly</changefreq>
      {% endif %}
      {% if page.sitemap.priority %}
        <priority>{{ page.sitemap.priority }}</priority>
      {% else %}
        <priority>0.3</priority>
      {% endif %}
    </url>
    {% endunless %}
  {% endfor %}
</urlset>

Cheers David!

Create Puppet Facts with Package Versions

2015-06-19T00:00:00-07:00

With puppet, sometimes it is necessary to make case statements around what version of a package is installed rather than having puppet dictate what version is installed. Say "add x line to sshd config for x version." To do this we need to have fact with what package version is already installed on the system, here's a quick script to do that in bulk. To deploy, add this as a plugin in a module.

#!/usr/bin/ruby
require 'facter'

packages = '(^bash |^glibc.x86_64 |^httpd |^jdk |^jre |^mod_ssl |^openssl |^php |^sshd )'
version = Facter::Util::Resolution.exec("rpm -qa --queryformat '[%{NAME} %{VERSION}-%{RELEASE}\n]' | egrep '#{packages}'")

version.each_line do |package|
  Facter.add("package_#{package.split[0]}".gsub('-','_')) do
    setcode do
      "#{package.split[1]}"
    end
  end
end

In this case we've got a regex with all the packages we care about, I know you could make this an array and loop over it but the performance is better to call the rpm command just once and then loop over the output. Technically you could do all packages but that seems pointless to me and would also be a ton of packages. Then the other trick was changing '-' to '_' because dashes aren't allowed in fact names but are quite common in package names.

Then in our manifests we can make case statements or if blocks on these new fancy facts.

if $package_foo =~ /3.2.7/ {
    file { '/usr/local/foo/conf/config.yaml' :
        ensure  => 'file',
        source  => 'puppet:///modules/foo/3.2.7/usr/local/foo/conf/config.yaml',
    }
} else {
    notify { 'foo_unknown' :
        message => 'I don't know what version of 'foo' to configure, this run is foobar!',
    }
}

Let me know if you find this helpful, horrible, or both.

Bandwidth Required for Cassandra Hinted Handoff

2015-04-28T00:00:00-07:00

For capacity planning and stability reasons it is important to be able to estimate how Cassandra will act in a multi-dc environment during adverse networking conditions. This includes capacity planning for WAN bandwidth and ensuring that after a network partition you can stream the hints off both fast enough that they don't expire and TTL off or stream so fast that it saturates your WAN and crashes the cluster. In this blog we'll go into how to compute the streaming speeds. Then in the next blog we will look at fine-tuning the number of threads needed to stream data safely after a network partition.

Regular WAN Bandwidth

First we need some regular baseline numbers on WAN bandwidth requirements. Just from watching average running of a single DC cluster you can figure out WAN bandwidth network neccessary for day to day operations. Rough, back of envelope, estimates for WAN bandwidth can be had by disabling thrift/native_transport on a node and measure internal network communication. At this point the node will just be handling internal cassandra reads/writes, divide this number by your replication factor (because writes are sent just once over the WAN) for a high water mark of WAN bandwidth during a normal day. In practice this estimate will be higher than actual WAN traffic because the WAN is only seeing writes and not reads.

WAN Bandwidth and Hints

Next comes the much harder part. What network will be required after a temporary network partition. After network is restored you'll have both regular traffic and traffic of hints.

The primary settings for hints are in cassandra.yaml and look like this:

# See http://wiki.apache.org/cassandra/HintedHandoff
hinted_handoff_enabled: true
# this defines the maximum amount of time a dead host will have hints
# generated.  After it has been dead this long, new hints for it will not be
# created until it has been seen alive and gone down again.
max_hint_window_in_ms: 10800000 # 3 hours
# Maximum throttle in KBs per second, per delivery thread.  This will be
# reduced proportionally to the number of nodes in the cluster.  (If there
# are two nodes in the cluster, each delivery thread will use the maximum
# rate; if there are three, each will throttle to half of the maximum,
# since we expect two nodes to be delivering hints simultaneously.)
hinted_handoff_throttle_in_kb: 1024
# Number of threads with which to deliver hints;
# Consider increasing this number when you have multi-dc deployments, since
# cross-dc handoff tends to be slower
max_hints_delivery_threads: 2

The notes for max_hints_delivery_threads of "Consider increasing this number when you have multi-dc deployments" is less than helpful and there is currently no good documentation on the web to tune this value. So lets dig into the source and see how this all works so we can plan for it with multiple datacenters.

Into the Source

First off when cassandra starts it creates a ThreadPool of matching max_hints_delivery_threads (the getMaxHintsThread method)

java/org/apache/cassandra/db/HintedHandOffManager.java:
  104  private final JMXEnabledScheduledThreadPoolExecutor executor =
  105      new JMXEnabledScheduledThreadPoolExecutor(
  106         DatabaseDescriptor.getMaxHintsThread(),
  107          new NamedThreadFactory("HintedHandoff", Thread.MIN_PRIORITY),
  108          "internal");

Next when a node is seen via gossip, it will schedule a hint transfer

cassandra/src/java/org/apache/cassandra/service/StorageService.java:
 1963  public void onAlive(InetAddress endpoint, EndpointState state)
 1964   {
 ...
 1969          HintedHandOffManager.instance.scheduleHintDelivery(endpoint, true);
 1970           for (IEndpointLifecycleSubscriber subscriber : lifecycleSubscribers)
 1971               subscriber.onUp(endpoint);

Which creates an executor instance to handle the handoff.

cassandra/src/java/org/apache/cassandra/db/HintedHandOffManager.java:
  528  public void scheduleHintDelivery(final InetAddress to, final boolean precompact)
  529  {
  530     // We should not deliver hints to the same host in 2 different threads
  531      if (!queuedDeliveries.add(to))
  532          return;
  533
  534     logger.debug("Scheduling delivery of Hints to {}", to);
  535
  536     executor.execute(new Runnable()

And finally, here's the code which divides our hinted_handoff_throttle_in_kb by the cluster size

cassandra/src/java/org/apache/cassandra/db/HintedHandOffManager.java:
 359   // rate limit is in bytes per second. Uses Double.MAX_VALUE if disabled (set to 0 in cassandra.yaml).
 360   // max rate is scaled by the number of nodes in the cluster (CASSANDRA-5272).
 361  int throttleInKB = DatabaseDescriptor.getHintedHandoffThrottleInKB()
 362                      / (StorageService.instance.getTokenMetadata().getAllEndpoints().size() - 1);

The Math for Hints

So lets put all this together. Say you have a cluster with nodes split among 2 DCs, DC1 and DC2. DC2 goes down for a time, then returns to service.

Your maximum outbound hint streaming speed per node is computed by

max_streaming_speed_per_node = ( hinted_handoff_throttle_in_kb / node_count - 1 ) * max_hints_delivery_threads

But because Cassandra only allows one outbound hint thread per remote node, the maximum inbound hint streaming per node will still be hinted_handoff_throttle_in_kb. This is important because you can then safely increase max_hints_delivery_threads without worrying about overwhealming a single node.

Then in the case of a network partition, we'd expect streaming to be queued for the entire DC that went off the web. So expected WAN bandwidth usage would be

max_wan_hint_speed = max_streaming_speed_per_node * DC1_node_count

Next post looks at taking these numbers and figuring out how long it'll take hints to replay based on different max_hints_delivery_threads settings.

Cassandra gc_grace_seconds of 0 Disables Hinted Handoff

2015-01-29T00:00:00-08:00

This is just a quick FYI post as I don't see this documented on the web elsewhere. As of now in all versions of Cassandra a gc_grace_seconds setting of 0 will disable hinted handoff. Basically to avoid an edge case that could cause data to reappear in a cluster (Detailed in Jira CASSANDRA-5314) hints are stored with a TTL of gc_grace_seconds for the keyspace in question. A gc_grace_seconds setting of 0 will cause hints to TTL instantly and they will never be streamed off when a node comes back up.

Here's the code line:

cassandra/src/java/org/apache/cassandra/db/RowMutation.java:
  124      /*
  125       * determine the TTL for the hint RowMutation
  126       * this is set at the smallest GCGraceSeconds for any of the CFs in the RM
  127       * this ensures that deletes aren't "undone" by delivery of an old hint
  128       */
  129     public int calculateHintTTL()
  130      {
  131          int ttl = Integer.MAX_VALUE;
  132          for (ColumnFamily cf : getColumnFamilies())
  133              ttl = Math.min(ttl, cf.metadata().getGcGraceSeconds());
  134          return ttl;
  135      }

And the log lines:

INFO 03:00:48,578 Finished hinted handoff of 0 rows to endpoint /10.0.0.58
INFO 03:00:48,584 Finished hinted handoff of 0 rows to endpoint /10.0.0.59
INFO 03:00:48,589 Finished hinted handoff of 0 rows to endpoint /10.0.0.37
INFO 03:00:48,594 Finished hinted handoff of 0 rows to endpoint /10.0.0.36
INFO 03:00:48,599 Finished hinted handoff of 0 rows to endpoint /10.0.0.39
INFO 03:00:48,604 Finished hinted handoff of 0 rows to endpoint /10.0.0.38
INFO 03:00:48,608 Finished hinted handoff of 0 rows to endpoint /10.0.0.33
INFO 03:00:48,613 Finished hinted handoff of 0 rows to endpoint /10.0.0.32
INFO 03:00:48,617 Finished hinted handoff of 0 rows to endpoint /10.0.0.35
INFO 03:00:48,622 Finished hinted handoff of 0 rows to endpoint /10.0.0.34
INFO 03:00:48,627 Finished hinted handoff of 0 rows to endpoint /10.0.0.45

In a single DC no hints isn't a huge issue, if you are using QUORUM for reads you'd end up fixing the missed writes, even if consistency is compromised some. In multi-dc with LOCAL_QUORUM this is killer though, that data will never come across the WAN without a full repair. Yikes!

How to Remove Files from an RPM with FPM

2014-12-30T00:00:00-08:00

Sometimes its handy to be able to unpackage and rebuild an RPM to remove specific files. In this example I'll rebuild the RPM for Datastax Opscenter's agent without the init files. The tools used to do this are rpm2cpio and FPM. I manage services with daemontools so having stuff start under init is a real issue. I could just install from tarbal but I prefer to keep everything under rpm for simplicity. Plus there is a lot of useful stuff that the rpm does on install that I don't want to reproduce manually.

Just as a fair warning, this isn't a perfect process and is quite hacky. It won't let you rebuild from source or change the source of compiled code, go find the source rpm for that. This just works for removing files and even that's not quite guaranteed. Test your work before you push to prod! haha.

So first we need to unpack our rpm. For this we can use rpm2cpio.

mkdir datastax-agent-5.0.1
cd !$
rpm2cpio ../datastax-agent-5.0.1-1.noarch.rpm | cpio -idmv

This should install our rpm's files into our current directory. After this go ahead and make any adjustments you want, I'm removing ./etc/init.d (note when removing files don't leave empty directories, you don't want your rpm to own /etc/init.d/). Next you'll need to pull out all the install scripts used in the original rpm: rpm -qp --scripts datastax-agent-5.0.1-1.noarch.rpm. Copy your scripts out to your text editor and go through them, break them into their separate sections for %pre, %post, etc (separated by lines that look like preinstall scriptlet (using /bin/sh):) and ensure they aren't using the files you removed. In my case I removed references to the services and chkconfig.

Next rebuild your rpm with FPM.

gem install fpm
fpm -s dir -t rpm --url http://www.datastax.com --description "Datastax-agent sans init file" -m "[email protected]" --rpm-user opscenter-agent --rpm-group opscenter-agent -n mommas-datastax-agent -e -v 5.0.1 --iteration 1 --prefix=/ .

When I rebuild them I usually try to keep versioning the same but rename the rpm to something unique so others know I mucked with it. If you are going to keep the same name, I'd recommend adding an epoch (if it doesn't already have one). This would make it so your custom rpm will appear newer than the default ones but still let you keep their same versioning. Once FPM opens up your new spec file go ahead and copy in your updated scripts from the last step. Save and close the file and FPM should build you a shiny new RPM.

And that's it!

Getting Rails Server --debug with SCL Ruby 1.9.3

2014-09-23T00:00:00-07:00

The documentation on getting rails server --debug working with the Software Collections (SCL) version of Ruby is a little weak. So here's how to do it. If you installed SCL ruby193 you'll probably get this error when you try to start the debugger:

/usr/share/foreman$ rails server --debug
=> Booting WEBrick
=> Rails 3.2.8 application starting in development on http://0.0.0.0:3000
=> Call with -d to detach
=> Ctrl-C to shutdown server
You need to install ruby-debug to run the server in debugging mode. With gems, use 'gem install ruby-debug'
Exiting

The great part though is that ruby-debug doesn't exist for Ruby 1.9.3. The gem you want is 'debugger' and the 'debugger' gem isn't packaged up with SCL ruby193. To get it, first setup the v8314 SCL repo which is required for ruby193-ruby-devel. Then install these:

yum install ruby193-ruby-devel gcc
scl enable ruby193 bash
gem install debugger

After those packages, debugger should start working

/usr/share/foreman$ RAILS_ENV=development ruby193-rails server --debug
=> Booting WEBrick
=> Rails 3.2.8 application starting in development on http://0.0.0.0:3000
=> Call with -d to detach
=> Ctrl-C to shutdown server
=> Debugger enabled
[2014-09-23 17:33:55] INFO  WEBrick 1.3.1
[2014-09-23 17:33:55] INFO  ruby 1.9.3 (2013-11-22) [x86_64-linux]
[2014-09-23 17:33:55] INFO  WEBrick::HTTPServer#start: pid=26604 port=3000

Great!

Foreman NoMethodError undefined method 'size' for nil:NilClass

2014-09-22T00:00:00-07:00

I spent quite a while in the last few days trying to figure this error out and since there was no blogs or info on it online I felt compelled to write one. Basic setup is RHEL 6.5 with either Foreman 1.5 or Foreman 1.6. I'm using the Foreman rpm's and the Centos SCL repo for Ruby193.

Here's the error:

Started GET "/" for 10.100.128.63 at 2014-09-19 21:03:08 +0000

NoMethodError (undefined method `size' for nil:NilClass):

To fix this try clearing your browser cache/cookies or use a Private Browser session. This error was infuriating to figure out because of the very limited stack trace. So I ran the site under webrick which still only gave me this little stack trace:

Started GET "/" for 10.100.1.25 at 2014-09-19 22:47:22 +0000

NoMethodError (undefined method `size' for nil:NilClass):
  /usr/lib/ruby/gems/1.8/gems/passenger-4.0.18/lib/phusion_passenger/rack/thread_handler_extension.rb:77:in `process_request'
  /usr/lib/ruby/gems/1.8/gems/passenger-4.0.18/lib/phusion_passenger/request_handler/thread_handler.rb:140:in `accept_and_process_next_request'
  /usr/lib/ruby/gems/1.8/gems/passenger-4.0.18/lib/phusion_passenger/request_handler/thread_handler.rb:108:in `main_loop'
  /usr/lib/ruby/gems/1.8/gems/passenger-4.0.18/lib/phusion_passenger/request_handler.rb:441:in `block (3 levels) in start_threads'

Eventually I got the site to run in develop mode with --debug (exceedingly difficult) though and the browser side full stack trace had the real clue.

NoMethodError

undefined method `size' for nil:NilClass
Rails.root: /usr/share/foreman

Application Full Trace

rack (1.4.1) lib/rack/utils.rb:457:in `[]='
rack (1.4.1) lib/rack/utils.rb:76:in `block in parse_query'
rack (1.4.1) lib/rack/utils.rb:66:in `each'
rack (1.4.1) lib/rack/utils.rb:66:in `parse_query'
rack (1.4.1) lib/rack/request.rb:263:in `cookies'
rack (1.4.1) lib/rack/session/abstract/id.rb:254:in `extract_session_id'
actionpack (3.2.8) lib/action_dispatch/middleware/session/abstract_store.rb:51:in `block in extract_session_id'
actionpack (3.2.8) lib/action_dispatch/middleware/session/abstract_store.rb:55:in `stale_session_check!'
actionpack (3.2.8) lib/action_dispatch/middleware/session/abstract_store.rb:51:in `extract_session_id'
rack (1.4.1) lib/rack/session/abstract/id.rb:43:in `load_session_id!'
rack (1.4.1) lib/rack/session/abstract/id.rb:32:in `[]'
rack (1.4.1) lib/rack/session/abstract/id.rb:262:in `current_session_id'
rack (1.4.1) lib/rack/session/abstract/id.rb:268:in `session_exists?'
rack (1.4.1) lib/rack/session/abstract/id.rb:107:in `exists?'
rack (1.4.1) lib/rack/session/abstract/id.rb:122:in `load_for_read!'
rack (1.4.1) lib/rack/session/abstract/id.rb:64:in `has_key?'
actionpack (3.2.8) lib/action_dispatch/middleware/flash.rb:258:in `ensure in call'
actionpack (3.2.8) lib/action_dispatch/middleware/flash.rb:259:in `call'
rack (1.4.1) lib/rack/session/abstract/id.rb:205:in `context'
rack (1.4.1) lib/rack/session/abstract/id.rb:200:in `call'
actionpack (3.2.8) lib/action_dispatch/middleware/cookies.rb:339:in `call'
activerecord (3.2.8) lib/active_record/query_cache.rb:64:in `call'
activerecord (3.2.8) lib/active_record/connection_adapters/abstract/connection_pool.rb:473:in `call'
actionpack (3.2.8) lib/action_dispatch/middleware/callbacks.rb:28:in `block in call'
activesupport (3.2.8) lib/active_support/callbacks.rb:405:in `_run__3062187567662466012__call__827136857413921735__callbacks'
activesupport (3.2.8) lib/active_support/callbacks.rb:405:in `__run_callback'
activesupport (3.2.8) lib/active_support/callbacks.rb:385:in `_run_call_callbacks'
activesupport (3.2.8) lib/active_support/callbacks.rb:81:in `run_callbacks'
actionpack (3.2.8) lib/action_dispatch/middleware/callbacks.rb:27:in `call'
actionpack (3.2.8) lib/action_dispatch/middleware/reloader.rb:65:in `call'
actionpack (3.2.8) lib/action_dispatch/middleware/remote_ip.rb:31:in `call'
actionpack (3.2.8) lib/action_dispatch/middleware/debug_exceptions.rb:16:in `call'
actionpack (3.2.8) lib/action_dispatch/middleware/show_exceptions.rb:56:in `call'
railties (3.2.8) lib/rails/rack/logger.rb:26:in `call_app'
railties (3.2.8) lib/rails/rack/logger.rb:16:in `call'
actionpack (3.2.8) lib/action_dispatch/middleware/request_id.rb:22:in `call'
rack (1.4.1) lib/rack/methodoverride.rb:21:in `call'
rack (1.4.1) lib/rack/runtime.rb:17:in `call'
activesupport (3.2.8) lib/active_support/cache/strategy/local_cache.rb:72:in `call'
rack (1.4.1) lib/rack/lock.rb:15:in `call'
actionpack (3.2.8) lib/action_dispatch/middleware/static.rb:62:in `call'
railties (3.2.8) lib/rails/engine.rb:479:in `call'
railties (3.2.8) lib/rails/application.rb:223:in `call'
railties (3.2.8) lib/rails/railtie/configurable.rb:30:in `method_missing'
rack (1.4.1) lib/rack/builder.rb:134:in `call'
rack (1.4.1) lib/rack/urlmap.rb:64:in `block in call'
rack (1.4.1) lib/rack/urlmap.rb:49:in `each'
rack (1.4.1) lib/rack/urlmap.rb:49:in `call'
rack (1.4.1) lib/rack/content_length.rb:14:in `call'
railties (3.2.8) lib/rails/rack/debugger.rb:20:in `call'
railties (3.2.8) lib/rails/rack/log_tailer.rb:17:in `call'
rack (1.4.1) lib/rack/handler/webrick.rb:59:in `service'
/opt/rh/ruby193/root/usr/share/ruby/webrick/httpserver.rb:138:in `service'
/opt/rh/ruby193/root/usr/share/ruby/webrick/httpserver.rb:94:in `run'
/opt/rh/ruby193/root/usr/share/ruby/webrick/server.rb:191:in `block in start_thread'

Notice that the trace is from where the middle layer looks for cookies. Thats what finally tipped me off that it was a browser side cookie issue. Who'd think that a browser side issue would give such a small unhelpful stacktrace...