It's been a month and a half since
the first part of this series. Why the long delay? I've been busy with other things. I implemented
inheritance, various
compiler optimizations, and many other things. In the last couple of weeks I've been working on the web framework again, tying up some loose ends and porting more existing web applications over (namely, the pastebin and planet factor).
In this entry I will talk about session management. Session management was one of the first things I implemented in the new framework when I started working on it, but recently I gave the code an overhaul.
Session management
The basic idea behind session management is that while HTTP is a stateless protocol, we can simulate state by sending a token to the client -- either in the form of a hidden element on the page, or a cookie, which the client sends back to the server with a later request. This token is associated with an object on the server and the object holds state between requests.
Another approach for session management is to store state entirely on the client; instead of sending the client a session ID identifying an object on the server, you send the session data itself to the client. Traditionally this approach has only been used for user preferences and such where security is immaterial, but it can even be used for more sensitive data by encrypting it with a private key only known to the server. The client receives an opaque blob of binary data which cannot be inspected or tampered with (unless the public key encryption algorithm being used is compromised).
Currently Factor's session manager does not support client-side sessions, but it will soon, using
Doug Coleman's public-key encryption code. Server-side sessions are supported, however.
The session manager uses two main strategies to pass state to the client:
- For GET and HEAD requests, a cookie is used. The cookie's value is a randomly-generated session ID.
- For POST requests, the form must define a hidden field with the session ID. The value of the cookie is ignored to thwart cross-site scripting attacks.
The idea is to strike a balance between security and convenience; we don't want to add a session ID to every link and start a new session if the user navigates to the site by directly entering a URL, but on the other hand we don't want potentially destructive POST requests to be accepted unless they were sent by a form generated from within the session itself.
In Factor, a session is simply a hashtable where values can be stored. Keys are known as "session variables" and values can be read and written with the
sget
and
sset
words, there's also a
schange
combinator which applies a quotation is applied to an existing session variable to yield a new value. This all entirely analogous to the
get
/
set
/
change
words for dynamic variables.
Session namespaces are serialized and stored in a database using Doug's
db.tuples
O/R mapper. I originally supported pluggable "session storage" backends, with database storage and in-memory storage as the two options, however I decided to simplify the code and hardcode database storage. This has the side-effect that you'll need to set up a database to use the session management feature, however SQLite presents a lightweight option which requires no configuration, so I don't think this is a big deal at all.
I will show a small example of a 'counter' web application, much like the
counter example for the Seaside framework.
We start off with a vocabulary search path:
USING: math kernel accessors http.server http.server.actions
http.server.sessions http.server.templating.fhtml locals ;
IN: webapps.counter
Now, we define a symbol used to key a session variable:
SYMBOL: count
Next, we define a pair of actions which increment the counter value, using the
schange
combinator. The
display
slot of an action contains code to be executed upon a GET request; it is expected to output a response object. In our case, the word outputs an action which applies the quotation to the current counter value; the action outputs a response which redirects back to the main page:
:: <counter-action> ( quot -- action )
<action> [
count quot schange
"" f <standard-redirect>
] >>display ;
The action to decrement the counter is entirely analogous:
: <dec-action> ( -- action )
<action> [ count [ 1- ] schange f ] >>display ;
Note that this word
constructs actions, instead of invoking them. This approach is more flexible than the old "furnace" web framework, where actions were mapped directly to word execution, because it allows one to write "higher-order actions" parametrized by values more easily.
Here is the default action; it displays the counter value using a template:
: <counter-action> ( -- action )
<action> [
"resource:extra/webapps/counter/counter.fhtml" <fhtml>
] >>display ;
Finally we put everything together in a dispatcher:
: <counter-app> ( -- responder )
counter-app new-dispatcher
[ 1+ ] <counter-action> "inc" add-responder
[ 1- ] <counter-action> "dec" add-responder
<display-action> "" add-responder
<sessions> ;
We create a dispatcher, add instances of our actions to it, and wrap the whole thing in a session manager.
Now, the template:
<% USING: io math.parser http.server.sessions webapps.counter ; %>
<html>
<body>
<h1><% count sget number>string write %></h1>
<a href="inc">++</a>
<a href="dec">--</a>
</body>
</html>
Finally, once we have all the parts, we can create the counter responder and start the HTTP server:
<counter-app> "test.db" sqlite-db <db-persistence> main-responder set
8888 httpd
Note that here we wrap the counter responder in another layer of indirection, this time for database persistence; while the counter web app doesn't use persistence the session manager does, and we chose to use SQLite since it requires no configuration or external services.
Navigating over to http://localhost:8888/ should now display the counter app, and clicking the increment and decrement links should have an effect on the displayed value. Sessions persist between server restarts and time out after 20 minutes of inactivity by default. Looking at your web browser's cookie manager will show that a
factorsessid
cookie has been set.
As an aside, the Seaside version uses continuations to maintain state. The Factor version explicitly maintains state. Even though I ported
Chris Double's modal web framework over to the new HTTP server, I'm avoiding continuations in favor of explicit state for now. I am building up a form component framework with validation, easy persistence, and user authentication without resorting to continuations, and I plan on building a state-machine model with a page flow DSL, much like
jBPM, to handle more complex multi-page flows such as shopping carts. While this will result in more work for me, I believe the benefits include transparent support for load-balancing and fail-over, readable URLs, and ultimately, simpler and more reusable web application code because page flow can be decoupled from logic and expressed in a custom DSL intended for that purpose.
The Seaside version is also somewhat shorter; it is easy to express with idiomatic Seaside (transparent session management, presentation logic mixed in with web app code). I will add better abstractions to make up for some of the difference, and for larger applications there should be no difference in code size; in fact since the scope of Factor's framework is wider than Seaside (it covers persistence, authentication and validation, and soon, versioning of persistent entities) you might even need less code to accomplish the same thing.
Virtual hosting
The other topic I promised to cover last time was virtual hosting. Virtual hosting is done with dispatchers, much like nested directory structure is. You create a virtual host dispatcher with
<vhost-dispatcher>
and add responders for various virtual hosts using
add-responder
; the
>>default
slot can be used to set the default virtual host. The key difference between the new approach and the old HTTP server virtual hosting implementation, which relied on a global hashtable mapping virtual host names to responders, is flexibility; the virtual host dispatcher does not necessarily have to be your top-level responder.
For example, the
<boilerplate>
responder gives you a way of enforcing a common look and feel across a set of web apps, by adding common headers and footers to every page. While I will describe boilerplate responders and the template system in more detail in a later post, for now here is an example:
<vhost-dispatcher>
<online-store> "store.acme.com" add-responder
<support-site> "support.acme.com" add-responder
<main-site> "acme.com" add-responder
<boilerplate>
"acme-site.xml" >>template
acme-db <db-persistence>
<sessions> main-responder set
Here, all virtual hosts share the same session management, database persistence, and common theme, and the virtual host dispatch only happens after the request filters through the mentioned layers of functionality. This would not be possible with the old HTTP server without duplicating code.
Cookies
Finally I promised to talk about cookies. The session management support is great but sometimes you just want to get and set cookies directly. This can be done by reading the
cookies
slot of the request object, and writing the
cookies
slot of the response object. The slot contains a sequence of
cookie
objects, which are parsed and unparsed from their HTTP representation for you. A cookie object contains a series of slots, such as name, value, expiration date (as a Factor timestamp object), max-age (as a Factor duration object), path, and host. While the expiration date is deprecated as of HTTP/1.1, most sites still use it in favor of max-age because older browsers don't support max-age. Factor's HTTP server sets the date header on each response so that expiration dates can work correctly.
Here is an example of using the HTTP client (which shares the cookie code with the server) to look at Google's ridiculously long-lived cookies:
( scratchpad ) "http://www.google.com" http-get-stream drop cookies>> first describe
cookie instance
"delegate" f
"name" "pref"
"value" "ID=c0f4c074cd87502e:TM=1209466656:LM=1209466656:S=_6gGEKtuTgP..."
"path" "/"
"domain" ".google.com"
"expires" T{ timestamp f 2010 4 29 10 57 36 ~duration~ }
"max-age" f
"http-only" f