3. Running code on ZeroCloud

These examples below include executing code using just plain old curl commands on the command line, as well as scripting using Python and the requests module.

Jump to a section:

3.1. Setup: Getting an auth token

The first thing you need to do is get an auth token and find the storage URL for your account in Swift. For convenience, you can get this information simply by running zpm auth:

$ zpm auth
Auth token: PKIZ_Zrz_Qa5NJm44FWeF7Wp...
Storage URL: http://127.0.0.1:8080/v1/AUTH_7fbcd8784f8843a180cf187bbb12e49c

Setting a couple of environment variables with these values will make commands more concise and convenient to execute:

$ export OS_AUTH_TOKEN=PKIZ_Zrz_Qa5NJm44FWeF7Wp...
$ export OS_STORAGE_URL=http://127.0.0.1:8080/v1/AUTH_7fbcd8784f8843a180cf187bbb12e49c

3.2. POST a Python script

This is the simplest and easiest way to execute code on ZeroCloud.

First, write the following the code into a file called example.

#!file://python2.7:python
import sys
print("Hello from ZeroVM!")
print("sys.platform is '%s'" % sys.platform)

Execute it using curl:

$ curl -i -X POST -H "X-Auth-Token: $OS_AUTH_TOKEN" \
  -H "X-Zerovm-Execute: 1.0" -H "Content-Type: application/python" \
  --data-binary @example $OS_STORAGE_URL

Using a Python script:

import os
import requests

storage_url = os.environ.get('OS_STORAGE_URL')
headers = {
    'X-Zerovm-Execute': 1.0,
    'X-Auth-Token': os.environ.get('OS_AUTH_TOKEN'),
    'Content-Type': 'application/python',
}

with open('example') as fp:
    response = requests.post(storage_url,
                             data=fp.read(),
                             headers=headers)
    print(response.content)

You can write and execute any Python code in this way, using any of the modules in the standard library.

3.3. POST a ZeroVM image

Another way to execute code on ZeroCloud is to create a specially constructed tarball (a “ZeroVM image”) and POST it directly to ZeroCloud. A “ZeroVM image” is a tarball with at minimum a boot/system.map file. The boot/system.map, or job description, contains runtime execution information which tells ZeroCloud what to execute.

This is useful if your code consists of multiple source files (not just a single script). You can pack everything into a single file and execute it. This method is also useful if you want to just execute something once, meaning that once ZeroCloud executes the application, the app is thrown away.

In this example, we’ll do just that. Create the following files:

mymath.py:

def add(a, b):
    return a + b

main.py:

import mymath
a = 5
b = 6
the_sum = mymath.add(a, b)
print("%s + %s = %s" % (a, b, the_sum))

Create a boot directory, then boot/system.map file:

[{
    "name": "example",
    "exec": {
        "path": "file://python2.7:python",
        "args": "main.py"
    },
    "devices": [
        {"name": "python2.7"},
        {"name": "stdout"}
    ]
}]

Create the ZeroVM image:

$ tar cf example.tar boot/system.map main.py mymath.py

Execute the ZeroVM image directly on ZeroCloud using curl:

$ curl -i -X POST -H "Content-Type: application/x-tar" \
  -H "X-Auth-Token: $OS_AUTH_TOKEN" -H "X-Zerovm-Execute: 1.0" \
  --data-binary @example.tar $OS_STORAGE_URL

Using a Python script:

import os
import requests

storage_url = os.environ.get('OS_STORAGE_URL')
headers = {
    'X-Zerovm-Execute': 1.0,
    'X-Auth-Token': os.environ.get('OS_AUTH_TOKEN'),
    'Content-Type': 'application/x-tar',
}

with open('example.tar') as fp:
    response = requests.post(storage_url,
                             data=fp.read(),
                             headers=headers)
    print(response.content)

3.4. POST a job description to a ZeroVM application

This method is useful if you want to execute the same application multiple times, for example, to run an application to process multiple different files.

In this example, we will upload a packaged application into Swift and then subsequently POST job descriptions to execute the application. This can be done multiple times, and with different arguments. We’ll use this to build a small application. Create a directory sampleapp and in it, create the following files:

main.py:

import csv
with open('/dev/input') as fp:
    reader = csv.reader(fp)

    for id, name, email, balance in reader:
        print('%(name)s: %(balance)s' % dict(name=name, balance=balance))

Create an example.tar containing the Python script:

$ tar cf example.tar main.py

Create a container for the application:

$ swift post example

Upload the image into Swift:

$ swift upload example example.tar

Now we need to create a couple of files for the application to read and process.

data1.csv:

id,name,email,balance
1,Alice,[email protected],1000
2,Bob,[email protected],-500

data2.csv:

id,name,email,balance
3,David,[email protected],15000
4,Erin,[email protected],25000

Upload the data files into Swift:

$ swift upload example data1.csv data2.csv

job.json:

[{
    "name": "example",
    "exec": {
        "path": "file://python2.7:python",
        "args": "main.py"
    },
    "devices": [
        {"name": "python2.7"},
        {"name": "stdout"},
        {"name": "input", "path": "swift://~/example/data1.csv"},
        {"name": "image", "path": "swift://~/example/example.tar"}
    ]
}]

Execute it using curl:

$ curl -i -X POST -H "Content-Type: application/json" \
  -H "X-Auth-Token: $OS_AUTH_TOKEN" -H "X-Zerovm-Execute: 1.0" \
  --data-binary @job.json $OS_STORAGE_URL

Execute it using a Python script:

import os
import requests

storage_url = os.environ.get('OS_STORAGE_URL')
headers = {
    'X-Zerovm-Execute': 1.0,
    'X-Auth-Token': os.environ.get('OS_AUTH_TOKEN'),
    'Content-Type': 'application/json',
}

with open('job.json') as fp:
    response = requests.post(storage_url,
                             data=fp.read(),
                             headers=headers)
    print(response.content)

You can process a different input file by simply changing the job.json and re-running the application (using curl or the Python script above). For example, change this line

{"name": "input", "path": "swift://~/example/data1.csv"},

to this:

{"name": "input", "path": "swift://~/example/data2.csv"},

Your job.json file should now look like this:

[{
    "name": "example",
    "exec": {
        "path": "file://python2.7:python",
        "args": "main.py"
    },
    "devices": [
        {"name": "python2.7"},
        {"name": "stdout"},
        {"name": "input", "path": "swift://~/example/data2.csv"},
        {"name": "image", "path": "swift://~/example/example.tar"}
    ]
}]

Try running that and see the difference in the output:

$ curl -i -X POST -H "Content-Type: application/json" \
  -H "X-Auth-Token: $OS_AUTH_TOKEN" -H "X-Zerovm-Execute: 1.0" \
  --data-binary @job.json $OS_STORAGE_URL

3.5. Run a ZeroVM application with an object GET

It is possible to attach applications to particular types of objects and run that application when the object is retrieved (using a GET request) from Swift.

In this example, we’ll write an application which processes JSON file objects and returns a pretty-printed version of the contents. The idea here is that we take some raw JSON data and make it more human-readable.

Create the following files in a new directory sampleapp2:

data.json:

{"type": "GeometryCollection", "geometries": [{ "type": "Point", "coordinates": [100.0, 0.0]}, {"type": "LineString", "coordinates": [[101.0, 0.0], [102.0, 1.0]]}]}

prettyprint.py:

import json
import pprint

with open('/dev/input') as fp:
    data = json.load(fp)
    print(pprint.pformat(data))

config:

[{
    "name": "prettyprint",
    "exec": {
        "path": "file://python2.7:python",
        "args": "prettyprint.py"
    },
    "devices": [
        {"name": "python2.7"},
        {"name": "stdout"},
        {"name": "input", "path": "{.object_path}"},
        {"name": "image", "path": "swift://~/example/prettyprint.tar"}
    ]
}]

Upload the test data:

$ swift post example  # creates the container, if it doesn't exist already
$ swift upload example data.json

Bundle and upload the application:

$ tar cf prettyprint.tar prettyprint.py
$ swift upload example prettyprint.tar

Upload the configuration to a .zvm container:

$ swift post .zvm  # creates the container, if it doesn't exist already
$ swift upload .zvm config --object-name=application/json/config

Now submit a GET request to the file, and it will be processed by the prettyprint application. Setting the X-Zerovm-Execute header to open/1.0 is required to make this work. (Without this header you’ll just get the raw file, unprocessed.)

Using curl:

$ curl -i -X GET $OS_STORAGE_URL/example/data.json \
  -H "X-Zerovm-Execute: open/1.0" -H "X-Auth-Token: $OS_AUTH_TOKEN"

Using a Python script:

import os
import requests

storage_url = os.environ.get('OS_STORAGE_URL')
headers = {
    'X-Zerovm-Execute': 'open/1.0',
    'X-Auth-Token': os.environ.get('OS_AUTH_TOKEN'),
}

response = requests.get(storage_url + '/example/data.json',
                        headers=headers)
print(response.content)

3.6. MapReduce application

This example is a parallel wordcount application, constructed to utilize the MapReduce features of ZeroCloud.

3.6.1. Create the project directory

Create a directory for the project files. For example:

$ mkdir ~/mapreduce

Then change into that directory:

$ cd ~/mapreduce

3.6.2. Create Swift containers

We need to create two containers in Swift: one to hold our application data, and one to hold the application itself.

$ swift post mapreduce-data
$ swift post mapreduce-app

3.6.3. Upload sample data

Create a couple of text files and upload them into the mapreduce-data container. You can use the samples below, or any text you like.

mrdata1.txt:

Lorem ipsum dolor sit amet, consectetur adipiscing elit. Ut diam sapien,
dictum eleifend erat in, luctus pellentesque est. Aliquam diam est,
tincidunt ac bibendum non, vehicula ut enim. Sed vitae mi orci. Nam
scelerisque diam ut orci iaculis dictum. Fusce consectetur consectetur
risus ut porttitor. In accumsan mi ut velit venenatis tincidunt. Duis id
dapibus velit, nec semper odio.  Quisque auctor massa vitae vulputate
venenatis. Pellentesque velit eros, pretium in hendrerit a, viverra vitae
neque. Vivamus mattis vehicula lectus vel fringilla. Curabitur sem urna,
condimentum nec lectus non, tristique elementum sapien. Quisque luctus
ultrices ante sed dignissim. Integer non commodo enim, quis semper diam.

mrdata2.txt:

Curabitur pulvinar diam eros, eget varius justo hendrerit sed. Maecenas
hendrerit aliquam libero id mollis. Donec semper sapien tellus, sed
elementum dolor ornare eu. Vestibulum lacinia mauris quis ipsum porta, ut
lobortis sapien consectetur. Sed quis pretium justo, mattis aliquet nisl.
Donec vitae elementum lectus. Morbi fringilla augue non elit pulvinar, non
fermentum quam eleifend. Integer ac sodales lorem, a iaculis sapien.
Phasellus vel sodales lorem. Integer consequat varius mi in pretium.
Aliquam iaculis viverra vestibulum. Ut ut arcu sed orci malesuada pulvinar
sit amet sed felis. Nullam eget laoreet urna. Sed eu dapibus quam. Nulla
facilisi. Aenean non ornare lorem.

mrdata3.txt:

Vivamus lacinia tempor massa at molestie. Aenean non erat leo. Curabitur
magna diam, ultrices quis eros quis, ornare vehicula turpis. Donec
imperdiet et mi id vestibulum. Nullam tincidunt interdum tincidunt. Nullam
eleifend vel mauris in bibendum. Maecenas molestie est ac rhoncus
elementum. Duis imperdiet hendrerit congue. Quisque facilisis neque a
semper egestas. Vestibulum nec lacus diam.  Nam vitae volutpat lacus.
Donec sodales dui est, ac malesuada arcu sodales vitae.

Upload the files:

$ swift upload mapreduce-data mrdata1.txt
$ swift upload mapreduce-data mrdata2.txt
$ swift upload mapreduce-data mrdata3.txt

3.6.4. Add zapp.yaml

Add a ZeroVM application configuration template:

$ zpm new

This will create a zapp.yaml file in the current directory. Open zapp.yaml in your favorite text editor.

First, give the application a name, by changing the

Change the execution section

execution:
  groups:
    - name: ""
      path: file://python2.7:python
      args: ""
      devices:
      - name: python2.7
      - name: stdout

to look like this:

 execution:
   groups:
     - name: "map"
       path: file://python2.7:python
       args: "mapper.py"
       devices:
       - name: python2.7
       - name: stdout
       - name: input
         path: "swift://~/mapreduce-data/*.txt"
       connect: ["reduce"]
     - name: "reduce"
       path: file://python2.7:python
       args: "reducer.py"
       devices:
       - name: python2.7
       - name: stdout

Note

The connect directive enables communication from the first execution group to the second. The creates a data pipeline where the results from the map execution, run on each text file in the mapreduce-data container, can be piped to the reduce part and combined into a single result.

We also need to update the bundling section

bundling: []

to include two Python source code files (which we will create below):

bundling: ["mapper.py", "reducer.py"]

3.6.5. The code

Now let’s write the code that will do our MapReduce wordcount.

mapper.py:

import os

# Word count:
with open('/dev/input') as fp:
    data = fp.read()

with open('/dev/out/reduce', 'a') as fp:
    path_info = os.environ['LOCAL_PATH_INFO']

    # Split off the swift prefix
    # Just show the container/file
    shorter = '/'.join(path_info.split('/')[2:])
    # Pipe the output to the reducer:
    print >>fp, '%d %s' % (len(data.split()), shorter)

reducer.py:

import os
import math

inp_dir = '/dev/in'

total = 0
max_count = 0

data = []

for inp_file in os.listdir(inp_dir):
    with open(os.path.join(inp_dir, inp_file)) as fp:
        for line in fp:
            count, filename = line.split()
            count = int(count)
            if count > max_count:
                max_count = count
            data.append((count, filename))
            total += count

fmt = '%%%sd %%s' % (int(math.log10(max_count)) + 2)

for count, filename in data:
    print fmt % (count, filename)
print fmt % (total, 'total')

3.6.6. Bundle, deploy, and execute

Bundle:

$ zpm bundle
created mapreduce.zapp

Deploy:

$ zpm deploy mapreduce-app mapreduce.zapp

Execute:

$ zpm execute mapreduce.zapp --container mapreduce-app
 104 mapreduce-data/mrdata1.txt
 101 mapreduce-data/mrdata2.txt
  69 mapreduce-data/mrdata3.txt
 274 total