OpenSocial in the Cloud
From OpenSocial
Contents |
(OpenSocial API v0.8)
Jason Cooper and Lane LiaBraaten, Google Developer Programs
September 2008
Intro
Some OpenSocial apps can be written entirely with client-side JavaScript and HTML, leveraging the container to serve the page and store application data. In this case, the app can scale effortlessly because the only request hitting your server is for the gadget specification which is typically cached by the container anyway.
However, there are lots of reasons to consider using your own server:
- Allows you to write code in the programing language of your choice.
- Puts you in control of how much application data you can store.
- Lets you combine data from users on multiple social networks.
- Enables interaction with the OpenSocial REST API.
Setting up an OpenSocial app that uses a third party server is fairly simple. There are a few gotchas and caveats, but the real issues come up when your app becomes successful - serving millions of users and sending thousands of requests per second. Apps can grow especially fast on social networks, so before you launch your next social app, you should think about how to scale up quickly if your app takes off.
Unfortunately, scaling is a complex problem that's hard to solve quickly and expensive to implement. Luckily, there are several companies that provide cloud computing resources—places you can store data or run processes on virtual machines. These computing solutions manage huge infrastructures so you can focus on your applications and let the "cloud" handle all the requests and data at scale.
This tutorial focuses on a simple photo-sharing app that uses a third-party server to host photos and associated metadata. If this app is going to host millions of images and support many requests per second, we won't be able to run it on a single dedicated host. We'll break the app down and analyze the interactions between the OpenSocial App and the back end server. Then we'll implement the app in the cloud, first using Google App Engine, then leveraging Amazon's S3 data storage service. Finally, we'll look at several ways to reduce the amount of network traffic the app generates, making the app faster and less expensive to host in the cloud.
The Photo Pier App
Photo Pier is a photo-sharing app where users can upload photos and tag them with keywords. Users can also view the photos that their friends have uploaded and select their favorite photos to display on their profile.
To enable this functionality, we'll need to store some data for each photo: a reference to the user who uploaded the photo (referred to hereafter as its owner), a unique name, and a collection of string-based tags or labels that the owner has set for the photo.
Software Requirements
You'll need a few resources to complete this tutorial:
- Python 2.5
- Google App Engine SDK and account
- OpenSocial in the Cloud resource bundle
- Amazon S3 account
Moving to the Cloud
If this app grows to serve millions of users and photos, shared hosting or even a dedicated server won't have the bandwidth or CPU cycles to handle all of the requests. We could invest in more servers and network infrastructure, shard the database, and load-balance requests, but that takes time, money, and expertise. If you'd rather work on the new features of the app, it's time to move into the cloud.
It's important to focus on the interactions between the app and your server when designing an application that will run in the cloud. If we standardize the communication protocol and data format, we can easily change the server side implementation without modifying the OpenSocial app.
Secure Communication from the app to the server
Before we look at the specific requests and responses flowing between app and server, let's cover how OpenSocial enables this communication to be secure. With a standard makeRequest call, anyone can send a request to our server with the appropriate parameters and tag images inappropriately or even upload images into another user's profile. This could lead to an embarrassing situation for your users and even bigger trouble for you and your app. Luckily, OpenSocial provides a mechanism for preventing this type of malicious behavior.
You can configure the makeRequest method to digitally sign the requests your app makes to your server using OAuth's algorithm for parameter signing. This means that when your server receives a request, it can verify that the request came from your application hosted in a specific container. To implement this, the calls to makeRequest in the OpenSocial app spec XML specify that the request should be signed, and the code that handles requests on the server side verifies that a signature is included and valid.
The change in the app is pretty small—just add a parameter that tells the container to use SIGNED authorization:
var params = {};
params[gadgets.io.RequestParameters.CONTENT_TYPE] = gadgets.io.ContentType.JSON;
params[gadgets.io.RequestParameters.AUTHORIZATION] = gadgets.io.AuthorizationType.SIGNED;
var url = buildUrl(this.request_base_url,
['photos'], [new Date().getTime()]);
gadgets.io.makeRequest(url,
bind(this.closeFetchOwnerPhotos(callback), this),
params);
When our server receives a request, we can verify that it came from our application by checking that the digital signature was signed by a valid container and that the application ID is correct. You can find code to do this in the opensocial-resources wiki.
Note that when signing a request, the container will add several request parameters including the ID of the container as well as the ID of the person that is using the app. The Photo Pier back-end leverages these parameters to know which user to associate each photo with.
Interactions between the app and the server
Based on the functionality described above, the following actions will result in the client-side app making a request to the server:
- Upload a new photo
- Fetch images that belong to the current user
- Fetch images that belong to the current user's friends
- Add a tag to a photo
- Fetch all tags that the current user has added
Uploading a new photo
To upload an image, the app creates a form that, when submitted, sends an HTTP POST request with the binary data and several additional parameters (don't worry, the form submission handles the encoding for you—all you have to do is specify an end-point).
http://<base_url>/photo
Note: <base_url> is used throughout this tutorial to denote the URL prefix which is a constant. All request URLs begin with this constant with the only thing changing being the end-point and the query string parameters.
When the server receives this request, it stores the image file and some metadata (i.e. the ID of the owner) in the datastore. An HTML reponse is expected with the text "Photo added" and an <img> tag configured to display the newly uploaded photo.
Fetching the current user's photos
When the user first loads the canvas page, the app needs to send a request to the server to get the list of photos to display. The app uses the gadgets.io.makeRequest function to send an HTTP GET request to the following URL:
http://<base_url>/photos?arg0=<TIMESTAMP>
When the server receives this request, it uses the oauth_consumer_key and opensocial_owner_id request parameters to locate the photos previously uploaded by the current user. For each photo, the server will return a URL and a list of tags for the photo.
The response to this request is a JSON string in the following format:
{"resultsSet":[ { "url":"http://foo", "tags":["Aruba", "snorkling" },
{ "url":"http://bar", "tags":["wedding", "cake"] } ]
}
Fetching photos for a user's friends
When the user clicks on the "Friends' Photos" tab, the app should display the photos that each of the user's friends have uploaded. Since our server isn't storing any relationship data, the app will need to send us a list of user IDs so we can fetch the appropriate photos. The fetchFriendPhotos method uses the makeRequest method to POST an HTTP request with the list of IDs included as post data. This POST request will go to the following URL:
http://<base_url>/photos?arg0=<TIMESTAMP>
Notice that this is the same URL above. The server will treat this request differently because the HTTP method is POST instead of GET. The post data in the request will be in the following format:
people=01495306580392390900,14088281537290874435
When our server receives the request it will parse this data and fetch the photos for each of the IDs provided by the request. The response is a JSON string in the following format:
{"resultsCollection":[
{ "name" : "01495306580392390900",
"photos" : [ { "url":"http://foo", "tags":["Aruba", "snorkling", "fish"] },
{ "url":"http://bar", "tags":["snorkling", "shipwreck"] } ] },
{ "name" : "14088281537290874435",
"photos" : [ { "url":"http://baz", "tags":["food", "pasta", "linguini"] },
{ "url":"http://raz", "tags":["food", "dessert", "apple pie"] } ] } ]
}
Adding a tag to a photo
To add a text tag to a photo, the app sends the text along with the extended name of the photo (this includes the owner's OpenSocial ID and the photo name set during upload) to the server. The app uses the gadgets.io.makeRequest function to send an HTTP POST request to the following URL:
http://<base_url>/photo/<EXTENDED_PHOTO_NAME>
The actual text of the tag is sent as post data. When the server receives this request, it parses the extended photo name and uses the components to locate the stored photo to be tagged. The tag is then associated with the photo in the datastore. If the tagging is successful, the plain-text response should read "Tag added!".
Fetching tags added by a user
Finally, we'll add a request for returning all tags that a user has added. This set of tags can be shown in a drop-down list to enable the user to tag related photos more easily. Once again, the application uses a gadgets.io.makeRequest function call to send a GET request, this time to the URL below.
http://<base_url>/tags
The response is a stringified JSON object that looks like this:
{"tags":["Aruba", "snorkeling", "fish", "shipwreck"]}
Google App Engine
At this point, you need to have all the [#requirements software requirements] installed on your development machine.
The App Engine project will contain several files and a few third-party libraries detailed below:
- app.yaml - contains metadata about the application.
- cloud.py - contains request handlers and application logic.
- main.py - contains an initialization routine and maps URI end-points to Python classes.
- modules.py - contains all front-end code used to render the UI and source the necessary JavaScript files.
- cloud.js - a set of client-side event handlers that send asynchronous requests to the App Engine back-end and display the response
- simplejson - a Python library for encoding/decoding JSON.
- prototype-1.6.0.2.js - a JavaScript library used to send asynchronous requests and parse JSON response.
- pycrypto - a general-purpose Python cryptography toolkit
- oauth.py - a Python library used for signature validation
The app.yaml file for our application is pretty simple:
application: datastore version: 1 runtime: python api_version: 1 handlers: - url: /scripts static_dir: scripts - url: /modules/.* script: modules.py - url: /.* script: main.py
This file can be used to specify multiple handler scripts or locations for static content as we have done above. See Configuring an App for more details on this file.
Let's start coding our application. First we need to define handler classes for the various URLs that our app will be receiving requests on. The OpenSocial in the Cloud resource bundle contains the following skeleton of request handlers in cloud.py:
# cloud.py
import sys
sys.path.append('lib')
import re
import cgi
import urllib
import simplejson
from google.appengine.ext import webapp
from google.appengine.ext import db
from math import floor
from time import time
# Signature validation required libraries import
import hashlib
import oauth
from Crypto.PublicKey import RSA
from Crypto.Util import number
# Local port; change if another process is running on 8080
PORT = '8080'
class RootHandler(webapp.RequestHandler):
def get(self):
self.response.out.write("RootHandler received a GET request")
class TagsHandler(webapp.RequestHandler):
def get(self):
if not _isValidSignature(self):
self.response.out.write('SIGNATURE INVALID')
return
self.response.out.write("TagsHandler received a GET request")
class PhotosHandler(webapp.RequestHandler):
def get(self):
if not _isValidSignature(self):
self.response.out.write('SIGNATURE INVALID')
return
self.response.out.write("PhotosHandler received a GET request")
def post(self):
if not _isValidSignature(self):
self.response.out.write('SIGNATURE INVALID')
return
self.response.out.write("PhotosHandler received a POST request")
class PhotoHandler(webapp.RequestHandler):
def get(self):
self.response.out.write("PhotoHandler received a GET request")
def post(self):
if not _isValidSignature(self):
self.response.out.write('SIGNATURE INVALID')
return
self.response.out.write("PhotoHandler received a POST request")
def _isValidSignature(self):
# Code lab hack:
# If the container is 'appengine' (e.g. app is running on localhost), return True
if self.request.get('oauth_consumer_key') == 'appengine':
return True
# Construct a RSA.pubkey object
exponent = 65537
public_key_str = """0x\
00b1e057678343866db89d7dec2518\
99261bf2f5e0d95f5d868f81d600c9\
a101c9e6da20606290228308551ed3\
acf9921421dcd01ef1de35dd3275cd\
4983c7be0be325ce8dfc3af6860f7a\
b0bf32742cd9fb2fcd1cd1756bbc40\
0b743f73acefb45d26694caf4f26b9\
765b9f65665245524de957e8c547c3\
58781fdfb68ec056d1"""
public_key_long = long(public_key_str, 16)
public_key = RSA.construct((public_key_long, exponent))
# Rebuild the message hash locally
oauth_request = oauth.OAuthRequest(http_method=self.request.method,
http_url=self.request.url,
parameters=self.request.params.mixed())
message = '&'.join((oauth.escape(oauth_request.get_normalized_http_method()),
oauth.escape(oauth_request.get_normalized_http_url()),
oauth.escape(oauth_request.get_normalized_parameters()),))
local_hash = hashlib.sha1(message).digest()
# Apply the public key to the signature from the remote host
sig = urllib.unquote(self.request.params.mixed()["oauth_signature"]).decode('base64')
remote_hash = public_key.encrypt(sig, '')[0][-20:]
# Verify that the locally-built value matches the value from the remote server.
return local_hash==remote_hash
To keep this code nice and orderly, we place the end-point-to-class mapping in a separate file called main.py. The contents of this file should look like the following:
import cloud
def main():
application = webapp.WSGIApplication([('/', cloud.RootHandler),
('/tags', cloud.TagsHandler),
('/photos', cloud.PhotosHandler),
('/photo/.*', cloud.PhotoHandler),
('/photo', cloud.PhotoHandler)],
debug=True)
user = cloud.User.get_by_key_name(''.join(['appengine', '00000000000000000000']))
if not user:
_initializeDatastore()
# Start application
wsgiref.handlers.CGIHandler().run(application)
if __name__ == "__main__":
main()
Before moving forward, pause for a moment and look at the _isValidSignature function provided above. In production environments that support digitally signing requests, this snippet could be run to verify the authenticity of a request—it encodes the parameters in accordance with the OAuth specification and verifies the digest of this with the request sent by the container. If they are the same, the request is known to be genuine; othewise, it is spoofed and your application should exit immediately. However, since we'll be running this example locally and not in a production environment, I've added a small section at the top of the routine that simply returns true if the container is 'appengine' (our fake container for this sample).
Be sure to remove this section if you deploy this application to a social network that does generate digital signatures for its requests.
Now that we've got a simple app, we'll test it with the development web server. If you haven't already, download the SDK and uncompress it. From the google_appengine directory, run './dev_appserver.py <your_app_directory>'. Verify that you can access your app from a browser (the default URL will be http://localhost:8080/).
Data Model
Google App Engine uses an object model datastore instead of a relational database. This means you just need to define the data elements that your app will use as Python objects that inherit from the db.Model class.
The Photo Pier app will keep track of two types of objects in the datastore: users and photos.
class User(db.Model): container = db.StringProperty() # the container this user came from containerId = db.StringProperty() # the ID provided by the container for this user class Photo(db.Model): name = db.StringProperty() # a unique ID for the photo content = db.BlobProperty() # the binary data of the image contentType = db.StringProperty() # the type of image (e.g. .jpg, .gif, etc.) user = db.ReferenceProperty(User) # a reference to the user that uploaded this image (like a foreign key) tags = db.StringListProperty() # a list of tags for this photo
As you can see, Google App Engine supports many data type in the datastore. For a complete list, see the Types and Property Classes documentation.
Uploading an image
The following implementation of the PhotoHandler class defines a post method that will be invoked any time the app gets an HTTP POST request to the /photo end-point (as defined in the webapp.WSGIApplication constructor above). The post method first checks the container and personid parameters against the datastore to see if the user exists and creates it if it doesn't exist. Then the photo's binary data is read and stored as a blob in the datastore.
class PhotoHandler(webapp.RequestHandler):
def get(self):
self.response.out.write("PhotoHandler received a GET request")
def post(self):
form = cgi.FieldStorage()
fileItem = form['file']
personId = form.getfirst('personId')
container = form.getfirst('container')
self.response.headers['Content-Type'] = 'text/html'
photo = createPhoto(container, personId, fileItem)
if photo:
self.response.out.write('Photo added.')
self.response.out.write(''.join(['<img src="http://localhost:',
PORT,
'/photo/',
container,
':',
personId,
':',
photo.name,
'" width="50"/><br/>']))
def getUser(container, personId):
user = User.get_or_insert(''.join([container, personId]), container=container, personId=personId)
return user
def createPhoto(container, personId, fileItem):
user = getUser(container, personId)
name = ''.join([str(int(time())), fileItem.filename])
key = ''.join([container, personId, '_', name])
photo = Photo(key_name=key)
photo.user = user
photo.name = name
photo.content = db.Blob(fileItem.file.read())
photo.contentType = fileItem.type
photo.put()
return photo
Notice the use of the getUser method above, which is called from createPhoto. This method queries App Engine's datastore looking for a user from the given container ID properties. If a match is found, it is returned to the calling method. Otherwise, a new user instance is created in the datastore and returned. App Engine makes this very easy with the Model class' get_or_insert method.
Note that this technique for uploading photos is not secure. Any server could send a HTTP POST request to our server with the appropriate parameters and upload a photo as any user in the system. Although it's outside the scope of this article, we could provide a mechanism for our OpenSocial app to request a one-time-use token that it would include in the request to upload a photo.
Fetching an individual's photo list
This implementation of the PhotosHandler class handles any HTTP GET requests with the get method. Again, it uses the parameters from the signed request to identify the user and fetches their photos via the getPhotos method. The photo information is written to the self.response.out object as a JSON string using the simplejson library's dumps method as demonstrated below.
class PhotosHandler(webapp.RequestHandler):
def get(self):
if not _isValidSignature(self):
self.response.out.write('SIGNATURE INVALID')
return
personId = self.request.get('opensocial_owner_id')
container = self.request.get('oauth_consumer_key')
photos = getPhotos(container, personId)
self.response.headers['Content-Type'] = 'text/plain'
self.response.out.write(simplejson.dumps({'resultsSet': photos}))
The fetchPhotosForUser method calls the getPhotos function which in turn calls getPhotosForUser which queries the datastore for all photos uploaded by the specified person. The relevant information from the returned Photo objects is placed in a Python dictionary (a.k.a. an associative array or a map) that can be "stringified" by the simplejson library.
def getPhotos(container, personId):
retArray = []
photos = getPhotosForUser(container, personId)
if photos:
for objt in photos:
photo = {}
photo['url'] = ''.join(['http://localhost:', PORT, '/photo/', container, ':', personId, ':', objt.name])
photo['tags'] = objt.tags
retArray.append(photo)
return retArray
def getPhotosForUser(container, personId):
user = getUser(container, personId)
return db.GqlQuery("SELECT * FROM Photo WHERE user = :1", user)
Note how the getPhotosForUser method uses the db.GqlQuery method to fetch a collection of Photo objects from the datastore.
Fetching the photo lists of multiple individuals
Since the server doesn't store any relationship data, the PhotosHandler class checks the post data of the request for a list of IDs from the container. It then calls the getPhotos function for each ID and returns the aggregate result as a JSON string. This response differs slightly from the response returned for a single individual's photo list: instead of a single array being returned in the response, multiple arrays may be returned, one per person. See the sample response several sections up.
class PhotosHandler(webapp.RequestHandler):
def post(self):
form = cgi.FieldStorage()
peopleIds = urllib.unquote(form.getfirst('people')).split(',')
if not _isValidSignature(self):
self.response.out.write('SIGNATURE INVALID')
return
personId = self.request.get('opensocial_owner_id')
container = self.request.get('oauth_consumer_key')
photoSetCollection = []
for id in peopleIds:
photoSet = {}
photoSet['name'] = id
photoSet['photos'] = getPhotos(container, id)
if len(photoSet['photos']) > 0:
photoSetCollection.append(photoSet)
self.response.headers['Content-Type'] = 'text/plain'
self.response.out.write(simplejson.dumps({'resultsCollection': photoSetCollection}))
Note that the post data is URL-encoded in the request so the post method uses urllib.unquote before splitting the comma-separated list of person IDs.
Adding a tag to a photo
To add a tag to a photo, the PhotoHandler class queries the datastore for a Photo with the given name and owner (the name and owner being passed into the request via the URL—i.e. .../photo/<CONTAINER>:<ID>:<PHOTO_NAME>). This format makes it fairly easy to find the appropriate Photo object in the datastore. Once found, the Photo object is updated with the new tag.
You may recall that PhotoHandler handles another type of POST request—photo uploading. We can distinguish between the types of POST requests by inspecting the parameters passed along with the request. An upload request, which is sent from a form, will have a 'file' member. If this member is present, we can proceed with the upload. Otherwise, we process the request as a tag post.
def post(self):
form = cgi.FieldStorage()
if 'file' not in form:
if not _isValidSignature(self):
self.response.out.write('SIGNATURE INVALID')
return
personId = self.request.get('opensocial_owner_id')
container = self.request.get('oauth_consumer_key')
if form.getfirst('text'):
textTag = urllib.unquote(form.getfirst('text'))
match = re.search(r'^http://.*/photo/([\w\.]*?):([\w\.]*?):([\w\.]*)', urllib.unquote(self.request.uri))
if match:
photo = getPhoto(match.group(1), match.group(2), match.group(3))
if photo:
self.response.headers['Content-Type'] = 'text/plain'
if 'textTag' in locals():
addTextTagToPhoto(photo, textTag)
self.response.out.write('Text tag added successfully')
else:
# See file upload code above
def addTextTagToPhoto(photo, textTag):
photo.tags.append(textTag)
photo.put()
Notice that we've added a new conditional block at the very beginning of the method. If the form object does have a 'file' key, we know that the request came from the upload form and we store the post data in the datastore as a Photo object. Otherwise, we parse the URI to get the photo name and owner credentials, retrieve the photo, and call the addTextTag function to update the Photo object in the datastore.
Fetching an individual's tags
This request is fairly straightforward. When the requests's end-point is /tags, the TagsHandler class kicks in and, with the help of the getTagsForUser function, fetches the user's photos, collecting all tags in a list. It then removes the duplicates from this list and returns the resulting list as a stringified JSON object using the simplejson library.
class TagsHandler(webapp.RequestHandler):
def get(self):
if not _isValidSignature(self):
self.response.out.write('SIGNATURE INVALID')
return
personId = self.request.get('opensocial_owner_id')
container = self.request.get('oauth_consumer_key')
tags = getTagsForUser(container, personId)
self.response.headers['Content-Type'] = 'text/plain'
self.response.out.write(simplejson.dumps({'tags': list(set(tags))}))
def getTagsForUser(container, personId):
tags = []
photos = getPhotosForUser(container, personId)
if photos:
for objt in photos:
for tag in objt.tags:
tags.append(tag)
return tags
list(set(tags)) is a convenient and efficient way to remove duplicates from the original list.
Publishing the app
Up to this point, we've been using the development app server, but in order for an OpenSocial container like orkut or MySpace to access your Google App Engine application, the app needs to be hosted publicly. From the My Applications page, create a new application—you probably want to use something generic, like username-dev, since you can only create 3 apps with App Engine currently. Now update the app.yaml file to include this application name.
From the google_appengine directory, run './appcfg.py update <your_app_directory>' from the application directory to publish your app. Make sure you can access the application at http://your_app_name.appspot.com/ from your browser.
Amazon S3
S3 is a web service provided by Amazon for file/data storage. Conveniently for our needs, it can store any file between one byte and five gigabytes in size and, because it is a cloud service, it is <infinitely> scalable. You can upload as many files as you'd like as fast as you'd like. Of course, this comes at a cost, but it's still significantly cheaper (not to mention far more convenient) than renting out space in a data center.
S3 is a REST-based service meaning that you can use it in any HTTP-aware development environment. Cooler still, there are many open source client libraries available that make it a cinch to interact with S3 in whichever language you're most comfortable with. This article demonstrates Python only, but you can easily port this sample code to PHP, Java, Ruby, etc. and be reasonably confident that an S3 client library is available for the language you choose. In the following section, we will use one such library to transform our sample above—now, instead of storing image binaries and text tags in the datastore, we will store the actual files and metadata in S3 instead.
In order to continue through this section, you will need to register to be an S3 developer so that you can substitute your personal access and secret keys, which is needed by the library. Registration is easy enough. Once you have your keys, add the following to the top of cloud.py:
# Amazon AWS S3 import import S3 # Amazon AWS parameters AWS_ACCESS_KEY_ID = <YOUR_ACCESS_KEY> AWS_SECRET_ACCESS_KEY = <YOUR_SECRET_KEY> BUCKET_NAME = ''.join([AWS_ACCESS_KEY_ID.lower(), '.cloud'])
Very little of the Python classes defined above have to be changed. Instead, we will re-implement the helper functions to post and fetch data from S3 instead of the data store.
Let's start with the createPhoto function, which, as you may recall, was used to post the image binary to the App Engine datastore. We'll reimplement it here to upload to S3 instead using the S3 library that we imported above.
def createPhoto(container, personId, fileItem):
user = getUser(container, personId)
name = ''.join([str(int(floor(time()))), fileItem.filename])
key = ''.join([container, personId, '_', name])
headers = {
'x-amz-acl':'public-read',
'Content-Type': fileItem.type,
'x-amz-meta-tags': ''
}
conn = S3.AWSAuthConnection(AWS_ACCESS_KEY_ID, AWS_SECRET_ACCESS_KEY, True)
conn.put(BUCKET_NAME, key, fileItem.file.read(), headers)
return {'name': name}
getUser doesn't change since we want to continue to manage user information in App Engine's datastore. Notice that a new dictionary object is defined with the headers that we want to set (the last being the header that we'll eventually use to store tags with the image file). After that, it's just a matter of opening a connection to the S3 service by calling the libary's AWSAuthConnection constructor and using it to "put" a new file, effectively uploading it to the indicated bucket.
Next, we want to be able to get the photo information out of S3. We'll modify getPhotosForUser to do this:
def getPhotosForUser(container, personId):
retArray = []
conn = S3.AWSAuthConnection(AWS_ACCESS_KEY_ID, AWS_SECRET_ACCESS_KEY, True)
listResponse = conn.list_bucket(BUCKET_NAME, {'prefix': ''.join([container, personId, '_'])})
if listResponse.entries:
for entry in listResponse.entries:
match = re.search(r'^.*?_(.*)$', entry.key)
if match:
getResponse = conn.get(BUCKET_NAME, entry.key)
if getResponse.http_response.status_code==200:
photo = {
'name': match.group(1),
'tags': []
}
if 'tags' in getResponse.object.metadata:
tags = getResponse.object.metadata['tags']
photo['tags'] = tags.split('|')
retArray.append(photo)
return retArray
After opening up a connection to the service, this code fetches all objects in the bucket that belong to the user (because we're prefixing the container and ID to the file name before we upload, we're able to easily query the service for a given user's photos by asking it to return only those that match a given prefix, just as we have here). Once all of the photos are available, a request is issued for each in order to get the content-type and metadata (tags) associated with each individually. This information is put into a list and returned.
addTextTagToPhoto becomes a little larger since it has to retrieve the photo from the service, set the appropriate header, and then "put" the object back. The code for this is printed below.
def addTextTagToPhoto(photo, textTag):
headers = {
'Content-Type': photo['contentType']
}
if photo['tags'] != '':
headers['x-amz-meta-tags'] = ''.join([photo['tags'], '|', textTag])
else:
headers['x-amz-meta-tags'] = textTag
conn = S3.AWSAuthConnection(AWS_ACCESS_KEY_ID, AWS_SECRET_ACCESS_KEY, True)
conn.put(BUCKET_NAME, photo['key'], photo['content'], headers)
The only other function that needs to be changed substantially is getPhoto which fetches the data from S3 and returns the content to the browser (after specifying the appropriate content-type, of course).
def getPhoto(container, personId, photoName):
key = ''.join([container, personId, '_', photoName])
conn = S3.AWSAuthConnection(AWS_ACCESS_KEY_ID, AWS_SECRET_ACCESS_KEY, True)
response = conn.get(BUCKET_NAME, key)
photo = None
if response.http_response.status_code==200:
if 'content-type' in response.http_response.headers:
photo = {
'contentType': response.http_response.headers['content-type'],
'content': response.http_response.content,
'name': photoName,
'key': key
}
photo['tags'] = response.object.metadata.get('tags', '')
return photo
Now once you replace your functions above with these and make a few additional changes (namely changing all object references (e.g. photo.name) to dictionary references like photo['name']), you've successfully completed your transition from a Google App Engine storage back-end to S3.
Optimizations
A common misconception when coding in the cloud is that storage space, CPU cycles, and bandwidth are unlimited. While the cloud hosting provider can, in theory, provide all the resources your app needs, hosting in the cloud ain't free so these resources are limited by your budget. Luckily, OpenSocial provides several mechanisms to cache images and data that will reduce the load on your server.
Using getProxyUrl
Consider the amount of traffic required to render the "Friends' Photos" tab. Assuming the average user has 10 friends with the app, and each has uploaded 20 photos (at 500KB each), rendering this page will request 100MB from your server. If this app gets popular and this tab gets 10,000 views a day, you're looking at 1TB of traffic, just for this one tab!
The gadgets infrastructure is designed to aggressively cache data to reduce the load on your server, but you have to tell it what to cache. This is as simple as using the gadgets.io.getProxyUrl method to fetch the URL of the cached image and using that URL in the HTML of your app.
function showImage() {
imgUrl = 'http://www.example.com/i_heart_apis_sm.png';
cachedUrl = gadgets.io.getProxyUrl(imgUrl);
html = ['<img src="', cachedUrl, '">'];
document.getElementById('dom_handle').innerHTML = html.join('');
};
showImage();
This will greatly reduce bandwidth you use to serve images because the majority of the requests will be going to the cached URLs. To get the most benefit from caching, be sure to set the cache control headers appropriately for your content. For more information on caching, see the OpenSocial Latency Combat Field Manual.
Caching data to render the profile page
Profile pages make up the lion's share of application renders on OpenSocial container sites. If Photo Pier begins to get popular, say 100,000 users each with about 10 profile views a day, the app will be sending a million requests per day (over 11 requests per second) just to get the URLs of photos to display on the profile.
One technique for reducing traffic to your server is to use OpenSocial's Persistence API to store the data you need to render the profile view. Then your app doesn't need to contact your server at all to render the profile view.
In the case of Photo Pier, we're requesting a list of image URLs to include in the profile slideshow. Rather than storing this data in our database, we can store it in the Persistence API. When a user selects the photos to show in their slideshow, we can store this information in the container:
updateFavoritesData: function(value, photoUrl) {
var req = opensocial.newDataRequest();
if (value == true) {
this.profilePhotoSet.push(photoUrl);
} else {
var index = this.profilePhotoSet.indexOf(photoUrl);
if (index != -1) {
this.profilePhotoSet.splice(index, 1);
}
}
req.add(req.newUpdatePersonAppDataRequest(opensocial.IdSpec.PersonId.VIEWER,
'favoritePhotos',
gadgets.json.stringify(this.profilePhotoSet)));
req.send();
};
Then when we render the profile view, we just request this piece of data from the container, not our server.
fetchOpenSocialData: function() {
var req = opensocial.newDataRequest();
var ownerIdSpec = opensocial.newIdSpec({'userId':'OWNER', 'groupId':'SELF'});
req.add(req.newFetchPersonRequest(opensocial.IdSpec.PersonId.OWNER), 'owner');
req.add(req.newFetchPersonAppDataRequest(ownerIdSpec, 'favoritePhotos'), 'profilePhotoUrls');
req.send(closeFetchOpenSocialData);
};
closeFetchOpenSocialData: function(resp) {
var ownerResp = resp.get('owner');
var photoUrlsResp = resp.get('profilePhotoUrls');
if (!ownerResp.hadError() && !photoUrlsResp.hadError()) {
var ownerData = photoUrlsResp.getData()[this.owner.getId()];
if (ownerData) {
profilePhotoSet = gadgets.json.parse(gadgets.util.unescapeString(ownerData['favoritePhotos']));
/* Now that we've fetched the photo URLs, render them */
}
}
};
In addition to reducing traffic to our server, this technique has the added benefit of being fast—requesting data from the Persistence API is much faster than making the round trip to your server.
Resources
As you start coding your app in the cloud, you'll no doubt have some questions. Here are some resources to get you started.
- Source code
- Reference
- Articles
- Ask questions
