Google Cloud 101: Cloud Functions

You’ve got some data in your project, but it’s not much use to you if you can’t do anything with it. Fortunately, you’ve got a vast array of tools at your disposal to do something with your data. The tool you choose for the job really depends on what you want to do with your data. Unfortunately, figuring out what tool you need isn’t always as straightforward as it might seem. In this post, we’ll look at one simple option for processing your data: Cloud Functions. We’ll start with an overview of how cloud functions work, what you can use them for, and then we’ll roll up our sleeves and get stuck in to a hands-on scenario.

Prerequisites

Before we start, you’ll need to be up and running with a google cloud project if you want to follow along. You can sign up for a year long free trial that gives you $300 of free credit to explore google cloud platform with. You’ll need to provide a payment method for this but you won’t be charged unless you explicitly upgrade your account. I’ll be using cloud storage to store data as part of the hands-on scenario but you won’t need any prior knowledge for this. If you want to learn more about cloud storage, you can find a more detailed post here. I’ll also be using python to implement a cloud function in this article. Again, you won’t need prior knowledge as I’ll provide the necessary code, but I strongly recommend you familiarise yourself with a programming language if you want to implement your own solutions. I find python easiest to work with, but you can also use node.js or go.

What are Cloud Functions?

Quite simply, cloud functions are just that: functions. Small blocks of code that typically serve a single purpose. If you happen to have prior programming experience with python, node or go, you’ll probably find that developing a cloud function is a familiar experience, barring a few nuances. In most cases, you’ll want your cloud function to execute its task in response to an event of some kind. This is where cloud functions really excel. You can configure a cloud function to execute in response to events from a range of sources including (but not limited to) cloud storage, cloud pub/sub and firestore. You can also respond to http requests, so if the existing service palette doesn’t suit you, you can easily roll your own events.

In days gone by, if you wanted to deploy an application you first needed to put the relevant infrastructure in place. This would usually include setting up a bare-metal or virtual server, installing an operating system, installing the various dependencies for your application and then installing your application. In cloud based environments, this traditional approach is rapidly giving way to the rise of serverless offerings. What this means is that the service provider handles setting up the infrastructure for you, freeing you up to focus on developing your application. Cloud functions falls under the bracket of functions as a service (FaaS) – a category of serverless cloud computing services. You don’t need to worry about setting up any infrastructure when you’re ready to deploy your function – all you need to do is click a button and let google handle the rest.

When developing a cloud function, you should ensure your logic is stateless. The reason for this is down to how the service handles scaling. For example, when your function is triggered by a request for the first time, a new instance of your function will be created. This instance will be allocated some CPU and memory resource for execution and can only process one request at a time. If, while your first request is processing, another request is made, a new instance will be created, with its own distinct set of resources. This means that cloud functions can’t share state (such as variables) with each other. While it’s possible that instances might sometimes be reused to improve performance by reducing the need for cold starts, you should not assume that this will be the case.

If you need to track state between functions then you can use a storage service such as cloud storage or firestore. If you find yourself in this situation, consider whether your use case might be better served by another service. If your function needs to run lots of instances concurrently, trying to track state this way could cause you some major headaches.

The upside to this stateless approach is that you get a service that has almost unbounded scalability and can rapidly scale up and down to meet demand. This behaviour isn’t always desirable though, as in some scenarios you may find yourself hitting quota limits or overwhelming another service that doesn’t have the same degree of scalability. To avoid this, you can set a cap on the maximum number of concurrent instances your cloud function can instantiate before you deploy your function, and then modify it over time as necessary.

When Should I Use Cloud Functions?

Cloud functions are a good option for your event-driven, lightweight and stateless processing needs. They are commonly used to chain cloud services together to create more capable systems than would otherwise be possible. You should note that each invocation of your function can live for a maximum of 9 minutes (540 seconds), so you’ll need to be sure that your logic can complete within that time. If you need more time than this, you may want to consider another compute service such as cloud run. Some example use cases for cloud functions include:

Real-time Data Processing – In scenarios such as fraud/anomaly detection and healthcare, having immediate access to data can be critical. The event-driven nature of cloud functions makes them ideal for processing data in near real-time. For example, you could configure a cloud function to trigger every time new data arrives in a cloud storage bucket or pub/sub subscription that immediately loads your data into a BigQuery table, making it available for analysis within seconds of creation.

APIs & Websites – Since you can respond to HTTP requests with cloud functions, you can use them to serve up resources such as webpages to end users, or as a medium for interacting with other services. Naturally, this means you can use cloud functions as a method for creating api’s that you can expose to authorised users, or make them available for public use. Be aware that for the latter option, this might expose you to malicious users and very high costs, so be sure to take appropriate security measures and consider using tools like cloud endpoints to control usage of your apis.

Webhooks – You can expose a HTTP cloud function url to other applications and services that can then call the function in response to certain events. You can push things like alerts and data from one service to another in this manner. This approach enables you to create custom triggers to help chain services together that don’t already have pre-defined triggers, allowing them to communicate in an automated way.

The cost of running a cloud function depends on a few factors, such as the number of invocations, the length of time your function takes and the amount of data you need to transfer over the network. At the time of writing, you get 2 million free invocations of cloud functions per month, free network ingress and free network egress to services that are based in the same region as your cloud function. Invocations outside the free allocation are charged at $0.40 per million and network egress at $0.12 per GB. The price of CPU and memory usage is measured in GHz and GB per second respectively. The cost for these resources will vary by the amount of resource allocated and the duration that they are active. You can find out more about cloud function pricing here.

Cloud Functions In Action

The best way to get started with cloud functions is to build one. Since we’re in the blogging business, let’s say we allow users to post arbitrary messages to our website. Our hypothetical blog is incredibly popular (wink, wink) and we couldn’t possibly read through all this feedback, so we want to capture all the messages we get and load them into google cloud platform for sentiment analysis. To do this, we’ll need to implement a webhook that can be called each time someone hits the send message button. Our webhook will need to trigger in response to a HTTP POST request. This request will provide the user name of the sender and the content of their message as a JSON object. The webhook will then inspect the request, and if the request is in the correct format, it stores the request body as a cloud storage object, ready for analysis.

Setting up Storage

Before we start implementing our cloud function, we’ll need to set up a cloud storage bucket so that we can start collecting some data. Head over to the cloud console and ready the project of your choice. Open the navigation menu and open up the cloud storage browser. In the storage browser, hit the create bucket button and set up a regional bucket:

  • In the name your bucket section, give your bucket a unique name
  • In the choose where to store your data section, choose a the region location type and a location of your choice. Make a note of this region as you will need this when you create the cloud function.
  • Leave the remaining sections unchanged, then click the create button.
create a bucket to store messages.

Configuring Your Cloud Function

Next, we need to create a cloud function to serve as our webhook. In the cloud console, open your navigation menu and browse to the cloud functions page. From here, you’ll need to click the create function button to open the create function form. You may also need to enable the cloud functions api if you haven’t done so. If you aren’t prompted to do this, you can navigate to the API’s & Services dashboard from the navigation menu and enable it from there.

With the create function form open, we’ll first need to give our function a name. Let’s go for something descriptive and call it message_webhook. Next, we’ll need to choose how much memory we want to allocate our function. Though it’s not stated in the form, allocating more memory will also allocate more CPU resource. This will result in higher costs than running your function with fewer resources over the same period of time, so you should try to allocate only the resources you need. Figuring out what resources you need might take some trial and error but this will save you from over allocating and reduce your costs in the long run. For our purposes a small 128MiB function will suffice. The next setting is the cloud function trigger. We want our webhook to be triggered by HTTP requests, so we can leave this unchanged. Finally, we’ll leave the allow unauthenticated invocations unchecked, so that we can control who uses our function.

configure the basic function settings.

Programming Your Cloud Function

Now we need to provide some logic for our webhook. There are a number of options we can use for doing this, such as using the provided editor, providing a zip archive of code, or getting our code from a cloud source repository. Since our use case is nice and simple, we can just go ahead and use the code editor provided by the form. Be aware though that this isn’t a great option for developing a production ready function. You’ll probably want to organise your function code across multiple files and adhere to good coding practices such as SOLID and DRY, particularly as your code becomes more complex. You won’t be able to do this with the inline editor so choose another option in this scenario.

For the function runtime, we’ll choose python 3.7. If you feel comfortable with a different language however, feel free to rewrite the code below in your chosen language. At this point, the editor should provide you with two files: main.py and requirements.txt. In main.py we’ll replace the provided function template with the logic for our webhook. Copy the code below and paste it into the main.py file in the editor, swapping the <your-bucket-name-here> placeholder with the name of your own bucket:

import uuid
import json
import flask
from google.cloud import storage
def store_message(request):
    """A webhook for capturing app user messages as cloud storage objects.
    Args:
        request (flask.Request): HTTP request object.
    Returns:
        (flask.Response): 'OK' if the put request contained a payload we could 
            store. 'Bad Request' otherwise
    """
    request_json = request.get_json()
    
    if (all(x in ['user','message'] for x in request_json.keys())
        and len(request_json.keys()) >= 2
        and request.method == 'POST'):
        client = storage.Client()
        bucket = client.get_bucket('<your-bucket-name-here>')
        blob = bucket.blob('message_{}.json'.format(uuid.uuid4()))
        blob.upload_from_string(
            json.dumps(request_json),
            content_type='application/json'
        )
        return flask.Response(response='OK', status=200)
    else:
        return flask.Response(response='Bad Request', status=400)

Our code defines a single function named store_message that we want to invoke in response to HTTP POST requests. In the function body, we check that the incoming request has the information we want and the correct type of request was used. If these conditions are met, we upload the request data to our cloud storage bucket, ready for analysis. In the function to execute text box, we need to tell cloud functions what function we want to use as our entry point. We only have one function, so type ‘store_message’.

You’ll notice that our code defines some imports outside the function that allow us to make use of some external code libraries. Cloud functions allows you to define code in this global scope. This code will run each time a new instance of your functions is created, but you should note that it will not run each time your function is invoked, as instances may be reused. You should therefore only use this scope for initialisation purposes and ensure that any logic here is idempotent. To finish with our source code section, we need to add a line to our requirements file, so that we can tell cloud functions that we require the python google-cloud-storage library, which isn’t pre-installed in the execution environment. Add the following line to requirements.txt:

google-cloud-storage==1.28.0

This will tell cloud functions to install the library for us, so that our function has access to it when it is invoked.

add the function source code.

Deploying Your Cloud Function

To complete our function, we need to expand the environment variables, networking, timeouts and more section. You’ll see some advanced settings that we won’t be modifying here, all we need to do is set the region of our cloud function to the region that we deployed our bucket in. Putting our function and bucket in the same region will allow us to avoid network egress charges and help keep costs down. Scroll down to the bottom of the form and click the deploy button. It might take a few minutes for your function to deploy. You should find yourself on the function browser page, with a green tick icon next to your function if the deployment was successful.

We can test our webhook by hitting it with a HTTP POST request and with a simple json payload. Remember that when we created our webhook, we decided not to make it publicly available. This means we will have to submit an authentication header with our request, so that the webhook can check that we have permission to call it. One way to do this is to use the curl command from the cloud shell terminal and pass your identity token with the gcloud auth print-identity-token command. Click the cloud shell icon in the top toolbar of the cloud console. Once the terminal has loaded, enter the following command, replacing the <your-cloud-function-url> placeholder with the url for your cloud function. You can find the url under the trigger tab of the function details page after clicking on your function in the function browser:

curl -H "Content-Type: application/json" -H "Authorization: bearer $(gcloud auth print-identity-token)" --data '{"user":"johndoe", "message":"hello"}' <your-cloud-function-url>

If all went well, we should have captured the payload in our cloud storage bucket. Use the navigation menu to open the storage browser and open the bucket you created to store your messages. You should see a new object in your bucket with a small json object as its content. Feel free to run the curl command a few more times and modify the username and message data. You can verify each successful request by checking for new objects in your bucket.

send a test request to your webhook.

Summary

You should now have a grasp of the basic concepts of cloud function development. Let’s finish with a recap of the key points:

  • Cloud functions can be used for lightweight, stateless, event-driven compute tasks.
  • You can develop a cloud function with python, node.js or go.
  • Cloud functions are categorised as a functions as a service offering. This means they are serverless and you don’t need to manage any infrastructure.
  • You can trigger a cloud function in response to events from services such as cloud storage, cloud pub/sub and cloud firestore. You can also invoke a cloud function in response to HTTP requests.
  • Cloud functions have almost unbounded scaling. If necessary, you can cap the number of concurrent instance of a cloud function during deployment.
  • When scaling, instances of your cloud function may be reused to avoid cold starts and improve performance.
  • You can avoid network egress charges by deploying cloud functions to the same region as the destination of any network traffic sent by your function.
  • You can provide a zip archive or code repository of source code for your function during deployment. You should avoid using the inline editor for anything other than the most basic use cases.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google photo

You are commenting using your Google account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s