Local testing and Google Cloud Functions

Back in 2017, I wrote about local testing and AWS Lambda. At some point, I will update that post with details on how to use AWS SAM to invoke automated local tests. In this article, I will pivot away from AWS to talk about Google’s equivalent service, Cloud Functions, and again will focus on local testing.

I am using the Python runtime, which makes use of Flask to handle incoming requests. If you are already familiar with Flask patterns, you will find a lot to like about Google Cloud Functions. And as before, I am using Python 3’s unittest to discover and run my tests.

The following is an attempt to document a problem I encountered, and the solution that I settled on. There are likely other/better ways. If you have a suggestion, please leave me a comment!

Here’s where I started.

My project structure:

  • gcf-testing-demo
      • count
        • __init__.py
        • main.py
        • counter.py
      • tests
        • test_count.py

main.py

import os
import json
from counter import Count

def document_count(request):

    headers = {
        'Content-Type': 'application/json'
    }

    try:
        request_json = request.get_json()
        document = request_json['document']
        c = Count()
        count = c.tok_count(document)
        response_body = {}
        response_body['document_count'] = count
        response = (json.dumps(response_body), 200, headers)

    except Exception as error:
        #can't parse json
        response_body = {}
        response_body['message'] = error
        response = (json.dumps(response_body), 400, headers)

counter.py

class Count:

    def tok_count(self, mystring):
        tokens = mystring.split()
        return len(tokens)

test_count.py

from count.main import document_count
import unittest
import json
from unittest.mock import Mock

class MyTests(unittest.TestCase):
    def test_count(self):
        data = {"document":"This is a test document"}
        count = 5
        request = Mock(get_json=Mock(return_value=data), args=data)
        response = document_count(request)[0]
        self.assertTrue(json.loads(response)['document_count'] == 5)

if __name__ == '__main__':
    unittest.main()

This function takes an input:

{"document":"this is a test document"}

And returns a token count of the input ‘document’:

{"document_count":5}

A few details worth highlighting:

From main.py:

request_json = request.get_json()

This is an example of how Google is reusing familiar Flask patterns. The method get_json will pull any JSON out of the incoming Flask object. We can then directly pick out properties, e.g. request_json[‘document’].

Also, from main.py:

response = (json.dumps(response_body), 200, headers)

Each Google Cloud Function is essentially an API method, and we must provide not just the response body, but the HTTP status code, and any headers. We return all 3 as a tuple and GCF will proxy this to the user appropriately.

The code for multiple Google Cloud Functions can be maintained in a single main.py. You will see how this looks when we deploy this function below. We can also import external modules, as we have done here with count.py, but to enable this functionality we must include an __init__.py in our function directory.

Note that you could also keep your modules in a subdirectory, provided that subdirectory also has an __init__.py. In either case, the __init__.py can be empty.

In test_count.py you will find a single test which ensures that the document counter returns the correct value for some input. I am using unittest.mock library to construct a Mock object equivalent to the object expected by our Google Cloud Function. Note the get_json method, which stores the contents of my test document.

request = Mock(get_json=Mock(return_value=data), args=data)

Okay, things look good. Let’s run our test:

python3 -m unittest discover

Which results in the error:

ERROR: tests.test_count (unittest.loader._FailedTest)
----------------------------------------------------------------------
ImportError: Failed to import test module: tests.test_count
Traceback (most recent call last):
  File "/usr/lib/python3.5/unittest/loader.py", line 428, in _find_test_path
    module = self._get_module_from_name(name)
  File "/usr/lib/python3.5/unittest/loader.py", line 369, in _get_module_from_name
    __import__(name)
  File "/home/dfox/gcf-testing-demo/tests/test_count.py", line 1, in 
    from count.main import document_count
  File "/home/dfox/gcf-testing-demo/count/main.py", line 3, in 
    from counter import Count
ImportError: No module named 'counter'
----------------------------------------------------------------------
Ran 1 test in 0.000s
FAILED (errors=1)

So, what’s happening? When I invoke my test script, it searches in local and system paths for counter.py, and finds…nothing! I have a few options at this point:

  • I could manually update my PYTHONPATH to ensure that my module directory is included.
  • I could move my test script into my module directory.
  • Or I can switch from using relative to absolute paths in my module code.

Let’s try this last option, as it seems like the solution that will be easiest on any future testers/developers. Update main.py as follows:

from count.counter import Count

When we run our unittest again, we should get back a successful report:

Ran 1 test in 0.001s
OK

Great! Let’s deploy our function. I’m using the gcloud command-line utility. More info on setting this up here. I’m also using the beta client. Change directories into your module dir and execute the following (note that the name ‘document_count) refers to the function in main.py and not to main.py itself):

gcloud beta functions deploy document_count --runtime python37 --trigger-http

Which will return

ERROR: (gcloud.beta.functions.deploy) OperationError: code=3, message=Function failed on loading user code. Error message: Code in file main.py can't be loaded.
Did you list all required modules in requirements.txt?
Detailed stack trace: Traceback (most recent call last):
  File "/env/local/lib/python3.7/site-packages/google/cloud/functions/worker.py", line 256, in check_or_load_user_function
    _function_handler.load_user_function()
  File "/env/local/lib/python3.7/site-packages/google/cloud/functions/worker.py", line 166, in load_user_function
    spec.loader.exec_module(main)
  File "", line 728, in exec_module
  File "", line 219, in _call_with_frames_removed
  File "/user_code/main.py", line 3, in 
    from count.counter import Count
ModuleNotFoundError: No module named 'count'

Oh boy. Looks like my genius plan to use absolute paths will not fly with our Google Cloud Function deployment. Which makes sense, as the resulting function has no knowledge of our local project structure. It is dynamically building a package out of main.py, any imported modules, and any dependencies we may have listed in our requirements.txt.

At this point, I was a bit unsure of how to proceed. I decided to take a look at some of Google’s own sample Python applications. Here is a sample application for a Slack ‘slash command’. Let’s peek at the project structure:

  • slack/
    • README.md
    • config.json
    • main.py
    • main_test.py
    • requirements.txt

It is not a like-for-like example, since there are no modules being imported into main.py, but notice that the test script is simply in the same directory as main.py. This was an approach I had considered and discarded, simply because I was concerned about mingling my tests with function code. But if it is good enough for Google, who am I to argue?

So, let’s restructure things:

  • gcf-testing-demo
      • count
        • __init__.py
        • main.py
        • counter.py
        • test_count.py

And then switch back to relative imports in main.py:

from counter import Count

I also need to update the import  in test_count.py:

from main import document_count

Now, I should be able to cd into count/ and execute my test:

python3 -m unittest discover</div>
Ran 1 test in 0.000s
OK

Next, I will confirm that I can deploy this function as  GCF:

gcloud beta functions deploy document_count --runtime python37 --trigger-http
Deploying function (may take a while - up to 2 minutes)...done.
availableMemoryMb: 256
entryPoint: document_count
httpsTrigger:
  url: ###
labels:
  deployment-tool: cli-gcloud
name: ###
runtime: python37
serviceAccountEmail: ###
sourceUploadUrl:###
status: ACTIVE
timeout: 60s
updateTime: '2019-01-23T15:06:23Z'
versionId: '1'

It worked!

Let’s log into the console, browse the Cloud Functions resource and view our function. If I look at the ‘Source’ tab I’ll see main.py, counter.py (this is good!), and also test_count.py (this is less good). This is why I did not want to mingle my tests with my code:

tests_in_source

Fortunately, Google provides a method to filter out those files we don’t wish to incorporate into our Cloud Function package. We need to create a .gcloudignore file (equivalent to.gitignore which you may be familiar with) and add this to the module directory. I only need one line to filter out my tests, but I may as well also filter out __pycache__, *.pyc, and .gcloudignore itself:

.gcloudignore:

*.pyc
__pycache__/
test_*.py
.gcloudignore

After I redeployed this function, the source code looks much cleaner:

filtered_source

Now, I can finally make a live test of the deployed function:

live_test

Success!

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google photo

You are commenting using your Google account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s