iangilham.com

By on

code azure azure-functions

I’ve recently been working with Azure Functions, Microsoft’s functions-as-a-service (FAAS) platform. I’m already quite familiar with AWS Lambda so my perspective is coloured by that experience.

First, the good. Functions Apps are pretty flexible. You can host a single entry point per app if you want but it is often more cost-effective to host several functions within the same app and share some resources. I’m using Javascript so the whole app shares a package.json and a node_modules directory etc. The app also has its own Storage Account, which can host blobs (like S3), queues (like SQS) and tables (similar to DynamoDB but works differently) etc.. These resources are easy to use from any function within the app, making it easy to send a message down a queue to model event-driven workflows and perform multiple actions with the same files or database records.

The runtime of a function app environment is versioned separately from the language and language version. There is a list of supported languages and versions for each version of the runtime.

A Function App is connected to an App Service Plan, which defines the virtual machine billing model. There are options for consumption plans on Windows and Linux or you can reserve an auto-scaling pool of virtual machines. It’s fairly flexible but quite awkward to configure for Linux. The key when using Terraform or ARM is to make sure the instance is reserved, as that seems to mean “use Linux”.

The inputs and outputs of functions can be bound to HTTP requests/responses or tied directly to other Azure event sources, such as blobs, queues, tables etc. This allows you to avoid writing the code to send a message, write a row to a table or any number of other common integrations. This gives Functions a lot more flexibility than AWS Lambda functions, which only lets you bind the input event source.

The documentation for Linux apps is largely missing and often useless. You have to dig pretty deep to discover that the setting for configuring the version of NodeJS you want only works on Windows instances. There is a hidden setting called LinuxFxVersion in a different place (Site Config) that has no UI in the Azure Portal for configuring it on Linux. Fortunately Terraform has a way to address this undocumented property. It appears in ARM as well but I haven’t found a way to edit it from the Azure Portal.

I’ve seen various weird crashes when deploying and restarting function apps. There are at least 3 different undocumented things that can go wrong, leaving only a cryptic error message. Searching for these issues usually fails to yield any relevant results. I’ve seen more strange errors with version 3 of the functions app runtime than with version 2.

Another frustration is the Azure CLI. The best case when deploying a function app is the message Operation failed with status: 200 Operation completed successfully. As amusing as that is, more often the tool simply fails with a HTTP 400 error: Bad Request with no explanation. Since the CLI hits the API about 8 times during deployment, there is no way to know how far you got or what state it left your app in. This completely breaks any kind of deployment automation. In production, you’ll have to pay somebody to hand-hold the build and make sure changes make it out to your customers.

By on

code aws aws-sns

AWS Simple Notification Service is a great tool for broadcasting messages to a topic. Subscribers receive all messages by default, or can use a filter on the messagee’s attributes to receive only a subset of the messages sent to the topic.

Message attributes have a few possible data types. Unfortunately the documentation for the Javascript SDK is pretty bad at the time of writing. It’s fairly obvious how to set an attribute of type String but it says nothing about how to set an attribute of type String.Array. Fortunately, I guessed correctly when I gave it a try.

const AWS = require('aws-sdk')
const config = require('./config')

AWS.config.region = 'eu-west-1' // or your region

const sns = new aws.SNS()

const notificationGroups = [
  'GroupA',
  'GroupB'
]

async function sendMessage(message, errorCode) {
  const params = {
    Message: message,
    Subject: 'Something happened',
    TopicArn: config.sns.arn,
    MessageAttributes: {
      errorCode: {
        DataType: 'String',
        StringValue: `${errorCode}`
      },
      group: {
        DataType: 'String.Array',
        StringValue: JSON.stringify(notificationGroups)
      }
    }
  }

  await sns.publish(params).promise()
}

The trick is to call JSON.stringify(someArray) and stuff it into the StringValue key in the MessageAttribute object.

By on

code bash

It’s funny how easy it is to overlook an obvious solution to a trivial problem and keep doing things the slow way for years at a time.

While writing a short bash script today, it occurred to me that I’ve been handling requirements poorly for years. I’ve previously used long chains of if !$(which COMMAND) &>/dev/null to declare and enforce requirements but it never occurred to me to wrap it in a simple function. This is what popped out of my head today:

#!/bin/bash
# Declare requirements in bash scripts

set -e

function requires() {
    if ! command -v $1 &>/dev/null; then
        echo "Requires $1"
        exit 1
    fi
}

requires "jq"
requires "curl"
# etc.

# ... rest of script

This makes it easy to declare simple command requirements without repeating the basic if not found then fail logic over and over. In hindsight it should have been obvious to wrap this stuff in a function years ago, but at least I’ve caught up now.

The requires function can of course be placed in a common file for inclusion into multiple scripts. I’ve shown it inline for simplicity.

This snippet also available as a Gist.

By on

code aws aws-lambda aws-s3

As the web continues to evolve, the selection of HTTP response headers a site is expected to support grows. Some of the security-related headers are now considered mandatory by various online publishers, including Google and Mozilla. If like me you run your site using an S3 bucket of static files, it’s not easy to add the recommended headers and improve your site’s security scores in tools like Mozilla’s Observatory.

CloudFront and Lambda@Edge can fix that. Others have detailed the process more fully than I can but beware that some features may have changed since the older posts were written. I suggest following the article linked previously if you want to implement this for your site. I’ve listed some of the gotchas that slowed me down below.

Adding the right permissions to the Lambda IAM role

While creating the Lambda you will be asked to assign it a role. Make sure you add the Basic Edge Lambda template to the role you create to allow the function trigger to be created. I missed this step the first time and it took me a few tries to figure it out.

Beware of replication

When you deploy the Lambda and attach the trigger, CloudFront will create replicas of the Lambda in various regions. If you then update your function, publish a new version and redeploy, it will create more replicas. The replicas of older versions are not automatically deleted and cannot be deleted by the user at this time so they will pollute your account with potentially large numbers of unused Lambda functions. Hopefully Amazon will fix this issue at some point.

Choose your trigger wisely

CloudFront supports 4 triggers: origin-request, origin-response, viewer-request and viewer-response. The restrictions on run-time and memory size are different for different triggers so pay attention to the delays your functions introduce to the trafic flow.

The viewer triggers are run outside of CloudFront’s caching layer so viewer-request is triggered before the request hits the cache and viewer-response is triggered after the response has exited the cache.

origin-response is triggered after CloudFront has found the resource but before it writes its response to its own cache. That means you can add headers and cache the result, reducing Lambda invokations and delays, keeping the cost down.

Headers and their values

I’ve configured CloudFront to redirect HTTP requests to my site to use HTTPS but browsers like a header to make that even clearer. HSTS or Strict-Host-Security does this job. Other than that, there’s a range of headers that help the browser to mitigate the risks of XSS (Cross-site scripting) vulnerabilities. These are well documented by Mozilla so I won’t rehash it all here.

The interesting one is the CSP (Content-Security-Policy) header. There are a few versions of the syntax supported by different browsers and getting it right is a little tricky. The excellent CSP Evaluator by Google is very helpful for testing wordings of the CSP. Tunine the policy to work properly with Google Analytics and allow inline stylesheets while disabling most other avenues of attack took a few attempts to get right but I’m happy with what I ended up with.

I disabled everything by default with default-src: 'none' then added the permissions I needed for my site. I use a very small inline CSS stylesheet so I needed to enable that with style-src 'self' 'unsafe-inline'. I don’t make much use of images at present but if I do I’ll access them over HTTPS so I enabled that with img-src 'self' https:. Opening up just the needed permissions for scripts was a bit more difficult but the CSP Evaluator helped a great deal. It recommends a strict-dynamic policy for browsers that support it. I only use one script on my site for Google Analytics, so I had to extract the contents (including all whitespace) from the <script> tag, hash it with SHA256, then encode the hash with Base64 and add the result directly to the CSP policy. CSP Evaluator also recommends some fall-back options for browsers that do not yet support strict-dynamic so I end up with script-src 'strict-dynamic' 'sha256-my_script_hash' 'unsafe-inline' https:, where my_script_hash is the Base64-encoded SHA256 hash of the contents of my script. The complete example is in the below code.

Lambda code template

My basic Lambda template for adding custom headers to all HTTP responses on my site is shared below.

'use strict';

exports.handler = (event, context, callback) => {
    function add(h, k, v) {
        h[k.toLowerCase()] = [
            {
                key: k,
                value: v
            }
        ];
    }

    const response = event.Records[0].cf.response;
    const headers = response.headers;

    // hsts header at 2 years
    // Strict-Transport-Security: max-age=63072000; includeSubDomains;
    add(headers, "Strict-Transport-Security", "max-age=63072000; includeSubDomains; preload");

    // Reduce XSS risks
    add(headers, "X-Content-Type-Options", "nosniff");
    add(headers, "X-XSS-Protection", "1; mode=block");
    add(headers, "X-Frame-Options", "DENY");
    // TODO: fill in value of the sha256 hash
    const csp = "default-src 'none'" +
        "; frame-ancestors 'none'" +
        "; base-uri 'none'" +
        "; style-src 'self' 'unsafe-inline'" +
        "; img-src 'self' https:" +
        "; script-src 'strict-dynamic' 'sha256-my_script_hash' 'unsafe-inline' https:"
    add(headers, "Content-Security-Policy", csp);

    console.log('Response headers added');

    callback(null, response);
};

By on

cmake cpack rpmbuild

I’ve been trying to analyse a core dump generated by a C++ application when it seg-faulted. I use CMake 3 to build it and create an RPM with CPack. This application is currently built in debug mode using -DCMAKE_BUILD_TYPE=Debug on the command line that invokes CMake.

The generated binaries have all their debug symbols as expected but the binaries in the RPM package do not. After some searching, I learned that rpmbuild strips binaries by default on some distributions. This makes analysing a core dump way harder than it needs to be so I found a way to turn this feature off using CPack. The trick is to set a variable in the CMakeLists.txt file:

# prevent rpmbuild from stripping binaries when in debug mode
if(CMAKE_BUILD_TYPE STREQUAL "Debug")
  set(CPACK_RPM_SPEC_INSTALL_POST "/bin/true")
endif()

Now my debug packages retain their debug info after installation so it’s possible to get a lot more information out of gdb when looking at a core dump.

This is documented in a roundabout way online, but it took me a while to figure it out so I thought I’d write it up.