iangilham.com

By on

cmake cpack rpmbuild

I’ve been trying to analyse a core dump generated by a C++ application when it seg-faulted. I use CMake 3 to build it and create an RPM with CPack. This application is currently built in debug mode using -DCMAKE_BUILD_TYPE=Debug on the command line that invokes CMake.

The generated binaries have all their debug symbols as expected but the binaries in the RPM package do not. After some searching, I learned that rpmbuild strips binaries by default on some distributions. This makes analysing a core dump way harder than it needs to be so I found a way to turn this feature off using CPack. The trick is to set a variable in the CMakeLists.txt file:

# prevent rpmbuild from stripping binaries when in debug mode
if(CMAKE_BUILD_TYPE STREQUAL "Debug")
  set(CPACK_RPM_SPEC_INSTALL_POST "/bin/true")
endif()

Now my debug packages retain their debug info after installation so it’s possible to get a lot more information out of gdb when looking at a core dump.

This is documented in a roundabout way online, but it took me a while to figure it out so I thought I’d write it up.

By on

core-dump linux

Since Systemd took over as the main init system in Red Hat Linux and derrivatives like CentOS, it has become more difficult to get a core dump out of a daemon application. The traditional approach of running ulimit -c unlimited before executing the binary works when running the application from the command line but does nothing for a daemon managed by Systemd’s unit files.

There is a lot of misleading information online about how to solve this so I thought I’d add a correct solution to the mix in the hope that it’s helpful.

The suggestions I found online include editing /etc/security/limits.conf, adding LimitCore=infinity to the Unit file, and messing around with /etc/systemd/coredump.conf. None of these methods work without customising the kernel configuration first.

Systemd is not configured to handle core dumps by default on CentOS (and by extension RHEL) distributions. The default behaviour is to write to the file core in the process’s working directory, which for daemons is often the root directory.

The obvious problem here is that the daemon probably doesn’t have write access to the root directory (if running as a non-root user). If is possible to change the working directory with the Systemd unit directive WorkingDirectory=/var/run/XXX. This is typically used with RuntimeDirectory=XXX, which creates and manages the lifecycle of /run/XXX (/var/run is a symlink to /run). Unfortunately, we can’t write the core file to a RuntimeDirectory because it gets deleted when the application terminates.

The simplest solution I found is to overwrite the kernel core_pattern setting. This can be edited at runtime by echoing a new value into /proc/sys/kernel/core_pattern:

echo /tmp/core-%e-sig%s-user%u-group%g-pid%p-time%t > /proc/sys/kernel/core_pattern

This will force the kernel to write all core files during the current OS uptime to /tmp with the filename pattern specified. The core manpage has more information on the syntax.

This change will be lost when the machine reboots. To effect the change at kernel startup, you need to edit /etc/sysctl.conf or a file in /etc/sysctl.d/.

kernel.core_pattern=/tmp/core-%e-sig%s-user%u-group%g-pid%p-time%t

Our solution at work was to write a script to create a file in /etc/sysctl.d/ at machine image creation time, so that the config is always there when we roll out to different environments (int, test, live etc.)

It should go without saying that there is no particular reason to use /tmp. The output can be redirected to any location the process has permission to write to. A network share may be more appropriate in some cases.

There may be another solution using systemd-coredump, but it is not part of this release of CentOS (7.2) and not in the yum repository at this time.

By on

code azure azure-functions azure-storage

I’ve written previously about how Amazon’s AWS Lambda feels like the future. I had a little time today to play with Azure Functions, Microsoft’s new competing service.

As a preview, Azure Functions isn’t quite polished yet, but the basic building blocks are in place. You create a Function App in the Azure Portal, much like you would create a Web or Mobile App, then add a few Functions in the Portal blade.

Since the service is built on the mature WebJobs product, there is already strong language support. Azure Functions can be created with specialised C#, NodeJS, PHP, Python, Powershell, Windows Batch scripts (I know), or any standard Windows executable. The Portal UI leaves a lot to be desired for now. I did’t find any obvious way of using different languages or packaging up apps and dependencies, but did manage to get it working. The documentation implies that there are many more features than those exposed by the Portal blade.

For the sake of a quick test, I created a small function to connect to the Azure Storage account and create an empty Table. I had a little trouble setting up the connection string to the storage account but found a solution using magic environment variables here. This may seem normal to regular dotNet and/or Azure developers but it was new to me. The documentation implied that I needed to use a CloudConfigurationManager from this NuGet package but I failed to figure out how to pull in the dependency and resolve the namespace. The solution is to copy a connection string from the Azure Storage account and paste it into the Functions App Service Settings blade’s Application Settings under “Connection Strings”. This effectively makes the string available to the Function in an environment variable at runtime. If our connection string is named StorageConnectionString, and the type is “Custom”, then the value will be available in the environment variable CUSTOMCONNSTR_StorageConnectionString.

Here’s the code I ended up with as shown in the editor in the Azure Portal.

#r "Microsoft.WindowsAzure.Storage"

using System;
using Microsoft.WindowsAzure;
using Microsoft.WindowsAzure.Storage;
using Microsoft.WindowsAzure.Storage.Table;

public static void Run(object trigger)
{
    var storage = CloudStorageAccount.Parse(
        Environment.GetEnvironmentVariable(
            "CUSTOMCONNSTR_StorageConnectionString"));
    var tableClient = storage.CreateCloudTableClient();
    var membersTable = tableClient
        .GetTableReference("MyTable");

    membersTable.CreateIfNotExists();
}

This file is a little weird because it is a csx file. There is some documentation in the Azure Functions Developer Reference but it’s essentially C# with embedded assembly references (#r "Microsoft.WindowsAzure.Storage") and a top-level public static void Run method. Any other classes etc. you need can be included inline in the file.

With the code in place, you have to switch to the ‘integration’ tab to enable a trigger so that the function can be run. I did the minimum needed to test the function and just selected a manual trigger, changed the name to “trigger”, then switched back to the code view and hit the “Run” button at the bottom of the Function App blade. It took a few goes to get the code right, but with everything working, I was able to confirm that it worked by opening the storage account blade in the Azure Portal and looking at the list of Tables in the account. Sure enough, we have our shiny new table.

While creating a table programmatically is about the simplest use case I could think of for testing a Function, I must admit that it feels a bit forced to have to write code for this at all. I prefer Amazon’s model of providing Console-based UIs for most of the core functionality, like creating tables and adding records. There are Storage Explorer tools for this on Azure, but it would be nice to be able to manage everything from the Portal.

There’s a lot more ground to cover here: Function triggering, integration with other services, creating API endpoints and deployment automation are all considerations. I look forward to seeing how Azure Functions evolve in the future.

By on , modified

aws aws-lambda aws-sns aws-cloudformation

Amazon Web Services (AWS) provides many building blocks you can use to create just about anything in the world of web-connected services. A lot of the skill in using this toolkit is in figuring out how to make the various services work together. CloudFormation is critical to this effort, as it let’s you write a config file that can be used to automate the creation of all the infrastructure you need to deliver your product. The big win from using CloudFormation is reproducibility. It let’s you define your whole infrastructure requirement, put it in version control, and reproduce the same setup in a production environment without manually recreating everything.

CloudFormation is not without its problems, however. The Syntax is overly verbose and awkward to author. Some of the services it can create cannot be edited, and it is easy to get into circular reference hell with more complex configurations involving the interactions between multiple products. There are reams of documentation but most of it is useless noise without meaningful explanation. The emerging best-practices are to work in another format (like YAML) and convert it to CloudFormation’s preferred format mechanically, or to pay a third party for their cloud automation tools.

I’ve been struggling to get an SNS topic to trigger a Lambda this week. This is really easy using the AWS console because it automates many of the tricky and barely documented steps required to get it working. Wiring it up in CloudFormation is another hell entirely. You need a few different pieces in the right order to make this work so I’ll go through each one in turn. Each of the below can be pasted into the Resources section of a CloudFormation template.

For completeness, I’ve included a basic template skeleton here.

{
  "Description": "blah blah",
  "Parameters": { ... },
  "Outputs": { ... },
  "Resources": { ... }
}

Lambda Execution Role

Before you can build a Lambda Function, you need to create some permissions for it to assume at runtime. Here I present a fairly minimal role suitable for a basic Lambda Function with no external integration points. Additional permissions (e.g. reading from an S3 Bucket) can be added to the list of Statements in the PolicyDocument. This part is actually documented reasonably well.

{
  "ExecutionRole": {
    "Type": "AWS::IAM::Role",
    "Properties": {
      "Path": "/",
      "Policies": [
        {
          "PolicyName": "CloudwatchLogs",
          "PolicyDocument": {
            "Statement": [
              {
                "Action": [
                  "logs:CreateLogGroup",
                  "logs:CreateLogStream",
                  "logs:GetLogEvents",
                  "logs:PutLogEvents"
                ],
                "Resource": [ "arn:aws:logs:*:*:*" ],
                "Effect": "Allow"
              }
            ]
          }
        }
      ],
      "AssumeRolePolicyDocument": {
        "Statement": [
          {
            "Action": [ "sts:AssumeRole" ],
            "Effect": "Allow",
            "Principal": {
              "Service": [ "lambda.amazonaws.com" ]
            }
          }
        ]
      }
    }
  }
}

Lambda Function

The Lambda Function itself is quite easy to set up. I’ve hard-coded placeholders for the S3 Bucket and path to the zip file containing the code. It doesn’t matter what the code is for the purpose of this article.

{
  "Lambda": {
    "Type": "AWS::Lambda::Function",
    "Properties": {
      "Code": {
        "S3Bucket": "my-personal-bucket",
        "S3Key": "lambdas/test/my-lambda.zip" }
      },
      "Description": "Some Lambda Function",
      "MemorySize": 128,
      "Handler": {"Ref": "LambdaHandler"},
      "Role": {
        "Fn::GetAtt": [ "ExecutionRole", "Arn" ]
      },
      "Timeout": 5,
      "Runtime": "python2.7"
    },
    "DependsOn": [
      "ExecutionRole"
    ]
  }
}

SNS Topic

The SNS Topic is very easy to create but it cannot be modified using CloudFormation after it has been created. You also have to create all the subscriptions at the same time, so if you use CloudFormation for reproducibility, you can never change the subscriptions of a running event pipeline that relies on SNS. This is a major drawback of SNS and CloudFormation and should be considered with care before you rely too heavily on this set of tools.

The unmodifiable nature of SNS Topics created this way won’t be a problem if you’re creating subscriptions via an API at runtime, but it limits how flexible SNS can be for some workflows, like fanning out to SQS Queues or triggering Lambda Functions.

{
  "Topic": {
    "Type": "AWS::SNS::Topic",
    "Properties": {
      "Subscription": [
        {
          "Endpoint": {
            "Fn::GetAtt": [ "Lambda", "Arn" ]
          },
          "Protocol": "lambda"
        }
      ]
    },
    "DependsOn": [ "Lambda" ]
  }
}

Permission for the Topic to invoke the Lambda

Unfortunately, creating all the pieces isn’t enough. We still need to grant our SNS topic permission to invoke the Lambda Function directly. This is really important and the documentation is almost completely useless so I’m putting a working example here.

It is absolutely critical to have the SourceArn property refer to the Topic we created earlier. Without this, the Lambda Console will give you inexplicable errors, while the SNS Console claims that everything is correct and working. It can be very frustrating trying to get this right.

{
  "LambdaInvokePermission": {
    "Type": "AWS::Lambda::Permission",
    "Properties": {
      "Action": "lambda:InvokeFunction",
      "Principal": "sns.amazonaws.com",
      "SourceArn": { "Ref": "Topic" },
      "FunctionName": {
        "Fn::GetAtt": [ "Lambda", "Arn" ]
      }
    }
  }
}

Update: An observant reader spotted a bug in the above CloudFormation. I originally had a reference to the SourceAccount in the Properties block:

  "SourceAccount": { "Ref": "AWS::AccountId" },

This was incorrect. It works after removing that line.

Conclusion

That should be enough to get it all working. There are some serious holes in the capabilities of CloudFormation for working with SNS, and the permission model is a poorly documented mess. The only thing I haven’t covered is how to get the Lambda itself to run, but that’s a topic for another day.

By on

code aws aws-lambda event-driven

Every now and then we find a technology that feels like the future, letting you achieve what you want to without getting in the way. Automatic and dual-clutch (semi-automatic) transmissions fall into this category, and some programmers occasionally get the bug from the latest experimental programming language or when rediscovering functional programming.

I’m getting that next-big-thing feeling from Amazon’s AWS Lambda service. I’ve been building enterprise data munging applications in Java and C++ for years and have done the usual little glue scripts in Python, Ruby, Bash, CMD just like most other software developers. As we all know, the code is the fun part that takes very little time compared to the brain-numbing effort of messing around with operating systems, deployment, scaling considerations, TLS (SSL) termination, fiddling with CORS settings etc.

When you move to cloud-rented servers, the configuration only gets worse, as the dev team has to micromanage details traditionally dealt with by operations staff running a data centre. At this level of abstraction you need to cake care of monitoring, availability and fault-tollerance while taking account of how to scale up and down with the load. And you always end up paying to leave at least one idle server sitting there waiting for something to do.

The beauty of Lambda is that it does most of the boring infrastructure stuff for you magically. You don’t have to think about how many hosts might be running and how to scale them up and down. You don’t have to pay for a running host when the application is sitting idle. You don’t even have to fiddle with the half-working settings of a Java Servlvet Container or the massive weight of the Spring framework.

All you have to do with Lambda is write a single-purpose application (currently in Java, Python or NodeJS), deploy it and trigger it to run. It can be given an endpoint with API Gateway or receive notifications from other Amazon services or from any SNS topic. You only pay for executing runtime and reserved memory capacity during execution.

The trade-off is obvious. Sacrifice some flexibility in exchange for a simpler life in which you can spend more of your time working on the code and delivering value. The biggest gain from adopting Lambda is the cost of developers’ time. The business saves money by paying developers to think about, design and write code full-time, not spend hours trying to wrap our heads around complex, multi-faceted configuration for a fleet of servers we don’t even need half the time. Configuration which inevitably contains faults and goes wrong in production at inconvenient times.

AWS Lambda isn’t best suited to monolithic applications. It works best if you build many small, single-purpose applications. This might mean each endpoint of your traditional monolith has a separate codebase, perhaps sharing a small library of common abstractions. The composable, single-purpose model encourages a scripting-like mindset, so complex frameworks become a hindrance. Even a basic CRUD application quickly decomposes into 4 or 5 tiny scripts. The Python and NodeJS support are ideal for supporting this approach, but it’s easy enough to extract functionality from existing Java monoliths as well.

In summary, AWS Lambda feels like the future of software development in ways that promising new programming languages don’t. It provides a great deal of flexibility within the problem domain while taking away most of the undifferentiated wastes of time in the software development process (internal corporate noise excepted). It encourages keeping delivered software very small and single-purpose, which makes it easy to keep the quality bar fairly high through peer reviews. I’m pretty excited to see how Lambda evolves over the next few years and what the competitors do to up their game.