Authorize Access to WebSocket API Gateway with Cognito

Danila Loginov
7 min readJan 22, 2024

Following my article Authorize Access to WebSocket API Gateway with AWS Signature V4, I’d like to show another solution for the same problem, this time using AWS Cognito.

Since Cognito authorization is not supported out-of-the-box for WebSocket API Gateways, I haven’t found simple, straightforward, and reusable end-to-end solutions on the Internet. The described Lambda authorizer implementation doesn’t require AppSync or special AWS SDKs to verify Cognito tokens, as it can be easily done without them.

Context

Let’s recall a scenario: we have WebSocket API configured in AWS with API Gateway, backed by a set of simple Lambdas handling $connect, $default, and $disconnect routes that communicate to DynamoDB.

Sample WebSocket API Infrastructure View

Problem

As an abstract example the business logic behind WebSocket API is not our concern, what’s important is that the WebSocket API is public and has not implemented any authorization mechanisms so far.

Our goal is to authorize clients access to this WebSocket API.

As described in the previous article, a straightforward approach to implement authorization logic in the WS Connect Function sacrifices quality attributes that could be important in the target architecture: Performance, Scalability, Flexibility, Security, Maintainability, Reusability, and Testability — please refer to the previous article for full explanation.

Suggested Solution

Similarly to the previous article according to the separation of concerns principle, this solution will also decouple authorization logic from the WebSocket API back-end implementation.

However, unlike last time, instead of using Identity and Access Management (IAM), I suggest utilizing Cognito — Customer Identity and Access Management platform provided by AWS.

As mentioned above, while it can be easily integrated out-of-the-box with HTTP API Gateways, it is not supported for WebSocket API Gateways. What is supported by all types of API Gateways is Lambda authorizers.

This way, we need to implement an Authorizer Function that will be invoked by WebSocket API Gateway at the time of establishing WebSocket connection, and WebSocket API Gateway should be integrated with the Authorizer Function. Similarly to the IAM solution, this will be the only change in the WebSocket API infrastructure we need!

The interesting part of this particular solution is how to implement the Authorizer Function that will validate the requests to WebSocket API using Cognito and pass user information to the WebSocket API back-end. Before jumping into the Authorizer Function implementation details, let’s review the additional components required to implement the end-to-end solution.

Updated Infrastructure View

Before requests reach the Authorization Function, clients are required to get a regular access token (JWT) from Cognito. To make it happen first we need to set up Cognito User Pool, as well as User Pool Client — application configuration that will be allowed to obtain user data and tokens.

Second, we need to implement a sample web application that will trigger the authentication flow using the User Pool Client configuration from the previous step. It can be implemented in different ways, but for a simple end-to-end example, I suggest setting up a web app that will redirect users to the Cognito Hosted UI that provides Authorization Code flow out-of-the-box.

After users complete sign up or login flow in the Cognito Hosted UI they are redirected back to the web app, which exchanges the Authorization Code for tokens using the Cognito-provided Token endpoint.

Despite it sounds complex, you will find everything needed to complete this setup, including the web app in the GitHub repository created for this article.

Updated Infrastructure View

Among different tokens (access, ID, refresh) that the web app obtains, we’re interested in the access token that clients will pass as an Authorization (configurable) header along with the requests to establish a WebSocket API connection.

When WebSocket API Gateway receives an incoming request, it invokes the Authorizer Function, which verifies the JWT using public keys exposed by Cognito. In case the JWT is valid, it passes the token payload, as well as any additional parameters configured, next to the WS Connect Function, which proceeds as before.

Authorizer Function

I provide an example in Node.js, however, the logic will be similar for any language you choose to implement this Lambda. Please refer to the full implementation in the GitHub repository, while I’ll only cover the main points here.

First, the Authorizer Function gets the header configured as an Identity Source at the WebSocket API Gateway level. In this example the header is configured to mimic the regular HTTP API Gateway authorization following the Authorization: Bearer JWT format, so the function obtains the JWT part and verifies it with the verifyToken(region, userPoolId, token) function:

To verify the token with the help of the jsonwebtoken library, it needs first to get the public key in PEM format by ID encoded in the JWT header.

Cognito exposes public keys for the client verification on the endpoint in the following format: https://cognito-idp.us-east-1.amazonaws.com/us-east-1_abc123/.well-known/jwks.json — so it can be easily fetched and PEM encoded using another jwk-to-pem library:

After the token is validated, the Authorizer Function generates a policy allowing (or denying, based on your business requirements) clients’ access to the WebSocket API Gateway and passes the token payload together with any parameters needed through the context:

Let’s take a look at the Authorizer Function output, in case the JWT is valid:

{
"context": {
"sub": "1d7856e9-7bd7-45fe-beb1-77798de950bc",
"iss": "https://cognito-idp.us-east-1.amazonaws.com/us-east-1_abc123",
"version": 2,
"client_id": "281794089m6k93mrbcvsd1o5jn",
"origin_jti": "323decee-01d5-4c0a-b86e-8fa0e44b88a9",
"event_id": "7c659f99-9598-4e9e-aba9-ac0088502953",
"token_use": "access",
"scope": "openid profile email",
"auth_time": 1705726879,
"exp": 1705730479,
"iat": 1705726880,
"jti": "834587a8-763b-4c2d-9641-f65d89613767",
"username": "loginov-rocks",
"customPayload": "customPayloadValue"
},
"policyDocument": {
"Version": "2012-10-17",
"Statement": [
{
"Action": "execute-api:Invoke",
"Effect": "Allow",
"Resource": "arn:aws:execute-api:us-east-1:777845240123:abc123/$default/$connect"
}
]
},
"principalId": "user"
}

Because the context data is passed directly to the WS Connect Function, it cannot be tampered with by clients.

Moreover, the context data will also be available in every other WebSocket API back-end call, such as to WS Default and WS Disconnect Functions.

With everything set up in place, let’s try to establish a connection to the WebSocket API utilizing the described Authorization Function. First, we need to configure the web app to point to the correct User Pool Client and launch on localhost:

Cognito Web Auth: Step 1

Next, when we click on the “Login” button, the web app redirects the browser to the Cognito Hosted UI, where we need to complete the signup process:

Cognito Web Auth: Step 2

After successful signup or login, we are redirected back to the web app, which exchanges the Authorization Code for the auth data in the background:

Cognito Web Auth: Step 3

From here, we need to copy the access token value and use it in the Authorization header to establish a connection with the WebSocket API from the CLI using wscat:

wscat -c 'wss://abc123.execute-api.us-east-1.amazonaws.com/$default?clientQueryParameter=clientQueryParameterValue' \
-H 'authorization:Bearer eyJraWQiOiJSWmhsTDRLaVpiU1Q3WnRDeVRGT01yYUxjbEh4ZElkN2RaZjBcLzV5V3dVMD0iLCJhbGciOiJSUzI1NiJ9...' \
-H 'clientHeader:clientHeaderValue'
Test

Voilà — successful connection!

Benefits

Similarly to the IAM solution, the Authorizer Function can be extended to obtain user-specific parameters, as well as query parameters, and pass it to the WebSocket back-end, for example, to support use cases where clients are not aware of the type of connection they need upfront.

This way the business logic in the Authorizer Function can determine the parameters of the request to the WebSocket API, and its back-end will handle the requests appropriately, keeping separation of concerns.

Examples of such parameters controlled by the authorization logic could be user preferences, geographical filters, type of data, time intervals to receive data through WebSockets, and any other custom metadata relevant to your use case.

Other benefits of utilizing the Authorizer Function, common with the previous approach, include abilities:

  • to deny user access, even if the user token is valid,
  • to be integrated from the front-end web or mobile apps.

What can be important for the front-end integration in case it already uses Cognito by mimicking the HTTP API Gateway authorization, the approach enhances interoperability: the front-end does not need to deviate from the existing authorization patterns, utilizing the same access token.

The last but not least benefit is that when configuring Identity Source at the WebSocket API Gateway level, it can drop connections without the header passed to avoid unnecessary calls to the Authorizer Function.

Conclusion

In summary, the proposed solution to authorize clients’ access to a public WebSocket API in AWS involves decoupling the authorization logic from the WebSocket API back-end. By leveraging Cognito authorization in WebSocket API Gateway, the authorization process is centralized, promoting better performance, scalability, flexibility, security, maintainability, reusability, and testability.

The use of Cognito for authorizing WebSocket API requests further enhances security and ensures a robust role-based access control mechanism. The separation of concerns principle is upheld by offloading authorization checks to the Authorizer Function, allowing the WS Connect Function to focus on processing WebSocket connections efficiently.

GitHub

Please find the full implementation behind Lambdas and CloudFormation template for AWS infrastructure on GitHub: https://github.com/loginov-rocks/WebSocket-API-Gateway-Cognito-Authorizer — it can be useful as a boilerplate for the sample WebSocket API described in the article.

WS Connect and Disconnect Functions manage connections by putting and deleting the corresponding items in DynamoDB, which will also keep any of the custom headers and query parameters set by the Sign Function. WS Default Function echoes users’ messages with data stored for that connection in DynamoDB.

That’s all Folks!

--

--