Picture a scenario with WebSocket API configured in AWS with API Gateway, backed by a set of simple Lambdas handling $connect, $default, and $disconnect routes that communicate to DynamoDB.
As an abstract example the business logic behind WebSocket API is not our concern, what’s important is that the WebSocket API is public and has not implemented any authorization mechanisms so far.
Our goal is to authorize clients access to this WebSocket API.
A straightforward approach is to implement authorization logic in the WS Connect Function, however, to a certain extent it sacrifices quality attributes that could be important in the target architecture:
- Performance: checking authorization in the WS Connect Lambda function may introduce additional latency for each connection request. This could impact the overall performance of your WebSocket API, especially if the authorization logic is complex or involves external calls.
- Scalability: if the authorization logic is resource-intensive or time-consuming, it might limit the scalability of your WebSocket API. As the number of simultaneous connections increases, the WS Connect Function might struggle to handle the load efficiently.
- Flexibility: placing authorization logic directly in the WS Connect Function might make it harder to adapt and modify the authorization process. Any changes or updates to the authorization logic would require modifying the WS Connect Function, potentially leading to a more rigid and less flexible system.
- Security: embedding authorization logic within the WS Connect Function could introduce security risks. If there are vulnerabilities in the authorization process, they might be exposed to potential attackers during the initial connection phase.
- Maintainability: managing and maintaining authorization logic within the WS Connect Function might become challenging over time, especially as the WebSocket API evolves. Separating concerns and using a dedicated authorization mechanism might make the codebase more modular and easier to maintain.
- Reusability: authorization logic implemented in the WS Connect Function might not be easily reusable across different parts of your application. A more centralized and modular approach to authorization could facilitate reuse in other components or services.
- Testability: testing authorization logic within the WS Connect Function might be more challenging compared to testing it in isolation. A decoupled authorization mechanism could provide better testing capabilities.
According to the separation of concerns principle, one of the possible solutions could be to decouple authorization logic from the WebSocket API back-end implementation.
WebSocket API Gateway out-of-the-box can be configured to authorize requests Using Identity and Access Management (IAM) Authorization. And that’s the only change in the WebSocket API infrastructure we need!
The question is how to authorize clients’ requests with IAM. And the answer is another out-of-the-box capability provided by API Gateway: Signing AWS Requests. Luckily it’s supported not only by HTTP API Gateway but also by WebSocket API Gateway.
The challenge remaining is how to sign requests to WebSocket API Gateway using AWS Signature V4, which is the main driver for this article. But before that, let’s assess how this solution benefits from the perspective of the quality attributes described:
- Performance: by offloading the authorization logic to IAM, the WS Connect Function can focus on processing WebSocket connections without the additional overhead of complex authorization checks.
- Scalability: IAM authorization is handled at the API Gateway level, allowing it to scale efficiently and independently of the WS Connect Function. This ensures that the WebSocket API can handle a larger number of simultaneous connections without overloading the WS Connect Function.
- Flexibility: separating authorization logic from the WS Connect Function makes it more flexible and adaptable. Changes or updates to the authorization process can be made centrally in IAM without requiring modifications to the WS Connect Function. This promotes a more agile and easily modifiable system.
- Security: IAM provides a robust and secure authorization mechanism. The connection phase becomes more secure by leveraging AWS Signature V4 for WebSocket API requests. IAM’s role-based access control adds an extra layer of security, reducing the risk of vulnerabilities in the authorization process.
- Maintainability: with IAM handling authorization, the WebSocket API codebase becomes more modular and easier to maintain. Authorization logic is centralized and managed separately, simplifying the overall codebase and making it more maintainable over time.
- Reusability: the authorization logic implemented separately can be reused across different parts of the application. Since IAM is a centralized service, it facilitates the reuse of authorization mechanisms in other components or services, promoting a more modular and reusable design.
- Testability: testing the authorization logic becomes more straightforward when it’s decoupled from the WS Connect Function. IAM provides testing capabilities for authorization independently of the WebSocket API implementation, leading to more effective and focused testing.
Updated Infrastructure View
To implement the suggested solution we need a separate component authorizing user access and generating URLs for clients to access WebSocket API. For example, it can be a simple Lambda with a single responsibility of generating the URL to WebSocket API and signing it with AWS Signature V4 — let’s name it Sign Function.
Access to the Sign Function can be exposed through creating a Lambda Function URL or proxying through a separate HTTP API Gateway, which can be beneficial in case access to that Signer Function should be protected as well with authoriztion at the API Gateway layer, such as Cognito, JWT or another Lambda authorizers. For the sake of consistency I would continue with a separate HTTP API Gateway in front of the Signer Function.
Access to the Sign Function can be exposed through creating a Lambda Function URL or proxying through a separate HTTP API Gateway, which can be beneficial in case access to that Sign Function should be protected as well with authorization at the API Gateway layer, such as Cognito, JWT or another Lambda authorizer. For the sake of consistency, I would continue with a separate HTTP API Gateway in front of the Sign Function.
When signing the request to WebSocket API Gateway, we need to have an identity, and again a simple example can be the identity of the Sign Function itself. In this case, we would need to configure access to WebSocket API Gateway in the Sign Function role policy.
Nota bene! A more correct approach would be to create a separate identity and utilize its credentials when signing the requests.
In the updated infrastructure, clients are required to first reach out to the Sign Function to obtain a signed request. The Sign Function, using its credentials or a separate identity’s credentials, generates a signature for the request, including necessary parameters, and returns it to the clients.
After that, clients proceed with communication to WebSocket API by including the signed request in their WebSocket connection attempts. WebSocket API Gateway validates the signature provided with the request through IAM and passes the connection request to WS Connect Function, which proceeds as before.
Now, that we figured out the infrastructure part of the solution, let’s take a look at how the Sign Function can be implemented, I provide an example in Node.js, however, the logic will be similar for any language you choose to implement this Lambda.
Clients can make a request providing the WebSocket API URL they want to be signed, or the Sign Function can return it preconfigured. Details of the HTTP request handling apart (you can find it in the GitHub repository), the concerning part is how to sign the URL:
This function expects AWS
url similar to
wss://abc123.execute-api.us-east-1.amazonaws.com/$default, and optionally Objects with
query parameters and
headers to be added to the request. This function also respects the existing query parameters encoded in the
After the request is signed it returns the composed URL (with query parameters added, if any) and an Object with headers that should be set when making the request:
"authorization": "AWS4-HMAC-SHA256 Credential=ASIAWT7PK3CR3PLXHJQ2/20231007/us-east-1/execute-api/aws4_request, SignedHeaders=fourth;host;third;x-amz-content..."
Signature V4 includes headers and query parameters, making it resistant to tampering: if somebody tries to mess with the headers or parameters, the signature won’t match, and the request authorization will fail at the WebSocket API Gateway level, avoiding invoking the WS Connect Function.
Let’s try to make a signed request from the CLI using wscat:
wscat -c 'wss://abc123.execute-api.us-east-1.amazonaws.com/development?clientQueryParameter=clientQueryParameterValue&customQueryParameter=customQueryParameterValue' \
-H 'x-amz-date:20231007T030855Z' \
-H 'x-amz-security-token:IQoJb3JpZ2luX2VjEHsaCXVzLWVhc3QtMSJHMEUCIBlnjS3aHywkkKwGgC+avqv7czYBJtgmOOFrLoZ76QAiAiEAteWpLJEMvlaQLkHDiKSRbK3kuliale9VJdUuJ8aJ6oIq...' \
-H 'x-amz-content-sha256:e3b0c44298fc1c149afbf4c8996fb92427ae41e4649b934ca495991b7852b856' \
-H 'authorization:AWS4-HMAC-SHA256 Credential=ASIAWT7PK3CR3PLXHJQ2/20231007/us-east-1/execute-api/aws4_request, SignedHeaders=fourth;host;third;x-amz-content...' \
-H 'clientHeader:clientHeaderValue' \
Voilà — successful connection!
One of the not-so-obvious benefits of this solution that was important in my use case is that the logic implemented in the Sign Function can be extended to add additional query attributes to the URL or headers to the request that will be available in the context of WS Connect Function, and it won’t be possible to change them since the signature is generated for the whole URL making it immutable.
Imagine a use case where clients are not aware of the type of connection they need upfront, this way the business logic in the Sign Function can determine the parameters of the request to the WebSocket API, and its back-end will handle the requests appropriately, keeping separation of concerns.
Examples of such parameters controlled by the authorization logic could be user preferences, geographical filters, type of data, time intervals to receive data through WebSockets, and any other custom metadata relevant to your use case.
In summary, the proposed solution to authorize clients’ access to a public WebSocket API in AWS involves decoupling the authorization logic from the WebSocket API back-end. By leveraging IAM authorization in WebSocket API Gateway, the authorization process is centralized, promoting better performance, scalability, flexibility, security, maintainability, reusability, and testability.
The use of AWS Signature V4 for signing WebSocket API requests further enhances security and ensures a robust role-based access control mechanism. The separation of concerns principle is upheld by offloading authorization checks to IAM, allowing the WS Connect Function to focus on processing WebSocket connections efficiently.
Please find the full implementation behind Lambdas and CloudFormation template for AWS infrastructure on GitHub: https://github.com/loginov-rocks/WebSocket-API-Gateway-IAM-Signer — it can be useful as a boilerplate for the sample WebSocket API described in the article.
WS Connect and Disconnect Functions manage connections by putting and deleting the corresponding items in DynamoDB, which will also keep any of the custom headers and query parameters set by the Sign Function. WS Default Function echoes users’ messages with data stored for that connection in DynamoDB.
That’s all Folks!
Update on January 21, 2024
Please refer to my article Authorize Access to WebSocket API Gateway with Cognito describing another solution for the same problem utilizing Cognito instead of IAM.
Update on February 14, 2024
The sign function code was updated to sign requests using the query parameters authentication method — this is the only possible authentication method when establishing a WebSocket connection from the Web applications because headers are not supported in the browsers.