Summarizing and Manipulating Your Data

This is a tutorial that describes how to use tinybio with tiny intern enabled to quickly understand csv data. Specifically we'll demonstrate how to:

  • Upload the data you would like to analyze to the workbench
  • How to use packages like pandas and numpy to understand your data
  • How to download any images or csvs that you may have produced and the code that made them

Uploading Your Data

You can analyze any datasets that you have uploaded to your workbench using tinybio. To get started simply:

  1. Click on the cloud icon next to the chat window.
  2. Click on the upload button next to the chat window
  3. Select the files that you would like to upload. Please note that the instance is designed best for working with counts data or individual read files. The system is not set up well to perform alignment & feature counting.

Running The Analysis

You can start your analysis by simply asking tinybio to do it via chat.

For the purposes of our tutorial we'll use the tutorial dataset that we have included with your workbench called wisconsin_breast_cancer_dataset.csv. We'll simply ask:

  • "can you please describe the wisconsin cancer dataset that you have access to in the workbench?"

tinybio will now do the following:

  • write a Python likely using pandas to help understand the file contents to a file
  • it will use a bash command with python to actually run the file that it just wrote.

If tinybio had made a mistake in the file that it had written it will try again to write a new file but incorporate the feedback that it got from the error.

Now that we have the initial overview of the file - we can ask more advanced questions and analyses like:

  • "can you build me a model that predicts which factors are most predictive whether a tumor is malignant or benign?"

tinybio will do the following:

  • write a python script using sklearn & pandas to load in the data, transform it, scale the features, train a logistic regression, write out the predictions, save the output to a csv.
  • it will use a bash command with python to actually run the file that it just wrote.

Again, if tinybio had made a mistake in the file that it had written it will try again to write a new file but incorporate the feedback that it got from the error.

Once the analysis is finished you can download the data that had been generated and the scripts that had generated it. If you would like to make changes to the file; simply ask the

Downloading The Results

To download the figures and the output that you may have produced you can open your workbench by clicking on the cloud icon next to the chat. This will trigger the workbench window.

You can download both the script that generated the results of your analysis and any of the figures that it had generated.

If you would like to see the full list of packages that we have installed see here: tinybio intern packages list