Skip to content

📦 Project Spaces, Storage, and Available Software and Analysis Tools

Project Spaces

Project owners can request new projects using this form. To add additional team members to a project space, contact research@hbs.edu. Remember that you will not be able to request a project space or be added to an existing project until you have logged into the RCP at least once.

The landing page of the RCP displays tiles with all projects that you have access to:

image

Clicking on a project will bring you to the project workbench. The workbench displays active sessions on the top half of the page and available launchers, each corresponding to different analysis tools, on the bottom half of the page:

image

Storage

S3 Bucket

Each project has shared S3 storage available to all members of a project. This can be accessed by clicking on the "Files" tab to the right of the Workbench:

image

Please see our documentation about Transferring Files to learn more using this feature.

Database

The RCP offers database capabilities through a back-end connection to Amazon Aurora. Connection parameters, including the username, password, and hostname can be obtained by clicking on the "Service Credentials" box that appears in the upper right hand side of the Workbench. Clicking on the box will reveal the connection parameters:

image

Connecting to an Existing Database

See below for sample code to connect to your database using Python. If you prefer to use R, please contact RCS for customized instructions.

Python

Using the mysql python package:

import mysql.connector 
# Connect to the database 
conn = mysql.connector.connect( 
    user = 'username', 
    password = 'password'', 
    host = 'endpoint', 
    database = ''databasename')

Creating a New Database

Database Naming Requirements

Your database name must start with a letter, and can only consist of letters, numbers, or underscores!

There are several ways to create a new database from a DataFrame, which may have been loaded from various file formats (such as CSV or Parquet). Below is sample Python code demonstrating one approach.

Python

Using the sqlalchemy python package:

from sqlalchemy import create_engine, text

# Define your Aurora cluster credentials and database name
AURORA_ENDPOINT = "endpoint" #The endpoint from the Service Credentials 
DB_USER = "username"
DB_PASSWORD = "password"
NEW_DB_NAME = "NewDatabase" #Database name of your choosing. Please note that the database name must start with a letter, and can only consist of letters, numbers, or underscores

try:
    # Create a SQLAlchemy engine
    engine = create_engine(f'mysql+pymysql://{DB_USER}:{DB_PASSWORD}@{AURORA_ENDPOINT}')

    # Create the database
    with engine.connect() as connection:
        connection.execute(text(f"CREATE DATABASE IF NOT EXISTS {NEW_DB_NAME}"))

    # Add the DataFrame to the database
    df.to_sql(name='my_table', con=engine, schema=NEW_DB_NAME, if_exists='replace', index=False)

except Exception as e:
    print(f"An error occurred: {e}")

Available software and analysis tools

The RCP launchers feature the most commonly used research software and analysis tools, including Rstudio, Spyder, VSCode, and Stata. Additionally, if applicable for the software, each launcher is preloaded with commonly used packages and modules.

Installing Packages or Modules

Important

Please note that the packages and modules you install are only available within a launcher, and not across the project's launchers. If you terminate the launcher, these packages and modules will be deleted.

If the package that you need to use is not preloaded, you can install it using the usual commands.

Installing R Packages

Using the standard command for install.packages() command from within RStudio will download and install the specified packages.

install.packages('somepkg')

Installing Python Modules

Python modules can be installed using the pip install command:

pip install some_module

To update/upgrade a module already installed, include also the --upgrade option:

pip install --upgrade some_module