📦 Project Spaces, Storage, and Available Software and Analysis Tools
Project Spaces
Project owners can request new projects using this form. To add additional team members to a project space, contact research@hbs.edu. Remember that you will not be able to request a project space or be added to an existing project until you have logged into the RCP at least once.
The landing page of the RCP displays tiles with all projects that you have access to:
Clicking on a project will bring you to the project workbench. The workbench displays active sessions on the top half of the page and available launchers, each corresponding to different analysis tools, on the bottom half of the page:
Storage
S3 Bucket
Each project has shared S3 storage available to all members of a project. This can be accessed by clicking on the "Files" tab to the right of the Workbench:
Please see our documentation about Transferring Files to learn more using this feature.
Database
The RCP offers database capabilities through a back-end connection to Amazon Aurora. Connection parameters, including the username, password, and hostname can be obtained by clicking on the "Service Credentials" box that appears in the upper right hand side of the Workbench. Clicking on the box will reveal the connection parameters:
Connecting to an Existing Database
See below for sample code to connect to your database using Python. If you prefer to use R, please contact RCS for customized instructions.
Python
Using the mysql python package:
import mysql.connector
# Connect to the database
conn = mysql.connector.connect(
user = 'username',
password = 'password'',
host = 'endpoint',
database = ''databasename')
Creating a New Database
Database Naming Requirements
Your database name must start with a letter, and can only consist of letters, numbers, or underscores!
There are several ways to create a new database from a DataFrame, which may have been loaded from various file formats (such as CSV or Parquet). Below is sample Python code demonstrating one approach.
Python
Using the sqlalchemy python package:
from sqlalchemy import create_engine, text
# Define your Aurora cluster credentials and database name
AURORA_ENDPOINT = "endpoint" #The endpoint from the Service Credentials
DB_USER = "username"
DB_PASSWORD = "password"
NEW_DB_NAME = "NewDatabase" #Database name of your choosing. Please note that the database name must start with a letter, and can only consist of letters, numbers, or underscores
try:
# Create a SQLAlchemy engine
engine = create_engine(f'mysql+pymysql://{DB_USER}:{DB_PASSWORD}@{AURORA_ENDPOINT}')
# Create the database
with engine.connect() as connection:
connection.execute(text(f"CREATE DATABASE IF NOT EXISTS {NEW_DB_NAME}"))
# Add the DataFrame to the database
df.to_sql(name='my_table', con=engine, schema=NEW_DB_NAME, if_exists='replace', index=False)
except Exception as e:
print(f"An error occurred: {e}")
Available software and analysis tools
The RCP launchers feature the most commonly used research software and analysis tools, including Rstudio, Spyder, VSCode, and Stata. Additionally, if applicable for the software, each launcher is preloaded with commonly used packages and modules.
Installing Packages or Modules
Important
Please note that the packages and modules you install are only available within a launcher, and not across the project's launchers. If you terminate the launcher, these packages and modules will be deleted.
If the package that you need to use is not preloaded, you can install it using the usual commands.
Installing R Packages
Using the standard command for install.packages() command from within RStudio will download and install the specified packages.
Installing Python Modules
Python modules can be installed using the pip install command:
To update/upgrade a module already installed, include also the --upgrade option: