Google Colab, short for "Colaboratory", is a free cloud-based platform provided by Google that allows users to write and execute Python code in a web browser. It’s especially popular in the machine learning and data science community because it requires no setup and provides free access to GPUs and TPUs. It combines the interactivity of Jupyter notebooks with the scalability of the cloud.
# Print a message in Google Colab print("Hello from Google Colab!")
Google Colab offers several advantages over traditional local Jupyter Notebooks. It doesn’t require installation or maintenance—just log in with your Google account. It supports free access to hardware accelerators like GPUs and TPUs, automatic saving to Google Drive, easy sharing just like Google Docs, and seamless collaboration. Colab also comes pre-installed with many data science and machine learning libraries.
# Check if you're running in Colab try: import google.colab print("Running in Google Colab") except: print("Not running in Google Colab")
Key features of Google Colab include: free GPU and TPU access, Google Drive integration, real-time collaboration, code cells with rich text markdown, easy data visualization, and built-in access to many popular Python libraries. It supports importing datasets from various sources, interactive widgets, and notebooks can be shared like Google Docs, enabling multiple users to edit and run code simultaneously.
# Mount Google Drive from google.colab import drive drive.mount('/content/drive')
Google Colab is widely used in education for teaching Python, data science, and machine learning. In data science and AI, it serves as a free development environment for prototyping, model training, and visualization. Researchers also benefit by sharing executable papers and running experiments in the cloud without setting up infrastructure. It reduces technical barriers and speeds up collaboration across disciplines.
# Simple data visualization example import matplotlib.pyplot as plt x = [1, 2, 3, 4] y = [10, 20, 25, 30] plt.plot(x, y) plt.title("Sample Line Chart") plt.xlabel("X-axis") plt.ylabel("Y-axis") plt.show()
To use Google Colab, all you need is a Google account. Simply visit colab.research.google.com and sign in. Once signed in, you can create, open, and save notebooks to your Google Drive. The integration with your Google account ensures autosaving, sharing, and access across devices without any software installation.
# Colab auto-saves your work to Google Drive when signed in. print("Signed in and ready to go!")
Creating a new notebook is as simple as clicking "File > New Notebook". Google Colab automatically names and saves the notebook in your Drive. You can also rename notebooks, move them between folders, or download them in multiple formats like `.ipynb` and `.py`. Notebooks are auto-saved continuously to prevent data loss.
# No code needed — click "File > Save" or use Ctrl+S. # Colab saves your work automatically every few seconds.
The Colab interface consists of a toolbar (for actions like saving, inserting cells, and running code), a sidebar with file access, and a main editor for writing code or text. Each notebook consists of "cells"—either code or text (Markdown). The menus offer powerful features like connecting runtimes, managing variables, and inserting charts or tables.
# Toolbar: Run ▷ Add cell ➕ Save 💾 etc. # Menu: Runtime > Run all / Restart Runtime print("Interface elements help you control your notebook.")
Google Colab supports a variety of file types including Jupyter notebooks (`.ipynb`), Python scripts (`.py`), data files like `.csv`, `.xlsx`, and more. You can upload files manually, fetch from Google Drive or GitHub, or download outputs in various formats. This versatility makes Colab ideal for data science workflows.
# Reading a CSV file import pandas as pd df = pd.read_csv("sample.csv") # Replace with your file df.head()
Colab allows users to switch runtime types to optimize performance. You can choose between CPU, GPU (for machine learning acceleration), or TPU (Tensor Processing Unit). Navigate to Runtime > Change runtime type to select. This is especially useful for training deep learning models or processing large datasets.
# Check current runtime hardware import tensorflow as tf print("GPU available:", tf.config.list_physical_devices('GPU'))
Code cells in Google Colab are where you write and execute Python code interactively. You can define variables, write functions, import libraries, or run scripts line by line. After writing code, press **Shift+Enter** or click the "Run" button. This flexibility makes it perfect for debugging, learning, and running code without needing a local setup.
# Basic Python example in a Colab code cell name = "Colab User" print("Hello,", name)
Google Colab offers many keyboard shortcuts to improve your coding efficiency. For instance, use **Shift+Enter** to run a cell and move to the next, **Ctrl+M B** to insert a new code cell below, or **Ctrl+/** to toggle comments. You can view all hotkeys by going to “Tools > Keyboard shortcuts” in the menu.
# This is for demonstration — try the shortcuts in Colab print("Use Ctrl+/ to comment or uncomment this line")
In Colab, cells are either **code cells** or **text cells**. Code cells execute Python code, while text cells are used for writing Markdown (e.g., headings, lists, formatted text). This combination helps you mix documentation with executable code, making it great for tutorials, reports, and presentations. You can switch cell types from the toolbar or using hotkeys like **Ctrl+M M**.
# Markdown example (write in a text cell instead): # ## Section Title # - Point 1 # - Point 2 # **Bold**, *italic*, and `code`
Comments are essential for writing understandable and maintainable code. In Python, comments start with `#` and are ignored during execution. Use comments to explain logic, flag future work, or annotate code for collaborators. In Markdown cells, you can write formatted explanations, headings, and notes to guide readers through your notebook.
# This line defines a temperature in Celsius celsius = 25 # Convert to Fahrenheit fahrenheit = (celsius * 9/5) + 32 print("Temperature in Fahrenheit:", fahrenheit)
As you run cells in Colab, outputs accumulate. You can clear outputs by right-clicking on a cell and choosing "Clear output" or use the top menu "Edit > Clear all outputs". To reset the environment completely, use "Runtime > Restart runtime", which wipes memory and variables—helpful when your notebook becomes slow or buggy.
# Long loop to show output for i in range(5): print("Running iteration", i) # After this, use Runtime > Restart to clear all variables
Google Colab allows easy uploading and downloading of files to and from your session. This is useful when working with datasets or saving outputs. The upload button brings up a file picker, while downloading can be done with `files.download()`. Keep in mind, files uploaded this way are temporary and lost after the session ends unless saved elsewhere.
from google.colab import files # Upload a file uploaded = files.upload() # Download a file files.download('example.csv')
Mounting Google Drive lets you access persistent storage in your Google account. Once mounted, Colab treats your Drive as a local filesystem. This is essential for accessing large files, saving results, or loading scripts across multiple sessions without re-uploading every time. Mounted Drive appears under `/content/drive/`.
from google.colab import drive # Mount Google Drive drive.mount('/content/drive')
Google Colab supports reading structured data files such as CSV, Excel, and JSON using common Python libraries like pandas. This enables fast exploration and preprocessing of tabular data, which is crucial in data science workflows. You can load files directly from Drive or after uploading them temporarily.
import pandas as pd # Read CSV file df = pd.read_csv('/content/sample.csv') # Read Excel file df_xls = pd.read_excel('/content/sample.xlsx') # Read JSON file df_json = pd.read_json('/content/sample.json')
When Google Drive is mounted, you can directly read from and write to it like a regular directory. This is useful for saving model outputs, logs, or notebooks persistently. Ensure your path starts with `/content/drive/MyDrive/` followed by your folder structure. Files saved here remain in Drive even after the session ends.
# Save a DataFrame to CSV in Google Drive df.to_csv('/content/drive/MyDrive/my_data.csv', index=False)
Google Colab supports loading datasets from URLs, cloud storage, or APIs. This makes it highly versatile for real-world tasks involving public or remote data. You can use `requests`, `wget`, or `gdown` to download files. APIs like Kaggle or Hugging Face datasets are also commonly integrated in AI workflows.
# Download dataset using wget !wget https://people.sc.fsu.edu/~jburkardt/data/csv/airtravel.csv # Load it into pandas df = pd.read_csv('airtravel.csv')
In Google Colab, you can use Markdown in text cells to format notes and documentation. Basic formatting includes bold, italic, strikethrough, and inline code. Markdown helps you make notebooks more readable, combining narrative and code in one document for better presentation and reproducibility.
**Bold text**
*Italic text*
~~Strikethrough~~
`Inline code`
Markdown allows organizing content with headers, bullet or numbered lists, and tables. Headers are denoted by `#`, and lists use `-` or numbers. Tables use pipe `|` separators. These structures help make your notebook clean and structured for presentations or reporting.
# Heading 1
## Heading 2
### Heading 3 - Item 1
- Item 2 1. First
2. Second | Name | Age |
|------|-----|
| John | 25 |
| Lisa | 30 |
You can embed images and links using standard Markdown syntax. This is useful for adding visual aids or directing users to documentation, external datasets, or related resources. Images can be added via URL or uploaded and linked directly from Google Drive.
[Visit Google](https://www.google.com)

Colab supports LaTeX syntax for displaying mathematical expressions using dollar signs. This is ideal for documenting mathematical models, statistical formulas, or theoretical content inside notebooks. You can use inline equations or block math environments for more complex notations.
Inline: \( a^2 + b^2 = c^2 \)
Block: $$ \sum_{i=1}^{n} x_i = X $$
`Matplotlib` and `Seaborn` are powerful libraries for creating static, animated, and interactive plots. `Matplotlib` provides detailed control over every element of a plot, while `Seaborn` offers high-level APIs for statistical graphics. These libraries help visualize trends, patterns, and insights in data, which is essential in both data exploration and presentation.
import matplotlib.pyplot as plt import seaborn as sns data = [10, 20, 30, 25] sns.set_style("darkgrid") sns.lineplot(x=[1, 2, 3, 4], y=data) plt.title("Simple Line Chart") plt.show()
For interactive and web-based visualizations, tools like `Plotly` and `Bokeh` are ideal. They allow zooming, hovering, and real-time updates directly in notebooks. These tools are used in dashboards, presentations, and modern data science workflows to enhance user experience and data interaction.
import plotly.express as px fig = px.bar(x=["A", "B", "C"], y=[10, 20, 15], title="Interactive Bar Chart") fig.show()
In Colab, charts and visualizations display directly below the code cell thanks to inline rendering. This is enabled automatically but can also be managed using special commands in other environments. Inline charts make it easy to analyze and revise plots without switching contexts or saving separate files.
# In Colab, plots are inline by default plt.plot([1, 2, 3], [3, 2, 1]) plt.title("Inline Plot Example") plt.show()
Visualization libraries allow extensive customization including colors, labels, markers, grid styles, and sizes. This helps in making charts more readable and tailored to different audiences. Whether for reports or presentations, well-designed graphs improve clarity and professionalism in data storytelling.
plt.plot([1, 2, 3], [3, 2, 1], color='green', marker='o', linestyle='--') plt.title("Customized Plot") plt.xlabel("X-axis") plt.ylabel("Y-axis") plt.grid(True) plt.show()
`Scikit-learn` is a popular library for classical machine learning. It provides easy-to-use tools for classification, regression, clustering, and dimensionality reduction. In Colab, it is pre-installed, making it easy to begin training models and evaluating results quickly. Its API is simple yet powerful for both beginners and professionals.
from sklearn.linear_model import LinearRegression import numpy as np X = np.array([[1], [2], [3]]) y = np.array([2, 4, 6]) model = LinearRegression() model.fit(X, y) print("Prediction for 4:", model.predict([[4]]))
Google Colab supports `TensorFlow` and `Keras` out of the box, making it easy to build and train deep learning models. TensorFlow is powerful and scalable, while Keras provides a user-friendly API. Together, they allow rapid prototyping and production-ready deployment for neural networks, especially with GPU/TPU acceleration.
import tensorflow as tf from tensorflow import keras model = keras.Sequential([ keras.layers.Dense(10, activation='relu'), keras.layers.Dense(1) ]) model.compile(optimizer='adam', loss='mse') print("Model ready")
Pre-trained models save time and computation by leveraging existing knowledge. Libraries like TensorFlow Hub and Hugging Face offer ready-to-use models for NLP, vision, and more. You can use them in Colab to quickly perform tasks like sentiment analysis, object detection, or text summarization without training from scratch.
import tensorflow_hub as hub model = hub.load("https://tfhub.dev/google/nnlm-en-dim50/2") embeddings = model(["This is a test sentence"]) print(embeddings)
Colab provides free access to GPUs and TPUs, accelerating model training significantly. You can select your hardware under "Runtime > Change runtime type". Using GPU/TPU is especially helpful for training deep learning models with large datasets or complex architectures, reducing training time from hours to minutes.
# Check GPU import tensorflow as tf print("GPU Available:", tf.config.list_physical_devices('GPU'))
Saving trained models is essential for reuse, sharing, or deployment. TensorFlow and Keras provide methods to save the entire model structure and weights. In Colab, you can also save them to Google Drive for persistent storage. Loading a model later allows you to resume training or perform inference.
# Save model model.save("my_model.h5") # Load model loaded_model = keras.models.load_model("my_model.h5")
Google Colab makes it easy to consume REST APIs using Python libraries like `requests`. You can send HTTP requests to external services, retrieve JSON data, and parse it directly in your notebook. This allows seamless integration with web services for data collection, analysis, or triggering workflows.
import requests response = requests.get("https://api.publicapis.org/entries") data = response.json() print("Number of APIs:", len(data['entries']))
Web scraping is the process of extracting data from websites. In Colab, libraries like BeautifulSoup and Scrapy help you parse HTML content, navigate the document tree, and extract relevant information. This is useful for gathering data not available through APIs or for research purposes.
from bs4 import BeautifulSoup import requests url = "https://example.com" page = requests.get(url) soup = BeautifulSoup(page.content, "html.parser") print("Page title:", soup.title.string)
Many popular AI and data platforms provide Python SDKs that can be used in Colab. Hugging Face offers pretrained models for NLP, OpenAI provides APIs for language generation, and Kaggle allows dataset downloads and competition participation. These integrations empower powerful workflows directly in notebooks.
# Example: Download dataset from Kaggle !pip install kaggle !mkdir -p ~/.kaggle # Upload your kaggle.json credentials to ~/.kaggle/kaggle.json !kaggle datasets download -d zynicide/wine-reviews
Google Sheets can be accessed programmatically via the Google Sheets API. In Colab, you can authenticate with Google and read or write spreadsheet data. This is useful for managing data collaboratively or using Sheets as a lightweight database.
from google.colab import auth import gspread from oauth2client.client import GoogleCredentials auth.authenticate_user() gc = gspread.authorize(GoogleCredentials.get_application_default()) spreadsheet = gc.open('Your Sheet Name') worksheet = spreadsheet.sheet1 values = worksheet.get_all_values() print(values)
Python’s core features include functions for reusable code blocks, classes for object-oriented programming, and decorators to modify behavior of functions or classes dynamically. Using these in Colab helps write clean, modular, and maintainable code, which is critical in complex data science projects.
def greet(name): return f"Hello, {name}!" class Person: def __init__(self, name): self.name = name def say_hello(self): print(greet(self.name)) p = Person("Colab User") p.say_hello()
Exception handling allows your code to gracefully manage errors without crashing. Using `try-except` blocks in Colab helps catch errors such as file not found or invalid inputs, enabling your notebook to continue running and provide meaningful feedback.
try: result = 10 / 0 except ZeroDivisionError: print("Cannot divide by zero!")
For compute-intensive tasks or I/O-bound operations, Python supports concurrency via multithreading and multiprocessing. In Colab, you can use the `threading` and `multiprocessing` modules to speed up workflows by running tasks in parallel, although be mindful of resource limits.
import threading def print_numbers(): for i in range(5): print(i) thread = threading.Thread(target=print_numbers) thread.start() thread.join()
Colab supports IPython magic commands that simplify tasks: `%%time` measures execution time of a cell, `!` lets you run shell commands, `%load` loads external code into a cell, among others. These improve productivity by integrating system commands and timing within your notebook.
# Time execution of code cell %%time sum = 0 for i in range(1000000): sum += i # Run shell command !ls -l # Load external python script # %load script.py
Google Colab allows seamless sharing of notebooks similar to Google Docs. You can share via link or directly add collaborators by email. Permissions can be set as “Viewer”, “Commenter”, or “Editor”, controlling who can run code or only view content. This facilitates easy teamwork and code reviews.
# No code needed — use the "Share" button at top-right to invite collaborators
Collaborators can add comments to specific cells or sections, providing feedback or suggestions without modifying code. This feature helps teams review notebooks asynchronously and discuss improvements, making notebooks living documents that evolve with input.
# No code needed — right-click on a cell and select "Add comment"
Colab maintains a detailed version history automatically. You can review, restore, or name past versions of your notebook. This is useful to track changes over time, recover previous states, or audit code modifications made by collaborators.
# No code needed — access "File > Version history" to explore changes
Multiple users can edit a Colab notebook simultaneously, seeing each other’s changes in real time. This live collaboration improves teamwork efficiency, allowing pair programming, joint debugging, or teaching sessions without switching between tools.
# Just share the notebook link and collaborate live!
Mounting Google Drive in Colab allows you to access the files stored in your Drive as if they were on your local machine. This means you can read, write, and modify files directly, making it easier to work with large datasets and save outputs persistently across sessions. The mounting process requires authorization through your Google account.
from google.colab import drive # Mount Google Drive drive.mount('/content/drive') # Now files are accessible under /content/drive/MyDrive/
By default, notebooks in Colab are saved to Google Drive under the "Colab Notebooks" folder. You can manually save or copy notebooks to specific Drive folders using the File menu. This integration ensures your work is safely stored in the cloud and accessible from any device logged into your Google account.
# No explicit code needed; use File > Save a copy in Drive or File > Save.
Once your Drive is mounted, you can read from or write files just like on a local file system using standard Python commands. This includes CSVs, images, text files, or any file type. This persistent file access enables you to continue work seamlessly between sessions and share files with collaborators.
# Example: read CSV from Drive import pandas as pd df = pd.read_csv('/content/drive/MyDrive/dataset.csv') print(df.head()) # Save dataframe back to Drive df.to_csv('/content/drive/MyDrive/output.csv', index=False)
The Google Drive API allows programmatic control over files and permissions. You can automate file sharing, access restrictions, and metadata management. Access control ensures that sensitive data is protected and shared only with authorized users. In Colab, you can integrate Drive API via Google APIs Client Library for Python.
# Example: install Google API client !pip install --quiet --upgrade google-api-python-client google-auth-httplib2 google-auth-oauthlib # Example snippet (full auth and API usage requires more setup) from googleapiclient.discovery import build # Further code needed to authenticate and manage permissions
The ipywidgets library enables interactive widgets in Jupyter and Colab notebooks. Widgets allow users to interact with your code via buttons, sliders, checkboxes, and more. This interactivity is ideal for demonstrations, parameter tuning, or data exploration without rewriting code cells.
import ipywidgets as widgets from IPython.display import display button = widgets.Button(description="Click me!") display(button)
Sliders, dropdown menus, and checkboxes allow users to select values or options interactively. These widgets can control variables dynamically in your notebook, making it easy to experiment with different parameters and see results update immediately.
slider = widgets.IntSlider(min=0, max=100, step=1, description='Value:') dropdown = widgets.Dropdown(options=['Option 1', 'Option 2'], description='Choose:') checkbox = widgets.Checkbox(value=False, description='Check me') display(slider, dropdown, checkbox)
By linking widgets to functions, you can create dynamic visualizations and inputs that update automatically when the widget value changes. This enables rich, interactive data analysis experiences directly inside the notebook without manual code reruns.
from ipywidgets import interact def square(x): return x * x interact(square, x=widgets.IntSlider(min=0, max=10, step=1))
You can create forms for gathering multiple inputs using ipywidgets, combining various widget types into a single interface. This allows collecting structured user input efficiently, which can then be processed or analyzed within the notebook.
text = widgets.Text(description="Name:") age = widgets.IntText(description="Age:") submit = widgets.Button(description="Submit") def on_submit(b): print(f"Name: {text.value}, Age: {age.value}") submit.on_click(on_submit) display(text, age, submit)
Selecting the appropriate hardware accelerator in Colab—CPU, GPU, or TPU—can significantly affect performance. GPUs are ideal for parallel tasks like deep learning, while TPUs optimize TensorFlow models specifically. Benchmarking helps you evaluate the speed differences and choose the best option for your workload.
import tensorflow as tf # Check available devices print("GPUs:", tf.config.list_physical_devices('GPU')) print("TPUs:", tf.config.list_physical_devices('TPU'))
Effective memory management is critical when running large computations. Monitor your session’s RAM usage and clear variables or outputs when no longer needed to avoid memory overflow. Restarting the runtime can also free resources and keep your notebook responsive.
# Delete large variables to free memory del large_variable # Restart runtime from menu: Runtime > Restart runtime
Sometimes, caches like GPU memory or Python package caches consume resources. Clearing caches helps recover memory and maintain optimal performance. While Python does not offer direct cache clearing commands, restarting the runtime is the common method to clear GPU memory in Colab.
# Restart runtime to clear caches (manual step) # Alternatively, clear variables as needed
Colab supports magic commands like `%time` and `%timeit` to profile execution time, and `%memit` from memory_profiler to measure memory usage. These tools are invaluable for identifying bottlenecks and optimizing code efficiency.
# Time a single run %time sum(range(1000000)) # Time multiple runs for average %timeit sum(range(1000000)) # Requires memory_profiler package # %memit sum(range(1000000))
Google Colab lets you export your notebooks into various formats such as PDF, HTML, or Python scripts. Exporting to PDF is useful for sharing readable reports, HTML preserves interactive elements, and exporting as Python scripts allows running the code outside Colab. These export options are accessible through the “File” menu and enable flexible usage of your work.
# Export manually: # File > Download > PDF (.pdf), HTML (.html), or Python (.py)
Besides exporting, you can download any files generated during your Colab session, including datasets, models, or outputs. Files can be saved locally using Python code or directly through the Colab interface. This ability to download ensures you can keep local backups or use results in other projects.
from google.colab import files files.download('my_model.h5') # Downloads the file to your local computer
Colab integrates smoothly with GitHub, allowing you to save notebooks directly to repositories or open notebooks hosted there. This integration facilitates version control, collaboration, and sharing. You can push changes to GitHub to maintain a history and collaborate with others efficiently.
# Upload notebook manually: # File > Save a copy in GitHub # Or clone and push using Git commands in a code cell: !git clone https://github.com/yourusername/yourrepo.git
After training, models can be deployed to user-friendly web apps using platforms like Streamlit or Gradio. These tools allow creating interactive interfaces for demos or production without heavy backend coding. Colab supports quick prototyping with these libraries, enabling seamless model sharing and interaction with minimal setup.
# Example: install streamlit and run app !pip install streamlit # Run app in background (Colab specific workaround) !streamlit run app.py &
Google Colab has usage limits such as RAM capacity, maximum continuous runtime (usually 12 hours), and automatic disconnection after idle periods. These limits ensure fair resource sharing but can interrupt long-running jobs. Being aware of these constraints helps in planning experiments and managing session lifecycles effectively.
# Check RAM usage !cat /proc/meminfo # Check uptime !uptime
Since Colab sessions can reset unexpectedly, it’s important to save work regularly. Use Google Drive integration to save data or models and export checkpoints frequently. Saving code and outputs minimizes the impact of interruptions and helps maintain continuity across sessions.
from google.colab import drive drive.mount('/content/drive') # Save model checkpoint model.save('/content/drive/MyDrive/model_checkpoint.h5')
Colab disconnects sessions after inactivity or reaching runtime limits. To handle this, users can periodically interact with the notebook, split workloads into smaller parts, or save intermediate results. Additionally, Colab Pro offers longer runtimes and more resources for users requiring extended sessions.
# Periodically run this to keep session alive from time import sleep for i in range(60): print(i) sleep(60)
Google Colab Pro and Pro+ provide enhanced features like longer runtimes, more RAM, and priority access to GPUs and TPUs. These plans reduce limitations faced by free users and improve productivity for professionals and researchers with demanding workloads.
# No code needed — upgrade via Google Colab interface
Structuring your notebook by grouping related code and text cells makes it easier to understand and maintain. Use clear section titles, divide tasks into logical blocks, and avoid mixing unrelated operations. This improves readability, facilitates debugging, and helps collaborators follow your workflow.
# Use Markdown cells for titles and explanations # Separate data loading, processing, modeling in different cells
Adding comments and Markdown explanations enhances clarity. Describe the purpose of code blocks, explain complex logic, and provide context for decisions. Good documentation not only helps others but also aids your future self in recalling why certain approaches were taken.
# Example comment in Python # Calculate the mean of dataset mean_value = sum(data) / len(data)
To ensure others can reproduce your results, fix random seeds, specify package versions, and clearly list dependencies. Share datasets or links and save checkpoints. Reproducibility is crucial for scientific integrity and collaborative projects.
import numpy as np np.random.seed(42)
Provide direct links to datasets used and related notebooks to facilitate access and reuse. Hosting datasets on platforms like Google Drive or GitHub and referencing them improves collaboration and transparency.
# Load dataset from URL import pandas as pd url = "https://raw.githubusercontent.com/datasets/covid-19/main/data/countries-aggregated.csv" data = pd.read_csv(url)
Starting from well-structured templates or boilerplate notebooks saves time and enforces good practices. They can include pre-configured imports, styles, and common functions tailored to your domain or workflow.
# Create your own template by copying a base notebook with setup
Real-world datasets often contain missing, inconsistent, or noisy data. Colab is an excellent environment to perform data cleaning using libraries like pandas and NumPy. Cleaning improves data quality and helps generate accurate insights during analysis.
import pandas as pd data = pd.read_csv('dataset.csv') data = data.dropna() # Remove missing values
Colab supports advanced NLP projects using transformer models like BERT or GPT through Hugging Face’s Transformers library. These models enable tasks like text classification, sentiment analysis, and summarization with state-of-the-art performance.
from transformers import pipeline classifier = pipeline('sentiment-analysis') result = classifier("I love using Google Colab!") print(result)
Using deep learning libraries like TensorFlow and Keras in Colab, you can build convolutional neural networks (CNNs) to classify images. CNNs are widely used in computer vision tasks and benefit from Colab’s free GPU acceleration.
import tensorflow as tf from tensorflow.keras import layers, models model = models.Sequential([ layers.Conv2D(32, (3,3), activation='relu', input_shape=(28,28,1)), layers.MaxPooling2D((2,2)), layers.Flatten(), layers.Dense(10, activation='softmax') ]) model.summary()
EDA involves summarizing main dataset characteristics often with visual methods. Colab’s interactive plots and data manipulation libraries make it easy to perform EDA, helping to uncover patterns and inform model building.
import seaborn as sns sns.pairplot(data)
Colab can be used to scrape data from websites using packages like BeautifulSoup, then visualize the scraped data through dashboards using Plotly or Dash. This enables creating dynamic reports and insights from real-time web data.
import requests from bs4 import BeautifulSoup url = "https://example.com" page = requests.get(url) soup = BeautifulSoup(page.content, "html.parser") print(soup.title.text)
Google Colab runs your code on virtual machines hosted by Google, so your data is processed remotely. While notebooks and files stored in your Google Drive remain private by default, be aware that sharing notebooks or outputs publicly can expose sensitive data. Always review what information your code outputs and shares.
# Example: Avoid printing sensitive data password = "supersecret" # Do NOT print password in output # print(password) # Avoid this in shared notebooks
Colab notebooks use Google Drive’s sharing system. You can control who can view, comment, or edit your notebook via the “Share” button. Carefully set permissions to avoid unwanted access. Use private sharing links and restrict editing rights to trusted collaborators only to maintain control over your code and data.
# No direct code — set permissions via the "Share" button in Colab UI
Never hard-code sensitive information like API keys or passwords in notebooks. Instead, use environment variables or external configuration files that are excluded from version control. Encrypt or mask sensitive outputs before sharing notebooks publicly to minimize data exposure risks.
import os api_key = os.getenv('API_KEY') # Store API key securely, don't hardcode print("API Key loaded securely:", api_key is not None)
Colab allows you to set environment variables temporarily during a session. This is a secure way to handle secrets like tokens or keys without embedding them directly in your code. You can export variables in shell commands or use Colab’s secret management integrations.
# Set environment variable in shell !export API_KEY="your_secret_key" # Access in Python import os print("API Key:", os.getenv('API_KEY'))
To protect your work, avoid sharing notebooks publicly if they contain sensitive data or credentials. Regularly review sharing permissions, backup important notebooks, and log out from shared devices. Use version control systems and encrypted storage for critical projects to ensure confidentiality and integrity.
# Backup notebooks using Google Drive sync or git # Use .gitignore to exclude sensitive files # Always review shared links before distributing
Jupyter notebook extensions add useful features and enhance productivity. While Colab has limited native extension support, you can install some extensions or use JavaScript snippets to customize the interface. Popular extensions include code folding, table of contents, and variable inspectors.
# Example: Install a Jupyter extension (may not persist after restart) !pip install jupyter_contrib_nbextensions # Enable extensions manually if supported
You can inject custom JavaScript and CSS into Colab notebooks to modify appearance or behavior. This enables UI tweaks like hiding elements, changing styles, or adding interactive widgets. Use the IPython `display` module to run scripts or add styles dynamically.
from IPython.display import display, HTML, Javascript # Inject CSS display(HTML(''' ''')) # Inject JavaScript display(Javascript('alert("Welcome to customized Colab!")'))
Colab supports integrating various APIs and services like Slack, GitHub, or cloud platforms. Use their Python SDKs or REST APIs to automate workflows, fetch data, or trigger external processes. This expands your notebook’s capabilities beyond pure computation.
# Example: Use GitHub API to list repos import requests response = requests.get("https://api.github.com/users/octocat/repos") repos = response.json() print("Repository names:") for repo in repos: print(repo['name'])
To save time, create notebooks with boilerplate code, imports, and explanations that can be reused for similar projects. Share these templates within teams or publish them to speed up onboarding and maintain consistency. Templates can be saved in Drive or GitHub for easy access.
# No code needed — just save a clean notebook as template.ipynb # Copy and reuse when starting new projects
Google Apps Script allows automation of Google Workspace apps. Combined with Colab, you can automate data flows, trigger notebooks from Sheets, or send emails based on notebook results. This integration helps build end-to-end pipelines that connect data, analysis, and reporting.
# Example Apps Script snippet to trigger Colab (conceptual) function runColabNotebook() { // Use Apps Script UrlFetchApp to trigger notebook via API or webhook }