GET GOING WITH PYTHON
GPU DATA COMPUTING & DATA SCIENCE
Pick and choose from learning paths carefully planned by RAPIDS GPU project creators and experts
ANALYSTS
Friendly introductions to what you need to know
Get going with Jupyter notebooks, loading large files, filtering, statistics, visualization, and machine learnings... all with automatic GPU acceleration!
PYDATA PRO'S
Ramp up and scale quickly with GPU replacements
Setup, learn the packages, go from CPU to single-GPU to multi-GPU, best practices, reference workflows, and interact with the community leaders
LEARNING PATHS OPTIMIZED
FOR YOUR INDUSTRY'S USE CASES
Built from years of experience working with top tech companies, enterprises, and federal agencies
WE TAILOR TO YOUR ANALYTICS GOALS
Upcoming: Security & fraud investigations and analytics
Next: Supply chain optimization, sales & marketing analytics, and genomics
Contact for future & custom!
RAMP UP, GO LIVE, AND INCREASE IMPACT
Specialized tracks and materials
Industry-specific datasets
Ready-to-go solution skeletons
Instructors who delivered similar projects
Peers tackling related problems
PICK YOUR PATH
Cover the fundamentals in the best way for you
Webinars
DIY tutorials
Expert-led labs
Community chat
Private trainings
BY THE GPU COMMUNITY,
FOR THE GPU COMMUNITY
You may have seen our instructors speak at your favorite events, including:
Amazon Re:Invent, GDC, Nvidia GTC, O'Reilly Strata, Strange Loop, JupyterCon, PyCon, BlackHat, BlueHat, DefCon, BSides, ISSA, GraphConnect, Oakland, WWW, ForwardJS, SplunkConf, and more. We helped start the GPU dataframe computing movement and are core members of the GPU Open Analytics Initiative and RAPIDS.ai. We look forward to meeting you!
Open-Source SQL
on RAPIDS
Our solutions staff is experienced in delivering in helping enterprise data scientists and data architects scale solutions all the way from sales & marketing to supplychain optimization
100X Graph Investigation Visualization & Automation
Graphistry is the first RAPIDS-native visual analytics platform. We work with data teams to power and deploy data-intensive graph projects. Heavy experience with security & fraud analytics, and growing in fintech, genomics, and sales & marketing.
Scaling Python Simply
Founded by creators of Dask, Coiled helps you run at maximum speed and minimum cost.
... MORE TO BE ANNOUNCED!
(Your Name Here!)
We are bringing together project creators, full-time users, and expert communicators to help spread the knowledge. Newest member announces coming soon!
FREE PUBLIC TRAINING
Upcoming community sessions
[PAST] GPU SECURITY ANALYTICS 1:
THE TOUR
July 16th @ 11a PT / 2p ET
GUEST INSTRUCTOR:
Leo Meyerovich (CEO @ Graphistry, Inc.)
LIVE STREAM (40min):
Incident Response - Network log mapping: Netflow (ssh, dns, http, ..) to exfil analysis & dynamic visual network map
Threat Research - Scam domains analysis: Using AT&T AlienVault OTX DNS data and graph analytics
Hunting - Killchain mapping & clustering: Over ELK winlogs (Project Mordor APT data)
Covered GPU tech: Python Jupyter notebooks, BlazingSQL, cuDF (dataframes & regex), cuML/UMAP, cuGraph, Apache Arrow, Graphistry
OPTIONAL LIVE LAB (1hr):
Load and explore your first large file using GPUs: Netflow & Zeek/CoreLight
Extra emphasis on log analysis, viz, & graph
Tech: Jupyter, cuDF, Graphistry, ...
[PAST] RAPIDS DATA SCIENCE 1:
THE TOUR
July 28th @ 11a PT / 2p ET
(Session links below)
GUEST INSTRUCTOR:
Rodrigo Aramburu (CEO @ BlazingSQL)
LIVE STREAM (40min):
RAPIDS stack: GPU components and fundamentals
Data manipulation: Use GPU dataframes and SQL to inspect and transform data
Data visualization: Render datasets in different charts both on and off the GPU
Machine learning: Analyze dataframes with GPU ML libraries
Covered GPU tech: Python Jupyter Notebooks, BlazingSQL, cuDF (DataFrames), cuML, Apache Arrow, Dask, cuXFilter, Datashader, Matplotlib...
OPTIONAL LIVE LAB (1hr):
Load and explore large file using GPUs
Emphasis on cuDF and SQL
Tech: Jupyter, cuDF, BlazingSQL, Dask, cuML, Datashader
[NEXT] SCALING TO WAY MORE DATA THAN 1 GPU CAN HANDLE
August 11th @ 11a PT / 2p ET
(Taking place of deferred Dask session, which is now August 18th)
(Special session as part of Dask postponement)
GUEST INSTRUCTOR:
Felipe Aramburu (CTO @ BlazingSQL)
LIVE STREAM + HANDS-ON (1hr):
RAPIDS stack: Python GPU components
BlazingSQL: Load & analyze more data than 1 GPU can handle with automatic out-of-core memory handling
Dask-cuDF: Run on multiple GPUs
Mixed format: Split between overview, hands-on, and discussion
[NEXT] EASY PYTHON MULTI-GPU PROGRAMMING WITH DASK-CUDF
August 18th @ 11a PT / 2p ET
(Previously was August 11th)
GUEST INSTRUCTOR:
Matthew Rocklin (Dask creator, Coiled founder, ex-Nvidia)
LIVE STREAM (40min):
dask-cudf: A Python multi-GPU library for running RAPIDS GPU code over multiple dask workers
Dask: Python multiprocessing
RAPIDS: Python GPU ecosystem,
cuDF: Python GPU dataframes in RAPIDS
OPTIONAL LIVE LAB (1hr):
Hands-on to load in a large dataset and easily compute over it using dask-cudf across multiple GPU nodes. Experience for yourself how the RAPIDS ecosystem recently won the TPCx-BB big data benchmark!