Quick Start Tutorial¶
This tutorial will guide you through building your first machine learning model with Arc. You'll build a diabetes prediction model in just a few minutes - without writing any ML code!
What You'll Build¶
In this tutorial, you'll: - Download the Pima Indians Diabetes dataset - Let Arc analyze the data and engineer features - Generate an Arc-Graph model specification - Train and evaluate the model - View predictions and performance metrics
Time required: ~5 minutes
Prerequisites¶
Before starting, make sure you have: - Installed Arc - Configured your API key
Step 1: Start Arc¶
Launch Arc's interactive chat interface:
You should see Arc's welcome message with the ASCII logo.
Step 2: Describe What You Want¶
Simply tell Arc what you want to build in plain English:
Download the Pima Indians Diabetes dataset and build a model to predict diabetes from patient health metrics
Press Enter and watch Arc work its magic!
What Happens Next¶
Arc will automatically:
- Download the Dataset
- Fetches the Pima Indians Diabetes dataset
-
Loads it into Arc's database
-
Analyze the Data
- Examines features: pregnancies, glucose, blood pressure, BMI, age, etc.
-
Determines appropriate preprocessing steps
-
Engineer Features
- Normalizes numerical features
- Creates train/test splits
-
Generates processed data tables
-
Generate Arc-Graph Specification
- Creates a YAML specification for the model architecture
- Defines inputs, model layers, and outputs
-
You'll see the spec and can review/approve it
-
Train the Model
- Trains the model with your data
- Launches TensorBoard for real-time monitoring
-
Tracks metrics (loss, accuracy, etc.)
-
Evaluate Performance
- Computes evaluation metrics
- Shows predictions vs actual values
- Displays model performance statistics
Step 3: Explore Your Results¶
Once training completes, you can explore your data and results using SQL:
-- View available tables
/sql SHOW TABLES
-- See predictions
/sql SELECT * FROM predictions LIMIT 10
-- Check model performance
/sql SELECT * FROM evaluations ORDER BY created_at DESC LIMIT 1
Understanding the Arc-Graph¶
Arc generated an Arc-Graph specification for your model. It looks something like this:
# Arc-Graph: Model Architecture
inputs:
patient_data:
dtype: float32
shape: [null, 8]
columns: [pregnancies, glucose, blood_pressure, skin_thickness,
insulin, bmi, diabetes_pedigree, age]
graph:
- name: classifier
type: torch.nn.Linear
params:
in_features: 8
out_features: 1
inputs:
input: patient_data
- name: sigmoid
type: torch.nn.Sigmoid
inputs:
input: classifier.output
outputs:
prediction: sigmoid.output
trainer:
optimizer:
type: torch.optim.Adam
params:
lr: 0.001
loss: torch.nn.BCELoss
epochs: 50
batch_size: 32
This specification is: - Human-readable - You can understand and modify it - Portable - Runs anywhere PyTorch runs - Versionable - Track changes in Git - Reproducible - Guarantees train/serve parity
Learn more about Arc-Graph in the Arc-Graph documentation.
Step 4: View Training Progress¶
Arc automatically launches TensorBoard to visualize training progress. Check your console output for the TensorBoard URL (usually http://localhost:6006).
In TensorBoard, you can view: - Training and validation loss curves - Accuracy metrics over time - Model architecture graph - Hyperparameter comparisons
What's Next?¶
Now that you've built your first model, explore more:
- Model Training Guide - Learn about training workflows
- Feature Engineering - Advanced data preparation
- Making Predictions - Use your model for inference